Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Comments on Simplest way of stripping leading/trailing whitespace from file or program output

Parent

Simplest way of stripping leading/trailing whitespace from file or program output

+4
−1

What is the simplest shell idiom for stripping leading and trailing whitespace from a file or program output? Ideally I am looking for the equivalent of trim or strip methods in some languages.

The ideal solution should

  • skip empty lines at the beginning and end of the file/stream
  • provide an option to also strip leading and trailing whitespace from all non-empty lines
History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Do tab spaces count as leading or trailing white space? (2 comments)
Post
+0
−0

I'll post this as an example of what I'm looking to do.

The following script:

import sys

a = sys.stdin.read()
b = a.strip()
c = map(lambda s: s.strip(), b.splitlines())

for s in c:
    print(s)

Will remove:

  • Whitespace at the beginning and end of the file or stream
  • Whitespace at the beginning and end of each line (except for the legitimate line ending, of course)
$ echo -e " hello\nmellow \nworld\n\n\n" | python trim.py | bat -A
───────┬─────────────
       │ STDIN
       │ Size: -
───────┼─────────────
   1   │ hello␊
   2   │ mellow␊
   3   │ world␊
───────┴─────────────

Caveats:

  • In practice, you would probably want to add this script to your shell's PATH to use it
  • It might be worth adding some CLI flags to control which whitespace exactly is removed
  • The performance (especially memory usage) of this is probably bad, it does not efficiently handle one line at a time (and maintain a "consecutive blanks" buffer for removing trailing whitespace)

This seems like such an obvious task that there must surely be a Unix program for it already. However I could not find anything better than a Python script or sed with a somewhat-complex regex.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

Somewhat complex and inefficient (3 comments)
Somewhat complex and inefficient
tripleee‭ wrote 10 months ago

The use of a generator expression seems odd. You end up reading the whole file into memory anyway. Also, b seems redundant. Why not simply for line in sys.stdin: print(line.strip())?

tripleee‭ wrote 10 months ago

(Also, what's bat? A common utility for displaying control characters etc is cat -v, though the only option for cat known to POSIX is -u, for unbuffered I/O.)

matthewsnyder‭ wrote 10 months ago

It's https://github.com/sharkdp/bat. I assume people would search for something like bat linux :)

Yes, the Python code can be improved. But I'm wondering if someone has already written such a program, which is in wide circulation, before I go off maintaining my own.