Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

greedy capture with sed

+1
−0

I am trying to greedily capture text with sed. For example, I have the string abbbc, and I want to capture all of the repeated b characters, so that my result is bbb. Here's an attempt at a solution:

$ sed -n 's/.*\(b\+\).*/\1/p' <<< abbbc
b

As shown in the output of the command, the capture only obtains a single b rather than my desired result bbb.

I know I could prepend and append the "not b" pattern ([^b]) to my capture, which would give me the desired result:

$ sed -n 's/.*[^b]\(b\+\)[^b].*/\1/p' <<< abbbc
bbb

However, this solution is a bit inelegant, and may become much more complicated when the match is not as simple. So I'm hoping there's another way to force the capture to be greedy.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

1 answer

+2
−0

The b\+ part of the regex is already greedy. In sed, all repetitions are greedy. Your problem is that the initial .* is also greedy, and so that's gobbling up both the a and as many bs as it can. For this example, you can change that part to [^b]*:

$ sed -n 's/[^b]*\(b\+\).*/\1/p' <<< abbbc
bbb

For more complicated situations, sed is unlikely to cut it. grep might be a more natural fit for what you're trying to do anyway.

$ grep -o 'b\+' <<< abbbc
bbb
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »