Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Comments on How to run a command on a list of files?

Parent

How to run a command on a list of files?

+4
−0

Suppose I have a list of files on standard input. These may be the output of find, cat filelist.txt or something else.

How can I run a command on each file in turn?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+5
−0

There are several options, like xargs and for. I'll leave those for other answers and only describe my favorite, GNU Parallel. You will have to install it separately, because unlike the inferior xargs it usually does not come pre-installed.

Parallel can take a list of files on STDIN and run a command on them.

# Count lines in each file under current directory using "wc FILENAME"
find | parallel wc {}

Actually, it can also take the list from other sources, like CLI arguments.

# Apply wc to only txt files in current dir (not subdirs)
parallel wc {} ::: *.txt

It supports various ways of manipulating the filename:

# Rename foo.JPG files to foo.jpg
find | grep '.JPG$' | parallel mv {} {.}.jpg

As you can see, you can do any kind of manipulation you want on the file list before you pipe it to Parallel.

Parallel also supports many other features like parallelization, grouping output with color, a dry run mode and so on. See man parallel for these.


Note that normally, these tools assume 1 file per line (delimited by \n). If you have exotic filenames, such as those with \n in the name, this won't work right. There are various workarounds for this, for example GNU find -print0, fd --print0 or parallel --null. You can find these in the man pages of the tools you use.

I personally prefer a workaround called "don't put newlines in filenames, and if you find any, delete them on sight, uninstall the program that made them, and yell at the developer in the issue tracker".

History
Why does this post require moderator attention?
You might want to add some details to your flag.

2 comment threads

Parallel sounds nice! (1 comment)
You may want to add a warning that parsing the output of `find` is strongly discouraged, because it w... (3 comments)
You may want to add a warning that parsing the output of `find` is strongly discouraged, because it w...
AdminBee‭ wrote 11 months ago · edited 11 months ago

You may want to add a warning that parsing the output of find is strongly discouraged, because it will stumble on filenames with non-standard characters, which can actually include the new-line (although the most common pitfall in these scenarios is the space character). The parallel approach using globs is a safer bet, but may run into problems from the shell if the number of matches is excessive.

matthewsnyder‭ wrote 11 months ago

Does find also have that problem? I thought that was a concern just for ls.

Canina‭ wrote 11 months ago · edited 11 months ago

I believe it can have, at least in some circumstances. With the GNU coreutils 8.32 ls and GNU findutils 4.8.0 find on Debian 11/Bullseye, observe:

$ cd $(mktemp -d)
$ touch abc$'\n'def
$ /bin/ls -N
abc?def
$ /bin/ls
'abc'$'\n''def'
$ find . -type f -print
./abc?def
$

Depending on what kind of parsing you are doing, I imagine that there's a non-trivial chance that it will choke when fed 'abc'$'\n''def'.

Admittedly, newlines in file names aren't exactly common; it's just an example that makes the problem very obvious.