How to run a command on a list of files?
2 answers
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
LAFK | (no comment) | Jun 21, 2023 at 05:59 |
-
If I just used
find
to generate a list of files, then find's-exec
argument is usually the way to run some other program on each file found.If you pipe the command to
xargs
, note that-P n
will run up to n commands in parallel. The best value of n will depend on the relative usage of your CPU and your storage system. -
If I have a program (say,
generate_lists
) that generates a list of files,for filename in $(generate_lists); do some_program "$filename" ; done
is usually helpful. Make sure you quote your use of
$filename
-- more of them have spaces than you'd think.
2 comment threads
There are several options, like xargs
and for
. I'll leave those for other answers and only describe my favorite, GNU Parallel. You will have to install it separately, because unlike the inferior xargs
it usually does not come pre-installed.
Parallel can take a list of files on STDIN and run a command on them.
# Count lines in each file under current directory using "wc FILENAME"
find | parallel wc {}
Actually, it can also take the list from other sources, like CLI arguments.
# Apply wc to only txt files in current dir (not subdirs)
parallel wc {} ::: *.txt
It supports various ways of manipulating the filename:
# Rename foo.JPG files to foo.jpg
find | grep '.JPG$' | parallel mv {} {.}.jpg
As you can see, you can do any kind of manipulation you want on the file list before you pipe it to Parallel.
Parallel also supports many other features like parallelization, grouping output with color, a dry run mode and so on. See man parallel
for these.
Note that normally, these tools assume 1 file per line (delimited by \n
). If you have exotic filenames, such as those with \n
in the name, this won't work right. There are various workarounds for this, for example GNU find -print0
, fd --print0
or parallel --null
. You can find these in the man pages of the tools you use.
I personally prefer a workaround called "don't put newlines in filenames, and if you find any, delete them on sight, uninstall the program that made them, and yell at the developer in the issue tracker".
0 comment threads