In a bash shell script, how to filter the command line argument list to unique entries only, for processing each?
I have a handful of shell scripts that accept any number of command line arguments, then do some relatively expensive processing based on each command line argument in turn. The general format for these goes along the lines of
#!/bin/bash
# preliminary set-up goes here
# main loop:
while test -n "$1"
do
do_expensive_processing_for "$1"
shift
done
# tear-down goes here
This works mostly well. However, it has the downside that if I for some reason pass the same argument twice during an invocation, that argument gets processed twice. Since the processing is expensive, I want to avoid that.
How can I ensure that each command line argument is processed only once during a single invocation of the script, while still allowing arbitrary command line argument contents (or at least not restricting them more than the above type of bash script already would)?
I'm happy with any one instance being the one that gets processed; the order of processing is not important.
To the extent that it matters, I'm using GNU bash 5.1.
3 answers
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
Canina | (no comment) | Oct 15, 2021 at 09:23 |
Bash
Here Bash's associative arrays come handy. The idea is to put every argument as a key in a separate array, and then only process arguments that are not keys to that array.
#!/bin/bash
declare -A processed #Declare that "processed" is an associative array
for e in "$@"; do #Loop over each argument
if [ -z "${processed["$e"]}" ]; then
echo "Expensively processing $e"
fi
processed["$e"]=1
done
Note the above script does not shift its arguments, so if you need the argument list to be empty after the processing is done, you can add a set --
line. Or use the script below, which has a syntax more along the lines of your sample anyway.
#!/bin/bash
declare -A processed
while test -n "$1"
do
test -z "${processed["$1"]}" && echo "Expensively processing $1"
processed["$1"]=1
shift
done
POSIX shell
Since you don't require the order to be preserved....
- Pop the head of the argument list.
- Look at the remainder of the list for a duplicate of the
decapitated head. If found,
- Suppress running the expensive command on the head.
#!/bin/sh
while [ -n "$1" ]; do
unset suppress
head=$1
shift
for e in "$@"; do
if [ "$head" = "$e" ]; then
suppress=1
break
fi
done
[ -n "$suppress" ] || echo "Expensively processing $head"
done
Here's an alternative that is a one-liner drop-in for your existing script:
eval set -- $(printf "%q\n" "$@" | sort -u)
It works by escaping the initial arguments, piping the escaped arguments through sort -u
which discards any duplicates, and then unescaping them with eval
(which is often unsafe when handling untrusted input, but in this case you've escaped everything first so it's okay). set --
alters the command line arguments to be whatever follows, so after this line is run, your $1
/shift
loop will be reading from the filtered list.
Put that before your main loop and you're golden.
(This does not remotely preserve argument order. The associative array approaches are better if you want the order of arguments to be the order in which the tasks are performed in the event of no duplicates. Since you said that doesn't matter to you, though, I thought this might be an approach worth considering.)
0 comment threads
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
Canina | (no comment) | Oct 15, 2021 at 09:22 |
The solution is to use an associative array to store what you've already seen:
#!/bin/bash
unset seen
declare -A seen
for arg in "$@"
do
if [[ -z "${seen[$arg]}" ]]
then
echo "Doing something omplicated with $arg"
seen["$arg"]=1
fi
done
The unset seen
is just in case the caller had an exported variable named seen
. The declare -A seen
tells bash to treat seen
as an associative array, that is, it takes arbitrary strings as index.
The loop then tests for each argument whether it has not yet been seen (in which case "${seen[arg]}"
is empty). If so, it processes it and then marks it as seen by storing something in seen["$arg"]
(what it stores doesn't really matter as long as it is not empty; if you want, you can store additional information about the argument here, e.g. the result of processing this argument).
0 comment threads