Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

In a bash shell script, how to filter the command line argument list to unique entries only, for processing each?

+11
−0

I have a handful of shell scripts that accept any number of command line arguments, then do some relatively expensive processing based on each command line argument in turn. The general format for these goes along the lines of

#!/bin/bash

# preliminary set-up goes here

# main loop:
while test -n "$1"
do
    do_expensive_processing_for "$1"
    shift
done

# tear-down goes here

This works mostly well. However, it has the downside that if I for some reason pass the same argument twice during an invocation, that argument gets processed twice. Since the processing is expensive, I want to avoid that.

How can I ensure that each command line argument is processed only once during a single invocation of the script, while still allowing arbitrary command line argument contents (or at least not restricting them more than the above type of bash script already would)?

I'm happy with any one instance being the one that gets processed; the order of processing is not important.

To the extent that it matters, I'm using GNU bash 5.1.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

3 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+8
−0

Bash

Here Bash's associative arrays come handy. The idea is to put every argument as a key in a separate array, and then only process arguments that are not keys to that array.

#!/bin/bash
declare -A processed  #Declare that "processed" is an associative array
for e in "$@"; do     #Loop over each argument
    if [ -z "${processed["$e"]}" ]; then
        echo "Expensively processing $e"
    fi
    processed["$e"]=1
done

Note the above script does not shift its arguments, so if you need the argument list to be empty after the processing is done, you can add a set -- line. Or use the script below, which has a syntax more along the lines of your sample anyway.

#!/bin/bash
declare -A processed
while test -n "$1"
do
    test -z "${processed["$1"]}" && echo "Expensively processing $1"
    processed["$1"]=1
    shift
done

POSIX shell

Since you don't require the order to be preserved....

  • Pop the head of the argument list.
  • Look at the remainder of the list for a duplicate of the decapitated head. If found,
    • Suppress running the expensive command on the head.
#!/bin/sh
while [ -n "$1" ]; do
    unset suppress
    head=$1
    shift
    for e in "$@"; do
        if [ "$head" = "$e" ]; then
            suppress=1
            break
        fi
    done
    [ -n "$suppress" ] || echo "Expensively processing $head"
done
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

Works fine, made a slight change (1 comment)
+5
−0

The solution is to use an associative array to store what you've already seen:

#!/bin/bash

unset seen
declare -A seen

for arg in "$@"
do
    if [[ -z "${seen[$arg]}" ]]
    then
        echo "Doing something omplicated with $arg"
        seen["$arg"]=1
    fi
done

The unset seen is just in case the caller had an exported variable named seen. The declare -A seen tells bash to treat seen as an associative array, that is, it takes arbitrary strings as index.

The loop then tests for each argument whether it has not yet been seen (in which case "${seen[arg]}" is empty). If so, it processes it and then marks it as seen by storing something in seen["$arg"] (what it stores doesn't really matter as long as it is not empty; if you want, you can store additional information about the argument here, e.g. the result of processing this argument).

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

Works fine, made a slight change (1 comment)
+5
−0

Here's an alternative that is a one-liner drop-in for your existing script:

eval set -- $(printf "%q\n" "$@" | sort -u)

It works by escaping the initial arguments, piping the escaped arguments through sort -u which discards any duplicates, and then unescaping them with eval (which is often unsafe when handling untrusted input, but in this case you've escaped everything first so it's okay). set -- alters the command line arguments to be whatever follows, so after this line is run, your $1/shift loop will be reading from the filtered list.

Put that before your main loop and you're golden.

(This does not remotely preserve argument order. The associative array approaches are better if you want the order of arguments to be the order in which the tasks are performed in the event of no duplicates. Since you said that doesn't matter to you, though, I thought this might be an approach worth considering.)

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »