Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Recursively remove files with the same name as the ones that end in `.part`

+5
−0

I want to remove all files with the ".part" extension in the current directory and its subdirectories, including files with the same name but different extension.

Is this correct?

find . -name '*.part' -exec sh -c 'base="$(basename "$1" .part)"; find . -name "$base*" -delete' sh {} \;
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

3 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+0
−3

For each file named foo.xyz, you want to delete foo.xyz.part. It doesn't matter if foo.xyz.part exists, you can just attempt it and skip errors.

You can get a list of all files with find etc. But you don't want the ones with .part, so you use grep to take them out: find | grep -v '\.part$'. $ means end of string and \. is because otherwise . means any character in regex.

You can then attempt to delete each one: find | grep -v '\.part$' | parallel rm {}

If the file doesn't exist, Parallel will show you the error message, but it will still delete the ones that do exist. You can do a bunch of extra filtering with comm to only attempt to delete those files which do exist, but there's no need in this case.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+7
−0

It is incorrect for two reasons.

1. File names containing glob characters

This is an edge case scenario.

Consider this structure:

.
├── abc
├── abc.part
├── cde
└── c*e.part

The outermost Find will find

  • abc.part, so base=abc and the innermost Find looks for files matching the glob abc*, which matches the abc file. Good.
  • c*e.part, so base=c*e and the innermost Find looks for files matching the glob c*e*, which matches the cde file. Bad, because cde does not contain c*e.

2. File names with extra characters

If you have abcde and abc.part files, the former will be deleted because it matches abc* as should be clear from the previous case discussion.

This particular problem would be easily fixed by changing $base* -> $base.*.

Proposed solution

Point 1 is the real challenge: It is quite involved to feed the file names back again into another Find's -name argument and escape the meta-characters, which is always a mine field.

I propose instead to use a shell with support for **, the recursive glob, for example Bash or Ksh with globstar option set or Zsh.

#!/bin/bash
shopt -s globstar #Not needed in Zsh
for f in ./**/*.part; do
    rm ./**/"$(basename "$f" .part)".*
done

For a breakdown,

  • In line 2, **/*.part matches ./a.part but also ./a/b/c.part (hence "recursive glob").
  • In line 3, "$(basename "$f" .part)" removes all directory components of the file name and its .part extension. This would boil down to a and c in our example.
    So the full line rm ./**/"$(basename "$f" .part)".* recursively removes files matching the a.* and c.* patterns.

It is crucial not to quote the * characters in the example, because we want it to act as a glob (and not to be parsed literally).

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+1
−0

I might be inclined to try...

find . -type f -name '*.part' -exec sh -c '
  [ -f "${1%.part}" ] && rm -i -- "${1%.part}"; 
  for f in "${1%.part}".*; do 
    [ -f "$f" ] && rm -i -- "$f"; 
  done
' -- {} \;

(newlines for readability; can be elided if one-liner means something to you...)

  1. find . -type f -name '*.part' — find files ending with .part
  2. -exec sh -c '...' -- {} \; — run a shell script ... for each found file; path to file is in $1 in child script
  3. "${1%.part}" — strip .part from the end of the filename in $1 (same as basename but without the extra process)
  4. [ -f "${1%.part}" ] && ...; — if a file exists with no extension, do the ... bit
  5. rm -i -- "${1%.part}" — delete the file with no extension
  6. for f in "${1%.part}".*; do ... done — loop each found path matching the filename with any extension; path is stored in $f (this includes the one with the .part extension)
  7. [ -f "$f" ] && ...; — if the path in $f exists and is a file, do the ... bit
  8. rm -i -- "$f" — remove the file in $f

Note that I'm using various checks that the thing I'm asking to delete is a file, not a directory, link, fifo, etc.

If limiting only to files is less of a concern, you might well be able to shorten this to...

find . -name '*.part' -exec sh -c 'rm -i -- "${1%.part}" "${1%.part}".*' -- {} \;

The shell may write errors if the args to rm don't expand to existing paths, hide that with judicious use of 2>/dev/null redirection, if you care.

For fewer subshells, you may be able to pass all found files to the same shell in one go, with...

find . -name '*.part' -exec sh -c 'while [ -n "$1" ]; do rm -i -- "${1%.part}" "${1%.part}".*; shift; done' -- {} \+

...but this might be painful for larger file lists.

In general, note there's is technically a race condition between the various tests and the eventual delete, but that's only a concern if multiple processes are acting on that directory tree. Not sure how to avoid that.

Finally, rm -i is used to prompt y/n for each file to delete, as a safety net. Remove the -i switch from the rm calls if you are confident.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

Different interpretation of the problem (2 comments)

Sign up to answer this question »