How to list the first x files in each directory
MWE
With the following tree:
l1
└── l2
├── d0
│ ├── f0
│ ├── f1
│ ├── f2
│ ├── f3
│ ├── f4
│ └── f5
├── d1
│ ├── f0
│ ├── f1
│ ├── f2
│ ├── f3
│ ├── f4
│ └── f5
├── d2
│ ├── f0
│ ├── f1
│ ├── f2
│ ├── f3
│ ├── f4
│ └── f5
├── d3
│ ├── f0
│ ├── f1
│ ├── f2
│ ├── f3
│ ├── f4
│ └── f5
├── d4
│ ├── f0
│ ├── f1
│ ├── f2
│ ├── f3
│ ├── f4
│ └── f5
└── d5
├── f0
├── f1
├── f2
├── f3
├── f4
└── f5
8 directories, 36 files
created by this script:
#!/bin/bash
path=l1/l2
mkdir -p $path
for dir in {0..5}; do
mkdir $path/d$dir
for file in {0..5}; do
touch $path/d$dir/f$file
done
done
Problem
How do I list the first three files of each directory?
First three being the first three alphabetically sorted.
Desired output:
l1/l2/d0/f0
l1/l2/d0/f1
l1/l2/d0/f2
l1/l2/d1/f0
l1/l2/d1/f1
l1/l2/d1/f2
l1/l2/d2/f0
l1/l2/d2/f1
l1/l2/d2/f2
l1/l2/d3/f0
l1/l2/d3/f1
l1/l2/d3/f2
l1/l2/d4/f0
l1/l2/d4/f1
l1/l2/d4/f2
l1/l2/d5/f0
l1/l2/d5/f1
l1/l2/d5/f2
Tried
I can get each file via:
find l1 -mindepth 3 -maxdepth 3 -type f
but I can't find a way in the manual to specify the match depth.
The output from find is also unsorted, and so, the first three would not be the first three alphabetically.
Notes
My goal is to make a trimmed copy of a large dataset for preliminary, quick testing as I develop my code. Performing this trim within the code unnecessarily complicates the test code and will be a wasted effort as it will be removed for the real deal. Without it in place, I am able to test what will be the final product.
The real target using the real dataset may be sub directories and not files. In other words, "give me the the first 3 directories in level 2".
Ideally, if I can find a way to list the files, I can pipe them to a
cp
command.
3 answers
There's three parts to this:
- Find all directories (in your case, sounds like you want depth=3 only)
- Print the top 3 files in a single directory
- Apply 2 to each in 1
1 should be a separate question but both find
can do it. I prefer fd
: fd --type directory . --max-depth 3 --min-depth 3
. There might be a shorter way to request a depth of exactly 3 but I don't know it.
2 is obviously done with head -n 3
.
3 can be done as described in https://linux.codidact.com/posts/288310.
Putting it all together, it would be something like this:
fd --type directory . --max-depth 3 --min-depth 3 \
| parallel 'fd --type file . {} | head -n 3'
2 comment threads
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
mcp |
Thread: Works for me This solution works and led me to a slightly simpler solution in the threads below. |
Oct 3, 2023 at 15:48 |
Is this what you want?
Edit: Credit to Kamil Maciorowski for catching an unsafe interpolation in the previous draft; it will work for non-adversarial inputs but this newer version is safer and a better example to learn from.
find l1 -mindepth 2 -maxdepth 2 \
-type d \! -empty \
-exec sh -c 'printf "%s\n" "$1"/* | head -n 3' _ {} \;
4 comment threads
Here's my approach:
find l1 -type d \
| while read d; do
find $d -maxdepth 1 -type f \
| head -n3;
done;
If your middle directories also contain files, it will also show them (of course, only the first 3).
This is what it shows for me in a tree similar to yours, where I added files to l2
:
$ find l1 -type d | while read d; do find $d -maxdepth 1 -type f | head -n3; done;
l1/l2/f2
l1/l2/f5
l1/l2/f4
l1/l2/d0/f2
l1/l2/d0/f5
l1/l2/d0/f4
l1/l2/d5/f2
l1/l2/d5/f5
l1/l2/d5/f4
l1/l2/d4/f2
l1/l2/d4/f5
l1/l2/d4/f4
l1/l2/d2/f2
l1/l2/d2/f5
l1/l2/d2/f4
l1/l2/d1/f2
l1/l2/d1/f5
l1/l2/d1/f4
l1/l2/d3/f2
l1/l2/d3/f5
l1/l2/d3/f4
If you want the first 3 files in a certain order, you'll need to sort(1) them according to your needs; otherwise, you'll get the first three files according to find(1)'s default criteria (see https://serverfault.com/questions/181787/find-command-default-sorting-order):
find l1 -type d \
| while read d; do
find $d -maxdepth 1 -type f \
| sort \
| head -n3;
done;
0 comment threads