Efficiently determining disk usage of a folder (without starting from scratch every time)
When I use my computer, one question I commonly want to answer for myself is "how much space is being used by the contents of this folder?". Typical file/window managers, IMX, answer this question the same way that Windows does: by recursing over directory contents and summing their logical sizes. This doesn't suit my needs, for three reasons:
-
While the logical size of an individual file is interesting to me, a sum of logical sizes is not; I want a sum of physical sizes, because the question is about disk usage.
-
It does the calculation (and directory traversal) on the fly, and doesn't show a progress bar or even a clear indication that it's done. Sometimes the file count and size sum will pause for seconds at a time and then start increasing again.
-
It's very slow.
I know that I can use du
at the command line to get physical sizes, and it's clear when du
is finished because it outputs to the terminal and eventually returns to a terminal prompt. However, it doesn't solve the performance issue.
Is there a filesystem that natively caches this information about directories, or well-known software that maintains such a cache - so that if I e.g. check the size of /home/user
, the size of /home/user/Desktop
is already known and can be returned instantaneously (as long as the subfolder hasn't been modified in the mean time)? Similarly, caching the result for /home/user/Desktop
should speed up a later check for /home/user
, since it wouldn't have to consider the Desktop contents. It would also be nice to have a GUI for such a program.
I thought about making such a program, but I don't want to reinvent the wheel. I'd also be interested if there's any way to make ext4
filesystems cache this information automatically, even though they don't appear to by default.
2 answers
Gnome Disk Usage Analyzer is a GUI program with similar purpose as du and ncdu.
I haven't used it in a long time, but I believe caches scan results. Subsequent scans should become faster, barring unexpected cache invalidation (which, granted, is "hard CS problem").
ncdu
does that.
from ncdu --help
:
-o FILE Export scanned directory to FILE
-f FILE Import scanned directory from FILE
You can start it with any directory you want, skip unwanted mounts, even delete files and folders directly from inside the program.
After some tests: consecutive runs of ncdu -o file
update the file, and pretty fast. A simple script that runs ncdu -o file $1; ncdu -f file
should be quite usable.
3 comment threads