Ergonomic way to search man pages
You often have to read man pages to use Linux/Unix software. However, many man pages are not easy to read. They are very long, not always conveniently arranged, and man
does not appear to have any way to handle indices or sections. A great example is man rsync
.
I am often in a situation where some popular tool has a LOOONG man page, and I just need to look up the one argument about a specific thing. I don't like having to read the whole thing just for one tiny part of it. It takes a long time, breaks my flow, and if I try to read it start to finish, I am usually very confused and overwhelmed by the time I get to the end. The default pager (I think less
) lets me search with /
, but it's not very user friendly and it's easy to end up with an too general or too specific search string.
I've been resorting to stuff like man rsync | rg -C 3 symlink
, which is much nicer. However, this is still a bit too dumb sometimes. For example, man pacman | rg -C 3 owns
does find the switch that tells you what package provides a file, but then you still have to dig through the man page to figure out that it's in the section of switches applicable to -S
(which is much higher in the page so ripgrep's context switch doesn't help).
Man pages have been around for half a century. Surely in this time, someone has come out with a way to peruse them that's better than this?
PS: For other people, there are tools like https://tldr.sh/ and https://github.com/cheat/cheat that provide shorter, alternate manual pages. Unfortunately these don't work for my question because they are not comprehensive - they obviously omit many details of the man page, and also there is not always a tldr page for every man page.
3 answers
Unix filters are quite handy.
First of all, you can get an index of any manual page with a simple grep(1):
$ man pacman | grep '^[^ ]'
PACMAN(8) Pacman Manual PACMAN(8)
NAME
SYNOPSIS
DESCRIPTION
OPERATIONS
OPTIONS
TRANSACTION OPTIONS (APPLY TO -S, -R AND -U)
UPGRADE OPTIONS (APPLY TO -S AND -U)
QUERY OPTIONS (APPLY TO -Q)
REMOVE OPTIONS (APPLY TO -R)
SYNC OPTIONS (APPLY TO -S)
DATABASE OPTIONS (APPLY TO -D)
FILE OPTIONS (APPLY TO -F)
HANDLING CONFIG FILES
EXAMPLES
CONFIGURATION
SEE ALSO
BUGS
AUTHORS
Pacman 6.0.2 2023-08-16 PACMAN(8)
If you also want to see subsections, you can expand that grep(1) a little bit:
$ man rsync | grep -e '^[^ ]' -e '^ [^ ]'
rsync(1) User Commands rsync(1)
NAME
SYNOPSIS
DESCRIPTION
GENERAL
SETUP
USAGE
COPYING TO A DIFFERENT NAME
SORTED TRANSFER ORDER
MULTI‐HOST SECURITY
ADVANCED USAGE
CONNECTING TO AN RSYNC DAEMON
USING RSYNC‐DAEMON FEATURES VIA A REMOTE‐SHELL CONNECTION
STARTING AN RSYNC DAEMON TO ACCEPT CONNECTIONS
EXAMPLES
OPTION SUMMARY
OPTIONS
DAEMON OPTIONS
FILTER RULES
SIMPLE INCLUDE/EXCLUDE RULES
SIMPLE INCLUDE/EXCLUDE EXAMPLE
FILTER RULES WHEN DELETING
FILTER RULES IN DEPTH
PATTERN MATCHING RULES
FILTER RULE MODIFIERS
MERGE‐FILE FILTER RULES
LIST‐CLEARING FILTER RULE
ANCHORING INCLUDE/EXCLUDE PATTERNS
PER‐DIRECTORY RULES AND DELETE
TRANSFER RULES
BATCH MODE
SYMBOLIC LINKS
DIAGNOSTICS
EXIT VALUES
ENVIRONMENT VARIABLES
FILES
SEE ALSO
BUGS
VERSION
INTERNAL OPTIONS
CREDITS
THANKS
AUTHOR
rsync 3.2.7 20 Oct 2022 rsync(1)
If you want to read only one of those sections, you can script a little bit more. Let's say you're only interested in the SYNC OPTIONS from pacman(8).
$ man -w pacman \
| xargs zcat \
| pee 'sed -n "/^\.TH/,/^\.SH/{/^\.SH/!p}"' \
'sed -n -e "/^\.SH \"\?SYNC OPTIONS/p" \
-e "/^\.SH \"\?SYNC OPTIONS/,/^\.SH/{/^\.SH/!p}"' \
| groff -man -Tutf8;
PACMAN(8) Pacman Manual PACMAN(8)
SYNC OPTIONS (APPLY TO -S)
-c, --clean
Remove packages that are no longer installed from the cache as well
as currently unused sync databases to free up disk space. When
pacman downloads packages, it saves them in a cache directory. In
addition, databases are saved for every sync DB you download from
and are not deleted even if they are removed from the configuration
file pacman.conf(5). Use one --clean switch to only remove packages
that are no longer installed; use two to remove all files from the
cache. In both cases, you will have a yes or no option to remove
packages and/or unused downloaded databases.
If you use a network shared cache, see the CleanMethod option in
pacman.conf(5).
-g, --groups
Display all the members for each package group specified. If no
group names are provided, all groups will be listed; pass the flag
twice to view all groups and their members.
-i, --info
Display information on a given sync database package. Passing two
--info or -i flags will also display those packages in all
repositories that depend on this package.
-l, --list
List all packages in the specified repositories. Multiple
repositories can be specified on the command line.
-q, --quiet
Show less information for certain sync operations. This is useful
when pacman’s output is processed in a script. Search will only
show package names and not repository, version, group, and
description information; list will only show package names and omit
databases and versions; group will only show package names and omit
group names.
-s, --search <regexp>
This will search each package in the sync databases for names or
descriptions that match regexp. When you include multiple search
terms, only packages with descriptions matching ALL of those terms
will be returned.
-u, --sysupgrade
Upgrades all packages that are out-of-date. Each
currently-installed package will be examined and upgraded if a
newer package exists. A report of all packages to upgrade will be
presented, and the operation will not proceed without user
confirmation. Dependencies are automatically resolved at this level
and will be installed/upgraded if necessary.
Pass this option twice to enable package downgrades; in this case,
pacman will select sync packages whose versions do not match with
the local versions. This can be useful when the user switches from
a testing repository to a stable one.
Additional targets can also be specified manually, so that -Su foo
will do a system upgrade and install/upgrade the "foo" package in
the same operation.
-y, --refresh
Download a fresh copy of the master package database from the
server(s) defined in pacman.conf(5). This should typically be used
each time you use --sysupgrade or -u. Passing two --refresh or -y
flags will force a refresh of all package databases, even if they
appear to be up-to-date.
Pacman 6.0.2 2023-08-16 PACMAN(8)
(If you want to get rid of the formatting, you can replace groff by mandoc in the pipeline.)
$ man -w pacman \
| xargs zcat \
| pee 'sed -n "/^\.TH/,/^\.SH/{/^\.SH/!p}"' \
'sed -n -e "/^\.SH \"\?SYNC OPTIONS/p" \
-e "/^\.SH \"\?SYNC OPTIONS/,/^\.SH/{/^\.SH/!p}"' \
| mandoc -Tutf8 2>/dev/null \
| col -pbx;
The origin of this script is in fact in the Linux man-pages repository (it's a bit different, as I deal with uncompressed pages, and often want to read the same section in several pages in the same directory). I've used it for reviewing changes to the pages.
Here's the original code (see the man_section()
function):
https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/scripts/bash_aliases
The above script only works with man(7) pages. mdoc(7) pages will need a slightly different (but very similar) script.
With more powerful scripts, you'll be able to do more powerful selections.
0 comment threads
Unix systems are made out of many small tools that focus on specific tasks but are general enough that the investment made in learning their specific switches and hotkeys pays off over many applications.
In that spirit, I would encourage you to get to know less
better. It's not a tool specifically designed for the one use case of reading man pages; it's a general purpose tool that will come in handy in many situations, which is why learning how to use it well is worth the small ergonomic cost you'd be sacrificing by not using (or writing, as I don't know of such software) a program specifically designed to make reading man pages newbie-friendly.
Did you know:
- Anything that you can tell
less
on the command line, you can toggle while you are viewing what you're viewing—to turn strict case sensitivity on or off, just type-i
(the exact same characters you'd use on the command line to turn this mode on). -
less
's filters give a great way to find a specific occurrence of a string when you don't know exactly what you're looking for. As you brought it up as an example, here's how I look for something to do with symlinks inman rsync
. I'd type&symlink⏎⃣
to view a list of all the lines with the wordsymlink
in them. That's a lot of results; I'd type/↑⃣
to highlight the word I just used to filter. Let's say that I'm interested in the line near the bottom of the results that says, ‘Symbolic links are considered unsafe if they are absolute symlinks’. Sometimes I'd typeJ
to move down the list until that line is at the top of the screen (J
, as opposed toj
or down arrow, will keep moving the screen even ifless
is at the end of its content—very useful in this context), but in this case, I'm more likely to type/unsafe⏎⃣
(which searches within the already filtered results) and typen
a much smaller number of times to get to the interesting line. Then I'll type&⏎⃣
to turn off the filter, and now I'm viewing that line in context in the full man page. - Want a table of contents for the man page? Use filters again:
&^[^ ]⏎⃣
shows only the section headers.J
will move down them until I have arrived at the section I want, and then&⏎⃣
‘expands’ it. - Do you frequently fat-finger regexes like the above? You can store custom commands in lesskey files: add the command
toc filter \^[\^ ]\r
and now you just have to remembertoc
. (The standard location for lesskey files is~/.config/lesskey
, and you can learn how to write one withman lesskey
.)
All of this might seem a little arcane as ways to accomplish things like ‘view a table of contents’, but once you start to mentally decompose those problems into smaller, more general problems (like ‘display lines matching a filter’ and ‘test if a line is a section header’), a solid familiarity with the capabilities of tried-and-true Unix tools will serve you well.
0 comment threads
Not a real answer, but these days there are some nice LLM models and they're good at summarizing text. If you have the CLI scripts to interact with them, you can submit the man page as the "system prompt" (in ChatGPT parlance) and ask the question in the "user prompt".
This will be somewhat slow and will cost you money (API credits), but it is technically quite versatile. It is also possible for the LLM to hallucinate incorrect stuff.
1 comment thread