Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Ergonomic way to search man pages

+1
−0

You often have to read man pages to use Linux/Unix software. However, many man pages are not easy to read. They are very long, not always conveniently arranged, and man does not appear to have any way to handle indices or sections. A great example is man rsync.

I am often in a situation where some popular tool has a LOOONG man page, and I just need to look up the one argument about a specific thing. I don't like having to read the whole thing just for one tiny part of it. It takes a long time, breaks my flow, and if I try to read it start to finish, I am usually very confused and overwhelmed by the time I get to the end. The default pager (I think less) lets me search with /, but it's not very user friendly and it's easy to end up with an too general or too specific search string.

I've been resorting to stuff like man rsync | rg -C 3 symlink, which is much nicer. However, this is still a bit too dumb sometimes. For example, man pacman | rg -C 3 owns does find the switch that tells you what package provides a file, but then you still have to dig through the man page to figure out that it's in the section of switches applicable to -S (which is much higher in the page so ripgrep's context switch doesn't help).

Man pages have been around for half a century. Surely in this time, someone has come out with a way to peruse them that's better than this?

PS: For other people, there are tools like https://tldr.sh/ and https://github.com/cheat/cheat that provide shorter, alternate manual pages. Unfortunately these don't work for my question because they are not comprehensive - they obviously omit many details of the man page, and also there is not always a tldr page for every man page.

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

What about the experience of searching within less do you find user-unfriendly? Less has a lot of pow... (4 comments)

3 answers

+8
−0

Unix systems are made out of many small tools that focus on specific tasks but are general enough that the investment made in learning their specific switches and hotkeys pays off over many applications.

In that spirit, I would encourage you to get to know less better. It's not a tool specifically designed for the one use case of reading man pages; it's a general purpose tool that will come in handy in many situations, which is why learning how to use it well is worth the small ergonomic cost you'd be sacrificing by not using (or writing, as I don't know of such software) a program specifically designed to make reading man pages newbie-friendly.

Did you know:

  • Anything that you can tell less on the command line, you can toggle while you are viewing what you're viewing—to turn strict case sensitivity on or off, just type -i (the exact same characters you'd use on the command line to turn this mode on).
  • less's filters give a great way to find a specific occurrence of a string when you don't know exactly what you're looking for. As you brought it up as an example, here's how I look for something to do with symlinks in man rsync. I'd type &symlink⏎⃣ to view a list of all the lines with the word symlink in them. That's a lot of results; I'd type /↑⃣ to highlight the word I just used to filter. Let's say that I'm interested in the line near the bottom of the results that says, ‘Symbolic links are considered unsafe if they are absolute symlinks’. Sometimes I'd type J to move down the list until that line is at the top of the screen (J, as opposed to j or down arrow, will keep moving the screen even if less is at the end of its content—very useful in this context), but in this case, I'm more likely to type /unsafe⏎⃣ (which searches within the already filtered results) and type n a much smaller number of times to get to the interesting line. Then I'll type &⏎⃣ to turn off the filter, and now I'm viewing that line in context in the full man page.
  • Want a table of contents for the man page? Use filters again: &^[^ ]⏎⃣ shows only the section headers. J will move down them until I have arrived at the section I want, and then &⏎⃣ ‘expands’ it.
  • Do you frequently fat-finger regexes like the above? You can store custom commands in lesskey files: add the command toc filter \^[\^ ]\r and now you just have to remember toc. (The standard location for lesskey files is ~/.config/lesskey, and you can learn how to write one with man lesskey.)

All of this might seem a little arcane as ways to accomplish things like ‘view a table of contents’, but once you start to mentally decompose those problems into smaller, more general problems (like ‘display lines matching a filter’ and ‘test if a line is a section header’), a solid familiarity with the capabilities of tried-and-true Unix tools will serve you well.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

+1
−0

Not a real answer, but these days there are some nice LLM models and they're good at summarizing text. If you have the CLI scripts to interact with them, you can submit the man page as the "system prompt" (in ChatGPT parlance) and ask the question in the "user prompt".

This will be somewhat slow and will cost you money (API credits), but it is technically quite versatile. It is also possible for the LLM to hallucinate incorrect stuff.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

+0
−0

Unix filters are quite handy.

First of all, you can get an index of any manual page with a simple grep(1):

$ man pacman | grep '^[^ ]'
PACMAN(8)                            Pacman Manual                           PACMAN(8)
NAME
SYNOPSIS
DESCRIPTION
OPERATIONS
OPTIONS
TRANSACTION OPTIONS (APPLY TO -S, -R AND -U)
UPGRADE OPTIONS (APPLY TO -S AND -U)
QUERY OPTIONS (APPLY TO -Q)
REMOVE OPTIONS (APPLY TO -R)
SYNC OPTIONS (APPLY TO -S)
DATABASE OPTIONS (APPLY TO -D)
FILE OPTIONS (APPLY TO -F)
HANDLING CONFIG FILES
EXAMPLES
CONFIGURATION
SEE ALSO
BUGS
AUTHORS
Pacman 6.0.2                          2023-08-16                             PACMAN(8)

If you also want to see subsections, you can expand that grep(1) a little bit:

$ man rsync | grep -e '^[^ ]' -e '^   [^ ]'
rsync(1)                             User Commands                            rsync(1)
NAME
SYNOPSIS
DESCRIPTION
GENERAL
SETUP
USAGE
COPYING TO A DIFFERENT NAME
SORTED TRANSFER ORDER
MULTI‐HOST SECURITY
ADVANCED USAGE
CONNECTING TO AN RSYNC DAEMON
USING RSYNC‐DAEMON FEATURES VIA A REMOTE‐SHELL CONNECTION
STARTING AN RSYNC DAEMON TO ACCEPT CONNECTIONS
EXAMPLES
OPTION SUMMARY
OPTIONS
DAEMON OPTIONS
FILTER RULES
   SIMPLE INCLUDE/EXCLUDE RULES
   SIMPLE INCLUDE/EXCLUDE EXAMPLE
   FILTER RULES WHEN DELETING
   FILTER RULES IN DEPTH
   PATTERN MATCHING RULES
   FILTER RULE MODIFIERS
   MERGE‐FILE FILTER RULES
   LIST‐CLEARING FILTER RULE
   ANCHORING INCLUDE/EXCLUDE PATTERNS
   PER‐DIRECTORY RULES AND DELETE
TRANSFER RULES
BATCH MODE
SYMBOLIC LINKS
DIAGNOSTICS
EXIT VALUES
ENVIRONMENT VARIABLES
FILES
SEE ALSO
BUGS
VERSION
INTERNAL OPTIONS
CREDITS
THANKS
AUTHOR
rsync 3.2.7                           20 Oct 2022                             rsync(1)

If you want to read only one of those sections, you can script a little bit more. Let's say you're only interested in the SYNC OPTIONS from pacman(8).

$ man -w pacman \
  | xargs zcat \
  | pee 'sed -n "/^\.TH/,/^\.SH/{/^\.SH/!p}"' \
        'sed -n -e "/^\.SH \"\?SYNC OPTIONS/p" \
                -e "/^\.SH \"\?SYNC OPTIONS/,/^\.SH/{/^\.SH/!p}"' \
  |  groff -man -Tutf8;
PACMAN(8)                        Pacman Manual                       PACMAN(8)

SYNC OPTIONS (APPLY TO -S)
       -c, --clean
           Remove packages that are no longer installed from the cache as well
           as currently unused sync databases to free up disk space. When
           pacman downloads packages, it saves them in a cache directory. In
           addition, databases are saved for every sync DB you download from
           and are not deleted even if they are removed from the configuration
           file pacman.conf(5). Use one --clean switch to only remove packages
           that are no longer installed; use two to remove all files from the
           cache. In both cases, you will have a yes or no option to remove
           packages and/or unused downloaded databases.

           If you use a network shared cache, see the CleanMethod option in
           pacman.conf(5).

       -g, --groups
           Display all the members for each package group specified. If no
           group names are provided, all groups will be listed; pass the flag
           twice to view all groups and their members.

       -i, --info
           Display information on a given sync database package. Passing two
           --info or -i flags will also display those packages in all
           repositories that depend on this package.

       -l, --list
           List all packages in the specified repositories. Multiple
           repositories can be specified on the command line.

       -q, --quiet
           Show less information for certain sync operations. This is useful
           when pacman’s output is processed in a script. Search will only
           show package names and not repository, version, group, and
           description information; list will only show package names and omit
           databases and versions; group will only show package names and omit
           group names.

       -s, --search <regexp>
           This will search each package in the sync databases for names or
           descriptions that match regexp. When you include multiple search
           terms, only packages with descriptions matching ALL of those terms
           will be returned.

       -u, --sysupgrade
           Upgrades all packages that are out-of-date. Each
           currently-installed package will be examined and upgraded if a
           newer package exists. A report of all packages to upgrade will be
           presented, and the operation will not proceed without user
           confirmation. Dependencies are automatically resolved at this level
           and will be installed/upgraded if necessary.

           Pass this option twice to enable package downgrades; in this case,
           pacman will select sync packages whose versions do not match with
           the local versions. This can be useful when the user switches from
           a testing repository to a stable one.

           Additional targets can also be specified manually, so that -Su foo
           will do a system upgrade and install/upgrade the "foo" package in
           the same operation.

       -y, --refresh
           Download a fresh copy of the master package database from the
           server(s) defined in pacman.conf(5). This should typically be used
           each time you use --sysupgrade or -u. Passing two --refresh or -y
           flags will force a refresh of all package databases, even if they
           appear to be up-to-date.

Pacman 6.0.2                      2023-08-16                         PACMAN(8)

(If you want to get rid of the formatting, you can replace groff by mandoc in the pipeline.)

$ man -w pacman \
  | xargs zcat \
  | pee 'sed -n "/^\.TH/,/^\.SH/{/^\.SH/!p}"' \
        'sed -n -e "/^\.SH \"\?SYNC OPTIONS/p" \
                -e "/^\.SH \"\?SYNC OPTIONS/,/^\.SH/{/^\.SH/!p}"' \
  | mandoc -Tutf8 2>/dev/null \
  | col -pbx;

The origin of this script is in fact in the Linux man-pages repository (it's a bit different, as I deal with uncompressed pages, and often want to read the same section in several pages in the same directory). I've used it for reviewing changes to the pages. Here's the original code (see the man_section() function): https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/scripts/bash_aliases

The above script only works with man(7) pages. mdoc(7) pages will need a slightly different (but very similar) script.

With more powerful scripts, you'll be able to do more powerful selections.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »