Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

What is cat abuse/useless use of cat?

+5
−0

Sometimes I share Unix commands online, and people chastise me for "useless use of cat" (UUOC) or "cat abuse".

My cat is quite comfy and doing very well, thank you.

What are they talking about?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

3 answers

+5
−0

Overview

A "useless use" or "abuse" of cat occurs when a Unix pipeline (sequence of commands that feed into each other, using the shell | or "pipe" operator) includes a call to cat that is unnecessary for solving the problem.

Such pipelines are naturally less efficient (since the OS has to start an additional cat process) and require a bit more typing, but these things are unlikely to matter much in normal circumstances. The more interesting impact is social: some newer users may find the version of a pipeline with useless cats easier to understand, while more experienced users may find that it violates their aesthetic sense and betrays a lack of comfort with how the tools are intended to work. Learning how to remove unnecessary cat uses from a pipeline can be a helpful exercise for improving one's shell programming skills.

What is cat, and how is it used in pipelines?

The cat program is used to concatenate one or more sources of input - i.e., read the input sources and output all the content of each, one after the other in sequence. These sources can be either files or the standard input.

In particular, cat can be given a single input which it will simply output on its standard output. (This is different from echo, which will output content directly from its command line, rather than from sources named there.) Running cat by itself will treat its standard input as the single input used; interactively, this results in a prompt where everything you type is simply echoed back to you (a second time, on top of the usual terminal echo) until you use control-D to mark the end of standard input (or force the program to quit with control-C or in some other way).

Why are the unnecessary uses unnecessary?

It's common to see pipelines which use cat to open a single file and provide it to the standard input of another program, for example wc -l. This is usually unnecessary because the other program already accepts filename arguments, and could open and read the file itself (and use that input instead of the standard input).

In more extreme cases, cat by itself does nothing useful in a pipeline. It's like adding an extra length of pipe to a physical pipeline - it does nothing to change the logic of where or when the water flows, or how it splits off or rejoins. For example, cat | wc -l is a more wasteful way to write wc -l, because the latter would already read from standard input and count the lines.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

+4
−0

Especially in a pedagogical context, the issue with something like cat /dev/random | head -c 20 versus the more straightforward head -c 20 /dev/random is that it communicates that extra ceremony is necessary. It isn't. Using one program instead of two isn't about saving kilobytes of computer memory; it's about saving human thought.

Different brains are going to work differently, and if yours is ridged such that you need that extra program in there, have a great time with that. But it's simply a fact that the cat is useless in that context—even a program that only accepts input from stdin can be written fooprog </dev/random—and most people in tech culture prefer to remove useless components from their tech.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

</dev/random fooprog (1 comment)
+1
−3

UUOC is an ancient Unix yarn. I can't find the original essay (I believe from Usenet, where else...) but if memory serves it's either from early 90s or before.

cat is actually a program for concatenating files. cat file1 file2 ... will give you file1+file2+file3. Together with split, this is a primitive but effective system for partitioning and reassembling files. And when the files are text, it's a useful tool for transforming the text in various ways.

Of course cat file1 does nothing. It just gives the file back as is. Useful though concatenation is, many people rarely use it, and a lot of newbies acquire the misapprehension that cat file1 means "print file1", and by the time they see cat actually used with multiple arguments and discover the truth, the muscle memory has long set in.

Unix really does have a way to "print file1" which is <file1 - the brother of "write to file1" which is >file1. See? Perfectly logical! Of course, grep cookies <file1 | wc looks ridiculous, so even after learning this, people don't want to do it. Ditto for grep cookies file1 - yes, grep can take input not only on STDIN... And yet, why bother, when cat file1 | grep cookies | wc looks so much neater?

The problem with this so called UUOC (or "cat abuse", someone must have felt mighty clever coming up with that one) is that you run one more program. Another thing newbies don't realize is that programs in a Unix pipeline don't run one after the other - they run all at once and process input concurrently - just like a pipeline, see? So with UUOC you add an extra program, cat, to the pipeline and waste CPU cycles. But if you did <file1 the OS reads it directly from the file, so you don't waste the CPU cycles, instead you waste brain cycles trying to remember what < is and why it's pointing in the wrong direction.

How much CPU cycles? Actually very little. cat is a very efficient program. If you're doing only cat file1 to see the file, your computer is a mega-turbo-overkill for doing that, cat or not. You'll be bottlenecked by the speed of printing the characters out on your screen, not cat. If you're using it as the first step of a processing pipeline, probably all the other programs are either simple in which case it's still mega-turbo-overkill (or hey, maybe just turbo-overkill), or they're heavy, and what counts as "heavy" for a computer today is a gazillion times more than the burden of running little old cat, so it doesn't matter.

But people looove pointing out cat abuse. You gotta admit, it has a real satisfying feel to gotcha someone with that one. "What are you concatenating?" - ha ha! So often, the argument will go on to reveal more pearls of Unix wisdom:

  • "But what if you were on a tiny computer, like an SBC?" indeed, if you're running pipelines on puny little micro-computers, mayhaps the humble cat will trip you up (aaah... 🐈).
  • "But what if the file was huge, and you needed to do seeks in it?" indeed, some programs like less are zippy and can skip right to line 813,745,714 of that mammoth file, without first going through the eight hundred thirteen million seven hundred forty five thousand seven hundred thirteen preceding lines. How clever! The next time you use less to get the one line from the middle of your 5 petabyte diary, watch out for that dastardly cat 😼.
  • There are surely many others - mine own beard is but a few fingers' length, so surely there are wiser heads with greyer growths than me that see further and deeper.

But, probably you encountered this while asking for help on a forum. You posted some logs to show where the problem is. And you thought you'd be extra helpful and include your command, so they can tell exactly where you're getting the 1 MB log and what trivial filtering you're insulting your 16-core gigahertz processor's intelligence with. "They can reproduce it on their own machine that way!"

I choose to look past the semantics in such things, and examine the pragmatics. Is the helpful user really trying to micromanage your simple, lazy log read? Are they really trying to save you that 5 bytes of memory? Well, probably not. I mean, you think what, he's stupid? He knows it doesn't matter! He's trying to communicate something to you. Namely, that he is not stupid, but sharper, more keen of eye, more discerning of the smallest ripple on the boundless Ocean that is the modern Unix system. Recognize the mastery of this o-sensei and bask in his brilliance!

As for the cat, it's probably fine, don't worry about it. Just don't tell the guy you're probably gonna keep doing it anyway.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

`<file1 | grep cookies | wc` might work in your shell of choice but it doesn't work in Bash. I loo... (2 comments)

Sign up to answer this question »