The Rookery: "Dogs" of the linux Shell
Posted on Saturday, October 19, 2002 by Louis J. Iacona
Could the command-line tools you've forgotten or never knew save time and some frustration?
One incarnation of the so called 80/20 rule has been associated with software systems. It has been observed that 80% of a user population regularly uses only 20% of a system's features. Without backing this up with hard statistics, my 20+ years of building and using software systems tells me that this hypothesis is probably true. The collection of linux command-line programs is no exception to this generalization. Of the dozens of shell-level commands offered by Linux, perhaps only ten commands are commonly understood and utilized, and the remaining majority are virtually ignored.
Which of these dogs of the linux shell have the most value to offer? I'll briefly describe ten of the less popular but useful Linux shell commands, those which I have gotten some mileage from over the years. Specifically, I've chosen to focus on commands that parse and format textual content.
The working examples presented here assume a basic familiarity with command-line syntax, simple shell constructs and some of the not-so-uncommon linux commands. Even so, the command-line examples are fairly well commented and straightforward. Whenever practical, the output of usage examples is presented under each command-line execution.
The following eight commands parse, format and display textual content. Although not all provided examples demonstrate this, be aware that the following commands will read from standard input if file arguments are not presented.
Table 1. Summary of Commands
Head/Tail
As their names imply, head and tail are used to display some amount of the top or bottom of a text block. head presents beginning of a file to standard output while tail does the same with the end of a file. Review the following commented examples:
## (1) displays the first 6 lines of a file
head -6 readme.txt
## (2) displays the last 25 lines of a file
tail -25 mail.txt
Here's an example of using head and tail in concert to display the 11th through 20th line of a file.
# (3)
head -20 file | tail -10
Manual pages show that the tail command has more command-line options than head. One of the more useful tail option is -f. When it is used, tail does not return when end-of-file is detected, unless it is explicitly interrupted. Instead, tail sleeps for a period and checks for new lines of data that may have been appended since the last read.
## (4) display ongoing updates to the given
## log file
tail -f /usr/tmp/logs/daemon_log.txt
Imagine that a dæmon process was continually appending activity logs to the /usr/adm/logs/daemon_log.txt file. Using tail -f at a console window, for example, will more or less track all updates to the file in real time. (The -f option is applicable only when tail's input is a file).
If you give multiple arguments to tail, you can track several log files in the same window.
## track the mail log and the server error log
## at the same time.
tail -f /var/log/mail.log /var/log/apache/error_log
tac--Concatenate in Reverse
What is cat spelled backwards? Well, that's what tac's functionality is all about. It concatenates file order and their contents in reverse. So what's its usefulness? It can be used on any task that requires ordering elements in a last-in, first-out (LIFO) manner. Consider the following command line to list the three most recently established user accounts from the most recent through the least recent.
# (5) last 3 /etc/passwd records - in reverse
$ tail -3 /etc/passwd | tac
curly:x:1003:100:3rd Stooge:/homes/curly:/bin/ksh
larry:x:1002:100:2nd Stooge:/homes/larry:/bin/ksh
moe:x:1001:100:1st Stooge:/homes/moe:/bin/ksh
nl--Numbered Line Output
nl is a simple but useful numbering filter. I displays input with each line numbered in the left margin, in a format dictated by command-line options. nl provides a plethora of options that specify every detail of its numbered output. The following commented examples demonstrate some of of those options:
# (6) Display the first 4 entries of the password
# file - numbers to be three columns wide and
# padded by zeros.
$ head -4 /etc/passwd | nl -nrz -w3
001root:x:0:1:Super-User:/:/bin/ksh
002daemon:x:1:1::/:
003bin:x:2:2::/usr/bin:
004sys:x:3:3::/:
#
# (7) Prepend ordered line numbers followed by an
# '=' sign to each line -- start at 101.
$ nl -s= -v101 Data.txt
101=1st Line ...
102=2nd Line ...
103=3rd Line ...
104=4th Line ...
105=5th Line ...
.......
fmt--format
The fmt command is a simple text formatter that focuses on making textual data conform to a maximum line width. It accomplishes this by joining and breaking lines around white space. Imagine that you need to maintain textual content that was generated with a word processor. The exported text may contain lines whose lengths vary from very short to much longer than a standard screen length. If such text is to be maintained in a text editor (like vi), fmt is the command of choice to transform the original text into a more maintainable format. The first example below shows fmt being asked to reformat file contents as text lines no greater than 60 characters long.
# (8) No more than 60 char lines
$ fmt -w 60 README.txt > NEW_README.txt
#
# (9) Force uniform spacing:
# 1 space between words, 2 between sentences
$ echo "Hello World. Hello Universe." |
fmt -u -w80
Hello World. Hello Universe.
fold--Break Up Input
fold is similar to fmt but is used typically to format data that will be used by other programs, rather than to make the text more readable to the human eye. The commented examples below are fairly easy to follow:
# (10) format text in 3 column width lines
$ echo oxoxoxoxo | fold -w3
oxo
xox
oxo
# (11) Parse by triplet-char strings -
# search for 'xox'
$ echo oxoxoxoxo | fold -w3 | grep "xox"
xox
# (12) One way to iterate through a string of chars
$ for i in $(echo 12345 | fold -w1)
> do
> ### perform some task ...
> print $i
> done
1
2
3
4
5
tr
tr is a simple pattern translator. Its practical application overlaps a bit with other, more complex tools, such as sed and awk [with larger binary footprints]. tr is quite useful for simple textual replacements, deletions and additions. Its behavior is dictated by "from" and "to" character sets provided as the first and second argument. The general usage syntax of tr is as follows:
# (12) tr usage
tr [options] "set1" ["set2"] < input > output
Note that tr does not accept file arguments; it reads from standard input and writes to standard output. When two character sets are provided, tr operates on the characters contained in "set1" and performs some amount of substitution based on "set2". Listing 1 demonstrates some of the more common tasks performed with tr.
Listing 1. Common Tasks with tr
pr
pr shares features with simpler commands like nl and fmt, but its command-line options make it ideal for converting text files into a format that's suitable for printing. pr offers options that allow you to specify page length, column width, margins, headers/footers, double line spacing and more.
Aside from being the best suited formatter for printing tasks, pr also offers other useful features. These features include allowing you to view multiple files vertically in adjacent columns or columnizing a list in a fixed number of columns (see Listing 2).
Listing 2. Using pr
Miscellaneous
The following two commands are specialized parsers used to pick apart file path pieces.
Basename/Dirname
The basename and dirname commands are useful for presenting portions of a given file path. Quite often in scripting situations, it's convenient to be able to parse and capture a file name or the containing-directory name portions of a file path. These commands reduce this task to a simple one-line command. (There are other ways to approach this using the Korn shell or sed "magic", but basename and dirname are more portable and straightforward).
basename is used to strip off the directory, and optionally, the file suffix parts of a file path. Consider the following trivial examples:
:# (21) Parse out the Java Class name
$ basename
/usr/local/src/java/TheClass.java .java
TheClass
# (22) Parse out the file name.
$ basename srcs/C/main.c
main.c
dirname is used to display the containing directory path, as much of the path as is provided. Consider the following examples:
# (23) absolute and relative directory examples
$ dirname /homes/curly/.profile
/homes/curly
$ dirname curly/.profile
curly
#
# (24) From any korn-shell script, the following
# line will assign the directory from where
# the script was launched
SCRIPT_HOME="$(dirname $(whence $0))"
#
# (25)
# Okay, how about a non-trivial practical example?
# List all directories (under $PWD that contain a
# file called 'core'.
$ for i in $(find $PWD -name core )^
> do
> dirname $i
> done | sort -u
bin
rje/gcc
src/C
Conclusion
The multiple commands in this article are presented in support of a hypothesis claiming the lion's share of a given system's feature set goes unnoticed. My goal here was to increase the awareness of several of the lesser utilized and showcased utilities that offer some value. If you ever think, "there must be a easier way to accomplish 'X'", while writing a script or while struggling with something at the command-line prompt, perhaps there is. Do some digging. One of the better sources for such digging is the O'Reilly linux in a Nutshell book--a well organized, quick reference. I also would encourage you to examine the installed manual and info-based pages--not all command-line options were covered here.
Louis Iacona has been designing and developing applications on UNIX/linux since 1982. Most recently, his efforts have been directed at applying leading-edge design/development techniques to the enterprise needs of fortune 2000 companies. He is currently a Senior Consulting Engineer at OmniE Labs, Inc. (www.omnie.com).
""Dogs" of the linux Shell" | Login/Create an Account | 38 comments
Threshold -1 0 1 2 3 4 5 No Comments Nested Flat Thread Oldest First Newest First Highest Scores First
The comments are owned by the poster. We aren't responsible for their content.
Re: (Score: 0)
by Anonymous on Sunday, October 20, 2002
Here's something I use now and again:
find / -type f -exec grep -icH 'regex' '{}' ; | sed -e '/0$/ d' | sed 's/(.*:)([0-9]*)/21/' | sort -n > results.txt
What this does is search every regular file on your system, greps it for a regex, pipes the output of that through sed a couple of times to remove results with zero hits and to put the number of hits at the front, sorts them by number then puts then in a file.
Useful when trying to find out how a particular distribution sets stuff for programs; be warned though, it can take a while to complete but that shouldn't be a problem if you need a coffee!
[ Reply to This ]
Re: by Anonymous on Sunday, October 20, 2002
Re: Faster Modification (I think) by Anonymous on Sunday, October 20, 2002
Re: Faster Modification (I think) by Anonymous on Monday, October 21, 2002
Re: Cool, but... by Anonymous on Monday, October 21, 2002
Re: (Score: 0)
by Anonymous on Monday, October 21, 2002
What is the Unix equivalent of Windows' "dir /s"? "dir /s" is like 'ls' but it looks recursively in all subdirectories too. I know 'find' can do something like this, but its man page is practically unreadable.. <:-
[ Reply to This ]
Re: by Anonymous on Monday, October 21, 2002
Re: ``dir /s by Anonymous on Monday, October 21, 2002
Re: recursive dir for UNIX/linux by Anonymous on Monday, October 21, 2002
Re: by Anonymous on Monday, October 21, 2002
Re: dir /s by Anonymous on Monday, October 21, 2002
Re: by Anonymous on Monday, October 21, 2002
Re: by Anonymous on Monday, October 21, 2002
Re: by Anonymous on Monday, October 21, 2002
Re: find by Anonymous on Tuesday, October 22, 2002
Re: by Anonymous on Monday, October 21, 2002
Re:dir /s equiv by Anonymous on Monday, October 21, 2002
Re:dir /s equiv by Anonymous on Monday, October 21, 2002
Re:dir /s equiv by Anonymous on Monday, October 21, 2002
zsh: ls **/*.txt by Anonymous on Monday, October 21, 2002
Re: by Anonymous on Monday, October 21, 2002
Re: by Anonymous on Tuesday, October 22, 2002
Re: Other tricks: DU and DF by Anonymous on Wednesday, October 23, 2002
Re: Other tricks: DU and DF by Anonymous on Wednesday, October 23, 2002
Re: Other tricks: DU and DF by Anonymous on Thursday, October 24, 2002
Re:dir /s by Anonymous on Wednesday, October 23, 2002
Re: line numbering (Score: 0)
by Anonymous on Monday, October 21, 2002
If you don't need anything complicated, cat -n somefile > somefile.numbered can do the trick with numbering lines.
[ Reply to This ]
Re: line numbering by Anonymous on Tuesday, October 22, 2002
Re: line numbering by Anonymous on Wednesday, October 23, 2002
Re: use seq, not fold, for iteration (Score: 0)
by Anonymous on Wednesday, October 23, 2002
The iteration example is less than convincing. Try iterating over a 10 elements. Oops. Try 1000. Huh? ...
for i in $(echo 12345|fold -w1); do print $i; done
should be
for i in `seq 5`; do print $i;done
seq(1) allows to define start, stop, step and more.
[ Reply to This ]
Re: use seq, not fold, for iteration by Anonymous on Wednesday, October 23, 2002
Re: Excellent article. (Score: 0)
by Anonymous on Wednesday, October 23, 2002
Found your site from linux Today.
My linux tips page:
http://wolfrdr.tripod.com/linuxtips.html
[ Reply to This ]
Re: "Dogs" of the linux Shell (Score: 1)
by DrScriptt on Wednesday, October 23, 2002
(User Info | Send a Message) http://drscriptt.riverviewtech.net
Now this is a GREAT article!!! I really would like to see more articles like this one.
I've been using linux for 3+ years now and I LOVE it. I cut my teeth on DOS batch files using DATE, FC, and TIME to do a LOT of what was done here. It was VERY hard, I ended up creating temporary files all over the place that had to be subsequently cleaned up. Unicies on the other hand make it SO easy. I really do enjoy seeing all the CLI tools that are out there and knowing that people are using them. To me using tools like these are what make us unix people. No matter how experienced or inexperienced (me) we may be. Using the system to its potential is what it's there for. Try doing some of these tasks things and more (combine them...) in Windows
Read the rest of this comment...
[ Reply to This ]
Re: One-level Deep Directory Listing (Score: 0)
by Anonymous on Wednesday, October 23, 2002
Here's a super simple command line thingy that I use all the time to see the contents of the current directory and one level down:
daemonbox [1]: ls -AF `ls -A`
I've aliased it to "l1" for convienence
note - this is on NetBSD-1.6: YMMV in linux
[ Reply to This ]
Re: One-level Deep Directory Listing by Anonymous on Thursday, October 24, 2002
Re: [...]linux Shell, Cygwin anyone? (Score: 0)
by Anonymous on Thursday, October 24, 2002
Just a reminder to those linux/UNIX enthusiasts who have to suffer the Microsoft command line at work... Check out Cygwin for the coolest shell (and X) stuff that runs on Windows.
Rgds, Derek
[ Reply to This ]
Re: recall things in depth (Score: 0)
by Anonymous on Thursday, October 24, 2002
i just involve a project for a data convertion .
old window software can print report/data to file as file.prn. but the output is dirty, they seperately one record to mutil-line.
and fill record seperator using one blank line.
--------------------------
head , tail just good for capture it.
while [ startLine -lt totalLine ]; do
parse using wc -c to check is empty line
using cat -A , sed to trailing ^M(return) char
use >>,> to join 3/4 line as one record.
done
than port to mysql database
many thanks to arctical author.