Miscellaneous sort Hints (Unix Power Tools, 3rd Edition)
22.6. Miscellaneous sort Hints
Here is a grab bag of useful, if not exactly interesting,
sort features. The utility will actually do quite
a bit, if you let it.
22.6.1. Dealing with Repeated Lines
sort
-u sorts
the file and eliminates duplicate lines. It's more
powerful than uniq (Section 21.20) because:
It sorts the file for you; uniq assumes that the
file is already sorted and won't do you any good if
it isn't.
It is much more flexible. sort -u considers
lines "unique" if the sort fields (Section 22.2)
you've selected match. So the lines
don't even have to be (strictly speaking) unique;
differences outside of the sort fields are ignored.
In return, there are a few things that uniq does
that sort won't do -- such as
print only those lines that aren't repeated, or
count the number of times each line is repeated. But on the whole, I
find sort -u more useful.
Here's one idea for using sort
-u. When I was writing a manual, I often needed to make
tables of error messages. The easiest way to do this was to
grep the source code for
printf statements, write some Emacs (Section 19.1) macros to
eliminate junk that I didn't care about, use
sort -u to put the messages in order and get rid
of duplicates, and write some more Emacs macros to format the error
messages into a table. All I had to do then was write the
descriptions.
22.6.2. Ignoring Blanks
One important option (that
I've mentioned a number of times) is
-b; this tells sort to ignore
extra whitespace at the beginning of each field. This is absolutely
essential; otherwise, your sorts will have rather strange results. In
my opinion, -b should be the default. But they
didn't ask me.
Another thing to remember about -b: it works only if
you explicitly specify which fields you want to sort. By itself,
sort -b is the same as sort:
whitespace characters are counted. I call this a bug,
don't you?
22.6.3. Case-Insensitive Sorts
If you don't care about
the difference between uppercase and lowercase letters, invoke
sort with the -f (case-fold)
option. This folds lowercase letters into uppercase. In other words,
it treats all letters as uppercase.
22.6.4. Dictionary Order
The -d option tells
sort to ignore all characters except for letters,
digits, and whitespace. In particular, sort -d
ignores punctuation.
22.6.5. Month Order
The -M option tells
sort to treat the first three nonblank characters
of a field as a three-letter month abbreviation and to sort
accordingly. That is, JAN comes before FEB, which comes before MAR.
This option isn't available on all versions of Unix.
22.6.6. Reverse Sort
The -r option tells
sort to
"reverse" the order of the sort;
i.e., Z comes before A, 9 comes before 1, and so on.
You'll find that this option is really useful. For
example, imagine you have a program running in the background that
records the number of free blocks in the filesystem at midnight each
night. Your log file might look like this:
Jan 1 2001: 108 free blocks
Jan 2 2001: 308 free blocks
Jan 3 2001: 1232 free blocks
Jan 4 2001: 76 free blocks
...
The script below finds the smallest and largest number of free blocks
in your log file:
head Section 12.12
#!/bin/sh
echo "Minimum free blocks"
sort -t: +1nb logfile | head -1
echo "Maximum free blocks"
sort -t: +1nbr logfile | head -1
It's not profound, but it's an
example of what you can do.
-- ML
22.5. Alphabetic and Numeric Sorting22.7. lensort: Sort Lines by Length
Copyright © 2003 O'Reilly & Associates. All rights reserved.
Wyszukiwarka
Podobne podstrony:
CH22ch22ch22ch22 (2)ch22ch22ch22 (16)ch22ch22ch22ch22ch22 (19)ch22 (4)więcej podobnych podstron