Adding Words to ispell's Dictionary (Unix Power Tools, 3rd Edition)
16.5. Adding Words to ispell's Dictionary
ispell (Section 16.2)
uses two lists for spelling verification: a master word list and a
supplemental personal word list.
The master word list for ispell is normally the
file /usr/local/lib/ispell/ispell.hash, though
the location of the file can vary on your system. This is a
"hashed" dictionary file. That is,
it has been converted to a condensed, program-readable form using the
buildhash program (which comes with
ispell) to speed the spell-checking process.
The personal word list is normally a file called
.ispell_english or
.ispell_words in your home directory. (You can
override this default with either the -p
command-line option or the
WORDLIST
environment variable (Section 35.3).) This file is simply a list of words, one
per line, so you can readily edit it to add, alter, or remove
entries. The personal word list is normally used in addition to the
master word list, so if a word usage is permitted by either list it
is not flagged by ispell.
Custom personal word lists are particularly useful for checking
documents that use jargon or special technical words that are not in
the master word list, and for personal needs such as holding the
names of your correspondents. You may choose to keep more than one
custom word list to meet various special requirements.
You can add to your personal word list any time you use
ispell: simply use the I
command to tell ispell that the word it offered as
a misspelling is actually correct, and should be added to the
dictionary. You can also add a list of words from a file using the
ispell -a (Section 16.3) option.
The words must be one to a line, but need not be sorted. Each word to
be added must be preceded with an asterisk. (Why? Because
ispell -a has other functions as well.) So, for
example, we could have added a list of Unix utility names to our
personal dictionaries all at once, rather than one-by-one as they
were encountered during spell checking.
Obviously, though, in an environment where many people are working
with the same set of technical terms, it doesn't
make sense for each individual to add the same word list to his own
private .ispell_words file. It would make far
more sense for a group to agree on a common dictionary for
specialized terms and always to set WORDLIST to
point to that common dictionary.
If the private word list gets too long,
you can create a "munched" word
list. The munchlist
script that comes with ispell reduces the words in
a word list to a set of word roots and permitted suffixes according
to rules described in the ispell(4) reference
page that will be installed with ispell from the
CD-ROM [see http://examples.oreilly.com/upt3]. This creates a more compact but still editable word list.
Another option is to provide an
alternative master spelling list using the -d
option. This has two problems, though:
The master spelling list should include spellings that are always
valid, regardless of context. You do not want to overload your master
word list with terms that might be misspellings in a different
context. For example, perl is a powerful
programming language, but in other contexts,
perl might be a misspelling of
pearl. You may want to place
perl in a supplemental word list when
documenting Unix utilities, but you probably
wouldn't want it in the master word list unless you
were documenting Unix utilities most of the time that you use
ispell.
The -d option must point to a hashed dictionary
file. What's more, you cannot edit a hashed
dictionary; you will have to edit a master word list and use (or
have
the system administrator use) buildhash to hash
the new dictionary to optimize spell checker performance.
To build a new hashed word list, provide buildhash
with a complete list of the words you want included, one per line.
(The buildhash utility can only process a raw word
list, not a munched word list.) The standard system word list,
/usr/dict/words on many systems, can provide a
good starting point. This file is writable only by the system
administrator and probably shouldn't be changed in
any case. So make a copy of this file, and edit or add to the copy.
After processing the file with buildhash, you can
either replace the default ispell.hash file or
point to your new hashed file with the -d option.
--TOR and LK
16.4. Inside spell16.6. Counting Lines, Words, and Characters: wc
Copyright © 2003 O'Reilly & Associates. All rights reserved.
Wyszukiwarka
Podobne podstrony:
ch16ch16 (2)ch16ch16 (9)ch16ch16ch16ch16 (13)ch16Chem ch16 pg527 558ch16ch16 (23)CH16 (7)CH16 (21)ch16ch16ch16więcej podobnych podstron