How to Create Interactive Spell-Checking Shell Script Facility

Posted on 6:48 PM by Bharathvn

#!/bin/sh

# shpell - An interactive spell-checking program that lets you step
# through all known spelling errors in a document, indicate which
# ones you'd like to fix and how, and apply the changes to the file
# The original version of the file is saved with a .shp suffix,
# and the new version replaces the old.
#
# Note that you need a standard 'spell' command for this to work, which
# might involve installing aspell, ispell, or pspell on your system.

tempfile="/tmp/$0.$$"
changerequests="/tmp/$0.$$.sed"
spell="ispell -l" # modify as needed for your own spell

trap "rm -f $tempfile $changerequests" EXIT HUP INT QUIT TERM

# Include the ansi color sequence definitions

. script-library.sh
initializeANSI

getfix()
{
# Asks the user to specify a correction. If the user enters a replacement word
# that's also misspelled, the function calls itself, which is a level 2 nesting.
# This can go as deep as the user might need, but keeping track of nesting enables
# us to ensure that only level 1 outputs the "replacing word" message.

word=$1
filename=$2
misspelled=1

while [ $misspelled -eq 1 ]
do

echo ""; echo "${boldon}Misspelled word ${word}:${boldoff}"
grep -n $word $filename |
sed -e 's/^/ /' -e "s/$word/$boldon$word$boldoff/g"
echo -n "i)gnore, q)uit, or type replacement: "
read fix
if [ "$fix" = "q" -o "$fix" = "quit" ] ; then
echo "Exiting without applying any fixes."; exit 0
elif [ "${fix%${fix#?}}" = "!" ] ; then
misspelled=0 # user forcing replacement, stop checking
echo "s/$word/${fix#?}/g" >> $changerequests
elif [ "$fix" = "i" -o -z "$fix" ] ; then
misspelled=0
else
if [ ! -z "$(echo $fix | sed 's/[^ ]//g')" ] ; then
misspelled=0 # once we see spaces, we stop checking
echo "s/$word/$fix/g" >> $changerequests
else
# It's a single-word replacement, let's spell-check the replacement too
if [ ! -z "$(echo $fix | $spell)" ] ; then
echo ""
echo "*** Your suggested replacement $fix is misspelled."
echo "*** Preface the word with '!' to force acceptance."
else
misspelled=0 # suggested replacement word is acceptable
echo "s/$word/$fix/g" >> $changerequests
fi
fi
fi
done
}

### Beginning of actual script body
if [ $# -lt 1 ] ; then
echo "Usage: $0 filename" >&2 ; exit 1
fi

if [ ! -r $1 ] ; then
echo "$0: Cannot read file $1 to check spelling" >&2 ; exit 1
fi

# Note that the following invocation fills $tempfile along the way
errors="$($spell < $1 | tee $tempfile | wc -l | sed 's/[^[:digit:]]//g')"

if [ $errors -eq 0 ] ; then
echo "There are no spelling errors in $1."; exit 0
fi

echo "We need to fix $errors misspellings in the document. Remember that the"
echo "default answer to the spelling prompt is 'ignore', if you're lazy."
touch $changerequests

for word in $(cat $tempfile)
do
getfix $word $1 1
done

if [ $(wc -l < $changerequests) -gt 0 ] ; then
sed -f $changerequests $1 > $1.new
mv $1 $1.shp
mv $1.new $1
echo Done. Made $(wc -l < $changerequests) changes.
fi

exit 0

How It Works
The script itself revolves around the getfix function, which shows each error in its context and then prompts the user for either a correction or permission to ignore each error. The sophisticated conditionals in this script allow users to type in either a correction for the reported misspelling, i to ignore the misspelling, or q to immediately quit the program. Perhaps more interesting is that getfix is interactive. It checks the spelling of the corrections that are entered to ensure that you're not trading one misspelling for another. If the script thinks that the correction is a misspelling too, you can force acceptance of the correction by prefacing it with the "!" character.

The fixes themselves are accumulated by a sed script called $changerequests, which is then used to apply the corrections to the file once the user has finished reviewing all of the would-be mistakes.

Also worth mentioning is that the trap command at the beginning of the script ensures that any temp files are removed. Finally, if you check the last few lines of the script, you'll note that the precorrected version of the file is saved with a .shp suffix, in case something goes wrong. Anticipating possible problems is always a wise policy, particularly for scripts that munge input files.

Running the Script
To run this script, specify the filename to spell-check as a command argument.

The Results
$ shpell ragged.txt
We need to fix 5 misspellings in the document. Remember that the
default answer to the spelling prompt is 'ignore', if you're lazy.

Misspelled word herrself:
1:So she sat on, with closed eyes, and half believed herrself in
i)gnore, q)uit, or type replacement: herself
Misspelled word reippling:
3:all would change to dull reality--the grass would be only rustling in the
wind, and the pool reippling to the waving of the reeds--the
i)gnore, q)uit, or type replacement: rippling

Misspelled word teacups:
4:rattling teacups would change to tinkling sheep-bells, and the
i)gnore, q)uit, or type replacement:

Misspelled word Gryphon:
7:of the baby, the shriek of the Gryphon, and all the other queer noises, would
change (she knew)
i)gnore, q)uit, or type replacement:

Misspelled word clamour:
8:to the confused clamour of the busy farm-yard--while the lowing of
i)gnore, q)uit, or type replacement:
Done. Made 2 changes.

It's impossible to reproduce here in the book, but the ANSI color sequences let the misspelled words stand out in the output display.