nanog mailing list archives

Re: Quick fix


From: Dan Boehlke <dboehlke () mr net>
Date: Fri, 3 Mar 2000 09:55:33 -0600 (CST)


its not so much that my shift key doesn't work, that i gave it up to have
an extra meta key for emacs functions to be mapped to. :-)  loosing
uppercase didn't seem so bad, however having just been through a bandwidth
crunch, i was unable to take my companies' suggestion and switch from
ascii to some a bit saving character set like bcd or fielddata to save two
bits of bandwidth.

On 1 Mar 2000, Bandy Rush wrote:


I noticed that a number of individuals who post regularly to the NANOG mailing list have keyboards with 
malfunctioning shift keys.  Since working keyboards are so expensive, far beyond the financial means of most 
poorly-paid technical workers, I humbly offer the following Perl program, which you can use to filter your NANOG 
e-mail to correct capitalization errors (procmail should do the trick for that).  For example:

    i abhor cross-posting.  but sometimes ...

Gets converted to:

    I abhor cross-posting.  But sometimes ... 

It's not perfect, but it catches most occurrances.  If you have a slow computer, you probably want to disable the 
dictionary.  You can do this by setting @DICT_FILES=().  As is, it is configured to search for the word list in the 
default locations in Linux and Solaris.

And oh yes, it's a joke.  ;)

#!/usr/bin/perl -w  
 
# use good Perl form
use strict;
use English;
 
# location of plain-text word list
my @DICT_FILES = ("/usr/dict/words", "/usr/share/lib/dict/words");
 
# slurp in all input  
undef $INPUT_RECORD_SEPARATOR;
my $input = <>;
exit unless (defined($input));
 
# Fix end of paragraph
$input =~ s/([a-z])(\n\n.)/$1.$2/g;
 
# Fix start of sentence
$input =~ s/(\n\n+)([a-z])/$1\u$2/g;
$input =~ s/(\.\s+)([a-z])/$1\u$2/g;
 
# Fix self-referental problems
$input =~ s/(\b)i(\b)/$1I$2/g;
$input =~ s/(\b)randy(\b)/$1Randy$2/g;
$input =~ s/(\b)bush(\b)/$1Bush$2/g;
 
# Fix words that should be capitalized (based on the dictionary)
$INPUT_RECORD_SEPARATOR = "\n";
foreach my $dict_file (@DICT_FILES) {
    if (open(DICT, "< $dict_file")) {
        while (defined (my $uc_word=<DICT>)) {
            if ($uc_word =~ /[A-Z]/) {
                chomp($uc_word);
                my $lc_word = lc($uc_word);
                $input =~ s/(\b)$lc_word(\b)/$1$uc_word$2/g;
            }
        }
        last;
    }
}
 
# output our fixed text
print $input;
 
# The author releases this software into the public domain.  


--------------------------------------
AltaVista     Smart is Beautiful     http://www.altavista.com

Raging Bull?  Sleeping Bear?  Live stock quotes at AltaVista Live!
http://money.altavista.com

--------------------------------------


--
Dan Boehlke, Senior Network Engineer                              Onvoy
Internet:  dboehlke () onvoy com                  Formerly MEANS and MRNet
Phone:  612-362-5814                  2829 SE University Ave. Suite 200
WWW: http://www.onvoy.com/~dboehlke/             Minneapolis, MN  55414




Current thread: