CUFP
halley
I admit it, I'm one of those annoying people who will interrupt a conversation to point out spelling errors. I try to be discreet and mean no offense; my parents raised me to prefer a friendly correction once than to make the same mistake in more important settings.
<p>However, blogs and other online forums are typically filled with egregious and repetitive and predictable errors. If only I could hide the errors from my browser, I would remain mellow and calm while the rest of the world's grammar decline went unchecked.
<p>I pondered aloud to some friends about the best way to put a search-and-replace filter into my favorite web browser, and somebody suggested [cpan://HTTP::Proxy].
<p>This is scratch code without documentation. I've only tested this on Linux. The simplistic filter has trouble in rare cases where a typo is found inside tag attributes. I filtered a word processor's auto-corrections file and added a few blog-common errors myself. I stripped out any non-ASCII fixes for simplicity. To use it, run this <code>typoxy</code> proxy in the background and configure your browser to access the web through it.
<readmore>
<code>
#!/usr/bin/perl
use strict;
use warnings;
#use Data::Dumper;
my $Port = 8080;
my $Highlight = 1;
#----------------------------------------------------------
my $Pre = $Highlight?
'<span style="background: #ffffcc; color: #800000">' : '';
my $Post = $Highlight?
'</span>' : '';
my @Typos = ();
open(TYPO, "$ENV{HOME}/.typo") and do
{
@Typos = ();
while (<TYPO>)
{
chomp;
my ($wrong, $right) = split /\t+/;
next if not $right;
next if length($wrong) < 2;
push(@Typos,
[ $wrong, $right ]);
push(@Typos,
[ ucfirst($wrong), ucfirst($right) ])
if ucfirst($wrong) ne $wrong;
}
close(TYPO);
};
die "No typos loaded from ~/.typo.\n" if not @Typos;
print STDERR "$0: ", scalar @Typos, " typos filtered on port $Port.\n";
# Longer corrections first.
@Typos =
map { $_->[1] }
sort { $b->[0] <=> $a->[0] }
map { [ length($_->[0]), $_ ] }
@Typos;
# Spaces are lenient.
$_->[0] =~ s/ \s+ /\\s+/gx
foreach @Typos;
# Precompile the correction patterns.
$_->[0] = qr/ (?<! [<>] ) \b ( $_->[0] ) \b/x
foreach @Typos;
#print Dumper $Typos[0], $Typos[-1];
#----------------------------------------------------------
use HTTP::Proxy;
my $proxy = HTTP::Proxy->new(port => $Port);
$proxy->push_body_filter( response => \&typo_filter );
$proxy->start();
#----------------------------------------------------------
sub typo_filter
{
foreach (@Typos)
{
${$_[0]} =~ s|$_->[0]|$Pre$_->[1]$Post|g;
}
}
</code>
</readmore>
<p>Without benchmarking, it seems to affect connect times more than it affects actual rendering time, even with 900+ typos in the <code>~/.typo</code> configuration file. A sample typo list is <a href="http://www.halley.cc/.typo" >http://www.halley.cc/.typo</a>; it's just a list of tab-delimited lines: <code>"definatly\tdefinitely\n"</code>. It's set to highlight errors in red on yellow (so you can see it working), but turning that off is a trivial parameter.
<p>--<br><tt>[ e d @ h a l l e y . c c ]</tt>