http://www.perlmonks.org?node_id=155450
Category: Web Stuff
Author/Contact Info Juerd
Description: Because the popular gnuvd is broken, I made this quick hack to query the Van Dale website for dictionary lookups. It's a quick hack, so no production quality here ;) Oh, and please don't bother me with Getopt or HTML::Parser: Don't want to use Getopt because I don't like it, and can't use HTML::Parser because http://www.vandale.nl/ has a lot of broken HTML, and because regexes are easier (after all, it's a quick hack because I can't live without a Dutch dictionary).

This probably isn't of much use to foreigners :)

Update (200306081719+0200) - works with vandale.nl html updates now.
#!/usr/bin/perl -w

use strict;
use LWP::Simple;

my (@switches, @woorden);

while (@ARGV) {
    $_ = shift;
    if (/^--$/) {
        push @woorden, @ARGV;
    } elsif (/^-/) {
        push @switches, $_;
    } else {
        push @woorden, $_;
    }
}

my $all = grep /^(?:-\w*a|--all)$/, @switches;
if (grep /^(?:-\w*h|--help)$/, @switches) {
    print qq{
        Usage: $0 [options] word ...
        
        options:
            -a  --all   List all matches
            -h  --help  Display usage information
    \n};
    exit 0;
}

for my $woord (@woorden) {
    $woord =~ s/(\W)/sprintf '%%%02x', ord $1/ge;

    my $page =
        get "http://www.vandale.nl/opzoeken/woordenboek/?zoekwoord=$wo
+ord";

    while ($page =~ s{<B><BIG>(.*?)</font>.*?((?:<DD>.*?</DD>)+)}{}si)
+ {
        my ($woord, $betekenis) = ($1, $2);
        for ($woord, $betekenis) {
            s[</dd>][\n]gi;
            s/<.*?>//g;
            s/&#180;/'/g;
            s/&#(\d+);/chr $1/ge;
        }
        $betekenis =~ s/^/  /gm;
        print "$woord\n$betekenis\n";
        last if not $all;
    }
}