Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

by Juerd (Abbot)
on Mar 30, 2002 at 17:54 UTC ( #155450=sourcecode: print w/ replies, xml ) Need Help??

Category: Web Stuff
Author/Contact Info Juerd
Description: Because the popular gnuvd is broken, I made this quick hack to query the Van Dale website for dictionary lookups. It's a quick hack, so no production quality here ;) Oh, and please don't bother me with Getopt or HTML::Parser: Don't want to use Getopt because I don't like it, and can't use HTML::Parser because has a lot of broken HTML, and because regexes are easier (after all, it's a quick hack because I can't live without a Dutch dictionary).

This probably isn't of much use to foreigners :)

Update (200306081719+0200) - works with html updates now.
#!/usr/bin/perl -w

use strict;
use LWP::Simple;

my (@switches, @woorden);

while (@ARGV) {
    $_ = shift;
    if (/^--$/) {
        push @woorden, @ARGV;
    } elsif (/^-/) {
        push @switches, $_;
    } else {
        push @woorden, $_;

my $all = grep /^(?:-\w*a|--all)$/, @switches;
if (grep /^(?:-\w*h|--help)$/, @switches) {
    print qq{
        Usage: $0 [options] word ...
            -a  --all   List all matches
            -h  --help  Display usage information
    exit 0;

for my $woord (@woorden) {
    $woord =~ s/(\W)/sprintf '%%%02x', ord $1/ge;

    my $page =
        get "$wo

    while ($page =~ s{<B><BIG>(.*?)</font>.*?((?:<DD>.*?</DD>)+)}{}si)
+ {
        my ($woord, $betekenis) = ($1, $2);
        for ($woord, $betekenis) {
            s/&#(\d+);/chr $1/ge;
        $betekenis =~ s/^/  /gm;
        print "$woord\n$betekenis\n";
        last if not $all;

Comment on
Download Code
Replies are listed 'Best First'.
(jeffa) Re: (with Getopt::Declare)
by jeffa (Bishop) on Mar 30, 2002 at 20:38 UTC
    Regarding Option parsing modules - this is not to bug you into using them, but rather an option for others to decide.

    I though to myself, "hmmmm ... let's use TheDamian's Getopt::Declare" and proceded to RTFM. I had always wanted to learn this module, and now seemed like the time.

    After about 40 minutes of racking my brain (:D) i finally came up with this:

    #!/usr/bin/perl -w use strict; use LWP::UserAgent; use Getopt::Declare; # -h, -v, --help, --version are included # and these are tabs - not spaces! my $spec = q( -a List all matches --all [ditto] ); my $args = Getopt::Declare->new($spec); my $all = $args->{'--all'} || $args->{'-a'}; for my $woord ($args->unused) { # insert for loop block innards from code above }
    But that is 40 minutes of well spent time, because now i see the power of this module. And thanks to the Von Neumann bottleneck of having to retrieve the page from the Internet, the fact that Getopt::Declare is slower than the option parsing code above is negligible.

    P.S. i also have no quandaries about using regexes to parse HTML, just as long as the coder understands how to use the CPAN HTML parsers. Sometimes using regexes really is easier. Sometimes.


    (the triplet paradiddle with high-hat)

      In the interest of tmtowtdi, here's an alternate Getopt::Declare scenario that sets up the $all and @words variables in action blocks. Here I decided to make the search words required (but without option description), and just allow for -a as an abbrev. of -all (instead of using the --all version):

      #!/usr/bin/perl -w use strict; use LWP::UserAgent; use Getopt::Declare; use vars qw/$all @words/; my $opts = Getopt::Declare->new(<<'EOS'); -a[ll] List all matches {$all = 1} <terms:s>... [required] {@words = @terms} EOS for my $word (@words) { # insert fetch code ... # ... last unless $all; } __END__
by cztmonk (Monk) on Jul 18, 2012 at 10:07 UTC

    When I use this code, there is no output...

      This post is ten years old, the code was last updated nine years ago. It's likely the site in question has changed substantially in that time.

        You are right, that was a stupid remark..

Back to Code Catacombs

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://155450]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2015-11-28 08:05 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (739 votes), past polls