Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Auto linking to words in a text file

by belize (Deacon)
on Dec 01, 2000 at 21:58 UTC ( [id://44418] : perlquestion . print w/replies, xml ) Need Help??

belize has asked for the wisdom of the Perl Monks concerning the following question:

I know I have seen this before but can't seem to locate it. I have a text file that is returned from a database search. Before the file is displayed, go through the text and match against a link file, where if a word appears, it is automatically linked to another file, similar to the nodes on this site.

I've searched this site using "auto link" and "linking words auto" without luck.

By the way, the answers to a previous question about generating graphics with GD resulted in a successful project. Thanks for the hints. You can see the results at The maps are generated on the fly. I would link to the SOPW entry for this, but I haven't figured out how to do that yet on here.

Replies are listed 'Best First'.
Re: Auto linking to words in a text file
by arturo (Vicar) on Dec 01, 2000 at 22:09 UTC

    One way of doing it would be defining a hash whose keys are the keywords (the ones you want to provide links for) -- this will work with phrases too, BTW -- and whose values are URIs or URLs. Then just use s///g to do it (that's the quick n' dirty way).

    #!/usr/bin/perl -w use strict; my %keywords = ( foo=>'/foo.html', bar=>'bar.html', cult=>'' ); my $textfile ="/file/to/link"; open TEXTFILE, $textfile or die "Yikes! $textfile won't open: $!\n"; my $data; { local $/; #sets input list separator to undef #so we can 'slurp' the file in $data = <TEXTFILE>; } close TEXTFILE; foreach (keys %keywords) { my $url = $keywords{$_}; # EDITED ... I'd forgotten to escape # the / in front of the closing <a> # credit to Albannach $data =~ s/($_)/<a href="$url">$1<\/a>/g; } # now do something with $data

    But that's *oh so quick* and *oh so dirty*, it probably raises a lot of problems and only works on very simple kinds of words (e.g. if you have phrases in your keywords hash, you could really mess things up).

    UPDATE Implementing something like this site's linking mechanism wouldn't be so hard. just put delimiters around the words / phrases you want to link, and something like the above should work fairly well. E.g. [bob] would say 'put a link around "bob"' (what link, you could determine in a number of ways, but the hash idea seems to work well enough.)

    Just alter the above substitution line to:

    $data =~ s/\[($_)]<a href="$url">$1<\/a>/g;

    Bonus: that takes care of phrases, too. (credit for this idea goes to whathisname

    Seems like there must be a module that does something like this (I'll go to CPAN and give you an update)

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

      my $data; { local $/; #sets input list separator to undef #so we can 'slurp' the file in $data = <TEXTFILE>; } close TEXTFILE;

      Why didn't I think of that :)

      These are the sort of things I love about such forums as Perl Monks.


      Brother Marvell

        If you like that then you'll probably like to know that it can be done w/ even less keystrokes:
        my $data = do { local $/; <TEXTFILE> };
        No, this isn't golf, but all the same it's a nice construct to know.
      You probably want to take one additional step and sort each of the keys by length, doing the longest first. This would allow you to catch a link to "foo bar" as well as individual links to "foo" and "bar" without getting inconsistent results.
      foreach (sort { length($b) <=> length($a) } keys %keywords) { ... }
Re: Auto linking to words in a text file
by mdillon (Priest) on Dec 01, 2000 at 22:26 UTC
    this is very simplistic and only works within a line. also, i hadn't seen that you wanted the links to be in the data file, so this won't totally work. oh well.
    #!/usr/bin/perl -w use strict; use URI::Escape; chomp(my @words = <DATA>); my $re_text = join '|', map quotemeta, @words; my $re = qr/\b($re_text)\b/; while (<>) { s/$re/build_link($1)/ge; print; } sub build_link { my $word = shift; sprintf '<A HREF="/">%s</A>', uri_escape($word), $word; } __END__ foo bar baz PerlMonks
Re: Auto linking to words in a text file
by 2501 (Pilgrim) on Dec 01, 2000 at 22:13 UTC
    where would the links be found, and do you know all the links beforehand?
    Are you looking for something like:
    this would be one way how a [link|] is inputted
    or might you have a hash loaded up with links and words like:
     $wordlist{'link'}=""; and then if you see the word "link" you automatically create the link.
      This is pretty much what I mean. I envision a text file with all the words needed to be linked to thus:


      Then when "link_word" appears in a returned text, anywhere that "link_word" appears, it would be replaced with:


      Hope this is more clear.

        I'm still not getting it, I'm afraid. Would this be defined at the top of a file, as in "here's a list of words and their associated links" or *within* the file, as in only certain occurrences of the words will get linked, and that link is determined by what follows the 'pipe' symbol.

        Example #1

        cult| banana|/orange.html Being in a cult can reduce your ability to eat a banana, doctors claim +.

        Example #2

        Being in a cult| can reduce your ability to eat + bananas|/orange.html , doctors claim.

        if the latter, it's so close to HTML already it's almost not worth the work of writing a script. But whatever. If the former, what you could to do is read the word/url pairs into a hash, and do something like what I suggested above. Philosophy can be made out of anything. Or less -- Jerry A. Fodor

Re: Auto linking to words in a text file
by marvell (Pilgrim) on Dec 01, 2000 at 22:25 UTC

    Presuming your link file is small, it should be relatively easy to load it all into a hash.

    Let's say the link file format was something like: {phrase}=={link}

    (I use == as there may be whitespace in the phrase)

    Load code would be something, like:

    while(<LINKFILE>) { chomp; ($key,$data) = split("=="); $links{$key} = $data; }

    Once that is loaded in, you should be able to replace the links in the file. As this is web files, I imagine that loading the whole thing in is not an issue.

    $content = join("",<FILE>); # this is better replaced with arturos met +hod # of reading in whole file foreach $k (keys %links) { $content =~ s/$k/$links{$k}/gm; }
    $content will now be left with all the links in it.


    Brother Marvell