http://www.perlmonks.org?node_id=169469


in reply to (jcwren) Re: Text::Balanced woes..
in thread Text::Balanced woes..

I see...
thank you very much, Chris.... i never would have though to try a test excluding everything before the tag...

You're right that Text::Balanced isn't very useful like this, i can't imagine that the author meant it to be this way.

You can see what i'm trying to do (well, actually i'll be parsing a href links out in an effort to combat chatroom spambots), is there another method you'd suggest?

It was looking at the docs that convinced me that Text::Balanced was the right thing for me... i'd be using the 5th (#4) element.. i haven't found another lib that'll supply just the stripped URL inside the tag yet... and while Perl seems super-cool for text handling (i'm a duffer), i'd rather not rewrite the wheel..

in any case, thanks very much for your reply!

Replies are listed 'Best First'.
(jcwren) Re: Text::Balanced woes..
by jcwren (Prior) on May 27, 2002 at 03:12 UTC

    There are several packages based on HTML::Parser, such as HTML::LinkExtor, that shouldn't require you to invent too many wheels. I would take a look at that.

    I would avoid at all costs attempting to use a regular expression to attempt to extract links. That's just a path to problems.

    --Chris

    e-mail jcwren