|
|
| "be consistent" | |
| PerlMonks |
Re: Cleaning up text for indexing in DBby t'mo (Pilgrim) |
| on Jul 16, 2003 at 15:23 UTC ( #274864=note: print w/ replies, xml ) | Need Help?? |
|
As other monks suggested in another thread, why not get the text via the program 'lynx', letting it get rid of any HTML-junk for you? $text = `lynx -dump $url`;...every application I have ever worked on is a glorified munger...
In Section
Seekers of Perl Wisdom
|
|
||||||||||||||||||||||