My program grabs web pages, stores two versions of them in a database:
- A version without any HTML tags, for which I use HTML::Strip. The text, that is, the non-tags content, of the web page is used to build a full-text index which is used for later searches;
- A version as the page was at the instant of downloading it. This one is used to show the user the web page as it was at the time and date when it was downloaded.
when small people start casting long shadows, it is time to go to bed