Anything allocated will not be released to the operating system. See e.g. http://c-faq.com/malloc/free2os.es.html
. The exception to this is mmapped memory, typically used for fast file access.
I suspect your primary problem is the actual parsing of the humongous HTML pages. It just takes a lot of memory -- but the memory should still be freed and reused by the Perl interpreter. But I found this choice quote from HTML::TreeBuilder:
4. previous versions of HTML::TreeBuilder required you to call $tree->delete() to erase the contents of the tree from memory when you're done with the tree. This is not normally required anymore. See "Weak References" in HTML::Element for details.
One general technique for keeping the memory use to a minimum is to fork off a child for every (large) request, and pass the bare-minimum result (here, @messages) to the parent. It's far from elegant and relatively difficult to implement, and the peak memory use will still be the same -- just not in the parent process. A last-resort option.