First of all you probably don't actually need to find out the exact content length...you just need to know if certain urls contain data over a certain size threshold. you'll need to decide what is the acceptable threshold, and...
instead of using the higher level HTTP functions, use sockets to read url data up to maximum size limit. whilst you're reading this into your buffer, you should be able to parse any content-length header that may come along. so if content-length header is present, you can decide to stop reading or keep going to read full file....and if there's no content-length header, continue reading up to your set threshold for entire length. hope this makes sense.
btw i think it's possible to a server to lie about content-length and get away with it.
the hardest line to type correctly is: stty erase ^H