There's a very good chance the OP is
using a CPAN module. Further, the question has nothing to do with parsing HTML but incomplete HTML. Finaly, the OP hasn't
tried "...to code it from scratch".
The OP asked a very good question (not, as he/she fears, in the least stupid) and made a reasonable stab at the answer. Corion has helpfully given some pointers that may likely help find a solution.
++ to the OP and ++ to Corion. Your contribution? I'll leave that as an exercise for the reader.