http://www.perlmonks.org?node_id=347020


in reply to Re: Getting File Type using Regular Expressions
in thread Getting File Type using Regular Expressions

All we have in any situation is context and convention. Intuition won't solve everything, and computational completeness won't solve everything. Perhaps the byte sequence "P100s of Samsung are really cool phones" is a perfectly well-formed 6x6 pixel GIF file. You can only guess at the intent, and more data gives a better guess. That's why they call them 'heuristics.'

That said, malicious users will attack any such heuristic assumptions to their favor. Britney.jpg.exe If your upload code expects web-intended images and only wants to accept web-intended images, it benefits the system to expect that any available heuristic passes muster. If it's not .jpg, toss it. If it's not JPEG magic, toss it. If the ImageMagic tool says the pixel dimensions are over 10000 in either dimension, toss it.

--
[ e d @ h a l l e y . c c ]