<?xml version="1.0" encoding="windows-1252"?>
<node id="779308" title="Re: Need a RegExp for Images" created="2009-07-12 00:58:07" updated="2009-07-12 00:58:07">
<type id="11">
note</type>
<author id="44715">
graff</author>
<data>
<field name="doctext">
According to the "Useful tips" section (on page 7) of the [http://www.jpeg.org/public/jfif.pdf|official specs], you should be able to read the first 11 bytes of the file and match this pattern:
&lt;c&gt;
/\xff.\xff...JFIF\x0/
&lt;/c&gt;
Based on looking at a small number of "*.jpg" files I happen to have uploaded from a camera (via "iPhoto", which may have been involved in "updating" some of those pictures after the upload), I would actually change that to:
&lt;c&gt;
/\xff.\xff...(?:JFIF|Exif)\x0/
&lt;/c&gt;
And apparently, you might expect "JFXX" as well.   But frankly, I'd be content to trust the file name, and so would focus on the first reply above.
&lt;P&gt;
UPDATE: The above assumes that the file is being read in &lt;c&gt;:raw&lt;/c&gt; mode (what we commonly understand as the default behavior of doing &lt;c&gt;binmode FH&lt;/c&gt;).  Also, I agree with what [afoken] says below: use a module to validate jpeg files, in case there is any doubt about their validity.</field>
<field name="root_node">
779295</field>
<field name="parent_node">
779295</field>
</data>
</node>
