Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Detecting 'binary' in a variable

by kirbyk (Friar)
on Jul 05, 2005 at 17:39 UTC ( #472535=perlquestion: print w/replies, xml ) Need Help??
kirbyk has asked for the wisdom of the Perl Monks concerning the following question:

I have an application running under Apache/Mod-perl where a user can upload a csv file. The file gets uploaded, and sits in a perl variable, eventually to be loaded into an Oracle CLOB.

I want to detect if the file they've uploaded is binary or text only, and give the user a helpful error message. (Like, if they upload a .xls file.) Note that the file never exists on a filesystem, so I can't use any unix tricks (and I don't want to write out a temp file.)

I figure I can go character-by-character in a loop and look at the ascii values, but that seems horribly inefficient. Is there a quick regex that could do this check? I'm not worried about Unicode characters, but it'd be nice if extended ascii characters through, say, 165 (to get all the accented characters.)

-- Kirby,

Replies are listed 'Best First'.
Re: Detecting 'binary' in a variable
by Transient (Hermit) on Jul 05, 2005 at 17:49 UTC
    would a simple if ( $file =~ /[^\x00-\xA5]/ ) { # binary } else { #text } suffice?

    Update: Also looks like there's a CGI::UploadEasy method "fileinfo" (in case you're using or could use that module)

      Not always

      $ perl -le '{local$/; $_=<>;}print /^[\x00-\xA5]/ ? "binary" : "text" +' \ /mnt/win/WINDOWS/system32/ text
        That's correct, but that's not the same regexp:

      Thanks, that regex does the trick.

      -- Kirby,

Re: Detecting 'binary' in a variable
by brian_d_foy (Abbot) on Jul 05, 2005 at 18:01 UTC

    In Perl 5.8, you can open a virtual filehandle on a scalar reference. That might do the trick for you. If you are using an older perl, Tie::Handle::ToMemory does the same thing. You could then use the file test operators, or something like File::Magic. If that doesn't work for you, you can try to match a specific signature for an Excel file (or whatever you might get) with what you see in the uploaded data, but that's a lot more work.

    Good luck!

    brian d foy <>

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://472535]
Approved by xorl
[Discipulus]: last hour of cb broken..

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (9)
As of 2018-01-19 17:48 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (222 votes). Check out past polls.