Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Distinguishing text from binary data

by inman (Curate)
on Oct 05, 2004 at 10:41 UTC ( [id://396546]=note: print w/replies, xml ) Need Help??


in reply to Distinguishing text from binary data

Your code is a little restrictive as it treats linefeeds, whitespace, punctuation etc. as non-characters and then decides that something is text if there are less than 100 of them. Try changing your code to work on ranges of the ascii table and then use a percentage as your test.
  • Comment on Re: Distinguishing text from binary data

Replies are listed 'Best First'.
Re^2: Distinguishing text from binary data
by maard (Pilgrim) on Oct 06, 2004 at 10:26 UTC
    Also don't forget about non-english encodings in which form data can be sent (english coders often forget about it :-) ). IMO, presence of 0x00..0x1F bytes in such data as HTTP response can mark it as binary (unless the form is sent in utf-8). So maybe you should take into consideration charset from Content-Type header and only then analyze byte/character stream.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://396546]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-25 02:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found