Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^6: Encoding/decoding question

by slugger415 (Monk)
on Sep 12, 2011 at 20:20 UTC ( [id://925563]=note: print w/replies, xml ) Need Help??


in reply to Re^5: Encoding/decoding question
in thread Encoding/decoding question

heh - can't say I follow all that -- I save the FB page as HTML from Firefox, and run tidy on it to make it XHTML. I'm doing all this on Windows 7 so I have no idea how or where it's being encoded. Tidy does allow various encodings but I seem to be getting wonky results no matter what I set it at.

Anyway I tried running uniquote on text file (test.txt) containing only this string:

sous réserve

Here's what I got:

> perl -nle 'print if /\P{ASCII}/' test.txt | uniquote.pl -vE cp1252
Can't find string terminator "'" anywhere before EOF at -e line 1.

Not sure what that means... appreciate the help...

Replies are listed 'Best First'.
Re^7: Encoding/decoding question
by Anonymous Monk on Sep 12, 2011 at 20:27 UTC
      thanks --

      ok if I run uniquote, the first two seem correct:

      > perl -nle "print if /\P{ASCII}/" test.txt | uniquote.pl -vE cp1252
      sous r\N{LATIN SMALL LETTER E WITH ACUTE}serve
      
      > perl -nle "print if /\P{ASCII}/" test.txt | uniquote.pl -vE latin1
      sous r\N{LATIN SMALL LETTER E WITH ACUTE}serve
      
      
      > perl -nle "print if /\P{ASCII}/" test.txt | uniquote.pl -vE macroman
      sous r\N{LATIN CAPITAL LETTER E WITH GRAVE}serve
      

      So what's happening? (sorry, still clueless.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://925563]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-04-25 15:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found