Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^3: substituting 1 escaped character for another

by Marshall (Canon)
on Jul 26, 2018 at 21:35 UTC ( #1219339=note: print w/replies, xml ) Need Help??


in reply to Re^2: substituting 1 escaped character for another
in thread substituting 1 escaped character for another

I'm guessing a bit here, but when you say "The application executes in the WWW environment", I'm guessing that you are trying to modify an HTML page? Show us a segment of the HTML page and your code to decode and encode text -> HTML. Your regex will work fine on regular text, but I suspect that is not what you actually have.

In recent memory, I did a quickie kludge to handle the ampersand character in one "get it done right now" LWP application, $clubName =~ s/&/&/g; &amp is what HTML needs to display the ASCII & character. I suspect something similar is going on. We need more info...

Replies are listed 'Best First'.
Re^4: substituting 1 escaped character for another
by Veltro (Hermit) on Jul 26, 2018 at 22:42 UTC

    Yes, good thinking there. More guessing:

    The string may have 'non-printable' characters, it may have been copied and pasted from an editor that contains the characters (that are not visible but are actually there. Something like this kept me awake one night).

    This may also be Unicode related. Did some googling, found a good example here: https://www.soscisurvey.de/tools/view-chars.php

      I think we just need more info about what the OP actually has. All of these Unicode and HTML problems can be solved. The problem statement as it exists is not correct - the OP's code "works", albeit not the best. I have often had to resort to viewing a file in binary to find "hidden" characters. That is one possibility although I don't think this is likely if this is an HTML page that properly renders in a browser.

        Don't assume that HTML cannot have weird characters

        I was scanning a web page one time and this is how one of the lines of my code ended up to be:

        if ( $tmpTxt =~ /\<option value=\"$c2\.php\?blue\=(.+?)\"/ ) { # rem +oved the \> because for some there is some kind of weird character b +etween the " and the > print $filehandle $1 . "\n" ; }
Re^4: substituting 1 escaped character for another
by Anonymous Monk on Jul 26, 2018 at 21:48 UTC

    I'm going to play with it some more. I'll be back. Thanks everyone.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1219339]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (1)
As of 2021-10-16 04:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My first memorable Perl project was:







    Results (69 votes). Check out past polls.

    Notices?