Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Removing ANSI Color Codes

by Aristotle (Chancellor)
on May 31, 2003 at 13:33 UTC ( [id://262061]=note: print w/replies, xml ) Need Help??


in reply to Removing ANSI Color Codes

castaway's list is no substitute for a specification of ANSI escapes, and her resulting regex suffers from the same problem as yours, although they break on opposite cases: neither takes into account that you can put any number of colours (including just one) in the same escape sequence by separating them with a semicolon. Yours will also match more than just escape sequences. Use
s/\e\[\d+(?>(;\d+)*)m//g;

Makeshifts last the longest.

Replies are listed 'Best First'.
Re: Re: Removing ANSI Color Codes
by castaway (Parson) on May 31, 2003 at 13:39 UTC
    Point taken, thanks Aristotle. (MUDs don't ever use the semi-colon syntax, they just hang the sequences one after the other, thats my excuse anyway. ;)

    C.

Re: Re: Removing ANSI Color Codes
by theorbtwo (Prior) on May 31, 2003 at 18:26 UTC

    I'm confused by your use of (?> ... ) here, aristotle... any chance you could clear it up? (I did read the description in perlre; it didn't help clear things up). Specificly, what does this pattern accomplish that s/\e\[\d+(;\d+)*m//g; doesn't?


    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

      Nothing, in this case. *g* It will just fail a notch faster in cases where it can't match.

      The reason is that once the (;\d+)* stops, if what follows isn't an m, the regex engine will backtrack, giving up a bit of what (;\d+)* matched, trying to find an m. Of course we know that neither the semicolon nor \d can match something that is an m, so no backtracking in the world is going to help and make it match.

      What (?>re) does is throw away all the intermediate states once re has matched, so if backtracking seems necessary, the engine will not remember how to backtrack into the middle of re. Effectively, if the engine fails to find an m after the (?>re), it will unmatch re all at once, rather than waste time doing so character by character.

      Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://262061]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-24 12:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found