http://www.perlmonks.org?node_id=1054398

hisyc has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way match some string mixed with binary data?
 $string=`head /bin/ls`; $string=~/.*linux.*/;
It does not match. Are there any modifier can do that, I tried with /l, /u, /a, /x. None of them helps. Thanks!

Replies are listed 'Best First'.
Re: Regular expression to match binary data amoun text data
by choroba (Cardinal) on Sep 17, 2013 at 08:07 UTC
    For me, it matches. How do you test the success of the match?
    $string = `head /bin/ls`; $string =~ /.*linux.*/ and print "Matches.\n";
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Regular expression to match binary data amoun text data (not)
by Anonymous Monk on Sep 17, 2013 at 08:09 UTC

    It does not match.

    How do you know that? The code you posted won't tell you if there was a match

    Also, what are you trying to match , what do you think that pattern is supposed to match?

    Also, what is the data you're trying to match against?

    Basic debugging checklist item 4 ( Dumper )

Re: Regular expression to match binary data amoun text data
by RichardK (Parson) on Sep 17, 2013 at 10:51 UTC

    I'm not sure if this is relevant, but using file rather than head would give you a more sensible message to match against.

Re: Regular expression to match binary data amoun text data
by Jim (Curate) on Sep 17, 2013 at 21:31 UTC
    $string=~/.*linux.*/; ^^ ^^

    Your use of the two patterns .* in this regular expression are purposeless, and they cause the regular expression engine to do things you don't want it to do. You should remove them.

    $string =~ m/linux/;

    What are you setting the encoding of the input data to?

    Jim

Re: Regular expression to match binary data amoun text data
by hisyc (Initiate) on Sep 18, 2013 at 08:22 UTC
    Thanks for the reply, yes, you are right, it matches actually. The regular expression has no problem to match binary data. My problem turns out to be in XML::Simple->XMLin.
    use XML::Simple; my $ref=XMLin( "xml_invalid_char.input" ); print Dumper $ref;
    Here is the input file content:
    <description>?s«ndjfr?1 S334µµ!</description>
    (Some strange char can not past here) The error is: not well-formed (invalid token) at line 1, column 22, byte 23 at /usr/lib/perl5/XML/Parser.pm line 187 Is there a way to let XMLin take in the input?