Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Regexp to match regardless of whitespace

by austinj (Acolyte)
on May 03, 2012 at 16:34 UTC ( [id://968769]=perlquestion: print w/replies, xml ) Need Help??

austinj has asked for the wisdom of the Perl Monks concerning the following question:

CONSTR(1,29) = TEXTHERE, TEXTHERE, TEXTHERE, TEXTHERE, 90.0, (10 * 0.01 /TEXTHERE), ! P.O. TEXTHERE

I have that line and many others like it, I need to be able to match no matter what whitespace is included in between text, parenthesis, commas etc.

so match

" w o r ds ( 345 ) 3 45 other wor ds, aks "

and

"words(345)345otherwords,aks"

and

"words (345) 345 otherwords, aks"

etc. I know I can put a /s* between every character in the expression, is there an easier way? I looked through the modifiers and couldn't find one.

  • Comment on Regexp to match regardless of whitespace

Replies are listed 'Best First'.
Re: Regexp to match regardless of whitespace
by MidLifeXis (Monsignor) on May 03, 2012 at 16:43 UTC

    How about removing the whitespace in the original string first?

    $text =~ s/\s//g; # or tr///, etc do_something( $text ) if $text =~ /your_pattern_here/;

    --MidLifeXis

      good idea, I may be able to make that work.

      The issue is that my file is several MB and the VAST majority of lines will not match - so I'd like to check for a match before wasting the processing time to remove whitespace.

      Also I need to remember the original string (with spaces) as I need to put it back as it was (after changing a couple things)

        With regard to the first paragraph, I would say that “this consideration is of non consequence,” because the computer can perform the operation in a few nanoseconds using a single regex.

        With regard to the second one, however, there could be a bit more of a problem, because when you do go about designing the code to “change a couple things,” you will perhaps need to be extremely careful to design the algorithm to always change the right things and to do so consistently in all cases.

        Nevertheless, each of these concerns are basically independent of one another, and therefore I would proceed in this course.   Removing the white space will allow you to use a regex efficiently to winnow out the lines-of-interest wheat among millions-of-lines of chaff, and that alone is enough.

Re: Regexp to match regardless of whitespace
by JavaFan (Canon) on May 03, 2012 at 17:05 UTC
    There's no modifier that let you skip anything.

    If you're just grepping for string literals, it's trivial to change it to allow for spaces:

    my $pattern = join '\s*', split //, "whatever"; while (<>) { print if /$pattern/; }
    Otherwise, I'd do as suggested: remove the whitespace in the line, than apply the pattern.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://968769]
Approved by sauoq
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-23 13:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found