Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: regex anchoring issue

by smls (Friar)
on Feb 15, 2013 at 12:01 UTC ( #1018884=note: print w/replies, xml ) Need Help??

in reply to regex anchoring issue

If you need your regex to match one of several possible text fragments of which at least one is longer than 1 character, you have to use an alternation (...|...|...) instead of a character class ([...]).

Some additional comments on your code:

  • The {1} quantifier in the first regex is redundant. Matching one occurrence is the default behavior if no quantifier is given.

  • The {1,} quantifier in the third regex can be more succinctly written as +.

  • The /g modifier is not needed in the first two regexes, as kcott already noted.

  • Before the first regex, the $_ =~ is redundant because Perl will match against that variable by default. Similarly, passing $_ as the second parameter to split is redundant.

  • I'm pretty sure you don't need to nest two foreach my $line (@split) { ... } loops... :)

  • The outer parenthesis in the line ( $line =~ s/[\s\n]{1,}//g ) are redundant.

  • Inside the while loop, you create a new @split array for each iteration (i.e. for each line from the input file), but then you don't do anything with it. Did you just cut out the code that does something with the current line's @split array to keep your question shorter, or did you actually intend to add all split fragments from all lines into a single array that will be available after the end of the while loop? In the latter case you need to modify your code.

  • You can probably restructure the code to avoid specifying the split regex twice. It would depend on your exact requirements. For example if each split fragment has to be followed by one of the "<SOH>"/"^A"/chr(1) markers, i.e. no line of the file may end like "8=FIX.4.2<SOH>8=FIX.4.2<SOH>8=FIX.4.2" with a lone fragment at the end, you could use an m/../g regex like this to do the splitting without calling split, and then print the error based on whether any matches were found:

    my @split; foreach (m/(.+?)(?:<SOH>|\^A|\cA)/g) { s/\s+//g; push @split, $_; } if (!@split) { # print error here last; } # do stuff with @split here

Replies are listed 'Best First'.
Re^2: regex anchoring issue
by Anonymous Monk on Feb 15, 2013 at 22:30 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018884]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (10)
As of 2016-10-25 11:31 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (317 votes). Check out past polls.