RegEX Doubt

sandy105 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: RegEX Doubt
by hippo (Bishop) on Aug 19, 2014 at 12:12 UTC

Yes, you do have to account for all the characters in your regex. It might be simpler just to split on the delimiters, though:

echo '[part1-date] log - [..part2..] [..part3..] part4' | perl -ne 'my
+ @r = split (/[[\]]/, $_); print @r[1,3,6];'
[download]

[reply]
[d/l]

Re: RegEX Doubt
by Athanasius (Archbishop) on Aug 19, 2014 at 12:07 UTC

Hello sandy105,

Yes, you have to allow for text such as “info - ” between the bracketed parts. But you don’t have to capture it. No need to capture the contents of the third bracketed part either, if you don’t need it:

#! perl
use strict;
use warnings;

while (<DATA>)
{
    / \[ ([^]]+) \] .* \[ ([^]]+) \] .* \[ [^]]+ \] \s+ (.*) /x or nex
+t;

    my ($part1, $part2, $part3) = ($1, $2, $3);

    print "1: |$part1| 2: |$part2| 3: |$part3|\n";
}

__DATA__
[part1-dateA] info - [..part2..] [..part3..] part4

[part1-dateB] log - [..part2..] [..part3..] part4
[download]

Output:

22:04 >perl 972_SoPW.pl
1: |part1-dateA| 2: |..part2..| 3: |part4|
1: |part1-dateB| 2: |..part2..| 3: |part4|

22:04 >
[download]

Note: I’ve added an /x modifier to the regex and whitespace within to make it easier to read.

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^2: RegEX Doubt

by sandy105 (Scribe) on Aug 19, 2014 at 15:58 UTC

thank you much ..for anyone referencing later x is a modifier to allow white spaces .possible use cases include commenting regex and spacing to improve legibility

[reply]

Re: RegEX Doubt
by Laurent_R (Canon) on Aug 19, 2014 at 14:54 UTC

your have been given valuable answers already, but I would suggest two possible improvements in terms of readability and ease of regex construction:

- using non-greedy quantifiers rather than negated character class for matching what is between square brackets

- creating first a sub-regex and use it then 3 times.

Possibly something like this:

$_ = "[part1-date] info - [..part2..] [..part3..] part4";
$part = qr /\[(.+?)]/;      # subregex using non-greedy quantifier (sl
+ightly easier than /\[([^]]+)\]/)
print "$1 $2 $3" if /$part \w+ - $part $part \w+/;     # prints "part1
+-date ..part2.. ..part3.."
[download]

[reply]
[d/l]

Re: RegEX Doubt
by soonix (Canon) on Aug 19, 2014 at 12:44 UTC

Hi sandy105,

I wanted to propose to put [^[]* instead of the spaces between your [...sections...], but the other Monks were quicker :-)
I can't find =` in perlop - most probably you mean =~ ?

[reply]
[d/l]
[select]

Re^2: RegEX Doubt

by sandy105 (Scribe) on Aug 19, 2014 at 15:49 UTC

yes that's correct. "=~"

[reply]

Re: RegEX Doubt
by sandy105 (Scribe) on Aug 19, 2014 at 16:11 UTC

i dont know if i should create aother thread for this ..but there is another hiccup.for the last "PART4" i need to match it with few strings eg "init code finished ","batch process code started"..

right now i am checking part 4 in if loops ..but it looks messy .is there a better way to match part 4 with say strings from a array @match

[reply]

Re^2: RegEX Doubt

by Laurent_R (Canon) on Aug 19, 2014 at 18:57 UTC

[..part 3..]

[reply]
[d/l]

Re^3: RegEX Doubt

by sandy105 (Scribe) on Aug 20, 2014 at 06:39 UTC

yes i am capturing the fourth part and then checking if any match those keywords ; but i have like n if loops ..is there a better way to search /compare that

[reply]

Re^4: RegEX Doubt

by Laurent_R (Canon) on Aug 20, 2014 at 17:43 UTC


P is for Practical
	PerlMonks