Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Print only if pattern matches

by brad_nov (Novice)
on Jan 24, 2013 at 06:23 UTC ( #1015081=perlquestion: print w/ replies, xml ) Need Help??
brad_nov has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have script like below:
#!/usr/local/bin/perl use strict; use warnings; while (<DATA>) { ( my ($s_id) = /^\d+\|(\d+?)\|/ ) ; if ( $s_id == 1 ){ s/^(.*\|)*.*ABC\.pi=([\d.]+|[\w.]+)*.*ABC\.id=(\d+|[\w.]+).*$/$1$2 +|$3/s; print "$1$2|$3\n"; } } __DATA__ 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.66~ABC.id= +789137136770 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.67~ABC.id= +789134713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.68~ABC.id= +789137213670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.69~ABC.id= +78913713670 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +78913713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +78913713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +789137135670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +789137153670 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713670 123|1|456464|645646|4546|654~abc~dhghga~121322~456466874~8796896 123|2|456464|645646|4546|654~abc~dhghga~121322~456466874~6788708 123|2|456464|645646|4546|654~abc~dhghga~121322~456466874~6806
When I am executing I am getting output as follows:
123|1|456464|645646|4546|112.33.44.55.66|789137136770 123|1|456464|645646|4546|112.33.44.55.67|789134713670 123|1|456464|645646|4546|112.33.44.55.68|789137213670 123|1|456464|645646|4546|112.33.44.55.69|78913713670 Use of uninitialized value $2 in concatenation (.) or string at split_ +test.pl line 14, <DATA> line 5. Use of uninitialized value $3 in concatenation (.) or string at split_ +test.pl line 14, <DATA> line 5. 1| 123|1|456464|645646|4546|112.33.44.55.70|78913713670 123|1|456464|645646|4546|112.33.44.55.70|78913713670 123|1|456464|645646|4546|112.33.44.55.70|789137135670 123|1|456464|645646|4546|112.33.44.55.70|789137153670
I am looking to get rid off the error. How can I do it? ANd I want to write the exceptions to new file.

Comment on Print only if pattern matches
Select or Download Code
Re: Print only if pattern matches
by Kenosis (Priest) on Jan 24, 2013 at 06:37 UTC

    Place your matching regex in an if statement. If true, you can print your captures w/o error. The else can handle the exceptions:

    use strict; use warnings; while (<DATA>) { next unless /^\d+\|(\d+?)\|/ and $1 == 1; if (/^(.*\|)*.*ABC\.pi=([\d.]+|[\w.]+)*.*ABC\.id=(\d+|[\w.]+).*$/) + { print "$1$2|$3\n"; } else { print "Exception: $_"; } }

    Output on your data:

    123|1|456464|645646|4546|112.33.44.55.66|789137136770 123|1|456464|645646|4546|112.33.44.55.67|789134713670 123|1|456464|645646|4546|112.33.44.55.68|789137213670 123|1|456464|645646|4546|112.33.44.55.69|78913713670 Exception: 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713 +670 123|1|456464|645646|4546|112.33.44.55.70|78913713670 123|1|456464|645646|4546|112.33.44.55.70|78913713670 123|1|456464|645646|4546|112.33.44.55.70|789137135670 123|1|456464|645646|4546|112.33.44.55.70|789137153670 Exception: 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713 +670 Exception: 123|1|456464|645646|4546|654~abc~dhghga~121322~456466874~87 +96896
Re: Print only if pattern matches
by vinoth.ree (Prior) on Jan 24, 2013 at 06:42 UTC

    because the lines,

    123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713670 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713670 123|1|456464|645646|4546|654~abc~dhghga~121322~456466874~8796896 123|2|456464|645646|4546|654~abc~dhghga~121322~456466874~6788708 123|2|456464|645646|4546|654~abc~dhghga~121322~456466874~6806
    does not match your regular expression. So the grouping variable $2 and $3 has no values.

Re: Print only if pattern matches
by 2teez (Priest) on Jan 24, 2013 at 06:52 UTC

    Your long regex

    s/^(.*\|)*.*ABC\.pi=([\d.]+|[\w.]+)*.*ABC\.id=(\d+|[\w.]+).*$/$1$2 +|$3/s;
    can be futher reduced to match your required output.
    Using the solution provided by kenosis like so:
    use strict; use warnings; while (<DATA>) { next unless /^\d+\|(\d+?)\|/ and $1 == 1; if (/(.+?)~.+?=(.+?)~.+=(.+?)$/) { # note here print $1, $2, $3, $/; } else { print "Exception: ", $_, $/; } } __DATA__ 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.66~ABC.id= +789137136770 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.67~ABC.id= +789134713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.68~ABC.id= +789137213670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.69~ABC.id= +78913713670 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +78913713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +78913713670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +789137135670 123|1|456464|645646|4546|654~abc~dhghga~ABC.pi=112.33.44.55.70~ABC.id= +789137153670 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713670 123|1|456464|645646|4546|654~abc~dhghga~121322~456466874~8796896 123|2|456464|645646|4546|654~abc~dhghga~121322~456466874~6788708 123|2|456464|645646|4546|654~abc~dhghga~121322~456466874~6806
    Output:
    123|1|456464|645646|4546|654112.33.44.55.66789137136770 123|1|456464|645646|4546|654112.33.44.55.67789134713670 123|1|456464|645646|4546|654112.33.44.55.68789137213670 123|1|456464|645646|4546|654112.33.44.55.6978913713670 Exception: 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713 +670 123|1|456464|645646|4546|654112.33.44.55.7078913713670 123|1|456464|645646|4546|654112.33.44.55.7078913713670 123|1|456464|645646|4546|654112.33.44.55.70789137135670 123|1|456464|645646|4546|654112.33.44.55.70789137153670 Exception: 123|1|456464|645646|4546|654~abc~dhghga~12.33.44.55.70~3713 +670 Exception: 123|1|456464|645646|4546|654~abc~dhghga~121322~456466874~87 +96896

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Print only if pattern matches
by Athanasius (Abbot) on Jan 24, 2013 at 07:00 UTC

    The script can be re-written so:

    #! perl use strict; use warnings; while (my $line = <DATA>) { if ($line =~ / ^ \d+ \| (\d+?) \| /x && $1 == 1 && $line =~ s{ ^ (.*\|)* # $1 .*ABC\.pi= ([\d.]+|[\w.]+)* # $2 .*ABC\.id= (\d+|[\w.]+) # $3 .* $ } {$1$2|$3}sx) { print "$1$2|$3\n"; } } __DATA__ ...

    While this “works”, it is dubious: the * quantifier in a regex means match zero or more of the preceeding; in the substitution, do you really want to match zero occurrences of (.*\|) or ([\d.]+|[\w.]+)? If not, use the + quantifier meaning one or more.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Print only if pattern matches
by LanX (Canon) on Jan 24, 2013 at 07:04 UTC
    >  Use of uninitialized value $2 in concatenation (.) or string at split_test.pl li

    > I am looking to get rid off the error. ... ANd I want to write the exceptions to new file.

    so avoid uninitialized '$2'!

    if (defined $2) { print "$1$2|$3\n"; } else { print $exception_fh "$_\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1015081]
Approved by vinoth.ree
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2015-07-03 20:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (56 votes), past polls