Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Check multiple lines exist in a record

by gbwien (Sexton)
on Mar 26, 2018 at 13:40 UTC ( #1211742=perlquestion: print w/replies, xml ) Need Help??

gbwien has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

Perl typically processes a file line by line but how is it possible to check for multiple lines within a record of a file? My main program reads each line of the input file using a while loop. Within the record below, I need to check if CFU-TS10-ACT and CFB-TS10-ACT are both present :-

<SUBBEGIN IMSI=12345678495452; MSISDN=1234567890; DEFCALL=TS11; CURRENTNAM=BOTH; CAT=COMMON; TBS=TS11&TS12&TS21&TS22; VLRLIST=10; SGSNLIST=10; SMDP=MSC; CB=BAOC-ALL-PROV; CB=BOIC-ALL-PROV; CB=BOICEXHC-ALL-PROV; CB=BICROAM-ALL-PROV; CW=CW-ALL-PROV; CF=CFU-ALL-PROV-NONE-YES-NO-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-NO +-NO-NO; CF=CFB-ALL-PROV-NONE-YES-YES-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO-N +O-NO-NO; CF=CFU-TS10-ACT-NONE-YES-NO-NONE-YES-65535-YES-YES-NO-NO-NO-NO-NO- +NO-NO-NO; CF=CFNRY-ALL-PROV-NONE-YES-YES-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO +-NO-NO-NO; CF=CFNRC-ALL-PROV-NONE-YES-NO-NONE-YES-65535-NO-NO-NO-NO-NO-NO-NO- +NO-NO-NO; CF=CFB-TS10-ACT-NONE-YES-NO-NONE-YES-65535-YES-YES-NO-NO-NO-NO-NO- +NO-NO-NO; CF=CFD-TS10-REG-9144455778-YES-YES-25-YES-65535-YES-YES-NO-NO-NO-Y +ES-YES-YES-YES-NO; TCSISTATE=YES; OCSISTATE=YES; CONTROL=SUB; WPA=0; GS=HOLD&MPTY&ECT&CLIR&CLIP; CLIRES=TEMPALLOW; CLIPOC=NO; OCSI=10; CFSMS=ACT-10-914366488325207-YES-YES-NO-NO-NO; ARD=PROV; SUBRES=ALLPLMN; IST_ALERT_TIMER=120; IST_ALERT_RESPONSE=2; SUB_AGE=0; MIMSI=240076400029999-ONELIVE-2-2-1-0-0; MIMSI=232191400029999-ONELIVE-1-1-1-0-0; SID=2805158185721065; MCSISTATE=YES; CLRBSG=CLIP-YES-NO-NO-NO-NO; UPLCSLCK=NO; UPLPSLCK=NO; DEFOFAID=10; EPS_PROFILE_ID=1; TGPPAMBRMAXUL=50000000; TGPPAMBRMAXDL=150000000; ARD_EXT=NULL-NULL-NULL-N3GPPNOTALLOWED; FRAUDTPL_ID=10; HLR_INDEX=1; LTEAUTOPROV=NO; PSSER=1-1-10-1-NONE-DYNAMIC-00000000; EPSSER=1-10-10-1-NONE-DYNAMIC-00000000-1; MPS=NO; <SUBEND

In the function callForwardingsCF() below I am using an if statement but I believe I should be using a while loop but I am unclear about how to implement it?

#!/usr/bin/perl use strict; use warnings; use feature 'say'; my $HSSIN='D:\testproject\HSS-export-test-run-small.txt'; my $ofile = 'D:\testproject\HSS-output.txt'; open (INFILE, $HSSIN) or die "Can't open input file"; open (OUTFILE,"> $ofile" ) or die "Cant open file"; my $add; my $MSISDN; my $line; my $CFline; my $CBBOACline; my $CBBOICline; sub callForwardingsCF() { if (/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/ && /\t*CF=(CFB-TS10-ACT-(NONE| +\d+))/) { say "this case is found here ....."; } } # end sub callForwardingsCFD while (<INFILE>) { if (/<SUBEND/) { say "SUBEND found"; #$line = $1 if /^\s*MSISDN=(\d+);/; print OUTFILE "processSingle UpdateCommand GSUB MKEY $line"; print OUTFILE "\n"; } if ($_ =~ /^\t*MSISDN=(\d+);/) { #find MSISDN in file global search say "STARTER MSISDN is $1"; $MSISDN = $1; $add = $1; $line = "$1"; #group 1 } callForwardingsCF(); #callForwardings } close INFILE; close OUTFILE;

Replies are listed 'Best First'.
Re: Check multiple lines exist in a record
by Your Mother (Bishop) on Mar 26, 2018 at 15:09 UTC

    I don't think anyone has mentioned changing the record/line delimiter yet which seems like a good approach for this problem especially if you have multiple records in a file. You asked about processing line by line. That's the default because the delimiter ($/) is a newline (\n) by default (usually, different systems and read modes affect this too). You could play around with this.

    { local $/ = "<SUBEND"; # localize or this is a bad idea. my $count = 0; while (<YOUR_FILE_HANDLE>) { next unless /SUBBEGIN/; # Don't catch trailing empty stuff. print "\nRECORD ", ++$count, " -------------------------\n"; print $_; } }

        Of course. Thanks. I was surprised it (apparently) hadn't come up.

Re: Check multiple lines exist in a record
by johngg (Canon) on Mar 26, 2018 at 16:02 UTC

    Process record by record rather than line by line (as suggested by Your Mother) then top-and-tail and use split to break into line items then key/value pairs, populating a HoA structure. Then use grep to see if both items of interest are present. I added a record with one item missing to test.

    The output.

    Record No. 1 - Both present Record No. 2 - One or both missing

    I hope this is helpful.

    Update: Corrected shonky wording ... and later deleted spurious square bracket.

    Cheers,

    JohnGG

Re: Check multiple lines exist in a record
by kgoess (Beadle) on Mar 26, 2018 at 16:35 UTC
    Rather than suck the entire file into memory (if it's bigger than the sample you posted), you could process it line-by-line and use the flip-flop operator (see perldoc perlop)
    #!/usr/bin/perl use strict; use warnings; use feature 'say'; my $HSSIN = shift or die "usage: $0 <inputfile"; open my $infile, $HSSIN or die "Can't open input file"; my ($MSISDN, $cfu_found, $cfb_found); while (<$infile>) { # this is true for all lines between these two markers if (/^\s*<SUBBEGIN/.. /^<SUBEND/) { if (/MSISDN=(.+)/){ $MSISDN = $1; } if (/^\s*CF=CFU-TS10-ACT/){ $cfu_found = 1; } if (/^\s*CF=CFB-TS10-ACT/){ $cfb_found = 1; } } } if ($cfu_found and $cfb_found){ say "cfu and cfb both found and MSISDN is $MSISDN"; }
Re: Check multiple lines exist in a record
by choroba (Bishop) on Mar 26, 2018 at 15:05 UTC
    Record somewhere (here: to the %found hash) what expressions have been found so far. At the end, check how many expressions have been found. Just counting the expressions isn't enough, as there can be 2 occurrences of expression 1, but no occurrence of expression 2.

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; sub callForwardingsCF { my ($found) = @_; $found->{1} = 1 if /\t*CF=(CFU-TS10-ACT-(NONE|\d+))/; $found->{2} = 1 if /\t*CF=(CFB-TS10-ACT-(NONE|\d+))/ } my %found; while (<>) { if (/<SUBEND/) { say "SUBEND found"; if (2 == keys %found) { print "Both expressions found.\n"; } elsif (keys %found) { print "Found only expression ", keys %found, ".\n"; } else { print "None of the expressions found.\n"; } %found = (); } callForwardingsCF(\%found); }
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Check multiple lines exist in a record
by thanos1983 (Parson) on Mar 26, 2018 at 13:52 UTC

    Hello gbwien,

    I was under the impression that your question was answered before Grouped Regular Expression not set assign default value. In case that was not answered from your sample of code you are creating a function and you call the function internally of the function again without processing any parameters.

    I am not sure what you want. Do you want to pass the file e.g. inside the function and retrieve the output?

    Can you provide us with more information?

    Looking forward to your reply, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
      Hi thanos1983, Thanks for your help. Yes part of my questions were answered in that post but I still have other questions relating to resolving the entire problem. In the example program I provided I am trying to trigger the scenario were two lines are found to exist in an input file, in this case the lines CFU-TS10-ACT and CFB-TS10-ACT. If I change the if statement to the following then this works fine as would be expected
      if (/\t*CF=(CFU-TS10-ACT-(NONE|\d+))/)
      But I do not know how to check for the existence of two lines.

        Hello again gbwien,

        Try something like this:

        Since you did not provide us with more than one subsection of your file or desired output I am not able to help you more than that.

        Hope this helps, BR.

        Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Check multiple lines exist in a record
by hippo (Chancellor) on Mar 26, 2018 at 14:12 UTC
Re: Check multiple lines exist in a record
by bliako (Vicar) on Mar 26, 2018 at 13:56 UTC
    I would read ALL the file in a single string and then apply the regular expression you used on the whole content in the string (can't say if it does what is intended) and check how many times it succeeds. Basically something like this:
    #!/usr/bin/perl use strict; use warnings; my $HSSIN='D:\testproject\HSS-export-test-run-small.txt'; my $ofile = 'D:\testproject\HSS-output.txt'; open (INFILE, $HSSIN) or die "Can't open input file"; open (OUTFILE,"> $ofile" ) or die "Cant open file"; my $content = undef; {undef $/; $content = <INFILE> } close(INFILE); my $num_found = 0; while( $content =~ /\t*CF=(CFU-TS10-ACT-(NONE|\d+))/g ){ $num_found++ +} print "multiple cases: $num_found cases of ... " if $num_found > 0; ...
    bliako
Re: Check multiple lines exist in a record
by jeffenstein (Pilgrim) on Mar 26, 2018 at 16:56 UTC

    You could read each record into an array, and check it as a unit:

    #!/usr/bin/env perl use strict; use warnings; use feature qw(say); use List::Util qw(first); my $rec_no = 0; my @record; my $msisdn; while(<>){ chomp; if(/^<SUBBEGIN/){ @record = ($_); $msisdn = undef; say "Found record: ", ++$rec_no; next; }elsif(/^<SUBEND/){ push @record, $_; check_record(\@record); @record = (); }elsif(/^\s*MSISDN=(\d+);/){ $msisdn = $1; say "Record $rec_no: MSISDN = $msisdn"; }else{ push @record, $_; } } exit 0; sub check_record { my $listref = shift; if( first {/^\s*CF=(CFU-TS10-ACT-(NONE|\d+))/} @{$listref} and first {/^\s*CF=(CFB-TS10-ACT-(NONE|\d+))/} @{$listref} ){ say "this case is found here ....."; } }

    It's a bit of a brute-force solution, but it should work.

    Sample output:

    $ ./t.pl Found record: 1 Record 1: MSISDN = 1234567890 this case is found here ..... Found record: 2 Record 2: MSISDN = 1234567890 this case is found here .....
Re: Check multiple lines exist in a record
by choroba (Bishop) on Mar 26, 2018 at 14:06 UTC
    Are you the same person as the author of Check if multiple lines exist in text file? Crossposting is fine, but you should inform about it not to waste efforts of people not attending both the sites.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Yes I am the author of that post, point taken I will provide that information in the future. Thank you. https://stackoverflow.com/questions/49490941/check-if-multiple-lines-exist-in-text-file
Re: Check multiple lines exist in a record
by karlgoethebier (Monsignor) on Mar 27, 2018 at 14:18 UTC
    "<SUBBEGIN..."

    I got a Déjà-vu, right? What file format is this? If i didn't ask yet...

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

    A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1211742]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2019-11-19 10:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Strict and warnings: which comes first?



    Results (95 votes). Check out past polls.

    Notices?