Re: Search file for certain lines
by jethro (Monsignor) on Sep 23, 2013 at 09:44 UTC
|
Use a state machine. A state machine is a loop and an integer variable that holds the state. In the loop you read one line and do things depending on state, one of which might be changing the state
In your relatively simple case it seems you just would have two states, not between 'h's (lets call it state 0) and between 'h's (state 1). If you encounter a line beginning with 'h' simply flip the state variable
| [reply] |
Re: Search file for certain lines
by Eily (Monsignor) on Sep 23, 2013 at 09:49 UTC
|
It would be easier to give you advice if you told us what you already know and thought of. Right now I'll just try and guess what I should tell you and what you already know.
Your conditions on the lines sound like a job for regular expressions. And there kind of is a "from .. until" operator in Perl, which is the flip flop operator, which would allow you to do something like (check the next keyword by the way) :
while(<INPUT>)
{
next unless /^h/ .. /^h/; # stop processing the line unless we are b
+etween two lines starting with h
somecode();
}
Which is nice and short, but not well known and understood, so if it looks to hard to understand to you, or if there probably will be a lot of people reading your code, you could use classic control structures : if, else, unless. | [reply] [d/l] |
Re: Search file for certain lines
by mtmcc (Hermit) on Sep 23, 2013 at 09:48 UTC
|
I'm not entirely sure what you're trying to do. Surely, all lines in your file will be between two lines beginning with h, aside from those before the first line beginning with h, and after the last line beginning with h?
It might make it easier to understand the problem if you give some sample data, and the result you expect to get. Have a look through this: How do I post a question effectively?
Finally, based on your question, my best guess would be something like this:
#!/usr/bin/perl
use strict;
use warnings;
my $file = $ARGV[0];
my $check = 0;
my @line;
while (<DATA>)
{
@line = split('', $_);
$check += 1 if $line[0] eq 'h';
if ($check%2 == 1)
{
if (($line[0] eq 'j') || ($line[0] eq 'E') || ($line[0
+] eq 'G'))
{
print STDOUT "$_";
}
}
}
__DATA__
aaaaa 1
bbbbb 2
hhhhh 3
fffff 4
rrrrr 5
lllll 6
jjjjj 7
HHHHH 8
EEEEE 9
GGGGG 10
hhhhh 11
jjjjj 12
EEEEE 13
GGGGG 14
I hope that helps. | [reply] [d/l] |
Re: Search file for certain lines
by hdb (Monsignor) on Sep 23, 2013 at 11:46 UTC
|
use strict;
use warnings;
my $state = 0;
while( <DATA> ) {
$state = 1-$state if /^h/;
print if $state && /^[jEG]/;
}
__DATA__
h132BIK2
u3*** TEST DATA ***
u3*** COMMENT AREA FOR TEST DATA ***
j1000010017 6790194100109201301092013Test Data N PW09-3PY248
+018BIK20
k10 2R 1 0045.1011N01010215.820012.220006.0000000 0250M 1I
+nsured Only NYY01N00000.00N00000.00Y00000.
+00 000215.82000012.22000006.00
q0215.820215.820215.820215.820215.820000000000000000000002500250025002
+500250YY00000 01000215.82000215.82000215.82000215.82000215.82
l02001 0400000000
+0000000000000000000000000000000000000000
a000.00000.00000.0000
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
yIIDD0043.160019.0110018.9909M0000.000010.000233.08N0017.270023.500000
+43.16000019.01000018.99000000.00000010.00000233.08000017.27
h216BIK0
u3*** TEST DATA ***
u3*** COMMENT AREA FOR TEST DATA ***
pMU76 Nov 2010 A B C D E F G H
+ I J L
+
+ 0000000000
j1000010017 6790194100109201301092013Test Data M PW09-3PY248
+005BIK00
k10 2R 1 0045.1011N01010217.190012.290006.0000000 0250M 1I
+nsured Only NYY01N00000.00N00000.00Y00000.
+00 000217.19000012.29000006.00
q0217.190217.190217.190217.190217.190000000000000000000002500250025002
+500250YY00000 01000217.19000217.19000217.19000217.19000217.19
l02001 0400000000
+0000000000000000000000000000000000000000
a000.00000.00000.0000
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
yIIDD0043.440019.1410019.1109M0000.000010.000234.57N0017.380023.500000
+43.44000019.14000019.11000000.00000010.00000234.57000017.38
h217BIK1
u3*** TEST DATA ***
u3*** COMMENT AREA FOR TEST DATA ***
pMU76 Nov 2010 A B C D E F G H
+ I J L
+ 0000000000
j1000010017 6790194100109201301092013Test Data L PW09-3PY248
+006BIK10
k10 2R 1 0045.1011N01010222.940012.620006.0000000 0250M 1I
+nsured Only NYY01N00000.00N00000.00Y00000.
+00 000222.94000012.62000006.00
q0222.940222.940222.940222.940222.940000000000000000000002500250025002
+500250YY00000 01000222.94000222.94000222.94000222.94000222.94
l02001 0400000000
+0000000000000000000000000000000000000000
a000.00000.00000.0000
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
yIIDD0044.590019.6110019.6209M0000.000010.000240.78N0017.840023.500000
+44.59000019.61000019.62000000.00000010.00000240.78000017.84
| [reply] [d/l] |
Re: Search file for certain lines
by Anonymous Monk on Sep 23, 2013 at 09:51 UTC
|
| [reply] |
Re: Search file for certain lines
by wjw (Priest) on Sep 23, 2013 at 09:51 UTC
|
..so if I read this right, you are not interested in lines that begin with h(lower case). You are only interested in outputting lines that start with j, E, or G. If that is in fact the case, then the problem should be pretty simple. Read the file into an array, loop through the array line by line using a regex to check for lines beginning with j, E or G and print them out. I get the impression I am missing something here... What importance do the lines beginning with h(lower case) have to you?
- ...the majority is always wrong, and always the last to know about it...
- The Spice must flow...
- ..by my will, and by will alone.. I set my mind in motion
| [reply] |
|
| [reply] |
|
..it generally works for me. I like iterating through an array. The question I face rarely is "why not", which boils down to file size in those rare cases. It is easy to see what I am working with using the debugger when I have an array available with everything in it. I am used to working with arrays.... Mostly this is just personal preference, but it works for me, so I suggest it...
- ...the majority is always wrong, and always the last to know about it...
- The Spice must flow...
- ..by my will, and by will alone.. I set my mind in motion
| [reply] |
Re: Search file for certain lines
by Jalcock501 (Sexton) on Sep 23, 2013 at 10:08 UTC
|
Hey Guys
Sorry it's so ambiguous. Here is some example data.
h132BIK2
u3*** TEST DATA ***
u3*** COMMENT AREA FOR TEST DATA ***
j1000010017 6790194100109201301092013Test Data N PW09-3PY248
+018BIK20
k10 2R 1 0045.1011N01010215.820012.220006.0000000 0250M 1I
+nsured Only NYY01N00000.00N00000.00Y00000.
+00 000215.82000012.22000006.00
q0215.820215.820215.820215.820215.820000000000000000000002500250025002
+500250YY00000 01000215.82000215.82000215.82000215.82000215.82
l02001 0400000000
+0000000000000000000000000000000000000000
a000.00000.00000.0000
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
yIIDD0043.160019.0110018.9909M0000.000010.000233.08N0017.270023.500000
+43.16000019.01000018.99000000.00000010.00000233.08000017.27
h216BIK0
u3*** TEST DATA ***
u3*** COMMENT AREA FOR TEST DATA ***
pMU76 Nov 2010 A B C D E F G H
+ I J L
+
+ 0000000000
j1000010017 6790194100109201301092013Test Data M PW09-3PY248
+005BIK00
k10 2R 1 0045.1011N01010217.190012.290006.0000000 0250M 1I
+nsured Only NYY01N00000.00N00000.00Y00000.
+00 000217.19000012.29000006.00
q0217.190217.190217.190217.190217.190000000000000000000002500250025002
+500250YY00000 01000217.19000217.19000217.19000217.19000217.19
l02001 0400000000
+0000000000000000000000000000000000000000
a000.00000.00000.0000
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
yIIDD0043.440019.1410019.1109M0000.000010.000234.57N0017.380023.500000
+43.44000019.14000019.11000000.00000010.00000234.57000017.38
h217BIK1
u3*** TEST DATA ***
u3*** COMMENT AREA FOR TEST DATA ***
pMU76 Nov 2010 A B C D E F G H
+ I J L
+ 0000000000
j1000010017 6790194100109201301092013Test Data L PW09-3PY248
+006BIK10
k10 2R 1 0045.1011N01010222.940012.620006.0000000 0250M 1I
+nsured Only NYY01N00000.00N00000.00Y00000.
+00 000222.94000012.62000006.00
q0222.940222.940222.940222.940222.940000000000000000000002500250025002
+500250YY00000 01000222.94000222.94000222.94000222.94000222.94
l02001 0400000000
+0000000000000000000000000000000000000000
a000.00000.00000.0000
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
yIIDD0044.590019.6110019.6209M0000.000010.000240.78N0017.840023.500000
+44.59000019.61000019.62000000.00000010.00000240.78000017.84
This is a small portion but it should give you the idea. As you can see there is some other information between lines that I need. To clarify further I do not need the lines beginning with h, they are just markers to search between. | [reply] [d/l] |
|
| [reply] |
|
| [reply] |
|
What output do you expect to get from that data?
| [reply] |
Re: Search file for certain lines
by kcott (Archbishop) on Sep 24, 2013 at 05:33 UTC
|
G'day Jalcock501,
You can read your input as multiline blocks: see '$/' in perlvar.
Each of the individual lines in those blocks can be matched for the starting characters you want: see the 'g' and 'm' modifiers, the '^' and '$' anchors and '[...]' character classes in perlre.
With the sample input you provided, this code:
#!/usr/bin/env perl -l
use strict;
use warnings;
use autodie;
my $re = qr{^([jEG].*)$}m;
{
local $/ = "\nh";
open my $fh, '<', 'pm_1055252_data.txt';
while (<$fh>) {
print "*** h-block #$.";
print $1 while /$re/g;
}
close $fh;
}
produces this output:
*** h-block #1
j1000010017 6790194100109201301092013Test Data N PW09-3PY248
+018BIK20
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
*** h-block #2
j1000010017 6790194100109201301092013Test Data M PW09-3PY248
+005BIK00
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
*** h-block #3
j1000010017 6790194100109201301092013Test Data L PW09-3PY248
+006BIK10
E99HEADER|004|001|
E99INSSCH|248|
E99POLCOM|3||CAP01|66|3301R7435459|||||
E99INSFAC2|MSRA01_1||||||"LNI10708"|
G3301R7435459:LNI10708
| [reply] [d/l] [select] |