Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Regular Expressions

by Anonymous Monk
on Aug 14, 2000 at 19:47 UTC ( #27762=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

First I would like to say Thanks for the help that I have received so far. I have 2 question on how to use Regular Expressions. Here is an example of the data that I am dealing with. Header Line One ***-*** 0 0 ***-MBO 0 0 2TO-T/V 0 0 2TO-T/O 0 0 POC-CNU 1285 0 POC-A/M 0 15567 Header Line Two ***-*** 0 0 ***-MBO 0 0 2TO-T/V 0 0 2TO-T/O 0 0 POC-CNU 1285 0 POC-A/M 0 15567 1) I am looking for a way to read a line in and look for the first 7 characters. The Characters can start with a a-z0-9 or */ and the fourth character will always be a -. It does this until it reads the line that says "Header line two" which starts the next file. 2) Also how would I do an error check for duplication. What I mean is if for some reason "Header Line two" is skipped over. And the next line is the same as in "Header line one" than I would need to produce an error. how would I go about doing this.

Replies are listed 'Best First'.
Re: Regular Expressions
by ZZamboni (Curate) on Aug 14, 2000 at 20:00 UTC
    Please use <code> tags around your data and code so that it's properly formatted. Here's the sample data that you posted (I removed empty lines for space):
    Header Line One ***-*** 0 0 ***-MBO 0 0 2TO-T/V 0 0 2TO-T/O 0 0 POC-CNU 1285 0 POC-A/M 0 15567 Header Line Two ***-*** 0 0 ***-MBO 0 0 2TO-T/V 0 0 2TO-T/O 0 0 POC-CNU 1285 0 POC-A/M 0 15567
    Here's sample code (untested) that does what you explained, storing each line in a hash using the first 7 characters as the key, and checking for duplicates:
    my $data={}; my $file; while(<>) { chomp; # Skip blank lines next if /^\s*$/; if (/^Header/) { $data->{$_}={}; # Create a new first-level hash. $file=$_; next; } if (/^([a-zA-Z0-9*/]{3}-\S{3})\s+/) { my $key=$1; if ($file) { # Check for duplicates. if (exists($data->{$file}->{$key})) { warn "Duplicate key $key in $file: $_\n"; next; } $data->{$file}->{$key}=$_; } else { warn "Line found before a header line: $_\n"; } } else { # Reject improper lines warn "Badly formatted line found, ignoring: $_\n"; } }
    This stores the data in a structure like this:
    $data->{Header Line One}-> {***-***} -> "***-*** 0 0 ..etc" {2TO-T/V} -> "..." ... ->{Header Line Two}-> ....
    This may not be precisely what you want, but it should give you an idea of one way of doing it.

    --ZZamboni

Re: Regular Expressions
by Shendal (Hermit) on Aug 14, 2000 at 20:06 UTC
    First, surround any data or code with CODE tags. If I understand you correctly, your data looks something like this:
    Header Line One ***-*** 0 0 ***-MBO 0 0 2TO-T/V 0 0 2TO-T/O 0 0 POC-CNU 1285 0 POC-A/M + 0 15567 Header Line Two ***-*** 0 0 ***-MBO 0 0 2TO-T/V 0 0 2TO-T/O 0 0 POC-CNU 1285 0 POC-A/M + 0 15567
    Although you do not specify, I'll also assume that the header lines alternate between 'one' and 'two', and you just want to make sure that these don't repeat (that is, they continue to alternate).

    Here's what I'd try:
    #!/usr/bin/perl -w use strict; # variable to hold the previous header my($header); foreach (<>) { if (/^Header Line (\S+)$/) { die "Error - header repeated in line $.\n" if ($header && $header +eq $1); $header = $1; next; } if (/^[\w\*\/]{3}-[\w\*\/]{3}/) { # do whatever processing on the line print "Got a matching line...\n"; } }

    Hope that helps,
    Shendal

    Update: Darn. Looks like I got your data wrong - or zzamboni did (grin). All the more reason to use CODE tags.
      You can see the intended format of a post by viewing the html source that your browser is attempting to display.
RE: Regular Expressions
by Anonymous Monk on Aug 15, 2000 at 17:31 UTC
    Thanks Guys for the help and also the advice.. ZZamboni what you had was the correct format.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://27762]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (2)
As of 2023-01-30 04:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?