Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

A better regex

by williamp (Pilgrim)
on Sep 24, 2003 at 00:21 UTC ( [id://293749]=perlquestion: print w/replies, xml ) Need Help??

williamp has asked for the wisdom of the Perl Monks concerning the following question:

GDay Monks,

I have a regex which I have made work but I am sure you could do better.
Basically I am reading in two types of file name and want to
grab a string which is embedded in each name type.
My code is;
my @files=`ls AFILE_*zip BFILE_*zip `; foreach my $zipfile (@files) { my $day = undef; $zipfile =~ m/AFILE_(\d+)/; $zipfile =~ m/BFILE_(\d+)/; $day = $1; }

Thank you for your help.

Replies are listed 'Best First'.
Re: A better regex
by mildside (Friar) on Sep 24, 2003 at 01:35 UTC
    Your regex question has been answered, but I noticed you used:
    my @files=`ls AFILE_*zip BFILE_*zip `;
    Since ls is dependent on your operating system, a more general way to do this would be:

        my @files = glob('AFILE_*zip BFILE_*zip');

    Cheers!

      also works:
      my @files = glob( '{A,B}FILE_*zip' ); or my @files = <{A,B}FILE_*zip>;

      Tiago
Re: A better regex
by kvale (Monsignor) on Sep 24, 2003 at 00:30 UTC
    Try the regex
    $day = $1 if $zipfile =~ m/[AB]FILE_(\d+)/;

    -Mark

Re: A better regex
by thens (Scribe) on Sep 24, 2003 at 08:36 UTC
    As others have suggested better regexes Iam not going to ponder over what is a better regex. Read the following rule , and then browse through the code that you and others have posted.

    You should always check for the success of a regex match before capturing the value in $1..$n

    If the regex match fails you will end up having undefined value or some junk in $1. Always check for the success before you assign the captured values.

    Your code should look like

    if ( $zipfile =~ m/[AB]FILE_(\d+)/ ) { $day = $1; } else { # oops.. the file name is not what we expected .. # panic !! }

    Hope this helps

    -T

Re: A better regex
by Roger (Parson) on Sep 24, 2003 at 00:59 UTC
    Hi William, a more generalized solution -
    my @files=`ls Some_*zip This_*zip`; foreach (@files) { my ($day) = /(?:Some|This)_(\d+)/; .... }
    Use ?: to forget the value of Some or This, which will make the regex a little bit more efficient.
Re: A better regex
by Abigail-II (Bishop) on Sep 24, 2003 at 08:24 UTC
    foreach my $zipfile (@files) { my $day = undef; $zipfile =~ m/AFILE_(\d+)/; $zipfile =~ m/BFILE_(\d+)/; $day = $1; }

    That code isn't correct. If @files contains for instance "AFILE_foo.zip", neither of the regexes matches, and $1 will still be the result of the previous successful match.

    Abigail

Re: A better regex
by injunjoel (Priest) on Sep 24, 2003 at 00:32 UTC
    Here is one possiblity
    $zipfile =~ /(A|B)FILE_([0-9]+)/; $day = $2;
    However if you are just looking for digits try:
    $zipfile =~ /([0-9]+)[^0-9]?/; $day = $1;

      Why capture when you don't have to?

      $zipfile =~ /(?:A|B)FILE_(\d+)/; $day = $1;

      Or you can use a character class (since the strings are 1 char long).

      $zipfile =~ /[AB]FILE_(\d+)/; $day = $1;

      Of course, you could always use the uber-sexy:

      ($day) = $zipfile =~ /[AB]FILE_(\d+)/;

      ;-P

      Anonymously yours,
      Anonymous Nun

Re: A better regex
by chanio (Priest) on Sep 25, 2003 at 05:08 UTC
    my @files=`ls AFILE_*zip BFILE_*zip `; foreach my $zipfile (@files) { my $day = undef; $zipfile =~ m/AFILE_(\d+)/; $zipfile =~ m/BFILE_($1)/; $day = $1; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://293749]
Approved by jdtoronto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-18 22:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found