Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Regular Expression Question

by MistaMuShu (Beadle)
on Jul 21, 2004 at 22:43 UTC ( [id://376399]=perlquestion: print w/replies, xml ) Need Help??

MistaMuShu has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
Being completely new to Perl and regexps, I thought the best way to learn is to get some practice. I thought I was getting the hang of it with simpler searches so I tried to use it for a problem at work. Here's where I got stuck:
There are a series of folders and each folder is named like so: "6 digit number" "optional ." "optional 2 digit number" "optional letter"

i.e. ######, ######.##, or ######.##a

Inside each folder is a file with the same name as the folder plus "_vml_1.htm". i.e. 274813.99a_vml_1.htm and inside this file I want to find <foldername>_vml_1.emz and replace it with <foldername>_gif_1.gif

Here's a little test program
... $folder = shift(@dir); #dir is a listing of folders $folder =~ /(\d{6}.?\d{2}?\w?)/; open FILE, ">./$folder/$1_vml.htm"; ...
I realize I must be doing something really stupid when I try to match the foldername. But I can't figure out why the expression after \d{6} does not work.
Please help shed some light on this so I don't have nightmares ;-)
Thanks in advance!
Jerry

Replies are listed 'Best First'.
Re: Regular Expression Question
by Roy Johnson (Monsignor) on Jul 21, 2004 at 23:41 UTC
    Note that a question mark after a quantifier ({2}?) does not mean optional, but rather non-greedy. You need to put parens around the sub-expression to make it optional:
    /^(\d{6}\.?(?:\d{2})?\w?$/;
    If the optionals aren't independent, you may need to nest them:
    /^(\d{6} #leading digits (?:\.? #if dot, then (?:(?:\d{2})? #if two digits, then \w?)))$/x; #maybe a character
    Note the use of anchors, to prevent matching part of the name.

    We're not really tightening our belts, it just feels that way because we're getting fatter.
      Sorry, but I think when you say "use of anchors", it means the ^ and $ sign correct?

      Yeah, I guess I oversimplified the expression I wanted to search for, but this has been most helpful. Thanks.

Re: Regular Expression Question
by swkronenfeld (Hermit) on Jul 21, 2004 at 22:58 UTC
    The . (dot) in your regular expression will match anything. You need to escape it, i.e.

    $folder =~ /(\d{6}\.?\d{2}?\w?)/;


    Also, this regular expression is matching for files which are of the form ######## (8 numbers without a dot), and files of the form ######a (6 numbers and the optional letter), as well as a few other forms that it sounds like you don't awnt to capture. Is the optional 2 digit number and letter dependent on whether there is a dot? If so, you are going to need a slightly more complex regular expression.

    One last question, why do you even have the regular expression in there? Can't you use $folder instead of $1 without loss of generality? You aren't doing a conditional based on whether or not the regexp matches, so I assume all folders are meant to be replaced in...
      Can't you use $folder instead of $1 without loss of generality?

      Well, initially I was thinking that since the actual folder names are ######.##a_files (think html folders) that I would find the first part of the name and save it in $1

      Now that I look at it again with your replies, I see that it'd be a lot easier if I just negated "_files" and did /(^_files)/

      That was silly of me to not escape the dot :( And I'll have to read a bit more on the lookahead find because ?: still does not make perfect sense to me... Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://376399]
Approved by PERLscienceman
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-06-22 18:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.