Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

regular expressions

by wannabeboy (Novice)
on Mar 01, 2005 at 08:28 UTC ( #435314=perlquestion: print w/replies, xml ) Need Help??

wannabeboy has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,
I have this code, I want to know is it will do the right thing: search for a motif that begins with "a" contains 1 or more characters then ends with {f or g}?
$amourss .= /^a(.)+{f,g}$/;
Thanks in advance

Replies are listed 'Best First'.
Re: regular expressions
by brian_d_foy (Abbot) on Mar 01, 2005 at 08:45 UTC

    I think you want the regex I show in this little program. It starts with an "a", has one of more non-newline characters, then uses the alternation (f|g) to denote that the whole thing can end with either of those characters.

    #!/usr/bin/perl while( <DATA> ) { chomp; print "$_ matches!\n" if /^a.+(f|g)$/; } __DATA__ axxxi bxg cxxf axxxh axxxg axxxf
    brian d foy <>
      Slightly picky, but I'd use a character class to match the f or g. I'm pretty certain it'd be quicker, as perl isn't jumping through hoops to catch $1.
      print "$_ matches!\n" if /^a.+[fg]$/;
      Hi it's me again, this is really what I'm trying to do, get each line of a file end if the word on the line begins with a has 1 or more words then ends with f or g (images files like amm.gif or ammre.jpg) get the word into a table @amourss.
      #!usr/bin/perl # 2005-03-01 : # ouvrir le fichier contenant la liste des cartes open(CARTES, "imagescartes.txt") or die "Ouverture du ficher imagescar +tes.txt impossible: $!\n"; -T "imagescartes.txt" or print "ceci n'est pas un fichier texte\n"; while ($ligne = <CARTES>) { chop($ligne); while ($amourss .= /^a.+(f|g)$/g ) { $total++; } }

        If you are looking for particular filenames, I would expand the regular expression. I'd specify the file extension as much as possible, including the literal full stop that separates the name and extension. The /i flag is sometimes a good idea since some things like to make everything upper case.

        brian d foy <>

        I think this will do what you want.

        #!/usr/bin/perl # 2005-03-01: # Please always use strict and warnings, they will capture # many common mistakes and typos use strict; use warnings; # we may as well test it before we try to open it unless (-T 'imagescartes.txt') { print "ceci n'est pas un fichier texte\n"; exit 1; } # ouvrir le fichier contenant la liste des cartes open CARTES, '<', 'imagescartes.txt' or die 'Ouverture du ficher image +scartes.txt impossible: $!\n'; # this is the array where we store the image file names my @amourss; while (my $ligne = <CARTES>) { chomp $ligne ; if ($ligne =~ /^a\S+\.(?:gif|jpg)$/) # will match exactly one word # begining a and ending in .g +if or .jpg # the \S+ is one or more non +space chrs # a.gif not allowed use \S* t +o allow it { push @amourss, $ligne; } } close CARTES; print "found ", scalar @amourss, " images: "; print join ", ", @amourss; print "\n"; __END__ # input file used test.gif this asilly.gif abc.jpg abcd.gif absolutely not a.jpg # results of running >./amourss found 2 images: abc.jpg, abcd.gif >


        as Jasper points out a space is valid in filenames in most OSes so please feel free to change the regex to /^a.+\.(?:gif|jpg)$/ if you wish to allow spaces in filenames or /^a.*\.(?:gif|jpg)$/ if you wish to allow a.gif and a.jpg


        Pereant, qui ante nos nostra dixerunt!
Re: regular expressions
by saintmike (Vicar) on Mar 01, 2005 at 08:49 UTC
    Hmmm ... seems like some of the choices of operators and regular expresssions are somewhat arbitrary. Did you guess :) ? Couple of hints:
    • The operator to match a string with regular expression is =~, not .=
    • Capturing a single dot in parentheses ((.)) doesn't really make sense unless you want to capture the very character this dot matches and save it in a variable.
    • To match either f or g, use a character class: [fg]

      Remains the question where you've got those ideas from ...

Re: regular expressions
by tirwhan (Abbot) on Mar 01, 2005 at 09:00 UTC
    Appending the matched text like that won't work, you need to do it in two steps:
    /<regex>/; $amourss.=$1;
    This is what I'm guessing you want to do with your regular expression(with comments):
    /^ # Match the start of the string a # Single occurrence of the character a (.+) # Grab one or more other characters # and put into $1 [fg] # a single character, either "f" or "g" $ # match the end of the string /x # x modifier to allow comments in regex
    So your complete code would look something like this:
    /^a(.+)[fg]$/ $amourss.=$1;
Re: regular expressions
by TedPride (Priest) on Mar 01, 2005 at 08:46 UTC
    You want
    if ($amourss =~ /^a.+[fg]$/) { }
    if ($amourss =~ /^a.+(?:f|g)$/) { }
    Assuming you're doing a test and not trying to return part of the motif. EDIT: The example given above is inefficient because it uses (f|g) instead of (?:f|g), requiring the regex to return this part of the match as $1 when not needed.
Re: regular expressions
by inman (Curate) on Mar 01, 2005 at 09:04 UTC
    The following code shows some variations on what you want. It matches and captures a string containing one or more no white space characters(represented by \S). The various options show the difference between greedy and non-greedy matching as well as anchoring on word boundaries.
    #! /usr/bin/perl -w use strict; use warnings; my $data = "abcdefghijklmnopFqrstuvwxyz arrrrhhhhg!"; # Greedily match as much non-whitespace (\S+) as we can that # starts with an a and ends with an f or g print "Greedily matched $1\n" if $data =~ /(a\S+(f|g))/i; # Add the ? modifier to make the expression match minimally print "Minimally matched $1\n" if $data =~ /(a\S+?(f|g))/i; # Anchor each end of the expression to a word boundary (\b) # so that we only match words that start with an a and ends with an f +or g print "Word matched $1\n" if $data =~ /(\ba\S+(f|g)\b)/; # Apply the same regular expression to pick out all of the # matches in the data while ($data =~ /(a\S+?(f|g))/ig) { print "Word: $1\n"; }
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://435314]
Approved by TedPride
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2023-02-08 03:53 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (40 votes). Check out past polls.