Re: regular expressions
by brian_d_foy (Abbot) on Mar 01, 2005 at 08:45 UTC
|
I think you want the regex I show in this little program. It starts with an "a", has one of more non-newline characters, then uses the alternation (f|g) to denote that the whole thing can end with either of those characters.
#!/usr/bin/perl
while( <DATA> )
{
chomp;
print "$_ matches!\n" if /^a.+(f|g)$/;
}
__DATA__
axxxi
bxg
cxxf
axxxh
axxxg
axxxf
--
brian d foy <bdfoy@cpan.org>
| [reply] [d/l] [select] |
|
Slightly picky, but I'd use a character class to match the f or g. I'm pretty certain it'd be quicker, as perl isn't jumping through hoops to catch $1.
print "$_ matches!\n" if /^a.+[fg]$/;
| [reply] [d/l] |
|
Hi it's me again, this is really what I'm trying to do, get each line of a file end if the word on the line begins with a has 1 or more words then ends with f or g (images files like amm.gif or ammre.jpg) get the word into a table @amourss.
#!usr/bin/perl
# 2005-03-01 :
# ouvrir le fichier contenant la liste des cartes
open(CARTES, "imagescartes.txt") or die "Ouverture du ficher imagescar
+tes.txt impossible: $!\n";
-T "imagescartes.txt" or print "ceci n'est pas un fichier texte\n";
while ($ligne = <CARTES>)
{
chop($ligne);
while ($amourss .= /^a.+(f|g)$/g )
{
$total++;
}
}
| [reply] [d/l] |
|
If you are looking for particular filenames, I would expand the regular expression. I'd specify the file extension as much as possible, including the literal full stop that separates the name and extension. The /i flag is sometimes a good idea since some things like to make everything upper case.
/^a.*\.(jpg|gif)$/i
--
brian d foy <bdfoy@cpan.org>
| [reply] [d/l] |
|
#!/usr/bin/perl
# 2005-03-01:
# Please always use strict and warnings, they will capture
# many common mistakes and typos
use strict;
use warnings;
# we may as well test it before we try to open it
unless (-T 'imagescartes.txt') {
print "ceci n'est pas un fichier texte\n";
exit 1;
}
# ouvrir le fichier contenant la liste des cartes
open CARTES, '<', 'imagescartes.txt' or die 'Ouverture du ficher image
+scartes.txt impossible: $!\n';
# this is the array where we store the image file names
my @amourss;
while (my $ligne = <CARTES>)
{
chomp $ligne ;
if ($ligne =~ /^a\S+\.(?:gif|jpg)$/) # will match exactly one word
# begining a and ending in .g
+if or .jpg
# the \S+ is one or more non
+space chrs
# a.gif not allowed use \S* t
+o allow it
{
push @amourss, $ligne;
}
}
close CARTES;
print "found ", scalar @amourss, " images: ";
print join ", ", @amourss;
print "\n";
__END__
# input file used
test.gif
this asilly.gif
abc.jpg
abcd.gif
absolutely not a.jpg
# results of running
>./amourss
found 2 images: abc.jpg, abcd.gif
>
Update
as Jasper points out a space is valid in filenames in most OSes so please feel free to change the regex to /^a.+\.(?:gif|jpg)$/ if you wish to allow spaces in filenames or /^a.*\.(?:gif|jpg)$/ if you wish to allow a.gif and a.jpg
Cheers, R.
Pereant, qui ante nos nostra dixerunt!
| [reply] [d/l] [select] |
|
|
Re: regular expressions
by saintmike (Vicar) on Mar 01, 2005 at 08:49 UTC
|
Hmmm ... seems like some of the choices of operators and regular expresssions are somewhat arbitrary. Did you guess :) ? Couple of hints:
| [reply] [d/l] [select] |
Re: regular expressions
by tirwhan (Abbot) on Mar 01, 2005 at 09:00 UTC
|
Appending the matched text like that won't work, you need to do it in two steps:
/<regex>/;
$amourss.=$1;
This is what I'm guessing you want to do with your regular expression(with comments):
/^ # Match the start of the string
a # Single occurrence of the character a
(.+) # Grab one or more other characters
# and put into $1
[fg] # a single character, either "f" or "g"
$ # match the end of the string
/x # x modifier to allow comments in regex
So your complete code would look something like this:
/^a(.+)[fg]$/
$amourss.=$1;
| [reply] [d/l] [select] |
Re: regular expressions
by TedPride (Priest) on Mar 01, 2005 at 08:46 UTC
|
if ($amourss =~ /^a.+[fg]$/) { }
or
if ($amourss =~ /^a.+(?:f|g)$/) { }
Assuming you're doing a test and not trying to return part of the motif.
EDIT: The example given above is inefficient because it uses (f|g) instead of (?:f|g), requiring the regex to return this part of the match as $1 when not needed. | [reply] [d/l] [select] |
Re: regular expressions
by inman (Curate) on Mar 01, 2005 at 09:04 UTC
|
The following code shows some variations on what you want. It matches and captures a string containing one or more no white space characters(represented by \S). The various options show the difference between greedy and non-greedy matching as well as anchoring on word boundaries.
#! /usr/bin/perl -w
use strict;
use warnings;
my $data = "abcdefghijklmnopFqrstuvwxyz arrrrhhhhg!";
# Greedily match as much non-whitespace (\S+) as we can that
# starts with an a and ends with an f or g
print "Greedily matched $1\n" if $data =~ /(a\S+(f|g))/i;
# Add the ? modifier to make the expression match minimally
print "Minimally matched $1\n" if $data =~ /(a\S+?(f|g))/i;
# Anchor each end of the expression to a word boundary (\b)
# so that we only match words that start with an a and ends with an f
+or g
print "Word matched $1\n" if $data =~ /(\ba\S+(f|g)\b)/;
# Apply the same regular expression to pick out all of the
# matches in the data
while ($data =~ /(a\S+?(f|g))/ig)
{
print "Word: $1\n";
}
| [reply] [d/l] |
A reply falls below the community's threshold of quality. You may see it by logging in. |