Re: Help!! Regular Expressions
by Corion (Patriarch) on Apr 12, 2001 at 12:03 UTC
|
As you already read MRE, Death to Dot Star! will be clear to you, if it's not, read it and then go back to MRE :-).
The Owl book (Mastering Regular Expressions) has some quite good hints at this, and it also demonstrates an interesting technique on how to avoid .* or the lazy version .*?. Here's what I came up with after reading through my copy of MRE. Note that my solutions
differ from physis solution in that they take the shortest possible match, while physis solution always takes the longest possible text between :::.
#!/usr/bin/perl -w
use strict;
# Slurp the input into $data
my $data;
{
local $/;
$data = <DATA>;
};
# This is the naive way, using the non-greedy .*?
if ($data =~ /:::(.*?):::/ms) {
print "$1\n";
} else {
print "No match.\n";
};
# This is the "perfect" way (should be described in the Owl book
# somewhere). It's much more specific about what it wants, and
# thus longer and more complex :-)
if ($data =~ /::: # start
([^:]* # As many non-: as we can gobble
(?:
::?[^:]+ # and then one or two :'s as long a
+s they are
)* # followed by something non-:
)
::: # end
/msx # And we want to match spanning lin
+es
# and use eXtended re syntax
) {
print "$1\n";
} else {
print "No match.\n";
};
__DATA__
Some foo:::This is a some text. Today you watered the dog and
::test:
bathed the plants. The server asked you what permission you
had to tell it what to do on it's day off. This was your day.:::
More foo.
| [reply] [d/l] |
|
Small point, but since you don't use . in your RE, you don't
actually need the s modifier. You also don't match end or beginning of string,
so you don't need the m modifier. Not that it matters much.
- Ant
| [reply] |
Re: Help!! Regular Expressions
by dws (Chancellor) on Apr 12, 2001 at 11:44 UTC
|
Sounds like you've put in a bit of work on figuring this out, so I won't cheat you out of that burst of pleasure that comes from solving a problem yourself. (I.e., a hint is all you're gonna get.)
Take a look at perlre, and note which of the regular expression "modifiers" change what '.' will match.
| [reply] |
Re: Help!! Regular Expressions
by deprecated (Priest) on Apr 12, 2001 at 11:49 UTC
|
Mastering regular expressions is indeed a good book, but I've heard it called dated by quite a few people. Probably what you want to do in this circumstance is something that you should rarely do, and that is tinker with $/. In this case, I suspect this shall suffice:
my $lineterm = $/;
$/ = '';
my $thingy = $_ =~ /:::(.*):::/;
$/ = $lineterm;
I dont believe perl's regexes behave like sed(1) or awk(1)'s in that they require flags and modifiers to catch newlines. More information on the tricky $/ is of course in perlvar.
and of course you should read that book cover to cover. it is extremely helpful. I learned a great deal from that book.
brother dep.
update: me and my itchy trigger finger. well i just spoke to dws in the chatterbox, and while i wasnt able to test this out, he was. apparently the re engine (at least as recently as 5.6) does not care what $/ is set to for the end-of-line character. which means this node isnt quite worthless because, yay, it taught me something. *grumble*
--
Laziness, Impatience, Hubris, and Generosity. | [reply] [d/l] [select] |
|
It may not have the latest RE stuff... but it really
covers the basics, which, really, isn't that what most people
need? Granted a new chapter on perl would be nice, but a lot
of the stuff in that book won't be dated until real changes
are made to the core of Regular Expressions
- Ant
| [reply] |
Re: Help!! Regular Expressions
by bjelli (Pilgrim) on Apr 12, 2001 at 14:13 UTC
|
another solution, without regular expressions.
I assumed that you're reading from a file, and that
you want to read in more the one lump of data:
#!/usr/bin/perl -w
use strict;
{
local $/ = ":::"; # read stuff separated by :::
while(<DATA>) {
s/:::$//; # remove the ::: at the end of
# each lump of data
print "found a piece -->$_<--\n\n";
}
}
__DATA__
Some foo:::This is a some text. It talks about CGI::Carp, but
then suddenly changes subject to matching three :'s in a row
:::
The server asked you what permission you
had to tell it what to do on it's day off. This was your day.:::
More foo.
hope that helps
--
Brigitte 'I never met a chocolate I didnt like' Jellinek
http://www.horus.com/~bjelli/ http://perlwelt.horus.at | [reply] [d/l] |
Re: Help!! Regular Expressions
by Caillte (Friar) on Apr 12, 2001 at 15:26 UTC
|
Proving TIMTOWTDI, and stiring the pot some more, how about this one? I used the same data as corion so they can be easily compared.
$data = <DATA>;
(@array) = split /:::/, $data;
print join "\n", @array;
__DATA__
Some foo:::This is a some text. Today you watered the dog and ::test:
+bathed the plants. The server asked you what permission you had to te
+ll it what to do on it's day off. This was your day.::: More foo.
This gives:
Some foo
This is a some text. Today you watered the dog and ::test: bathed the plants. The server asked you what permission you had to tell it what to do on it's day off. This was your day.
More foo.
A little more sophistication and you could quite easily edit how the data s read
$japh->{'Caillte'} = $me; | [reply] [d/l] [select] |
Re: Help!! Regular Expressions
by physi (Friar) on Apr 12, 2001 at 11:54 UTC
|
Hmm what do you get with your regexp ?
I guess only the first row ?
Is the text in a file ?
If so, try to set $/=undef:
$/=undef;
$t=<FILEHANDLE>;
$t =~ s/^:::(.*):::/$1/;
print $t;
This should work, if your memory if big enough to read the whole file in one $t.
-----------------------------------
--the good, the bad and the physi--
-----------------------------------
| [reply] [d/l] [select] |
Re: Help!! Regular Expressions
by Rhandom (Curate) on Apr 12, 2001 at 20:38 UTC
|
I may be over simplifying things, but this should do:
my ($thingy) = $_ =~ /:::(.*?):::/s;
There is no global on it so it will get the first one.
If for some chance there was more than one that you wanted, I would go with
my (@thingies) = $_ =~ /:::(.*?)(?=:::)/sg;
# ?= allows for it to start at the ::: on the next time through
I keep seeing people talk about the dot star issue, but the dot star issue is not as much an issue if you have qualifiers before and after that force it to match a specific location. | [reply] [d/l] [select] |
Re: Help!! Regular Expressions
by providencia (Pilgrim) on Apr 12, 2001 at 20:06 UTC
|
Okay here's what I learned today from a friend.
"You are sipping when you should be slurping."
Of course if I had told you all what I was doing before
that line you would have definitely seen my real problem.
Okay here's close to what I did have (I burned that code long ago):
#!/usr/bin/perl -w
use strict;
open(FH, '-'); # using STDIN instead of a real file.
while (<FH>){
my $entry = $_ =~ /:::(.*?):::/ms;
open(FH2,'>filename');
print FH2 "$entry";
}
Of course this didn't work because I was only working with
the first line and what I wanted was further in the file
and on multiple lines.
Here's what I am using.
#!/usr/bin/perl -w
use strict;
open(FH,'-');
my $file = do {local $/; <FH>};
#A really elegant way to slurp an entire file and fast too.
#It creates a "disposable subroutine" in effect.
#I can't take credit for coming up with it though.
close FH;
open(FH2,">filename");
my $entry = $file =~ /:::(.*?):::/ms;
#Thanks Corion
print FH2 "$1\n";
Thanks everyone. I liked the hint dws.
Thanks Corion for telling me about the not so greedy (.*?).
I liked the hint dws.
I learned from that hint but it wouldn't work for me until I slurped. :)
I'm REAL glad to know about perlre until I can make time for Mastering Regular Expressions | [reply] [d/l] [select] |