Regular expressions

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks

I have another question regarding pattern matching.

I have a piece of html that I need to search through. The html looks like this:

<!-- Start_of_revision-->
revision1
<!-- End_of_revision-->
 
<!-- Start_of_revision-->
revision2
<!-- End_of_revision-->

<!-- Start_of_revision-->
revision3
<!-- End_of_revision-->
[download]

What I need to do is find the three revisions between  and .

I am using the following bit of code to do this, but I am only printing "revision1" and not the second two:

my $file = $foo_bar_file;
    open (FILE,"<$file") || die $!;
    read FILE, my $text, -s $file;
    close(FILE);
    if($text =~ /<!--Start_of_revision-->(.*?)<!-- End_of_revision-->/
+sg)
    {
    print $1;
    }
[download]

I am slurping in the whole file, I am using .*? (i.e. non greedy), and I'm using the s and g modifiers. I thought that this would be enough to find all three occurrences of anything between  and , so what am I doing wrong??

Thanks in advance,

C J

Comment on Regular expressions Select or Download Code

Replies are listed 'Best First'.
Re: Regular expressions by rev_1318 (Chaplain) on Jul 20, 2005 at 10:09 UTC
`if` will only look for the first occurrence. You're looking for `while...` Paul	[reply] [d/l] [select]
Re^2: Regular expressions by Anonymous Monk on Jul 20, 2005 at 10:14 UTC
<sheepish> Of course. Thanks! </sheepish> C J	[reply]
Re: Regular expressions by GrandFather (Saint) on Jul 20, 2005 at 10:16 UTC
Two problems: Missing space in `<!--Start_of_revision-->` `if` rather than `while` A working version of your code in a form that is better for stand alone testing is: Read more... The code (683 Bytes) Read more... Output (239 Bytes) Perl is Huffman encoded by design.	[reply] [d/l] [select]
Re^2: Regular expressions by ikegami (Patriarch) on Jul 20, 2005 at 14:56 UTC
That's an aweful way of reading a file. It means every line must be pushed onto the stack. The OP's method was better, and the following is even better because it avoids a call to `stat` and works will all kinds of IO handles: `my $text; { local $/; $text = <DATA>; }` [download] Visually, I prefer `my $text = do { local $/; <DATA> };` but I think I determined the above is equivalent to `my $text; { local $/; my $temp = <DATA>; $text = $temp; }`	[reply] [d/l] [select]
Re^3: Regular expressions by GrandFather (Saint) on Jul 20, 2005 at 20:35 UTC
The reason I am a monk is to learn from the masters. I keep forgetting about $/! I hope I have learned :). Thank you. Perl is Huffman encoded by design.	[reply]
Re: Regular expressions by tphyahoo (Vicar) on Jul 20, 2005 at 15:52 UTC
You could also use File::Slurp. Not sure if it's advantageous speed or algorithmwise; the documentation claims it does. But I just like it for readability.	[reply]

Back to Seekers of Perl Wisdom