Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

(Dermot) RE: RE: RE: RE: Re: Stripping page headers

by Dermot (Scribe)
on Sep 28, 2000 at 01:54 UTC ( #34286=note: print w/replies, xml ) Need Help??


in reply to RE: RE: RE: Re: Stripping page headers
in thread Stripping page headers

Ok, I've made two modifications to what I originally posted and now it works ok with your data. My original script would never have worked properly with your report file, it worked with the trivial example that I tested it on but how and ever. Here is a working version:
#!/usr/bin/perl -w use strict; my ($REPFILE, $report); undef $/; open REPFILE, "report.rpt" or die "Cant open $REPFILE: $!\n"; $report = <REPFILE>; $report =~ s/^User Report.*//mg; $report =~ s/^All Users.*//mg; $report =~ s/^User Name.*//mg; $report =~ s/^-> Token.*//mg; print $report; close REPFILE;

Addition of m modifier to the substitution.

Because we are dealing with the whole report file in one scalar it is effectively one string and the rule for ^ and $ is that they match at the start and end of a string, not a line. To get ^ and $ matching at the start and end of a line instead of a string we have to add the m modifier. Now it sees the string in $report as a series of lines delimited by \n characters.

Addition of .* to the regex to deal with the rest of the line.

By adding .* to the regex we cause it to match (i) start of line, (ii) piece of text that we're using as a tag on the line, (iii) rest of the line up to the next \n. A dot character in a regex matches any character except a newline (\n). If you want it to match a newline you can specify this using the s regex modifier. Just to top off the confusion you can actually use both the s and m modifiers on the same regex. Most people assume that they mean single-line vs multi-line but actually they mean match newlines with dot and match ^ and $ in lines not in the whole string.

Replies are listed 'Best First'.
RE: RE: RE: RE: RE: Re: Stripping page headers
by Anonymous Monk on Sep 29, 2000 at 04:31 UTC
    You rock! :) It works flawlessly. BIG Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://34286]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2020-10-24 01:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (242 votes). Check out past polls.

    Notices?