Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: parsing question

by Wonko the sane (Deacon)
on May 28, 2003 at 13:49 UTC ( #261299=note: print w/ replies, xml ) Need Help??


in reply to parsing question

I like kilinrax use of reverse, I have never seen that trick before.
Without knowing that I would have suggested a capturing regex,
sort of a modification of the greedy suggestion.

It benchmarks the fastest of the three.

#!/usr/local/bin/perl use strict; use Benchmark; my $string = "<<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> ;strip_ +me"; sub reversed { my $reverse = reverse(shift); $reverse =~ s| \w* ; \s* > |>|x; return scalar reverse $reverse; } sub greedy { my $line = shift; $line =~ s|^ (.*>) \s* ; \w* |$1|x; return $line; } sub capture { my $line = shift; return $line =~ /^(.+>)/; } print "Reversed: ", reversed($string), "\n"; print "Greedy: ", greedy($string), "\n"; print "Capture: ", capture($string), "\n"; timethese( -10,{ reversed => sub { reversed( $string ) }, greedy => sub { greedy( $string ) }, capture => sub { capture( $string ) }, } );
Output:
:!./test.pl Reversed: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> Greedy: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> Capture: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> Benchmark: running capture, greedy, reversed, each for at least 10 CPU + seconds... capture: 10 wallclock secs (10.40 usr + 0.01 sys = 10.41 CPU) @ 53 +160.52/s (n=553401) greedy: 10 wallclock secs (10.52 usr + 0.00 sys = 10.52 CPU) @ 21 +887.07/s (n=230252) reversed: 11 wallclock secs (10.54 usr + 0.01 sys = 10.55 CPU) @ 36 +366.92/s (n=383671)

Wonko


Comment on Re: parsing question
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://261299]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (8)
As of 2015-07-07 00:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (86 votes), past polls