Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Recommendations for efficient data reduction/substitution application

by LanX (Saint)
on Mar 05, 2015 at 11:33 UTC ( [id://1118876]=note: print w/replies, xml ) Need Help??


in reply to Recommendations for efficient data reduction/substitution application

I solved a roughly similar problem using the trie optimization of the regex engine.

You need to put the different match clause into a long or-chain.

The replace-part has to identify which regex matched.

Further more you should benchmark how expensive replacements are on longer chunks.

A sliding window technique could be the answer.

Hmm... Could you identify where the bottleneck really is by benchmarking match and replace independently?

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)

PS: Je suis Charlie!

  • Comment on Re: Recommendations for efficient data reduction/substitution application

Replies are listed 'Best First'.
Re^2: Recommendations for efficient data reduction/substitution application
by BrowserUk (Patriarch) on Mar 05, 2015 at 12:57 UTC

    Hm Did they fix that "As it seems that trie optimization brakes for more than 15000 alternative patterns." yet?

    C:\docs\OriginOfSpecies(Darwin)\2009-h>\perl5.18\bin\perl.exe \test\10 +43602.pl 2009-h.htm Finding 203474 words (of 216808) took 0.173504 seconds using a hash Finding 203474 words took 2072.099258 seconds using a trie(via regex +engine)

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1118876]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (9)
As of 2024-04-23 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found