Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: Need Speed:Search Tab-delimited File for pairs of names

by Laurent_R (Canon)
on Dec 16, 2013 at 19:48 UTC ( [id://1067363]=note: print w/replies, xml ) Need Help??


in reply to Re: Need Speed:Search Tab-delimited File for pairs of names
in thread Need Speed:Search Tab-delimited File for pairs of names

Hi Ken,

your code is obviously much shorter and cleaner than the original post, but using regexes rather than the index function is rather unlikely to improve performance, which is the OP's primary request. Or did I miss something?

Replies are listed 'Best First'.
Re^3: Need Speed:Search Tab-delimited File for pairs of names
by AnomalousMonk (Archbishop) on Dec 16, 2013 at 22:39 UTC

    The journey to a better program (for some definition of 'better', in this case faster) begins with a program that works and that one can understand. As suggested elsewhere, the OP code is a spaghetti monster that dare not enable strictures and warnings lest it reveal a host of naughty practices and lurking bugs.

    kcott's shorter and cleaner code, assuming it actually does what mnnb wants, is much more likely to be a good starting point for improvement. I haven't studied it closely, but it seems to me that the regexes, if insufficiently speedy, could fairly easily be replaced by the use of index. In any event, while the use of regexes will not improve performance, it is also unlikely, IMHO, to significantly degrade it versus index in this case. But only benchmarking will determine the trade-offs.

    Update: Minor wording changes; no semantic change.

      I second the notion that regular expressions are a better choice, especially using precompiled patterns.

      I vaguely recall that a RE serach without metacharacters should be fast. There is a short statement implying this in my camel book in the Efficiency section.

      You can always do some performance benchmarking to verify.

        Yes, indeed, a RE search without meta-characters is fast. But index is still faster:
        $ perl index_regex_bench.pl Rate Regex Index Regex 5010020/s -- -23% Index 6544503/s 31% --
      I definitely agree with you, AnomalousMonk, and my very first comment in my post above was that kcott's code was much cleaner and shorter.
Re^3: Need Speed:Search Tab-delimited File for pairs of names
by kcott (Archbishop) on Dec 17, 2013 at 03:31 UTC

    You are quite correct in that I haven't addressed mnnb's primary request; however, I did state my intention: "This may not be exactly what you want but should provide some direction: ...".

    There were so many issues with the posted code (e.g. "sub name_search(@_, $search_string) { ... }" and "$run_time = time() - our $start_run;") that I chose not to attempt to make this code (in its present form) faster as that didn't seem like a useful exercise.

    Beyond that, I can only echo what ++AnomalousMonk wrote in the first response to your comment.

    -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1067363]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-03-28 19:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found