Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^2: How to speed up my Html parsing program? (Concurrently run Subroutines?)

by eye (Chaplain)
on Jan 06, 2009 at 07:37 UTC ( #734361=note: print w/ replies, xml ) Need Help??


in reply to Re: How to speed up my Html parsing program? (Concurrently run Subroutines?)
in thread How to speed up my Html parsing program? (Concurrently run Subroutines?)

This page contains a list of links which then need to be retrieved by the get_page_html() method and there content passed to retrieve_info();
If the get_page_html() method retrieves pages over a network (rather than from disk), there is great potential for improving performance with forking or threads. In a single process/thread, network latency is additive. With multiple processes/threads, latency costs can run concurrently.


Comment on Re^2: How to speed up my Html parsing program? (Concurrently run Subroutines?)
Re^3: How to speed up my Html parsing program? (Concurrently run Subroutines?)
by BobFishel (Acolyte) on Jan 06, 2009 at 16:34 UTC
    Yes, it retrieves it over a network. I'm definitly going to look into forking, after doing some reading last night it seems this is my best bet at this point. Now I just need to figure out how to keep my variables indedpendant. I haven't dug into the code in the responses below yet but from the looks of it they seem like a great starting point!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://734361]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2015-07-02 03:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (27 votes), past polls