Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^4: ithreads, locks, shared data: is that OK?

by bliako (Vicar)
on Sep 20, 2018 at 10:08 UTC ( #1222704=note: print w/replies, xml ) Need Help??


in reply to Re^3: ithreads, locks, shared data: is that OK?
in thread ithreads, locks, shared data: is that OK?

got it thanks

well, I need to peek() because one of my queues is not a "work" queue: i.e. enqueue() any data for processing by the thread. Rather, it is a list of all data (=dictionary words) currently being processed by threads (call it CWq queue). And another one is a list of all the words that already have been processed and done with (call it REq). So, before a thread processes word W, it must peek() queue CWq and see if W is in there. In which case it will skip it. Also it will skip if the word is in the REq, so another peek() there.

Ideally CWq and REq should have been a hash but I find sharing a hash way too complicated than lock() and peek() a queue. A queue is definetely a weird hammer for that sort of nail. Any suggestions?

  • Comment on Re^4: ithreads, locks, shared data: is that OK?

Replies are listed 'Best First'.
Re^5: ithreads, locks, shared data: is that OK?
by choroba (Bishop) on Sep 20, 2018 at 12:29 UTC
    I don't fully understand your workflow. If you want to process each word just once, create a thread that keeps a hash of the processed words and ask it before sending any word forward maybe?

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Thanks choroba for your time. I am bit apprehensive to start a thread to act as a data-query server. What about the communication overhead? Bottlenecks?

      *Or* does a Thread::Queue or a threads::shared hash imply the exact same overheads and bottlenecks because of Perl's ithreads implementation?

      summary: what you propose and my workflow:

      Your suggestion: start a thread to handle enquiries about data (remember i need to do 2 checks, 1 if word is currently being processed and 2 if word has already been processed) and maybe also put the dictionary in there. So that eliminates all shared data and insane duplication of a readonly dictionary over all threads.

      my workflow: I am keeping track of first 2 cases above using a queue which I basically (mis-)treat as a hash. I loop over its elements and peak() trying to find my word. Or I loop and peak() and if found, I dequeue() it. Ideally I should have used a shared hash for each of the two words. And a shared dictionary.

        It was just an idea. If you have enough time, try all the solutions, benchmark, compare readability. Using shared variables means you need locking, which is complex and can go wrong. Peaking means you need to lock the queries, which slows down the work and introduces locking, which is complex and can go wrong :-)

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1222704]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2019-11-21 21:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Strict and warnings: which comes first?



    Results (105 votes). Check out past polls.

    Notices?