Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Sharing globs

by renodino (Curate)
on Nov 02, 2007 at 00:49 UTC ( #648575=note: print w/replies, xml ) Need Help??

in reply to Sharing globs

Serves me right for yakking before thinking...

I just realized the Big Issue with installing the handle context in the shared interpretter, and once again its the beached whale.

Since any access to a shared variable might cause interpretter state changes, and (presumably) accesses to a handle might cause even more state changes than just touching a scalar, the shared interpretter lock would have to be held for the duration of access to the handle...including during the I/O operations applied to said handle. Which means that every other thread that needs to do something to any shared variable - no matter how trivial - has to wait for some other thread's I/O operation to complete. Which might be a very long time...

Which essentially means the handle can't be used from the shared interpretter (at least, not for anything more than bookkeeping purposes).

Which leads me back to the PerlIO layer idea. If a shared scalar were associated with each handle, the handles might be treated as regular atoms (string/number literals) when passed between threads. The receiving thread would then need to instantiate the handle's context in its own private interpretter context. Which is just a fancier way of doing what we've already been doing, i.e., passing the fileno and re-opening in the recving thread. Of course, the semantics of file operations get a bit confused at that point: passing a handle from Thread A to Thread B, wherein Thread B does a seek and lets Thread A know its repositioned the file pointer, leaves Thread A stuck at the old file position, since Thread B has a distinctly new handle. Not an issue for stream handles, but block I/O can get confused.

And then there's the need to collect all the other layer info to re-instantiate the handle, which won't exist until 5.10 (maybe 5.8.9 ?)

So ideally, the resuscitation of the handle in the receiving thread would actually perform a clone operation, rather than a re-open. But then things get hairy wrt refcounts. (Thread B's handle reference goes out of scope, so the private interpretter invokes close() on it...and suddenly Thread A starts getting nasty errors when it tries to use the handle).

In summary: its complicated.

Perl Contrarian & SQL fanboy

Replies are listed 'Best First'.
Re^2: Sharing globs
by BrowserUk (Pope) on Nov 02, 2007 at 05:50 UTC

    For the *{IO} portion of the glob, I would do exactly what we do now manually, dup() the IO handle and store it in the shared proxy. Whilst the underlying file/directory/socket/whatever would be shared, the internal state associated with it would be thread-specific.

    This is essentially identical to the situation when file/socket handles are shared between processes via fork. Each process can access the underlying entity, but has local internal state for things like file positions, directory positions etc. You're right about the confused semantics, but they are really no different than in the fork scenario.

    The problems I was hoping for enlightment on go much much deeper than this. For example, globs have associated glob magic. The way threads::shared works (now) is by adding it's own form of magic to the entity being shared. My initial thought was that there was a conflict between having two types of magic applied to a single entity, but I've discovered that this is not the case.

    Perl has long been able to have multiple types of magic applied to a single entity. See the moremagic field of the magic structure referenced from the MAGIC field of the SvPVMG. It allows for an arbitrary length chain of magics to be applied to any entity.

    To verify this, I went into threads::shared and disabled the checks that produce the "Cannot share globs yet" and loh & behold, I can now share a glob, dup the *{IO} portion (still done externally for now) and pass the shared glob to another thread and it works.

    This makes the implementation of threaded servers very much easier as there in no mucking around with fileno and holding onto copies of socket handles so that the socket isn't automatically closed--as the original goes out of scope--before the child thread has a chance to perform the dup(). So much simpler. Testing is limited so far, but it does work.

    But there is a further problem. HTTP::Daemon and probably others, hang objects off of the underlying socket glob produced by IO::Socket. That in itself is not a problem, you simply assign a reference to a shared hash to the *{HASH} slot at the same time as you assign a dup() of the socket to the *{IO} slot. And sure enough, everything (that I've tested so far) works.

    Where things go bellyup is that HTTP::Deamon also creates glob-based objects for the ClientConnection, also by using the *{HASH} slot--and that works also, except that is also stores a copy of the server globject in a hash value in the ClientConn globject. And once a shared glob has been stored in a hash value, it looses some or all of its magic.

    The underlying cause of that (as best as I can determine) seems to be that the magic handling code in threads::shared.xs only handles copying one additional level of magic (besides it's own), and shared element of shared aggregates require additional 'share magic'. That means that a shared glob stored in the element of a shared hash (or array) would need a chain of 3 types of magic, but shared.xs is only written to deal with 2 levels. And at that point things start failing in interesting ways.

    As I pointed out above, Perl's SvPVMGs are designed to handle an arbitrary length chain of magics, and so (I reason) there is no fundemental reason why this couldn't be made to work. It just needs the appropriate degree of understanding and xs skills and simple, thread-based servers would become a reality. Unfortunately, so far, my attempts to understand the application and management of magics has left me floundering, and those with the skills aren't interested enough in the use of threads to do the necessary.

    I'm at a loss as to how to take this further because the documentation of magics seems to be limited to a single paragraph in the perlguts docs and it isn't enough. As with all things XS, there does not seem to be a viable route forward in the acquisition of the required knowledge.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Nice piece of detective work!

      I'm troubled by this statement:

      And once a shared glob has been stored in a hash value, it looses some or all of its magic.

      Presumably, the assignment to the 'httpd_daemon' element in the shared *HASH is going thru the sharedsv_elem_mg_STORE() method, which should end up in sharedsv_scalar_store(). Have you tried sprinkling the latter with printf's to see what path it takes ? I'd assume its handled as an RV, but maybe something else is going on...( I have vague memories of a recent p5p posting regarding a threads::shared patch to address chained magic, but can't seem to find anything in the changelog...but make sure you're using the latest version)

      FWIW I'm personally still not comfortable with just dup()'ing; I've experienced too many weird behaviors with both block and (esp.) stream I/O when things get dup()'ed. I'm also concerned about the possibility of piling up dup's if an app has a master thread repeatedly/arbitrarily passing file handles around to worker threads. But if the sharing magic can be fixed, then that problem can presumably be fixed by stealing the clone code for handles, and then (a) checking if the file already has context in the recving thread and (b) cloning the state into the recving thread at that point if it doesn't.

      (BTW: maybe this dialogue needs to be moved to the ithreads maillist ?)

      Perl Contrarian & SQL fanboy

        I made my modifications to shared.xs 1.12, which was the latest version when I started. I noticed jdhedden released 1.14 recently which I have installed but haven't got around to catching up with yet. I need to add all my changes and tracing to it before I can continue my investigations.

        Here's the output from a really simple test I ran (that won't work without the mods I've made). Sorry for the mess it will look in the browser, but if you C&P it to an editor with a reasonably wide screen it should sort itself out.

        What it shows is the output from Devel::Peek::Dump() of the same glob returned from Symbol::gensym():

        The first (leftmost) is what it looks like as returned.

        The second is how it looks after it has been shared (after mods). You can see that it is still the same animal. It has retained the glob magic, and gains the shared scalar magic. At this point, the shared copy works as the original in every way I've tried. If I use a glob returned by IO::Socket::INET, all the methods work on the copy as normal. In the originating thread and in any other thread that uses it.

        The third is what you see after you assign the original as a value to a shared hash.

        The fourth if you assign the copy.

        As you can see, all the glob and shared scalar magic has been replaced by shared element magic and all the good stuff done by share() seems to have been blown away.

        Maybe I'm misinterpreting this output, this is somewhat of a work in progress that has recently been interupted by other things. What I do know is that after you assign a globref to a hash (or array element), it no longer acts as a globref, which is probably why it was disabled in the first place.

        But I am convinced that it is a limitation of the magic handling in share.xs, rather than a fundamental limitation of perl's magic handling.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://648575]
erix likes the term condescension detection

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (11)
As of 2018-05-24 14:34 GMT
Find Nodes?
    Voting Booth?