Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: On timely destruction?

by PhiRatE (Monk)
on Aug 28, 2002 at 23:51 UTC ( [id://193624]=note: print w/replies, xml ) Need Help??


in reply to On timely destruction?

I think the important thing is to consider just how costly the other options are. I'm no expert when it comes to GC, but there are a myriad of options available and I think that timely destruction has one particular benefit: consistancy.

When you create an object, its created immediately, when you assign something to one, or call a method on one, all of this happens immediately. To take one particular operation and declare "This could happen at any old time" instantly puts the programmer into headache mode, they can never be sure anymore whether a given hard-to-find bug isn't being caused by an object being destroyed too late or equivalent, you end up having people put unnecessary gc forces throughout their code in the process of debugging, and forgetting to remove them later, or even worse having modules end up on cpan with explicit collections to get around obscure issues.

I see little choice but to make timely GC an option, but I am confident that there is enough collective brainpower available to make it efficient as an option under reasonable circumstances.

For example, we could say "Yes, you may specify this class requires timely destruction, but if you do so, its *construction* will take longer than normal", and we could then take that construction time to run down the graph tagging all the parents and containers of that object to indicate that there is a child that requires timely destruction.

Or we may hold a seperate list of objects that require timely destruction, and every time an object goes out of scope, if we have anything on that list we run up the graph from each looking to see if we can reach the object that went out of scope, if so we trigger a gc immediately.

Not all of these are practical within the architecture available, my knowledge of the parrot internals is pretty much zero, but there are a number of options that, in the case in which no destructors are specified as timely, will run pretty much equivalent to a system that has no optional, but in which when one is specified, we start making performance compromises in return for time guarrantees.

Of course, in the end, the one who does the coding, gets to make the choice. If I don't like it, I'll do it different later on :)

Replies are listed 'Best First'.
Re: Re: On timely destruction?
by Elian (Parson) on Aug 29, 2002 at 07:28 UTC
    I think the important thing is to consider just how costly the other options are. I'm no expert when it comes to GC, but there are a myriad of options available and I think that timely destruction has one particular benefit: consistancy.
    The timely deterministic options (which, in a language with references, is singular--refcounting) are expensive, both in processor time and in programming time. Also rather error-prone, unfortunately.

    They also don't guarantee consistency, though they do guarantee determinism. Not that you're necessarily going to guess the time right, but you've a reasonably good chance.

    To take one particular operation and declare "This could happen at any old time" instantly puts the programmer into headache mode"
    That's always true, though. Perl is sufficiently introspective, and is getting more introspective, to make timing of destruction potentially indeterminate. And most objects don't have DESTROY methods. Just wait until we start throw
    I see little choice but to make timely GC an option, but I am confident that there is enough collective brainpower available to make it efficient as an option under reasonable circumstances.
    Thanks for the vote of confidence, bu I should point out that I am the brainpower in this case, along with a stack of books and papers by people rather more clever than I am. If it was easy or inexpensive to do this, I wouldn't be asking the question.
      Thanks for the vote of confidence, bu I should point out that I am the brainpower in this case, along with a stack of books and papers by people rather more clever than I am. If it was easy or inexpensive to do this, I wouldn't be asking the question.

      Ah, then your solution is indeed somewhat different. I would suggest instead a garbage collection API, such that others who may have some brilliant idea we had not thought of may later come along and implement "deterministic but slow" gc, and someone else may come along and implement "never executes a destructor but is blindingly fast" gc, and yet another may simply look at the standard does-its-best gc and think "I can do better".

      If you make it practical for others to implement their own GC if necessary, you can give away the question of whether determinism is needed currently and put it back on the stack for the people who need it (itch stratch) to worry about it.

      I think your determination that refcounting is the only solution for a language with references is premature. While I agree that, in the general case, it is unlikely there are any alternatives, in the specific case of an embedded system with a number of precisely known quantities such as memory, code to execute and others, refcounting is only one of many potential solutions to the problem, including the potential option of saying "I don't need to gc at all", and another of saying "well shit, we'll just add some gc-supporting hardware to our board since we're suffering so much from it.."

      So, in summary, it is my determination (still :) that the option needs to be available for deterministic GC. I do not, however, think that you need do it, only that the option is readily available. I'm not convinced simply being open-source is enough in this case, the capability to select at build-time (at least, but probably good enough) the GC desired easily is one acceptable way of providing others with both the option and the motivation to implement the GC that fits their needs best.

      You never know, some smart-ass research group might decide that, with parrot supporting so many languages, and with such an easy plug-in for the GC, they could spend a bunch of research money coming up with something with stupifyingly tricky statistical optimisations we daren't consider, which take parrots GC beyond state of the art. Such things are inclined to happen when the architecture supports it.

        Thanks for the vote of confidence, bu I should point out that I am the brainpower in this case, along with a stack of books and papers by people rather more clever than I am. If it was easy or inexpensive to do this, I wouldn't be asking the question.

        Ah, then your solution is indeed somewhat different. I would suggest instead a garbage collection API,

        While there will be a GC API, unfortunately it's not the solution in this case. Reference counting can't be added in after the fact, since it involves a lot of code, scattered through all the core and extension source, hat we wouldn't otherwise be writing. That's one of the advantages to tracing collectors--you don't have to worry about GC code in your mainline code.
        I think your determination that refcounting is the only solution for a language with references is premature.
        Unfortunately not. For true timely DESTROY calling, it's the only option. (Though whether destruction can ever be truly deterministic in the presence of threads, closures, and continuations is up in the air)

        Choosing timely destruction in the face of references requires refcounting. (timely, here, meaning "as soon as the last reference to an object goes away")

        There's no way to do static analysis of a program such that you can determine at compile time when a variable is no longer used, since taking a reference allows a variable to escape its scope. Throw in some of the heavy introspection capabilities that are on the way and you're completely out of luck, since library code you may not know about can peek at and take references to your lexicals.

        Since we can't do static analysis, that requires a runtime solution. And for that it's either tracing every time a variable dies (which is really pricey), or reference counting.

        So, in summary, it is my determination (still :) that the option needs to be available for deterministic GC.
        But the question is still why? For what purpose? What will break, besides personal comfort, without at least the illusion of timely destruction? Abigail's given a few examples, most of which can be dealt with in other ways. How many classes have you written that both have a DESTROY and would behave oddly if the timing of those DESTROY calls weren't apparently set in stone? And if you have them, at what granularity is DESTROY calling acceptable?
      I offer a way-out suggestion just for the hell of it:

      It is clear that the non-timeliness of destruction only becomes an issue when an item of some kind of scarcity is held by the relevant object.

      In your example above, the item in question is a lock on a file, in other cases it is a database handle, in other cases it will be some other resource.

      We can solve *part* of the problem (short of refcounting) by registering such contentious resources internally. Thus, rather than closing the filehandle at the right time above, we would instead have the second open say "I wish to register my interest in a file lock on this file" at which point the registry will say "well shit, someone else already has that, lemme check if I should run a gc".

      Its not a particularly pretty concept in that it requires much determining of, and registering likely contentious resources, however due to the nature of the parrot design you may find that it fits quite well as a middle-ground solution, preventing close timing issues within the same instance (although obviously distinct processes will not benefit).

        We've talked about having some sort of immediate resource registry that would be checked at some defined interval (block exit, sub exit, sub entry, something like that) and trigger a sweep for dead objects if there were any registered. While it's more expensive than truly on-demand collection, it provides some minimum guarantees on the lifetime of potentially transient objects without too much overhead or potential for bugs in the source.

        It shouldn't be that bad as far as code goes as usually there's not much in the way of truly contentious resources in use. (Though I suppose some folks might disagre)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://193624]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (4)
As of 2024-03-19 06:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found