Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Completeness of Devel::Size answers

by jand (Friar)
on Sep 23, 2008 at 00:09 UTC ( [id://713120]=note: print w/replies, xml ) Need Help??


in reply to Re: Completeness of Devel::Size answers
in thread Completeness of Devel::Size answers

You can detect circles without all the bookkeeping overhead, but you wouldn't be able to calculate the total size of a structure containing circular references (you may end up counting some elements multiple times, and you can't be sure you traversed the whole structure before you detected a circle).

Replies are listed 'Best First'.
Re^3: Completeness of Devel::Size answers
by BrowserUk (Patriarch) on Sep 23, 2008 at 00:35 UTC

      To detect cycles, you need only keep a hash as large as your search is deep. That even allows you to detect cycles "immediately", but it doesn't tell you when you have the same structure referenced from two different places in your directed, acyclic graph: my @a= ( \%huge, \%huge );

      - tye        

      FWIW, I was stunned when I came across the following:

      To detect cycles you can use two pointers -- say, p and q. You step p through the data in the usual way. You step q through the data twice for each step of p, until the end is met. If p equals q before the end, you have a loop.

      The tricky bit in a recursive descent tree walk is to walk two pointers at once ! You need to build an iterator for q which contains at least a stack for recursion, and, for Perl, some way of iterating through a hash which isn't going to trip up p. (Of course, you have to guarantee that p and q will visit the nodes in the same order, especially where there is a loop !)

      Apart from that minor difficulty, this trick only tells you there is a cycle. It doesn't tell you where it is.

      "It's a good sort of brake. But it hasn't worked yet."

        That is interesting. Horribly inefficient, but interesting :)

        Also, as noted, the tracking hash in Devel::Size doesn't only detect cycles, but also prevents data structures and subtrees that are referenced from 2 (or more) places in the overall structure from being counted multiple times. This isn't just required to detect repetition in the structure at the user level, but also in the internal structures. For example, internally, some hashes can share keys, and the same hash mechanism is used to prevent the memory for shared keys being counted over.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        This works, but it's not very efficient in terms of CPU.

        This technique is used in software verification tools that do depth-first search on transition graphs. Often these graphs are much too large to fit into memory, so they are lazily generated and discarded of, so schemes like this are the only rescue.

        Indeed. Most probably OT, but this reminded me of the appendix Secrets of Programmer Jobs Interviews from a book I enjoyed reading a couple of years ago: Expert C Programming - Deep C Secrets (with a fish on the cover ;-) by Peter van der Linden (1994) (Update: search http://books.google.com/).

        The task is to detect a loop in a linear list. First you are allowed to sketch any algorithm you like. Then more and more restrictions are imposed like the list is read-only, RAM is too limited, must use CPU registers only, etc.

      If you have a circle, then any value in that circle will repeat infinitely often. So at depths 2, 4, 8, and so on you switch to looking for the current element at deeper depths. When you back up, you stop looking for a match. At any point in time you only need to remember your current depth and possibly the one element that you're looking for duplicates of.

      This is guaranteed to find any circular references. It just won't always find them promptly, and so in a total size count you might count elements more than once.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://713120]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-03-28 21:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found