Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^10: Memory leak question

by SBECK (Chaplain)
on Oct 05, 2010 at 20:09 UTC ( [id://863671]=note: print w/replies, xml ) Need Help??


in reply to Re^9: Memory leak question
in thread Memory leak question

I've been using both of these... but the output is cryptic (to me at least... I suspect someone for who memory leak tracing is second nature find them more than adequate). I played just a bit with valgrind too... I may spend some additional time with it.

I'd certainly appreciate you looking at my module... but I don't expect you to. You've already given me tons of help. If you're up for it though, I've placed a copy: http://sullybeck.com/Date-Manip-6.13.tar.gz It's a little bigger than I wanted to include as an attachment.

Thanks again.

Replies are listed 'Best First'.
Re^11: Memory leak question
by BrowserUk (Patriarch) on Oct 05, 2010 at 20:29 UTC

    I downloaded it, and attempted a build, but--as is typical with modules that use the insanely complicated, egotistically over-engineered, crappily buggy, Module::Build--it outputs Building Date-Manip and then sits chewing 100% cpu and does absolutely nothing.

    Total insanity for a pure perl module that doesn't even need a bloody compiler. The Module::Build author should be strung up by his nether regions!

    So sorry, unless you have an alternative distribution that will build on Windows, I won't be able to do anything.

    At least not tonight. Maybe tomorrow I'l have a go at a manual install--but given the complexity of the thing, I'm not making any promises.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I wasn't aware that Module::Build was such a problem on Windows. I do very little development on windows (basically just a small amount of debugging) and hadn't encountered that problem.

      Don't waste your time doing a manual install... with over 900 timezone modules in Date::Manip, that wouldn't be a good use of your time. Instead, I have created a new bundle which uses Makefile.PL instead and put it at: http://sullybeck.com/Date-Manip-6.13a.tar.gz.

      As a side note, the reason that Date::Manip currently uses Build.pl is that it gave me the flexibility to test what version of perl was running and then install either version 5 or version 6 of Date::Manip (Date::Manip 6 requires perl 5.10 or higher). I just threw away version 5 for the temporary bundle I created for you, so I didn't need that flexibility.

      I was planning on changing this however. After playing with this for a few versions, I've decided that I want to simply install both versions for everyone and then have Date::Manip be a wrapper to load the appropriate version. This will mean being able to go back to providing both a Build.PL and Makefile.PL. This planned change was a bit lower on my priority list, but given your previous message, it has jumped up much higher, and I'm going to try to include that in the next release.

      Thanks again.

        I believe you are being bitten by regex engine leaks.

        Here's what I discovered.

        1. If I replace _iso8601rx() with the bare minimum to parse the date/time in the test, the memory leaks disappear completely.
          my %cache; sub _iso8601_rx { my($self,$rx) = @_; my $dmt = $$self{'tz'}; my $dmb = $$dmt{'base'}; return $cache{ $rx } if exists $cache{ $rx }; } $cache{cdate} = '(?<y>\d\d\d\d)-(?<m>\d\d)-(?<d>\d\d)'; $cache{ctime} = '(?<h>\d\d):(?<mn>\d\d):(?<s>\d\d)'; $cache{fulldate} = "$cache{cdate}\\s+$cache{ctime}"; 1;
        2. However, if I change that to using the fully expanded regexes, it goes back to leaking like a sieve:
          my %cache; sub _iso8601_rx { my($self,$rx) = @_; my $dmt = $$self{'tz'}; my $dmb = $$dmt{'base'}; return $cache{ $rx } if exists $cache{ $rx }; } $cache{cdate} = <<'ERX'; (?i-xsm:(?:(?<y>\d\d\d\d)(?<m>\d\d)(?<d>\d\d)|(?<y>\d\d\d\d)\-(?<m>\d\ +d)\-(?<d>\d\d)|\-(?<y>\d\d)(?<m>\d\d)(?<d>\d\d)|\-(?<y>\d\d)\-(?<m>\d +\d)\-(?<d>\d\d)|\-?(?<y>\d\d)(?<m>\d\d)(?<d>\d\d)|\-?(?<y>\d\d)\-(?<m +>\d\d)\-(?<d>\d\d)|\-\-(?<m>\d\d)\-?(?<d>\d\d)|\-\-\-(?<d>\d\d)|(?<y> +\d\d\d\d)\-?(?<doy>\d\d\d)|\-?(?<y>\d\d)\-?(?<doy>\d\d\d)|\-(?<doy>\d +\d\d)|(?<y>\d\d\d\d)W(?<w>\d\d)(?<dow>\d)|(?<y>\d\d\d\d)\-W(?<w>\d\d) +\-(?<dow>\d)|\-?(?<y>\d\d)W(?<w>\d\d)(?<dow>\d)|\-?(?<y>\d\d)\-W(?<w> +\d\d)\-(?<dow>\d)|\-?(?<yod>\d)W(?<w>\d\d)(?<dow>\d)|\-?(?<yod>\d)\-W +(?<w>\d\d)\-(?<dow>\d)|\-W(?<w>\d\d)\-?(?<dow>\d)|\-W\-(?<dow>\d)|\-\ +-\-(?<dow>\d))) ERX $cache{ctime} = <<'ERX'; (?-xism:(?:(?<h>[0-1][0-9]|2[0-3])(?<mn>[0-5][0-9])(?<s>[0-5][0-9])(?: +[\.,]\d*)?|(?<h>[0-1][0-9]|2[0-3]):(?<mn>[0-5][0-9]):(?<s> ... bulk of the regex ellided because PM won;t let me post that much! +... azt|ret|mot|gyt|lrt|ut|e|a|u|k|o|d|z|t|n|p|y|g|w|s|c|i|m|b|q|v|r|x|h|f +|l)) \))? ))))?) ERX $cache{fulldate} = <<'ERX'; (?x-ism:^\s*(?: (?i-xsm:(?:(?<y>\d\d\d\d)(?<m>\d\d)(?<d>\d\d)|(?<y>\d\ +d\d\d)\-(?<m>\d\d)\-(?<d>\d\d)|\-(?<y>\d\d)(?<m>\d\d)(?<d>\d\d)|\-(?< +y>\d\d)\-(?<m>\d\d)\-(?<d>\d\d)|\-?(?<y>\d\d)(?<m>\d\d)(?<d>\d\d)|\-? +(?<y>\d\d)\-(?<m>\d\d)\-(?<d>\d\d)|\-\-(?<m>\d ... bulk of the regex ellided because PM won't let me post that much i +n a single post! ... nmt|lkt|gst|vet|tjt|eat|ept|cat|pht|pwt|nft|set|gft|hst|nut|qmt|mpt|tr +t|ywt|cdt|emt|met|ast|net|kst|ect|brt|bdt|mvt|cst|cvt|fmt|azt|ret|mot +|gyt|lrt|ut|e|a|u|k|o|d|z|t|n|p|y|g|w|s|c|i|m|b|q|v|r|x|h|f|l)) \))? +))))?) | (?-xism:(?:(?<h>[0-1][0-9]|2[0-3])|\-(?<mn>[ +0-5][0-9]))) )\s*$) ERX 1;

        I thought that it was maybe the use of (so many) named captures, but I tried very hard to make them leak. A single regex with 175,000 named captures; matching /g against a string that contained 10,000 matches for them; in a (v.slow) loop. It grew very arge, but once it maxed out, it didn't leak at all.

        So then I remembered that I'd seen the regex trie optimisation caused problems with large alternations, but disabling it didn't change things.

        Then I thought to try your monster regexes in a standalone script and run them directly on the sample date in a loop:

        #! perl use strict; my %cache = ( ctime => <<'RXA', cdtate => <<'RXB', fulldate -> <<'RXC' + ); ##... monster regex initialisation ellided; my $refull = qr[$cache{ fulldate }]x; my $rectime = qr[$cache{ ctime }]x; my $recdate = qr[$cache{ cdate }]x; for (1..100e6) { "2010-02-01 01:02:03" =~ $refull; "2010-02-01 01:02:03" =~ $rectime; "2010-02-01 01:02:03" =~ $recdate; }

        it doesn't leak at all. Not a jot.

        So, it's not just the monster regexes, but also how they're are being used, or the results are being used that triggers the leak.

        I'm kinda stuck for a direction in which to go now, but I hope that this will help you zero in on the cause. I'll keep looking.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Got it thanks. It downloaded and installed this time in 2 minutes.

        I left the M::B build running last night whilst I watched a film and 1 1/2 hours later it was still using 100% cpu and still had done nothing. The best as I can tell, it is just grepping around the build tree looking for POD, reading every file over & over and over. Dog only knows why!

        I;ve looked to try and fix it several times, but it is sooo (unnecessarily) complicated that I get nowhere.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://863671]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-03-28 13:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found