Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Out Of Memory error at 950MB with 14GB free RAM

by aburker (Sexton)
on Feb 10, 2004 at 06:45 UTC ( [id://327832]=perlquestion: print w/replies, xml ) Need Help??

aburker has asked for the wisdom of the Perl Monks concerning the following question:

Hi!

I have to parse a 50MB file and HAVE TO put the WHOLE file in a dynamic created structure (pointers to arrays of pointers to arrays...;-) into memory. Well it's hard enough that the memory consumption is near to 19 times the filesize > 950MB, but the real hard problem is that the perl script terminiates with "Out of memory!" statment.

So this would not be surprise if i would have not more than ca 1GB of RAM but the machine has 14GB free memory (and NO the GB is no typo ;-)

what i have checked:
====================
* enough free memory (physical and swap)
* I also have ulimitied access rights to the memory.
* Compiling perl with the the -DPACK_MALLOC and -DTWO_POT_OPTIMIZE opions also doesn't help!
* The program crashes every time at the same memory consumption, not related to the input file!
* because of the things I will have to do with the data it would be very complicated handle smaler parts of the file at a time...

so the question(s):
===================
* is it a perl bug? or is there a comiler/runtime option i can set that more memory can be used?
* any other suggestions?

info:
=====

$>uname -a HP-UX edsserv3 B.11.00 U 9000/800 613309356 unlimited-user license $>ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) 147456 memory(kbytes) unlimited coredump(blocks) 4194303 $>perl -V Summary of my perl5 (revision 5.0 version 6 subversion 0) configuratio +n: Platform: osname=hpux, osvers=11.00, archname=PA-RISC2.0 uname='hp-ux hostname b.11.00 u 9000800 613309356 unlimited-user l +icense ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultipl +icity=undef useperlio=undef d_sfio=undef uselargefiles=define use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=u +ndef Compiler: cc='cc', optimize='-O +Onolimit', gccversion= cppflags='-Ae' ccflags =' -Ae -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 ' stdchar='unsigned char', d_stdstdio=define, usevfork=false intsize=4, longsize=4, ptrsize=4, doublesize=8 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +6 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=8 alignbytes=8, usemymalloc=y, prototype=define Linker and Libraries: ld='ld', ldflags ='' libpth=/usr/local/lib /lib /usr/lib /usr/ccs/lib libs=-lnsl -lnm -lndbm -ldld -lm -lc -lndir -lcrypt -lsec libc=/lib/libc.sl, so=sl, useshrplib=false, libperl=libperl.a Dynamic Linking: dlsrc=dl_hpux.xs, dlext=sl, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl +,-B,deferred ' cccdlflags='+z', lddlflags='-b -s -a shared' Characteristics of this binary (from libperl): Compile-time options: USE_LARGE_FILES Built under hpux Compiled at Mar 6 2001 18:11:51 %ENV: PERL5LIB=".:/home/ucsab/bin:/hot_work/develop/main/tools" PERL_DEBUG_MSTATS="2" @INC: . /home/ucsab/bin /hot_work/develop/main/tools /opt/perl5/lib/5.6.0/PA-RISC2.0 /opt/perl5/lib/5.6.0 /opt/perl5/lib/site_perl/5.6.0/PA-RISC2.0 /opt/perl5/lib/site_perl/5.6.0 /opt/perl5/lib/site_perl . $>top System: hostname Tue Feb 10 07:31:39 +2004 Load averages: 0.57, 0.52, 0.47 250 processes: 245 sleeping, 5 running Cpu states: CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS 0 0.45 1.0% 0.0% 4.0% 95.0% 0.0% 0.0% 0.0% 0.0% 1 0.72 9.1% 0.0% 7.3% 83.5% 0.0% 0.0% 0.0% 0.0% 2 1.00 100.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 3 0.52 23.4% 0.0% 4.2% 72.4% 0.0% 0.0% 0.0% 0.0% 4 0.43 11.5% 0.0% 10.9% 77.6% 0.0% 0.0% 0.0% 0.0% 5 0.41 9.5% 0.0% 3.6% 86.9% 0.0% 0.0% 0.0% 0.0% 6 0.41 3.0% 0.0% 6.7% 90.3% 0.0% 0.0% 0.0% 0.0% 7 0.64 8.3% 0.0% 3.4% 88.3% 0.0% 0.0% 0.0% 0.0% --- ---- ----- ----- ----- ----- ----- ----- ----- ----- avg 0.57 20.8% 0.0% 5.1% 74.1% 0.0% 0.0% 0.0% 0.0% Memory: 1373464K (1254588K) real, 1364092K (1263948K) virtual, 1366568 +8K free Page# 1/32 CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU + COMMAND 2 pts/1 8404 ucsab 241 20 905M 898M run 9:24 99.78 99.60 + perl 3 ? 8678 ora_rman 241 20 45404K 17088K run 1:00 62.25 61.76 + oracleOBAA $>skript.pl ... Out of memory! Error: Script <skript.pl> returned exit value <1>! $>

Thanks for any advice
alex

Edited by BazB. Added readmore tags.

janitored by ybiC: Retitled from "Out of memory! at 950MB with 14GB free RAM!!!"

Replies are listed 'Best First'.
Re: Out Of Memory error at 950MB with 14GB free RAM
by ysth (Canon) on Feb 10, 2004 at 07:01 UTC
    Why the ancient perl? You can try Configuring with -Uusemymalloc and see if that makes a difference. Or try perl5.8.3; I think there were some improvements in perl's malloc (what you get when mymalloc is set) but can't recall offhand the details (other than malloc.c got it's missing LotR quote added).

    Update: http://groups.google.com/groups?threadm=m3d6jsp4eq.fsf%40franz.ak.mind.de would seem to indicate perl's malloc does have a 1GB limit (though I didn't see an actual authoritative statement to that effect in that thread.)

    I'd encourage you to submit a perl bug report (see perldoc perlbug), and try -Uusemymalloc. I note that you are not using 64bit pointers, so you are going to have at very best a 4GB (any quite possibly a 2GB) limit anyway. See README.hpux for information on building 64-bit perl.

      wow!

      I really didn't expect that much answers past one week of reaserch on that topic and finding not to much, THANKS!

      your link really helped and brought me to the conclusion that it is very likely that the perl usemymalloc switch which is set on the server is causing the problem.

      Unfortunately the sysadmin won't recompile perl without the switch (he will only use prepacked packages provided by HP)

      Probably he will do an upgrade to 5.8 but this can take time... and (after reading the postings in your link) this will not fix the problem!

      other answers: *) I can't map this to harddisk (performance issue)
      *) I know 950MB is a lot of RAM but I will have to build up a complex Structure made up of small strings. This structure must be FAST handable. So that is why the RAM gets big!
      *) And normally RAM is not the problem (why would someone by a server with 8CPUs and 16GB RAM if it's not for performance..., Just bad that perl can't keep scope with that!
      *) the 4GB border is not the problem my problem would just need 10% more memory to handle the largest file, but by now all possible tweaks are already done (as I found them :-)

      conclusion:
      +++++++++++
      I will have to rewrite the program and make it slower, but this seems to be less days of work then recompiling perl and checking all other scripts :-((((((

      But anyway thanks for your response!

        It's possible you're being bitten by the HP-UX caveat of maxdsiz/maxdsiz64. This variable limits the maximum size any single process can grow to, and no amount of recompiling will fix it. Run "kmtune -q maxdsiz" to see whether the process limit is causing this. If so, then fiddling with kmtune or "Kernel Configuration" from within SAM will allow you to change it.


        davis
        It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.
        Unfortunately the sysadmin won't recompile perl without the switch (he will only use prepacked packages provided by HP)
        Point the sysadmin to the HP-UX Porting Centre. (It's not immediately clear to me if the perl-5.8.3 package there is 64-bit or not.)
        Might you just build your own userland build of perl, just for this program?
Re: Out Of Memory error at 950MB with 14GB free RAM
by davido (Cardinal) on Feb 10, 2004 at 07:30 UTC
    I know that you said you "HAVE TO" put the entire 50mb file into an in-memory complex datastructure. But begging your pardon for second-guessing that strategy (which isn't working for you), is it possible that you could use Tie::MLDBM instead, so that the seemingly in-memory datastructure can actually (mostly transparently) reside on your HD rather than in memory? ...just a thought.

    I know that it's not 100% transparent, but it allows for modifiable datastructures of arbitrary complexity, while keeping them disk-based so that memory usage doesn't go through the roof. This will carry with it a performance hit, but maybe in a pinch slower is better than not at all.

    You probably have a good reason for not doing this, but just in case, I thought I'd mention it on the off chance it might solve your problem without hacking Perl's internals to allow you to utilize more of your system's memory.


    Dave

Re: Out Of Memory error at 950MB with 14GB free RAM
by Abigail-II (Bishop) on Feb 10, 2004 at 08:30 UTC
    950 Mb for a 50 Mb is a lot of memory, even for Perl. How are you storing your file? As an array of characters? Then the memory consumption is no surprise. But if you store the file as a single string, you should consume about 50 Mb of memory.

    I'd first try to find out what is causing the memory consumption. It might not even be related to your input. Something of the form $a [[1]] = 1; could cause the out of memory problem as well.

    Abigail

      My God!
      Please, Abigail, could you explain why this expression is a problem?
      $a [[1]] = 1;
      take close to 50 MB on my system.
      Many thanks.

        [1] is a reference. A reference in a numeric context gives you a memory address. Memory addresses are usually big. Storing an element out of range in an array will make Perl grow the array so it fits - creating undefined values to fill up the array. Storing an element using a big index will make Perl create a huge array, with all the elements taking two handfuls of bytes.

        I'd say you're lucky it takes only 50Mb on your system. It dies on one system I tried it on, and it used 420 Mb on another.

        Abigail

        Try perl -wle 'print 0+[1]'; it will print a very big number, as it converts the pointer to an anonymous list ([1]) to an integer. Thus, $a[[1]]=1 creates a very large array, as it has to set the (0+[1])-th element of @a to 1.

        Update: Abigail-II was somewhat faster to submit an answer, and his is actually cleaner than mine.

Re: Out Of Memory error at 950MB with 14GB free RAM
by theorbtwo (Prior) on Feb 10, 2004 at 07:29 UTC

    In that config you quoted, note the "ptrsize=4"? That means you're using normal 32-bit pointers (ptrsize, etc, are in bytes, 4 bytes * 8 bits/byte = 32 bits), which have a theoretical maximial addressable space of 4GB -- but much of that will be reserved for the kernel. You need a newer perl, properly compiled for 64-bit support all over the place (and not just for 64-bit-sized files). (You really only need 64-bit pointers, not 64-bit ints, but having intsize != ptrsize isn't all that well supported... I think. I've never used a memory space larger then 1GB.)


    Warning: All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

      IIUC, PA-RISC chips have a segmented architecture, just like the old 80x86, except that the segments are up to 4Gb instead of 64Kb. Unless HP-UX uses them very differently than MPEiX (with which I am more familiar), the kernel doesn't use much if any of that space; it's just process data. (though memory mapped files may use some, and there will be a division of what's available between the heap and the stack.)

        32-bit pointers can't possibly address more then 2**32 bits of memory (bytes 0..((2**32)-1)). Perhaps, by some funky technique (not so funky, really), HP-UX manages to make almost all of that 4GB available to user processes. If you want more then 4GB of memory space, you have to use 64 bit pointers (or at least greater then 32-bit pointers -- 32-bit intel chips can use 36 bit pointers through PXE36 in some circumstances, with lots of extra work, whcih I don't think perl even supports -- so it's a choice of 32-bit for 4 GiB, or 64-bit for up to 16,777,216 TiB. Since the OP wants more then 4GB, 64 bit it is.


        Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Re: Out Of Memory error at 950MB with 14GB free RAM
by Courage (Parson) on Feb 10, 2004 at 09:36 UTC
    I had an experience when perl-5.6.1 produced "Out of memory" but perl-5.8.1 succesfully finished a script although .

    (it actually was a oneliner perl -we "print length(join '==','a'..'zzzzz')":)

    Courage, the Cowardly Dog

Re: Out Of Memory error at 950MB with 14GB free RAM
by mattr (Curate) on Feb 10, 2004 at 17:05 UTC
    Does your program give correct results on a smaller dataset? (i.e. have you used unit tests in development?) I think this is the most important question. Assuming the algorithm works and is not reducible to a flatter, less pathological data structure..

    It also looks like you are only maxing a single cpu. So you could put another cpu or so to work virtualizing your data structure to a database or ramdisk / ram cache that could be more easily shared.

    But like people are saying, perl can chew memory 2 or 3 times size but you report more than what people generally get I think. And, if there's a good reason for the memory snarfing like I dunno, using ascii or maybe a relatively small number of substructures that are repeated often, then you can probably trade cycles for space using packing, serializing, and indexing shortcuts.

    I'm thinking about a certain professor I once listened to, who apparently came up with an immensely fast genome pattern matcher, part due to the algorithm and part due to having gues who could use lots of intelligent programming tricks. So you might put some more time into thinking about implementation, and before that about how the problem could be mathematically reduced.

    You might consider sharing a little of the problem with us or anyway consider how the problem might translate to a database, or otherwise spend a long time compiling a data structure that can be quickly searched afterwards.

    By the way! I am also interested in:

  • What is the general kind of problem you are solving
  • Are there similar problems to the hp ones, on say a 4 GB, 4 xeon linux machine? Just hooked up a new virtual private server at globalservers.com but thinking about what drawbacks may come from not compiling perl for 64 bits..
Re: Out Of Memory error at 950MB with 14GB free RAM
by ctilmes (Vicar) on Feb 10, 2004 at 11:40 UTC
    Perhaps you could write your script as if the data was on disk as others have noted, but then arrange to have the files go on a RAM disk for speed?
      Exactly what I was going to suggest. If you don't want to go through the overhead of mapping your data structure(s) to disk files, the earlier post suggesting Tie::MLDBM combined with a RAMdisk would only add the overhead of going through the OS's disk routines, which have probably been subjected to some optimization already.

      --
      Spring: Forces, Coiled Again!
Re: Out Of Memory error at 950MB with 14GB free RAM
by qq (Hermit) on Feb 10, 2004 at 15:05 UTC

    * The program crashes every time at the same memory consumption, not related to the input file!

    Wouldn't this indicate that its not a large file problem, but rather something internal to the code? Have you tried it on a small file? As Abigal-II points out, it may have to do with something else entirely.

    qq

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://327832]
Approved by dws
Front-paged by bart
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-04-19 04:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found