LukeyBoy has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Plucene (the latest version from CPAN) and I'm feeding a standard indexer/analyzer a bunch of text files from my local drive. I then get this monstrous error...
SIZE IS 0 at /usr/local/share/perl/5.8.4/Plucene/Index/SegmentTermEnum +.pm line 74 Plucene::Index::SegmentTermEnum::new('Plucene::Index::SegmentT +ermEnum', 'Plucene::Store::InputStream=ARRAY(0x8692cd0)', 'Plucene::I +ndex::FieldInfos=HASH(0x872a404)', 0) called at /usr/local/share/perl +/5.8.4/Plucene/Index/TermInfosReader.pm line 52 Plucene::Index::TermInfosReader::new('Plucene::Index::TermInfo +sReader', '/tmp/gFqzc1sPeS', '_112', 'Plucene::Index::FieldInfos=HASH +(0x872a404)') called at /usr/local/share/perl/5.8.4/Plucene/Index/Seg +mentReader.pm line 82 Plucene::Index::SegmentReader::new('Plucene::Index::SegmentRea +der', 'Plucene::Index::SegmentInfo=HASH(0x877ec50)') called at /usr/l +ocal/share/perl/5.8.4/Plucene/Index/Writer.pm line 274 Plucene::Index::Writer::_merge_segments('Plucene::Index::Write +r=HASH(0x816335c)', 1) called at /usr/local/share/perl/5.8.4/Plucene/ +Index/Writer.pm line 253 Plucene::Index::Writer::_maybe_merge_segments('Plucene::Index: +:Writer=HASH(0x816335c)') called at /usr/local/share/perl/5.8.4/Pluce +ne/Index/Writer.pm line 155 Plucene::Index::Writer::add_document('Plucene::Index::Writer=H +ASH(0x816335c)', 'Plucene::Document=HASH(0x873248c)') called at ./upd +atedb line 78 (in cleanup) SIZE IS 0 at /usr/local/share/perl/5.8.4/Plucene/ +Index/SegmentTermEnum.pm line 74 Plucene::Index::SegmentTermEnum::new('Plucene::Index::SegmentT +ermEnum', 'Plucene::Store::InputStream=ARRAY(0x88c999c)', 'Plucene::I +ndex::FieldInfos=HASH(0x87fe170)', 0) called at /usr/local/share/perl +/5.8.4/Plucene/Index/TermInfosReader.pm line 52 Plucene::Index::TermInfosReader::new('Plucene::Index::TermInfo +sReader', '/tmp/gFqzc1sPeS', '_112', 'Plucene::Index::FieldInfos=HASH +(0x87fe170)') called at /usr/local/share/perl/5.8.4/Plucene/Index/Seg +mentReader.pm line 82 Plucene::Index::SegmentReader::new('Plucene::Index::SegmentRea +der', 'Plucene::Index::SegmentInfo=HASH(0x877ec50)') called at /usr/l +ocal/share/perl/5.8.4/Plucene/Index/Writer.pm line 274 Plucene::Index::Writer::_merge_segments('Plucene::Index::Write +r=HASH(0x816335c)', 1) called at /usr/local/share/perl/5.8.4/Plucene/ +Index/Writer.pm line 178 Plucene::Index::Writer::_flush('Plucene::Index::Writer=HASH(0x +816335c)') called at /usr/local/share/perl/5.8.4/Plucene/Index/Writer +.pm line 119 Plucene::Index::Writer::DESTROY('Plucene::Index::Writer=HASH(0 +x816335c)') called at ./updatedb line 0 eval {...} called at ./updatedb line 0
Anyone know how to prevent this, or what causes it? Thanks!

Update: The OS is Debian GNU/Linux, using Perl 5.8.4 from the distribution and Plucene from CPAN.

Replies are listed 'Best First'.
Re: Plucene problems.
by EverLast (Scribe) on Oct 14, 2004 at 04:58 UTC
    I use Plucene at a regular basis - without any problems.

    Show me the code, and I'll try to reproduce!

    ---Lars

    Update

    PS:

    • Did you install without force - i.e. Did make test pass OK?
    • Are you updating an existing index or just re-creating?
    • Again, some sort of wrap-up of your script for reproducing will be beneficial to locating the problem...

    Update2

    I went ahead and checked the Plucene Archives (for you) - the problem was reported in september: error during auto merge segment.

    There's a workaround in a reply.

    I have not seen the problem myself - indexing +10.000 files (C++ sources). (As I run Win32 currently, the problem might be OS dependent.)

      Hey Lars, thanks. I have to run out for a bit, and when I get back I'll post all the code. It installed cleanly, and I'm creating an index from scratch when this happens.
      Again thanks... Looking into it, it appears to happen after a few hundred operations regardless of the data - it is the auto-merge. What's the nicest way to reload the index that I'm working with? (I figure I'll checkpoint it every 100 data objects).
      Still no dice. Now I'm checkpointing every 50 additions by calling "undef" on the index writer, and then reloading it. Once the index grows past a few checkpoints, I get:
      Running checkpoint... (in cleanup) SIZE IS 0 at /usr/local/share/perl/5.8.4/Plucene/ +Index/SegmentTermEnum.pm line 74 Plucene::Index::SegmentTermEnum::new('Plucene::Index::SegmentT +ermEnum', 'Plucene::Store::InputStream=ARRAY(0x84a5ee8)', 'Plucene::I +ndex::FieldInfos=HASH(0x872ce40)', 0) called at /usr/local/share/perl +/5.8.4/Plucene/Index/TermInfosReader.pm line 52 Plucene::Index::TermInfosReader::new('Plucene::Index::TermInfo +sReader', '/tmp/swr8wHCzbm', '_55', 'Plucene::Index::FieldInfos=HASH( +0x872ce40)') called at /usr/local/share/perl/5.8.4/Plucene/Index/Segm +entReader.pm line 82 Plucene::Index::SegmentReader::new('Plucene::Index::SegmentRea +der', 'Plucene::Index::SegmentInfo=HASH(0x872d990)') called at /usr/l +ocal/share/perl/5.8.4/Plucene/Index/Writer.pm line 274 Plucene::Index::Writer::_merge_segments('Plucene::Index::Write +r=HASH(0x84a75a4)', 5) called at /usr/local/share/perl/5.8.4/Plucene/ +Index/Writer.pm line 178 Plucene::Index::Writer::_flush('Plucene::Index::Writer=HASH(0x +84a75a4)') called at /usr/local/share/perl/5.8.4/Plucene/Index/Writer +.pm line 119 Plucene::Index::Writer::DESTROY('Plucene::Index::Writer=HASH(0 +x84a75a4)') called at ./updatedb line 86 eval {...} called at ./updatedb line 86
        Again,

        Show me (us) the code!

        It will provide a starting point to reproducing and you might be able to get some suggestions for alternative solutions not triggering the bug(?).

        ---Lars

      Looks like it might be related to
      http://rt.cpan.org/NoAuth/Bug.html?id=6453 http://rt.cpan.org/NoAuth/Bug.html?id=5815
      
      
        No, these are not related (IMHO). From the original posting pathnames you'll see that the OS in question is most likely Linux.

        ---Lars

      Looks like it might be related to
      http://rt.cpan.org/NoAuth/Bug.html?id=6453
      http://rt.cpan.org/NoAuth/Bug.html?id=5815