http://www.perlmonks.org?node_id=774926

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have a file from which I need to fetch the name using the id of the account; the file will be huge in size, maybe in millions so will the following code have issues? perl -ne '{ open(FILE,"/tmp/Config1"); print grep { /9344151299/i } <FILE>; close<FILE>; }' Thanks,

Replies are listed 'Best First'.
Re: File search using Grep
by targetsmart (Curate) on Jun 26, 2009 at 05:59 UTC
    so will the following code have issues?
    you have to run and tell us what issue it is having

    why can't you use the grep utility; if you are using *nix system.


    Vivek
    -- 'I' am not the body, 'I' am the 'soul', which has no beginning or no end, no attachment or no aversion, nothing to attain or lose.
      vivek, Actually we were planning to load the info in Config1 as hashes and then use it find the value. But as the file increases in size we were worried about the hash performance, so looked at this as alternate approach. So will the hash be better or this? Thanks
        IMO, if the comparison is not involving too many complexities, better use grep
        But it seems that you are ready to load such a big file into hashes, you can use DBM::Deep But since the data is stored in a file, it will not be as quick as normal hash.
        If you give more information about what/why you are trying to load and what/why you are trying to compare, you will get best possible solutions


        Vivek
        -- 'I' am not the body, 'I' am the 'soul', which has no beginning or no end, no attachment or no aversion, nothing to attain or lose.
        There is no data structure that can beat a hash for look-up's. The Perl hash algorithm isn't theoretically perfect, but it is damn fast. End of story.
Re: File search using Grep
by poolpi (Hermit) on Jun 26, 2009 at 10:08 UTC

    Better read the file line by line:

    perl -e ' open(FILE, '<','/tmp/Config1'); while(<FILE>){ print if /9344151299/i }; close FILE; ' # Or perl -ne 'print if /9344151299/i' /tmp/Config1


    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
    Update: -options and second solution

      I'm not sure whether the OP's use of grep implies that there might be more than one record with that ID but, if not, it might be an idea to avoid reading the rest of the file once you've found the record. Something like

      $ perl -ne ' > next unless /9344151299/; > print; > last;' /tmp/Config1

      I hope this is of interest.

      Cheers,

      JohnGG

        Hi, Thanks to all your suggestions; My problem is just to get the information particular to an account using his unique id; some of them will not be in the list also. We are expecting that this file will have millions of records as the accounts grow. So thinking of whether to store these records in hash each time the script is called or to just grep it? Thanks Priya
Re: File search using Grep
by Jenda (Abbot) on Jun 27, 2009 at 11:00 UTC

    Apart from other things, this has the problem that it looks for that string of digits anywhere in the line. Are you sure it will not be found in some other column? Are you sure there will never be an account whose ID contains that number as a substring?

    If the file changes seldom and you need to search it often, you should probably import it into an on-disk hash (DB_File, GDMB, SDBM, ...) or to a database (even just DBD::SQLite would help). Or maybe you should keep it in the database and change the other program(s) that use it to look for the data in the database or on-disk hash.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.