Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: General program and related problems

by tokpela (Chaplain)
on Aug 03, 2009 at 13:55 UTC ( #785469=note: print w/ replies, xml ) Need Help??


in reply to General program and related problems

Basically, it seems like you are trying to get some lookup data from your first file and then use it when scanning your second file.

I would not use arrays since you mention that you have GB file sizes. I would instead use DBM::Deep which you could use to store your initial values.

A side effect is that your retrieval will be pretty quick as well.

One thing that you will need to come up with is a link between the data from the two files - some common key to use in the DBM::Deep database.

Something like this:

use strict; use warnings; use DBM::Deep; my $db_filepath = 'lookup.db'; my $file1 = 'XXXX.txt'; my $file2 = 'YYYY.txt'; my $db = DBM::Deep->new($db_filepath); open(my $fh, $file1) or die "[Error] COULD NOT OPEN FILE [$file1]-[$!] +"; while (<$fh>) { my $line = $_; # get a common key from the data somewhere in here my @fields = split (/\s/ ,$line); my @output = grep /rs\d{5,}\b/ ,@fields; my $rs = join (':' , @output); $rs =~ s/:/\n/g; $db->{'some-common-key-bewtween-files'} = $rs; } close($fh); # now iterate through your other file and lookup using DBM::Deep open(my $fh2, $file2) or die "[Error] COULD NOT OPEN FILE [$file2]-[$! +]"; while (<$fh2>) { my $line = $_; if ($line =~ /some-common-key-bewtween-files/) { my $db_record = $db->{'some-common-key-bewtween-files'}; # now you have linked data from both files # do your other coding here. } } close($fh2);


Comment on Re: General program and related problems
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://785469]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2014-10-22 00:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (112 votes), past polls