hj4jc's scratchpad


go ahead... be a heretic
	PerlMonks

hj4jc's scratchpad

by hj4jc (Beadle)

on Jan 04, 2006 at 20:09 UTC ( [id://521002]=scratchpad: print w/replies, xml )

Need Help??

Public Scratchpad

Download, Select Code To D/L

Okay, I want to create a list of every pair of names possible (non-redundant), from this file called "test.txt" So, test.txt looks like this:

name 1
name 2
name 3
name 4
name 5
[download]

This is the code I have:

#!/usr/bin/perl -w

use strict;

open (INPUT, "<test.txt") or die "cannot open\n";
open (OUTPUT, ">result.txt") or die "cannot open out\n";

while (<INPUT>) {
    chomp $_;
    my $first_name = $_;
    
    while (<INPUT>) {
        chomp $_;
        my $next_name = $_;
        print OUTPUT "$first_name\t$next_name\n";
    }
}
[download]

BUT my result stops here:

name 1    name 2
name 1    name 3
name 1    name 4
name 1    name 5
[download]

Why doesn't the first while loop proceed to the next line (which contains "name 2")? Why is it stuck at the first line and does not move ahead?

The problem is, I have over 300,000 names that I need to draw every possible unique pairs from (so, it's 300,000 choose 2, which is over 40 billion unique pairs). So, I need a strategy where I can read in a small chunk and print out a small chunk, rather than reading the whole file into some large array first.

To explain my data a little better... each name is associated with 200 numbers, so the data actually looks like the following:

name 1 0.2 0.3 0.22 0.41 ... (200 numbers)
name 2 0.3 0.8 0.72 0.11 ... (200 numbers)
...
name 300000 0.1 0.2 0.3 0.4 ... (200 numbers)
[download]

And I need to calculate correlation between every unique pair possible. Thanks so much for your help!

Domain Nodelet^?

www.com | www.net | www.org

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others browsing the Monastery: (5)

As of 2024-04-24 08:00 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found