I must admit that I'm a little rusty with Perl, I dug up an ancient script. I'm not proud of it... but hey it is from year 2000.
while (<FIL>) {
$row = $_;
($pnr, $enamn, $fnamn, $adress, $stad, $telnr, $ar, $lon, $timmar,
+$anr) = split("\t", $row);
for ($i = 0; $i < $cnt; $i++)
{
if($pnr_array[$i] eq $pnr)
{
$duplicate++;
$unique = 0;
print "\n$pnr is not unique!\n";
}
}
if ($unique != 0)
{
$pnr_array[$cnt] = $pnr;
$cnt++;
# file write code omitted..
}
$unique = 1;
}
Now this code is terribly inefficient, it almost takes 30 minutes to run with the data set to remove the rows with non unique $pnr :s.
Today I looked up my old script, to do some serious optimization on it, and the resulting code
while (<FIL>) {
$rad = $_;
($pnr, $enamn, $fnamn, $adress, $stad, $telnr, $ar, $lon, $timmar,
+$anr) = split("\t", $rad);
# print "$pnr";
if(exists $pnr_array{$pnr})
{
$duplicate++;
$unique = 0;
print "\n$pnr är inte unikt!\n";
}
if ($unique != 0)
{
$pnr_array{$pnr} = 0;
}
$unique = 1;
}
Now this code uses a nasty construct that I don't find beatiful... $pnr_array{$pnr} = 0; is there a better way to do a set in perl?
|