use strict; use warnings; my %buckets; my $lineStart = tell DATA; while () { chomp; next unless length; push @{$buckets{lc substr $_, 0, 1}}, [$lineStart, tell DATA]; $lineStart = tell DATA; } for my $key (sort keys %buckets) { my @pairs = map {"@$_"} @{$buckets{$key}}; print "$key: ", (join ', ', @pairs), "\n"; } __DATA__ Ok, let's say file A has a series of strings, one per line. Let's say that file B has a series of strings, one per line. The goal is, for each line in A, to return the best match from B using a subroutine named fuzzy_match, a function that takes two strings and returns a float from 0 to 1. Now, let's assume that file B is enormous, making the prospect of applying fuzzy_match to each member infeasible. But let's also assume that the first character of each member of B will always be the best result from fuzzy_match for A. This means that instead of looking through all of B, you simply need to retrieve all records from B which start with the same first letter as the current record in A.