I'll keep it simple: I'm trying to write a quick and dirty parser of some Project Gutenberg etexts, and ran into a puzzle.
Each of the etexts is stored in a split directory structure that models the name of the etext. For example, the HTML version of etext 12345 exists in /1/2/3/4/12345/12345-h/12345-h.htm.
Here's what I have to split that out, when given just the 12345 as the argument to my parser:
my $etext = $ARGV;
my $site = 'http://pod/Gutenberg';
my $splitguten = join('/', split(/ */, $etext));
my $clipguten = substr($splitguten, -2, 2, '');
my $link = "$site/$splitguten/$etext/$etext-h/$etext-h.htm";
I'm trying to find a cleaner way to do this. Any ideas or suggestions?