Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Hashes, File sizes, Translations, OH MY!

by hacker (Priest)
on May 29, 2004 at 14:47 UTC ( [id://357479]=perlquestion: print w/replies, xml ) Need Help??

hacker has asked for the wisdom of the Perl Monks concerning the following question:

I have a series of files which are auto-generated hourly, based on commits from developers and community members who maintain these files. Currently there are 14 languages that we support. Our public website has a page which contains links to these files, including filename, size, and date of the file itself, so users who wish to use these "snapshots", can make sure they are current, etc.

To do this, I'm currently rolling through a directory where these files are placed, and fetching the information as follows:

find( { untaint_pattern=>'.*', no_chdir => 1, preprocess => sub {sort @_}, wanted => sub { return unless /\.(prc|pdb)\z/; my $v_snap_file = $File::Find::name; my $v_sb = stat("$v_snap_file"); my $v_filesize = $v_sb->size; my $v_bprecise = sprintf "%.0f", ($v_filesize); my $v_bsize = insert_commas($v_bprecise); my $v_kprecise = sprintf "%.0f", ($v_filesize/1024); my $v_ksize = insert_commas($v_kprecise); my $v_filedate = strftime "%D %r", localtime $v_sb->mtime; my $basename_v = basename($v_snap_file); print $cgi->blockquote( $cgi->a({-href=>"snapshots/$basename_v"}, "$basename_v"), $cgi->br(), "File size: $v_bsize bytes ($v_ksize kb)", $cgi->br(), "File date: $v_filedate", $cgi->br()); print "\n"; } }, $root);

This part works, but produces a very lengthy list of files, in a non-human ordering.

What I'd like to do, is compress that output a bit, so that I have a hash (??) lookup of the languages, which are then tacked onto the filename (so "English" matches up with strings_en.txt, Spanish matches up with strings_es.txt). We know the extensions of the languages we support (_es for Spanish, _fr for Francais, etc.).

Ideally, the output would look something like this:

Chinese [<a href="foo_zh_CN.txt">txt</a>] Size: 6,016 bytes Deutsch [<a href="foo_de.txt">txt</a>] Size: 277,175 bytes

The piece of this that I'm struggling with, is how to make the "human" names (English, Deutsch, French) match up with the file extensions (_en, _de, _fr respectively), and then for each of those (${file}${lang}.txt), find the file size and date (File::Find here), and output them in alphabetical order by their "human" names (Catalan before Chinese, etc.)

Any hints/tips that would help?

Replies are listed 'Best First'.
Re: Hashes, File sizes, Translations, OH MY!
by thor (Priest) on May 29, 2004 at 16:12 UTC
    First, the "easy" part: mapping human languages with their ANSI abbreviation.
    %language = (English => 'en', German => 'de', French => 'fr');
    Next, to output them in alphabetical order:
    foreach my $key (sort {$a cmp $b} keys %language) { print "$key has '$language{$key}' as the ANSI value\n"; }
    Lastly, you could implement the output that you want in a couple of different ways. The first is store all of the file information in a hash of arrays, indexed by ANSI language code. Then, in the foreach loop that I have above, you'd do a hash lookup and process the contents of the array that lives there. This has the disadvantage of having to store all of the information for all files in memory before you output. The second method that I can think of is to alter your find subroutine to search for files of a specific language. You'd call it in your foreach loop. The disadvantage here is that you walk your file tree once for every language. I'm sure that there are other ways to do it, though.

    thor

Re: Hashes, File sizes, Translations, OH MY!
by SciDude (Friar) on May 30, 2004 at 01:39 UTC

    You are very close to a solution. Do your filenames already have the "_es", "_en", etc. attached to them? If so, a simple matching operator comparision should allow you to organize them:

    if ($basename_v =~ m/_en/) { @english = ($basename, $size, $date) }

    Repeat this for each language and print each array out in any order you wish. If you have multiple filenames per language just keep pushing them onto the array. If you really must place everything in one structure - only then consider a hash of arrays,

    %Files = ( english => ["file1", "size1", "date1"], spanish => ["file2", "size2", "date2"], );

    Printing these should be straightforward:

    for $language ( keys @Files ) { print "$language: @{ $Files{$language} }\n"; }

    Hash output can also be sorted if you wish.

    Note: untested code

    SciDude

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://357479]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2024-03-28 19:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found