Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Hash ref and file extensions

by rupesh (Hermit)
on Aug 16, 2003 at 11:24 UTC ( [id://284326]=perlquestion: print w/replies, xml ) Need Help??

rupesh has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
Here's the picture: I have a very large file with a lot of file names in it. Like this:
cln2-test.pl labsearch.txt listprojects.pl server1-test.pl (17).SWW (18).SWW (19).SWW (2).SWW (20).SWW (21).SWW (22).SWW (23).SWW (24).SWW (25).SWW (26).SWW (3).SWW (4).SWW (5).SWW (6).SWW (7).SWW (8).SWW (9).SWW 2689previous.gif 2690next.gif a a.out bbb book_details c1 c2 c3 charcut check_dupvar club common cpgm_check ctest ctest1 ctest2 activepj.pl autopro.exe AutoPro.pl getbylabel.exe getbylabel.pl getbylabel2.exe getbylabel2.pl . . . . . . ..
As you can see, most of the files end with extensions. What i want is to produce a report in this way:
Extension: No. of files .txt 3 .pl 10 .exe 17 . 4 . . .
The file extensions are not pre-determined. They are updated in the log as they are read from the file.
Thanks for your time

we're born with our eyes closed and our mouths wide open, and we spend our entire life trying to rectify that mistake of nature. - anonymous.

Replies are listed 'Best First'.
Re: Hash ref and file extensions
by valdez (Monsignor) on Aug 16, 2003 at 11:52 UTC

    This code should do what you need:

    open(F, '<', './files.txt') or die "open: $!"; while ($filename = <F>) { chomp($filename); $dot = rindex($filename, '.'); if ($dot > -1) { $extension = substr($filename, $dot+1); $extensions{$extension}++; } else { $extensions{'_without_extension'}++; } } close(F); while (($key, $value) = each %extensions) { print "$key -> $value\n"; }

    Ciao, Valerio

    update: thanks liz!

      One small nit. I would replace this code:
      if ($dot > -1) { $extension = substr($filename, $dot+1); $extensions{$extension}++; } else { $extensions{'_without_extension'}++; }
      by:
      $extensions{$dot == -1 ? '' : substr( $filename,$dot )}++;
      for two reasons:
      1. it's more compact
      2. it allows you to differentiate between filenames without extension (no . found) and with an empty extension (a . at the end).
      Liz
Re: Hash ref and file extensions
by submersible_toaster (Chaplain) on Aug 17, 2003 at 01:32 UTC

    Depending on how much of the work you really need to do yourself, File::Basename is a option.

    use File::Basename; use strict; my %exts; open FILE , '/some/big/file' or die "Screaming $!"; while ( <FILE> ) { chomp; my $file = $_; my ($name, $path, $suffix) = fileparse( $file , '\..*' ); $exts{$suffix}++; } # Dump your hash here.
    Of course that is a regex being performed by File::Basename, YMMV.

    Update: Totally untested , looks OK to me.


    I can't believe it's not psellchecked
Re: Hash ref and file extensions
by demerphq (Chancellor) on Aug 17, 2003 at 09:40 UTC

    my %ext; while (<>) { next unless /\S/; /(\.[^.]+)?$/; $ext{lc($1||"")}++; } printf "%-6s %d\n",$_||'<NONE>',$ext{$_} for sort keys %ext;

    Remove the lc() if you are on a case sensitive file system.


    ---
    demerphq

    <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://284326]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-03-28 17:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found