in reply to Cultural and Bibliometric Perl
Some comments:
- use strict warnings and diagnostics or die
- The file regex may produce some unwanted results with unix-like filenames: file.txt.bak . Just think which part you want to retain. You can find the first and last dot with index and rindex, respectively.
-
You slurp the contents in array context only to join the
array. You can also set $/ to undef:
You can leave the newlines intact, they will be catched with '\s'. Even better, tr will take care of that.{ local $/ = undef; $contenido = <LIBRO>; } - Use lc or uc to change the case.
-
You can simplify the translation, by complementing the
list to the alphabetic range (see perlop):
$contenido = uc $contenido; $contenido =~ tr/A-Z/ /cs;
- Use '\s+' rather than '\s', so you don't have to test for empty cases.
- You can get the total number without array assignment: $npalabras = keys %PF;The scalar context will force immediate size return.
- I would print LIBROUT in the while loop, so the system will get the chance to buffer nicely.
Well, you see how the use of $_ simplifies things..#.... my $contenido; { local $/ = undef; $contenido = <LIBRO>; } $contenido = uc $contenido; $contenido =~ tr/A-Z/ /cs; my %PF; $PF{$_}++ for( split /\s+/, $contenido); open LIBROUT, ">$ar.csv"; my $npalabras = keys %PF; while( keys %PF ){ print LIBROUT join ';', $_, my $f=$PF{$_}, $f/ $npalabras; print LIBROUT "\n"; }
Hope this helps,
Jeroen
"We are not alone"(FZ)
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Cultural and Bibliometric Perl
by Ignatius Monk (Novice) on Jun 29, 2001 at 15:53 UTC |
In Section
Meditations