Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^6: Entity statistics

by LexPl (Sexton)
on Nov 12, 2024 at 16:50 UTC ( [id://11162661]=note: print w/replies, xml ) Need Help??


in reply to Re^5: Entity statistics
in thread Entity statistics

Thanks to your kind assistance I could get a working statistics tool :)

But when I apply the script listed below to another file, I get the following error which really puzzles me:

Use of uninitialized value in concatenation (.) or string at whitespace-stat.pl line 47, <$in> line 1 (#1)
(W uninitialized) An undefined value was used as if it were already defined. It was interpreted as a "" or a 0, but maybe it was a mistake. To suppress this warning assign a defined value to your variables.

To help you figure out what was undefined, perl will try to tell you the name of the variable (if any) that was undefined. In some cases it cannot do this, so it also tells you what operation you used the undefined value in. Note, however, that perl optimizes your program and the operation displayed in the warning may not necessarily appear literally in your program. For example, "that $foo" is usually optimized into "that " . $foo, and the warning will refer to the concatenation (.) operator, even though there is no . in your program.

#!/usr/bin/perl use warnings; use strict; use diagnostics; #my personal data left out! print "Generate statistics: Whitespace in context\n"; my $infile = $ARGV[0]; #define regexes as search target (in the array @regexes) my @regexes = (qr/&sect;\s*[0-9]/, qr/Art\.\s*[0-9IVX]/, qr/Artikel\s* +[0-9IVX]/, qr/Artikels\s*[0-9IVX]/, qr/Artikeln\s*[0-9IVX]/); open my $in, '<', $infile or die "Cannot open $infile for reading: $!" +; #read input file in variable $xml my $xml; { local $/ = undef; $xml = <$in>; } #define array for frequency values my @tally; #count routine for each regex for my $i (0 .. $#regexes) { my $regex = $regexes[$i]; ++$tally[$i] while $xml =~ /$regex/g; } #define output file open my $out, '>', 'stats.txt' or die $!; #output statistics print {$out} "Statistics: Whitespace in context\n\ninput file: "; print {$out} "$infile"; print {$out} "\n====================================================== +==================\n\n"; for my $i (0 .. $#regexes) { my $regex = $regexes[$i]; $regex =~ s/^\(\?\^://; $regex =~ s/\)$//; print {$out} "$regex:\t\t$tally[$i]\n"; } close $in; close $out;

Replies are listed 'Best First'.
Re^7: Entity statistics
by choroba (Cardinal) on Nov 12, 2024 at 16:55 UTC
    > I get the following error

    It's not an error, it's a warning. You can easily tell it from the W in the diagnostics output: "(W uninitialized)".

    The most probable reason is some of the regexes didn't match anything, so their corresponding element in the array is undefined. You can print 0 instead of an undefined value using the defined-or operator //:

    print {$out} "$regex:\t\t", $tally[$i] // 0, "\n";
    or you can prepopulate the array with zeros:
    my @tally = (0) x @regexes;

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11162661]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2025-02-15 13:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found