Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

scoping problem?

by rocroc (Initiate)
on Dec 06, 2011 at 17:21 UTC ( [id://942065]=perlquestion: print w/replies, xml ) Need Help??

rocroc has asked for the wisdom of the Perl Monks concerning the following question:

Here is a text file containing data I need to munge:
"ADELMAN","John","adad","Ray" "AGAN","John","agag","Aditya" "AHMED","John","ahah","Conor"
Here is a perl script to do some munging:
my $username; my $color; while(<>){ chomp; s/"//g; ($username,$color) = (split /,/,$_)[2,3]; if ("agag" =~ m/($username)/){print STDOUT "here is the username: +$username\n"} }
Here is the output when I invoke perl test.pl test.txt at the command line:
here is the username:
so note what's happening here: Evidently, the interpolation of the variable $username works just fine in the match "agag" =~ m/($username)/. But then, $username get's treated as empty (uninitialized?) inside the call to print. I'm totally stumped. please help.

Replies are listed 'Best First'.
Re: scoping problem?
by Corion (Patriarch) on Dec 06, 2011 at 17:27 UTC

    Are you sure you're modifying / running the same Perl script? Because for me, the following works:

    use strict; my $username; my $color; while(<DATA>){ chomp; s/"//g; ($username,$color) = (split /,/,$_)[2,3]; if ("agag" =~ m/($username)/){ print STDOUT "here is the username: $username\n" } } __DATA__ "ADELMAN","John","adad","Ray" "AGAN","John","agag","Aditya" "AHMED","John","ahah","Conor"

    ... and outputs:

    here is the username: agag
      And are you sure you're getting the contents of the text file to <>? Bad path to text file? Wrong name of text file? Failure to test open ($FH, "<  $text_file") or die "Can't open $text_file, $! ? or something similar?
Re: scoping problem?
by rocroc (Initiate) on Dec 06, 2011 at 20:26 UTC

    Ok, so first of all, thanks AnomolousMonk. The text file did have a blank line at the end. This means that my initial understanding of the problem was wrong: I had thought that $username was being interpolated like I expected in the matching expression but not in the print statement.

    Now, indeed, when that blank line is read, the match is successful, and so the print get's called, but the variable $username is empty. This means that $username is failing to be interpolated in BOTH the match AND the call to print.

    But that is not the whole story. Here's a slightly modified script: (the #NEW comment marks the one line I've added to the original):

    use strict; my $username; my $color; while(<>){ chomp; s/"//g; ($username,$color) = (split /,/,$_)[2,3]; print STDOUT "$username\n"; #NEW if ("agag" =~ m/($username)/){print STDOUT "here is the username: +$username\n"} }
    when I run this with the same invocation as before (perl test.pl test.txt), here's the output I get (and by the way, the text file still ends with a blank line):
    adad agag ahah here is the username:

    Wierd, huh? The $username comes out as expected when used in the "naked" print statement (the line marked #NEW), but it fails to be interpolated in BOTH the match AND the print statement in the following line.

    OK, now a second point here. (it get's wierder!) Corion tests the script by "hard coding" the data from the text file into the script like so:
    use strict; my $username; my $color; while(<DATA>){ chomp; s/"//g; ($username,$color) = (split /,/,$_)[2,3]; if ("agag" =~ m/($username)/){ print STDOUT "here is the username: $username\n" } } __DATA__ "ADELMAN","John","adad","Ray" "AGAN","John","agag","Aditya" "AHMED","John","ahah","Conor"
    Now, when I run this on my machine, I get the output I'm looking for -- i.e.
    here is the username: agag
    So, I'm still stumped. Is this a scoping problem (like $username goes out of scope when we get into the "if" statement)? Is there something I don't understand about the diamond operator (that's what Corion's finding suggests to me)? Could this be some sort of file permission issue (I'm running linux, and working in the shell as a regular user. and no, I'm not willing to run this script as root).

      How do you know the result of split gives you (at least) four items? Add a diagnostic print to show you $_ and you'll probably see what's happening.

      It's not a scoping problem or a readline problem or a permission problem.


      Improve your skills with Modern Perl: the free book.

        Thanks chromatic. Here's the script with additional tests to see what's in $_. at each point.
        use strict; use warnings; my $username; my $color; while(<>){ chomp; s/"//g; ($username,$color) = (split /,/,$_)[2,3]; print STDOUT "test of username: $username\n";#NEW print STDOUT "test of dollar-underscore: $_\n";#NEW ALSO if ("agag" =~ m/($username)/){ print STDOUT "here is the username: $username\n"; print STDOUT "here is dollar-underscore: $_\n"; } }
        here's the output of 'perl test.pl test.txt' (script includes "use warnings" this time, so they are included in the output)
        test of username: adad test of dollar-underscore: ADELMAN,John,adad,Ray test of username: agag test of dollar-underscore: AGAN,John,agag,Aditya test of username: ahah test of dollar-underscore: AHMED,John,ahah,Conor Use of uninitialized value $username in concatenation (.) or string at + test.pl line 11, <> line 4. test of username: test of dollar-underscore: Use of uninitialized value $username in regexp compilation at test.pl +line 13, <> line 4. Use of uninitialized value $username in concatenation (.) or string at + test.pl line 14, <> line 4. here is the username: here is dollar-underscore:

        Note that $_ and $username are both coming out as they should (i.e. the split is working) before we get to the if statement. It just looks as though the match "agag" =~ m/($username)/ is failing. (note that the warnings are only being issued at the fourth "line" of the text file -- that is at the empty last line.)

        still stumped. Perhaps I'm missing something really basic and obvious about the match operator?

      I suspect you are still falling foul of blank lines or maybe unexpected line endings (Windows cr/lf line endings on a *nix system for example). The following code may be closer to what you need:

      use strict; use warnings; open my $tempOut, '>', 'delme.txt' or die "Can't create temp file: $!\ +n"; print $tempOut <<FILE; "ADELMAN","John","adad","Ray" "AGAN","John","agag","Aditya" "AHMED","John","ahah","Conor" FILE close $tempOut; @ARGV = 'delme.txt'; while(<>){ chomp; s/"//g; my ($username,$color) = (split /,/,$_)[2,3]; next if ! defined $color; print "here is the username: $username\n" if "agag" =~ m/($usernam +e)/; }

      Prints:

      here is the username: agag

      However you seem to be parsing a CSV file so really you should be using one of the modules designed for that task such as Text::CSV.

      Oh, and you really should use warnings in addition to strict!

      True laziness is hard work
      ... thanks AnomolousMonk. The text file did have a blank line at the end.

      keszler suggested that. I just supplied a long-winded example of the effects.

Re: scoping problem?
by bliz (Acolyte) on Dec 06, 2011 at 18:57 UTC

    Not that it matters much for this example, but any reason for the matching versus just a comparison?

    if ("agag" =~ m/($username)/){
    vs
    if ("agag" eq $username){
    bliz
    
      because in the "real" script, I'm looking for filenames in a directory that contain the username, but have other stuff in them as well.
Re: scoping problem?
by keszler (Priest) on Dec 06, 2011 at 17:28 UTC

    Does the text file contain a blank line?

      Does the text file contain a blank line?

      rocroc:
      Here's an example of behavior when processing a blank line or a line having empty or too few fields: an empty or missing field splits to an empty string or to undef. The reason for the seemingly spurious match is that an empty string or an undef becomes the // regex, which always matches!

      >perl -wMstrict -le "my $s = 'foo,,'; ;; my ($field0, $field1, $field2, $field3) = split /,/, $s; print qq{field0 '$field0' field1 '$field1' field3 '$field3'}; ;; if ('bar' =~ m{ $field1 }xms) { print qq{bar matches '$field1'} } if ('bar' =~ m{ $field3 }xms) { print qq{bar matches '$field3'} } " Use of uninitialized value $field3 in concatenation (.) or string ... field0 'foo' field1 '' field3 '' bar matches '' Use of uninitialized value $field3 in regexp compilation ... Use of uninitialized value $field3 in concatenation (.) or string ... bar matches ''

      Note that without warnings (you are using warnings, aren't you?), this all proceeds quite silently:

      >perl -Mstrict -le "my $s = 'foo,,'; ;; my ($field0, $field1, $field2, $field3) = split /,/, $s; print qq{field0 '$field0' field1 '$field1' field3 '$field3'}; ;; if ('bar' =~ m{ $field1 }xms) { print qq{bar matches '$field1'} } if ('bar' =~ m{ $field3 }xms) { print qq{bar matches '$field3'} } " field0 'foo' field1 '' field3 '' bar matches '' bar matches ''
Re: scoping problem?
by rocroc (Initiate) on Dec 07, 2011 at 17:35 UTC

    SOLVED!!!!!

    The problem was indeed the encoding of the file. opening the file via a file handle with the encoding:(UTF-16le) filter did the trick. Here is the working code:

    use strict; use warnings; use Encode; my $username; my $color; my $filename = shift @ARGV; my $fh; open($fh, '<:encoding(UTF-16le):crlf', $filename); binmode STDOUT, ':encoding(UTF-8)'; while(<$fh>){ chomp; s/"//g; ($username,$color) = (split /,/,$_)[2,3]; if ('agag' =~ m/($username)/){ print STDOUT "here is the username: $username\n"; } }

    Thanks so much to all of you for your help.

    Interesting side note: I first learned perl back in 2001, coded a whole lot in a job I had through 2003, and got pretty proficient at it. Then, I didn't code at all (except for numerical stuff in C) until just this week, when all of a sudden I had to deal with some text files. I plowed ahead as if nothing had changed, coding in exactly the same style as I had 10 years ago. Reading all of the stuff on Unicode (by the way, those links are great Anonymous Monk), I see that that was very much the wrong choice! This is great, because it's given me a chance to learn about Unicode.

    thanks again!

      To brush up on your perl, you should check out the free book Modern Perl, a loose description of how experienced and effective Perl 5 programmers work....You can learn this too.

Re: scoping problem?
by rocroc (Initiate) on Dec 07, 2011 at 14:55 UTC

    Update on the issue of possible non-printable characters:

    the bash command 'file' tells me that the text file I'm reading in is encoded as follows:

    Little-endian UTF-16 Unicode
    Could this be causing my calls to the match operator to work differently than I think they are?
Re: scoping problem?
by rocroc (Initiate) on Dec 07, 2011 at 17:38 UTC

    sorry, one more thing:

    seems like I ought to go back and change the title of this node. Obviously, "scoping" was not the problem.

    any suggestions?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://942065]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-19 01:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found