Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Integer regex, different results in windows and mac - I just need regex help

by hiyesthanks (Initiate)
on Oct 19, 2017 at 02:42 UTC ( #1201647=perlquestion: print w/replies, xml ) Need Help??
hiyesthanks has asked for the wisdom of the Perl Monks concerning the following question:

When I test the code in windows, I get the results im looking for (https://imgur.com/a/59SKl). But when I test it on mac, I get different results (https://imgur.com/a/ZJMEg) in the positive and negative integers. Whats wrong with my regex?
#!/usr/bin/perl # The program calculates the total zeros and positive intergers from t +he data using regex use strict; use warnings; my ( $ctrP, $ctrN, $ctrZ ) = ( 0, 0, 0 ); while( my $num = <DATA> ) { chomp($num); ## print "num=[$num]\n"; if ( $num =~ /^[0].{0}/ ) { $ctrZ++; } elsif ( $num =~ /^\d[0-9]{1,3}$/ ) { $ctrP++; } else { $ctrN++; } } printf("freq(Z+):%8s\n", $ctrP ); printf("freq(Z-):%8s\n", $ctrN ); printf("freq(0):%9s\n", $ctrZ ); printf("Total:%11s\n", ($ctrP+$ctrN+$ctrZ) ); exit; __DATA__ 19 -22 498 512 15 -932 0 22 808 17 -32
  • Comment on Integer regex, different results in windows and mac - I just need regex help
  • Download Code

Replies are listed 'Best First'.
Re: Integer regex, different results in windows and mac - I just need regex help
by Marshall (Abbot) on Oct 19, 2017 at 06:49 UTC
    If you are getting different results on the MAC vs Windows, a very likely culprit would be the line endings and what chomp does.

    This isn't "perfect", but gives an idea. I would use better variable names so that you have a chance to understand them 5 years from now.

    #!usr/bin/perl # The program calculates the total zeros and positive integers from # the data using regex use strict; use warnings; my ( $c_positive_int, $c_negative_int, $c_zero_int, $ctr_not_int) = ( +0, 0, 0, 0 ); while( my $num = <DATA> ) { $num =~ s/^\s+//; # delete leading space(s) $num =~ s/\s+$//; # delete trailing space(s) if ( $num =~ /^0$/ ) # integer single 0 { $c_zero_int++; } elsif ( $num =~ /^\d+$/ ) # only positive digits { $c_positive_int++; } elsif ( $num =~ /^-\d+$/ ) #leading minus sign, then only digi +ts { $c_negative_int++; } else { $ctr_not_int++; #non integer print "**** Error $num is not an integer!\n"; } } printf "freq(Z+):%8i\n", $c_positive_int; printf "freq(Z-):%8i\n", $c_negative_int; printf "freq(0): %8i\n", $c_zero_int; printf "freq(?): %8i\n", $ctr_not_int; printf "Total: %8i\n", $c_positive_int + $c_negative_int + $c_zero_int + $ctr_not_int ; =Results **** Error 32.5 is not an integer! freq(Z+): 7 freq(Z-): 3 freq(0): 1 freq(?): 1 Total: 12 =cut __DATA__ 19 -22 498 512 15 -932 0 22 808 17 -32 32.5
Re: Integer regex, different results in windows and mac - I just need regex help
by kcott (Chancellor) on Oct 19, 2017 at 03:28 UTC

    G'day hiyesthanks,

    Welcome to the Monastery.

    Please post your results with your question. Guidelines for posting questions can be found in "How do I post a question effectively?". You can update your post with the results; "How do I change/delete my post?" explains how to do this.

    A screenshot of your terminal is an inappropriate way to present your results; however, when you do have a link to post, please create an actual link (not a URL as plain text) — "What shortcuts can I use for linking to other information?" has details about this.

    From the code you posted, I'd say the first thing you should do is read "perlretut - Perl regular expressions tutorial".

    Here's the regexes you could have used to match the numbers in your DATA section:

    $ perl -E '/^0$/ and say for qw{19 -22 498 512 15 -932 0 22 808 17 -32 +}' 0 $ perl -E '/^[1-9]\d*$/ and say for qw{19 -22 498 512 15 -932 0 22 808 + 17 -32}' 19 498 512 15 22 808 17 $ perl -E '/^-[1-9]\d*$/ and say for qw{19 -22 498 512 15 -932 0 22 80 +8 17 -32}' -22 -932 -32

    That was run on "macOS 10.12.5" using "Perl 5.26.0". I don't have MSWin available to run a comparison; although, I'd be highly surprised if the results differed.

    If you believe you're getting different results on various platforms, you should provide details of the OS and Perl you used (as I did above).

    — Ken

Re: Integer regex, different results in windows and mac - I just need regex help
by pryrt (Curate) on Oct 19, 2017 at 13:44 UTC

    You've already gotten good answers regarding CRLF, and chomping, and alternate regular expressions. However, to me, regex doesn't seem the best tool for the job. Testing whether a number is negative, zero, or positive seems a job for numerical operators to me. Using Test::More to verify and Benchmark to compare times, with the same set of 1000 random random integers:

    __END__ 1..4 # grep => { # 'count_of_negatives' => 479, # 'count_of_others' => 0, # 'count_of_positives' => 506, # 'count_of_zeros' => 15, # 'total_count' => 1000 # } # comparison_ops => { # 'count_of_negatives' => 479, # 'count_of_others' => 0, # 'count_of_positives' => 506, # 'count_of_zeros' => 15, # 'total_count' => 1000 # } # spaceship_op => { # 'count_of_negatives' => 479, # 'count_of_others' => 0, # 'count_of_positives' => 506, # 'count_of_zeros' => 15, # 'total_count' => 1000 # } # [hiyesthanks] regex => { # 'count_of_negatives' => 521, # 'count_of_others' => 0, # 'count_of_positives' => 464, # 'count_of_zeros' => 15, # 'total_count' => 1000 # } # [Marshall] regex => { # 'count_of_negatives' => 479, # 'count_of_others' => 0, # 'count_of_positives' => 506, # 'count_of_zeros' => 15, # 'total_count' => 1000 # } ok 1 - verify comparison operators give same results as grep ok 2 - verify spaceship operators give same results as grep not ok 3 - verify [hiyesthanks] regex give same results as grep # Failed test 'verify [hiyesthanks] regex give same results as grep' # at C:\usr\local\share\passthru\perl\perlmonks\1201647.pl line 22. # Structures begin differing at: # $got->{count_of_positives} = '464' # $expected->{count_of_positives} = '506' ok 4 - verify [Marshall] regex give same results as grep Rate use_marshall_regex use_hiyesthanks_regex + use_grep use_spaceship_op use_comparison_ops use_marshall_regex 1715/s -- -18% + -26% -33% -84% use_hiyesthanks_regex 2088/s 22% -- + -9% -18% -81% use_grep 2307/s 35% 10% + -- -9% -79% use_spaceship_op 2543/s 48% 22% + 10% -- -77% use_comparison_ops 10857/s 533% 420% + 371% 327% -- # Looks like you failed 1 test of 4.

    edit: fixed bugs, which were verifying and printing the wrong versions; benchmark was ok

Re: Integer regex, different results in windows and mac - I just need regex help
by dave_the_m (Prior) on Oct 19, 2017 at 06:45 UTC
    You've probably got DOS line endings in your data. Try adding $num =~ s/\s+$//; to the top of the loop.

    Dave.

Re: Integer regex, different results in windows and mac - I just need regex help
by haukex (Abbot) on Oct 19, 2017 at 08:16 UTC

    Just to confirm what the others have said, if you save your file with CRLF line endings on a system that uses LF line endings (*NIX, including Mac OS X) and run it, it gives the wrong results. If you then put binmode DATA, ':crlf'; just before the loop, it works again. :crlf is the default on Windows, see open and binmode, and it causes the CRLF line endings to be converted to LF, so that chomp works properly. Without :crlf, what is happening is that $num contains a trailing CRLF, but chomp only removes the LF, leaving the strings looking like "19\r", which you can see yourself if you use Data::Dump or Data::Dumper with $Data::Dumper::Useqq=1; - see also the Basic debugging checklist. The best solution IMO is to convert the file to LF line endings, for example using fromdos from the Tofrodos package, and also many text editors support selecting the line endings in the "Save As" dialog.

    Also, regarding your regexes, I just wanted to point out Regexp::Common::number, which you might find useful, and also that the regex /^[0].{0}/ looks a little suspicious: .{0} doesn't really do anything, and rewriting the regex as /^0/ shows that it will accept any string beginning with a 0, including stuff like "0 but true", is that really what you want?

Re: Integer regex, different results in windows and mac - I just need regex help
by choroba (Bishop) on Oct 19, 2017 at 21:06 UTC
    Crossposted to reddit. It's considered polite to inform about crossposting to avoid duplicate work of people not attending both the sites.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1201647]
Approved by NetWallah
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2017-12-16 21:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What programming language do you hate the most?




















    Results (459 votes). Check out past polls.

    Notices?