Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: Regexp mystery (to me)

by jwkrahn (Abbot)
on Mar 03, 2008 at 20:19 UTC ( [id://671712] : note . print w/replies, xml ) Need Help??

in reply to Regexp mystery (to me)

This code works like I want it to.

It does?    Really?    OK.

use strict;
use warnings;
our $list;
our @clients;
our $filedef1=$ARGV[0]; #name of client CSV file

Why are you declaring those variables here when you are only using them inside the read_clients() subroutine?

&read_clients ();

You shouldn't use & when calling subroutines, see perlsub for reasons why.

# define regex components
my $accode = qr(^"(.*)",.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*)x;

my $name = qr(^.*,"(.*)",.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*)x;

Why include the empty fields after the captured field?

# do regex matches
print "Extractions:\n";
my @extractions = $list =~ m{(?: $name)}mxgc;

Why are you using the /c option? It is only relevant if you are using the \G zero-width assertion in the pattern.

print "$extractions[$_], " for 0.. $#extractions;
print "End of Program!\n";
##Beginning of subroutine for reading the document source file.
sub read_clients
open FILEDEF1, "< $filedef1" or die "error reading $filedef1-$!";
while (<>)

The special <> readline operator will treat @ARGV as a list of file names and open and read each line from all of those files. Since $filedef1 is the first element of @ARGV the file will be opened and the first line from that file will be read into the $_ variable.

    push (@clients, <FILEDEF1>);

You are pushing all the lines from the file onto the @clients array from inside the loop so you should have the number of lines times the file in the array.

close FILEDEF1;
$list = join(' ',@clients);

You are joining the lines together with a single space character. That may confuse the /m option on regular expressions? That means that every line except the first will have a space at the beginning.

 print $list;
} ##End of block for reading the document source file.

Replies are listed 'Best First'.
Re^2: Regexp mystery (to me)
by ysth (Canon) on Mar 03, 2008 at 21:34 UTC
    You shouldn't use & when calling subroutines, see perlsub for reasons why.
    A quick read of perlsub doesn't show what you might mean, and IMO there is no problem with consistently using & and parentheses.
      With parentheses it's fine, as long as you're not using prototypes. Without the parentheses it could cause a problem.

Re^2: Regexp mystery (to me)
by barkingdoggy (Initiate) on Mar 03, 2008 at 21:00 UTC
    Thank you. That space joining the lines together is the problem. BOY, IS MY FACE RED! Mystery explained!
      That may have been the problem in this specific bug, but you should really use Text::xSV instead of parsing CSV files with a regex. Regex-based solutions cannot parse CSV files correctly in many cases.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?