http://www.perlmonks.org?node_id=998404


in reply to Simple Regex Question / Code Review

Rather than a detailed review, just one thing that caught my eye...

foreach (@$files) { $table{$_} = substr $_, 0, -2; }

In this loop, the
    $table{$_} = substr $_, 0, -2;
statement seems intended to snip the extension off the path/filename of a C source file and associate the file name so truncated with the original, full name. It assumes the extension is always two characters, e.g., '.c'. It's not a bad assumption in this case, but of course immediately fails in the face of extensions like '.cc', '.cpp', etc., leaving one with a perhaps very puzzling bug — but of course, this will never happen! Whenever I encounter an assumption like this, old scars begin to throb and I feel a strong urge to program defensively.

Assuming one does not want to use well-tested and platform independent CPAN modules for manipulating file names, one might write something like (untested)
    for my $file_name (@$files) {
        (my $base_name = $file_name) =~ s{ [.] [^.]* \z }{}xms;
        $table{$file_name} = $base_name;
        }
which snips off the last '.whatever' extension regardless of its length, and, even though more verbose, has, I feel, a certain 'self-documentation' quality.

Update:

    `gcc $_ -o substr $_, 0, -2` foreach (@$files);
I couldn't get it to work syntactically... clarity is sacrificed... Is something like this even possible?

I agree with your concerns about clarity, but in any event, one way might be something like this (of course, you would use qx{...} instead of print qq{...}):

>perl -wMstrict -le "print qq{gcc $_->[0] -o $_->[1]} for map [ $_, s{ [.] [^.]* \z }{}xmsr ], qw(see cee. foo.c foo/bar.cc foo/bar/baz.cpp) ; " gcc see -o see gcc cee. -o cee gcc foo.c -o foo gcc foo/bar.cc -o foo/bar gcc foo/bar/baz.cpp -o foo/bar/baz

(Prior to 5.14 and the introduction of the  //r regex modifier, you can use
    map { (my $o = $_) =~ s{ [.] [^.]* \z }{}xms;  [ $_, $o ] }
instead.)

Further Update: Or even:

>perl -wMstrict -le "print qq{gcc $_ -o @{[ s{ [.] [^.]* \z }{}xmsr ]}} for qw(see cee. foo.c foo/bar.cc foo/bar/baz.cpp); " gcc see -o see gcc cee. -o cee gcc foo.c -o foo gcc foo/bar.cc -o foo/bar gcc foo/bar/baz.cpp -o foo/bar/baz

Replies are listed 'Best First'.
Re^2: Simple Regex Question / Code Review
by marquezc329 (Scribe) on Oct 12, 2012 at 01:15 UTC

    Thank you. Your well thought out examples of  print qq{...} / qx{...} in conjunction with the above reference to Re^2: How can we interpolate an expression?? provide an excellent foothold for the understanding of a concept that had originally seemed far-fetched to me. I think my bias towards clarity will prevail in this particular situation and I might stick with assigning pairs to a hash for the associated files. Is this exceedingly Novice?

    "Whenever I encounter an assumption like this, old scars begin to throb and I feel a strong urge to program defensively."

    In hindsight, I can definitely understand the ramifications of this assumption and plan on expanding the original script to include acceptance of a larger/more realistic spectrum of file extensions, as well as generating a more concentrated focus on defensive coding in my practice and learning.

    You sir, have hit the nail on the head in regards to the kind of input I was hoping to receive in posting this for review. This is a pristine example of the quality of this community and why I prefer it to the plethora of available forums. Your response has opened several paths of inquisition in my mind, that I intend to follow to fruition. Thanks again!

      ... my bias towards clarity ... Is this exceedingly Novice?

      In a professional/production environment, clarity is a jewel above price. The person who maintains your code months or years hence (and it may even be you; remember: the sanity you save may be your own) will sing your praises to the heavens if you give him or her a clear piece of code to work with.