Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

How to handle metacharacters in input file for Perl one-liner code

by debug (Initiate)
on Jul 16, 2015 at 12:47 UTC ( #1135012=perlquestion: print w/replies, xml ) Need Help??

debug has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!! This is my first post here, even though I've been using PerlMonks gems as lifesavers for a while now :)

First, my problem code:

my $file = "foo.c";

`perl -p -l -i.bak -e "s/$file/$file,=SUM(B$x:B$y)/" sample.csv`

sample.csv has a list of C filenames in column A. Based on certain conditions, I want to display a sum of values in the same row as a filename match in its equivalent column B.

My problem is that while searching for foo.c, the code also matches and replaces files like foo_con.c, foo_con_bar.c etc. I'm not sure why, but its obvious that my . separator in the filename is somehow being ignored even when the text has an _ in its place, leading to multiple invalid substitutions i.e. foo_c is also detected as a valid string match.

I would like to know if there is any way to make the code detect the . separator and not mistake it with an _ in other filenames.

Notice that I do not have a /g global modifier in my code string. Even then, this code does multiple matches and substitutions. I was hoping that it would find the first match in sample.csv, and stop further execution.

PS: I'm sure you'll notice the use of "" instead of ' ' for the code string. I am running Perl on a windows environment.

Thanks for you help!!

Replies are listed 'Best First'.
Re: How to handle metacharacters in input file for Perl one-liner code
by marinersk (Priest) on Jul 16, 2015 at 12:59 UTC

      Thank You!!!

      So I did not use your suggestion of using quotemeta (I haven't used it before, so I'm gonna go look it up), but I got my answer in the sample output you linked :)

      Here's my updated code that's working now. Apparently I just needed to preemptively escape the . in my search string.

      my $file = "foo.c";

      $file =~ s/\./\\\./;

      `perl -p -l -i.bak -e "s/$file/$file,=SUM(B$x:B$y)/" sample.csv`

      Though probably I should look into Dean's suggestion as well. Any thoughts on which approach would have the fastest execution time out of these?

        ... which approach would have the fastest execution time out of these?

        Since you're shelling out to the OS to run another copy of Perl to do your work, fast execution is not something you ever need to worry about.


        Give a man a fish:  <%-(-(-(-<

Re: How to handle metacharacters in input file for Perl one-liner code
by duelafn (Vicar) on Jul 16, 2015 at 13:15 UTC

    Not being a windows person, I don't know if the \Q\E quotemeta that marinersk is suggesting will have problems when passed to the windows shell, but I'm fairly certain that there would be at least some risk that you might eventually need to include a character that causes some pain due to shell escaping. Thus, I beg you consider doing it the long way.

    my $file = "foo.c"; rename "sample.csv", "sample.csv.bak" or die "Error backing up sample. +csv: $!"; open my $IN, "<", "sample.csv.bak" or die "Error reading sample.csv +.bak: $!"; open my $OUT, ">", "sample.csv" or die "Error writing to sample. +csv: $!"; while (defined(my $line = <$IN>)) { $line =~ s/\Q$file\E/$file,=SUM(B$x:B$y)/; print $OUT $line; }

    Good Day,
        Dean

      Hello Dean,

      Thanks for your reply.

      I should probably have mentioned that I'm running this code inside an iterative loop, so am a bit concerned with execution time. I did consider going this approach instead of a system call, but honestly I have no idea which one would be a more cpu intensive process in terms of file open operations. Would you have any thoughts on that?

        A pure Perl solution is likely to be faster than shelling out to the OS.

        Shelling out to the OS is a very expensive process. If you're really concerned about execution time, do it all in Perl.


        Give a man a fish:  <%-(-(-(-<

Re: How to handle metacharacters in input file for Perl one-liner code
by QM (Parson) on Jul 16, 2015 at 15:29 UTC
    Besides the good advice you've received above, you should also investigate regex anchors ^ and $, for example:
    s/^$file$/something goes here/

    Search for "anchor" in perlretut#Simple word matching.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      Thank you everyone for your excellent suggestions above.

      I will go ahead and replace my shell calls with direct Perl code. Also, QM's anchor suggestion is great, and would probably have helped me in the first place, though I'm kinda glad I did not, since it forced me to figure out the other loophole in my code. I'll go put it in my code now.

Re: How to handle metacharacters in input file for Perl one-liner code
by GotToBTru (Prior) on Jul 16, 2015 at 14:56 UTC

    The period in your file name is being interpreted as the "match-any-character-except-linefeed" wildcard. That's why you get those matches you aren't expecting. "foo.c" matches "foo.c", but also the "foo_c" in "foo_con.*". Putting the \Q and \E around $file tells the regex interpreter not to make those substitutions. Alternatively, quotemeta will 'escape' any characters that might be interpreted as wildcards.

    Dum Spiro Spero

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1135012]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (2)
As of 2020-02-23 02:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (102 votes). Check out past polls.

    Notices?