Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Removing trailing asterisk from lines with regex

by Tmms (Initiate)
on Apr 14, 2017 at 20:35 UTC ( #1187960=perlquestion: print w/replies, xml ) Need Help??
Tmms has asked for the wisdom of the Perl Monks concerning the following question:

Dear all

I have a text file and some of the lines contain a trailing asterisk (they are protein sequences and * stands for a stop codon). I was trying to write a short perl script to remove (substitute with empty string) them, but I got some results I found weird. Hopefully someone could help me.

I tried this code, but it does not work. The * are still there.

while(my $line = <$in>){ $line =~ s/\*$//; print $line; }

When I tried this code, it suddenly worked.

while(my $line = <$in>){ $line =~ s/\*\s$//; print $line; }

I think the previous code worked because every line contains a newline character (\n). But when I replace \s with \n the code stopped working. That made me confused and made me wonder these three things:

  • What is going on?
  • Does perl let you print $line as a literal string with \n etc.?
  • When using $ as an anchor, does it include the newline character "\n$" or not "$\n"?

Thanks in advance.

Replies are listed 'Best First'.
Re: Removing trailing asterisk from lines with regex
by AnomalousMonk (Chancellor) on Apr 14, 2017 at 22:16 UTC

    Just another set of examples to drive home the point that, whatever else may happen (i.e., with the newlines), the  * (asterisk) is always gone:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = qq{AAA*\n}; ;; my $t = $s; print qq{A1: '$t'}; $t =~ s/\*$//; print qq{A2: '$t'}; ;; $t = $s; print qq{B1: '$t'}; $t =~ s/\*\s$//; print qq{B2: '$t'}; ;; $t = $s; print qq{C1: '$t'}; $t =~ s/\*[\n]$//; print qq{C2: '$t'}; " A1: 'AAA* ' A2: 'AAA ' B1: 'AAA* ' B2: 'AAA' C1: 'AAA* ' C2: 'AAA'
    Note that:  \n is a member of the  \s (whitespace) set;  [\n] is the same as  \n (newline) alone. Please see perlre, perlretut, and perlrequick.

    Update:

    Does perl let you print $line as a literal string with \n etc.?
    Some questions can be answered by simple experimentation. What happens when you print the strings "foobar", "foo\nbar", "foo\n\n\nbar", "foobar\n", "foobar\n\n\n", etc?


    Give a man a fish:  <%-{-{-{-<

Re: Removing trailing asterisk from lines with regex
by Anonymous Monk on Apr 14, 2017 at 21:03 UTC
    $ matches at the end of line *or* just before the \n at the end of the line. Use Data::Dumper to look at $line, sounds like there is extra whitespace on the end
Re: Removing trailing asterisk from lines with regex
by Anonymous Monk on Apr 14, 2017 at 21:17 UTC
    my @lines = ("foo*\n", "foo* \n", "foo* \n"); for my $line (@lines) { $line =~ s/\*\s$//; print "($line)\n"; }
    For the first input, the * and newline are removed. For the second input, the * and trailing space are removed, but the newline is left there. For the third input, nothing is removed. You might want to use this substitution instead:
    $line =~ s/\*\s*$//;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1187960]
Front-paged by Corion
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (9)
As of 2017-11-17 18:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (270 votes). Check out past polls.

    Notices?