Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Removing trailing asterisk from lines with regex

by Tmms (Initiate)
on Apr 14, 2017 at 20:35 UTC ( #1187960=perlquestion: print w/replies, xml ) Need Help??
Tmms has asked for the wisdom of the Perl Monks concerning the following question:

Dear all

I have a text file and some of the lines contain a trailing asterisk (they are protein sequences and * stands for a stop codon). I was trying to write a short perl script to remove (substitute with empty string) them, but I got some results I found weird. Hopefully someone could help me.

I tried this code, but it does not work. The * are still there.

while(my $line = <$in>){ $line =~ s/\*$//; print $line; }

When I tried this code, it suddenly worked.

while(my $line = <$in>){ $line =~ s/\*\s$//; print $line; }

I think the previous code worked because every line contains a newline character (\n). But when I replace \s with \n the code stopped working. That made me confused and made me wonder these three things:

  • What is going on?
  • Does perl let you print $line as a literal string with \n etc.?
  • When using $ as an anchor, does it include the newline character "\n$" or not "$\n"?

Thanks in advance.

Replies are listed 'Best First'.
Re: Removing trailing asterisk from lines with regex
by AnomalousMonk (Chancellor) on Apr 14, 2017 at 22:16 UTC

    Just another set of examples to drive home the point that, whatever else may happen (i.e., with the newlines), the  * (asterisk) is always gone:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = qq{AAA*\n}; ;; my $t = $s; print qq{A1: '$t'}; $t =~ s/\*$//; print qq{A2: '$t'}; ;; $t = $s; print qq{B1: '$t'}; $t =~ s/\*\s$//; print qq{B2: '$t'}; ;; $t = $s; print qq{C1: '$t'}; $t =~ s/\*[\n]$//; print qq{C2: '$t'}; " A1: 'AAA* ' A2: 'AAA ' B1: 'AAA* ' B2: 'AAA' C1: 'AAA* ' C2: 'AAA'
    Note that:  \n is a member of the  \s (whitespace) set;  [\n] is the same as  \n (newline) alone. Please see perlre, perlretut, and perlrequick.

    Update:

    Does perl let you print $line as a literal string with \n etc.?
    Some questions can be answered by simple experimentation. What happens when you print the strings "foobar", "foo\nbar", "foo\n\n\nbar", "foobar\n", "foobar\n\n\n", etc?


    Give a man a fish:  <%-{-{-{-<

Re: Removing trailing asterisk from lines with regex
by Anonymous Monk on Apr 14, 2017 at 21:03 UTC
    $ matches at the end of line *or* just before the \n at the end of the line. Use Data::Dumper to look at $line, sounds like there is extra whitespace on the end
Re: Removing trailing asterisk from lines with regex
by Anonymous Monk on Apr 14, 2017 at 21:17 UTC
    my @lines = ("foo*\n", "foo* \n", "foo* \n"); for my $line (@lines) { $line =~ s/\*\s$//; print "($line)\n"; }
    For the first input, the * and newline are removed. For the second input, the * and trailing space are removed, but the newline is left there. For the third input, nothing is removed. You might want to use this substitution instead:
    $line =~ s/\*\s*$//;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1187960]
Front-paged by Corion
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2017-08-17 06:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Who is your favorite scientist and why?



























    Results (282 votes). Check out past polls.

    Notices?