http://www.perlmonks.org?node_id=11122798

flieckster has asked for the wisdom of the Perl Monks concerning the following question:

hello, i have a script that calls an URL, and returns characters like this "654321_1111" (quotes are part of the return). what is the best way to remove the first and last character, or remove the "" from the variable? I've tried chomp but that only works on the first character, and i've thought of using substr but i can't be sure that the number above in quotes will always be the same number of characters. can someone recommend a good path to start on? thank you in advance.

Replies are listed 'Best First'.
Re: remove first and last character of string
by GrandFather (Saint) on Oct 14, 2020 at 03:12 UTC
    use strict; use warnings; my $str = q{"654321_1111"}; $str =~ s/^"|"$//g; print "'$str'\n";

    Prints:

    '654321_1111'

    Note that the single quotes are added in the print to demonstrate that there are no "hidden" characters like spaces or line breaks included in the string.

    Update: fix $str quoting issue pointed out by LanX.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

        Indeed, or something like that.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: remove first and last character of string
by syphilis (Archbishop) on Oct 14, 2020 at 02:26 UTC
    i've thought of using substr but i can't be sure that the number above in quotes will always be the same number of characters

    You can use substr() without knowing the number of characters.
    To remove the first character of $str: substr($str,  0, 1, '')
    To remove the last character of $str : substr($str, -1, 1, '')
    Or remove the last character of $str : substr($str, length($str) - 1, '')
    Or, just use chop() to remove the last character, as you've already noted.

    Cheers,
    Rob
      Thanks Rob, but is there a way use this Or remove the last character of $str : substr($str, length($str) - 1, '') to remove the last and first in one line?

        Win8 Strawberry 5.8.9.5 (32) Tue 10/13/2020 23:18:23 C:\@Work\Perl\monks >perl -Mstrict -Mwarnings my $s = '"654321_1111"'; print "'$s' \n"; $s = substr $s, 1, -1; print "'$s' \n"; ^Z '"654321_1111"' '654321_1111'


        Give a man a fish:  <%-{-{-{-<

Re: remove first and last character of string - benchmarks
by haukex (Archbishop) on Oct 14, 2020 at 12:27 UTC

    I thought it might be interesting to benchmark the various solutions:

    #!/usr/bin/env perl use warnings; use strict; use Benchmark qw/cmpthese/; use JSON::PP (); use Cpanel::JSON::XS (); my $IN = '"654321_1111"'; my $EXP = '654321_1111'; use constant TEST=>0; print "Testing is ", TEST?"enabled":"disabled", "\n"; my $json_pp = JSON::PP->new->allow_nonref; my $json_xs = Cpanel::JSON::XS->new->allow_nonref; cmpthese(-2, { regex_strip => sub { # GrandFather (my $out = $IN) =~ s/^"|"$//g; $out eq $EXP or die $out if TEST; }, regex_match => sub { # BillKSmith my ($out) = $IN =~ /^\"(.+)\"$/; $out eq $EXP or die $out if TEST; }, tr_strip => sub { # kcott and hippo (my $out = $IN) =~ y/"//d; $out eq $EXP or die $out if TEST; }, substr => sub { # syphilis and AnomalousMonk my $out = substr $IN, 1, -1; $out eq $EXP or die $out if TEST; }, reverse => sub { # rsFalse my $out = $IN; for (1..2) { $out = reverse $out; chop $out; } $out eq $EXP or die $out if TEST; }, json_pp => sub { # haukex my $out = $json_pp->decode($IN); $out eq $EXP or die $out if TEST; }, json_xs => sub { # haukex my $out = $json_xs->decode($IN); $out eq $EXP or die $out if TEST; }, }); __END__ Testing is disabled Rate json_pp regex_strip regex_match reverse json_xs + tr_strip substr json_pp 161266/s -- -85% -96% -96% -97% + -97% -99% regex_strip 1081962/s 571% -- -73% -74% -81% + -81% -91% regex_match 3989148/s 2374% 269% -- -2% -29% + -30% -67% reverse 4091430/s 2437% 278% 3% -- -28% + -28% -66% json_xs 5654082/s 3406% 423% 42% 38% -- + -0% -53% tr_strip 5663601/s 3412% 423% 42% 38% 0% + -- -53% substr 12107824/s 7408% 1019% 204% 196% 114% + 114% --

    However, it's also very noteworthy that all of the solutions have their caveats, as already highlighted by AnomalousMonk and hippo:

    • regex_strip will also remove unbalanced quotes, as in q{"foo} becomes q{foo}
    • regex_match will also remove quotes from things that - depending on your input format - may not be syntactically valid, like q{"foo"bar"} becomes q{foo"bar}
    • tr_strip will remove all quotes in the string no matter where they appear, i.e. q{"foo""bar"} becomes q{foobar}
    • substr and reverse will remove the first and last character unconditionally, i.e. q{foobar} becomes q{ooba}
    • json_pp and json_xs only apply if the input format is actually JSON or something that can be parsed as if it were, e.g. they will refuse to parse q{"foo"bar"}

    This underscores what others have already said: it would be better if you could better specify your input format.

      Which nicely demonstrates that getting the code correct is vastly more important than fussing over execution speed. In the context of data fetched across the interwebs any of the proposed solutions are many times faster then would have any impact on the efficacy of the script.

      Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: remove first and last character of string (updated)
by haukex (Archbishop) on Oct 14, 2020 at 06:48 UTC
    i have a script that calls an URL, and returns characters like this "654321_1111" (quotes are part of the return).

    Many web APIs return JSON. If that is the case here, you can use decode_json from one of the JSON modules, such as Cpanel::JSON::XS, or JSON::PP has been in core since Perl 5.14.

    Update: In JSON::PP versions before 4.0, that is, those distributed with Perl 5.28 and before, the allow_nonref option must be set to decode a string. That is, instead of decode_json(q{"654321_1111"}), one has to write JSON::PP->new->allow_nonref->decode(q{"654321_1111"}).

      ... and if the response is JSON, then you absolutely should not try to do anything with it until you have "decoded" it. Apply your regex-magic only to members of the resulting data structure.
Re: remove first and last character of string
by kcott (Archbishop) on Oct 14, 2020 at 09:09 UTC

    G'day flieckster,

    "what is the best way to remove the first and last character, or remove the "" from the variable?" [my emphasis]

    If you only need to remove the leading and trailing quotes, and the example data you provided is representative, i.e. no embedded quotes, the easiest and most efficient way to do this would be by using transliteration:

    $string =~ y/"//d

    Here's a quick, yet complete, command line example:

    $ perl -E 'my $x = q{"654321_1111"}; say $x; $x =~ y/"//d; say $x' "654321_1111" 654321_1111

    [Aside: In his book, "Perl Best Practices", Damian Conway used the term transobliteration to describe this particular usage.]

    — Ken

Re: remove first and last character of string
by BillKSmith (Monsignor) on Oct 14, 2020 at 01:40 UTC
    You can use a regex.
    use strict; use warnings; use Test::More tests => 1; my $string = q/"654321_1111"/; my $regex = qr/^\"(.+)\"$/; my $expected = q/654321_1111/; (my $got) = $string =~ m/$regex/; is( $got, $expected, q/Remove quotes/ );

    Output:

    1..1 ok 1 - Remove quotes
    Bill

      It's not clear to me if flieckster intends to deal only with strings like '"foo"' (from which it is clear that 'foo' should be extracted), or if he or she may also be dealing with strings like 'foo' '"foo' 'foo"' 'f"o"o' etc., i.e., strings not having double-quotes at both the start and end of the string.

      In the latter case, it should be noted that
          qr/^\"(.+)\"$/
      will not match and will return an empty list, leaving $got undefined.


      Give a man a fish:  <%-{-{-{-<

      Thanks Bill, i like the idea of using Regex, but the example provided isn't removing quotes, it seems to remove most of the string?

        ... the example provided isn't removing quotes, it seems to remove most of the string ...
        I don't understand. The example code removes balanced quotes from the ends of a string. What return do you want from '"654321_1111"'?

        Update: Note that this substr solution removes any characters from the ends of a string, whereas this solution removes only balanced double-quotes from the ends of a string, and this solution removes only double-quotes, balanced or not, from the ends of a string. It's a question of exactly what you want.


        Give a man a fish:  <%-{-{-{-<

        My test proves that my solution satisfies the only test-case you have provided. Please replace my $string and $expected values with a case that fails. Post the complete test. Perhaps then, we can understand your problem. If you really want to unconditionally remove the first and last character of the string, I like the AnomalousMonk solution of substr with a length of -1. (I never would have thought of that myself.)
        Bill
Re: remove first and last character of string
by AnomalousMonk (Archbishop) on Oct 14, 2020 at 17:17 UTC

    As with others who have commented in this thread, it's not clear to me just what flieckster wants to achieve.

    If, and it's a big if, the aim is to remove double-quotes only when they are paired at both the start and end of the string and never in any other circumstance, then
        qr{ (?| \A " (.*) " \z | (.*)) }xms  # needs 5.10+
    will do the trick. With this regex,
       '""'  '""""'  '"foo"'  '"fo"o"'  '"f"o"o"'
    become
        ''    '""'    'foo'    'fo"o'    'f"o"o'
    respectively, while strings like
        ''  '"'  '"foo'  'foo"'  'f"oo'  'f"o"o'
    are unchanged.

    Note that this regex needs Perl version 5.10+ because it uses the (?|...) branch reset regex extension. The regex can be made to work in pre-5.10 versions by removing the (?|...) and adding a grep defined, ... filter to the output of the regex match.


    Give a man a fish:  <%-{-{-{-<

Re: remove first and last character of string
by rsFalse (Chaplain) on Oct 14, 2020 at 06:41 UTC
    To delete the first and the last character:
    for( 1 .. 2 ){ $string = reverse $string; chop $string; }