Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

In this StackOverflow question, user "Learner" asked how to replace a set of certain letters in the first column of a file with numbers. In other words, given this set of replacements:

A=>1, B=>5, C=>6, D=>4, E=>7, F=>16, G=>10, H=>11, I=>12, K=>14, L=>13, M=>15, N=>3, P=>17, Q=>8, R=>2, S=>18, T=>19, V=>22, W=>20, Y=>21, Z=>9

And this input:

NDDDDTSVCLGTRQCSWFAGCTNRTWNSSA 0 VCLGTRQCSWFAGCTNRTWNSSAVPLIGLP 0 LTWSGNDTCLYSCQNQTKGLLYQLFRNLFC 0 CQNQTKGLLYQLFRNLFCSYGLTEAHGKWR 0 ITNDKGHDGHRTPTWWLTGSNLTLSVNNSG 0 GHRTPTWWLTGSNLTLSVNNSGLFFLCGNG 0 FLCGNGVYKGFPPKWSGRCGLGYLVPSLTR 0 KGFPPKWSGRCGLGYLVPSLTRYLTLNASQ 0 QSVCMECQGHGERISPKDRCKSCNGRKIVR 1

The expected output is:

3 4 4 4 4 19 18 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 +18 1 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1 22 17 13 12 + 10 13 17 13 19 20 18 10 3 4 19 6 13 21 18 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 + 13 16 6 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6 18 21 10 13 19 7 1 11 10 + 14 20 2 12 19 3 4 14 10 11 4 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 +22 3 3 18 10 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10 13 16 16 + 13 6 10 3 10 16 13 6 10 3 10 22 21 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 + 17 18 13 19 2 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2 21 13 1 +9 13 3 1 18 8 8 18 22 6 15 7 6 8 10 11 10 7 2 12 18 17 14 4 2 6 14 18 6 3 10 2 14 12 + 22 2

Here are my two solutions:

$ perl '-M5;%h=map{$_,++$i}split//,"ARNDBCEQZGHILKMFPSTWYV"' -alpe ' ($_=$F[0])=~s/[A-Z]/$h{$&} /g' $ perl -alpe ' ($_=$F[0])=~s/[A-Z]/(index("ARNDBCEQZGHILKMFPSTWYV",$&)+1)." "/ge'

WebPerl link

I personally consider the extra space character at the end of the line produced by my solutions acceptable (I think diff -b is probably ok too). Unfortunately the OP didn't specify what would happen in case the input strings contained letters that aren't in the set, so I guess "bonus points" for solutions that only affect [A-IK-NP-TV-WY-Z] instead of [A-Z] like my solution does. Bonus question: Can anyone come up with a short, preferably pure Perl, solution to produce such a regex character set for any given list of letters?

$ echo "ARNDBCEQZGHILKMFPSTWYV" | perl -MSet::IntSpan -ple ' $_=Set::IntSpan->new([map{ord}split//,$_])->run_list; s/\d+/chr$&/eg;s/,//g;$_="[$_]"'

Have at it ;-)

Update: Thank you for the inspired and inspiring responses so far, Discipulus, Eily, rsFalse, Veltro, and vr! I really enjoy the creativity in the solutions :-)


In reply to A little golfing challenge: Replacing letters with numbers by haukex

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others romping around the Monastery: (6)
    As of 2020-11-27 09:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?