haukex has asked for the wisdom of the Perl Monks concerning the following question:
In this StackOverflow question, user "Learner" asked how to replace a set of certain letters in the first column of a file with numbers. In other words, given this set of replacements:
A=>1, B=>5, C=>6, D=>4, E=>7, F=>16, G=>10, H=>11, I=>12, K=>14, L=>13, M=>15, N=>3, P=>17, Q=>8, R=>2, S=>18, T=>19, V=>22, W=>20, Y=>21, Z=>9
And this input:
NDDDDTSVCLGTRQCSWFAGCTNRTWNSSA 0 VCLGTRQCSWFAGCTNRTWNSSAVPLIGLP 0 LTWSGNDTCLYSCQNQTKGLLYQLFRNLFC 0 CQNQTKGLLYQLFRNLFCSYGLTEAHGKWR 0 ITNDKGHDGHRTPTWWLTGSNLTLSVNNSG 0 GHRTPTWWLTGSNLTLSVNNSGLFFLCGNG 0 FLCGNGVYKGFPPKWSGRCGLGYLVPSLTR 0 KGFPPKWSGRCGLGYLVPSLTRYLTLNASQ 0 QSVCMECQGHGERISPKDRCKSCNGRKIVR 1
The expected output is:
3 4 4 4 4 19 18 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 +18 1 22 6 13 10 19 2 8 6 18 20 16 1 10 6 19 3 2 19 20 3 18 18 1 22 17 13 12 + 10 13 17 13 19 20 18 10 3 4 19 6 13 21 18 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 + 13 16 6 6 8 3 8 19 14 10 13 13 21 8 13 16 2 3 13 16 6 18 21 10 13 19 7 1 11 10 + 14 20 2 12 19 3 4 14 10 11 4 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 +22 3 3 18 10 10 11 2 19 17 19 20 20 13 19 10 18 3 13 19 13 18 22 3 3 18 10 13 16 16 + 13 6 10 3 10 16 13 6 10 3 10 22 21 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 + 17 18 13 19 2 14 10 16 17 17 14 20 18 10 2 6 10 13 10 21 13 22 17 18 13 19 2 21 13 1 +9 13 3 1 18 8 8 18 22 6 15 7 6 8 10 11 10 7 2 12 18 17 14 4 2 6 14 18 6 3 10 2 14 12 + 22 2
Here are my two solutions:
$ perl '-M5;%h=map{$_,++$i}split//,"ARNDBCEQZGHILKMFPSTWYV"' -alpe ' ($_=$F[0])=~s/[A-Z]/$h{$&} /g' $ perl -alpe ' ($_=$F[0])=~s/[A-Z]/(index("ARNDBCEQZGHILKMFPSTWYV",$&)+1)." "/ge'
I personally consider the extra space character at the end of the line produced by my solutions acceptable (I think diff -b is probably ok too). Unfortunately the OP didn't specify what would happen in case the input strings contained letters that aren't in the set, so I guess "bonus points" for solutions that only affect [A-IK-NP-TV-WY-Z] instead of [A-Z] like my solution does. Bonus question: Can anyone come up with a short, preferably pure Perl, solution to produce such a regex character set for any given list of letters?
$ echo "ARNDBCEQZGHILKMFPSTWYV" | perl -MSet::IntSpan -ple ' $_=Set::IntSpan->new([map{ord}split//,$_])->run_list; s/\d+/chr$&/eg;s/,//g;$_="[$_]"'
Have at it ;-)
Update: Thank you for the inspired and inspiring responses so far, Discipulus, Eily, rsFalse, Veltro, and vr! I really enjoy the creativity in the solutions :-)
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: A little golfing challenge: Replacing letters with numbers (edited)
by Eily (Monsignor) on Feb 21, 2019 at 09:46 UTC | |
by rsFalse (Chaplain) on Feb 21, 2019 at 15:54 UTC | |
by Eily (Monsignor) on Feb 22, 2019 at 11:06 UTC | |
Re: A little golfing challenge: Replacing letters with numbers -- oneliner
by Discipulus (Canon) on Feb 21, 2019 at 09:40 UTC | |
Re: A little golfing challenge: Replacing letters with numbers (edit)
by Veltro (Hermit) on Feb 21, 2019 at 19:37 UTC | |
by rsFalse (Chaplain) on Feb 21, 2019 at 21:13 UTC | |
Re: A little golfing challenge: Replacing letters with numbers
by rsFalse (Chaplain) on Feb 21, 2019 at 17:27 UTC | |
Re: A little golfing challenge: Replacing letters with numbers
by vr (Curate) on Feb 22, 2019 at 11:13 UTC |