http://www.perlmonks.org?node_id=353072

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Is there a simple line of perl code which will parse a string and get rid of any letters that apear twice. Like so.
Appear = Apear
It removed the extra p.

janitored by ybiC: Retitle from "Repeats" since one-word nodetitles be eeevil, and tweak format for legibility

Replies are listed 'Best First'.
Re: Remove repeated characters from a string
by japhy (Canon) on May 13, 2004 at 14:48 UTC
    If you mean "bookkeeper" should be "bokeper", then there's a really fast way to do so. This would be collapsing adjacent duplicate letters into just one, by the way. That's a more accurate description of the problem. $string =~ tr/a-zA-Z//s; Look up tr/// in the perlop documentation.
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Re: Remove repeated characters from a string
by Limbic~Region (Chancellor) on May 13, 2004 at 14:44 UTC
    Anonymous Monk,
    The short answer is yes. The long answer is yes, but which simple line of code all depends on what you mean. That is why it is important not to leave ambiguity in your questions. It is frustrating to have people answer questions you did not ask.
    • Do you want to reduce only letters appearing twice that are together?
    • Do you want to reduce letters appearing twice that appear anywhere in the string?
    • Do you only want to perform the action if the letters appear exactly twice, but not greater than twice?
    So as to not leave you with all questions and no answers. Take a look at this small snippet, it should help if what you are going for is option 3. It is intentionally "wordy" so that the process is more obvious.
    #!/usr/bin/perl use strict; use warnings; my $string = 'one bright day in the middle of the night'; my (%seen, $new_string); for ( split // , $string ) { next if $seen{$_}; $seen{$_}++; $new_string .= $_; } print "$new_string\n";
    Cheers - L~R
      I wish to remove all repeating letters no matter where they are. I generating a key and can have no double letters at a max there can be 26 letters all unique
        Anonymous Monk,
        Then it sounds like the code I provided is on the right track. It would need to be modified a bit to take into account your additional requirements. Again, this is why it is important to explain your problem without assuming things are just "understood".
        #!/usr/bin/perl use strict; use warnings; my $string = 'One bright day in the middle of The Night'; my (%seen, $new_string); for my $char ( split // , $string ) { $char = lc $char; next if $seen{$char} || $char !~ /^[a-z]$/; $seen{$char}++; $new_string .= $char; last if length $new_string == 26; } print "$new_string\n";
        I hope this helps - L~R
        Interesting, ...

        Why is it that the key may not have repeating characters?

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Remove repeated characters from a string
by BrowserUk (Patriarch) on May 13, 2004 at 15:20 UTC

    If you want to remove all duplicate letters, not just adjacent ones, this does it without any loops (external to the regex engine).

    print $s; abacadaeafabacadaeafabacadaeafabacadaeafabacadaeafabacadaeaf $s =~ s[(.)(?=.*?\1)][]g; print $s; bcdeaf

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      Don't know if you wrote this before or after the clarification; but this removes all but the last, where it seems all but the first is the desired behavior.

        Your right, I missed the clarification. I guess you could do a scalar reverse before and after, but that kinda takes the shine off the solution.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
Re: Remove repeated characters from a string
by Tomte (Priest) on May 13, 2004 at 14:41 UTC

    this works, I don't know if there are things known that stand against a usage...

    [1610]tom@margo pm $ perl -e '$t="Appear"; $t=~s/(.)\1/$1/g; print $t, +"\n"' Apear

    regards,
    tomte


    An intellectual is someone whose mind watches itself.
    -- Albert Camus

Re: Remove repeated characters from a string
by EdwardG (Vicar) on May 13, 2004 at 14:44 UTC
Re: Remove repeated characters from a string
by knew (Monk) on May 13, 2004 at 14:49 UTC

    This bit of code will do the job:

    $string =~ s/(.)\1/$1/g

    Alternatively, if you want to ignore case when comparing, you could use:

    $string =~ s/(.)\1/$1/gi

    And finally, if you wanted to trim any number of duplicates down to a single letter, this would work:

    $string =~ s/(.)\1+/$1/gi
      the duplicates are not nesscarly beside each other
      where = wher Expressions = Expresion Finished = Finshed indicated = indcate

        This will do that for you:

        1 while $string =~ s/(.)(.*)\1/$1$2/g

        If you had put more effort into your first question you would not now be explaining what you really meant.