Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Scrubbing a string

by princepawn (Parson)
on Aug 24, 2007 at 20:58 UTC ( #634963=perlquestion: print w/replies, xml ) Need Help??
princepawn has asked for the wisdom of the Perl Monks concerning the following question:

I want to "scrub" a second string of all characters which are not in the first string. Thus:
scrub 'pure', 'p!u-r-*e'; # returns pure
Here is the code I wrote. It works, but I was hoping there was a module or something which already did this.
sub string_clean { my ($pure, $dirty) = @_; my @pure = (split //, $pure) ; my @dirty = (split //, $dirty) ; my %pure = map { ($_ => 1) } @pure; my @cleaned = grep { $pure{$_} } @dirty; return join '', @cleaned; }


staring at the sample code, I'm thinking how slick it would look to call the function like this:
scrub 'impure' => 'pure' ; # yields 'pure'
you know... use the arrow to show the transition from dirty to clean... real English-like.

Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality

Replies are listed 'Best First'.
Re: Scurbbing a string
by moritz (Cardinal) on Aug 24, 2007 at 21:03 UTC

    Use regexes with char classes:

    #!/usr/bin/perl use warnings; use strict; my $allowed = "pure"; my $re = quotemeta $allowed; my $str = "p u+r-e"; $str =~ s/[^$re]//g; print $str, "\n";
      I always believed that there wasn't interpolation inside a character class.


        I would have thought the same. But you could get around it by building your RE as a string and then putting that into the s///:
        my $allowed = "pure"; my $re = "[^\Q$allowed\E]"; my $str = "p uuu+tr-ed"; $str =~ s/$re//g; print $str, "\n";

        Caution: Contents may have been coded under pressure.
Re: Scrubbing a string
by Limbic~Region (Chancellor) on Aug 24, 2007 at 22:31 UTC
    I have thought about this for 20 seconds. Since your solution disregards order and count, it seems that this can be generalized into saying "this set can't have any members that are in that set". If I were putting this into a module, I would likely use a tied interface to the 3 main data types (hashes, arrays, scalars).
    tie my $string 'Set::Avoid', not => 'this'; $string = "This is how you do it"; print $string; #T ow you do"
    It should be obvious how this would work for arrays and hashes but you would run into problems with scalars if they ended up being references.

    Cheers - L~R

Re: Scurbbing a string
by BrowserUk (Pope) on Aug 24, 2007 at 21:06 UTC
      Won't that fail, or even produce incorrect results, if $ctl includes tr metacharacters?

        Yes. But remember that $ctl in that snippet is the equivalent of the OPs $pure, not the variable being cleaned.

        Still, I guess they could legitimately contain meta chars. If the OPs $pure variable(s) are sourced from outside of his program or genuinely might contain meta characters, then a quotemeta would probably be appropriate.

        Indeed, it might be good if that were mentioned in the eval tr example in perlop.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Scurbbing a string
by FunkyMonk (Canon) on Aug 24, 2007 at 21:05 UTC
    There's always the "evil" string eval:

    my ( $s1, $s2 ) = ( 'pure', 'pppp!u-r-*e' ); my $re = eval "qr/[^$s1]/"; $s2 =~ s/$re//g; print $s2; # ppppure

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://634963]
Approved by TStanley
Front-paged by Roy Johnson
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2017-03-24 20:19 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (307 votes). Check out past polls.