Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

requesting small regex

by Anonymous Monk
on May 07, 2012 at 06:53 UTC ( #969190=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi! any chance someone could give me a small regex to split any non-alphanomeric characters from a string? here is an exmple:
$str = "13%22%3%43"; @numbers = split(/regex/, $str);
now i should have 13 22 3 43 in the numbers array. Also that percentage sign is just an example, it could be !,*,$..ect. thanks..

Comment on requesting small regex
Download Code
Re: requesting small regex
by Anonymous Monk on May 07, 2012 at 07:07 UTC

    Hi! any chance someone could give me a small regex to split any non-alphanomeric characters from a string?

    You should be able to write it yourself after reading http://perldoc.perl.org/perlintro.html#Regular-expressions subsection http://perldoc.perl.org/perlintro.html#More-complex-regular-expressions it should take about 10 seconds

    $ perl -MRegexp::English -le " print Regexp::English->new->non_word_char "
    (?^:\W)
    $ perl -MYAPE::Regex::Explain -le " print YAPE::Regex::Explain->new( q/\W/ )->explain "
    The regular expression:

    (?-imsx:\W)

    matches as follows:

    NODE EXPLANATION
    ----------------------------------------------------------------------
    (?-imsx: group, but do not capture (case-sensitive)
    (with ^ and $ matching normally) (with . not
    matching \n) (matching whitespace and #
    normally):
    ----------------------------------------------------------------------
    \W non-word characters (all but a-z, A-Z, 0-
    9, _)
    ----------------------------------------------------------------------
    ) end of grouping
    ----------------------------------------------------------------------

      oh thanks, \D seems to have done the trick :)
Re: requesting small regex
by BrowserUk (Pope) on May 07, 2012 at 07:16 UTC

    As you probably know, \d+ matches digits. You want the opposite, which conveniently, perl provides as \D. (Note als: \s&\S; \w&\W; etc. ).

    So:

    print for split /\D+/, "13%22%3%43";; 13 22 3 43

    Note: This applies to ASCII/ISO data; once you get into the world of Unicrap, you're on your own :(


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      lol unicrap i shall remember that! :D p.s: also from uk.. nice wet country huh :(
        uk.. nice wet country huh :(

        Wet?! Were in the middle of a drought don't'ya know :)

        (Ya gotta love the optimism of the supermarkets advertising their barBQ stuff on TV :)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

      How about something like below? Apart from only digits, it also allows alphabets:
      perl -e '$a = "ab%13!3&z* Trial"; $a=~s/(\w+)\W+/$1/g; print $a;'

      If you need to work with Unicrap, er.. Unicode, use Unicode named property assertions.

      print for split /\P{Alnum}+/, '&#945;&#946;&#950;#!&#1488;&#1513;<>!&# +1046;&#1048;&#1059;+sdfg.%12'; &#945;&#946;&#950; &#1488;&#1513; &#1046;&#1048;&#1059; sdfg 12

      Sigh. Imagine that those were not automatically converted to HTML codepoints.

        magine that those were not automatically converted to HTML codepoints.

        Posting (small) quantities of Unicrap is the one time (I think) that <pre></pre> tags are justified.

        I'm guessing that \P is 'not the named class'; and that {Alnum} is alpha-numeric. Is theRE no \P{Numeric}?


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://969190]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2014-07-12 01:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (238 votes), past polls