Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

regex for string

by saranperl (Initiate)
on Aug 18, 2009 at 08:17 UTC ( [id://789377]=perlquestion: print w/replies, xml ) Need Help??

saranperl has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: regex for string
by moritz (Cardinal) on Aug 18, 2009 at 08:25 UTC
    I don't know why you want a regex for that, but here you go:
    use strict; use warnings; my $str = "hello\nworld"; my ($first, $last); $str =~ m/ (?{ $first = substr($str, 0, 1); $last = substr($str, -1) } +) /x; print "first: '$first'; last: '$last'\n";

    However note the caveats of (?{ ... }) groups as documented in perlre.

    Perl 6 projects - links to (nearly) everything that is Perl 6.
      why we want to write this much code using "substr"? we can use like
      $f = substr($str,0,1) $l=substr($str,-1);
      what want to get first and last char by regex ? like that m/(.?)..etc/
        Because substr is the way to go, but you wanted a regex.

        So I gave you a regex, but still used substr.

        The real question is, why do you insist on using a regex? it's not a good idea, and far from being an ideal solution.

        Perl 6 projects - links to (nearly) everything that is Perl 6.
Re: regex for string
by arkturuz (Curate) on Aug 18, 2009 at 08:44 UTC
    Strictly using regex you can do it like this:
    my $str = "perlmonks.org"; if ($str =~ / ^(.{1}) # first character .* # the rest of the string (.+)$ # minus the last character /x) { print $1, ' ', $2, "\n"; }
    Normally you would do this using substr:
    my $f = substr($str, 0, 1); my $l = substr($str, length($str)-1, 1); print $f, ' ', $l;
      This will fail for a string consisting of a single character, or a string which contains newlines. The first can be fixed with a look-ahead, the second with /s.
      Perl 6 projects - links to (nearly) everything that is Perl 6.
      ya its working fine thank you. why it is not working to giving like this
      $str=~m/^(.?).*(.?)$/
        It won't match what you want because the '*' quantifier is greedy, and the '?' can match 0 or 1 times. So, the '*' matches all of the it can, and the last '?' matches 0 times; so the match is successful. You have to force '*' to give something back to the rest of the regex by replacing '?' with '{1}' for example.
Re: regex for string
by rovf (Priest) on Aug 18, 2009 at 08:37 UTC

    Have a look at perlre, in particular at the zero-width assertions \A and \z (lower-case z!), and to the meta-character . (dot). But I think that you can easier extract the characters using substr.

    -- 
    Ronald Fischer <ynnor@mm.st>
Re: regex for string
by abubacker (Pilgrim) on Aug 18, 2009 at 12:31 UTC

    I think this can be a easy way interms of using regular expression ,

    $_="1and2" ; /^(.).*(.)$/; print "$1 and $2 " ;

    but I'm pretty sure that you can achieve this using several more simple ways

Re: regex for string
by AnomalousMonk (Archbishop) on Aug 18, 2009 at 18:16 UTC
    The reason a single, 'pure and simple' regex is not a good approach to this problem is that it has great difficulty handling the degenerate cases of zero-length and single-character strings: either some fancy footwork is needed within the regex, or some sort of post-match fixup must be done in these cases.

    For a single-character string in particular, one is asking the regex to match twice on the same character! A regex will always advance the string match point (as returned by the pos built-in) past the match or, in the case of a zero-width assertion match, by a default of one character. (The 5.10 regex 'backtracking control verbs' may offer a way around this problem, but I'm not familiar enough with them to know.)

    The following is the best I can do with a regex. It uses post-match fixup to finish the job. Note that the order of the alternatives in the ordered alternation
        \A . | . \z | \z
    is important: the lone  \z alternative must be last.

    >perl -wMstrict -le "for my $str (@ARGV) { printf qq{string '$str': }; my ($first, $last) = $str =~ m{ \A . | . \z | \z }xmsg; $last = $first if not $last; print qq{first '$first', last '$last'}; } " "" "a" "ab" "abc" "abcd" string '': first '', last '' string 'a': first 'a', last 'a' string 'ab': first 'a', last 'b' string 'abc': first 'a', last 'c' string 'abcd': first 'a', last 'd'
Re: regex for string
by ikegami (Patriarch) on Aug 18, 2009 at 18:39 UTC
    • Works with single-character strings. Both $f and $l are set to that character.
    • For zero-length strings, both $f and $l are undefined.
    • For zero-length strings, the whole expression returns false in scalar context.
    my ($f,$l) = /^(.).*(?<=(.))/s;

    • Works with single-character strings. Both $f and $l are set to that character.
    • For zero-length strings, both $f and $l are set to the zero-length string.
    • Readable.
    my $f = substr($_, 0, 1); my $l = substr($_, -1, 1);
Re: regex for string
by markkawika (Monk) on Aug 18, 2009 at 18:09 UTC

    It's tough to do this for edge cases in one regex, so I would solve it in two. First, grab the first character:

    $str =~ m/\A(.)/xms; my $first_character = $1;

    Then, grab the last character:

    $str =~ m/(.)\z/xms; my $last_character = $1;

    If you could guarantee the string was at least two characters long, you could do it in one regex:

    $str =~ m/\A(.).*(.)\z/xms; my $first_character = $1; my $last_character = $2;

    If you insist on handling all edge cases in one regex, here's one way:

    $str =~ m/\A(.).*?(.?)\z/xms; my $first_character = $1; my $last_character = $2; $last_character = $first_character if ! $last_character;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://789377]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-24 12:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found