regex for string

saranperl has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: regex for string by moritz (Cardinal) on Aug 18, 2009 at 08:25 UTC
I don't know why you want a regex for that, but here you go: `use strict; use warnings; my $str = "hello\nworld"; my ($first, $last); $str =~ m/ (?{ $first = substr($str, 0, 1); $last = substr($str, -1) } +) /x; print "first: '$first'; last: '$last'\n";` [download] However note the caveats of `(?{ ... })` groups as documented in perlre. Perl 6 projects - links to (nearly) everything that is Perl 6.	[reply] [d/l] [select]
Re^2: regex for string by saranperl (Initiate) on Aug 18, 2009 at 08:57 UTC
why we want to write this much code using "substr"? we can use like `$f = substr($str,0,1) $l=substr($str,-1);` [download] what want to get first and last char by regex ? like that m/(.?)..etc/	[reply] [d/l]
Re^3: regex for string by moritz (Cardinal) on Aug 18, 2009 at 09:13 UTC
Because substr is the way to go, but you wanted a regex. So I gave you a regex, but still used substr. The real question is, why do you insist on using a regex? it's not a good idea, and far from being an ideal solution. Perl 6 projects - links to (nearly) everything that is Perl 6.	[reply]
Re: regex for string by arkturuz (Curate) on Aug 18, 2009 at 08:44 UTC
Strictly using regex you can do it like this: `my $str = "perlmonks.org"; if ($str =~ / ^(.{1}) # first character .* # the rest of the string (.+)$ # minus the last character /x) { print $1, ' ', $2, "\n"; }` [download] Normally you would do this using substr: `my $f = substr($str, 0, 1); my $l = substr($str, length($str)-1, 1); print $f, ' ', $l;` [download]	[reply] [d/l] [select]
Re^2: regex for string by moritz (Cardinal) on Aug 18, 2009 at 08:48 UTC
This will fail for a string consisting of a single character, or a string which contains newlines. The first can be fixed with a look-ahead, the second with `/s`. Perl 6 projects - links to (nearly) everything that is Perl 6.	[reply] [d/l]
Re^2: regex for string by saranperl (Initiate) on Aug 18, 2009 at 09:01 UTC
ya its working fine thank you. why it is not working to giving like this `$str=~m/^(.?).*(.?)$/` [download]	[reply] [d/l]
Re^3: regex for string by arkturuz (Curate) on Aug 18, 2009 at 09:36 UTC
It won't match what you want because the '' quantifier is greedy, and the '?' can match 0 or 1 times. So, the '' matches all of the it can, and the last '?' matches 0 times; so the match is successful. You have to force '*' to give something back to the rest of the regex by replacing '?' with '{1}' for example.	[reply]
Re^4: regex for string by saranperl (Initiate) on Aug 18, 2009 at 11:06 UTC
Re: regex for string by rovf (Priest) on Aug 18, 2009 at 08:37 UTC
Have a look at perlre, in particular at the zero-width assertions `\A` and `\z` (lower-case z!), and to the meta-character `.` (dot). But I think that you can easier extract the characters using substr. -- Ronald Fischer <ynnor@mm.st>	[reply] [d/l] [select]
Re: regex for string by abubacker (Pilgrim) on Aug 18, 2009 at 12:31 UTC
I think this can be a easy way interms of using regular expression , `$_="1and2" ; /^(.).*(.)$/; print "$1 and $2 " ;` [download] but I'm pretty sure that you can achieve this using several more simple ways	[reply] [d/l]
Re: regex for string by AnomalousMonk (Archbishop) on Aug 18, 2009 at 18:16 UTC
The reason a single, 'pure and simple' regex is not a good approach to this problem is that it has great difficulty handling the degenerate cases of zero-length and single-character strings: either some fancy footwork is needed within the regex, or some sort of post-match fixup must be done in these cases. For a single-character string in particular, one is asking the regex to match twice on the same character! A regex will always advance the string match point (as returned by the pos built-in) past the match or, in the case of a zero-width assertion match, by a default of one character. (The 5.10 regex 'backtracking control verbs' may offer a way around this problem, but I'm not familiar enough with them to know.) The following is the best I can do with a regex. It uses post-match fixup to finish the job. Note that the order of the alternatives in the ordered alternation `\A . \| . \z \| \z` is important: the lone `\z` alternative must be last. `>perl -wMstrict -le "for my $str (@ARGV) { printf qq{string '$str': }; my ($first, $last) = $str =~ m{ \A . \| . \z \| \z }xmsg; $last = $first if not $last; print qq{first '$first', last '$last'}; } " "" "a" "ab" "abc" "abcd" string '': first '', last '' string 'a': first 'a', last 'a' string 'ab': first 'a', last 'b' string 'abc': first 'a', last 'c' string 'abcd': first 'a', last 'd'` [download]	[reply] [d/l] [select]
Re: regex for string by ikegami (Patriarch) on Aug 18, 2009 at 18:39 UTC
Works with single-character strings. Both $f and $l are set to that character. For zero-length strings, both $f and $l are undefined. For zero-length strings, the whole expression returns false in scalar context. `my ($f,$l) = /^(.).*(?<=(.))/s;` [download] Works with single-character strings. Both $f and $l are set to that character. For zero-length strings, both $f and $l are set to the zero-length string. Readable. `my $f = substr($_, 0, 1); my $l = substr($_, -1, 1);` [download]	[reply] [d/l] [select]
Re: regex for string by markkawika (Monk) on Aug 18, 2009 at 18:09 UTC
It's tough to do this for edge cases in one regex, so I would solve it in two. First, grab the first character: `$str =~ m/\A(.)/xms; my $first_character = $1;` [download] Then, grab the last character: `$str =~ m/(.)\z/xms; my $last_character = $1;` [download] If you could guarantee the string was at least two characters long, you could do it in one regex: `$str =~ m/\A(.).(.)\z/xms; my $first_character = $1; my $last_character = $2;` [download] If you insist on handling all edge cases in one regex, here's one way: `$str =~ m/\A(.).?(.?)\z/xms; my $first_character = $1; my $last_character = $2; $last_character = $first_character if ! $last_character;` [download]	[reply] [d/l] [select]


We don't bite newbies here... much
	PerlMonks