Re: regex for string
by moritz (Cardinal) on Aug 18, 2009 at 08:25 UTC
|
I don't know why you want a regex for that, but here you go:
use strict;
use warnings;
my $str = "hello\nworld";
my ($first, $last);
$str =~ m/ (?{ $first = substr($str, 0, 1); $last = substr($str, -1) }
+) /x;
print "first: '$first'; last: '$last'\n";
However note the caveats of (?{ ... }) groups as documented in perlre.
| [reply] [d/l] [select] |
|
why we want to write this much code using "substr"?
we can use like
$f = substr($str,0,1)
$l=substr($str,-1);
what want to get first and last char by regex ? like that m/(.?)..etc/ | [reply] [d/l] |
|
Because substr is the way to go, but you wanted a regex.
So I gave you a regex, but still used substr.
The real question is, why do you insist on using a regex? it's not a good idea, and far from being an ideal solution.
| [reply] |
Re: regex for string
by arkturuz (Curate) on Aug 18, 2009 at 08:44 UTC
|
Strictly using regex you can do it like this:
my $str = "perlmonks.org";
if ($str =~ /
^(.{1}) # first character
.* # the rest of the string
(.+)$ # minus the last character
/x)
{
print $1, ' ', $2, "\n";
}
Normally you would do this using substr:
my $f = substr($str, 0, 1);
my $l = substr($str, length($str)-1, 1);
print $f, ' ', $l;
| [reply] [d/l] [select] |
|
This will fail for a string consisting of a single character, or a string which contains newlines. The first can be fixed with a look-ahead, the second with /s.
| [reply] [d/l] |
|
ya its working fine thank you.
why it is not working to giving like this
$str=~m/^(.?).*(.?)$/
| [reply] [d/l] |
|
It won't match what you want because the '*' quantifier is greedy, and the '?' can match 0 or 1 times. So, the '*' matches all of the it can, and the last '?' matches 0 times; so the match is successful. You have to force '*' to give something back to the rest of the regex by replacing '?' with '{1}' for example.
| [reply] |
|
Re: regex for string
by rovf (Priest) on Aug 18, 2009 at 08:37 UTC
|
Have a look at perlre, in particular at the zero-width assertions \A and \z (lower-case z!), and to the meta-character . (dot). But I think that you can easier extract the characters using substr.
--
Ronald Fischer <ynnor@mm.st>
| [reply] [d/l] [select] |
Re: regex for string
by abubacker (Pilgrim) on Aug 18, 2009 at 12:31 UTC
|
I think this can be a easy way interms of using regular expression ,
$_="1and2" ;
/^(.).*(.)$/;
print "$1 and $2 " ;
but I'm pretty sure that you can achieve this using several more simple ways
| [reply] [d/l] |
Re: regex for string
by AnomalousMonk (Archbishop) on Aug 18, 2009 at 18:16 UTC
|
The reason a single, 'pure and simple' regex is not a good approach to this problem is that it has great difficulty handling the degenerate cases of zero-length and single-character strings: either some fancy footwork is needed within the regex, or some sort of post-match fixup must be done in these cases.
For a single-character string in particular, one is asking the regex to match twice on the same character! A regex will always advance the string match point (as returned by the pos built-in) past the match or, in the case of a zero-width assertion match, by a default of one character. (The 5.10 regex 'backtracking control verbs' may offer a way around this problem, but I'm not familiar enough with them to know.)
The following is the best I can do with a regex. It uses post-match fixup to finish the job. Note that the order of the alternatives in the ordered alternation
\A . | . \z | \z
is important: the lone \z alternative must be last.
>perl -wMstrict -le
"for my $str (@ARGV) {
printf qq{string '$str': };
my ($first, $last) = $str =~ m{ \A . | . \z | \z }xmsg;
$last = $first if not $last;
print qq{first '$first', last '$last'};
}
" "" "a" "ab" "abc" "abcd"
string '': first '', last ''
string 'a': first 'a', last 'a'
string 'ab': first 'a', last 'b'
string 'abc': first 'a', last 'c'
string 'abcd': first 'a', last 'd'
| [reply] [d/l] [select] |
Re: regex for string
by ikegami (Patriarch) on Aug 18, 2009 at 18:39 UTC
|
- Works with single-character strings. Both $f and $l are set to that character.
- For zero-length strings, both $f and $l are undefined.
- For zero-length strings, the whole expression returns false in scalar context.
my ($f,$l) = /^(.).*(?<=(.))/s;
- Works with single-character strings. Both $f and $l are set to that character.
- For zero-length strings, both $f and $l are set to the zero-length string.
- Readable.
my $f = substr($_, 0, 1);
my $l = substr($_, -1, 1);
| [reply] [d/l] [select] |
Re: regex for string
by markkawika (Monk) on Aug 18, 2009 at 18:09 UTC
|
It's tough to do this for edge cases in one regex, so I would solve it in two. First, grab the first character:
$str =~ m/\A(.)/xms;
my $first_character = $1;
Then, grab the last character:
$str =~ m/(.)\z/xms;
my $last_character = $1;
If you could guarantee the string was at least two characters long, you could do it in one regex:
$str =~ m/\A(.).*(.)\z/xms;
my $first_character = $1;
my $last_character = $2;
If you insist on handling all edge cases in one regex, here's one way:
$str =~ m/\A(.).*?(.?)\z/xms;
my $first_character = $1;
my $last_character = $2;
$last_character = $first_character if ! $last_character;
| [reply] [d/l] [select] |