Re: Variable assignment confusion
by fruiture (Curate) on Dec 15, 2003 at 17:48 UTC
|
Well, you're assigning either the result of a match or, if the match is without success, the retid + "_001". It's not true that this always assigns 1, it only assigns 1 when the match is successfull, because a simple m// returns a boolean (undef or 1) and || will return it's left operand's result if it's true.
What you want is, as you said:
# parens not neccessary here
my $new_retid =
# if the retid has the suffix
$retid =~ /_\d{3}$/ ?
# then leave it as is
$retid :
# otherwise add _001 suffix
$retid.'_001'
See perlop for "?:" and the behaviour of m//. HTH
| [reply] |
Re: Variable assignment confusion
by delirium (Chaplain) on Dec 15, 2003 at 17:54 UTC
|
You need to enclose the regex match you're looking for in parentheses to assign it to a variable that way. For example: my ($new_retid) = $retid =~ /(^\d{4}_\d{3})$/;
$new_retid ||= $retid.'_001';
| [reply] [d/l] |
|
This is very clever. It's not what I had been thinking of but would work nicely.Thanks!
| [reply] |
Re: Variable assignment confusion
by Roy Johnson (Monsignor) on Dec 15, 2003 at 18:39 UTC
|
Is there some compelling reason to make this a one-line assignment? Here's what I'd consider nicely readable:
my $new_retid = $retid;
$new_retid .= '_001' unless $retid =~ /_\d{3}$/;
Apart from the ternary operator suggested by several, here are a few bletcherous but possibly instructive ways to turn it into one assignment:
my $new_retid = $retid . '_001' x $retid !~ /_\d{3}$/;
(my $new_retid = $retid) .= do {'_001' unless $retid =~ /_\d{3}$/};
my $new_retid = $retid . ($retid !~ /_\d{3}$/ and '_001');
Update:
One more:
(my $new_retid = $retid) =~ s/(?<!_\d{3})$/_001/;
And one inspired by delerium's post above:
my ($new_retid) = grep( $_, $retid =~ /(^\d{4}_\d{3})$/, $retid.'_001'
+);
The PerlMonk tr/// Advocate
| [reply] [d/l] [select] |
Re: Variable assignment confusion
by Aristotle (Chancellor) on Dec 15, 2003 at 18:42 UTC
|
I'd throw in some stricter error checking and write it like this:
$retid =~ /\A(\d{4})(_\d{3})?\z/
or die "Malformed ret. id\n";
my $new_retid = $1 . ( $2 ? $2 : '_001' );
Though unless you need it, I'd just change with the original value:
$retid .= '_001' if not $2;
Update: s/\Q\d{4}/(\\d{4})/ of course. Thanks catching this to Not_a_Number.
Makeshifts last the longest.
| [reply] [d/l] [select] |
Re: Variable assignment confusion
by ysth (Canon) on Dec 15, 2003 at 17:56 UTC
|
The 1 is the return value from the successful match.
Try:
my $new_retid = ($retid =~ /_\d{3}\z/ ? $retid : $retid . "_001");
| [reply] [d/l] |
|
Which is (factoring out the common prefix):
my $new_retid = $retid . ($retid =~ /_\d{3}\z/ ? '' : '_001');
The PerlMonk tr/// Advocate
| [reply] [d/l] |
|
This is no doubt what I was thinking about. Thanks! One question. Why a \z instead of $ at the end of the expression? Don't get me wrong, I'm sure your correct I just don't understand.Thanks to All!
| [reply] |
|
Please don't assume that someone else is right just because they are more experienced than you; always try (as you have done :) to get an explanation. Cargo cultism often stems from unquestioningly assuming that unusually-written code is that way for a reason.
As to your inquiry, '\z' matches the very end of the string; whereas '$' matches the end, or just before a newline at the end of a string.
This doesn't make any difference in your case; so I'd use '$' to avoid confusion.
If you're curious; '\z' becomes much more useful when you switch on the '/m' (multi-line) flag on a regex to allow '$' to match before any newlines in the string. Have a look at perldoc perlre or Mastering Regular Expressions for more details.
| [reply] |
|
|
With $, it would match either "9999_999\n" or "9999_999",
which is not what your original post requested.
| [reply] |
Re: Variable assignment confusion
by pg (Canon) on Dec 15, 2003 at 18:02 UTC
|
Use a function to wrap it, so that it can be reused:
print foo("1234"), "\n";
print foo("1234_002");
sub foo {
return ($_[0] =~ /^\d{4}$/) ? $_[0] . "_001" : $_[0];
}
| [reply] [d/l] |
Re: Variable assignment confusion
by TomDLux (Vicar) on Dec 15, 2003 at 19:48 UTC
|
If you're certain you won't get invalid data, but only either four digits or else four digits, an underscore, and three digits, then using length or index or substr may be simpler than using a regex.
$short = "1234";
$result_1 = $short;
$result_1 .= "_001" unless ( 4 < length $result_1 );
$result 2 = $short;
$result_2 .= "_001 unless ( substr( $result_2, 4, ) );
$result 3 = $short;
$result_3 .= "_001 unless ( index( $result_3, '_', 4) );
index is probably the fastest and most direct; if simply has to locate and return the fifth character,character number four, if there is one. substr returns the remainder of the string, starting after the fourth character, assuming that isn't past the end of the string. length has to count every character in the string. All of these are quite direct. Personally, I would use length, or maybe index.
If it's important to you to use one line, all of these could be used as the condition in a ternary expression, but two lines is clearer, in my opinion. I like to stretch ternaries over three lines, unless they are very simple:
$result_4 = $short . ( index( $short, 4, 1 ) )
? ""
: "_001";
--
TTTATCGGTCGTTATATAGATGTTTGCA
| [reply] [d/l] [select] |
|
| [reply] |
Re: Variable assignment confusion
by antirice (Priest) on Dec 16, 2003 at 04:17 UTC
|
my $new_retid = $retid.("_001","")[0+$retid=~/_\d{3}/];
Yeah. Fun stuff.
antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1
| [reply] [d/l] |
Re: Variable assignment confusion
by qq (Hermit) on Dec 15, 2003 at 23:24 UTC
|
~>perl -e '@id = (1234,"4323_003"); foreach ( @id ) { s/^(\d{4})$/$1_0
+01/; print $_, "\n"; }'
1234_001
4323_003
If the input needs to be checked, I'd do it separately (and before).
qq | [reply] [d/l] [select] |