With just one problematic field that contains spaces you can still split the whole line on spaces without first modifying it by using the third argument to limit the number of fields. Work first from the left leaving the field with spaces along with the rest of the line in the "remainder" part of the string. Then, by using reverse, work in from the right, again with a limit, on the reversed "remainder" to get the rest of the fields; the field with the spaces is the last field and is not disturbed because of the limit to the split.
use strict;
use warnings;
use 5.010;
my @dataLines = (
q{>cds:AEA30293 A/Netherlands/2223b/2009 2009/11/18 HA},
q{>cds:ADD23250 A/District of Columbia/INS17/2009 2009/10/26 HA},
q{>cds:ADX98640 A/San Diego/INS13/2009 2009/10/19 HA},
q{>cds:ADD97035 A/Wisconsin/629-D00036/2009 2009/09/15 HA},
);
say q{=} x 60;
foreach my $dataLine ( @dataLines )
{
say $dataLine;
my @elems;
( $elems[ 0 ], my $remainder ) = split m{\s+}, $dataLine, 2;
@elems[ 3, 2, 1 ] =
map { scalar reverse }
split m{\s+}, reverse( $remainder ), 3;
say for @elems;
say q{=} x 60;
}
The output.
============================================================
>cds:AEA30293 A/Netherlands/2223b/2009 2009/11/18 HA
>cds:AEA30293
A/Netherlands/2223b/2009
2009/11/18
HA
============================================================
>cds:ADD23250 A/District of Columbia/INS17/2009 2009/10/26 HA
>cds:ADD23250
A/District of Columbia/INS17/2009
2009/10/26
HA
============================================================
>cds:ADX98640 A/San Diego/INS13/2009 2009/10/19 HA
>cds:ADX98640
A/San Diego/INS13/2009
2009/10/19
HA
============================================================
>cds:ADD97035 A/Wisconsin/629-D00036/2009 2009/09/15 HA
>cds:ADD97035
A/Wisconsin/629-D00036/2009
2009/09/15
HA
============================================================
I hope this is of interest.
Update: Modified code to change order of array slice and thereby eliminated the final reverse
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.