http://www.perlmonks.org?node_id=1038011

cornelius80 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I have a $line as such:
A::B:123-456 hh:mm:C::D:789
I am trying to split the string such that I will obtain the following in an array:
A B 123-456 hh:mm C D 789
I have tried @array=split(/:/,$line); but the problem I am facing is that the hh:mm gets "caught in the action" too,ie.
A B 123-456 hh mm C D 789
What are your advice, please? Kind Regards, Cornelius

Replies are listed 'Best First'.
Re: How to split unique patterns
by Corion (Patriarch) on Jun 10, 2013 at 08:41 UTC

    Instead of splitting, I would instead match the data I want to keep.

    Using a regular expression has the benefit of also somewhat validating your data, so you get notice of malformed input early:

    my @columns= qw( name branch code timestamp info1 info2 id); $line=~ /^(A)::(B):(123-456) ([012]\d:[0-6]\d):(C)::(D):(789)/ or die "Malformed input [$line] in line $."; my %info; @info{ @columns }= ($1,$2,$3,$4,$5,$6);

      Didn't you mean @info{ @columns }= ($1,$2,$3,$4,$5,$6) (a hash slice) instead of $info{ @columns }= ($1,$2,$3,$4,$5,$6)?

      Well done is better than well said. -- Benjamin Franklin

        Whoops! Thank you, fixed!

      Hi Corion, Thank you for your suggestion. However, I might have left out some criteria though. 1. A,B,C,D.. are headers that I would need to have with their preceding values equated to. eg. A=undef,B=123-456 hh:mm, C=undef,D=789... 2. This would mean that the RHS of the headers would change over time keeping the LHS headers constant. How would you suggest that I overcome this, please? Thank you. Kind Regards, Cornelius

        I don't understand the additional requirements from your text. Can you maybe post some (anonymized) more relevant input data?

        ... I might have left out some criteria ...

        Oh, of course you left out critical criteria! Answering these questions would not be near as much fun if we actually had accurate problem statements to begin with. I'm sure Corion appreciates the opportunity to waste... er, devote his or her time to providing a useful and insightful answer to a fundamentally mis-stated question.

        Many ++ to Corion for truly humble monkish patience, forbearance and generosity in dealing with a miserable sinner.

Re: How to split unique patterns
by hdb (Monsignor) on Jun 10, 2013 at 08:45 UTC

    First, your split creates a couple of empty fields caused by the double colons present. Secondly, the "hh" is still part of "123-456 hh", is this intentional?

    If you want to stick to split, you could re-join hh with mm afterwards:

    use strict; use warnings; use Data::Dumper; my $line = "A::B:123-456 hh:mm:C::D:789"; my @array = split /:/, $line; print Dumper \@array; splice @array, 3, 2, join( ":", @array[ 3..4 ] ) ; print Dumper \@array;