http://www.perlmonks.org?node_id=1041752

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have a doubt. How can I split the below with the delimiter "','" with 4 times it occurs?

my $to_split = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z';

So that output will be like,

'A','B','C','D','E'     'F','G','H','I','J'       'K','L','M','N','O'       'P','Q','R','S','T'  'U','V','W','X','Y'      'Z'

basically I need to split with delimitter "','" but not all the times it occur, but every 4 times it occur.

Replies are listed 'Best First'.
Re: split with a delimiter, every 4 time it occurs.
by davido (Cardinal) on Jul 01, 2013 at 07:20 UTC

    Don't make it so hard. Don't craft some ugly regexp or confusing split. Just keep it simple; split on commas, grab five at a time, print them, move on.

    use List::MoreUtils qw(natatime); my $to_split = "'A','B','C','D','E','F','G','H','I','J','K','L','M','N +','O','P','Q','R','S','T','U','V','W','X','Y','Z'"; my $it = natatime 5, split /,/, $to_split; while( my @vals = $it->() ) { local $" = ','; print "@vals\t"; } print "\n";

    ...the output...

    'A','B','C','D','E' 'F','G','H','I','J' 'K','L','M','N','O' ' +P','Q','R','S','T' 'U','V','W','X','Y' 'Z'

    Dave

      Thanks Davido, that worked exactly as I wanted. but I have 1 more question, Is this okay in dealing with very big data(in case of speed & performance)

        There are several things that could be improved upon if memory is an issue. In fact, if memory is an issue, split probably shouldn't play any part in your solution, since it returns a list, which consumes a lot more space than the original string. natatime will be reasonably memory efficient, but expanding that nice, compact string into a list will be memory-expensive.

        As for time, or computational efficiency (which is what you were asking about), it's an O(n) solution. Where you will get into trouble is where "n" grows large enough to send you into swap memory. There may be O(n) solutions with smaller per-iteration costs, but you won't ever turn this into O(log n) or O(1); you'll always be at least O(n).

        If I were designing it with memory in mind, I would probably use index and substr together to find every fifth comma, and to extract the portion of the string that resides between every fifth comma. I would iterate one small segment at a time so that I'm never making a copy of a large string. If that approach turned out not to be fast enough, I would re-implement the same simple algorithm using Inline::C.


        Dave

        I would suggest that it really depends on how “very” your “big” is.   You might have to read a large file in arbitrarily-sized sections, which can get messy when you are doing this sort of thing.   Usually it’s best to cross such bridges if-and-when you get there.   What sort of data-volume and speed requirements might you be dealing with?

Re: split with a delimiter, every 4 time it occurs.
by davido (Cardinal) on Jul 01, 2013 at 06:37 UTC

    Are you the same person who asked Seeking help with split!? The questions are close enough that the same solutions will apply, with minor modifications, which are left as an exercise for the student.

    By the way, have you taken a look at the contents of $to_split? The way you're defining it, it will contain only 'A'.


    Dave

      Yes,

      Actually I tried the same solution by modifying it but it is not working! I am not good in Perl. Can you pls suggest me a way out.

      Here is my code,

      my $to_split = 'A','B','C','D','E','F','G','H','I','J','K','L','M','N' +,'O','P','Q','R','S','T','U','V','W','X','Y','Z'; my @ra = split m{ (?: \',\' [^\',\']+){50} \K \',\' }xms, $to_split ; + foreach my $piece(@ra){ print "\n\n" . $piece . "\n\n"; }

      Can you pls suggest me where I got wrong?

      Thanks in advance.

        "I am not good in Perl.

        None of us were, when we first studied (formally or otherwise) how to write in Perl. So what have you done to "suggest (yourself) where (you) got wrong."

        And, again re "not good in Perl," presumably, that's why this homework was assigned... to give you a problem you can solve by applying what you've learned previously, and with judicious study of the relevant documents.

        If you didn't program your executable by toggling in binary, it wasn't really programming!

      There is an edit!!!

      my $to_split = "'A','B','C','D','E','F','G','H','I','J','K','L','M','N +' +,'O','P','Q','R','S','T','U','V','W','X','Y','Z'";

      Thanks

Re: split with a delimiter, every 4 time it occurs.
by hdb (Monsignor) on Jul 01, 2013 at 07:10 UTC

    Splitting can be done in other ways than using split. Here it is by far easier to describe the pieces that should result from the splitting and use a regex (a simple regex, relatively simple regex):

    use strict; use warnings; my $to_split = "'A','B','C','D','E','F','G','H','I','J','K','L','M','N +','O','P','Q','R','S','T','U','V','W','X','Y','Z'"; my @pieces = $to_split =~ /('\w'(?:,'\w'){0,4})/g; $"="\n"; print "@pieces\n";

    Or, if you insist on split, use the above as a delimiter, and remove the commas:

    use strict; use warnings; my $to_split = "'A','B','C','D','E','F','G','H','I','J','K','L','M','N +','O','P','Q','R','S','T','U','V','W','X','Y','Z'"; my @pieces = grep {$_} split /('\w'(?:,'\w'){0,4}),?/, $to_split; $"="\n"; print "@pieces\n";
      I want this ','(not just comma) as the delimiter. I tried changein the coma to ',' but not working!
Re: split with a delimiter, every 4 time it occurs.
by kcott (Archbishop) on Jul 01, 2013 at 06:44 UTC

    It looks like one of your classmates beat you to asking this homework question earlier today: Seeking help with split!

    $ perl -E 'say "Do your own work or learn nothing!";' > deaf.ears

    -- Ken

Re: split with a delimiter, every 4 time it occurs.
by poj (Abbot) on Jul 01, 2013 at 06:46 UTC
    You need put double quotes around the input string otherwise $to_split is just 'A'.
    my $to_split = " 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'";
    If you use warnings you would see messages like
    Useless use of a constant ("B") in void context at scrap.pl line 4. Useless use of a constant ("C") in void context at scrap.pl line 4. Useless use of a constant ("D") in void context at scrap.pl line 4.
    poj