Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Arrays manipulation

by hotshot (Prior)
on May 01, 2003 at 06:47 UTC ( [id://254573]=perlquestion: print w/replies, xml ) Need Help??

hotshot has asked for the wisdom of the Perl Monks concerning the following question:

Hi all!

I have an array whose entries are the format:
my @array = ('nfs,7,rw', 'afp,12,rro', 'cifs,32,ro', 'dns,5,rw', );
I need the fastest way to create 3 arrays one for each cloumn in the original array, for example:
# working on the above array, I want to get: arr1 = qw(nfs afp cifs dns); arr2 = qw(7 12 32 5); arr3 = qw(rw ro ro rw);
Anyone with a fast way?



Replies are listed 'Best First'.
Re: Arrays manipulation
by leriksen (Curate) on May 01, 2003 at 07:00 UTC
    obvious and relativly fast - an array of arrays - could do the foreach as a map as well, but that is not as clear.
    #!/usr/bin/perl -w use strict; use Data::Dumper; my @array = ('nfs,7,rw', 'afp,12,rro', 'cifs,32,ro', 'dns,5,rw', ); my $cols = []; foreach my $row (0..$#array) { my @cols = split /,/, $array[$row]; map {$cols->[$_]->[$row] = $cols[$_]} (0..$#cols); } print Dumper($cols);

    output is
    $VAR1 = [ [ 'nfs', 'afp', 'cifs', 'dns' ], [ '7', '12', '32', '5' ], [ 'rw', 'rro', 'ro', 'rw' ] ];

      Easier to read and avoids the map in void context:

      #!/usr/bin/perl my @array = ( 'nfs,7,rw', 'afp,12,rro', 'cifs,32,ro', 'dns,5,rw', ); my @cols; for my $row (@array) { my $i = 0; push @{$cols[$i++]}, $_ for split /,/, $row; } use Data::Dumper; print Dumper(\@cols);

      "My two cents aren't worth a dime.";
        And without an explicit counter variable:
        my @cols; for my $row (@array) { my @f = split /,/, $row; push @{$cols[$_]}, $f[$_] for 0 .. $#f; }

        Makeshifts last the longest.

Re: Arrays manipulation
by kabel (Chaplain) on May 01, 2003 at 07:59 UTC
    instead of processing the elements of the array one after each other, i create a list out of the elements. then a mod-3 counter puts them in the appropriate slot - that is the (IMO) cool part distinguishing the solution from the one leriksen gave.
    use strict; use Data::Dumper; my @array = ('nfs,7,rw', 'afp,12,rro', 'cifs,32,ro', 'dns,5,rw', ); my @splitted_up = (); my $cnt = 0; push @{ $splitted_up[($cnt ++) % 3] }, $_ foreach (split (/,/, join (",",@array))); print (Dumper \@splitted_up);
    generated output is the same.
Re: Arrays manipulation
by UnderMine (Friar) on May 01, 2003 at 09:57 UTC
    You can also assign in 2d directly :-
    my $array2d = []; my @list = (split /,/,join (",",@array)); $array2d->[$_%3][int($_/3)]=$list[$_] for (0..$#list);
    or even
    my @array2d = (); my $c=0; $array2d[$c%3][int($c++/3)]=$_ for (split /,/, join (",",@array));
    And let us compare with :-

    And the Winner is :-

    Devel::Timer Report -- Total time: 0.1663 secs Interval Time Percent ---------------------------------------------- 01 -> 02 0.0530 31.87% V1 -> V2 03 -> 04 0.0443 26.65% V3 -> V4 04 -> 05 0.0370 22.24% V4 -> V4 end 02 -> 03 0.0319 19.15% V2 -> V3 00 -> 01 0.0002 0.09% INIT -> V1
    The second suggested code posted performed best for my given hardward config, but always try speed tests on the correct hardware as OS's etc can effect performance.
    push @{ $splitted_up[($cnt ++) % 3] }, $_ foreach (split (/,/, join (",",@array)));
    Hope this helps
      The second suggested code posted performed best for my given hardward config, but always try speed tests on the correct hardware as OS's etc can effect performance.

      So does the current load on the machine... which is one reason why a "benchmark" such as the one you supplied is essentially useless. Not one of the methods you tested ran for longer than 6 hundredths of a second. That's simply not adequate. You will need a much larger dataset before you'll get any performance data that is even remotely meaningful.

      Many around here are happy to give advice to those obsessed with the performance of their code. That advice will almost certainly include statements like: "consider how long it takes to write as well as how long it takes to run" and "if you were really interested in performance you probably wouldn't be using perl in the first place." The upshot is that micro-optimizations simply aren't worth it. You are usually better off saving your time (or the maintainers) by writing clean, straight-forward code that is easy to read. Often enough, that approach leads to efficient code as well. When you really need better performance, you'll know it.

      Afterall, consider that the code in question probably won't spend more time running in the next 5 years than you've already spent benchmarking it...

      "My two cents aren't worth a dime.";

        I agree that the period the benchmark was run for was not sufficient for a more accruate estimate the benchmark should take longer (say 1 minute plus) and be run at an appropriate time. In the past I have run such tests using a scheduled job (say hourly) over a week to give an indication of when the system best copes with it.

        All Benchmarks are indicative and never give the whole story and you are right to point that out. But when we have no other way of testing the code (ie use Black box X or Y) then they do give us hints as which methods to investigate further.

        Benchmarking psudo-random samples is also useful when the whole dataset is massive.

        Hope it Helps

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://254573]
Approved by adrianh
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-24 12:32 GMT
Find Nodes?
    Voting Booth?

    No recent polls found