Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Sort this data

by extremely (Priest)
on Nov 19, 2000 at 11:31 UTC ( [id://42408]=note: print w/replies, xml ) Need Help??


in reply to Sort this data

Well, I'd probably use perlfunc:splice to rip 4 items at a time off the list and then push hashes onto a new list,
tested now =):
@bigarray = ... ; #your data my @LoH; while (my ($t, $a, $l, $j)= splice (@bigarray,0,4)) { push @LoH, { Title => $t, Author => $a, Link => $l }; }

--
$you = new YOU;
honk() if $you->love(perl)

Replies are listed 'Best First'.
Re: Re: Sort this data
by japhy (Canon) on Nov 19, 2000 at 19:35 UTC
    I'd guard against splice() continually from the beginning of an array. I'd much rather call it from the end of the array. "But Jeff," you'd say, "what if the order of the hash references matters? You couldn't do:
    while (my ($a,$b,$c) = splice(@data, -3)) { push @hashrefs, { a => $a, b => $b, c => $c }; pop @data; # null field }
    because then @hashrefs would be in reverse!" Yes, that's absolutely right. And it might be inefficient to reverse() the array when we're done with it, and it's also inefficient to keep unshift()ing to the array. So what possible efficient solution could I come up with to combine the splice() speed with the insertion speed?

    Pre-extend the array! (Dun, dun, DUN!)
    my @hashrefs; $#hashrefs = int(@data / 4); my $i = $#hashrefs; while (@data and my ($a,$b,$c) = splice(@data, -3)) { $hashrefs[$i--] = { a => $a, b => $b, c => $c }; pop @data; }
    What a sneaky trick I've done.

    Update

    Oops, used -4 above, when I meant -3. Thanks, jcwren.

    Update

    splice() will wig out when the array is empty. The while loop has been adjusted.

    japhy -- Perl and Regex Hacker
      I would have to check, but I thought that splice was only slow if you have to move the array around. But I don't know whether pulling from the beginning of the array has been optimized. And (like jcwren) I am too lazy to benchmark it at the moment. In any case shift is fast and has the benefit of avoiding tracking indices. (Note that you had a bug in your code which jcwren caught? Yes, I am talking about that.)
      while (@big_array) { my $href; @$href{'title', 'author', 'link'} = map shift(@big_array), 1..4; push @structs, $href; }
      This might be faster than your sneaky trick. It might be slower. It certainly has fewer indices.

      Also the cost of reverse is overstated. You have just walked through a list of n things in Perl. You then want to reverse a list of n/4 things. What is the relative cost of those two operations? Right.

      Pick up good material on optimization. Such as this sample chapter from Code Complete. Or RE: Efficient Perl Programming. You will find that experienced people understand that getting maintainable code with good algorithms can result in better overall speed wins than trying to optimize every line.

      Now noticing the splice, that matters. If it isn't optimized then that is an order(n) operation n times - which is n^2 and therefore is likely to be slow. But one reverse at the end is an order n operation once. Should the body of the loop be slightly more efficient from doing the slice rather than repeated manipulation of indices (something I would have to benchmark to have a feeling for either way) then your attempt to optimize would actually lose.

      To summarize, don't worry about slow operations, worry about bad algorithms. A slow operation inside a loop may matter. A slow operation outside a loop which speeds up the loop can go either way. An order n (or worse) operation inside a loop - that is the only one which should cause you to want to care up front about optimizing the structure of the code!

      EDIT
      I had messed up the final paragraph.

      Nice solution! But a couple of questions:

      It would be interesting to benchmark if using unshift() into the $hashrefs array would be more or less efficient than messing with $i. If nothing else, it would be two lines shorter, and more Perlish.

      And perhaps this is too early in the morning, but why are you splicing by -4, and then popping @data?

      It's Sunday morning where I am, and I'm too lazy to try to write a coherent benchmark...

      --Chris

      e-mail jcwren
      What if there is a blank line at the END of the list? You might want to sniff for that and pop off blank lines first.

      Also, what if fields are allowed to be null? If so, you HAVE to read from the front...

      --
      $you = new YOU;
      honk() if $you->love(perl)

        Yes, and we've just seen that reading from the front is not all that slow an operation. Sorry, I didn't understand how the array was implemented internally. Now that I do, I see that splice()ing at the front of an array is a damn nice operation, and makes unshift()ing later less of a headache.

        japhy -- Perl and Regex Hacker

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://42408]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-03-29 14:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found