Iterating through an array using multiple loops and removing array elements

BiochemPhD has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Iterating through an array using multiple loops and removing array elements by frozenwithjoy (Priest) on Apr 24, 2014 at 04:20 UTC
I think the splice thing won't work unless you are going backwards through the array (or something) or using first_index from List::MoreUtils to choose what to splice, since removing entries will mess up the significance of the counter value. Why not keep it simple and just `push` on to an array of 'values to keep' rather than removing values you don't want? Totally untested example: `my @entries = ...; my $top_entry = shift @entries; my @keepers; for (@entries) { my $comparison = compare_sub( $top_entry, $_ ); if ( $comparison > $user_defined_value ) { push @keepers, $_; } }` [download]	[reply] [d/l] [select]
Re^2: Iterating through an array using multiple loops and removing array elements by BiochemPhD (Novice) on Apr 24, 2014 at 04:25 UTC
I've thought about that but the problem is that after I take the zeroth element and compare it to everything else in the array (removing both the zeroth element and any that meet the comparison criteria), I want to then move onto the next remaning element and repeat the process over and over again until the entire list has been consumed.	[reply]
Re^3: Iterating through an array using multiple loops and removing array elements by frozenwithjoy (Priest) on Apr 24, 2014 at 04:39 UTC
What about making a subroutine for the iterative comparison and have it recursively call itself on the kept values from the array unless some condition is true? `iterate(@entries); sub iterate { my ( $top_entry, @entries ) = @_; my @keepers; for (@entries) { my $comparison = compare_sub( $top_entry, $_ ); if ( $comparison > $user_defined_value ) { push @keepers, $_; } } iterate(@keepers) unless ...; }` [download] EDIT: A variation where you continue iterating or assign final result depending on some condition. It's hard to know the right approach from here w/o more info `my @final_result; iterate(@entries); sub iterate { my ( $top_entry, @entries ) = @_; my @keepers; for (@entries) { my $comparison = compare_sub( $top_entry, $_ ); if ( $comparison > $user_defined_value ) { push @keepers, $_; } } if (...) { iterate(@keepers); } else { @final_result = ...; } }` [download]	[reply] [d/l] [select]
Re^4: Iterating through an array using multiple loops and removing array elements by BiochemPhD (Novice) on Apr 24, 2014 at 05:11 UTC
Re^5: Iterating through an array using multiple loops and removing array elements by frozenwithjoy (Priest) on Apr 24, 2014 at 05:18 UTC
Some notes below your chosen depth have not been shown here
Re: Iterating through an array using multiple loops and removing array elements by Tanktalus (Canon) on Apr 24, 2014 at 04:44 UTC
I have to admit, I don't get the code. As in, I don't really get what you're trying to accomplish. But some warning flags do erupt. The first one is that when you splice an entry out, you keep checking if you get more $comparisons that are less than or equal to whatever value you're checking against. Are you intending on removing all the entries that are lower than the current one? If so, you probably intend to use grep: `@entries = grep { my $comparison = compare_sub($top_entry, $_); $comparison > $user_defined_value; # inverse - we want to keep ones +that match } @entries;` [download] The next flag is that it doesn't look like you do anything. A compare_sub wouldn't, at least in my mind, do anything. It just compares. And your loop doesn't do anything else. I'm not sure if there's supposed to be a `do_something($???)` in there somewhere. But if you're just consuming everything without doing anything, a simple `@entries = ();` might be faster. The other thing that comparison makes me think of is that you're sorting things somehow. In which case I'd recommend against your merge sort (you'd have to do a binary search to keep it fast), and skip straight to using sort using your compare_sub to order things. The fact you mention that the file is pre-sorted also indicates to me that this is important in some way, so I have to wonder if you're trying to keep it - but if so, there are huge pieces missing from your sample code with regards to sorting anything. Such as pushing the entries on to an output stack. Maybe that's the "print functions" that you removed? But, if so, you'd have to indicate where the print statement goes, and which element you're printing. So, really, everything here is just a guess. More details might be required unless someone else can glean more from this.	[reply] [d/l] [select]
Re^2: Iterating through an array using multiple loops and removing array elements by BiochemPhD (Novice) on Apr 24, 2014 at 05:06 UTC
My apologies. I realize there's a lot of context missing here. It was my intention to simplify the code for readability, rather than to encumber everyone with other aspects of the code that are not relevant. The script is designed to read entries from an input file that are sorted by decreasing abundance, take the top-most (most abundant) entry and group everything else in the file that is similar (but less abundant). It should then move onto the next most-abundant entry that hasn't been grouped and search the remaining ungrouped entries for similarity and group them together. This repeats until the array of entries is exhausted. I want to KEEP entries that don't meet the criteria and I want to REMOVE the ones that do, and then repeat until there's nothing left. When I call on compare_sub it returns a value, if the value is less than or equal to a user defined value, then the entry gets printed to output and spliced out of the array. (I removed the print function from the code). I'm still quite new at this! Thanks for the help and I hope that clarified things a bit.	[reply]
Re: Iterating through an array using multiple loops and removing array elements by kcott (Archbishop) on Apr 24, 2014 at 09:39 UTC
G'day BiochemPhD, Welcome to the monastery. Firstly, modifying an array while you're looping through it (with `for`[`each`]) will cause problems. The documentation is quite clear on this. From "perlsyn - Foreach Loops": "If any part of LIST is an array, `foreach` will get very confused if you add or remove elements within the loop body, for example with `splice`. So don't do that." From your various posts in this thread, I think this is fairly close to what you want: #!/usr/bin/env perl -l use strict; use warnings; use List::Util qw{first}; my $min = 3; my @records = reverse(0 .. 10, 13, 17, 42); print "All records: @records"; my %deleted; for my $i (0 .. $#records) { my $top_index = first { ! $deleted{$_} } $i .. $#records; last unless defined $top_index; my $top = $records[$top_index]; my @group = ($top); for ($top_index + 1 .. $#records) { next if $deleted{$_}; if (compare_sub($top, $records[$_]) <= $min) { push @group, $records[$_]; ++$deleted{$_}; } else { last; } } ++$deleted{$top_index}; print "Group: @group"; } sub compare_sub { my ($x, $y) = @_; return abs($x - $y); } [download] Output: `All records: 42 17 13 10 9 8 7 6 5 4 3 2 1 0 Group: 42 Group: 17 Group: 13 10 Group: 9 8 7 6 Group: 5 4 3 2 Group: 1 0` [download] Obviously, I've had to dummy up input data and the `compare_sub()` routine; however, this does seem to match your (rather vague) description of "abundance". As this solution doesn't actually modify `@records` at all, you may find some benefit in using the builtin module Tie::File: it doesn't load the file into memory (so that may be useful depending on the record size of your thousands of records) and it's less coding (than what I can only guess you're currently doing). I appreciate this is your first post (and, to be honest, it's a lot better than many first posts). Please just note the various difficulties monks had and keep those in mind whenever you post next. -- Ken	[reply] [d/l] [select]
Re^2: Iterating through an array using multiple loops and removing array elements by BiochemPhD (Novice) on Apr 24, 2014 at 15:59 UTC
Modifying an array while looping with foreach = not good. What about while looping though it with while? Similarly, what about modifying a hash (using delete) while looping with for/foreach or while? From a quick skim through the doc it doesn't appear to be an issue. I'll definitely keep in mind the shortcomings of this post in the future! Context is especially important when TIMTOWTDI! Thanks for the help!	[reply]
Re^3: Iterating through an array using multiple loops and removing array elements by kcott (Archbishop) on Apr 25, 2014 at 07:10 UTC
"What about while looping though it with while?" `for (@array)` iterates over the list of values in `@array`. Changing that list in mid-iteration often has problems. We actually get quite a few questions like "Why doesn't my `for` loop work?" that are due to such a problem. So, as the doco says, "don't do that". `while (@array)` involves no iteration; it's a simple condition which basically says "Enter the loop if `@array` has any elements". Unless there's some other method for exiting that loop, you would expect elements to be removed from `@array` so that the loop can eventually terminate. "Similarly, what about modifying a hash (using delete) while looping with for/foreach or while? From a quick skim through the doc it doesn't appear to be an issue." If you're writing `for (%hash)` or `while (%hash)`, that's probably a mistake; perhaps you meant something else. You'll need to provide some code to show the scenario(s) you're considering here. By the way, in case you didn't know, `for` and `foreach` are synonymous. Save yourself four keystrokes by writing `for` instead of `foreach`: the code will run the same whichever you choose. -- Ken	[reply] [d/l] [select]
Re: Iterating through an array using multiple loops and removing array elements by Laurent_R (Canon) on Apr 24, 2014 at 06:42 UTC
If I understood you correctly, you don't need nested loops to do what you want. And you also don't need to store your file in an array in the first place. You can just read your input file, store the data from your first line in an array or a hash; then you go to the next line, if it meets the criteria of what you have already stored, add to the existing hash entry, otherwise create a new hash entry, and so on. In other word, you need only one pass through your file to get all what you need into the hash. At the end, print the hash content or do whatever you need with it.	[reply]
Re: Iterating through an array using multiple loops and removing array elements by hdb (Monsignor) on Apr 24, 2014 at 07:36 UTC
I am not sure what you want to achieve as `@entries` will be empty after the `while` loop whatever you do inside... However, the following should do what you want (not tested as I could not think of sample data). `while( @entries ) { my $top = shift @entries; @entries = map { compare_sub( $top, $_ ) > $user_defined_value ? $ +_ : () } @entries; }` [download] UPDATE: Reading the whole thread more carefully I realize that my proposal is essential the same (but more convoluted) as Tanktalus' grep above. Pls ignore...	[reply] [d/l] [select]


Clear questions and runnable code get the best and fastest answer
	PerlMonks