Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

A proper name for is_sorted function that can check more than just sorting order?

by Dallaylaen (Hermit)
on Dec 25, 2017 at 10:17 UTC ( #1206160=perlquestion: print w/replies, xml ) Need Help??
Dallaylaen has asked for the wisdom of the Perl Monks concerning the following question:

Hello dear esteemed monks,

I've come up with an is_sorted { $a ... $b } \@array, "Message"; testing condition. Soon enough I realized that not only total orders like $a ge $b or $a <= $b can be tested, but just anything regarding adjacent \@array elements:

is_sorted { $a->{end_date} eq $b->{start_date} }, \@reservations, "Sub +sequent reservations";

That's not even a partial order, but it is quite a meaningful testing condition.

How should is_sorted be named then? Or should I leave it as is and just add a comment about the arbitrariness of the ordering function?

  • Comment on A proper name for is_sorted function that can check more than just sorting order?
  • Download Code

Replies are listed 'Best First'.
Re: A proper name for is_sorted function that can check more than just sorting order?
by johngg (Abbot) on Dec 25, 2017 at 12:00 UTC

    Since you are testing some sort of logical order which is not necessarily sorted how about is_ordered()?

    Cheers,

    JohnGG

      The OP's issue is that it can be used for testing other things too (as they demonstrated), and the name should reflect that.

        Yes, you're quite right. I was thinking that is_ordered was better than is_sorted, because it would reflect some more general notion of order than just sorting order, but it is true that, for example, the $a->{end_date} eq $b->{start_date} example doesn't really have anything to do with order (even with a strongly customized idea of order).

        But then, it is difficult to find a general name for that. Perhaps something like match_property or something in that direction would be better.

      I was thinking also about something like that, is_ordered or some variation thereon, such as is_in_order, when I read the OP.
Re: A proper name for is_sorted function that can check more than just sorting order?
by aitap (Curate) on Dec 25, 2017 at 10:25 UTC

    I would try to place the words all and reduce in the name of the function so the users would get the semantics by references to other well-known standard library functions.

    Clearly, naming things and off-by-one errors are the three hardest things in programming.

      Or you could actually use all.

      all { $res[$_ - 1]{end_date} eq $res[$_]{start_date} } 1 .. $#res

      Or if you want to know which ones are "bad" so you can output a message for each of them:

      map { $res[$_ - 1] } grep { $res[$_ - 1]{end_date} ne $res[$_]{start_date} } 1..$#res

      Exactly right, the problem is broken down via reduce and then some characterization. Ordering can be ascending/descending, strict/non-strict, involve numeric/string compare, etc.

      However, the example given is a test for gaps or discontinuities rather than ordering. This, and the other topic relating to Testing, makes me think the concept OP is looking for might be invariant.

        In fact the "concept OP is looking for" is called dependent type theory. However, this isn't even remotely implementable in Perl. Possibly in Scala or Haskell, definitely in more advanced languages like Idris. Not sure about Perl6.

        So in the absence of "you cannot do it in a way that breaks invariants" types, a mixture of runtime/unit tests that can be fine-tuned on accuracy<------>speed scale makes for a poor but workable substitute.

        In my opinion, anyway, but I'm going to put it to the test the hard way.

Re: A proper name for is_sorted function that can check more than just sorting order?
by salva (Abbot) on Dec 26, 2017 at 10:14 UTC
    I think the proper mathematical name for that is monotonicity. So you could name your function is_monotonous or something alike.
      After having a closer look into all aspects discussed in the WP page, I'm afraid that my intuition wasn't too bad and "Monotonicity" is indeed not a good idea.

      Basically because it defines for all elements x ≤ y => f(x) ≤ f(y)

      Since the OP is only testing adjacent elements, this can only be true with a transitive relation inside the testing block.

      But this can't be guaranteed because the user is free to apply any code there.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Wikisyntax for the Monastery

      Excellent point, but from my intuition does monotonicity imply transitivity.

      That is a < b and b < c implies a < c and that's not necessarily the case with this construct.

      But since a quick look into the WP page doesn't reveal any mention of transitivity you might be right.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Wikisyntax for the Monastery

Re: A proper name for is_sorted function that can check more than just sorting order?
by 1nickt (Monsignor) on Dec 25, 2017 at 12:54 UTC

    Hi Dallaylaen,

    Well, it may be because I am grumpy as always at this time of year, but to me it seems that you are trying to make things too shiny.

    I prefer to use Lego bricks rather than modelling clay. As soon as you get your testing function just right, Murphy's Law says you will encounter data that doesn't fit. I would test your condition like this:

    use strict; use warnings; use Test::More; my @data = ( { id => 1, start => 2, end => 3 }, { id => 2, start => 3, end => 4 }, { id => 3, start => 4, end => 5 }, # fail { id => 42, start => 6, end => 7 }, { id => 666, start => 7, end => 8 }, { id => 999, start => 8, end => 9 }, ); for ( 0 .. $#data - 1 ) { is( $data[ $_ + 1 ]->{'start'}, $data[ $_ ]->{'end'}, "$data[ $_ ] +->{'id'} sequence" ); } done_testing; __END__
    Output:
    1206160.pl .. 1/? # Failed test '3 sequence' # at 1206160.pl line 18. # got: '6' # expected: '5' # Looks like you failed 1 test of 5. 1206160.pl .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/5 subtests Test Summary Report ------------------- 1206160.pl (Wstat: 256 Tests: 5 Failed: 1) Failed test: 3 Non-zero exit status: 1 Files=1, Tests=5, 0 wallclock secs ( 0.03 usr 0.00 sys + 0.04 cusr + 0.01 csys = 0.08 CPU) Result: FAIL

    Or maybe like this .... Edit: as ikegami revealed the below is buggy. (emits a warning if number of list elements is not even):

    se strict; use warnings; use Test::More; use List::Util 'pairfirst'; my @data = ( { id => 1, start => 2, end => 3 }, { id => 2, start => 3, end => 4 }, { id => 3, start => 4, end => 5 }, # fail { id => 42, start => 6, end => 7 }, { id => 666, start => 7, end => 8 }, { id => 999, start => 8, end => 9 }, ); my @failed = pairfirst { $a->{'end'} ne $b->{'start'} } @data; is( @failed, undef, 'sequencing' ) or diag sprintf 'ID %s and ID %s are not sequential', map { $_->{'id +'} } @failed; done_testing; __END__
    Output:
    prove 1206160.pl 1206160.pl .. 1/? # Failed test 'sequencing' # at 1206160.pl line 16. # got: '2' # expected: undef # ID 3 and ID 42 are not sequential # Looks like you failed 1 test of 1. 1206160.pl .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests Test Summary Report ------------------- 1206160.pl (Wstat: 256 Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=1, Tests=1, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.06 cusr + 0.00 csys = 0.08 CPU) Result: FAIL

    Hope this helps, and Happy Merry!


    The way forward always starts with a minimal test.

      Your second solution is buggy. It fails to find a non-matching record for the following:

      my @data = ( { id => 1, start => 2, end => 3 }, { id => 2, start => 3, end => 5 }, { id => 3, start => 4, end => 5 }, { id => 5, start => 5, end => 6 }, );

        Withdrawn, thanks.

        The way forward always starts with a minimal test.

      Good point. For complex checks (anything but simple ordering) division into smaller conditions is almost always better.

      This makes me think of another function that (1) packs all smaller tests into a subtest (so that it's "array is such-and-such" and not "ok 1, ok 2, ... ok 100500") and (2) provides access to adjacent array elements via $a and $b.

      Although possibly it should be just handed off to List::Util::foreach_adjacent (lame name off the top of my head) or smth.

Re: A proper name for is_sorted function that can check more than just sorting order?
by LanX (Bishop) on Dec 26, 2017 at 16:19 UTC
    Depends on the implementation.

    As far as I understand you are testing the condition for all pairs ( a=set[i], b=set[i+1] )

    That is you are testing all n-1 links / neighboring pairs, correct?

    This should be reflected by the name

    What about something like

    • is_chained_by
    • chain_does
    • is_linked_by
    • links_do
    • all_links
    • successors_do
    • neighbors_do ...
    • (hope_you_got_the_idea_by_now ;)
    ?

    Using the word "link" hints also to "linked lists" a data structure closely related to arrays.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Wikisyntax for the Monastery

    updates

    Expanded list

Re: A proper name for is_sorted function that can check more than just sorting order?
by LanX (Bishop) on Dec 30, 2017 at 19:09 UTC

      I think I'm staying with is_sorted for simple checks like $a < $b, plus a separate subtest-based function for more complex conditions, for which I don't have an established name yet:

      reduce_ok { is $a->{end_date}, $b->{start_date}, "reservations are coupled"; ok !$b->{is_arrival}, "not expecting >1 arrivals"; } \@reservations;

      The current implementation of is_sorted will work like that, too, but that's a coincidence and should not be abused.

      And happy new 2018!

Re: A proper name for is_sorted function that can check more than just sorting order?
by Anonymous Monk on Dec 26, 2017 at 13:45 UTC
    There is a "stupid-simple" way to write this as a sub which returns True or False. Something like:
    $a = <file>; while ($b = <file>) { if (($a cmp $b) > 0) return false; $a = $b; } return true;
    There. Anyone can understand that in about ten seconds, and debug it and patch it forever. Go ahead and write as many "redundant" functions like this as you care to, because the day might will come when one of them has to be just a little different ... and you prepared for that, so what could be a big change in "clever" code is a small one instead, just as it should be. Your successors will thank you.

      Generally I prefer to use simple constructs, especially in tests. However, sometimes a well-defined wrapper is better. Something like Test::Warn:

      warnings_like { my_fucntion() } [ {carped => qr/deprecated.*our_function/} ], "Warning issued, alter +native suggested";

      This can be written with $SIG{__WARN__} and like, but why bother? The intention is clear from this code.

      My proposed "test that condition holds for all adjacent values in an array" looks pretty well-defined to me. And if it's insufficient, there's always subtest and a for loop...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1206160]
Front-paged by haukex
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2018-06-18 21:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (111 votes). Check out past polls.

    Notices?