Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by tobyink (Canon) on Sep 30, 2013 at 06:50 UTC
|
Sort the keys in tests, or use tied hashes that preserve key order. Or test a round-trip (i.e. use your module to convert Perl data structure -> serialized -> Perl data structure, and use is_deeply to compare the input and output data structures).
use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name
| [reply] [Watch: Dir/Any] [d/l] |
Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by hdb (Monsignor) on Sep 30, 2013 at 06:34 UTC
|
Sorting your keys in your tests would be one way, but it would not detect any bugs where you implicitly rely on sorted keys...
| [reply] [Watch: Dir/Any] |
|
but it would not detect any bugs where you implicitly rely on sorted keys...
I hope saintmike isn't relying on implicit key ordering, especially for tests?
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by kcott (Archbishop) on Sep 30, 2013 at 10:54 UTC
|
| [reply] [Watch: Dir/Any] |
Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by BrowserUk (Patriarch) on Sep 30, 2013 at 10:50 UTC
|
The problem with this new key randomisation code is not the non-determinacy it introduces -- that was already there a long time since; albeit in a lesser form.
The problem is that is a pointless prophylactic that doesn't even come close to solving the "problem" that was used to justify its addition to the code-base.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
|
|
|
Actually, this change is removal from the code base. It's a simplification of the existing mechanism. Rather than perturbing the hash when an attack is detected, the salt is always applied. To make that simplification safe, the salt needs to be different for each hash.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
|
|
Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by marinersk (Priest) on Sep 30, 2013 at 10:21 UTC
|
# Generation Code
my $newkey = &generateRandomHashKey();
# Store keys as generated in original order
push @Keylist, $newkey;
$hash{$newkey} = &generateData();
# ...
# Testing Code
foreach my $testkey (@Keylist)
{
# Perform test against $hash{$testkey};
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by ikegami (Patriarch) on Oct 01, 2013 at 13:47 UTC
|
which means that even within the same process, calling keys() twice on the same hash will result in a different key order.
That's not what it means at all. For a given hash, multiple calls to keys (and values) are still guaranteed to return the same order if there has been no change to the hash, and the order has always been subject to change after hash modifications.
Difference one: The order is more likely to change on hash modification.
Difference two: In a given interpreter, if you built two hashes using identical insert and delete steps, you used to get the same key orderings. This is not always the case now.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
You are mistaken. That file never calls keys twice on the same hash. It call keys on two different hashes (containing the same data). That code has been buggy since 5.8.1. The bug is just more likely to occur now.
| [reply] [Watch: Dir/Any] |
Re: Truly randomized keys() in perl 5.17 - a challenge for testing?
by saintmike (Vicar) on Sep 30, 2013 at 16:35 UTC
|
Let me clarify: The module referenced in the original posting has a function hash_serialize() that takes a reference to a hash like { a => "b", c => "d" } and turns it into something like "a=b&c=d" or "c=d&a=b", depending on what keys() returns underneath:
https://github.com/mschilli/php-httpbuildquery-perl/blob/master/HTTPBuildQuery.pm#L63
Now how am I supposed to test that the outcome is what I'd expect? With two hash elements, you could argue that I could generate all permutations of possible result strings and check if one of them matches the one I got by running the function, but with thousand entries this becomes unwieldy.
The general problem is this: I have an unpredictable function (keys()) and its result gets processed, and the processed result needs to get checked against an expected outcome.
Unless keys() can be switched to a test mode to produce predictable results (doesn't need to be sorted, just give me something predictable, like before perl 5.17), what are the options?
By the way, some people have suggested to use "sort keys" in my algorithm every time, but putting in extra complexity (sort) at runtime to make sure I can test perfectly fine code is just plain wrong. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] [d/l] |
|
Unless keys() can be switched to a test mode to produce predictable results (doesn't need to be sorted, just give me something predictable, like before perl 5.17), what are the options?
To test hash_serialize():
I would deserialize it to memory, and compare with original hash (with something based on "sort keys" or with cmp_deeply).
However this violates some ideas of unit testing (while it's ok for integration testing).
If it's worrying you - test it for 1-element hash and 2-elements hash (yes, all 2 permutations), and stop. Do other testing with deserialization.
To test other code, which uses hash_serialize():
mock (fake) hash_serialize().
| [reply] [Watch: Dir/Any] |
|
$result = join '&', sort split '&', $result;
just for your testing. This seeems to be simple enough that no new bugs are introduced and will not require a sort in your production code.
While de-serialization could be a solution as well, one has to be very careful as it adds additional complexity. For example, if your serialization function returns "a=b&c=d&a=b" because of some bug, it could easily be "fixed" by a de-serialization procedure:
my %hash = map { split '=', $_ } split '&', $result;
Clearly, there are ways around this, like testing for the length of the string as well. But one has to spend extra time for each application and additional complexity can not be avoided.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
I meant "reproduceable", not "predictable" here.
| [reply] [Watch: Dir/Any] |
|
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
Another possibility for your particular case is to sort keys. Performance penalty is small, and having random URLs generated each time is a bigger problem and can cause some caching issue and other troubles.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
|
Unless keys() can be switched to a test mode to produce predictable results ...
Several hours before you wrote that, someone else told you what the relevant environment variables are.
| [reply] [Watch: Dir/Any] |