Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Data visualisation.

by roboticus (Canon)
on Jan 02, 2014 at 13:08 UTC ( #1068950=note: print w/ replies, xml ) Need Help??


in reply to Data visualisation.

BrowserUk:

I've been thinking along those lines myself, and coded up a distance-matrix to coordinates program (current code in Re^2: Data visualisation. (updated)). However, the data doesn't seem to map into a planar representation. I'm thinking about (but haven't started) mixing in force-directed layout to build a planar mapping listing the stress forces along the edges and force vector from each node to approximate a 2D layout.

...roboticus

When your only tool is a hammer, all problems look like your thumb.


Comment on Re: Data visualisation.
Re^2: Data visualisation.
by LanX (Canon) on Jan 02, 2014 at 13:54 UTC
    Hi roboticus

    Did you check if the data conforms to the Triangle_inequality ?

    If not I doubt that any sane visualization is possible.

    update

    well at least with straight lines...

    ... but I fear with curved lines it might always be possible but not usable!

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      LanX:

      No, I didn't think to do that. It's a trivial check, so when I get home, I'll try to remember to add that. (Unfortunately, the security restrictions at work make it extremely difficult to actually do any coding.)

      Update: I've now added the triangle inequality check in the version at Re: Data visualisation..

      Update 2: There was an error in my triangle inequality check (it didn't affect my original results, because my code didn't use any of the invalid triangles--by luck, not design). The error was that I only made one of the three checks required. The new check function is:

      sub chk_triangle_inequality { # Get the points my ($A, $B, $C) = @_; # Calc the distances my ($AB, $AC, $BC) = (dist_2($A, $B), dist_2($A, $C), dist_2($B, $ +C)); if ($AB > $AC + $BC) { print "ERROR: Triangle inequality fails for points ($A, $B, $C +) ($AB > $AC+$BC)\n"; } elsif ($AC > $AB + $BC) { print "ERROR: Triangle inequality fails for points ($A, $B, $C +) ($AC > $AB+$BC)\n"; } elsif ($BC > $AB + $AC) { print "ERROR: Triangle inequality fails for points ($A, $B, $C +) ($BC > $AB+$AC)\n"; } }

      I also added an exhaustive check at the end:

      print <<EOHDR; ##### # Exhaustive triangle inequality check ##### EOHDR for my $A (0 .. ($#LOCs-2) ) { for my $B ( ($A+1) .. ($#LOCs-1) ) { for my $C ( ($B+1) .. $#LOCs ) { chk_triangle_inequality($A, $B, $C); } } }

      The exhaustive check gives the following misses:

      ##### # Exhaustive triangle inequality check ##### ERROR: Triangle inequality fails for points (0, 5, 6) (150 > 80+63) ERROR: Triangle inequality fails for points (0, 6, 7) (134 > 80+29) ERROR: Triangle inequality fails for points (0, 6, 15) (332 > 80+246) ERROR: Triangle inequality fails for points (0, 6, 16) (121 > 80+29) ERROR: Triangle inequality fails for points (1, 2, 3) (661 > 390+228) ERROR: Triangle inequality fails for points (1, 2, 13) (466 > 390+74) ERROR: Triangle inequality fails for points (1, 3, 4) (661 > 227+383) ERROR: Triangle inequality fails for points (1, 3, 5) (661 > 488+120) ERROR: Triangle inequality fails for points (1, 3, 6) (661 > 572+77) ERROR: Triangle inequality fails for points (1, 3, 7) (661 > 530+105) ERROR: Triangle inequality fails for points (1, 3, 10) (661 > 282+324) ERROR: Triangle inequality fails for points (1, 3, 12) (661 > 567+27) ERROR: Triangle inequality fails for points (1, 3, 13) (661 > 466+182) ERROR: Triangle inequality fails for points (1, 3, 14) (661 > 420+239) ERROR: Triangle inequality fails for points (1, 3, 16) (661 > 518+84) ERROR: Triangle inequality fails for points (1, 5, 6) (572 > 488+63) ERROR: Triangle inequality fails for points (1, 5, 7) (530 > 488+34) ERROR: Triangle inequality fails for points (1, 6, 7) (572 > 530+29) ERROR: Triangle inequality fails for points (1, 6, 16) (572 > 518+29) ERROR: Triangle inequality fails for points (2, 3, 12) (228 > 191+27) ERROR: Triangle inequality fails for points (2, 3, 15) (472 > 228+237) ERROR: Triangle inequality fails for points (2, 3, 16) (228 > 142+84) ERROR: Triangle inequality fails for points (2, 5, 6) (196 > 112+63) ERROR: Triangle inequality fails for points (2, 5, 7) (154 > 112+34) ERROR: Triangle inequality fails for points (2, 6, 7) (196 > 154+29) ERROR: Triangle inequality fails for points (2, 6, 16) (196 > 142+29) ERROR: Triangle inequality fails for points (3, 4, 12) (383 > 27+346) ERROR: Triangle inequality fails for points (3, 4, 16) (383 > 84+297) ERROR: Triangle inequality fails for points (3, 5, 12) (120 > 27+83) ERROR: Triangle inequality fails for points (3, 5, 15) (364 > 120+237) ERROR: Triangle inequality fails for points (3, 5, 16) (120 > 84+35) ERROR: Triangle inequality fails for points (3, 6, 12) (77 > 27+47) ERROR: Triangle inequality fails for points (3, 6, 15) (332 > 77+237) ERROR: Triangle inequality fails for points (3, 7, 12) (105 > 27+68) ERROR: Triangle inequality fails for points (3, 7, 15) (349 > 105+237) ERROR: Triangle inequality fails for points (3, 9, 12) (476 > 27+439) ERROR: Triangle inequality fails for points (3, 9, 16) (476 > 84+390) ERROR: Triangle inequality fails for points (3, 10, 12) (324 > 27+287) ERROR: Triangle inequality fails for points (3, 10, 16) (324 > 84+238) ERROR: Triangle inequality fails for points (3, 12, 13) (182 > 27+145) ERROR: Triangle inequality fails for points (3, 12, 14) (239 > 27+202) ERROR: Triangle inequality fails for points (3, 12, 15) (289 > 27+237) ERROR: Triangle inequality fails for points (3, 12, 16) (84 > 27+55) ERROR: Triangle inequality fails for points (3, 13, 15) (426 > 182+237 +) ERROR: Triangle inequality fails for points (3, 13, 16) (182 > 84+96) ERROR: Triangle inequality fails for points (3, 14, 15) (483 > 239+237 +) ERROR: Triangle inequality fails for points (3, 14, 16) (239 > 84+153) ERROR: Triangle inequality fails for points (3, 15, 16) (336 > 237+84) ERROR: Triangle inequality fails for points (4, 5, 6) (351 > 267+63) ERROR: Triangle inequality fails for points (4, 5, 7) (309 > 267+34) ERROR: Triangle inequality fails for points (4, 6, 7) (351 > 309+29) ERROR: Triangle inequality fails for points (4, 6, 16) (351 > 297+29) ERROR: Triangle inequality fails for points (5, 6, 9) (444 > 63+360) ERROR: Triangle inequality fails for points (5, 6, 10) (292 > 63+208) ERROR: Triangle inequality fails for points (5, 6, 14) (207 > 63+123) ERROR: Triangle inequality fails for points (5, 7, 9) (402 > 34+360) ERROR: Triangle inequality fails for points (5, 7, 10) (250 > 34+208) ERROR: Triangle inequality fails for points (5, 7, 14) (165 > 34+123) ERROR: Triangle inequality fails for points (6, 7, 9) (444 > 29+402) ERROR: Triangle inequality fails for points (6, 7, 10) (292 > 29+250) ERROR: Triangle inequality fails for points (6, 7, 13) (150 > 29+108) ERROR: Triangle inequality fails for points (6, 7, 14) (207 > 29+165) ERROR: Triangle inequality fails for points (6, 9, 16) (444 > 29+390) ERROR: Triangle inequality fails for points (6, 10, 16) (292 > 29+238) ERROR: Triangle inequality fails for points (6, 13, 16) (150 > 29+96) ERROR: Triangle inequality fails for points (6, 14, 16) (207 > 29+153) ERROR: Triangle inequality fails for points (9, 13, 14) (336 > 240+57)

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        Hi roboticus

        > triangle inequality check

        Already a visual test reveals that projecting the data into 2D wouldn't make sense:

        M / \ 27 / \ 47 / \ D ----- G 77

        27 + 47 = 74 < 77

        A B C D E F G G: 80 572 196 *77 351 63 0 ... ... M: 70 567 191 *27 346 83 *47 ...

        Cheers Rolf

        ( addicted to the Perl Programming Language)

        updates

        • added table

        • improved wording
        Hi roboticus,

        > Update 2... The exhaustive check gives the following misses:

        wow ...indeed "a mess of bad triangles".

        TSP publications reach back to before the 1950s and IIRC the data was mostly taken from German or Swiss cities.

        17 cities is a rather small and surely old example and I think at that time they just copied the kilometers from a tourist guide or train schedule or similar, w/o taking much care.

        So my theory was that just reaching the city limits naturally causes an error for large communities like Berlin or Zürich.

        But I don't think it's worth to try to correct this now with something like a fuzzy surface in the visualization, better try starting from the beginning with real coordinates.

        At least I think questioning the quality in my first reply was justified, it's worth checking data before trying to work with it.

        Cheers Rolf

        ( addicted to the Perl Programming Language)

Re^2: Data visualisation.
by BrowserUk (Pope) on Jan 02, 2014 at 14:02 UTC
    coded up a distance-matrix to coordinates program

    Thanks. That's very helpful.

    However, the data doesn't seem to map into a planar representation.

    I assume you mean the OP data in the TSP thread. This data has nothing to do with that data. (Though it was the trigger for my current interest.)

    My first postulate: the maximum size of the plane required is: xMax = yMax = 1.23 * sqrt( max_distance2 / 2 ).


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      BrowserUk:

      I'm glad you found it useful. The problem was interesting, so I kept playing around with it.

      And yes, I was referring to the OP data in the TSP thread--I haven't had a chance to check out your distance data yet. In my initial replies, I read my browser tabs out of order and rather than noticing that this was a new thread, I thought I was working in that thread, so I probably made a few mistakes in that regard.

      Another possibility I've been thinking about is to add a Z coordinate to see if the data could be made consistent that way. (Again referring to OP TSP data.) But if the data represents miles travelled (i.e. along the road, rather than as the crow flies), adding a single degree of freedom is probably insufficient.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        (Again referring to OP TSP data.) But if the data represents miles travelled (i.e. along the road, rather than as the crow flies), adding a single degree of freedom is probably insufficient.

        Indeed. Road routes are rarely straight lines. That's why I obtained a different* set of data for experimenting with.

        (*As well as smaller (quicker to process) and with a known solution.)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1068950]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (8)
As of 2014-12-26 09:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (169 votes), past polls