Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Data visualisation.

by davies (Vicar)
on Jan 03, 2014 at 19:48 UTC ( #1069190=note: print w/ replies, xml ) Need Help??


in reply to Data visualisation.

OK, I have some working code. It's merely intended to demonstrate the algorithm, so I've done it in Excel without all the usual clever dickery I usually use in Excel. The spreadsheet, including the resulting chart, is at https://gitorious.org/visualisation/visualisation/source/108fdc930d57ea0aeda62fe8dc741bf95e40c28e:.

What I don't have is working data. If you consider your points B, C & D, B to D is 661, while BC is 390 and CD 228. So the total from B to D via C is 618, less than the BD distance. This involves an imaginative one way system that I can't visualise. Either that, or the crow is flying into some strong headwinds. :-)

The code isn't intended to do anything clever like check data validity. If you run it with all your data, it will produce a visualisation as I will explain below. However, if run with just A, B, C and D, it will crash as it tries to find the square root of a negative number.

The algorithm works as follows. First it finds the largest single distance, in this case BP. Then it transforms the matrix so that this value is at the top left. It assumes a north-south line between the two. Then it adds the third point, A, using triangle calculations to work out how far to the east to put it. It then loops through the rest of the lines, calculating the X and Y co-ordinates in the same way, but before placing the point, it tests whether the fit with the third point (A) would be better if it were to the left or the right of the original line. This is how it gets around the problems described above - it doesn't use all the data, just the relationships with B, P and A.

Then it draws the graph. Because it can't do that very well and Messware won't add the necessary functionality (I haven't looked at 2010 or 2013), marking the points sensibly is a complicated business.

If you want to run my macros, delete the "Chart26" and "Co-ordinates" sheets first. I haven't checked spreadsheet or data integrity in any way.

Regards,

John Davies

Update: corrected URL that was pointing to an out of date commit.


Comment on Re: Data visualisation.
Re^2: Data visualisation.
by BrowserUk (Pope) on Jan 04, 2014 at 09:00 UTC
    I've done it in Excel

    Thanks for that. Unfortunately, don't have anything installed at the moment that let's me open Excel files.

    What I don't have is working data. If you consider your points B, C & D, B to D is 661, while BC is 390 and CD 228. So the total from B to D via C is 618, less than the BD distance. This involves an imaginative one way system that I can't visualise. Either that, or the crow is flying into some strong headwinds. :-)

    The dataset comes from TSPLIB(gr_17.tsp) and they make no guarantees (nor even statements) about the plot-ability of the sets. That's a big part of the motivation for wanting to try and visualise them.

    The algorithm works as follows. First it finds the largest single distance, in this case BP. Then it transforms the matrix so that this value is at the top left. It assumes a north-south line between the two. Then it adds the third point, A, using triangle calculations to work out how far to the east to put it. It then loops through the rest of the lines, calculating the X and Y co-ordinates in the same way, but before placing the point, it tests whether the fit with the third point (A) would be better if it were to the left or the right of the original line. This is how it gets around the problems described above - it doesn't use all the data, just the relationships with B, P and A.

    That sounds very similar to the approach I used except I put the baseline(B-P) horizontally. This is what the plotting process looks like. And the code that produces that is here.

    I hope you had as much fun with the problem as I did :)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      LibreOffice Calc can open the spreadsheet and show the code in the macros. Gnumeric opens the spreadsheet file, shows the sheets containing the input data, a chart (XY plot) of the calculated results and a table of results (coordinates). The labelling of the charts is wrong in both Calc and Gnumeric but the data points look correctly positioned.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1069190]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (6)
As of 2014-12-20 22:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (99 votes), past polls