Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

SQL crosstab complexity depends on the number of distinct values in the columns involved with the crosstab. The 600 lines I mentioned were due to a query asking for COUNT, SUM, AVG, MIN, MAX with row and column subtotals, thus requiring a UNION for each level of row header.

In a database with the same structure but with one million records, the query would not have been much longer, provided that the data is properly checked on input.

Of course, if you try to do a crosstab by person's name in a table of one million records, you are likely to run out of space, but OTOH crossing data by names wouldn't let you in much better shape with any statistical tool.

About having 50-100 values in each of 4 dimensions, yes, it's true that you would get an unbearable number of combinations. But you'd get such complexity with any tool, and even if you manage to get such result, it is not readable. Theoretical limits and practical limits need to be considered here. The main purpose of crosstabs is to give an overview of a situation, mostly something that is useful for human consumption. Nobody in his right state of mind would consider reading a document with 50,000 columns and 100,000 rows (provided that I find the paper to print it!)

Databases with statistical needs and data warehouses are designed in such a way that data can be grouped by some meaningful element. If the designers allow such element to reach thousands of values, then it becomes useless for this kind of overview.

Anyway, consider that one side of the crosstab (rows) can grow up to the limits of the system, so if one of your values has a large set of distinct values you can always decide to move it from column to row header, keeping in mind that if you generate too many rows it may not be valuable as a statistical report.

I ran some tests on my database of chess games (2.3 million records) and I got meaningful results in decent times. I generated a few thousand columns, just for fun, but I would never want to be the one in charge of analyzing such a report!

 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to Re: Re: SQL Crosstab, a hell of a DBI idiom by gmax
in thread SQL Crosstab, a hell of a DBI idiom by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chilling in the Monastery: (8)
    As of 2014-08-28 02:29 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The best computer themed movie is:











      Results (255 votes), past polls