Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^4: Architecture design for full stack development.

by anonymized user 468275 (Curate)
on Jun 23, 2017 at 16:54 UTC ( [id://1193386]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Architecture design for full stack development.
in thread Architecture design for full stack development.

I don't and won't have evidence unfortunately, only theory. The theory I am applying is that provided the values are contigious, the mapping used in the btree algorithm can use this to calculate the physical location of the page more quickly than if not and that page seeking will slow down proportionately according to how often it cannot calculate the page location based on the value and has to go walkies looking for it. OK I kind of have evidence because I have subjectively noticed a performance improvement in operating on large tables where the values of the PK are contiguous relative to those where they werent. In fact where PK values were contiguous Postgres crunched through 4 mil records in seconds but slept soundly for same record count but non contiguousPK values. Please don't ask me to submit test results -- I have a system to build!

One world, one people

  • Comment on Re^4: Architecture design for full stack development.

Replies are listed 'Best First'.
Re^5: Architecture design for full stack development.
by erix (Prior) on Jun 23, 2017 at 17:06 UTC

    CLUSTER [1] rewrites a table in the order of an index (to wit, the index you pass to the CLUSTER command). Access of singular values does not change; but access of values that are now (after CLUSTER) 'near' each other can become faster (because there is a higher chance they are already read). It has to do with /physical/ nearness (or proximity) of the table data, not of the values.

    At least, that is my theory :)

    Please don't ask me to submit test results -- I have a system to build!

    Fair enough :) But I wouldn't undertake any work on the basis of your theory - I don't think it is valid.

    I'll try to devise a test myself, and see if one of our theories can be falsified.

    [1] CLUSTER command (PostgreSQL docs)

      I am not sure about a cluster command, I was referring to the 'CLUSTERED' attribute available when creating or altering an index. A clustered index is one which maintains organisation of the underlying data in index order. Postgres allows more than one index to be clustered; I first learned about clustered indexes on Sybase where only one index per table could be clustered for obvious reasons, but how Postgres manages more than one clustered index on the same table is a separate mystery.

      One world, one people

        ... to the 'CLUSTERED' attribute available when creating or altering an index.

        Where did you find that clustered attribute? Are you perhaps using some GUI that implements the CLUSTER with the INDEX display? There /is/ a clusterable attribute for indexes, but that's obviously not the same thing (the clusterable attribute just flags whether an index can possibly be used (as the /only/ index!) with a CLUSTER command).

        ... clustered indexes on Sybase where only one index per table could be clustered

        It's no different in PostgreSQL: CLUSTER accepts only a single index as parameter.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1193386]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-04-20 00:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found