Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: database and deployment questions

by SpaceCowboy (Acolyte)
on Oct 19, 2021 at 21:58 UTC ( #11137754=note: print w/replies, xml ) Need Help??


in reply to Re: database and deployment questions
in thread Newbie question

Thank you @hippo for your response. I have data from more than one database and needed to bring them over to drop, rename, perform calculations, joins and export it as flat file... does not seem possible with SQL alone, need some kind of "pandas" environment; I am willing to learn and try other approaches - storing millions of rows and hundreds of columns in 2d array if possible... any advice on books related to data tables processing?
  • Comment on Re^2: database and deployment questions

Replies are listed 'Best First'.
Re^3: database and deployment questions
by hippo (Bishop) on Oct 19, 2021 at 22:12 UTC

    I would still be looking to do all that within an RDBMS.

    storing millions of rows and hundreds of columns in 2d array if possible

    Sure it's possible - if you have enough memory. I just don't see why you would do that when a database is the right tool for the job. There is nothing you have said here to suggest to me that it isn't.


    🦛

      so how would one go about joining tables from multiple data sources without database links and without temp tables...

        joining tables from multiple data sources without database links and without temp tables

        Foreign Tables via Foreign Data Wrappers from SQL/MED. SQL/MED is part of the SQL Standard. These wrappers provide access to external data sources.

        SQL/MED on wikipedia

        and without temp tables

        What an odd and arbitrary restriction. Why should that apply?


        🦛

        so how would one go about joining tables from multiple data sources without database links

        One wouldn't!

        If you have multiple data sources which are in different databases, then you need to link to them somehow. Data cannot flow from one place to another without some form of link...

        and without temp tables...

        Why would you want to???
        The creators of RDBMSs and languages give us the facilities to solve problems. It is up to us to use them in a sensible way. Choosing not to use one or more is a bit like trying to cross a river by paddling when a perfectly good boat is ready and available.

        I know of no modern RDBMS that does not support temporary tables.

Re^3: database and deployment questions - updated for DBIx
by bliako (Monsignor) on Oct 20, 2021 at 07:41 UTC

    Perl arrays and hashtables support nesting (e.g. hash-of-hash-of-array-of-hash etc.), mixed "data types" and slices (a convenient form of indexing). There is also PDL if your data is numeric. Additionally, there is Inline::Python if you want to inline python code you already have (untested by me).

    storing millions of rows and hundreds of columns in 2d array if possible

    well that's illegal in some states and some of us Perl-ers still work on Spectrums, :)

    Update: There's also DBIx::Array which offers this:

    foreach my $row ($dbx->sqlarrayhash($sql, @bind)) { do_something($row->{"id"}, $row->{"column"}); }

    And there's DBIx::Class as a long-term investment.

    bw, bliako

      Thank you! this seems like a start. While I would try to get as much as done in SQL way, there are some things like transformations, transposes, joins I really wish I could do it in Perl. Is there a package that would do cross-tab or pivot table in Perl? I found Data::Pivot and would love to test out others when I get a chance...

        SpaceCowboy, a bit of giggling led me to DBIx::SQLCrosstab and Data::Pivoter. For the record, these are the terms I had searched: cross tabulation cpan perl. CPAN is the repository of Perl modules and where we usually install modules from. On the left tab of each module's page you will notice the date last updated and also the result of testing. Both modules are old but have no failed tests. I would definetely give them a try based on that information.

        There are a few ways to install modules from CPAN to your local machine. The most basic is by using cpan which ships with Perl. There's a more convenient module though: cpanm provided by App::cpanminus. In short: cpan App::cpanminus, bearing in mind that you may need to configure it the first time you use it, pressing enter for using the defaults works usually fine. And then cpanm DBIx::SQLCrosstab. Just mentioning these in case you are new to Perl.

        There are two more points to mention in case you are new to Perl and not aware. Where are modules installed? You can install them system-wide for all users if you want and do have system-admin privileges, or you can use your own personal library (see local::lib, but that's a whole new question) when with no privileges. Secondly and most importantly, Perl is used by many applications in our computers (ehm computer=unix-based OS) and by the OS itself. And that's why there is what we call system perl. By installing modules system-wide you are risking affecting that system perl's behaviour and breaking your system (for example it may rely on a certain module being on a certain version but you upgraded it with your admin privileges, rare but possible). The best way to tackle this is either to use local::lib or install another Perl, living in parallel with the "system Perl". This is much easier than it sounds, thanks to https://perlbrew.pl/ and it is highly recommended when you are dealing with unix systems. The system will be using its own Perl and you will be using your own (one or more, no problem) and the two will live happily together.

        reference: http://www.cpan.org/modules/INSTALL.html

        bw, bliako

Re^3: database and deployment questions
by Bod (Curate) on Oct 19, 2021 at 23:18 UTC
    I have data from more than one database and needed to bring them over to drop, rename, perform calculations, joins and export it as flat file

    That sounds like a use for a Temporary Table in the RDMS to me...

      can you please elaborate, I'd like to learn from you
        can you please elaborate

        Sure...
        Note that I am very much self-taught so may use incorrect terms and make assumptions - others will hopefully correct any errors I make.

        A Temporary Table is much like a regular table in the RDBMS except that it created on the fly by your code. It is automatically dropped by the RDBMS when the database session ends. So it only exists within the current running instance of your code.

        So a common use for a Temporary Table is to gather together data from various different sources (usually other tables) so that SQL operations can be performed on all the data at once. For example, in my CRM I have a reminder view. This takes birthdays from the 'Person' table, reminders from the 'Note' table, anniversaries from the 'Anniversary' table, etc. It loads all this into a Temporary Table before sorting in into date order and showing just the first 20 events.

        This code snippet is ancient legacy code and I wouldn't write it quite like this now but it should give you the idea of pulling data from different places and then working on the combined result.

        In this example, all the data sources are within the same database schema. But they could be in different schemas within the same RDBMS or different RDBMS's. They could be on different machines in different locations potentially accessed over ODBC. They don't even have to be data sources from databases - they could come from anywhere.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137754]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2022-05-19 19:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (72 votes). Check out past polls.

    Notices?