Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

XML and Databases and Excel, Oh My!

by jedikaiti (Hermit)
on May 12, 2010 at 16:44 UTC ( #839679=perlquestion: print w/replies, xml ) Need Help??
jedikaiti has asked for the wisdom of the Perl Monks concerning the following question:

OK, Monks, I'm on the sketching-out-ideas phase of my next project, and I thought I'd see what kind of brilliant insights you can offer.

The mission: I need to pull data out of database and pop it into an Excel spreadsheet for examination/modification by the end user. Then I'll need to get the data back out of the spreadsheet to update (& add new info to) the database. I may also need to compare the changed user data to what's currently in the database, and report back to the user exactly what would be added/changed.

Generally, what I was thinking was this: Get the data (specific details on that will have to wait, there's some debate as to exactly what database we'll be using, so that will be in a holding pattern until that gets sorted out), pop it into XML files, use Excel macro(s) to import the data into an Excel spreadsheet, let the user mess with it, then use Excel macro(s) again to export the data back into XML files, and then use Perl to handle sorting the data and popping it back into the database.

So, I had been thinking that I'd likely use XML::Parser to handle dealing with the XML, but then I hit CPAN and searched for XML modules. That is one long list. Since I have a bit of time to ponder my implementation options while TPTB decide on a database (long story), I thought I'd ask you guys for your input. Is XML::Parser a good choice? Should I be looking at XML::EasySQL instead? Another favorite module I should be looking at? Other ideas out of left field?


Swiss Army Nerd

Replies are listed 'Best First'.
Re: XML and Databases and Excel, Oh My!
by JavaFan (Canon) on May 12, 2010 at 16:49 UTC
    I'd use Spreadsheet::WriteExcel and Spreadsheet::ParseExcel, saving the XML roundtrip.

      See, I never would have thought of that, which is why I ask here! Thanks!

      Swiss Army Nerd
Re: XML and Databases and Excel, Oh My!
by davies (Parson) on May 12, 2010 at 17:08 UTC
    It may not be your preferred path, but Excel has a feature - Microsoft Query - that can read from and write to databases directly. On 2000-2003 it is accessed via Data | Import External Data. You can write the SQL directly, but Excel has a ghastly quirk. At about 240 (NOT 255 or anything sensible) characters, it truncates the SQL. The trick is that VBA demands the SQL to be an array of strings. This array can have as many elements as desired, but each element must be relatively short. I've been up to over 40K characters to prove a point. You may need to install Query on your target machines - it's on the Office CDs I've used, but frequently as an optional component, which means that lazy acceptance of Bill's defaults doesn't install it.


    John Davies

      That might be an option, I'll have to check into it. The one likely issue is a matter of how the end user wants to deal with the data, but I bet I can fix that with a macro or something. Thanks!

      Swiss Army Nerd
Re: XML and Databases and Excel, Oh My!
by graff (Chancellor) on May 12, 2010 at 21:46 UTC
    One other possibility to think about: What does Excel offer you that you wouldn't be able to get via a basic web form? Another way to state the question is: How might the use of a web form simplify and/or enhance the project?

    (Or, depending on your particular situation, maybe a basic Perl/Tk, Perl/Gtk or similar GUI app would be preferable to a web app.)

    The point would be to get the end users into direct contact with the database, rather than bothering with external intermediate files. You could even add a table to the database to log all updates, so that you keep a revision history and can roll back if you want.

    Depending on lots of factors, there are good reasons for both approaches (spreadsheet files vs. direct DB access).

      Actually, the main point of this project is to have the data in an easier-to-deal-with format for the end user. (A change to data in one place frequently requires changes to data in another table or two, etc, and having a process in place that will deal with the 6-digit ID number for you, rather than having to keep them all straight.) Another benefit is NOT changing the database directly, so that the likely changes can be compared to the current data before the changes are actually uploaded to the DB.

      A web form may still work, although one nice thing about Excel is that I do have a base worksheet already that I can build on.


      Swiss Army Nerd
Re: XML and Databases and Excel, Oh My!
by scorpio17 (Abbot) on May 12, 2010 at 17:08 UTC

    I recommend avoiding XML if there's any way possible. Whenever I've had to do anything like this, I dump the data from the database into a CSV file. Excel (and its OpenOffice equivalent) can easily import/export CSV files.

    But if you *have* to do XML stuff - XML::Parser can probably do whatever you need.

      Why CSV over XML?

      Swiss Army Nerd

        In most cases, XML is overly verbose. CSV is more compact, more humanly readable, has a greater 1-to-1 correspondence to rows of data in your database, and is a more 'native format' for a spreadsheet (rows/columns of the CSV correspond to rows/columns of the spreadsheet).

        XML has it's place, but IMO it's been greatly misused/abused. Say, for example, that you needed to generate a very complex report. Generating the data for the report was going to require multiple complex, multi-table joins on tables having millions of rows each (not the kind of thing you want to have to do very frequently). Let's also suppose that you want to generate your report in several formats: maybe HTML, for a web page; CSV for a spreadsheet; RTF for a MS-WORD document; and maybe PDF, just for grins and giggles. In a scenario like that, it might make sense to hit the database ONCE, store the results in XML, since it makes a good INTERMEDIATE format, and then run a series of converters in parallel (i.e., xml-to-html, xml-to-csv, xml-to-rtf, and xml-to-pdf).

        Too many times I've seen people using XML as a kind of flat-file, text-based, database language. That's not what it was designed for, and it's not particularly good at it.

        If you only have one target output (not many) and you can more easily generate that output (i.e., CSV), there's no need to go through the extra effort of generating XML as an intermediate format. Read the docs on cpan for Parse::CSV and compare to XML::Parser and think about if you really need to use XML or not.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://839679]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2018-07-20 14:48 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (435 votes). Check out past polls.