Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Xlxs Parsing Issue

by rahul_lfo (Initiate)
on Nov 07, 2013 at 11:21 UTC ( [id://1061548]=perlquestion: print w/replies, xml ) Need Help??

rahul_lfo has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks , I have run into a problem. I have an irregular xlxs file and I need to convert it to a proper row and column. The irregular excel file looks like this- I am posting the image to the xlxs Deleted the contact details. Now I need to format it and copy this data to a regular excel format. Please help on how to start with it.

Replies are listed 'Best First'.
Re: Xlxs Parsing Issue
by roboticus (Chancellor) on Nov 07, 2013 at 11:33 UTC

    rahul_lfo:

    Since .xlsx (I assume that's what you mean) is a set of zipped XML files, you can use XML::LibXML to parse the original file, and then use Spreadsheet::WriteExcel to generate the older style format. If you only care about the data, and you're not picky about the format, it can be pretty easy to go this route. Unfortunately, the more special features in your original that you want to retain, the more work you'll have to do digging out the relevant information.

    I'm assuming that you're not needing any of the unsupported features in Spreadsheet::WriteExcel, though. I didn't bother to look at your image link. Generally it's advised to put all the information here on PerlMonks because the information on other sites may go stale, and then this node will become less useful to others searching the site in the future.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

        Corion:

        Yes, I should have mentioned that package. I tried using it a few times, but each time it was for such an overly-large excel file that I always ran out of RAM when parsing it. (Not the module's fault--more the fault of people trying to use Excel as a mass data-transport mechanism.)

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

      By .xlxs I mean excel files.
        We humans can cope with it. For the computer, you should put the X in the end, not the S
        :-)
Re: Xlxs Parsing Issue
by marto (Cardinal) on Nov 07, 2013 at 11:47 UTC

    How is this file "irregular"? What do you mean by "regualr excel format"? Do you think these people listed in this screenshot will mind that you've published their email addresses and phone numbers? Do you think it looks like that image link worked as you expected? Start by reading the Excel file (read and understand Spreadsheet::Read), then write out your mystery format using Excel::Writer::XLSX.

Re: Xlxs Parsing Issue
by wjw (Priest) on Nov 10, 2013 at 01:15 UTC
    I also did not look at the image file.

    That being said, I have to wonder how much manipulation is needed here. If the idea is to go from .xlxs to the standard(old).xls format, why not just open it in something like LibreOffice and do a "Save As", and choose the version that you need? Last I looked there was some rather large selection of Excel formats, and I seem to recall they include the format you want. Once that is done, use Excel or LibreOffice to manipulate it as you see fit, then save.

    I could understand doing this programmaticaly if there were a large number of files that you needed to do it for, but you indicated "an irregular file", so I am assuming a single file.

    I find it hard to believe that you are going to get functionality out of Perl that you would not get out of the application(Excel) and its built in scripting without a lot more effort than is really required.

    I recognize that there may be other circumstances, but based on what you have written, I would be inclined to ask myself why I want to write a program...

    Note that I no longer use Excel, other than to receive the files and open them in an opensource application, and if I have to, save them back to the original format and return them to the sender, so the Excel side of my advice is suspect... .

    Good luck with your project!

    ...the majority is always wrong, and always the last to know about it...
    Insanity: Doing the same thing over and over again and expecting different results.
Re: Xlxs Parsing Issue
by sundialsvc4 (Abbot) on Nov 07, 2013 at 14:48 UTC

    I, for one, would seriously consider doing the work in Excel, using, say, a VB script within one or the other spreadsheet.   Let it open up a new Excel.Workbook object (perhaps a template with various named cell-ranges assigned for this purpose ...), and iterate through it to populate the data from one to the other.   It is usually a good idea to use named cell-ranges in the source data, too, and I would gamble that your source spreadsheet is already doing this.   (Code that refers to cells by their coordinates is fragile, not “future-proof.”)

    While Perl does, as already mentioned, provide support-packages for dealing with this file format, “Excel is fully-programmable, too, and Excel knows herself best-of-all.”   So, that’s how I would do it, have done it, and would frankly recommend doing it.

      On one hand, I agree with you and I personally prefer to use Win32::OLE to control Excel when trying to manipulate Excel files when on a Windows system that has Excel installed.

      However, I would urge caution about leveraging macros, which is what I think that you're referring to when you mention "VB script". To make a long story short, apparently there was a situation at my job some time prior to when I started where a macros in an Excel file got infected by a virus and was deleting "random" files on our file server whenever that macros was run. I don't have all of the details on what happened there. Also, there's a reason that Microsoft's latest versions of Office has macros disabled by default.

      Not disagreeing with you on anything, but just wanted to pass on a word of caution.

        That word of caution applies to any and every file on windows -- they're all attack vector -- every file is executable

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1061548]
Approved by sundialsvc4
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (3)
As of 2024-04-25 06:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found