Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Excel File Dos2Unix not working

by ZWcarp (Beadle)
on Jul 12, 2011 at 10:39 UTC ( [id://913880]=perlquestion: print w/replies, xml ) Need Help??

ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:

Glorious Monks, I come before you humbled seeking wisdom.

So a coworker sent me an excel file, which I need to do some simple parsing with in order to utilize. I have run into issues before with excel files but usually gotten around them with a couple dos2unix style commands. I'm not sure what is different about this file, but my usual methods just aren't working. First I saved the excel file as a tab delimited file, and I've noticed that if I use lets say

cut -f1 Filename.txt I only get the very first row. I first tried using the command perl -i -pe 'tr/\015/\n/d' Filename.txt

To try and remove any carriage returns and replace with \n . Usually this works however, this time for what ever reason I'm still having issues. I've tried od -tc on the file to look for any weird characters that might be screwing up my line read in. Does anyone have any ideas of what might be causing the issue?

Update: Problem was due to color markup in the excel file which left hidden characters behind, I solved this by just using text edit to get rid of "rich text" . Not sure how the same thing would be done in perl . Thanks all for your responses

Replies are listed 'Best First'.
Re: Excel File Dos2Unix not working
by Tux (Canon) on Jul 12, 2011 at 10:45 UTC

    When I read your post, I guess you mean "Exported data from Excel files", as .txt is something completely different.

    Perl has an excellent module to parse native Excel files, it is called Spreadsheet::ParseExcel. If the API is too difficult (for you), you could consider the wrapper module Spreadsheet::Read, which uses Spreadsheet::ParseExcel under the hood. With both you can be the one controlling what and how you deal with the data in the spreadsheet(s).

    Excel uses a binary format that is very portable across architectures, making it relatively easy to read those files on Windows as well as on AIX, HP-UX, MacOSX, etc etc.


    Enjoy, Have FUN! H.Merijn
Re: Excel File Dos2Unix not working
by choroba (Cardinal) on Jul 12, 2011 at 11:42 UTC
    Can you further specify the "issues" you are having?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://913880]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-23 19:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found