Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Excel File Dos2Unix not working

by ZWcarp (Beadle)
on Jul 12, 2011 at 10:39 UTC ( #913880=perlquestion: print w/ replies, xml ) Need Help??
ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:

Glorious Monks, I come before you humbled seeking wisdom.

So a coworker sent me an excel file, which I need to do some simple parsing with in order to utilize. I have run into issues before with excel files but usually gotten around them with a couple dos2unix style commands. I'm not sure what is different about this file, but my usual methods just aren't working. First I saved the excel file as a tab delimited file, and I've noticed that if I use lets say

cut -f1 Filename.txt I only get the very first row. I first tried using the command perl -i -pe 'tr/\015/\n/d' Filename.txt

To try and remove any carriage returns and replace with \n . Usually this works however, this time for what ever reason I'm still having issues. I've tried od -tc on the file to look for any weird characters that might be screwing up my line read in. Does anyone have any ideas of what might be causing the issue?

Update: Problem was due to color markup in the excel file which left hidden characters behind, I solved this by just using text edit to get rid of "rich text" . Not sure how the same thing would be done in perl . Thanks all for your responses

Comment on Excel File Dos2Unix not working
Select or Download Code
Replies are listed 'Best First'.
Re: Excel File Dos2Unix not working
by Tux (Monsignor) on Jul 12, 2011 at 10:45 UTC

    When I read your post, I guess you mean "Exported data from Excel files", as .txt is something completely different.

    Perl has an excellent module to parse native Excel files, it is called Spreadsheet::ParseExcel. If the API is too difficult (for you), you could consider the wrapper module Spreadsheet::Read, which uses Spreadsheet::ParseExcel under the hood. With both you can be the one controlling what and how you deal with the data in the spreadsheet(s).

    Excel uses a binary format that is very portable across architectures, making it relatively easy to read those files on Windows as well as on AIX, HP-UX, MacOSX, etc etc.

    Enjoy, Have FUN! H.Merijn
Re: Excel File Dos2Unix not working
by choroba (Canon) on Jul 12, 2011 at 11:42 UTC
    Can you further specify the "issues" you are having?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://913880]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2015-10-04 03:47 GMT
Find Nodes?
    Voting Booth?

    Does Humor Belong in Programming?

    Results (98 votes), past polls