Problems? Is your data what you think it is? | |
PerlMonks |
Waiting ...by PerlingTheUK (Hermit) |
on May 31, 2005 at 18:54 UTC ( [id://462168]=perlmeditation: print w/replies, xml ) | Need Help?? |
I am one of those who have never written a meditation. I have never thought about a reason to write one. Right now, I feel like I really should ask why other people meditate. I for example have spent all day messing with Excel. Someone gave me a 66MByte Excel file with almost 200 sheets, seriously asking me to look up values copy and paste some of these tables into other tables, where he had some surely clever but nevertheless completely non-understandable formulas, then take some other XML-File - preferably copy it into Excel, then manually search some entries, okay I guess you got the point. Maybe you are not wondering anymore why I learned Perl. However, my first thought was to automate this copying and pasting. But I did not make my homework and expecting that I could easily open a 66 Megabyte file in Spreadsheet::ParseExcel or just edit files and not create new ones proved to be a bit naive. So I found myself rushing home at lunch, getting my old VBA Excel book, which until today I considered the biggest waste in money during all my university years and got myself into OLE. I did not believe, it worked. Just let me mention that even now my VBA Excel book is the biggest waste of money. I did find one or two question as to how to format strings in there but I realized I googled that sort of result up much faster anyway. So now I have mastered this little script, ripping an Excel file into something useful, say a textfile. But I should have expected that if a human somewhere in some dark office of some dark company building in the middle of London, has a system that gets data out of a database into Excel, which surely is wrong anyway, this person does not trust the data. Therefore reviews it. Okay okay, that sounds harmless, but it is not. Some random values bottom line, top line, somewhere between those two were manually reformatted. Sure, by checking if a time is still a time after I converted it to text, or controlling dates that way I can find out if this date is right. And thank you my dear unnamed friend also for formatting the grid of 5 of these two hundred odd sheets, at least that does not cause me any problems. More so however does it cause me problems that 5 sheets do not have a title row - Did I mention that same values where in different columns in different sheets, to make things not too simple? Such an easy assumption, match the title against a pattern, get a your_column my_column hash and so one, but no. Okay that is where I got up two: I can now identify 4 different time and 5 different date formats, I am looking up to 70 rows deep into the worksheets to identify of a time is Arrival or delay, because headers and many entries are missing,... And yet I still expect to find some new formats that get my files to be unusable. And this is where Excel is doing the best job. to get Each value in this file, takes 2 hours, so parallel to writing my text files. I have some functions that will alert me as soon as values seem to be wrong. Hopefully this time, I can fix them afterwards in the text file. And this is where this circle closes. I am writing this Meditation because I have nothing else to do but a deadline to keep and have to make sure everything was converted right and start matching tomorrow. I am curious to hear if other people have similar experiences? It is sometimes frustrating but at least then there is Perl ;) and the Monks. Cheers, PerlingTheUK
Back to
Meditations
|
|