Hello all,
I'm currently making a program to convert CSV files from mail programs to a list of vCards. As each email program is different, each one requires a different work around for the exported CSV address book. I've figured out most myself, but i just can't figure out how to handle outlooks malformed CSV files.
First Name,Last Name,Middle Name,Name,Nickname,E-mail Address,Home Str
+eet,Home City,Home Postal Code,Home State,Home Country/Region,Home Ph
+one,Home Fax,Mobile Phone,Personal Web Page,Business Street,Business
+City,Business Postal Code,Business State,Business Country/Region,Busi
+ness Web Page,Business Phone,Business Fax,Pager,Company,Job Title,Dep
+artment,Office Location,Notes
Joe,Schmoe,L,Joe L Schmoe,"Joe, "the shark" Smith" ,joe@here.com,"270
+E. Willbur Ave
Apt 378",Atlanta,30823,GA,USA,HOMEPHONE,FAXPHONE,CELLPHONE,http://perl
+monks.com,"233 N. Ocean Drive Suite
300",Pheonix,73829,AZ,USA,http://joeswork.com,WORKPHONE,WORKFAX,WORKPA
+GER,Joes Company,Joes Title,IT Technologies,Office,here are some note
+s
"Smith,",Tony,,"Smith, Tony",,tony@smith.com,,,,,,,,,,,,,,,,,,,,,,,
Notice how the \r\n at the end of each line normally separate entries in a CSV file, however, here they can be part of the actual field value. Outlook doesn't handle quotations very well either, since it doesn't mark them. You can see this in Joe's nickname as "Joe, "The Shark" Smith". any other CSV gererator would make it as "Joe, ""The Shark"" Smith" as opposed to the one shown above.
One would normally use Text::ParseWords::parse_line to get the quoted values, but that doesn't work in this case, and with the broken lines, it is unable to find two instances of quotes for some fields. I've also considered counting the number of fields just by splitting the commas, but even then i'll end up getting an innaccurate count due to quotes and possible embedded commas within the quotes... Right now i'm just wondering if anyone might have any insight onto how to handle this... or perhaps some guidance as to what i would even begin to start with for a RegEx. Thanks!