in reply to Re: CSV_XS and UTF8 strings
in thread CSV_XS and UTF8 strings
I understand the rules for proper CSV formats and thus know that putting double quotes around strings with spaces is correct according to these CSV formatting rules. My concern is that the original CSV file does not have any double quotes around strings with spaces. This is an English Resource file and I'm creating a Japanese resource source file. The concern is that the program reading the CSV files may have problems when it encounters the double quotes around the Japanese string since the original English string did not have these. I know I can then tell the developer that the program should be able to handle properly formatted CSV but it is a hassle working with the developers so if I could create the Japanese CSV with same formatting than I won't have to worry about whether their program works with the double quotes around the Japanese string.
I also do a lot of work with Unicode and do get frustrated when there are inconsistencies across languages. Characters are characters and it should not matter what language. Unfortunately, there is an inconsistency with the use of "quote_space => 0". As demonstrated in my data examples, a data file with just English (ASCII characters) processed by my script results in exactly the same format. That means if a string with spaces did not have quotes, the new file carries over this same format BUT if the data file has Unicode (UTF8) characters with spaces than the formatting changes and double quotes are added to this string even though the purpose of "quote_space => 0" is to not add these quotes.