in reply to Re^2: Parsing MS SQL CSV export with Text::CSV_XS
in thread Parsing MS SQL CSV export with Text::CSV_XS

If I remember correctly, there was a way to tell the database to not output those two bytes but I can't remember how. I vaguely recall it had something to do with not telling it you were doing CSV but rather text or perhaps it was just changing the extension from .csv to .txt. Unfortunately, the problem was from a customer providing the data and they could never be bothered to do it consistently so I end up writing something that tested the first two bytes and only stripping them if they were ord() > 127.

Cheers - L~R

  • Comment on Re^3: Parsing MS SQL CSV export with Text::CSV_XS

Replies are listed 'Best First'.
Re^4: Parsing MS SQL CSV export with Text::CSV_XS
by andyford (Curate) on Oct 22, 2008 at 21:49 UTC

    Perfect, that's the answer. Well part 1 anyway. I also needed to remove a CR (^@) from in between every character to get Text::CSV_XS to parse it.

    I noticed a surprising thing: vim doesn't show the extra CR's in the original file with the "funny" lead two bytes. Remove them, and vim shows the CR's like this:

    D^@A^@R^@K^@0^@1^@D^@G^@B^@B^@H^@1^@D^@,^@1^@5^@.^@5^@2^@.^@1^@3^@6^@. +^@2^@3^@7^@,^@2^@0^@0^@8^@-^@1^@0^@-^@2^@0^@ ^@1^@9^@:^@0^@0^@:^@0^@8 +^@.^@0^@0^@0^@,^@1^@,^@1^@.^@6^@.^@6^@0^@0^@0^@,^@8^@1^@.^@2^@.^@0^@. +^@2^@5^@,^@-^@W^@o^@r^@k^@s^@t^@a^@t^@i^@o^@n^@P^@a^@r^@e^@n^@t^@s^@^ +M^@
    I wonder if vim recognizes it as a special file format.

      Your file appears to be in unicode format. The leading FEFF bytes are the byte order mark

      You can probably save the file from SQL Server in a plain text format. If I remember correctly, output format ASCII txt will do this for some applications.

      Alternatively, you can have Perl read and translate the unicode.