Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

CSV file with double quotes

by rahulr (Initiate)
on Sep 14, 2011 at 12:46 UTC ( [id://925896]=perlquestion: print w/replies, xml ) Need Help??

rahulr has asked for the wisdom of the Perl Monks concerning the following question:

I have a csv file, which has records like this,

1,20-05-2011,23.456,"THIS is, fine, but , not",200

i.e. the text fields are in double quotes and the have commas.

I want to replace commas within double quotes with some special letter(s), say ~z~.

Whats the best way to do this? At the moment I have written a long code which parses each line character by character, when you are within the double quotes replace the comma with ~z~ etc etc. I am looking for a smarter/compact way. Thanks in advance

Replies are listed 'Best First'.
Re: CSV file with double quotes
by Tux (Canon) on Sep 14, 2011 at 12:51 UTC

    That is why perl has CSV parsing modules: Text::CSV and Text::CSV_XS, that deal with this by default.

    Enjoy, Have FUN! H.Merijn
Re: CSV file with double quotes
by moritz (Cardinal) on Sep 14, 2011 at 12:51 UTC
Re: CSV file with double quotes
by Tux (Canon) on Sep 14, 2011 at 13:01 UTC
    use strict; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, eol => "\n +" }); open my $fh, "<:encoding(utf-8)", "file.csv" or die "file.csv: $!"; while (my $row = $csv->getline ($fh)) { s/,/~z~/g for @$row; $csv->print (*STDOUT, $row); }

    Simple enough?

    Enjoy, Have FUN! H.Merijn
      Yes... looks jolly simple.
      All I need is, get the module installed and I am ready to go.
      Just curious, which bit here is differentiating between those commas which are delimiters and those commas which are data (i.e. within the double quotes)? I cannot try it right now coz the module is not installed

        The snippet my $row = $csv->getline ($fh) in the example reads a line from the source file, and intelligently splits it up into fields. The Text::CSV library knows how do do that and takes quotes and the like into account. The $row scalar that it returns is slightly misleading as it is actually an array reference to the fields it extracted.

        The next line: s/,/~z~/g for @$row; iterates over each field in the row and performs your substitution.

        Question to rahulr: Why do you want to replace commas with "~z~"? It looks like you are trying to escape them. If so what would you do if you encounter a real "~z~" in your input? You might want to re-think how you are processing your data so that escaping works properly.

Re: CSV file with double quotes
by MidLifeXis (Monsignor) on Sep 14, 2011 at 12:54 UTC

    Use DBD::CSV, Text::CSV, or one of their companions, and create a filter - on read, use " as the delimiter; on write, use ~z~ as the delmiter.

    Update (ENOCOFFEE): Read OP as wanting to replace the quotes, and still messed up my response.




Re: CSV file with double quotes
by Wheely (Acolyte) on Sep 15, 2011 at 20:05 UTC
    For some reason that makes no sense at all I prefer not to use modules if I donīt need to. I do realise this is stupid but I canīt help it. Had you considered using split on the " into three strings and then doing a split on the , for the first and third? Or is that a bit silly?

      In most cases it is silly not to use modules. If you write something yourself you are unlikey to spend more than a few hours on it, and there will be bugs. The module authors will have spend a lot of time examining and debugging the problem, and finding all the corner cases with the help of user bug reports.

      If you can't compile binary modules then for CSV there is a pure perl altertanive that is slower, bu still much better tested than anything you can write.

      If you Boss won't allow you to install modules then unless there is a very good reason, you should probably be looking for a new Boss.

        We have security requirements in our industry to not install additional modules. Is there a way to do this without modules?

      For some reason that makes no sense at all I prefer not to use modules if I donīt need to.

      using perlmonks is exactly like using modules

        > using perlmonks is exactly like using modules

        No, it's worse, as the code posted to PerlMonks hasn't been tested by CPAN testers and users of the module.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://925896]
Approved by moritz
Front-paged by toolic
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-07-25 00:18 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.