Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re^2: How to add quotes to comma separated values in a String (updated)

by Laurent_R (Canon)
on Feb 12, 2018 at 23:57 UTC ( #1209032=note: print w/replies, xml ) Need Help??

in reply to Re: How to add quotes to comma separated values in a String (updated)
in thread How to add quotes to comma separated values in a String

don't go and try to quote the strings yourself.
Come on, why not? This is being a bit dogmatic, isn't it?

While I certainly agree that using the functionalities provided by DBI or other modules such as SQL::Abstract is a good idea (and I upvoted your post), knowing how to do it in core Perl is also a good idea, IMHO. After all, Perl is supposed to be a very good language at string handling. Do you really think that doing it the Java way is better?

For example:

my $str = join ', ', map { "'$_'" } split /,/, "CAT,DOG,BIRD,COW"; + # -> 'CAT', 'DOG', 'BIRD', 'COW'
or even better:
my $str = "CAT,DOG,BIRD,COW"; $str =~ s/\b/'/g; # -> 'CAT','DOG','BIRD','COW'
Both examples took me less than 10 seconds to write and test (under the debugger). Looking up the documentation for either of the two modules would probably take me at least 10 or 15 minutes, so 60 to 100 times longer.

To me, using a module for something that can be done with a regex requiring less than ten keystrokes is a bit of over-engineering.

Replies are listed 'Best First'.
Re^3: How to add quotes to comma separated values in a String (updated)
by Your Mother (Bishop) on Feb 13, 2018 at 01:39 UTC
    Come on, why not? This is being a bit dogmatic, isn't it?

    Not really. When you know the broken or wonky data you'll get 100% of the time, sure, why not. But data in the wild is rarely predictable and the best advice to that end is suggesting the most robust solutions. The same reason we don't recommend regular expressions for parsing HTML. Most of the time it is actually fine but most of the time is a lousy way to live. :P

      The same reason we don't recommend regular expressions for parsing HTML.
      To me, this is quite different.

      The OP has an internal variable containing a (CSV) string and wants to quote the fields. It is really not like processing an HTML or XML external file, it is a variable within the program. The OP presumably knows how the string was generated and should presumably be sure of its content.

      The string was probably generated within the program. And even if coming from some external source, hopefully the string has been verified and possibly untainted, maybe sanitized, whatever is needed to be reasonably sure of the content. If the string is coming from outside the program and not generated by the OP, these checks are necessary anyway.

      Please note that I did not object to use the modules mentioned by haukex, quite to the contrary, but only to the advise "do not to try to quote the strings yourself". I believe that there are many cases where you know exactly what your data is like and where you really can quote the strings yourself. Sometimes, you don't need heavy artillery when a fly-swatter will do the job.

        And I don't object to doing things directly, as you did, even with HTML. I have frequently edited huge piles of HTML with -pi -e 's///' but I would never recommend it (to a junior dev at least) because it's similar to recommending cleaning a loaded gun. I don't mind taking the risk, and even the consequences, myself now and then but I'm not going to suggest it's a good idea to anyone else.

        OPs frequently misreport or overly simplify requirements or misunderstand the differences between the cases and unlearning a bad habit is much harder than learning the right way, so I appreciate the dogmatic as long it is also a legitimate best practice. It's easier to say always do ABC than to say you could do XYZ as long as, provided that, but beware, also note bene, caveats apply.

        I don't think your advice was incorrect, I was just addressing the why be dogmatic part. :P

        You're making a lot of assumptions about the data, whereas I assumed that "CAT,DOG,BIRD,COW" was just an example and not the actual input data, so we really don't know what it'll be ("be liberal in what you accept"). Knowing how to do it in plain Perl is of course useful, but personally I'd prefer the first solution people come across to be a robust one - hence the somewhat dogmatic statement, but hopefully for a good reason ;-) I also agree entirely with Your Mother's posts.

        If all your assumptions hold, then sure, it's fine to use plain Perl, but even then I would have written something like the following - just one more line of code to protect against the input changing unexpectedly:

        my $input = "CAT,DOG,BIRD,COW"; $input =~ /\A\w*(?:,\w*)*\z/ or die "invalid input format"; my $str = join ',', map { "'$_'" } split /,/, $input;

        I did assume that the OP, since they are doing work with a database, will have a $dbh lying around. Note that our two pieces of code really aren't that different - only a couple more characters for extra protection :-) Also note that using the database driver for quoting should take care of possible quoting differences between databases.

        my $str = join ',', map { $dbh->quote($_,'VARCHAR') } @values; my $str = join ',', map { "'$_'" } @values;

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1209032]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2018-06-23 20:48 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (125 votes). Check out past polls.