Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Encoding issue from DBI to string

by sundialsvc4 (Monsignor)
on Jan 03, 2013 at 18:04 UTC ( #1011500=note: print w/ replies, xml ) Need Help??


in reply to Encoding issue from DBI to string

The use of placeholders is absolutely vital when using SQL, especially in a web-site where the inputs can never be fully trusted.   The placeholders, an un-quoted question mark (therefore, not a literal-string), represent places where consecutive values from a separately-supplied list will be inserted.   In this way, it is impossible for any of the data to be misinterpreted (that is, “injected”) as part of the SQL statement itself.   Nor can there ever be any question about where each value begins or ends.   The data-type of each parameter in the list is independently known and does not have to be, nor converted to, a character-string.   (At least, not by you.)   Furthermore, this might (or might not) also solve your character-encoding concern:   the SQL string is one “thing,” and each parameter is another, such that never the twain shall meet.   It’s very clean and black-box-ish:   “here’s the SQL string, and here’s a rag-tag bag of parameters I want to use with it; now, go do it.”


Comment on Re: Encoding issue from DBI to string
Re^2: Encoding issue from DBI to string
by karlgoethebier (Curate) on Jan 03, 2013 at 19:18 UTC
    "The use of placeholders is absolutely vital when using SQL..."

    Nothing but the truth. But perhaps this is yet another communication problem.

    I'm just wondering about what will happen when i try to force mysql_enable_utf8 => 1 in my DBI call when default-character-set=utf8 and default-collation=utf8_unicode_ci is not set in my.cnf...

    For the moment i don't have any mySQL at the hand to check this.

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

      Hey Karl, Upon looking in the my.cnf file, I had default-character-set = utf8 and collation-server = utf8_unicode_ci, however default-collation was not set. I made the necessary changes, but had to step into a meeting. I'll take a look at how things go in a bit.
      Even after adding in the default-collation=utf8_unicode_ci, I'm still getting all question marks for multibyte characters. What a headache.

      Not sure if anyone who have the answer here, but assuming the DB is set up appropriate with all UTF-8 encoding and data appearing valid in the tables, it wouldn't really matter how it's getting into the database to begin with, right?

      We're using a 3rd party program as a scraper, and its underlying Java is dumping the data to the DB. I haven't looked into it much just because the data appears right in the DB with all UTF-8 encoding configured, so I assumed it wasn't the issue.

        "Even after adding in the default-collation..."

        S**t!

        "...it wouldn't really matter how it's getting into the database to begin with, right?

        To be honest, i'm not shure about this.

        BTW, what happens if you fire up your query using mysql client?

        Regards, Karl

        «The Crux of the Biscuit is the Apostrophe»

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1011500]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (9)
As of 2014-07-31 10:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (248 votes), past polls