http://www.perlmonks.org?node_id=1028516

walkingthecow has asked for the wisdom of the Perl Monks concerning the following question:

When I have something like so:
my $sth = $dbh->prepare( " select t.id,t.emp_name,t.emp_email,t.req_status,t. +date_requested,t.mngr_email,t.date_submitted, ( SELECT GROUP_CONCAT( (case when req_status = 'Approved' t +hen emp_name end) separator ', ') FROM requests s WHERE s.date_requested = t.date_re +quested) AS approved FROM requests t WHERE req_status = 'Pending' AND mngr_email = '$filter' AND date_requested >= '$date' ORDER BY emp_name,date_requested " );
I am wondering what the Perl Best Practices are with regards to long fields applied to an object, and if there is any way for Perl Tidy to automatically clean those up. As it is now, I am doing 'perltidy -pbp input.pl' and it is tidying up everything else, but leaving those areas alone. I want those areas to be tidied up as well :) Another example, perltidy didn't touch anything in the parentheses:
my $sth = $dbh->prepare( " INSERT requests ( emp_name, emp_email, emp_title, mngr_name, m +ngr_email, date_requested, req_status, date_submitted, approved_by ) VALUES ( '$login_name', '$login_email', '$login_title', '$login_mn +gr_name', '$login_mngr_email', '$date', 'Pending', '$req_date', '' ) " );

Replies are listed 'Best First'.
Re: Best practices and any way to have Perl Tidy clean it up
by LanX (Saint) on Apr 13, 2013 at 14:16 UTC
    You want a sub-tidy which formats embedded strings and here-docs?

    I'm not aware of any possibility in perltidy for hooking an external parser for languages like SQL.

    And how is tidy supposed to know what kind of sub-language you are using? Heuristics?

    My best practice for embedded code is to manually change the mode in emacs and start a reformatting process on selected regions. When needed I even put this in a macro assigned to a key.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      Alright, assuming Perl Tidy cannot do this, then my next question is, what are the best practices for object parameters? For example:
      my $blah=$ldap->new( PARAM1 => $param1, PARAM2 => $param2, PARAM3 => $param3, );
      or:
      my Ssth = $dbh->prepare( "SELECT * FROM DATABASE WHERE name = '$name' AND money = '$money' AND field = '$field' ");
      I'm trying to get a feel for how deal with this formatting. Amount of spaces for what's within the parentheses; do the parentheses line up? And so on. I know it's a rather stupid question, but after all these years it's something that is still bothering me. I format everything else beautifully, but when it comes to this I have always been rather clueless.
        Since this is for the most part a personal preference there is no single correct answer, but my choice of formatting usually goes like this:
        my Ssth = $dbh->prepare(q( SELECT * FROM DATABASE WHERE name = ? AND money = ? AND field = ? )); $sth->execute($name, $money, $field);

        To be honest, I place the parantheses like you do, and leave the rest to Emacs' Perl mode.

Re: Best practices and any way to have Perl Tidy clean it up
by igelkott (Priest) on Apr 13, 2013 at 14:15 UTC

    Maybe it's just me but changing the number of spaces within a quoted string seems to go beyond simple reformating. Sounds like this should be done manually.

Re: Best practices and any way to have Perl Tidy clean it up
by kcott (Archbishop) on Apr 13, 2013 at 20:24 UTC

    G'day walkingthecow,

    "I am wondering what the Perl Best Practices are with regards to long fields applied to an object, and if there is any way for Perl Tidy to automatically clean those up."

    There's a few Perl Best Practices that could apply. Here's some that identify problems:

    Line Lengths
    Use 78-column lines. (pp.18-19) [So you're not doing that.]
    Indentation
    Use four-column indentation levels. (pp.19-20) [And you're not doing this either.]
    Tabs
    Indent with spaces, not tabs. (pp.20-22) [That could be an issue. Are you doing this one?]

    Here's some that suggest solutions:

    Multiline Strings
    Lay out multiline strings over multiple lines. (pp.60-61)
    Here Documents
    Use a heredoc when a multiline string exceeds two lines. (p.61)
    Heredoc Indentation
    Use a "theredoc" when a heredoc would compromise your indentation. (pp.61-62)
    Heredoc Terminators
    Heredoc Quoters
    Another two to read if you go with a heredoc solution. (pp.62-65)
    Automated Layout
    Enforce your chosen layout style mechanically. (pp.33-35) [Talks about perltidy and provides a .perltidyrc at the end.]

    -- Ken

      With all due respect to Damian Conway and to his absolutely excellent book (whose recommendations I do follow on a number of items, well, most of them), and I certainly don't claim to challenge his authority, I still have to disagree.

      The line length and the tab recommendations are, to say the least, debatable (at least in my view).

      I had dinner last Monday with Curtis "Ovid" Poe and brian de foy in Paris (social setting with the Perl Mongers in Paris), and we discussed briefly these matters, my feeling is that both of them agreed that using spaces rather than tabs is not the best idea.

      To brian and Curtis: if I misrepresented what you said in any way (I don't think I did, but just in case), sorry about that, I did not mean that.

        I wasn't advocating the use of any of those recommendations. This was merely a response to the OP's request: "... wondering what the Perl Best Practices are ...". These are just those I thought applicable to the situation at hand; there may be others.

        For what it's worth, I don't use Line Lengths but I do use Indentation and Tabs; I don't recall ever using Multiline Strings (but it seems sensible enough); I do use heredocs and theredocs; I don't use perltidy.

        -- Ken

        Out of curiosity, do you know what their objections to indenting with spaces was? I've always found it significantly more reliable and manageable than using tabs. I'd be curious about arguments to the contrary.

        Christopher Cashell
        my feeling is that both of them agreed that using spaces rather than tabs is not the best idea

        Out of curiosity, do you know what their objections to indenting with spaces was? Or what the advantages of tabs are?

        I've always found spaces to be more reliable and manageable than using tabs. I'd be curious about arguments to the contrary.

        Christopher Cashell
      Thank you! That's pretty much exactly what I was looking for. I actually use the -pbp (perl best practices) flag with Perl Tidy, which sets 78-column lines for my scripts, but does not touch object params for some reason. So, 99% of my script is shortened down to 78-column lines, then I have these stragglers, these long lines that are object params. Guess I will have to do it manually (no big deal) to fit within the suggestions given above. The -pbp flag I believe sets all options to that which you find in the perltidyrc file provided by the Perl Best Practices book. Seems a little known option (it's not even in --help output), but something worthwhile knowing. Again, thank you for your excellent response.

        A string (e.g. "a string") is a constant. You can't modify it directly in code. You may well have a very good reason for positioning characters within the string. It would inappropriate for perltidy (or other similar software) to modify a constant you've coded. Even without strict and warnings, attempting to modify a constant value is a fatal error:

        $ perl -e '"abc" =~ s/c$//' Can't modify constant item in substitution (s///) at -e line 1, at EOF Execution of -e aborted due to compilation errors.

        Ditto for numbers:

        $ perl -e '2++' Can't modify constant item in postincrement (++) at -e line 1, near "2 +++" Execution of -e aborted due to compilation errors.

        Chapter 2 of PBP is devoted to Code Layout. While the focus is obviously on laying out Perl code, many of the suggestions would apply equally to other languages: this might provide a few hints for your SQL.

        -- Ken

Re: Best practices and any way to have Perl Tidy clean it up
by LanX (Saint) on Apr 13, 2013 at 15:02 UTC
    If the question was "How could I add automatic sub-formatting to perl-tidy?", I'd suggest using the html-output of perltidy.

    So strings and comments are easily identified by tags and can be externally reformatted.

    Deleting all tags as final step leads to naked code again.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

Re: Best practices and any way to have Perl Tidy clean it up
by talexb (Chancellor) on Apr 14, 2013 at 21:04 UTC

    I have my own preferences on how SQL should get tidied up, and what you have is close. However, you probably shouldn't expect perltidy to handle SQL as well as Perl. I'd run perltidy on the code, then go back and organize the SQL separately.

    And my preference for formatting this chunk would be

    my $sth = $dbh->prepare(" SELECT t.id, t.emp_name, t.emp_email, t.req_status, t.date_requested, t.mngr_email, t.date_submitted, ( SELECT GROUP_CONCAT ( ( CASE WHEN req_status = 'Approved' THEN emp_name END ) SEPARATOR ', ' ) FROM requests s WHERE s.date_requested = t.date_requested ) AS approved FROM requests t WHERE req_status = 'Pending' AND mngr_email = '$filter' AND date_requested >= '$date' ORDER BY emp_name, date_requested ");
    This format highlights all of the key words, leaves spaces between each of the resulting columns, indents the 'AND' clauses to clarify the filtering, and generally exposes as much of the logic behind the query as possible.

    There is no one right way to format SQL, but this approach seems fairly clear.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Re: Best practices and any way to have Perl Tidy clean it up
by Anonymous Monk on Apr 14, 2013 at 03:02 UTC
    Consistency is more important than any dogma. Pick one way and have everybody just get used to it. The code throughout the org should look the same no matter who wrote each piece. This makes it much easier to spot errors by eye.
Re: Best practices and any way to have Perl Tidy clean it up
by topher (Scribe) on Apr 17, 2013 at 05:20 UTC

    I think everyone has explained well why perltidy isn't touching the SQL code inside, as it is a quoted string. So, I'm going to offer some related advice regarding best practices with SQL.

    Feel free to ignore these suggestions if you aren't interested.

    If I were doing similar code, I'd do something along the lines of:

    my %sql = foo_select => qq{ SELECT t.id, t.emp_name, t.emp_email, t.req_status, t.date_req +uested, t.mngr_email, t.date_submitted, ( SELECT GROUP_CONCAT( (case when req_status = 'Approved' +then emp_name end) separator ', ') FROM requests AS s WHERE s.date_requested = t.date_requested ) AS approve +d FROM requests t WHERE req_status = 'Pending' AND mngr_email = ? AND date_requested >= ? ORDER BY emp_name, date_requested }; my $sth = $dbh->prepare($sql{foo_select}); $sth->execute($filter, $date);
    There's a couple of things I want to note. First, I'm a believer in defining all of your non-trivial SQL code together. You've got one spot to check for anything SQL, and it makes it much easier to abstract the SQL code out if you need to later (for example, to add support for a different database).

    The more important bit though, are the question marks instead of the explicit variables in the SQL string. These are placeholders (or bind variables). There are a couple very good reasons to use them.

    The first reason is to protect yourself against Bobby Tables and related SQL injection attacks. Unsanitized database inputs are a significant and serious source of problems.

    The second reason is for performance. This doesn't matter as much if you only call the statement once, for any query that gets run repeatedly, you can potentially get a performance boost (or remove a performance penalty) by using bind variables.

    When you prepare a statement, the database (in better DBs) will take that query and do their initial processing on it. This includes parsing the query, analyzing execution paths, and developing the optimized query plan. This can be a non-trivial amount of effort for the database. When you embed Perl variables directly in your SQL, each statement reaches the DB as a new and unique SQL query, forcing the DB to do all of that work each time. When you use bind variables, the database can process the query once, and then cache the results. You can then execute that query with different values repeatedly without having to repeat the pre-processing step.

    Christopher Cashell