Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Read this article to get an idea of how dangerous it can be to blindly accept macro's in spreadsheets. Be it MS Excel or Google spreadsheets, they all suffer.

You cannot blame CSV for it. CSV is just passive data.

Once you load or open a CSV file into something dangerous as a spreadsheet program that allows formula's to be execcuted on open, all bets are off. Or are they?

The upcoming Text::CSV_XS has added a new feature to optional take actions when a field contains a leading =, which to most spreadsheet programs indicates a formula.

On both parsing and generating CSV, you will be able to specify what you want to do (where "formula" does not go beyond the fact that the field starts with a =):

  • Do nothing special (default behavior) and leave the text as-is
  • Die whenever a formula is seen
  • Croak when a formula is seen
  • Give a warning where a formula is seen
  • Replace all formulas with an empty string
  • Remove all formulas (replace with undef

Code speaks loader than words ...

$ cat formula.csv a,b,c 1,=2+3,4 6,,7,=8+9,

Parsing

$ perl -MCSV -e'dcsv (in => "formula.csv")' [ [ 'a', 'b', 'c' ], [ '1', '=2+3', '4' ], [ '6', '', '7', '=8+9', '' ] ] $ perl -MCSV -e'dcsv (in => "formula.csv", formula => "none")' [ [ 'a', 'b', 'c' ], [ '1', '=2+3', '4' ], [ '6', '', '7', '=8+9', '' ] ] $ perl -MCSV -e'dcsv (in => "formula.csv", formula => "die")' Formulas are forbidden $ perl -MCSV -e'dcsv (in => "formula.csv", formula => "croak")' Formulas are forbidden $ perl -MCSV -e'dcsv (in => "formula.csv", formula => "diag")' Field 2 in record 1 contains formula '=2+3' Field 4 in record 2 contains formula '=8+9' [ [ 'a', 'b', 'c' ], [ '1', '=2+3', '4' ], [ '6', '', '7', '=8+9', '' ] ] $ perl -MCSV -e'dcsv (in => "formula.csv", formula => "empty")' [ [ 'a', 'b', 'c' ], [ '1', '', '4' ], [ '6', '', '7', '', '' ] ] $ perl -MCSV -e'dcsv (in => "formula.csv", formula => "undef")' [ [ 'a', 'b', 'c' ], [ '1', undef, '4' ], [ '6', '', '7', undef, '' ] ]

Generating

$ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1)' a,b,c 1,=2+3,4 6,"",7,=8+9 1 $ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1, formula => "none")' a,b,c 1,=2+3,4 6,"",7,=8+9 1 $ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1, formula => "die")' a,b,c Formulas are forbidden Exit 255 $ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1, formula => "croak")' a,b,c Formulas are forbidden Exit 255 $ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1, formula => "diag")' a,b,c Field 1 contains formula '=2+3' 1,=2+3,4 Field 3 contains formula '=8+9' 6,"",7,=8+9 1 $ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1, formula => "empty")' a,b,c 1,"",4 6,"",7,"" 1 $ perl -MCSV -e'dcsv (in => [["a","b","c"],[1,"=2+3",4],[6,"",7,"=8+9" +]], quote_empty => 1, formula => "undef")' a,b,c 1,,4 6,"",7, 1

I'm pretty pleased with the diagnostics

$ cat formula.csv a,b,c 1,=2+3,4 6,,7,=8+9, $ perl -MCSV -e'$_ = dcsv (in => "formula.csv", bom => 1, formula => " +diag")' Field 2 (column: 'b') in record 1 contains formula '=2+3' Field 4 in record 2 contains formula '=8+9'

Expect this to be available by next week.


Enjoy, Have FUN! H.Merijn

In reply to Be prepared for CSV injections in spreadsheet by Tux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2024-04-18 00:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found