Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

As I am in the process of porting Text::CSV_XS to perl6, I have learned not only a lot about perl6, but also about loose ends.

We all know by now that perl6 is reasonable likely to be available in September (of this year), and depending on what you hope or expect, you might be delighted, disappointed, disillusioned or ecstatic (or anything in between: YMMV).

My goal was to be able to present a module working in perl6 that would provide the user with as much functionality possible of what Text::CSV_XS offers: flexible feature-rich safe and fast CSV parsing and generation.

For now I have to drop the "fast" requirement, but I am convinced that the speed will pick up later.

Text::CSV_XS currently offers a testsuite with 50171 tests, so my idea was that if I convert the test suite to perl6 syntax, it good very well serve a point of proof for whatever I wrote in perl6.

There's a few things that you need to know about me and my attitude towards perl6 before you are able to value what has happened (at least I see this as a valueable path, you might not care at all).

I do not like the new whitespace issues that perl6 imposes on the code. It strikes *me* as ugly and illogical. That is the main reason why I dropped interest in perl6 very early in the process. In Sofia however, I had a conversation (again) with a group of perl6 developers who now proclaimed that they could meet my needs as perl6 now has a thing called "slang", where the syntax rules can be lexically changed. Not only did they tell me it was possible, but early October 2014, Slang::Tuxic was created just for me and low and behold, I could write code in perl6 without the single big annoyance that drove me away in the first place. This is NOT a post to get you to use this slang, it is (amongst other things) merely a reason to show that perl6 is flexible enough to take away annoyences.

Given that I now can write beautiful perl (against, beauty in the eyes of the beholder), I regained enjoyment again and John and I were so stupid to take the fact that perl6 actually can be used now, to promise to pick a module in the perl5 area and port it to perl5. XS being an extra hurdle, we aimed high and decided to "do" CSV_XS.

So, I have 50000+ tests that I want to PASS (ideally) but I soon found out that with perl6 having type checking, some tests are useless, as the perl6 compiler already catches those for you. Compare that to use strict in perl5, so I can just delete all tests that pass wrong arguments to the different methods.

Converting error-dealing stuff was fun too but I think that if you try to mimic what people expect in perl5, it is not too hard to get the best of both worlds: I'm quite happy already with error-handling as it stands.

So, the real reason for this post is what I found to have no answer to, as it wasn't tested for or had tests that were bogus.

  • What should be done when parsing a single line of data is valid, but there is trailing data in the line after parsing it.?
    $csv->parse (qq{"foo",,bar,1,3.14,""\r\n"more data});
    parse is documented to parse the line and enable you to get the fields with fields. In contrast with getline, which acts on IO, it will just discard all data beyond the EOL sequence. I am uncertain if that is actually what it should do. Should it "keep" the rest for the next iteration? Should it be discarded? Should it cause a warning? Or an error?
  • What is the correct way to deal with single ESCapes (given that Escape is not the default " or otherwise equal to QUOtation). Here I mean ESCape being used on a spot where it is not required or expected without the option to accept them as literal.
    $csv->escape_char ("+"); $csv->parse ("+"); $csv->parse ("+a"); $csv->parse ("+\n");
    Leave the ESCape as is? Drop it, being special? Warn or Error?

Questions like those slowed down the conversion as a whole, as I can now take my own decisions with sane defaults (like binary => True) instead of caring about backward compatibility issues.

Anyway, feel free to check out what I have done so far on my git repo and I welcome comments (other than those on style) in the issues section. Feel free to join #csv on IRC to discuss. Hope to see you (or some of you) at the next Dutch Perl Workshop 2015 in Utecht.


Enjoy, Have FUN! H.Merijn

In reply to Porting (old) code to something else by Tux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-04-19 23:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found