Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

No I/we started of by making 4 different extremely simple versions of CSV parsing core code, just to see how ell approaches would work.

  1. Chucks of interest
  2. State machine
  3. Grammar based
  4. Brute force

The first three are still alive, and I personally only develop in the chunks version, which is - for me - the easiest to develop.

I did not want to put people off in the initial post, but speed is about the most serious drawback at the moment. Not having CPAN can be worked around using use Inline::Perl5;. Examples of how to do that are available on the git repo, that includes working with XS modules (including DBI)! (passing IO arguments is work-in-progress)

When I started in October 2014, my initial version was 6700 times slower than the XS version. Meanwhile is is "just" 1010 time slower. Some of that is because I learn to code more efficient in perl6, but most of that is because the perl6 core gets faster. We're not there yet. Here is a compare:

Perl5 Text::CSV::Easy_XS 0.016 These two have no options and only parse v +alid CSV Text::CSV::Easy_PP 0.016 Text::CSV_XS 0.039 Highly optimized XS with many options Text::CSV_PP 0.514 Pure perl version Pegex::CSV 1.356 Ingy's Pegex parser Perl6 csv.pl 8.133 John's state machine csv-ip5xs 8.950 Text::CSV_XS with Inline::Perl5 csv-ip5pp 9.812 Text::CSV_PP with Inline::Perl5 csv_gram.pl 13.426 Using a grammar-based parser test.pl 38.733 My first attempt, no options test-t.pl 39.502 Almost compatible with Text::CSV_XS

The numbers shown are the time needed in seconds to parse a valid 10000 line CSV file with 5 columns.

Back to your question. Of course one cannot start with the test suite, but one can start with the test suite as a guide. So after building the initial core parser, feed it the tests and look what works and what does not. Then use the failing tests as a plan to alter the code to make the tests pass: implement error-handling, make all the attributes work, catch all exceptions etc etc

Building a bridge would imho be a waste of time: that will not make you learn perl6 any faster, not will you hit problem areas that one needs to fix in the code eventually. As perl6 is type-checked and passes arguments by reference (all are objects), supporting array-refs to speed things up is counter-productive, so one needs to match the test suite to what is feasible and sane in perl6: don't slow don to match perl5 behavior. Things are changing anyway. I'm not trying to mimic the old CSV syntax, I'm trying to port its versatility and flexibility keeping the complete and safe parsing rules.


Enjoy, Have FUN! H.Merijn

In reply to Re^2: Porting (old) code to something else by Tux
in thread Porting (old) code to something else by Tux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-20 16:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found