Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
OK, I took the idea and did basically a complete rewrite. In particular I noticed the following:
  1. I made the interfaces less magic. For instance you have this magic stuff on the filehandle. I made that a separate function. This will work with tied filehandles as well. For the same reason I stopped using $/ because the author of a tied method may not pay attention to that.
  2. If you are a module, there is no need to do initializations in a BEGIN block.
  3. I would have moved your functions into @EXPORT_OK as Exporter suggests, but you want this for one-offs. OK, TIMTOWTDI. But if I was using it I would have made that change.
  4. I wondered if your @VERSION was meant to be $VERSION.
  5. I note that there is no equivalent to the third argument to split. I played both ways with that then left it alone. Just note that trailing blanks will get split.
  6. I am doing a rewrite and didn't include any POD. You should.
  7. I made this n-dimensional because, well, because I can.
  8. You were not completely clear what the argument order was, and naming the first one $second and the second one $first is IMO confusing. I made it recursive, but still you should note the naming issue. If you wanted 2-dim I would suggest $inner and $outer as names.
  9. You are using explicit indexes. I almost never find that necessary. In this version I use map. Otherwise you could push onto the anon array. Avoiding ever thinking about the index leads to fewer opportunities to mess up, and often results in faster code as well!
  10. I am using qr// to avoid recompiling REs. Given the function call overhead this probably isn't a win. I did it mainly to mention that if you are going to do repeated uses of an RE, you can and should avoid compilation overhead.
  11. The reason for my wrappers is so that my recursion won't mess up on the defaults. :-)
  12. I considered checking wantarray, but the complication in the interface did not seem appropriate for short stuff.
  13. Note that this entire approach is going to fail miserably on formats with things like escape characters and escape sequences. For instance the CSV format is never going to be easily handled using this. Something to consider before using this for an interesting problem.
Oh right, and you want to see code? OK.
package SuperSplit; use strict; use Exporter; use vars qw( @EXPORT @ISA $VERSION ); $VERSION = 0.02; @ISA = 'Exporter'; @EXPORT = qw( superjoin supersplit supersplit_io ); # Takes a reference to an n-dim array followed by n strings. # Joins the array on those strings (inner to outer), # defaulting to "\t", "\n" sub superjoin { my $a_ref = shift; push (@_, "\t") if @_ < 1; push (@_, "\n") if @_ < 2; _join($a_ref, @_); } sub _join { my $a_ref = shift; my $str = pop; if (@_) { @$a_ref = map {_join($_, @_)} @$a_ref; } join $str, @$a_ref; } # Splits the input from a filehandle sub supersplit_io { my $fh = shift; unless (defined($fh)) { $fh = \*STDIN; } unshift @_, join '', <$fh>; supersplit(@_); } # n-dim split. First arg is text, rest are patterns, listed # inner to outer. Defaults to /\t/, /\n/ sub supersplit { my $text = shift; if (@_ < 1) { push @_, "\t"; } if (@_ < 2) { push @_, "\n"; } _split($text, map {qr/$_/} @_); } sub _split { my $text = shift; my $re = pop; my @res = split($re, $text); # Consider the third arg? if (@_) { @res = map {_split($_, @_)} @res; } \@res; } 1;

PS Please take the quantity and detail of my response as a sign that I liked the idea enough to critique it, and not as criticism of the effort you put in...

In reply to Re (tilly) 1: Supersplit by tilly
in thread Supersplit by jeroenes

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    [LanX]: cho: your joke has very limited impact on a non German+Czech speaking board ... we need to create one and invite Lichtkind and "Peta...Monk"
    [Eily]: I still don't understand how the Turkish AA fit into the German+Czech joke though :P
    [LanX]: new Firefox + cb sidebar do random auto expand on submit
    [LanX]: probably need to start pm discussion

    How do I use this? | Other CB clients
    Other Users?
    Others about the Monastery: (7)
    As of 2017-03-27 12:08 GMT
    Find Nodes?
      Voting Booth?
      Should Pluto Get Its Planethood Back?

      Results (319 votes). Check out past polls.