Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
    0: package SuperSplit;
    1: use strict;
    2: 
    3: =head1 NAME
    4: 
    5: SuperSplit - Provides methods to split/join in two dimensions
    6: 
    7: =head1 SYNOPSIS
    8:  use SuperSplit;
    9:  
    10:  #first example: split on newlines and whitespace and print
    11:  #the same data joined on tabs and whitespace. The split works on STDIN
    12:  #
    13:  print superjoin( supersplit() );
    14:  
    15:  #second: split a table in a text file, and join it to HTML
    16:  #
    17:  my $array2D   = supersplit( \*INPUT )  #filehandle must be open
    18:  my $htmltable = superjoin( '</TD><TD>', "</TD></TR>\n  <TR><TD>", 
    19:                                  $array2D );
    20:  $htmltable    = "<TABLE>\n  <TR><TD>" . $htmltable . "</TD></TR>\n</TABLE>";
    21:  print $htmltable;
    22:  
    23:  #third: perl allows you to have varying number of columns in a row,
    24:  # so don't stop with simple tables. To split a piece of text into 
    25:  # paragraphs, than words, try this:
    26:  #
    27:  undef $/;
    28:  $_ = <>;
    29:  tr/.!();:?/ /; #remove punctiation
    30:  my $array = supersplit( '\s+', '\n\s*\n', $_ );
    31:  # now you can do something nifty as counting the number of words in each
    32:  # paragraph
    33:  my @numwords = (); my $i=0;
    34:  for my $rowref (@$array) {
    35:     push( @numwords, scalar(@$rowref) );  #2D-array: array of refs!
    36:     print "Found $numwords[$i] \twords in paragraph \t$i\n";
    37:     $i++;
    38:  }
    39: 
    40: =head1 DESCRIPTION
    41: 
    42: Supersplit is just a consequence of the possibility to use 2D arrays in 
    43: perl. Because this is possible, one also wants a way to conveniently split 
    44: data into a 2D-array (at least I want to). And vice versa, of course. 
    45: Supersplit/join just do that. 
    46: 
    47: Because I intend to use these methods in numerous one-liners and in my 
    48: collection of handy filters, an object interface is more often than not 
    49: cumbersome.  So, this module exports two methods, but it's also all it has.  
    50: If you think modules shouldn't do that, period, use the object interface, 
    51: SuperSplit::Obj. TIMTOWTDI
    52: 
    53: =over 4
    54: 
    55: =item supersplit($colseparator,$rowseparator,$filehandleref || $string);
    56: 
    57: The first method, supersplit, returns a 2D-array.  To do that, it needs data
    58: and the strings to split with.  Data may be provided as a reference to a
    59: filehandle, or as a string.  If you want use a string for the data, you MUST
    60: provide the strings to split with (3 argument mode).  If you don't provide
    61: data, supersplit works on STDIN. If you provide a filehandle (a ref to it,
    62: anyway), supersplit doesn't need the splitting strings, and assumes columns
    63: are separated by whitespace, and rows are separated by newlines.  Strings
    64: are passed directly to split.
    65: 
    66: Supersplit returns a 2D-array or undef if an error occurred. 
    67:  
    68: =item superjoin( $colseparator, $rowseparator, $array2D );
    69: 
    70: The second and last method, superjoin, takes a 2D-array and returns it as a 
    71: string. In the string, columns (adjacent cells) are separated by the first 
    72: argument provided. Rows (normally lines) are separated by the second 
    73: argument. Alternatively, you may give the 2D-array as the only argument. 
    74: In that case, superjoin joins columns with a tab ("\t"), and rows with a 
    75: newline ("\n"). 
    76: 
    77: Superjoin returns an undef if an error occurred, for example if you give a 
    78: ref to an hash. If your first dimension points to hashes, the interpreter
    79: will give an error (use strict).
    80: 
    81: =back
    82: 
    83: <READMORE>
    84: 
    85: =head1 AUTHOR
    86: 
    87: J. Elassaiss-Schaap
    88: 
    89: =head1 LICENSE
    90: 
    91: Perl/ artisitic license
    92: 
    93: =head1 STATUS
    94: 
    95: Alpha
    96: 
    97: =cut
    98: 
    99: BEGIN{
    100:    use Exporter;
    101:    use vars qw( @EXPORT @ISA @VERSION);
    102:    @VERSION = 0.01;
    103:    @ISA = qw( Exporter );
    104:    @EXPORT = qw( &supersplit &superjoin );
    105: }
    106: 
    107: sub supersplit{
    108:         my $handleref = pop || \*STDIN;
    109:         unless (ref($handleref) =~ /GLOB/){
    110:            push(@_, $handleref);
    111:            undef $handleref;
    112:         }
    113:         my $second = $_[0] || '\s+';
    114:         my $first = $_[1] || '\n';
    115:         $handleref || (my $text = $_[2]);
    116:         my $index = 0;
    117:         my $arrayref = [[]] ; 
    118:         local $/;
    119:         undef $/;
    120:         $text = <$handleref> if( ref($handleref) );
    121:         my @lines = split( $first, $text );
    122:         for (@lines){
    123:             $arrayref->[$index] = [ (split($second) || $_)];
    124:             $index++;
    125:         }
    126:         return $arrayref;
    127: }
    128: 
    129: sub superjoin{
    130:         my $array = pop || return undef;
    131:         my $first = shift || "\t";
    132:         my $second = shift || "\n";
    133:         my $text = '';
    134:         return undef unless( ref($array) eq 'ARRAY' );
    135:         return undef unless( ref($array->[0]) =~ /ARRAY|HASH/ );
    136:         my $arrayarray = [];
    137:         for $arrayarray (@$array) {
    138:                 $text .= join( $first, @$arrayarray );
    139:                 $text .= $second;
    140:         }
    141:         return $text;
    142: }
    143: 
    144: 1;
    

In reply to Supersplit by jeroenes

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others cooling their heels in the Monastery: (10)
    As of 2015-07-01 20:28 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (19 votes), past polls