Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Some missing textutils

by belg4mit (Prior)
on Dec 01, 2001 at 06:30 UTC ( [id://128817]=CUFP: print w/replies, xml ) Need Help??

These are my own perl versions of some C programs at http://acme.com/. lam joins multiple files using the contents of each as a column in the output. overlay lays two files on top of each other (as if they were transparencies) and the second shows through where the first contains whitespace.

I imagine overlay can be handy with ascii-art. And lam might be useful for making simple brochures...

Sample lamination:

=begin file 1
One fish, two fish.
=end file 1
=begin file 2
Red fish, blue fish.
=end file 2
=begin output
One fish, two fish.Red fish, blue fish.
=end output
Sample overlay:
=begin file 1
***hello      ***
              !!!
=end file 1
=begin file 2
...cruel world
I like bananas
=end file 2
=begin output
***hello world***
I like bananas!!!
=end output
UPDATE: danger made the astute observation that I was making the nasty assumption of \t == ' 'x8. Per his suggestion I am now using Text::Tabs, and running with -w. There was also a nasty little bug where I did length($2) instead of length($two), guess that's an argument for picking good variable names ;-). Also, jeffa got a for loop.

UPDATE 2: Bug and robustness fixes from danger's comment.

lam source:

#!/usr/bin/perl -w use strict; use Text::Tabs; use Symbol; #Symbol and my @data; #array instead of hash so the same file can be in several +columns my $sep =''; if( $ARGV[0] =~ /^-sep/ ){ (undef, $sep, @ARGV) = @ARGV; } die("usage: lam [-sep seperator] file1 file2 [...]\n") unless scalar @ +ARGV >1; for(my $i=0; $i < scalar @ARGV; $i++){ open($data[$i]->{FH}=gensym, $ARGV[$i]); $data[$i]->{w}=0; while(readline(*{$data[$i]->{FH}})){ my $length = length(expand($_)); $data[$i]->{w} = $length > $data[$i]->{w} ? $length : $data[$i]->{ +w}; } $data[$i]->{h} = $.; seek($data[$i]->{FH}, 0, 0); } my $max = (sort {$main::b <=> $main::a} map {$_->{h}} @data)[0]; for(my $j=0; $j<$max; $j++){ for(my $i=0; $i < scalar @ARGV; $i++){ if( $j > $data[$i]->{h} ){ print ' 'x$data[$i]->{w}; } else{ chomp($_ = readline(*{$data[$i]->{FH}})); $_ = expand($_); print $_, ' 'x($data[$i]->{w}-1-length); } print $sep unless $i+1 == scalar @ARGV; } print "\n"; }
overlay source:
#!/usr/bin/perl -w use strict; use Symbol; use Text::Tabs; die("usage: overlay file1 file2\n") unless scalar @ARGV == 2; my @data; for(my $i=0; $i < scalar @ARGV; $i++){ open($data[$i]->{FH}=gensym, $ARGV[$i]) || die("overlay($ARGV[$i]): +$!\n"); 1 while(readline(*{$data[$i]->{FH}})); $data[$i]->{LINE} = $.; seek($data[$i]->{FH}, 0, 0); } my $maxline = $data[0]->{LINE} > $data[1]->{LINE} ? $data[0]->{LINE} : $data[1]->{LINE}; for(my $i=0; $i < $maxline; $i++){ if( $i > $data[0]->{LINE} ){ while(readline(*{$data[1]->{FH}})){ print; } last; } elsif( $i > $data[1]->{LINE} ){ while(readline(*{$data[0]->{FH}})){ print; } last; } else{ for my $data (@data){ $data->{str} = readline(*{$data->{FH}}); chomp($data->{str}); $data->{str} = expand($data->{str}); } my $maxchar = length($data[0]->{str}) > length($data[1]->{str}) ? length($data[0]->{str}) : length($data[1]->{str}); my @onechars = split(//, $data[0]->{str}); my @twochars = split(//, $data[1]->{str}); my $str; for(my $j=0; $j<$maxchar; $j++){ if( $j > $#onechars ){ $str .= join('', splice(@twochars, $j)); } elsif($j > $#twochars){ $str .= join('', splice(@onechars, $j)); } else{ $str .= $onechars[$j] eq ' ' ? $twochars[$j] : $onechars[$j]; } } print $str, "\n"; } }

--
perl -p -e "s/(?:\w);([st])/'\$1/mg"

Replies are listed 'Best First'.
(jeffa) Re: Some missing textutils
by jeffa (Bishop) on Dec 01, 2001 at 20:56 UTC
    Neat stuff! ++! I have to comment on some style though:

    Things i don't like (sorry):

    1. hand rolling your own arg processor - not good if you are handing code out to others, use Getopt::Std or Getopt::Long instead.
    2. C style for loops - if i have said it once here i have said it one thousand times: for my $j (0..$maxchar) {} is so much nicer.
    Things i did like:
    1. use of Symbol
    2. using readline on a hash ref key to get around the interpreter mixing up angle brackets and the arrow operator (my $one = <$data[0]->{FH}>; doesn't work - i would have gimped out with a temp var)
    3. fully qualifying $a and $b to main in your sort
    Thanks!

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    F--F--F--F--F--F--F--F--
    (the triplet paradiddle)
    
      Thanks! I hadn't thought of #2 (things you didn't like). I had a foreach when I was using a hash to store data, but switched to the C-style for when I made the move to an array because, well just because that's the other idiom I'm familiar with. Although, while your suggestion is pretty safe for the chars loop isn't there a larger potential with the lines loop of instantiating a large list? As for #1, I know I know, but at least I didn't go with -s! (I was tempted but especially didn't want to force use of = to pass a value.) That and it's such a simple thing, I most certainly would have used Getopt::* if it was any more complicated; two arguments, arguments with optional values, or arguments with types. Yeah, it actually works without fully qualified $a and $b (had like that for awhile) but figured I'd go for the gusto.

      Thanks again.

      --
      perl -p -e "s/(?:\w);([st])/'\$1/mg"

        "...isn't there a larger potential ... of instantiating a large list?"

        Pardon my being behind the times, but that is definately the case with Perl versions 5.003 and older. In my old copy of Programming Perl Second Ed., there is a footnote on page 90 that mentions burning lots of memory "under the current implementation."

        So maybe you should keep that one iterative. My mistake. :)

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        F--F--F--F--F--F--F--F--
        (the triplet paradiddle)
        
Re: Some missing textutils
by belg4mit (Prior) on Dec 02, 2001 at 08:19 UTC
    So there's been an update that incorporated some suggestions, quashed a few bugs, and am a bit more robust. However I have found something rather odd. These doesn't run under perl 5.005_03. And there are no meaningful messages. The only warnings that occur are for undefined values concerning the length of the files. It seems readline is bailing right off and the first while loop is never entered!!!

    --
    perl -p -e "s/(?:\w);([st])/'\$1/mg"

      There is a difference in how readline() handles references to typeglobs from 5.00503 and 5.6.1. The following works in 5.6.1:

      #!/usr/bin/perl -w use strict; use Symbol; my $sym = gensym; open($sym, 'file1') || die "Can't $!"; while(readline($sym)){ print; }

      But it doesn't work in 5.00503 (no warnings, readline simply returns undef so the loop isn't entered). Reading from <$sym> works in both version when $sym is a globref, but readline() itself didn't work with globrefs until 5.6 --- I didn't see mention of it in perldelta, but it is mentioned in the 'Changes' file for 5.6 sources:

      _________________________________________________________________ [ 3349] By: gsar on 1999/05/09 20:23:07 Log: allow readline($globref), <$globref> already works Branch: perl ! pp_hot.c

      A work around for 5.005 is to simply dereference the globref explicitly for the readline() function. The following work in 5.00503 and 5.6.1:

      readline(${$sym}) readline(*{$sym}) or in your particular case: readline(*{$data[$i]->{FH}})

      As a final note, in your last block where you pad out the string I think you want to substract the expand()'d length, not just the length():

      # change this: print $_, ' 'x($data[$i]->{w}-1-length); # to this: print $_, ' 'x($data[$i]->{w}-1-length(expand($_)));
        Damn, YASB. I had
        my $length = length($_ = expand($_));
        originally but upon later review the assignment seemed unnnecessary and was removed.

        --
        perl -p -e "s/(?:\w);([st])/'\$1/mg"

      I've run overlay through B::Deparse... And learned nothing (okay I learned that B::Deparse doesn't run clean with -w, and that the code it produces (at least in this case) is invalid ;-).

      --
      perl -p -e "s/(?:\w);([st])/'\$1/mg"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://128817]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (8)
As of 2024-03-28 07:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found