Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Initial Devel::UncheckedOps, a macro for perl

by diotalevi (Canon)
on Jun 17, 2004 at 00:01 UTC ( #367478=perlmeditation: print w/replies, xml ) Need Help??

<rant>CPAN modules failing to check for write errors</rant> kicked off a quest to see if I could apply the stuff I learned in Hacking The Op Tree For Fun And.... The following code is the result of a two weeks of hacking at perl to automatically either report ( the default - a safe choice ) or fix your code that doesn't check the result of system calls. The following snippet will automatically fix your entire program so that there are no unchecked io, file, or directory calls. It even gets into print() and printf() which are both normally impossible to override.

I am posting this code because it may be a while before I release another, better version. Along the way I found that it would be easy to abstract the `compiled_program() =~ s///`-like portions of the module off and so I expect to create a Devel::Macro later so I can rewrite this as a proper perl-macro using the impending macro system. In the interim I'm thinking I may just end up really busy with other, more important stuff and instead of just letting this sit around, collecting dust I thought I'd share it and let you all at least try it out.

All the user-configurable code is handled via the import() call so where the documentation isn't clear read that bit of source. See also the %CHECK_DICT for a categorized list of opcodes that can be matched. By default the module looks for everything appropriate from io, sockets, file, directory, eval, and miscellaneous categories.

I had to submit a patch to B::Utils and to B::Generate to get this to work. I have included both patches as responses to this node (mostly so the code I actually want to post isn't cluttered up with these related patches).

use Devel::UncheckedOps ( fix => 1, functions => [ map @$_, @Devel::UncheckedOps::CHECK_DICT{ qw( io file dir +ectory ) } ] ); print "Hello world!\n"; # Becomes # print "Hello world!\n" or die $!;
package Devel::UncheckedOps; use strict; use warnings FATAL => 'all'; use B qw( OPf_WANT_VOID class ); use B::Utils qw( walkallops_filtered opgrep ); use vars qw( $REPORT_CALLBACK %CHECK_DICT @CHECK_OPS @TERMINAL_OPS $VE +RSION $OPCODE_NAME $FIX_OPCODES $O_PM @QUEUED_FIXES_TO_APPLY $D +EBUG ); # %ALLOPS %OPMAP use Carp qw( carp ); $VERSION = '0.01'; # This is a largish list of stuff I think can be validated by this mod +ule. # The default list of opcodes that will be checked is defined in @CHEC +K_OPS # immediately following and normal users specify the list of ops to va +lidate # by passing in a reference to an array to the 'check' parameter of th +e use() # call. %CHECK_DICT = ( # 'write' would be a nice op to add but I do not yet kn +ow how # it works in code. io => [ qw[open close binmode dbmclose dbmopen fcntl flock ge +tc ioctl pipe_op tie read print prtf seek send sysopen sysr +ead sysseek syswrite recv tell truncate ] ], sockets => [ qw[accept bind connect listen shutdown sockpair ] ], file => [ qw[ chdir chmod chown chroot link mkdir readlink rena +me rmdir symlink unlink utime rmdir ] ], directory => [ qw[ closedir open_dir readdir rewinddir seekdir telld +ir ] ], # process => # TODO: # Check backtick # die $?, not $! # [ qw[ exec fork kill system ] ], # I do not know how to validate the semaphore, shared m +emory, # or message passing code. Thi shared_memory => [ qw[ shmctl shmget shmread shmwrite ] ], message_passing => [ qw[ msgctl msgget msgrcv msgsnd ] ], semaphores => [ qw[ semctl semget semop ] ], eval => [ qw[ dofile ] ], miscellaneous => [ qw[ syscall ] ] ); @CHECK_OPS = ( map @$_, @CHECK_DICT{ qw( io sockets file directory eval miscellaneous ) } ); # I started with just nextstate and leavesub but while reading opcode. +pl # went "eh, what the heck. Why not?" and just included the raft of rel +ated # opcodes. @TERMINAL_OPS = ( qw[ method entersub leavesub leavesublv caller reset lineseq nextstate dbstate unstack enter leave scope enteriter iter enterloop leaveloop return last next redo dump goto exit ] ); $REPORT_CALLBACK = \ &default_report; CHECK { check(); } # Create an alias so that when fixing, a person can say fix() instead. + This # might only be interesting when the normal CHECK call wasn't called. *fix = \&check; sub check { if ( $FIX_OPCODES ) { walkallops_filtered( \ &find_unchecked_system_call, \ &queue_fix_opcode ); fix_opcode( $_ ) for @QUEUED_FIXES_TO_APPLY; } else { walkallops_filtered( \ &find_unchecked_system_call, $REPORT_CALLBACK ); } if ( $O_PM ) { eval "use O '$O_PM'"; } return 1; } sub import { my $class = shift; my %p = @_; # Ethier take a callback from the user via # use Devel::UncheckedOps ( callback => sub { ... } ); # or supply a default. $REPORT_CALLBACK = $p{'report_callback'} if $p{'report'}; # Allow both `use Devel::UncheckedOps( check => 'print' )` or # `use Devel::UncheckedOps( check => [ 'print' ] )`. This is the # parameter I most expect people to specify. if ( $p{'function'} ) { @CHECK_OPS = $p{'function'}; } elsif ( $p{'functions'} ) { @CHECK_OPS = @{$p{'functions'}}; } # This is a boolean value. $FIX_OPCODES = !! $p{'fix'}; if ( $FIX_OPCODES ) { eval q[ use B::Generate (); use Internals (); 1; ] or carp( $@ ); } # This is a boolean value. Various guts will be displayed if you p +ass # in a true value. The guts that are displayed are entirely up to +my # most recent needs. $DEBUG = !! $p{'debug'}; # This is passed to `use O '$O_PM'` so the user of this module can + say # `use Devel::UncheckedOps( O => 'Deparse' )` to see what the # code looks like after deparsing. The parameter is any module in +the B:: # namespace that has already been designed to be called by in + this # way. This includes Bblock, Bytecode, C, CC, Concise, Debug, Depa +rse, # Showlex, Stackobj, Stash, Terse, Xref or any other module you mi +ght # get from CPAN like B::Deobfuscate. $O_PM = $p{'O'}; # I seriously doubt that anyone is going to need to specify these. + I # include this solely for debugging purposes and perhaps the event +ual # need for it. @TERMINAL_OPS = @{$p{'terminal_ops'}} if $p{'terminals'}; return 1; } sub find_unchecked_system_call { # This is used by B::Utils::*_filtered to grep for opcodes that ne +ed to # be reported or fixed. my $op = shift; # I am going to fix/report this in another function immediately fo +llowing. $OPCODE_NAME = $op->oldname; # if ( $FIX_OPCODES ) # { # my $addr = $$op; # for my $m ( qw( sibling # first # last ) ) # { # my $to = eval { ${ $op->$m } }; # next unless $to; # push @{$OPMAP{ $to }}, [ $op, $m ]; # } # } # B::Utils::opgrep test to decide if this opcode is one that is de +sirable. return ( opgrep( { name => \ @CHECK_OPS, flags => OPf_WANT_VOID }, $op ) or opgrep( { name => \ @CHECK_OPS, next => { name => \ @TERMINAL_OPS } }, $op ) ); } sub default_report { # This is the default callback for reporting that something has go +ne awry. # It may be overriden by saying # use Devel::UncheckedOps( report => \ &other_sub ); carp( "Unchecked $OPCODE_NAME" . " call at $B::Utils::file line $B::Utils::line" ); } sub queue_fix_opcode { # This function puts fixes into a to-do list so that they are only # altered when the tree is not being currently walked. my $op = shift; push @QUEUED_FIXES_TO_APPLY, { op => $op, file => $B::Utils::file, line => $B::Utils::line }; return 1; } sub fix_opcode { # This function accepts a 'fix' as previously queued by queue_fix_ +opcode(). my $fix = shift; my $op = $fix->{'op'}; my $file = $fix->{'file'}; my $line = $fix->{'line'}; printf( __PACKAGE__ . " FIXING %s at %s line %s\n", op_to_text( $op ), $file, $line ) if $DEBUG; # This is the in-memory address of the opcode. It is used my $orig_next = $op->next; my $orig_sibling = $op->sibling; printf( __PACKAGE__ . " SIBLING %s\n", op_to_text( $orig_sibling ) ) if $DEBUG; my $orig_parent = $op->parent; printf( __PACKAGE__ . " PARENT %s\n", op_to_text( $orig_parent ) ) if $DEBUG; my $orig_reverse_first; $orig_reverse_first = $orig_parent if ${$orig_parent->first} == $$op; printf( __PACKAGE__ . " PARENT->FIRST %s\n", op_to_text( $orig_parent->first ) ) if $DEBUG; my $orig_reverse_last; $orig_reverse_last = $orig_parent if ${$orig_parent->last} == $$op; printf( __PACKAGE__ . " PARENT->LAST %s\n", op_to_text( $orig_parent->last ) ) if $DEBUG; # Maybe find the opcode that thinks this opcode is its sibling by +going # to this opcode's parent and walking over the list of siblings un +til this # one is reached. The previously visited opcode is the one we're a +fter. my @siblings = $orig_parent->kids; my $orig_reverse_sibling = ( grep ${$siblings[$_]->sibling} == $$o +p, 0 .. $#siblings - 1 )[0]; $orig_reverse_sibling = $siblings[ $orig_reverse_sibling ] if defined $orig_reverse_sibling; printf( __PACKAGE__ . " REVERSE SIBLING %s\n", op_to_text( $orig_reverse_sibling ) ) if ( $DEBUG and $orig_reverse_sibling and ${$orig_reverse_sibling->sibling} == $$op ); # Construct the new program fragment in reverse order so parent no +des # can point to child nodes. This alters the original node so it is # now inside the new fragment. # or # ORIGINAL # die # pushmark # gvsv use Devel::Peek; my $gvsv = B::SVOP->new( 'gvsv' => 2, '$!' ); # Now inflate the reference count for *! because this is a sneaky +way # to take a reference that doesn't inform the variable's refcnt. Internals::SetRefCount( \*!, 1 + Internals::GetRefCount( \*! ) ); my $pushmark = B::OP->new( 'pushmark' => 2 ); my $die = B::LISTOP->new( 'die' => 5, $pushmark, $gvsv ); $die->targ( 1 ); $die->private( 1 ); my $or_root = B::LOGOP->new( 'or' => 2, $op, $die ); my $or_op = $or_root->first; $or_op->private( 1 ); # Insert this fragment into the appropriate place in the tree. Eve +ry place # that the ORIGINAL node was, this new node has to replace it. # PARENT # ->first( ORIGINAL ) # ->last( ORIGINAL ) $orig_reverse_first->first( $or_root ) if $orig_reverse_first; $orig_reverse_last->last( $or_root ) if $orig_reverse_last; # PARENT # KID # ->sibling( ORIGINAL ) $orig_reverse_sibling->sibling( $or_root ) if $orig_reverse_siblin +g; # PARENT # ORIGINAL # ->sibling( KID ) $or_root->sibling( $orig_sibling ) if $orig_sibling; # Now thread the execution order. # EXT -> $op # -> OR # -> $orig_next # ... # -> $pushmark # -> gvsv # -> die # Insert the OR into the execution $op->next( $or_op ); # Continue as normal if $or succeeds $or_op->next( $orig_next ); # Otherwise detour and then reroute back to the normal place $or_op->other( $pushmark ); $pushmark->next( $gvsv ); $gvsv->next( $die ); $die->next( $orig_next ); 1; } sub op_to_text { my $op = shift; return 'undef' if not defined $op; my $class = class $op; my $name; eval { $name = $op->oldname; 1; } or do { $name = ''; }; my $addr = sprintf '(0x%07x)', $$op ; join( '=', grep length(), $class, $name, $addr ); } 1; __END__ =head1 NAME Devel::UncheckedOps - Perl extension to warp your mind =head1 SYNOPSIS use Devel::UncheckedOps ( functions => [ 'print', 'prtf' ], fix => 1 ); =head1 DESCRIPTION This module examines the compiled perl program and either reports or f +ixes unchecked system calls. =head1 USE PARAMETERS =over4 =item function => NAME This parameter specifies a single function name to search for. Do reme +mber to document %CHECK_DICT which has a big, categorized list of op codes. use Devel::UncheckedOps ( function => 'print' ); =item functions => \ @NAMES This parameter specifies a list of function names to search for. Do re +member to document %CHECK_DICT which has a big, categorized list of op codes. =item report => \ &CALLBACK If the program is not fixing then it is reporting. This allows the use +r to specify and alternate reporting function. It is passed the opcode that + is in error. =item fix => BOOLEAN A boolean value that triggers all the really cool guts so even non-ove +rridable stuff like print and printf are fixed up. =item debug => BOOLEAN A boolean to get some additional information. Generally this is useful + when debugging the operation of the fixing code. =item O => B:: backend name Put stuff like 'Deparse', 'Concise', 'Terse', 'Debug', etc. here. This + just arranges to have the program passed to the appropriate B:: backend aft +er it has been altered. It is like saying -MO=Deparse to a command-line scri +pt. =item terminal_ops => \ @NAMES Go read the source. =back =head1 MOD_PERL? Mention that everything can be called directly from check() though eve +rything normally happens during the normal CHECK routine. This may not even be + valid to bring up. =head1 SEE ALSO See... what? That request that wished for this? =head1 AUTHOR Me. =head1 COPYRIGHT AND LICENSE Same as perl, etc,. =cut

Replies are listed 'Best First'.
Re: Initial Devel::UncheckedOps, a macro for perl
by diotalevi (Canon) on Jun 17, 2004 at 00:02 UTC
Re: Initial Devel::UncheckedOps, a macro for perl
by diotalevi (Canon) on Jun 17, 2004 at 00:03 UTC
Re: Initial Devel::UncheckedOps, a macro for perl
by Mr. Muskrat (Canon) on Jun 17, 2004 at 17:13 UTC

    Just a quick question for you diotalevi. What is the reasoning behind altering every print or printf statement and not just those associated with a filehandle other than STDOUT? This question popped into my head when I saw:

    print "Hello world!\n"; # Becomes # print "Hello world!\n" or die $!;

    I have an idea as to why you did that. Rather than show ignorance and spout off some possibly totally stupid idea (I am still a novice perlguts/B spelunker after all), I think I'll just wait for a response. :)

      perl -e 'print "'ello world\n" or die $!' > file_on_fs_that_is_already_full

      I still wonder if that would error if the buffer weren't flushed though. It at least makes it more likely especially as the default also puts an `or die $!` on the close() op. There isn't anything to be done about implicit closes though.

      An update. It has been suggested that I patch perl so that failures during implicit closes throw warnings, probably in the 'io' class.

        I did not even think of that one. I was thinking that if someone used select to set the default filehandle then you might need to make a second pass through to figure out what filehandle is really being used.
Re: Initial Devel::UncheckedOps, a macro for perl
by bobtfish (Scribe) on Jun 18, 2004 at 09:52 UTC
    Sorry for the line-noise comment, however just wanted to say I'm SO impressed.

    I'll be testing this and once it gets stable/in CPAN etc we'll probably be using it as a tool for checking our code.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://367478]
Approved by broquaint
Front-paged by grinder
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (10)
As of 2017-06-28 13:23 GMT
Find Nodes?
    Voting Booth?
    How many monitors do you use while coding?

    Results (638 votes). Check out past polls.