http://www.perlmonks.org?node_id=1200210

clueless newbie has asked for the wisdom of the Perl Monks concerning the following question:

Semantic diff for Perl code

Here LanX states "Colleague asked me if I knew a method to compare modules where functions blocks where moved."

#!/usr/bin/env perl # Combined Disassemble and Reassemble use if (-d 'C:/Users/Clueless'),lib=>'C:/Users/Clueless/lib'; use if (-d 'C:/Users/Clueless'),lib=>'C:/Users/Clueless/devlib'; # $ENV{DBG} -> compile time; $ENV{DEBUG} -> run time. use if ($ENV{DBG} || $ENV{DEBUG}),"Devel::UnComment","#[=]#","Keep"; use Data::Dumper; use English '-no_match_vars'; use File::Basename; use Getopt::Long; #use Getopt::Euclid; use Params::Validate(':all'); use Perl::Tidy; use Pod::Usage; use PPI; use PPI::Dumper; use Text::Diff; use Try::Tiny; use strict; use warnings; use 5.10.0; { # INTERNALS # Given a file name and a list of directories returns the path of +the file in the "right most" directory sub use_file { my ($filename_S,@directories_A)=Params::Validate::validate_pos +(@_,{ type=>SCALAR },({ type=>SCALAR,callbacks=>{ "Directory not foun +d"=> sub { -d shift() } } }) x $#_); for (my $i=$#directories_A; $i >= 0; $i--) { return "$directories_A[$i]/$filename_S" if (-r "$directories_A[$i]/$filename_S"); }; Carp::cluck "File '$filename_S' was never found!"; my ($name)=File::Basename::fileparse($filename_S, qr/\.[^.]*/) +; return \q{sub $name { ...; }}; }; # use_file; # Given a reference to a string and a string of perltidy options # Returns a reference to the tidied string. sub tidy { my ($source_SREF,$argv_S)=Params::Validate::validate_pos(@_,{ +type=>SCALARREF },{ type=>SCALAR }); my $error=Perl::Tidy::perltidy( argv => $argv_S, source => $source_SREF, destination => \my $destination, stderr => \my $stderr, errorfile => \my $errorfile, # ignored when -se f +lag is set ); # Handle errors if ($error) { # serious error in input parameters, no tidied output print {*STDERR} "\n$stderr\n"; Carp::croak "Exiting because of serious errors!"; }; print {*STDERR} "<<STDERR>>\n$stderr\n" if ($stderr); print {*STDERR} "<<.ERR file>>\n$errorfile\n" if ($errorfile); return \$destination; }; #tidy: # Given a reference to a scalar and a file name writes the scalar +to the designated file: sub write_to_file { my ($string_SREF,$filename_S)=Params::Validate::validate_pos(@ +_,{ type=>SCALARREF },{ type=>SCALAR }); try { open my $file,'>',$filename_S or die "Can NOT open '$filename_S' for write"; print {$file} $$string_SREF; close $file or die "Can NOT close '$filename_S' after write"; } catch { #=# WARN Carp::longmess $_; Carp::confess $_; }; }; # write_to_file; done. }; # INTERNALS # Get options my $option_href={}; Getopt::Long::GetOptions($option_href, 'help' ,'man' ,'from=s@' ,'to=s' ) or pod2usage(2); #=# DEBUG [ option_href=>\$option_href]; pod2usage(1) if $option_href->{help}; pod2usage( '-verbose' => 2 ) if $option_href->{man}; if ($ARGV[0] && -r $ARGV[0] && $option_href->{to}) { # Disassmble $ARG +V[0] saving its "MAIN" and its subs into $option_href->[to} as files # Either empty the directory or create it if (-d $option_href->{to}) { # Directory exists - flush it #=# DEBUG unlink glob "$option_href->{to}/*.*"; } elsif (mkdir $option_href->{to}) { # Made the directory } else { # Maybe $option_href->{to} is a file? Carp::confess "We don't have the requisite directory '$option_ +href->{to}'! $!"; }; # Load the source my $source_s=do { open my $FILE,'<',$ARGV[0] or die "Could not open '$ARGV[0]' for reading! $!"; local $/; <$FILE>; }; # Tidy it? # Disassmble it my $document=PPI::Document->new(\$source_s) or die "Oops! Could not PPI::Document '$ARGV[0]'!"; # Decompose for my $sub (@{$document->find('PPI::Statement::Sub') || []}) { unless ($sub->forward) { #=# DEBUG $sub->name; # save the sub and its content # THIS WORKS (as I can't get $sub->save(...) to work) BUT +IS IT KOSHER? write_to_file(\$sub->PPI::Document::serialize(),$option_hr +ef->{to}.'/'.$sub->name.'.sub'); # Create a stub by removing sub's block's children my @elements=$sub->block->children; for (my $i=0; $i < @elements; $i++) { $elements[$i]->remove; }; }; }; # Save the stubbed out "main/package" under MAIN. $document->save("$option_href->{to}/MAIN"); } elsif ($ARGV[0] && $option_href->{from}) { # Reassemble $ARGV[0] from +MAIN and the subs in @$option_href->{from} (scanning $# .. 0) # Load "MAIN" my $document=PPI::Document->new(use_file('MAIN',@{$option_href->{f +rom}})); #=# DEBUG ''.PPI::Dumper->new($document)->string; # Find and flesh out the stubs for my $stub (@{$document->find('PPI::Statement::Sub') || []}) { unless ($stub->forward) { # Thanks to haukex # http://www.perlmonks.org/bare/?node_id=1199996 ## 1. store the document in its own variable #my $doc=PPI::Document->new(\$sub_h{$stub->name}{content}) +; #my $sub=$doc->find_first('PPI::Statement::Sub'); # ## 2. remove the sub element from its document #$stub->insert_after($sub->remove) # or warn "Could not insert_after!"; ## 3. remove the stub #$stub->remove # or warn "Could not ->remove!"; # ... or equivalently $stub->insert_after(PPI::Document->new(use_file($stub->nam +e.q{.sub},@{$option_href->{from}}))->find_first('PPI::Statement::Sub' +)->remove) or warn "Could not ...->insert_after(...->remove)!"; #=# TRACE '',PPI::Dumper->new($document,whitespace=>0)->st +ring; $stub->remove or warn "Could not ...->remove!"; #=# DEBUG '',PPI::Dumper->new($document,whitespace=>0)->st +ring; }; }; #=# DEBUG '',PPI::Dumper->new($document,whitespace=>0)->string; # Write out the results (should we tidy it?) $document->save($ARGV[0]); } else { pod2usage( '-verbose' => 2 ) }; exit; __END__

Disassemble both source files into their own directories via "assemble.pl <file of code to be disassembled> -to <directory to hold "main" and subs>".

Compare corresponding files between the two directories.

Reassemble via "assemble.pl <file to hold reconstituted code> -from <directory that holds the updated "main" and subs>".

My question: I'd like very much to make use of PPI::Document's save method to write the extracted subs but the subs are PPI::Statement::Sub. I'm able to use PPI::Document::serialize on them (thus getting the heredocs) but this just feels wrong.

Updated code thanks to all especially haukex

#!/usr/bin/env perl # Combined Disassemble and Reassemble use if (-d 'C:/Users/clueless_newbie'),lib=>'C:/Users/clueless_newbie/ +lib'; use if (-d 'C:/Users/clueless_newbie'),lib=>'C:/Users/clueless_newbie/ +devlib'; # $ENV{DBG} -> compile time; $ENV{DEBUG} -> run time. use if ($ENV{DBG} || $ENV{DEBUG}),'Devel::UnComment','#[=]#','Keep'; use Data::Dumper; use English '-no_match_vars'; use File::Basename; use Getopt::Long; use Params::Validate(':all'); use Pod::Usage; use PPI; use PPI::Dumper; use strict; use warnings; use 5.10.0; { # INTERNALS # Given a file name and a list of directories returns the content +of the file in the "right most" directory sub read_file { my ($filename_S)=Params::Validate::validate_pos(@_,{ type=>SCA +LAR }); open my $FILE,'<',$ARGV[0] or die "Could not open '$filename_S for reading! $OS_ERROR +"; local $INPUT_RECORD_SEPARATOR; return \<$FILE>; }; # read_file: # Given a file name and a list of directories returns the path of +the file in the "right most" directory sub which_file { my ($filename_S,@directories_A)=Params::Validate::validate_pos +(@_,{ type=>SCALAR },({ type=>SCALAR,callbacks=>{ "Directory not foun +d"=> sub { -d shift } } }) x $#_); my ($name,$directory,$extension)=File::Basename::fileparse($fi +lename_S, qr/\.[^.]*/); #=# DEBUG [ name=>\$name,$directory=>\$directory,extension=>\$ +extension,directories=>\@directories_A ]; for (my $i=$#directories_A; $i >= 0; $i--) { #=# TRACE [ trying=>\"$directories_A[$i]/$name$extension" +]; return "$directories_A[$i]/$name$extension" if (-r "$directories_A[$i]/$name$extension"); }; # No such file -- return a reference to a string containing a +stub. return \qq{sub $name {\n # NOT FOUND!!! \n}}; }; # which_file: }; # INTERNALS # Get options my $option_href={}; Getopt::Long::GetOptions($option_href, 'help' ,'man' ,'from=s@' ,'to=s' ) or pod2usage(2); #=# DEBUG [ option_href=>\$option_href]; pod2usage(1) if $option_href->{help}; pod2usage( '-verbose' => 2 ) if $option_href->{man}; if ($option_href->{to} && $option_href->{from}) { # WTF: Can't have bo +th! pod2usage( '-verbose' => 2 ) } elsif ($ARGV[0] && -r $ARGV[0] && $option_href->{to}) { # Disassmble $ +ARGV[0] saving its "MAIN" and its subs into $option_href->[to} as fil +es # Either empty the directory or create it if (-d $option_href->{to}) { # Directory exists - flush it #=# TRACE "deleting files:", unlink glob "$option_href->{to}/*.*"; } elsif (mkdir $option_href->{to}) { # Made the directory } else { # Maybe $option_href->{to} is a file? Carp::confess "We do not have the requisite directory '$option +_href->{to}'! $OS_ERROR"; }; my $source_sref=read_file($ARGV[0]); # Tidy it? # Disassmble it my $document=PPI::Document->new($source_sref) or die "Oops! Could not PPI::Document '$ARGV[0]'!"; # Decompose for my $sub (@{$document->find('PPI::Statement::Sub') || []}) { unless ($sub->forward) { #=# DEBUG $sub->name; # save the sub and its content # THIS WORKS AND won't have Adam Kennedy shaking his head +in disgust! Thanks to haukex $sub->insert_after(PPI::Document::Fragment->new(\(q{sub }. +$sub->name.q{ {...}}))->find_first(q{PPI::Statement::Sub})->remove) or warn q{Could not ...->insert_after(...->remove)!}; my $subdoc=PPI::Document::Fragment->new() or die q{Oops! Could not PPI::Document::Fragment->new! +}; $subdoc->add_element($sub->remove); # Won't need to stub by removing sub's block's children $subdoc->save($option_href->{to}.q{/}.$sub->name.q{.sub}); }; }; # Save the stubbed out "main/package" under MAIN. #=# DEBUG 'MAIN'; $document->save("$option_href->{to}/MAIN"); } # Appears to work: Check MAIN & subs against ARGV[0] elsif ($ARGV[0] && $option_href->{from}) { # Reassemble $ARGV[0] from +MAIN and the subs in @$option_href->{from} (scanning $# .. 0) # Load "MAIN" my $document=PPI::Document->new(which_file('MAIN',@{$option_href-> +{from}})); #=# TRACE ''.PPI::Dumper->new($document)->string; # Buffer to hold subs required by the updated document my @sub_a; # Find and flesh out the stubs for my $stub (@{$document->find('PPI::Statement::Sub') || []}) { #=# DEBUG $stub->name; unless ($stub->forward) { push @sub_a,PPI::Document->new(which_file($stub->name.q{.s +ub},@{$option_href->{from}})); if ($stub->insert_before($sub_a[-1]->find_first('PPI::Stat +ement::Sub'))) { $stub->delete(); } else { warn q{... ->insert_before failed!}; }; }; }; #=# DEBUG '',PPI::Dumper->new($document,whitespace=>0)->string; # Write out the results (should we tidy it?) $document->save($ARGV[0]); } else { pod2usage( '-verbose' => 2 ) }; exit; __END__

Replies are listed 'Best First'.
Re^2: Semantic diff for Perl code
by haukex (Archbishop) on Sep 28, 2017 at 10:22 UTC

    The ->remove technique I showed in reply to your other post can be applied here too:

    In that thread, holli mentioned PPR, and I've been wanting to play more with that module, so I took this opportunity to do so. Here is something that will "stub/unstub" subs similarly to what your PPI code does.

Re^2: Semantic diff for Perl code
by clueless newbie (Curate) on Sep 27, 2017 at 18:23 UTC

    This works too

    # This works too - but it's certainly not KOSHER! { package PPI::Statement::Sub; local @ISA=('PPI::Document',@ISA); $sub->save($option_href->{to}.'/'.$sub->name.'.sub'); };

    but I'm quite certain that Adam Kennedy and a few others are shaking their heads in anguish!

Re^2: Semantic diff for Perl code
by Anonymous Monk on Sep 27, 2017 at 17:08 UTC

      I'm uncertain as to how one inserts "$sub" into the new PPI::Document or PPI::Document::Fragment.

        untested, but try
        my $subdoc = PPI::Document::Fragment->new(\$sub->content); write_to_file($subdoc->serialize(), ...);
A reply falls below the community's threshold of quality. You may see it by logging in.