Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Poor man's diff

by grinder (Bishop)
on May 23, 2005 at 14:22 UTC ( #459572=snippet: print w/replies, xml ) Need Help??

I have a windows server that I refuse to work on as much as possible, although fortunately it has a copy of Perl 5.6.1 installed. In the process of replacing a batch process that spits out interface files that are imported on another system, I needed to test whether what I was producing was the same.

You cannot, as far as I am aware, tell Text::Diff to ignore trailing blanks as being insignificant. So I whipped up the following poor man's diff to do, well, a poor man's diff (no, I don't have Cygwin installed). It showed me that I had a single field in my new replacement that differs from the original. Tracing things further, I found that the original program had a SQL bug in which table t1 was being joined to table t2 instead of t3. Funnily enough, the comments in the original program said that the select was supposed to be on t3.

Exit one 1314-line C program, to be replaced by a 278-line Perl program (of which 95 lines is an SQL select heredoc).

Note that this diff does not notice, nor does it care, if the files are of different lengths. Additional lines in the longer file will be silently ignored. In certain circumstances, this could be construed as a feature.

#! perl -w

# -- a poor man's diff

use strict;

my $in1 = shift || die "no first file";
my $in2 = shift || die "no second file";

open IN1, $in1 or die "input $in1: $!\n";
open IN2, $in2 or die "input $in2: $!\n";

my $nr = 0;
while( defined( my $r1 = <IN1> ) and defined( my $r2 = <IN2> ) ) {
    $r1 =~ s/\s+$//;
    $r2 =~ s/\s+$//;
    print "files differ at line $nr\n" if $r1 ne $r2;
Replies are listed 'Best First'.
Re: Poor man's diff
by kaif (Friar) on Jun 03, 2005 at 20:28 UTC
    Good code.

    However, you said that you knew no way to "tell Text::Diff to ignore trailing blanks as being insignificant." I decided to explore this and found that the following code works:

    #!/usr/bin/env perl die "Usage: $0 from-file to-file\n" unless @ARGV == 2; use Text::Diff; diff shift, shift, { OUTPUT => \*STDOUT, STYLE => "OldStyle", # or whatever pleases you KEYGEN => sub{ (my $line = shift) =~ s/\s*$//; return $line; }, };
    In general, to compare something other than the lines themselves, just return that from the KEYGEN argument. For example, inserting sub{return substr shift, 0, 1} compares only the first characters. Unfortunately, the documentation for this is hidden in Algorithm::Diff.
Re: Poor man's diff
by lupey (Monk) on Jun 09, 2005 at 11:53 UTC
    I like your script for its simplicity if one wants a quick answer to what lines differ.

    Having read that you don't have Cygwin installed, I strongly recommend that you or anybody else does so. Or install GNU utilities for Win32. Once you start to think that you need to write these types of scripts, it would be worth it to install Cygwin. It has saved me countless hours of trying to write my own stuff that already exists on a *nix system.

    I write this because my hope is that it will help many others too.


    unashamed Cygwin advocate

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://459572]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2016-10-23 22:42 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (302 votes). Check out past polls.