Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

RTF diff

by sheriff (Sexton)
on Aug 28, 2003 at 10:52 UTC ( #287301=sourcecode: print w/replies, xml ) Need Help??
Category: Text Processing
Author/Contact Info
Description: If you're working with RTF, sometimes you'll want to compare two RTF files to see if they're different. Traditional diff falls down here, because RTF can have all sorts of crazy whitespace, some of which is significant, some of which isn't. rtfdiff, below, rasterizes the token streams from two rtf files, and then diffs those, allowing you to easily see if two rtf files are the same :-) Tada!
#!/usr/bin/perl

# Compares a tokenized view of two RTF files

use strict;
use RTF::Tokenizer;
use Text::Diff;

my $first_file = pretty_print( $ARGV[0] );
my $second_file = pretty_print( $ARGV[1] );

print diff \$first_file, \$second_file;

sub pretty_print {

    my $filename = shift;
    my $output;

    my $tokenizer = RTF::Tokenizer->new( file => $filename );

    while (1) {

        my ( $type, $token, $argument ) = $tokenizer->get_token();

        last if $type eq 'eof';

        $argument =~ s/\n/[n]/g;
        $argument =~ s/\t/[t]/g;
        $argument =~ s/\r/[r]/g;

        $output .= "($type) $token $argument\n";

    }

    return $output;

}
Replies are listed 'Best First'.
Re: RTF diff
by particle (Vicar) on Aug 28, 2003 at 13:12 UTC

    a handy utility, sheriff. i've got a couple suggestions:

    • use warnings there's no reason not to, this code is clean under warnings.
    • specify the module version for RTF::Tokenizer, as in use RTF::Tokenizer 1.01 qw();. the functionality of new has changed to allow the syntax you've specified. i had to upgrade mine from 1.00 to get this to work correctly.

    otherwise, quite a handy utility!

    ~Particle *accelerates*

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://287301]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2023-09-27 15:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?