more useful options PerlMonks

### file comparison

by chennaiite (Sexton)
 on May 23, 2007 at 12:35 UTC Need Help??
chennaiite has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I have two files 'expected output' and 'actual output'. I need to compare these two files. I have done the exact comparison.
Now for some cases the actual output lines are getting interchanged. So after comparison it is failing.
Could you please suggest me any other logic how can I make the comparison programme to work more efficiently.
Also Can anyone guide me how to do comparison using patterns.

Replies are listed 'Best First'.
Re: file comparison
by liverpole (Monsignor) on May 23, 2007 at 13:08 UTC
Hi chennaiite,

That's a pretty open-ended question.

If you're interested in comparing files, I'd suggest taking a look at Text::Diff; it will compare 2 files the way the Linux/Unix utility diff does.

If you're interested in how, exactly, the algorithms for comparison work, take a look at Approximate_string_matching.  It references the Levenshtein_distance, which is a way of calculating how close a set of strings are to one another.

Furthermore, if you Google for "Levenshtein distance", you'll find sites that provide more detailed analysis of the algorithms, including some, like this one, which present a helpful visual demonstration.

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
Re: file comparison
by educated_foo (Vicar) on May 23, 2007 at 13:42 UTC
If output order doesn't matter, then just sort the two files before diffing. If it kind of matters, i.e. only local reordering is allowed, then you'll need to be more specific, but you can probably sort consecutive groups of lines and compare them:
```for i = 1 .. n - k:
if (sort(expected[i..i+k]) != sort(actual[i..i+k])):
print "Mismatch around line ", i
Re: file comparison
by blazar (Canon) on May 23, 2007 at 14:00 UTC

I'm not sure if I understand your question. Do you mean that you want to compare two files and consider them identical if they have exactly the same lines, except that they may not be in the same order? If so, then just sort them before the comparison. It may even be a situation in which calling an external sort program could be sensible, although that would make your program not portable.

I will analyse all your suggestion and see which one meet my requirement...

Thank you all for the suggestion provided....
Re: file comparison
by duff (Vicar) on May 23, 2007 at 13:42 UTC

Also, if you have control over the programs that generate the output, you might want to see why the lines are interchanged and fix it so they are not. Barring that, if you can arrange the lines in some other, definite order before comparing them, that would work too.

Create A New User
Node Status?
node history
Node Type: perlquestion [id://616991]
Approved by marto
help
Chatterbox?
 [talexb]: Hmm .. fascinated to learn that there's no INT function in SQL Server, only FLOOR and CEIL. #interestingjobint erviewquestion

How do I use this? | Other CB clients
Other Users?
As of 2017-08-16 13:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Who is your favorite scientist and why?

Results (265 votes). Check out past polls.

Notices?