Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Re: File Comparison

by sunadmn (Curate)
on Aug 20, 2003 at 19:09 UTC ( #285262=note: print w/ replies, xml ) Need Help??


in reply to Re: File Comparison
in thread File Comparison

that's close to what I am trying to do, but not exactly. Let me give you the example of what I want to achieve I have a parse script written that looks through a log file and then builds three seporate files from the output these files actually hold Bind transfer stats. Now what I would like to do is parse the three files and do like a sdiff on all three, but sdiff will only do two files. In the end I want to have a list of lines that do not exsist in all files and what file they are missing in. Does that make any sense??


Comment on Re: Re: File Comparison
Re: Re: Re: File Comparison
by waswas-fng (Curate) on Aug 20, 2003 at 22:21 UTC
    If you have a parse script writen already why the extra step of going to 3 files, why not inspect the data and output the real result in one step?

    -Waswas
      I could try that, but I think for the purpose of the outcome I want to achieve the three seporate files will work better.
Re: Re: Re: File Comparison
by esh (Pilgrim) on Aug 21, 2003 at 07:18 UTC

    Sounds like an interesting problem, but I still can't quite picture the data and the result you want. These questions might clear things up for me:

    Are the lines guaranteed to be unique?

    Does the order of the lines matter as it does in diff?

    If I see a line "XYZ" in file 1 and "XYZ" in file 2, and "XYZ" in file 3, are these the same line no matter where they show up in the respective files?

    How big are the files? Would it be feasible to load them all into memory at the same time?

    Is it ok to sort the files before doing the comparison or does your output need to be in a specific order?

    Pretend letters are lines. What should be the output if the following are the contents of the three files?

    file 1: A B C D E G file 2: B A D E G H file 3: A B D E G I

    -- Eric Hammond

      ok I will give you a sample of the files, what these files are , are logs from the output of the namedxfer daemon within bind 9.2.2. What I am seeing is that I am have a large lack of transfers to a single server in Atl, GA and I can not get my network Nazi's here at work to do anymore digging as they say their switch is fine. What I want to do is take the log and split it into three files one for each "Master Server" so ns1, ns2, and ns3.mycompany.com from these three files I want to compare the three to find out when and where I have degridation on my network so I can go back to the NetEng group with hard evidence that there is a network issue.
      Here is a small exert from a parsed file for a single server.
      Aug 06 15:00:36.747 xfer-out: info: client 68.168.192.17#50840: transfer of '112.23.67.in-addr.arpa/IN': AXFR started
      Aug 06 16:00:36.326 xfer-out: info: client 68.168.192.17#50963: transfer of '129.23.67.in-addr.arpa/IN': AXFR started
      Aug 06 16:00:36.829 xfer-out: info: client 68.168.192.17#50964: transfer of '131.23.67.in-addr.arpa/IN': AXFR started
      Aug 06 16:00:36.840 xfer-out: info: client 68.168.192.17#50965: transfer of '130.23.67.in-addr.arpa/IN': AXFR started
      Aug 06 16:00:37.327 xfer-out: info: client 68.168.192.17#50966: transfer of '128.23.67.in-addr.arpa/IN': AXFR started
      Aug 06 16:06:09.468 xfer-out: info: client 68.168.192.17#50978: transfer of '78.168.68.in-addr.arpa/IN': AXFR-style IXFR started
      Aug 06 16:12:06.719 xfer-out: info: client 68.168.192.17#50989: transfer of 'colememorial.com/IN': AXFR-style IXFR started
      Aug 06 16:15:44.581 xfer-out: info: client 68.168.192.17#50999: transfer of 'charlescolehospital.com/IN': AXFR-style IXFR started
      Aug 06 16:20:25.301 xfer-out: info: client 68.168.192.17#51010: transfer of 'coudersporthospital.com/IN': AXFR-style IXFR started
        Given that data set I do not see how you are going to make a case for a network issue. The most I could see that data set implying is that there may be a differance in the number of IXFR/AXFR started on two different servers -- where that difference comes from is not stated by that data. Could one of the servers be overloaded and not accepting or initiating transfers? could the named be compiled differently on one of the servers or the config file be different? could the kernel on the server be a different revision/patch level/compile options different? It may be a better option to look at network data instead of application data to pinpoint network issues.. IMHO.

        -Waswas

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://285262]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (11)
As of 2014-07-30 07:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls