Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^4: find common data in multiple files

by mao9856 (Sexton)
on Dec 31, 2017 at 06:09 UTC ( #1206466=note: print w/replies, xml ) Need Help??


in reply to Re^3: find common data in multiple files
in thread find common data in multiple files

I am very grateful for all the useful explanations you have provided. As you know, I am very beginner of perl, i tried to modify your provided code, because it didn't worked for me. I can see the logic for your code. But please let me ask you something. Following of your code isn't giving any output when i use it for 25 files. Please tell me how to fix it.

#!/usr/bin/env perl use strict; use warnings; use autodie; my @files = glob 'pm_1206312_in*'; my %uniq; { open my $fh, '<', shift @files; while (<$fh>) { my ($k, $v) = split; $uniq{$k} = $v; } } for my $file (@files) { my %data; open my $fh, '<', $file; while (<$fh>) { my ($k, $v) = split; $data{$k} = $v; } for (keys %uniq) { delete $uniq{$_} unless exists $data{$_} and $uniq{$_} eq $dat +a{$_}; } } printf "%s %s\n", $_, $uniq{$_} for sort keys %uniq;

Replies are listed 'Best First'.
Re^5: find common data in multiple files
by kcott (Chancellor) on Jan 01, 2018 at 01:01 UTC

    In my original response, I showed the test files I created with the data from your OP. You didn't say what your filenames were; I had to make up names for my files. The pm indicates it's a PerlMonks file; the 1206312 is the node ID of your OP; the in is for input. Those are fairly standard naming conventions that I use; I very much doubt you use these same conventions.

    My intention was to help you learn; not to do your school/job/whatever work for free. Spend some time understanding the techniques I've used, instead of blindly copying my code and expecting it to work as is. I probably have a different directory structure to you; names I've given to test files (as seen here) won't be the same as filenames on your system; I may have used a CPAN module which you'll first need to install; there could be differences between software versions which require you to write your code slightly differently; you may even have local coding standards that you need to follow.

    If you're genuinely interested in learning, then you'll need to put in some effort yourself and do some troubleshooting. Investigate how %uniq changes as the script runs: from initial (my) declaration to final (printf) output. Do the same with other variables: look at their values and see how those change over the life of the program. If, on the other hand, you just want your work done for free, you're in the wrong place: see "How (Not) To Ask A Question" and, in particular, its "Do Your Own Work" section.

    All of the code that I provided is very straightforward and documented. You should be able to find information on everything I've used in http://perldoc.perl.org/perl.html: I have this link bookmarked; I recommend you do the same.

    — Ken

Re^5: find common data in multiple files
by poj (Monsignor) on Dec 31, 2017 at 09:18 UTC

    Do all your 25 filenames start pm_1206312_in and if not what are they like ?

    poj

      "Do all your 25 filenames start pm_1206312_in and if not what are they like ?" All 25 files ends with .txt and starts with 4_os_ The middle part is different for all 25 files. as an example 4_os_abc.txt, 4_os_def.txt...so on.

        "Do all your 25 filenames start pm_1206312_in and if not what are they like ?"

        I think poj's original question was intended to provoke thought. If the glob pattern you're using is  'pm_1206312_in*' no file with a pattern  '4_os_*.txt' will be matched. Please see the documentation for glob and links therefrom for information on constructing glob filename patterns.


        Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1206466]
help
Chatterbox?
[Discipulus]: as a greed Agrid I agree.. coffee!

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (8)
As of 2018-05-21 08:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?