Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Detect common lines between two files, one liner from shell

by b (Beadle)
on Dec 13, 2000 at 22:16 UTC ( #46453=note: print w/ replies, xml ) Need Help??

Comment on Re: Detect common lines between two files, one liner from shell
Re: Re: Detect common lines between two files, one liner from shell
by merlyn (Sage) on Dec 13, 2000 at 22:34 UTC
      Can you be less of an ass about it?
        As soon as someone asks a question about what part needs explaining, yes.

        I didn't use anything that wasn't also in Learning Perl. Regex match, hash assignment, .= operator, @ARGV array, array in scalar context.

        Nothing tricky going on here!

        -- Randal L. Schwartz, Perl hacker

      Humans have a wonderful love for "Do What I mean, Not What I say". This applies both to programming languages (Perl) and general conversations. I'm pretty sure he didn't want a role call of who knew the answer to the question, even though that's how it's phrased. Since we rely on a programming language that goes out of its way to be nice to us, perhaps we should go out of our way to be nice to other people. :)

      -Ted
Re (tilly) 2: Detect common lines between two files, one liner from shell
by tilly (Archbishop) on Dec 13, 2000 at 23:01 UTC
    I will point to the pieces of documentation from which you can figure it out. I suggest locating it with perldoc, but I will also provide links to site documentation.

    The meaning of the -n and -e switches is explained in perlrun. This also tells you what $_ is during the script. As you scan through files, the contents of @ARGV change. The append is being done in scalar context. In that context @ARGV gives you the number of elements you have. The pattern will match when the hash value ends with "10". The two filenames are on the command line. The output is redirected to a file that you look at.

    The trick is that for the hash value to get a 1 in it, the line must appear in the first file. For it to get a 0 in it, it must appear in the second. It will only match /10$/ on the first occurance in the second file when it already appeared in the first.

      Hey, thanks alot!

      I couldn't understand where the files are read from. There is no <> anywhere and the @ARGV is only the file names.


      The trick with the 0 and 1 is cool.


      Sorry, I'm new to this and I don't have too much time to read the tutorials, but this I still don't understand $seen{$_} I know $_ is the current stream. .= is like adding it at the end. But what is that hash /10$/ ? But this only matches the exact line length it doesn't look for a the same word. what if I want to find a word in both files and print it out on screen? Thanks.
        I refuse to repeat the documentation until you have at least tried to read it. That is why it is there and it is faster for both of us if you take advantage of it.

        As for your additional question, the -n option is an implicit loop over the lines in both files. If you want to do words, then within each line you would need to loop over the words as well. But the same logic would work. (OTOH the algorithm will get rather inefficient. But oh well.)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://46453]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (17)
As of 2014-10-23 14:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (125 votes), past polls