Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Perl: the Markov chain saw
 
PerlMonks  

Re^2: Deleting duplicate lines from file

by Win (Novice)
on Feb 16, 2006 at 10:47 UTC ( [id://530682]=note: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.


in reply to Re: Deleting duplicate lines from file
in thread Deleting duplicate lines from file

I am in the process of trying to get this code to work. Please could someone offer me a detailed explanation of how this works so I can fix problems that I am having with it.

For example, I don't understand this line:
my (@lines, %line_md5);
or these lines:
my $digest = md5($_); unless ( exists $line_md5{$digest} ) { $line_md5{$digest} = 1;

Replies are listed 'Best First'.
Re^3: Deleting duplicate lines from file
by xorl (Deacon) on Feb 16, 2006 at 12:14 UTC

    my

    unless

    It looks like perldoc is down....so try these links:

    my

    unless (BTW does anyone know where perldoc.perl.org keeps the info about unless and other control structures?)

Re^3: Deleting duplicate lines from file
by turo (Friar) on Feb 16, 2006 at 14:31 UTC
    You hurt me ...
    program file
    thats enough? ....

    i recommend you to read some perl guide
    perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
Re^3: Deleting duplicate lines from file
by Anonymous Monk on Feb 17, 2006 at 07:33 UTC
    So how much are you getting paid for this one? How much do we get?
Re^3: Deleting duplicate lines from file
by blazar (Canon) on Feb 17, 2006 at 07:57 UTC

    It works nearly in the same exact way as the code you already asked about except that:

    1. the flow control is syntactically (but not logically!) different;
    2. it doesn't do a check on the actual strings, but on a checksum computed for them which gives a sufficient condition for two of them to be different. That is, if the checksums are different, then the strings will be different too, while the converse does not hold: different strings may have the same checksums (but we rely on the confidence that such occurrencies are rare enough), hence => my remarks

    Then you wrote:

    For example, I don't understand this line:
    my (@lines, %line_md5);

    Oh my! Please tell me you're joking!! I see that you have 204 writing as of now. That's quite surprising... maybe you're not really programming in Perl, but in some vaguely similar language. What is it precisely that you do not understand?

      The question is.

      Why use
      my (@lines, %line_md5);
      when you can use
      my @lines; my %line_md5:
      I don't really see that there is any point in grouping them together like that. I am thinking of putting a moto on my messages: Keep it simple when possible. Go complex when required.
        Both are simple. Concentrate on your problem
        [slightly edited and fixed a typo: Perl 5 doesn't make that much use of the colon!] The question is my (@lines, %line_md5); when you can use  my @lines; my %line_md5;

        Because it's simpler, cleaner, easier to type and to read, and groups together correlated variables maybe?

        I don't really see that there is any point in grouping them together like that. I am thinking of putting a moto on my messages: Keep it simple when possible. Go complex when required.

        Indeed! Just be aware that the World's concept of "complex" is not bound to match Win's authoritative one...

        PS: here's a motto: "don't put motos on your messages". Oh, or... was it all about Moto? This would explain a lot of things... but no: PM is definitely about Perl.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://530682]
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.