Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: col-uniq -- remove lines that match on selected column(s)

by repellent (Priest)
on Nov 06, 2008 at 00:39 UTC ( #721869=note: print w/ replies, xml ) Need Help??


in reply to col-uniq -- remove lines that match on selected column(s)

This is nice :) Some comments:

  • Why not 'default' be ' '? Also, the delimiter could potentially be 'default' literally.
  • @heldlines could be coded up as a scalar $heldline, lest ( @cols == $match ) be true till memory usage blows up.


Comment on Re: col-uniq -- remove lines that match on selected column(s)
Select or Download Code
Re^2: col-uniq -- remove lines that match on selected column(s)
by graff (Chancellor) on Nov 10, 2008 at 02:42 UTC
    Thanks! I almost agreed with your first suggestion, until I remembered why I used "default" as the, um, default value for the delimiter option. It seemed a lot less likely that someone would actually need to use the word "default" as a column delimiter, and rather more likely that they might want to use a single space character -- not in the "magical" sense of split ' ' but rather in the literal sense of split / / -- and this entails that every time the user gives a delimiter on the command line, it should always be treated as a regex. This way you only get the "magical" split behavior when you don't supply the "-d regex" option, and you have the ability to split on single space if you want to.

    As for your second point, it's true the code as originally posted could lead to an "out of memory" condition, if it got a very long stream of repeated lines. But I wanted an array that I could "pop" or "shift" off of in order to print a duplicate line only once. So to fix the possible memory consumption problem, instead of "pushing" onto the array every time there's a duplication, I just make sure the array never contains more than one element (and this happens to be the line that the user wants to see). That made the print statements a lot simpler too, which is nice.

    Update: then again, after making that change to how I was using the "heldline" array, I finally realized that it doesn't have to be an array, which is exactly what you said. So I fixed it (and I thank you) again.

        ... might want to use a single space character -- not in the "magical" sense of split ' ' but rather in the literal sense of split / / ...

      Ahh, I see. Then I'll suggest using $delim = undef and defined($delim) instead, as a means towards a more thorough solution.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://721869]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (18)
As of 2014-10-23 14:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (125 votes), past polls