Think about Loose Coupling | |
PerlMonks |
Re: Remove lines that contain matching values from csv.by fluffyvoidwarrior (Monk) |
on Oct 09, 2012 at 15:52 UTC ( [id://998031]=note: print w/replies, xml ) | Need Help?? |
Perhaps I'm missing the point of your question but..... If the fields are ordered as you seem to suggest and the unique id is in position 1 (so you can short circuit for speed - otherwise you'd have to regex a whole line - slower than anchoring at start or substr) can't you just treat the csv files as text files, ie a bunch of arrays. Parse each one and compare position 1 (the id field) with a cumulative output array for uniqueness. So long as the output is less than about 100,000 not-huge lines Perl should do this in a few seconds per input file (if you've optimised your code). For obvious reasons simple textfile handling is a lot faster than using CSV libraries) I wouldn't know how to do this with a one-liner but I don't see why you would have to. Maybe it's not the neatest of solutions but it is a guaranteed, self contained solution for half hours work, that other people will easily understand in the future.
In Section
Seekers of Perl Wisdom
|
|