comment on

HI, I have a tab separated file which may run upto 5000 lines. The file format is some thing like this:

XXXXXS331632    XXXXXS331632    female  40087   a5
XXXXXS331632    XXXXXS331632    female  47735   a5
XXXXXS331681    XXXXXS331681    male    40087   e6
XXXXXS331681    XXXXXS331681    male    47735   e6
XXXXXS331856    XXXXXS331856    male    40177   d1
XXXXXS331856    XXXXXS331856    male    47737   d1
[download]

What I really want to do is delete the row that appears twice irrespective of the difference(40087 , 47735) in the 4th column. I could remove either the first or the the second entry. At the end what I like to have is a file with the duplicate(?) entry removed.
Something like this:

XXXXXS331632    XXXXXS331632    female  40087   a5
XXXXXS331681    XXXXXS331681    male    40087   e6
XXXXXS331856    XXXXXS331856    male    40177   d1
[download]

Any suggestions please
Thanks for your time.

In reply to Remove duplicate lines in a file by Anonymous Monk

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


laziness, impatience, and hubris
	PerlMonks