Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^5: Reading files n lines a time

by ww (Archbishop)
on Dec 07, 2012 at 19:22 UTC ( [id://1007813]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Reading files n lines a time
in thread Reading files n lines a time

SuperSearch (done already - here's the link: ?node_id=3989;BIT=FASTA -- will give you a short list of recent discussions on dealing with FASTA files.

My notion that your paragraphing might be identifiable with a regex is pretty useless here. However, there's no reason you can't read a 2 lines at a time and use hashes to ensure the two "neighbors" values are discrete.

That too, however, breaks down if the dups appear other than adjacent to one another, given the size of your data.

So if none of the above help, you may wish to read about bioperl at both the wikipedia article, http://en.wikipedia.org/wiki/BioPerl and at the project page, http://www.bioperl.org/wiki/Main_Page.

Replies are listed 'Best First'.
Re^6: Reading files n lines a time
by naturalsciences (Beadle) on Dec 07, 2012 at 19:51 UTC

    Yes it would break down but right now I'm specifically looking for neighbouring dupes :) (There is an actual reason to suspect they are positioned so in those files) It wouldn't be hard for me to write a push/shift code that would kind of like slide a 4-line long reading frame over the whole text file.

    But I ran totally in a ditch trying to do the same so that the frame wouldn't "slide" but would be "lifted" four lines at a time.

    Then I could just

    <code> if ($frame[1]!=m/$frame[3]/){print @frame} <\code>

    But for some reason I mess the populating/emptying/and moving the frame up so readily.

    edit: disregard all that - a sliding frame is exactly what I need. So I guess were done here :D. Thank you all! Learned a lot of other stuff on the side also :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1007813]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2024-03-28 12:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found