Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Yet another way to approach this sort of problem is to “do in the way that they did it when they were using punched cards.”   (For instance, how did Herman Hollerith help the US Census, all those years ago?)   The answer in this case would be to first concatenate the contents of all 197 files into a single file, then to sort that file.   (All of which you can probably do, so far, with no program-writing at all.)  

I believe that you said that there will be about 197 * 8000 records in that file.   Well, after you have sorted it, all occurrences of any given 21mer will now be consecutive, and you can tally them all up with a trivial program that reads the file line-by-line and notices each time the 21mer-value changes.

Yet another approach, though, is far easier:   place all of the data into a database table, then simply use an SQL query!   No programming is required to obtain the counts that you need, because the SQL engine will figure out what to do and simply do it.   (Spreadsheets can also readily take input from a query-result.)

SELECT MER, COUNT(*) FROM MERS GROUP BY MER
(SQLite is an excellent engine to use for such purposes because there is no server and its database is a simple file.)


In reply to Re: Out of Memory when generating large matrix by sundialsvc4
in thread Out of Memory when generating large matrix by cathyyihao

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-16 05:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found