Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
What would be a good way to learn such tricks as indexing e.t.c.

in the context of setting up a table on a reasonably mature relational database engine (mysql, postgres, etc), creating an index on a given field is simply a matter of telling the database engine that you want that field to be indexed. The engine handles the rest for you -- that's one of the things that makes database engines so attractive. For example:

CREATE TABLE genome ( id int AUTO_INCREMENT PRIMARY KEY, start int, end int, strand varchar(6), ... -- include other fields as needed INDEX start, INDEX end ... -- index other fields that are often useful in queries )
By declaring that those two integer fields are to be indexed, mysql will take care of all the back-end work to make sure that queries like the following would execute with a minimum amount of time spent searching and comparing:
-- assume your current input data record has a "position" value of 876 +5: select * from genome where start <= 8765 and end >= 8765
As for your benchmark results that you reported, do you consider those to be good or bad? How much longer is that relative to the time it takes just to read the larger input file and do little else (e.g. how long does it take to run a simple one-liner, like:  perl -ne '$sz+=length(); END {print $sz,"\n"}' )?

In reply to Re^5: Comparing and getting information from two large files and appending it in a new file by graff
in thread Comparing and getting information from two large files and appending it in a new file by perlkhan77

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others contemplating the Monastery: (2)
    As of 2018-12-16 10:52 GMT
    Find Nodes?
      Voting Booth?
      How many stories does it take before you've heard them all?

      Results (70 votes). Check out past polls.

      • (Sep 10, 2018 at 22:53 UTC) Welcome new users!