Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
glasswalk3r,
First of all, don't do this:
$ ./test < textfile | sort | uniq
Perl has more then enough tools to do the job that is done with sort and uniq programs. System calls are expensive and sometimes the speed of those programs doesn't pay for the cost of invoking them.

I think you are making the mistake of repeating what you have heard others say without really understanding it yourself. In this particular case, sort and uniq are likely compiled C programs optimized for a single task and are far superior to Perl. While system calls can be expensive - it is just not the case here.

open(IN,"<$file") or die "Cannot read $file: $!\n"; my @content = <IN> +; close(IN); close(IN);
This will speed up things than using while block.

Well it may speed things up at the expense of memory. I do not know how many lines are in the file but if individual strings are 9 million characters this may definately be the wrong way to go. You still need to loop through the array so it is not going to avoid the need to loop. The speed savings come in from disk I/O.

Try as much as you can to avoid using next loops with for. Look for the Schwartzian Transform to see how to improve your code. Try using @sequence = split( //, $sequence ) instead of a other loop.

I am not exactly sure why you think using next inside a for loop is a bad thing. If it is possible to eliminate those loops prior to entering the loop then it is advantageous because you don't have a conditional every loop. That is seldom the case. The ST is used to speed up sorting routines when the comparison of 2 elements is expensive. This looks out of place in the context of the rest of what you said so you should probably be sure to explain why what you are saying has relavence.

Finally, the real problem here is the numbers involved. Using a brute force algorithm, no matter how well it is tuned, to find the longest common substring of a 9 million digit number is going to be extremely slow. If you are interested in the math I will be happy to provide it.

Cheers - L~R


In reply to Re^2: Longest repeated string... by Limbic~Region
in thread Longest repeated string... by Yzzyx

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (2)
As of 2024-04-19 19:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found