Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
ozgurp,
Unfortunately I am not a perl guru myself. I can only provide you with some hints. Typically, a better algorithm is what will make your code run faster. Sometimes you can trade memory for time by caching (see Memoize by Dominus). When you want to evaluate how a tweak has impacted performance - look into Benchmark. The thing to remember here is to go through many iterations to remove "flukes", vary your data as code behaves differently based off input, and try to test on a system at rest so it won't be influenced by other running programs. There is also Devel::DProf.

Let me point out a few things in your code that may or may not help you.

  • my @FileArray = ("c:/ultimate1_it2.f06"); - I am assuming this is this way because you might have numerous file names in this array? If not, there is no need to make it an array.
  • &Initial_Sort(); - This is normally considered bad form. Use the & or the () - and the tendency is to lean towards ().
  • my $Size_Of_FileArray = @FileArray; - This is probably not needed and is likely to break. If you use @FileArray in a scalar context, it will provide you with what you are after. The problem with this is if you alter @FileArray, you have to remember to update $Size_Of_Array.
  • for (my $i =0; $i<= $#FileArray; $i++) { - This is usually done as for (0 .. $#FileArray) or if you don't like dealing with $_ (nested loops are also a good reason), you can used for my $index (0 .. $FileArray).
  • The regex engine is expensive. It looks like at the beginning of parsing you are trying to throw away some lines you aren't interested in. The problem is this check has to be performed on every single line of the file. It would be better to create a flag variable. Test to see if the flag is set, if not check for the lines you want to avoid, and then set the flag. This way, only a variable is checked in memory.
  •   if ( ($in =~ m/^0\s+(.+?)\s+SUBCASE/) || ($in =~ m/^0\s+(.+?)\s+SUBCOM/) || ($in =~ m/^0\s+(.+?)\s+SYM/) || ($in =~ m/^0\s+(.+?)\s+SYMCOM/) || ($in =~ m/^0\s+(.+?)\s+REPCASE/) ) { - you could probably reduce the invocations of the regex engine - \s+SUB(CASE|COM) \s+SYM(COM)?
  • You may also want to consider index if you do not care where something appears in a line, but just want to know if it is present. I would recommend benchmarking this as the data you are checking usually dictates which will be faster.

    Now, I am sure other monks would be able to look at your data that your provided and write a very fast an elegant script to do what you are asking.

    Cheers - L~R


    In reply to Re: Re: Re: Fast reading and processing from a text file - Perl vs. FORTRAN by Limbic~Region
    in thread Fast reading and processing from a text file - Perl vs. FORTRAN by ozgurp

    Title:
    Use:  <p> text here (a paragraph) </p>
    and:  <code> code here </code>
    to format your post; it's "PerlMonks-approved HTML":



    • Are you posting in the right place? Check out Where do I post X? to know for sure.
    • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
      <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
    • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
    • Want more info? How to link or How to display code and escape characters are good places to start.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others chanting in the Monastery: (5)
    As of 2024-03-19 08:34 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found