Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Hi, Need your expert advice on fine tuning the below perl script, which is currently taking more than 30 minutes to extract information from a raw file which has nearly 2million records. below are the requirements met in the script: 1) script should scan through a log file and output the final result into a file called "test.txt" 2)it replaces some strings in the log file into more generic terms 3)it finds the record number of the affected row from the log file and uses this record number to query the main source file for the particular record 4) it generates a string which now contains information from the logfile and patches it along with the affected row from the source 5) the result is written into the output file test .txt

=======================code ============================ #!/usr/bin/perl $read_file = "$ARGV[0]"; $read_source = "$ARGV[1]"; open(LOGFILE,$read_file) or die "An Error Occured : $!"; open(REPORT,">/retsit/systematics/test.txt"); $str1 = 'failed all WHEN clauses'; $str2= 'CUST SEGEMENT IS EMPTY'; $str3= 'unique constraint'; $str4= 'DUPLICATE RECORD'; while(<LOGFILE>) { if ($_ =~ /Record/) { $_ =~ s/$str1/$str2/g; $_ =~ s/$str3/$str4/g; $ind1 = index($_,'Record')+6; $len2 =index($_, ':')-6; $recnum = substr($_,$ind1,$len2); $recnum =~ s/^\s+|\s+$//g ; $strx = "sed -n '".$recnum."p' ".$read_source; $str5 = `$strx`; $_ .= '|'."$recnum"."$str5\n"; print REPORT $_; } } close(LOGFILE); close(REPORT);

In reply to Need help to fine tune perl script to make it faster( currently taking more than 30 minutes) by anujajoseph

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-16 18:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found