Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^3: Is 100% CPU utilisation during a procees is aproblem?

by BrowserUk (Patriarch)
on Nov 25, 2014 at 17:47 UTC ( [id://1108378]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Is 100% CPU utilisation during a procees is aproblem?
in thread Is 100% CPU utilisation during a procees is aproblem?

By far the greatest amount of time is being spent evaling the data structure into existence:

my $e = eval( $p );

There's not a lot you can do about that. Writing your own parser would be complicated, error prone and almost certainly eons slower.

The second largest amount of time is being spent building and executing those two complicated SQL statements:

$stmt = qq(INSERT INTO ACCU_USAGE VALUES ($Mobil ... ... $stmt = qq(INSERT INTO ACCU_USAGE VALUES ($MobileNumber, ... ... $rv = $dbh->do($stmt) or die $DBI::errstr; ...

Now that you can do something about. By pre-preparing the statement using placeholders:

my $sql = $dbh->prepare( q[ INSERT INTO ACCU_USAGE VALUES( ?, ?, ?, ? +, ?, ?, ?, ? ) ] ) or die $DBI::errstr;

and then binding the vars when executing the statement:

$rv = $sql->execute( $MobileNumber, $value->{'subscriberGroupName'}, $value->{'absoluteAccumulated'}->{'counters'}- +>[0]->{'bidirVolume'}, $value->{'absoluteAccumulated'}->{'counters'}- +>[0]->{'name'}, $value->{'selected'}, $value->{'absoluteAccumulated'}->{'expiryDate' +}->{'volume'}, $value->{'absoluteAccumulated'}->{'previousExp +iryDate'}->{'time'}, $value->{'absoluteAccumulated'}->{'previousExp +iryDate'}->{'volume'} ) or die $DBI::errstr;

Not only will you give the DB engine the best chance of performing the inserts optimally; you can also make your code look a lot cleaner and simpler to maintain. It won't reduce your cpu usage; but it should result in your program finishing more quickly.

I hope you'll agree this is much nicer to look at and understand:

#!/usr/bin/perl use Data::Dumper; use DBI; use File::Basename; #use warnings; @ARGV = qw[ NUL junk.dat CON CON ]; ## crude hack for local testing. open( FH, "<", $ARGV[0] ); open( AH, "<", $ARGV[1] ); open( OUT, ">", $ARGV[2] ); open( LOG, ">", $ARGV[3] ); #Conneting to database my $dbExt = "db"; my $driver = "SQLite"; #my $file = basename($ARGV[1]); #my @fileName = split(/\./, $file); #$file = $fileName[0] . ".$dbExt"; my $dbFileName = 'junk.db'; #dirname( $ARGV[1] ) ."/" . $file; if(-e $dbFileName) { unlink ($dbFileName); } my $database = "$dbFileName"; my $dsn = "DBI:$driver:dbname=$database"; my $userid = ""; my $password = ""; my $dbh = DBI->connect($dsn, $userid, $password, { RaiseError => 1, Au +toCommit => 0 }) or die $DBI::errstr; my $stmt = <<EOS; CREATE TABLE ACCU_USAGE ( MOBILE VARCHAR2(50), PLANNAME VARCHAR2(50), PLANUSAGE CHAR(50), COUNTER CHAR(20), STATUS VARCHAR2(20), EXPIRY_DATE CHAR(20), PREEXP_TIME CHAR(20), PREEXP_VOLUME CHAR(20) ); EOS my $rv = $dbh->do($stmt); if($rv < 0){ unlink($dbFileName) if(-e $dbFileName); $dbh->disconnect(); print LOG $DBI::errstr; exit(1); } my $sql = $dbh->prepare( q[ INSERT INTO ACCU_USAGE VALUES( ?, ?, ?, ? +, ?, ?, ?, ? ) ] ) or die $DBI::errstr; my $SubsSize= -s $ARGV[0]; my $AccuSize= -s $ARGV[1]; my $SubsCount=0; my $AccuCount=0; my $finalCount=0; my $rowCount = 0; while( <AH> ) { chomp; my $line = $_; $AccuCount++; my $MobileNumber; if( $line =~ /subscriberId:(\w+)\(\"(\d+)\"\)/ ) { $MobileNumber = $2; } else { $MobileNumber = "''"; } my $plan = $line; $plan =~ s/\\//g; my @AccVolume; if( $plan =~ /usageControlAccum:(\w+)\(\"(.*)\"\)/ ) { my $p = $2; $p =~ s/:\{/ => {/g; $p =~ s/:\[/ => [/g; $p =~s/\"/\'/g; $p =~ s/\':/\'=>/g; $p =~ s/\}n/\}/g; # print $p,"\n"; my $e = eval( $p ); if ( @$ ) { push (@AccVolume,"error"); } else { #print Dumper($e); foreach my $value ( @{$e->{'reportingGroups'}} ) { if ( exists ( $value->{'absoluteAccumulated'}->{'count +ers'} ) ) { $rv = $sql->execute( $MobileNumber, $value->{'subscriberGroupName'}, $value->{'absoluteAccumulated'}->{'counters'}- +>[0]->{'bidirVolume'}, $value->{'absoluteAccumulated'}->{'counters'}- +>[0]->{'name'}, $value->{'selected'}, $value->{'absoluteAccumulated'}->{'expiryDate' +}->{'volume'}, $value->{'absoluteAccumulated'}->{'previousExp +iryDate'}->{'time'}, $value->{'absoluteAccumulated'}->{'previousExp +iryDate'}->{'volume'} ) or die $DBI::errstr; } elsif ( exists ( $value->{'absoluteAccumulated'}->{'bi +dirVolume'} ) ) { $rv = $sql->execute( $MobileNumber, $value->{'subscriberGroupName'}, $value->{'absoluteAccumulated'}->{'bidirVolume +'}, $value->{'absoluteAccumulated'}->{'name'}, $value->{'selected'}, $value->{'absoluteAccumulated'}->{'expiryDate' +}->{'volume'}, $value->{'absoluteAccumulated'}->{'previousExp +iryDate'}->{'time'}, $value->{'absoluteAccumulated'}->{'previousExp +iryDate'}->{'volume'} ) or die $DBI::errstr; } if($rv < 0) { print LOG "Failed to insert $stmt query. Exiting.. +.\n"; unlink($dbFileName) if(-e $dbFileName); print LOG $DBI::errstr; $dbh->disconnect(); exit(1); } else { $rowCount++; if($rowCount == 5000) { $dbh->commit(); $rowCount = 0; } } } } } } close(AH); $dbh->commit();

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: Is 100% CPU utilisation during a procees is aproblem?
by Ankur_kuls (Sexton) on Nov 26, 2014 at 05:30 UTC

    Thanks a lot for your efforts I will apply these changes and let you know the outcomes. meanwhile I ran the same script on my sunSO server and surprising there it was taking only 4 to 5% of cpu..on the other hand on my Linux server it was taking 99 to 100%.. why there is such a large difference... and it also shows that there is not much wrong with our script.. also please fiind my server details.

    Linux server where I am supposed to run the script and its causing issues...

    YZP4M2:~ # uname -a Linux YZP4M2 2.6.5-7.308-smp #1 SMP Mon Dec 10 11:36:40 UTC 2007 x86_6 +4 x86_64 x86_64 GNU/Linux YZP4M2:~ # less /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Xeon(R) CPU E7420 @ 2.13GHz stepping : 1 cpu MHz : 2133.414 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge + mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm sysca +ll lm pni monitor ds_cp l est cmpxchg16b dca lahf_lm bogomips : 4227.07 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: YZP4M2:~ # cat /proc/cpuinfo | grep processor | wc -l 16 YZP4M2:~ # free total used free shared buffers cac +hed Mem: 8158176 8132864 25312 0 4468 1314 +884 -/+ buffers/cache: 6813512 1344664 Swap: 2104472 2104132 340

    details of solaris server where script is running fine

    # psrinfo -v Status of virtual processor 0 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:50. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 1 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 2 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 3 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 8 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 9 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 10 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 11 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 16 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1200 MHz, and has a sparcv9 floating point processor. Status of virtual processor 18 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1200 MHz, and has a sparcv9 floating point processor. Status of virtual processor 19 as of: 11/26/2014 10:48:52 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1200 MHz, and has a sparcv9 floating point processor. Status of virtual processor 512 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 513 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 514 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 515 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 520 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 521 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 522 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 523 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1800 MHz, and has a sparcv9 floating point processor. Status of virtual processor 528 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1200 MHz, and has a sparcv9 floating point processor. Status of virtual processor 530 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1200 MHz, and has a sparcv9 floating point processor. Status of virtual processor 531 as of: 11/26/2014 10:48:53 on-line since 12/16/2013 19:50:51. The sparcv9 processor operates at 1200 MHz, and has a sparcv9 floating point processor. # prtconf | grep Memory Memory size: 45056 Megabytes

      The difference in the output is simply that on a Linux machine with four cores, the range for CPU utilization is from 0% to 400%, while on Solaris, it goes from 0% to 100%. Your Solaris machine has 22 CPUs, so one completely busy CPU on Solaris will show up as about 4% CPU usage.

Re^4: Is 100% CPU utilisation during a procees is aproblem?
by Ankur_kuls (Sexton) on Nov 26, 2014 at 09:59 UTC

    Hi All, I am not able to reply on Corion's answer.. so please help me getting this clarification....as per his reply, it means that CPU usage on my linux machine is only around 25%. There is no problem with my script and I can go ahead with this?

      Personally, if I had a 22-core machine available to me, I'd be very dissatisfied with only being able to make use of 4% of the available power.

      I'd definitely look to multi-thread the processing so as to spread the cpu-intensive part(s) of the processing over more cores and reduce the overall time taken.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      Yes, your script will be using 100% of one core, so if you have multiple cores, then the other cores won't be occupied. Your definition of a "problem" is still unclear - why shouldn't your script take 100% CPU? For a script that does a lot of CPU intensive stuff on a large input file that seems pretty normal to me.
      Search the top documentation for "CPU" and try the "I" command to see the different ways CPU usage can be displayed.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1108378]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2024-04-18 13:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found