Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^3: dynamic number of threads based on CPU utilization

by mabossert (Beadle)
on Sep 26, 2012 at 16:38 UTC ( #995813=note: print w/ replies, xml ) Need Help??


in reply to Re^2: dynamic number of threads based on CPU utilization
in thread dynamic number of threads based on CPU utilization

my apologies...I thought that since the procXml sub worked just fine, it would not be relevant to the discussion or potential solution. Within the procXml sub, I simply slurp the file into a hash, then operate on the hash.

I was under the impression that because I was operating on the file contents in memory (i.e. the hash), it was a mostly CPU-bound process (minus slurping the input file and printing to the output file.

sub procXml { my($inFile)=@_; my $triples; my (%countries,%avDetails,%avFiles); my $fsize = -s $inFile; my $fmb=$fsize/1048576; print "PROCESSING: $inFile (".sprintf("%.4f",$fmb)." MB)\n"; my $INFILE; open($INFILE,'<',$inFile); my $xmlString=read_file($inFile); close($INFILE); my $xml_converter = XML::Hash->new(); my $xml_hash; eval { $xml_hash = $xml_converter->fromXMLStringtoHash($xmlString); }; if($@) { print "BAD XML: $inFile\n"; return; } $xmlString = undef; foreach my $outer (@{$xml_hash->{'cyveillance'}->{'inspected_url'} +->{'URL'}}) { my $domainName=$outer->{'Domain_Name'}->{'text'}; my $exploitType=$outer->{'Exploit_Type'}->{'text'}; my $inspectedTime=$outer->{'InspectedTime'}->{'text'}; my ($ss,$mm,$hh,$day,$month,$year,$zone) = strptime($inspected +Time); $year+=1900; $month+=1; $month=sprintf("%02d",$month); $hh='00' unless defined $hh; $mm='00' unless defined $mm; $ss='01' unless defined $ss; $inspectedTime="$year-$month-$day"."T"."$hh:$mm:$ss"; my $ip=$outer->{'IP'}->{'text'}; my $exploitDescription=$outer->{'Exploit_Description'}->{'text +'}; my $hostName=$outer->{'Host_Name'}->{'text'}; my $referenceUrl=$outer->{'reference_url'}; $ip=defined $ip?$ip eq ''?undef:$ip=~m/^-$|^unknown$/i?undef:$ +ip:undef; $exploitDescription=defined $exploitDescription?$exploitDescri +ption eq ''?undef:$exploitDescription=~m/^-$|^unknown$/i?undef:$explo +itDescription:undef; $hostName=defined $hostName?$hostName eq ''?undef:$hostName=~m +/^-$|^unknown$/i?undef:$hostName:undef; $referenceUrl=defined $referenceUrl?$referenceUrl eq ''?undef: +$referenceUrl=~m/^-$|^unknown$/i?undef:$referenceUrl:undef; if(ref($outer->{'Binary'}) eq 'ARRAY') { foreach my $binary (@{$outer->{'Binary'}}) { my $fileName=$binary->{'File_Name'}->{'text'}; my $fileURL=$binary->{'Binary_Path'}->{'text'}; my $pestName=$binary->{'Pest_Name'}->{'text'}; my $md5=$binary->{'Hash'}->{'MD5'}->{'text'}; my $fileSize=$binary->{'File_Size'}->{'text'}; $fileName=defined $fileName?$fileName eq ''?undef:$fil +eName=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileName:undef; $fileURL=defined $fileURL?$fileURL eq ''?undef:$fileUR +L=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileURL:undef; $pestName=defined $pestName?$pestName eq ''?undef:$pes +tName=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$pestName:undef; $pestName=$1 if defined $pestName && $pestName =~ m/Fo +und potentially unwanted program (.*)\./; $md5=defined $md5?$md5 eq ''?undef:$md5=~m/^-$|^unknow +n$|^Unidentified Threat$/i?undef:$md5:undef; $fileSize=defined $fileSize?$fileSize eq ''?undef:$fil +eSize=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileSize=~m/^.[ +0-9]+$/?$fileSize:undef:undef; my $server_domainName=$binary->{'Server_Properties'}-> +{'Domain_Name'}->{'text'}; my $server_hostName=$binary->{'Server_Properties'}->{' +Host_Name'}->{'text'}; my $server_ip=$binary->{'Server_Properties'}->{'IP'}-> +{'text'}; my $server_ISP=$binary->{'Server_Properties'}->{'ISP_D +ata'}->{'ISP'}->{'text'}; my $server_numBinaries=$binary->{'Server_Properties'}- +>{'ISP_Data'}->{'Number_Hosted_Binaries'}->{'text'}; my $server_zipCode=$binary->{'Server_Properties'}->{'I +SP_Data'}->{'Zip_Code'}->{'text'} if exists $binary->{'Server_Propert +ies'}->{'ISP_Data'}->{'Zip_Code'}->{'text'}; my $server_city=$binary->{'Server_Properties'}->{'ISP_ +Data'}->{'City'}->{'text'} if exists $binary->{'Server_Properties'}-> +{'ISP_Data'}->{'City'}->{'text'}; my $server_region=$binary->{'Server_Properties'}->{'IS +P_Data'}->{'Region'}->{'text'} if exists $binary->{'Server_Properties +'}->{'ISP_Data'}->{'Region'}->{'text'}; my $server_country=$binary->{'Server_Properties'}->{'I +SP_Data'}->{'Country'}->{'text'} if exists $binary->{'Server_Properti +es'}->{'ISP_Data'}->{'Country'}->{'text'}; my $server_numSitesHosted=$binary->{'Server_Properties +'}->{'ISP_Data'}->{'Number_Hosted_Sites'}->{'text'} if exists $binary +->{'Server_Properties'}->{'ISP_Data'}->{'Number_Hosted_Sites'}->{'tex +t'}; my $webServer=$binary->{'Server_Properties'}->{'ISP_Da +ta'}->{'Web_Server_info'}->{'text'}; $server_domainName=defined $server_domainName?$server_ +domainName eq ''?undef:$server_domainName=~m/^-$|^unknown$/i?undef:$s +erver_domainName:undef; $server_hostName=defined $server_hostName?$server_host +Name eq ''?undef:$server_hostName=~m/^-$|^unknown$/i?undef:$server_ho +stName:undef; $server_ip=defined $server_ip?$server_ip eq ''?undef:$ +server_ip=~m/^-$|^unknown$/i?undef:$server_ip:undef; $server_ISP=defined $server_ISP?$server_ISP eq ''?unde +f:$server_ISP=~m/^-$|^unknown$/i?undef:$server_ISP:undef; $server_numBinaries=defined $server_numBinaries?$serve +r_numBinaries eq ''?'1':$server_numBinaries=~m/^-$|^unknown$/i?'1':$s +erver_numBinaries=~m/^.[0-9]+$/?$server_numBinaries:'1':'1'; $server_zipCode=defined $server_zipCode?$server_zipCod +e eq ''?undef:$server_zipCode=~m/^-$|^unknown$/i?undef:$server_zipCod +e:undef; $server_city=defined $server_city?$server_city eq ''?u +ndef:$server_city=~m/^-$|^unknown$/i?undef:$server_city:undef; $server_region=defined $server_region?$server_region e +q ''?undef:$server_region=~m/^-$|^unknown$/i?undef:$server_region:und +ef; $server_country=defined $server_country?$server_countr +y eq ''?undef:$server_country=~m/^-$|^unknown$/i?undef:$server_countr +y:undef; $server_numSitesHosted=defined $server_numSitesHosted? +$server_numSitesHosted eq ''?'1':$server_numSitesHosted=~m/^-$|^unkno +wn$/i?'1':$server_numSitesHosted=~m/^.[0-9]+$/?$server_numSitesHosted +:'1':'1'; $webServer=defined $webServer?$webServer eq ''?'unknow +n':$webServer=~m/^-$|^unknown$/i?'unknown':$webServer:'unknown'; $server_country =~ s/\s/_/g if defined $server_country +; my (%avDetections,%threatTypes,%classes); next if !defined $binary->{'Class'}; foreach(keys $binary->{'Class'}) { $classes{$_}=1 if $binary->{'Class'}->{$_}->{'text +'} == 1; } foreach(keys $binary->{'Anti-Virus'}) { $avDetections{$_}->{'Signature_Version'}=$binary-> +{'Anti-Virus'}->{$_}->{'Signature_Version'} unless $binary->{'Anti-Vi +rus'}->{$_}->{'Signature_Version'} eq ''; $avDetections{$_}->{'Engine_Version'}=$binary->{'A +nti-Virus'}->{$_}->{'Engine_Version'} unless $binary->{'Anti-Virus'}- +>{$_}->{'Engine_Version'} eq ''; $avDetections{$_}->{'Threat_Name'}=$binary->{'Anti +-Virus'}->{$_}->{'Threat_Name'} unless $binary->{'Anti-Virus'}->{$_}- +>{'Threat_Name'} eq ''; } foreach(keys $binary->{'Type'}) { $threatTypes{$_}=1 if $binary->{'Type'}->{$_}->{'t +ext'} == 1; } $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasDomainName> <http://cs.org/domain#$domainName> .\n| +if defined $domainName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasExploitType> <http://cs.org/exploitAttempted#$exploi +tType> .\n| if defined $exploitType; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasExploitDescription> "$exploitDescription" .\n| if de +fined $exploitDescription; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/DTGstart> "$inspectedTime"^^<http://www.w3.org/2001/XML +Schema#dateTime> .\n| if defined $inspectedTime; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasIpAddr> <http://cs.org/ipv4#$ip> .\n| if defined $ip +; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasHostName> <http://cs.org/host#$hostName> .\n| if def +ined $hostName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasURL> <http://cs.org/url#$referenceUrl> .\n| if defin +ed $referenceUrl; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFile> <http://cs.org/file#$fileName> .\n| if defined + $fileName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFileURL> <http://cs.org/url#$fileURL> .\n| if define +d $fileURL; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFileMD5> <http://cs.org/MD5#$md5> .\n| if defined $m +d5; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFileSize> "$fileSize"^^<http://www.w3.org/2001/XMLSc +hema#integer> .\n| if defined $fileSize; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasPestName> <http://cs.org/pest_name#$pestName> .\n| i +f defined $pestName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasWebServer> "$webServer" .\n| if defined $webServer; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerDomain> <http://cs.org/domain#$server_domainNa +me> .\n| if defined $server_domainName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerHostName> <http://cs.org/host#$server_hostName +> .\n| if defined $server_hostName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerIpAddr> <http://cs.org/ipv4#$server_ip> .\n| i +f defined $server_ip; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerNumSites> "$server_numSitesHosted"^^<http://ww +w.w3.org/2001/XMLSchema#integer> .\n| if defined $server_numSitesHost +ed; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerNumBinaries> "$server_numBinaries"^^<http://ww +w.w3.org/2001/XMLSchema#integer> .\n| if defined $server_numBinaries; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasISP> "$server_ISP" .\n| if defined $server_ISP; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerZipCode> "$server_zipCode" .\n| if defined $se +rver_zipCode; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| i +f defined $server_city; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerRegion> <http://cs.org/city#$server_region> .\ +n| if defined $server_region; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerCountry> <http://cs.org/country#$server_countr +y> .\n| if defined $server_country; $triples .= qq|<http://cs.org/country#$server_country> + <http://cs.org/p/hasServerRegion> <http://cs.org/city#$server_region +> .\n| unless (!defined $server_country || !defined $server_region) | +| exists $countries{$server_country}->{'regions'}->{$server_region}; $triples .= qq|<http://cs.org/region#$server_region> < +http://cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| + unless (!defined $server_region || !defined $server_city) || exists +$countries{$server_country}->{'cities'}->{$server_city}; $triples .= qq|<http://cs.org/city#$server_city> <http +://cs.org/p/hasServerZipCode> <http://cs.org/city#$server_zipCode> .\ +n| unless (!defined $server_city || !defined $server_zipCode) || exis +ts $countries{$server_country}->{'zipcodes'}->{$server_zipCode}; $countries{$server_country}->{'regions'}->{$server_reg +ion}=1 if defined $server_region && defined $server_country; $countries{$server_country}->{'cities'}->{$server_city +}=1 if defined $server_city && defined $server_country; $countries{$server_country}->{'zipcodes'}->{$server_zi +pCode}=1 if defined $server_zipCode && defined $server_country; $triples .= qq|<http:cs.org/file#$fileName> <http://cs +.org/p/detectedAs> <http://cs.org/pest_name#$pestName> .\n| if (defin +ed $fileName && defined $pestName) && (!exists $avFiles{$pestName} || + $avFiles{$pestName} ne $pestName); $avFiles{$pestName}=$fileName if defined $fileName && +defined $pestName; foreach(keys %avDetections) { my $sig=$avDetections{$_}->{'Signature_Version'}; my $eng=$avDetections{$_}->{'Engine_Version'}; my $tn=$avDetections{$_}->{'Threat_Name'}; $tn =~ s/\s/_/g if defined $tn; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n|; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvEngineVersion> "$eng" .\n| if defined $eng; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvSigVersion> "$sig" .\n| if defined $sig; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| +if defined $tn; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n| unless !defined $ +fileName || exists $avDetails{$fileName}->{'avDetection'}->{$_}; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| unles +s (!defined $tn || !defined $fileName) || exists $avDetails{$fileName +}->{'avThreatName'}->{$tn}; $avDetails{$fileName}->{'avDetection'}->{$_}=1 if +defined $fileName; $avDetails{$fileName}->{'avThreatName'}->{$tn}=1 i +f defined $tn && defined $fileName; $avFiles{$tn}=$fileName if defined $fileName && de +fined $tn; } foreach(keys %threatTypes) { $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n|; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n| unless + !defined $fileName || exists $avDetails{$fileName}->{'avThreatType'} +->{$_}; $avDetails{$fileName}->{'avThreatType'}->{$_}=1 if + defined $fileName; } foreach(keys %classes) { $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n| +; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n| unle +ss !defined $fileName || exists $avDetails{$fileName}->{'avThreatClas +s'}->{$_}; $avDetails{$fileName}->{'avThreatClass'}->{$_}=1 i +f defined $fileName; } $similar{$domainName}='domain' if defined $domainName; $similar{$hostName}='host' if defined $hostName; $similar{$fileName}='file' if defined $fileName; $similar{$pestName}='pest_name' if defined $pestName; $similar{$server_domainName}='domain' if defined $serv +er_domainName; $similar{$server_hostName}='host' if defined $server_h +ostName; $recordCount++; } } else { my $fileName=$outer->{'Binary'}->{'File_Name'}->{'text'}; my $fileURL=$outer->{'Binary'}->{'Binary_Path'}->{'text'}; my $pestName=$outer->{'Binary'}->{'Pest_Name'}->{'text'}; my $md5=$outer->{'Binary'}->{'Hash'}->{'MD5'}->{'text'}; my $fileSize=$outer->{'Binary'}->{'File_Size'}->{'text'}; $fileName=defined $fileName?$fileName eq ''?undef:$fileNam +e=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileName:undef; $fileURL=defined $fileURL?$fileURL eq ''?undef:$fileURL=~m +/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileURL:undef; $pestName=defined $pestName?$pestName eq ''?undef:$pestNam +e=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$pestName:undef; $pestName=$1 if defined $pestName && $pestName =~ m/Found +potentially unwanted program (.*)\./; $md5=defined $md5?$md5 eq ''?undef:$md5=~m/^-$|^unknown$|^ +Unidentified Threat$/i?undef:$md5:undef; $fileSize=defined $fileSize?$fileSize eq ''?undef:$fileSiz +e=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileSize=~m/^.[0-9] ++$/?$fileSize:undef:undef; my $server_domainName=$outer->{'Binary'}->{'Server_Propert +ies'}->{'Domain_Name'}->{'text'}; my $server_hostName=$outer->{'Binary'}->{'Server_Propertie +s'}->{'Host_Name'}->{'text'}; my $server_ip=$outer->{'Binary'}->{'Server_Properties'}->{ +'IP'}->{'text'}; my $server_ISP=$outer->{'Binary'}->{'Server_Properties'}-> +{'ISP_Data'}->{'ISP'}->{'text'}; my $server_numBinaries=$outer->{'Binary'}->{'Server_Proper +ties'}->{'ISP_Data'}->{'Number_Hosted_Binaries'}->{'text'}; my $server_city=$outer->{'Binary'}->{'Server_Properties'}- +>{'ISP_Data'}->{'City'}->{'text'} if exists $outer->{'Binary'}->{'Ser +ver_Properties'}->{'ISP_Data'}->{'City'}->{'text'}; my $server_country=$outer->{'Binary'}->{'Server_Properties +'}->{'ISP_Data'}->{'Country'}->{'text'} if exists $outer->{'Binary'}- +>{'Server_Properties'}->{'ISP_Data'}->{'Country'}->{'text'}; my $server_zipCode=$outer->{'Binary'}->{'Server_Properties +'}->{'ISP_Data'}->{'Zip_Code'}->{'text'} if exists $outer->{'Binary'} +->{'Server_Properties'}->{'ISP_Data'}->{'Zip_Code'}->{'text'}; my $server_region=$outer->{'Binary'}->{'Server_Properties' +}->{'ISP_Data'}->{'Region'}->{'text'} if exists $outer->{'Binary'}->{ +'Server_Properties'}->{'ISP_Data'}->{'Region'}->{'text'}; my $server_numSitesHosted=$outer->{'Binary'}->{'Server_Pro +perties'}->{'ISP_Data'}->{'Number_Hosted_Sites'}->{'text'} if exists +$outer->{'Binary'}->{'Server_Properties'}->{'ISP_Data'}->{'Number_Hos +ted_Sites'}->{'text'}; my $webServer=$outer->{'Binary'}->{'Server_Properties'}->{ +'ISP_Data'}->{'Web_Server_Info'}->{'text'}; $server_domainName=defined $server_domainName?$server_doma +inName eq ''?undef:$server_domainName=~m/^-$|^unknown$/i?undef:$serve +r_domainName:undef; $server_hostName=defined $server_hostName?$server_hostName + eq ''?undef:$server_hostName=~m/^-$|^unknown$/i?undef:$server_hostNa +me:undef; $server_ip=defined $server_ip?$server_ip eq ''?undef:$serv +er_ip=~m/^-$|^unknown$/i?undef:$server_ip:undef; $server_ISP=defined $server_ISP?$server_ISP eq ''?undef:$s +erver_ISP=~m/^-$|^unknown$/i?undef:$server_ISP:undef; $server_numBinaries=defined $server_numBinaries?$server_nu +mBinaries eq ''?'1':$server_numBinaries=~m/^-$|^unknown$/i?'1':$serve +r_numBinaries=~m/^.[0-9]+$/?$server_numBinaries:'1':'1'; $server_zipCode=defined $server_zipCode?$server_zipCode eq + ''?undef:$server_zipCode=~m/^-$|^unknown$/i?undef:$server_zipCode:un +def; $server_city=defined $server_city?$server_city eq ''?undef +:$server_city=~m/^-$|^unknown$/i?undef:$server_city:undef; $server_region=defined $server_region?$server_region eq '' +?undef:$server_region=~m/^-$|^unknown$/i?undef:$server_region:undef; $server_country=defined $server_country?$server_country eq + ''?undef:$server_country=~m/^-$|^unknown$/i?undef:$server_country:un +def; $server_numSitesHosted=defined $server_numSitesHosted?$ser +ver_numSitesHosted eq ''?'1':$server_numSitesHosted=~m/^-$|^unknown$/ +i?'1':$server_numSitesHosted=~m/^.[0-9]+$/?$server_numSitesHosted:'1' +:'1'; $webServer=defined $webServer?$webServer eq ''?'unknown':$ +webServer=~m/^-$|^unknown$/i?'unknown':$webServer:'unknown'; $server_country =~ s/\s/_/g if defined $server_country; my (%avDetections,%threatTypes,%classes); next if !defined $outer->{'Binary'}->{'Class'}; foreach(keys $outer->{'Binary'}->{'Class'}) { $classes{$_}=1 if $outer->{'Binary'}->{'Class'}->{$_}- +>{'text'} == 1; } foreach(keys $outer->{'Binary'}->{'Anti-Virus'}) { $avDetections{$_}->{'Signature_Version'}=$outer->{'Bin +ary'}->{'Anti-Virus'}->{$_}->{'Signature_Version'} unless $outer->{'B +inary'}->{'Anti-Virus'}->{$_}->{'Signature_Version'} eq ''; $avDetections{$_}->{'Engine_Version'}=$outer->{'Binary +'}->{'Anti-Virus'}->{$_}->{'Engine_Version'} unless $outer->{'Binary' +}->{'Anti-Virus'}->{$_}->{'Engine_Version'} eq ''; $avDetections{$_}->{'Threat_Name'}=$outer->{'Binary'}- +>{'Anti-Virus'}->{$_}->{'Threat_Name'} unless $outer->{'Binary'}->{'A +nti-Virus'}->{$_}->{'Threat_Name'} eq ''; } foreach(keys $outer->{'Binary'}->{'Type'}) { $threatTypes{$_}=1 if $outer->{'Binary'}->{'Type'}->{$ +_}->{'text'} == 1; } $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasDomainName> <http://cs.org/domain#$domainName> .\n| if d +efined $domainName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasExploitType> <http://cs.org/exploitAttempted#$exploitTyp +e> .\n| if defined $exploitType; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasExploitDescription> "$exploitDescription" .\n| if define +d $exploitDescription; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/DTGstart> "$inspectedTime"^^<http://www.w3.org/2001/XMLSche +ma#dateTime> .\n| if defined $inspectedTime; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasIpAddr> <http://cs.org/ipv4#$ip> .\n| if defined $ip; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasHostName> <http://cs.org/host#$hostName> .\n| if defined + $hostName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasURL> <http://cs.org/url#$referenceUrl> .\n| if defined $ +referenceUrl; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFile> <http://cs.org/file#$fileName> .\n| if defined $fi +leName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFileURL> <http://cs.org/url#$fileURL> .\n| if defined $f +ileURL; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFileMD5> <http://cs.org/MD5#$md5> .\n| if defined $md5; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFileSize> "$fileSize"^^<http://www.w3.org/2001/XMLSchema +#integer> .\n| if defined $fileSize; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasPestName> <http://cs.org/pest_name#$pestName> .\n| if de +fined $pestName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasWebServer> "$webServer" .\n| if defined $webServer; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerDomain> <http://cs.org/domain#$server_domainName> +.\n| if defined $server_domainName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerHostName> <http://cs.org/host#$server_hostName> .\ +n| if defined $server_hostName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerIpAddr> <http://cs.org/ipv4#$server_ip> .\n| if de +fined $server_ip; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerNumSites> "$server_numSitesHosted"^^<http://www.w3 +.org/2001/XMLSchema#integer> .\n| if defined $server_numSitesHosted; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerNumBinaries> "$server_numBinaries"^^<http://www.w3 +.org/2001/XMLSchema#integer> .\n| if defined $server_numBinaries; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasISP> "$server_ISP" .\n| if defined $server_ISP; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerZipCode> "$server_zipCode" .\n| if defined $server +_zipCode; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| if de +fined $server_city; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerRegion> <http://cs.org/city#$server_region> .\n| i +f defined $server_region; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerCountry> <http://cs.org/country#$server_country> . +\n| if defined $server_country; $triples .= qq|<http://cs.org/country#$server_country> <ht +tp://cs.org/p/hasServerRegion> <http://cs.org/city#$server_region> .\ +n| unless (!defined $server_country || !defined $server_region) || ex +ists $countries{$server_country}->{'regions'}->{$server_region}; $triples .= qq|<http://cs.org/region#$server_region> <http +://cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| unl +ess (!defined $server_region || !defined $server_city) || exists $cou +ntries{$server_country}->{'cities'}->{$server_city}; $triples .= qq|<http://cs.org/city#$server_city> <http://c +s.org/p/hasServerZipCode> <http://cs.org/city#$server_zipCode> .\n| u +nless (!defined $server_city || !defined $server_zipCode) || exists $ +countries{$server_country}->{'zipcodes'}->{$server_zipCode}; $countries{$server_country}->{'regions'}->{$server_region} +=1 if defined $server_region && defined $server_country; $countries{$server_country}->{'cities'}->{$server_city}=1 +if defined $server_city && defined $server_country; $countries{$server_country}->{'zipcodes'}->{$server_zipCod +e}=1 if defined $server_zipCode && defined $server_country; $triples .= qq|<http:cs.org/file#$fileName> <http://cs +.org/p/detectedAs> <http://cs.org/pest_name#$pestName> .\n| if (defin +ed $fileName && defined $pestName) && (!exists $avFiles{$pestName} || + $avFiles{$pestName} ne $pestName); $avFiles{$pestName}=$fileName if defined $fileName && +defined $pestName; foreach(keys %avDetections) { my $sig=$avDetections{$_}->{'Signature_Version'}; my $eng=$avDetections{$_}->{'Engine_Version'}; my $tn=$avDetections{$_}->{'Threat_Name'}; $tn =~ s/\s/_/g if defined $tn; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n|; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvEngineVersion> "$eng" .\n| if defined $eng; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvSigVersion> "$sig" .\n| if defined $sig; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| +if defined $tn; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n| unless !defined $ +fileName || exists $avDetails{$fileName}->{'avDetection'}->{$_}; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| unles +s (!defined $tn || !defined $fileName) || exists $avDetails{$fileName +}->{'avThreatName'}->{$tn}; $avDetails{$fileName}->{'avDetection'}->{$_}=1 if +defined $fileName; $avDetails{$fileName}->{'avThreatName'}->{$tn}=1 i +f defined $tn && defined $fileName; $avFiles{$tn}=$fileName if defined $fileName && de +fined $tn; } foreach(keys %threatTypes) { $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n|; $triples .= qq|<http://cs.org/file#$fileName> <http:// +cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n| unless !de +fined $fileName || exists $avDetails{$fileName}->{'avThreatType'}->{$ +_}; $avDetails{$fileName}->{'avThreatType'}->{$_}=1 if def +ined $fileName; } foreach(keys %classes) { $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n|; $triples .= qq|<http://cs.org/file#$fileName> <http:// +cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n| unless ! +defined $fileName || exists $avDetails{$fileName}->{'avThreatClass'}- +>{$_}; $avDetails{$fileName}->{'avThreatClass'}->{$_}=1 if de +fined $fileName; } $similar{$domainName}='domain' if defined $domainName; $similar{$hostName}='host' if defined $hostName; $similar{$fileName}='file' if defined $fileName; $similar{$pestName}='pest_name' if defined $pestName; $similar{$server_domainName}='domain' if defined $server_d +omainName; $similar{$server_hostName}='host' if defined $server_hostN +ame; $recordCount++; } } $xml_converter = undef; print "FINISHED: $inFile\n"; return($triples); }


Comment on Re^3: dynamic number of threads based on CPU utilization
Download Code
Replies are listed 'Best First'.
Re^4: dynamic number of threads based on CPU utilization
by BrowserUk (Pope) on Sep 26, 2012 at 16:42 UTC
    .I thought that since the procXml sub worked just fine, it would not be relevant to the discussion or potential solution.

    You were mostly right. The only relevance it has is that nowhere in that code do I see any sign of locking (the keyword 'lock' does not appear), which means that multiple threads are writing to a shared hash and there is nothing to prevent them from corrupting data through collisions.

    You may 'get away with it', but I wouldn't want to be responsible for when things go wrong.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://995813]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2016-02-10 07:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How many photographs, souvenirs, artworks, trophies or other decorative objects are displayed in your home?





    Results (336 votes), past polls