Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: dynamic number of threads based on CPU utilization

by mabossert (Scribe)
on Sep 26, 2012 at 16:38 UTC ( #995813=note: print w/replies, xml ) Need Help??


in reply to Re^2: dynamic number of threads based on CPU utilization
in thread dynamic number of threads based on CPU utilization

my apologies...I thought that since the procXml sub worked just fine, it would not be relevant to the discussion or potential solution. Within the procXml sub, I simply slurp the file into a hash, then operate on the hash.

I was under the impression that because I was operating on the file contents in memory (i.e. the hash), it was a mostly CPU-bound process (minus slurping the input file and printing to the output file.

sub procXml { my($inFile)=@_; my $triples; my (%countries,%avDetails,%avFiles); my $fsize = -s $inFile; my $fmb=$fsize/1048576; print "PROCESSING: $inFile (".sprintf("%.4f",$fmb)." MB)\n"; my $INFILE; open($INFILE,'<',$inFile); my $xmlString=read_file($inFile); close($INFILE); my $xml_converter = XML::Hash->new(); my $xml_hash; eval { $xml_hash = $xml_converter->fromXMLStringtoHash($xmlString); }; if($@) { print "BAD XML: $inFile\n"; return; } $xmlString = undef; foreach my $outer (@{$xml_hash->{'cyveillance'}->{'inspected_url'} +->{'URL'}}) { my $domainName=$outer->{'Domain_Name'}->{'text'}; my $exploitType=$outer->{'Exploit_Type'}->{'text'}; my $inspectedTime=$outer->{'InspectedTime'}->{'text'}; my ($ss,$mm,$hh,$day,$month,$year,$zone) = strptime($inspected +Time); $year+=1900; $month+=1; $month=sprintf("%02d",$month); $hh='00' unless defined $hh; $mm='00' unless defined $mm; $ss='01' unless defined $ss; $inspectedTime="$year-$month-$day"."T"."$hh:$mm:$ss"; my $ip=$outer->{'IP'}->{'text'}; my $exploitDescription=$outer->{'Exploit_Description'}->{'text +'}; my $hostName=$outer->{'Host_Name'}->{'text'}; my $referenceUrl=$outer->{'reference_url'}; $ip=defined $ip?$ip eq ''?undef:$ip=~m/^-$|^unknown$/i?undef:$ +ip:undef; $exploitDescription=defined $exploitDescription?$exploitDescri +ption eq ''?undef:$exploitDescription=~m/^-$|^unknown$/i?undef:$explo +itDescription:undef; $hostName=defined $hostName?$hostName eq ''?undef:$hostName=~m +/^-$|^unknown$/i?undef:$hostName:undef; $referenceUrl=defined $referenceUrl?$referenceUrl eq ''?undef: +$referenceUrl=~m/^-$|^unknown$/i?undef:$referenceUrl:undef; if(ref($outer->{'Binary'}) eq 'ARRAY') { foreach my $binary (@{$outer->{'Binary'}}) { my $fileName=$binary->{'File_Name'}->{'text'}; my $fileURL=$binary->{'Binary_Path'}->{'text'}; my $pestName=$binary->{'Pest_Name'}->{'text'}; my $md5=$binary->{'Hash'}->{'MD5'}->{'text'}; my $fileSize=$binary->{'File_Size'}->{'text'}; $fileName=defined $fileName?$fileName eq ''?undef:$fil +eName=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileName:undef; $fileURL=defined $fileURL?$fileURL eq ''?undef:$fileUR +L=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileURL:undef; $pestName=defined $pestName?$pestName eq ''?undef:$pes +tName=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$pestName:undef; $pestName=$1 if defined $pestName && $pestName =~ m/Fo +und potentially unwanted program (.*)\./; $md5=defined $md5?$md5 eq ''?undef:$md5=~m/^-$|^unknow +n$|^Unidentified Threat$/i?undef:$md5:undef; $fileSize=defined $fileSize?$fileSize eq ''?undef:$fil +eSize=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileSize=~m/^.[ +0-9]+$/?$fileSize:undef:undef; my $server_domainName=$binary->{'Server_Properties'}-> +{'Domain_Name'}->{'text'}; my $server_hostName=$binary->{'Server_Properties'}->{' +Host_Name'}->{'text'}; my $server_ip=$binary->{'Server_Properties'}->{'IP'}-> +{'text'}; my $server_ISP=$binary->{'Server_Properties'}->{'ISP_D +ata'}->{'ISP'}->{'text'}; my $server_numBinaries=$binary->{'Server_Properties'}- +>{'ISP_Data'}->{'Number_Hosted_Binaries'}->{'text'}; my $server_zipCode=$binary->{'Server_Properties'}->{'I +SP_Data'}->{'Zip_Code'}->{'text'} if exists $binary->{'Server_Propert +ies'}->{'ISP_Data'}->{'Zip_Code'}->{'text'}; my $server_city=$binary->{'Server_Properties'}->{'ISP_ +Data'}->{'City'}->{'text'} if exists $binary->{'Server_Properties'}-> +{'ISP_Data'}->{'City'}->{'text'}; my $server_region=$binary->{'Server_Properties'}->{'IS +P_Data'}->{'Region'}->{'text'} if exists $binary->{'Server_Properties +'}->{'ISP_Data'}->{'Region'}->{'text'}; my $server_country=$binary->{'Server_Properties'}->{'I +SP_Data'}->{'Country'}->{'text'} if exists $binary->{'Server_Properti +es'}->{'ISP_Data'}->{'Country'}->{'text'}; my $server_numSitesHosted=$binary->{'Server_Properties +'}->{'ISP_Data'}->{'Number_Hosted_Sites'}->{'text'} if exists $binary +->{'Server_Properties'}->{'ISP_Data'}->{'Number_Hosted_Sites'}->{'tex +t'}; my $webServer=$binary->{'Server_Properties'}->{'ISP_Da +ta'}->{'Web_Server_info'}->{'text'}; $server_domainName=defined $server_domainName?$server_ +domainName eq ''?undef:$server_domainName=~m/^-$|^unknown$/i?undef:$s +erver_domainName:undef; $server_hostName=defined $server_hostName?$server_host +Name eq ''?undef:$server_hostName=~m/^-$|^unknown$/i?undef:$server_ho +stName:undef; $server_ip=defined $server_ip?$server_ip eq ''?undef:$ +server_ip=~m/^-$|^unknown$/i?undef:$server_ip:undef; $server_ISP=defined $server_ISP?$server_ISP eq ''?unde +f:$server_ISP=~m/^-$|^unknown$/i?undef:$server_ISP:undef; $server_numBinaries=defined $server_numBinaries?$serve +r_numBinaries eq ''?'1':$server_numBinaries=~m/^-$|^unknown$/i?'1':$s +erver_numBinaries=~m/^.[0-9]+$/?$server_numBinaries:'1':'1'; $server_zipCode=defined $server_zipCode?$server_zipCod +e eq ''?undef:$server_zipCode=~m/^-$|^unknown$/i?undef:$server_zipCod +e:undef; $server_city=defined $server_city?$server_city eq ''?u +ndef:$server_city=~m/^-$|^unknown$/i?undef:$server_city:undef; $server_region=defined $server_region?$server_region e +q ''?undef:$server_region=~m/^-$|^unknown$/i?undef:$server_region:und +ef; $server_country=defined $server_country?$server_countr +y eq ''?undef:$server_country=~m/^-$|^unknown$/i?undef:$server_countr +y:undef; $server_numSitesHosted=defined $server_numSitesHosted? +$server_numSitesHosted eq ''?'1':$server_numSitesHosted=~m/^-$|^unkno +wn$/i?'1':$server_numSitesHosted=~m/^.[0-9]+$/?$server_numSitesHosted +:'1':'1'; $webServer=defined $webServer?$webServer eq ''?'unknow +n':$webServer=~m/^-$|^unknown$/i?'unknown':$webServer:'unknown'; $server_country =~ s/\s/_/g if defined $server_country +; my (%avDetections,%threatTypes,%classes); next if !defined $binary->{'Class'}; foreach(keys $binary->{'Class'}) { $classes{$_}=1 if $binary->{'Class'}->{$_}->{'text +'} == 1; } foreach(keys $binary->{'Anti-Virus'}) { $avDetections{$_}->{'Signature_Version'}=$binary-> +{'Anti-Virus'}->{$_}->{'Signature_Version'} unless $binary->{'Anti-Vi +rus'}->{$_}->{'Signature_Version'} eq ''; $avDetections{$_}->{'Engine_Version'}=$binary->{'A +nti-Virus'}->{$_}->{'Engine_Version'} unless $binary->{'Anti-Virus'}- +>{$_}->{'Engine_Version'} eq ''; $avDetections{$_}->{'Threat_Name'}=$binary->{'Anti +-Virus'}->{$_}->{'Threat_Name'} unless $binary->{'Anti-Virus'}->{$_}- +>{'Threat_Name'} eq ''; } foreach(keys $binary->{'Type'}) { $threatTypes{$_}=1 if $binary->{'Type'}->{$_}->{'t +ext'} == 1; } $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasDomainName> <http://cs.org/domain#$domainName> .\n| +if defined $domainName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasExploitType> <http://cs.org/exploitAttempted#$exploi +tType> .\n| if defined $exploitType; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasExploitDescription> "$exploitDescription" .\n| if de +fined $exploitDescription; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/DTGstart> "$inspectedTime"^^<http://www.w3.org/2001/XML +Schema#dateTime> .\n| if defined $inspectedTime; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasIpAddr> <http://cs.org/ipv4#$ip> .\n| if defined $ip +; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasHostName> <http://cs.org/host#$hostName> .\n| if def +ined $hostName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasURL> <http://cs.org/url#$referenceUrl> .\n| if defin +ed $referenceUrl; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFile> <http://cs.org/file#$fileName> .\n| if defined + $fileName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFileURL> <http://cs.org/url#$fileURL> .\n| if define +d $fileURL; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFileMD5> <http://cs.org/MD5#$md5> .\n| if defined $m +d5; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasFileSize> "$fileSize"^^<http://www.w3.org/2001/XMLSc +hema#integer> .\n| if defined $fileSize; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasPestName> <http://cs.org/pest_name#$pestName> .\n| i +f defined $pestName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasWebServer> "$webServer" .\n| if defined $webServer; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerDomain> <http://cs.org/domain#$server_domainNa +me> .\n| if defined $server_domainName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerHostName> <http://cs.org/host#$server_hostName +> .\n| if defined $server_hostName; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerIpAddr> <http://cs.org/ipv4#$server_ip> .\n| i +f defined $server_ip; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerNumSites> "$server_numSitesHosted"^^<http://ww +w.w3.org/2001/XMLSchema#integer> .\n| if defined $server_numSitesHost +ed; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerNumBinaries> "$server_numBinaries"^^<http://ww +w.w3.org/2001/XMLSchema#integer> .\n| if defined $server_numBinaries; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasISP> "$server_ISP" .\n| if defined $server_ISP; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerZipCode> "$server_zipCode" .\n| if defined $se +rver_zipCode; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| i +f defined $server_city; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerRegion> <http://cs.org/city#$server_region> .\ +n| if defined $server_region; $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasServerCountry> <http://cs.org/country#$server_countr +y> .\n| if defined $server_country; $triples .= qq|<http://cs.org/country#$server_country> + <http://cs.org/p/hasServerRegion> <http://cs.org/city#$server_region +> .\n| unless (!defined $server_country || !defined $server_region) | +| exists $countries{$server_country}->{'regions'}->{$server_region}; $triples .= qq|<http://cs.org/region#$server_region> < +http://cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| + unless (!defined $server_region || !defined $server_city) || exists +$countries{$server_country}->{'cities'}->{$server_city}; $triples .= qq|<http://cs.org/city#$server_city> <http +://cs.org/p/hasServerZipCode> <http://cs.org/city#$server_zipCode> .\ +n| unless (!defined $server_city || !defined $server_zipCode) || exis +ts $countries{$server_country}->{'zipcodes'}->{$server_zipCode}; $countries{$server_country}->{'regions'}->{$server_reg +ion}=1 if defined $server_region && defined $server_country; $countries{$server_country}->{'cities'}->{$server_city +}=1 if defined $server_city && defined $server_country; $countries{$server_country}->{'zipcodes'}->{$server_zi +pCode}=1 if defined $server_zipCode && defined $server_country; $triples .= qq|<http:cs.org/file#$fileName> <http://cs +.org/p/detectedAs> <http://cs.org/pest_name#$pestName> .\n| if (defin +ed $fileName && defined $pestName) && (!exists $avFiles{$pestName} || + $avFiles{$pestName} ne $pestName); $avFiles{$pestName}=$fileName if defined $fileName && +defined $pestName; foreach(keys %avDetections) { my $sig=$avDetections{$_}->{'Signature_Version'}; my $eng=$avDetections{$_}->{'Engine_Version'}; my $tn=$avDetections{$_}->{'Threat_Name'}; $tn =~ s/\s/_/g if defined $tn; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n|; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvEngineVersion> "$eng" .\n| if defined $eng; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvSigVersion> "$sig" .\n| if defined $sig; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| +if defined $tn; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n| unless !defined $ +fileName || exists $avDetails{$fileName}->{'avDetection'}->{$_}; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| unles +s (!defined $tn || !defined $fileName) || exists $avDetails{$fileName +}->{'avThreatName'}->{$tn}; $avDetails{$fileName}->{'avDetection'}->{$_}=1 if +defined $fileName; $avDetails{$fileName}->{'avThreatName'}->{$tn}=1 i +f defined $tn && defined $fileName; $avFiles{$tn}=$fileName if defined $fileName && de +fined $tn; } foreach(keys %threatTypes) { $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n|; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n| unless + !defined $fileName || exists $avDetails{$fileName}->{'avThreatType'} +->{$_}; $avDetails{$fileName}->{'avThreatType'}->{$_}=1 if + defined $fileName; } foreach(keys %classes) { $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n| +; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n| unle +ss !defined $fileName || exists $avDetails{$fileName}->{'avThreatClas +s'}->{$_}; $avDetails{$fileName}->{'avThreatClass'}->{$_}=1 i +f defined $fileName; } $similar{$domainName}='domain' if defined $domainName; $similar{$hostName}='host' if defined $hostName; $similar{$fileName}='file' if defined $fileName; $similar{$pestName}='pest_name' if defined $pestName; $similar{$server_domainName}='domain' if defined $serv +er_domainName; $similar{$server_hostName}='host' if defined $server_h +ostName; $recordCount++; } } else { my $fileName=$outer->{'Binary'}->{'File_Name'}->{'text'}; my $fileURL=$outer->{'Binary'}->{'Binary_Path'}->{'text'}; my $pestName=$outer->{'Binary'}->{'Pest_Name'}->{'text'}; my $md5=$outer->{'Binary'}->{'Hash'}->{'MD5'}->{'text'}; my $fileSize=$outer->{'Binary'}->{'File_Size'}->{'text'}; $fileName=defined $fileName?$fileName eq ''?undef:$fileNam +e=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileName:undef; $fileURL=defined $fileURL?$fileURL eq ''?undef:$fileURL=~m +/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileURL:undef; $pestName=defined $pestName?$pestName eq ''?undef:$pestNam +e=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$pestName:undef; $pestName=$1 if defined $pestName && $pestName =~ m/Found +potentially unwanted program (.*)\./; $md5=defined $md5?$md5 eq ''?undef:$md5=~m/^-$|^unknown$|^ +Unidentified Threat$/i?undef:$md5:undef; $fileSize=defined $fileSize?$fileSize eq ''?undef:$fileSiz +e=~m/^-$|^unknown$|^Unidentified Threat$/i?undef:$fileSize=~m/^.[0-9] ++$/?$fileSize:undef:undef; my $server_domainName=$outer->{'Binary'}->{'Server_Propert +ies'}->{'Domain_Name'}->{'text'}; my $server_hostName=$outer->{'Binary'}->{'Server_Propertie +s'}->{'Host_Name'}->{'text'}; my $server_ip=$outer->{'Binary'}->{'Server_Properties'}->{ +'IP'}->{'text'}; my $server_ISP=$outer->{'Binary'}->{'Server_Properties'}-> +{'ISP_Data'}->{'ISP'}->{'text'}; my $server_numBinaries=$outer->{'Binary'}->{'Server_Proper +ties'}->{'ISP_Data'}->{'Number_Hosted_Binaries'}->{'text'}; my $server_city=$outer->{'Binary'}->{'Server_Properties'}- +>{'ISP_Data'}->{'City'}->{'text'} if exists $outer->{'Binary'}->{'Ser +ver_Properties'}->{'ISP_Data'}->{'City'}->{'text'}; my $server_country=$outer->{'Binary'}->{'Server_Properties +'}->{'ISP_Data'}->{'Country'}->{'text'} if exists $outer->{'Binary'}- +>{'Server_Properties'}->{'ISP_Data'}->{'Country'}->{'text'}; my $server_zipCode=$outer->{'Binary'}->{'Server_Properties +'}->{'ISP_Data'}->{'Zip_Code'}->{'text'} if exists $outer->{'Binary'} +->{'Server_Properties'}->{'ISP_Data'}->{'Zip_Code'}->{'text'}; my $server_region=$outer->{'Binary'}->{'Server_Properties' +}->{'ISP_Data'}->{'Region'}->{'text'} if exists $outer->{'Binary'}->{ +'Server_Properties'}->{'ISP_Data'}->{'Region'}->{'text'}; my $server_numSitesHosted=$outer->{'Binary'}->{'Server_Pro +perties'}->{'ISP_Data'}->{'Number_Hosted_Sites'}->{'text'} if exists +$outer->{'Binary'}->{'Server_Properties'}->{'ISP_Data'}->{'Number_Hos +ted_Sites'}->{'text'}; my $webServer=$outer->{'Binary'}->{'Server_Properties'}->{ +'ISP_Data'}->{'Web_Server_Info'}->{'text'}; $server_domainName=defined $server_domainName?$server_doma +inName eq ''?undef:$server_domainName=~m/^-$|^unknown$/i?undef:$serve +r_domainName:undef; $server_hostName=defined $server_hostName?$server_hostName + eq ''?undef:$server_hostName=~m/^-$|^unknown$/i?undef:$server_hostNa +me:undef; $server_ip=defined $server_ip?$server_ip eq ''?undef:$serv +er_ip=~m/^-$|^unknown$/i?undef:$server_ip:undef; $server_ISP=defined $server_ISP?$server_ISP eq ''?undef:$s +erver_ISP=~m/^-$|^unknown$/i?undef:$server_ISP:undef; $server_numBinaries=defined $server_numBinaries?$server_nu +mBinaries eq ''?'1':$server_numBinaries=~m/^-$|^unknown$/i?'1':$serve +r_numBinaries=~m/^.[0-9]+$/?$server_numBinaries:'1':'1'; $server_zipCode=defined $server_zipCode?$server_zipCode eq + ''?undef:$server_zipCode=~m/^-$|^unknown$/i?undef:$server_zipCode:un +def; $server_city=defined $server_city?$server_city eq ''?undef +:$server_city=~m/^-$|^unknown$/i?undef:$server_city:undef; $server_region=defined $server_region?$server_region eq '' +?undef:$server_region=~m/^-$|^unknown$/i?undef:$server_region:undef; $server_country=defined $server_country?$server_country eq + ''?undef:$server_country=~m/^-$|^unknown$/i?undef:$server_country:un +def; $server_numSitesHosted=defined $server_numSitesHosted?$ser +ver_numSitesHosted eq ''?'1':$server_numSitesHosted=~m/^-$|^unknown$/ +i?'1':$server_numSitesHosted=~m/^.[0-9]+$/?$server_numSitesHosted:'1' +:'1'; $webServer=defined $webServer?$webServer eq ''?'unknown':$ +webServer=~m/^-$|^unknown$/i?'unknown':$webServer:'unknown'; $server_country =~ s/\s/_/g if defined $server_country; my (%avDetections,%threatTypes,%classes); next if !defined $outer->{'Binary'}->{'Class'}; foreach(keys $outer->{'Binary'}->{'Class'}) { $classes{$_}=1 if $outer->{'Binary'}->{'Class'}->{$_}- +>{'text'} == 1; } foreach(keys $outer->{'Binary'}->{'Anti-Virus'}) { $avDetections{$_}->{'Signature_Version'}=$outer->{'Bin +ary'}->{'Anti-Virus'}->{$_}->{'Signature_Version'} unless $outer->{'B +inary'}->{'Anti-Virus'}->{$_}->{'Signature_Version'} eq ''; $avDetections{$_}->{'Engine_Version'}=$outer->{'Binary +'}->{'Anti-Virus'}->{$_}->{'Engine_Version'} unless $outer->{'Binary' +}->{'Anti-Virus'}->{$_}->{'Engine_Version'} eq ''; $avDetections{$_}->{'Threat_Name'}=$outer->{'Binary'}- +>{'Anti-Virus'}->{$_}->{'Threat_Name'} unless $outer->{'Binary'}->{'A +nti-Virus'}->{$_}->{'Threat_Name'} eq ''; } foreach(keys $outer->{'Binary'}->{'Type'}) { $threatTypes{$_}=1 if $outer->{'Binary'}->{'Type'}->{$ +_}->{'text'} == 1; } $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasDomainName> <http://cs.org/domain#$domainName> .\n| if d +efined $domainName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasExploitType> <http://cs.org/exploitAttempted#$exploitTyp +e> .\n| if defined $exploitType; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasExploitDescription> "$exploitDescription" .\n| if define +d $exploitDescription; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/DTGstart> "$inspectedTime"^^<http://www.w3.org/2001/XMLSche +ma#dateTime> .\n| if defined $inspectedTime; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasIpAddr> <http://cs.org/ipv4#$ip> .\n| if defined $ip; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasHostName> <http://cs.org/host#$hostName> .\n| if defined + $hostName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasURL> <http://cs.org/url#$referenceUrl> .\n| if defined $ +referenceUrl; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFile> <http://cs.org/file#$fileName> .\n| if defined $fi +leName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFileURL> <http://cs.org/url#$fileURL> .\n| if defined $f +ileURL; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFileMD5> <http://cs.org/MD5#$md5> .\n| if defined $md5; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasFileSize> "$fileSize"^^<http://www.w3.org/2001/XMLSchema +#integer> .\n| if defined $fileSize; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasPestName> <http://cs.org/pest_name#$pestName> .\n| if de +fined $pestName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasWebServer> "$webServer" .\n| if defined $webServer; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerDomain> <http://cs.org/domain#$server_domainName> +.\n| if defined $server_domainName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerHostName> <http://cs.org/host#$server_hostName> .\ +n| if defined $server_hostName; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerIpAddr> <http://cs.org/ipv4#$server_ip> .\n| if de +fined $server_ip; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerNumSites> "$server_numSitesHosted"^^<http://www.w3 +.org/2001/XMLSchema#integer> .\n| if defined $server_numSitesHosted; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerNumBinaries> "$server_numBinaries"^^<http://www.w3 +.org/2001/XMLSchema#integer> .\n| if defined $server_numBinaries; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasISP> "$server_ISP" .\n| if defined $server_ISP; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerZipCode> "$server_zipCode" .\n| if defined $server +_zipCode; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| if de +fined $server_city; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerRegion> <http://cs.org/city#$server_region> .\n| i +f defined $server_region; $triples .= qq|<http://cs.org/record#$recordCount> <http:/ +/cs.org/p/hasServerCountry> <http://cs.org/country#$server_country> . +\n| if defined $server_country; $triples .= qq|<http://cs.org/country#$server_country> <ht +tp://cs.org/p/hasServerRegion> <http://cs.org/city#$server_region> .\ +n| unless (!defined $server_country || !defined $server_region) || ex +ists $countries{$server_country}->{'regions'}->{$server_region}; $triples .= qq|<http://cs.org/region#$server_region> <http +://cs.org/p/hasServerCity> <http://cs.org/city#$server_city> .\n| unl +ess (!defined $server_region || !defined $server_city) || exists $cou +ntries{$server_country}->{'cities'}->{$server_city}; $triples .= qq|<http://cs.org/city#$server_city> <http://c +s.org/p/hasServerZipCode> <http://cs.org/city#$server_zipCode> .\n| u +nless (!defined $server_city || !defined $server_zipCode) || exists $ +countries{$server_country}->{'zipcodes'}->{$server_zipCode}; $countries{$server_country}->{'regions'}->{$server_region} +=1 if defined $server_region && defined $server_country; $countries{$server_country}->{'cities'}->{$server_city}=1 +if defined $server_city && defined $server_country; $countries{$server_country}->{'zipcodes'}->{$server_zipCod +e}=1 if defined $server_zipCode && defined $server_country; $triples .= qq|<http:cs.org/file#$fileName> <http://cs +.org/p/detectedAs> <http://cs.org/pest_name#$pestName> .\n| if (defin +ed $fileName && defined $pestName) && (!exists $avFiles{$pestName} || + $avFiles{$pestName} ne $pestName); $avFiles{$pestName}=$fileName if defined $fileName && +defined $pestName; foreach(keys %avDetections) { my $sig=$avDetections{$_}->{'Signature_Version'}; my $eng=$avDetections{$_}->{'Engine_Version'}; my $tn=$avDetections{$_}->{'Threat_Name'}; $tn =~ s/\s/_/g if defined $tn; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n|; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvEngineVersion> "$eng" .\n| if defined $eng; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/hasAvSigVersion> "$sig" .\n| if defined $sig; $triples .= qq|<http://cs.org/record#$recordCount> + <http://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| +if defined $tn; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedBy> <http://cs.org/AV#$_> .\n| unless !defined $ +fileName || exists $avDetails{$fileName}->{'avDetection'}->{$_}; $triples .= qq|<http://cs.org/file#$fileName> <htt +p://cs.org/p/detectedAs> <http://cs.org/avThreat_name#$tn> .\n| unles +s (!defined $tn || !defined $fileName) || exists $avDetails{$fileName +}->{'avThreatName'}->{$tn}; $avDetails{$fileName}->{'avDetection'}->{$_}=1 if +defined $fileName; $avDetails{$fileName}->{'avThreatName'}->{$tn}=1 i +f defined $tn && defined $fileName; $avFiles{$tn}=$fileName if defined $fileName && de +fined $tn; } foreach(keys %threatTypes) { $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n|; $triples .= qq|<http://cs.org/file#$fileName> <http:// +cs.org/p/hasThreatType> <http://cs.org/threatType#$_> .\n| unless !de +fined $fileName || exists $avDetails{$fileName}->{'avThreatType'}->{$ +_}; $avDetails{$fileName}->{'avThreatType'}->{$_}=1 if def +ined $fileName; } foreach(keys %classes) { $triples .= qq|<http://cs.org/record#$recordCount> <ht +tp://cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n|; $triples .= qq|<http://cs.org/file#$fileName> <http:// +cs.org/p/hasThreatClass> <http://cs.org/threatClass#$_> .\n| unless ! +defined $fileName || exists $avDetails{$fileName}->{'avThreatClass'}- +>{$_}; $avDetails{$fileName}->{'avThreatClass'}->{$_}=1 if de +fined $fileName; } $similar{$domainName}='domain' if defined $domainName; $similar{$hostName}='host' if defined $hostName; $similar{$fileName}='file' if defined $fileName; $similar{$pestName}='pest_name' if defined $pestName; $similar{$server_domainName}='domain' if defined $server_d +omainName; $similar{$server_hostName}='host' if defined $server_hostN +ame; $recordCount++; } } $xml_converter = undef; print "FINISHED: $inFile\n"; return($triples); }

Replies are listed 'Best First'.
Re^4: dynamic number of threads based on CPU utilization
by BrowserUk (Pope) on Sep 26, 2012 at 16:42 UTC
    .I thought that since the procXml sub worked just fine, it would not be relevant to the discussion or potential solution.

    You were mostly right. The only relevance it has is that nowhere in that code do I see any sign of locking (the keyword 'lock' does not appear), which means that multiple threads are writing to a shared hash and there is nothing to prevent them from corrupting data through collisions.

    You may 'get away with it', but I wouldn't want to be responsible for when things go wrong.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://995813]
help
Chatterbox?
[marto]: obviously someone could be religious, but not creationist
[erix]: "Nothing in Intelligent Design makes sense except in the light of Creationism" <-- I made that one up myself (free after Dobzhansky )
[erix]: yes. Deplorable marto, deplorable.
[marto]: the situation seemed similar to this one, majority of the contributrions are nonsense, doesn't address any questions ...
[marto]: meh, I've been called worse :P
[marto]: deplorable is actually not bad for this point in the day :P
[erix]: we aim to satisfy
[erix]: churches are good excuses for large-scale architecture -- which is nice

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2017-07-28 15:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I came, I saw, I ...
























    Results (431 votes). Check out past polls.