I have an array which Dumps to this
What you show is the contents of hashref, not an array. You're referring perhaps to the anonymous array under 'hits'?
Let's say your hashref that you get back from Elasticsearch is called $results. Let's assume you want to know which IPs requested which resources, and you'd like a count. The following parses your hashref and produces another hash containing your desired data.
use strict; use warnings; use feature 'say';
use Data::Dumper; $Data::Dumper::Indent = $Data::Dumper::Sortkeys = 1;
my $results = load_results();
my %interesting;
# extract interesting data, record count of hits per IP per request
for my $hit ( map { $_->{'_source'} } @{ $results->{'hits'}->{'hits'}
+} ) {
$interesting{ $hit->{'request'} }->{ $hit->{'clientip'} }++;
}
say Dumper \%interesting;
sub load_results {
return {
'_shards' => {
'skipped' => 0,
'successful' => 5,
'total' => 5,
'failed' => 0
},
'hits' => {
'hits' => [
{
'_id' => 'AV6SrwuTv7sBjjRqMiW1',
'_source' => {
'request' => '/inde
+x.php',
'clientip' => '192.
+168.1.1'
},
'_type' => 'nginx',
'_index' => 'nginx-2017.09.18',
'_score' => '4.238926'
},
{
'_id' => 'AV6SrwuTv7sBjjRqMiW1',
'_source' => {
'request' => '/inde
+x.php',
'clientip' => '192.
+168.1.1'
},
'_type' => 'nginx',
'_index' => 'nginx-2017.09.18',
'_score' => '4.238926'
},
{
'_id' => 'AV6UL-DOv7sBjjRqMidb',
'_source' => {
'clientip' => '192.
+168.1.1',
'request' => '/'
},
'_score' => '4.189655',
'_type' => 'nginx',
'_index' => 'nginx-2017.09.18'
},
{
'_id' => 'AV6SrwuTv7sBjjRqMiW1',
'_source' => {
'request' => '/',
'clientip' => '192.
+168.1.2'
},
'_type' => 'nginx',
'_index' => 'nginx-2017.09.18',
'_score' => '4.238926'
},
],
'total' => 2,
'max_score' => '4.238926'
},
'took' => 0,
'timed_out' => undef
};
} # end sub
__END__
The key lines are:
for my $hit ( map { $_->{'_source'} } @{ $results->{'hits'}->{'hits'}
+} ) {
$interesting{ $hit->{'request'} }->{ $hit->{'clientip'} }++;
}
To understand a complex expression involving a loop it's often helpful to read from the right to the left.
- Here you start with the value of $results->{'hits'}->{'hits'}, which is the anonymous array you're interested in.
-
This is dereferenced with @{ ... }, which turns it into a regular array we can work with.
-
map then loops through this array, assigning the value of each element to $_, and then returning the result of the expression $_->{'_source'}, which is the hashref containing the two k-v pairs you want.
-
Then on the next line you populate the hash %interesting with a key for each unique value of 'request', and a value of a hashref, with its own subkeys for each 'clientip' that requested the resource.
-
The value for each of those subkeys is incremented by one for each time the record is found (starting from 0 which is magically auto-initialized by Perl).
This gives a data structure like this (I added a couple of hits to flesh out the results):
$VAR1 = {
'/' => {
'192.168.1.1' => 1,
'192.168.1.2' => 1
},
'/index.php' => {
'192.168.1.1' => 2
}
};
Which you can then loop through and print in a report or whatever.
[ ... ]
for my $hit ( map { $_->{'_source'} } @{ $results->{'hits'}->{'hits'}
+} ) {
$interesting{ $hit->{'request'} }->{ $hit->{'clientip'} }++;
}
for my $resource ( keys %interesting ) {
say "Resource: $resource";
foreach my $ip ( keys %{ $interesting{ $resource } } ) {
say "\t$ip made $interesting{ $resource }->{ $ip } requests";
}
}
Output:
Resource: /
192.168.1.1 made 1 requests
192.168.1.2 made 1 requests
Resource: /index.php
192.168.1.1 made 2 requests
Hope this helps!
The way forward always starts with a minimal test.
|