Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

json data: dereferencing arrays

by hulketa (Initiate)
on May 14, 2013 at 10:52 UTC ( #1033446=perlquestion: print w/replies, xml ) Need Help??
hulketa has asked for the wisdom of the Perl Monks concerning the following question:

Hi perlmonkers! I'm in troubles again accesing to json data (dereferencing arrays of arrays). I have these json data:
[ { "PDBS": [ { "BLOCKS": [ { "PSTART": "1571", "PEND": "1589" } ], "PDB_CHAIN": "B", "PDB_TITLE": "Solution Structure of the Complex of the + PTB Domain of SNT-2 and 19-Residue Peptide (Aa 1571-1589) of HALK", "PDB_ID": "2KUP" }, { "BLOCKS": [ { "PSTART":"1095", "PEND":"1136" }, { "PSTART":"1144", "PEND":"1274" }, { "PSTART":"1289", "PEND":"1401" } ], "PDB_CHAIN":"A", "PDB_TITLE":"Structure of Human Anaplastic Lymphoma Ki +nase in Complex with Nvp-tae684", "PDB_ID":"2XB7" } ], "PROT_NAME": "ALK" } ]
and I'd would like to parse each block and, if my position (in this case = 1200) is included in the block (that means that the position have to be < PEND and > PSTART)print the PDB_ID. In this case the script should print 2XB7. It'd be something like this. But it doesn't work and I think the problem is dereferencing arrays.
my $json; { local $/; open my $fh, "<", "struc_cover_edu.json"; $json = <$fh>; close $fh; } my $data = JSON->new->decode($json); my $position = 1200; for my $s (@$data) { next unless $s->{PROT_NAME} eq 'ALK'; foreach my $p (0 .. $#{$s->{PDBS}}) { foreach my $b (0 .. $#{$s->{BLOCKS}}) { my @blocks = @{ $s->{BLOCKS}}; if ($position >= $blocks[$p][$b]{PSTART} && $position <= $ +blocks[$p][$b]{PEND} ){ print ($b->{PDB_ID},"\n");} } }}
Thank you so much!!

Replies are listed 'Best First'.
Re: json data: dereferencing arrays
by hdb (Prior) on May 14, 2013 at 11:01 UTC

    You got the levels wrong. Change $s->{BLOCKS} to $p->{BLOCKS} and $b->{PDB_ID} to $p->{PDB_ID} and $blocks[$p][$b]{PSTART} to $blocks[$b]{PSTART}.

    UPDATE: I got them wrong as well. Here is code that works:

    my $position = 1200; for my $s (@$data) { next unless $s->{PROT_NAME} eq 'ALK'; foreach my $p (0 .. $#{$s->{PDBS}}) { foreach my $b (0 .. $#{$s->{PDBS}[$p]->{BLOCKS}}) { my @blocks = @{ $s->{PDBS}[$p]->{BLOCKS}}; if ($position >= $blocks[$b]{PSTART} && $position <= $ +blocks[$b]{PEND} ){ print ($s->{PDBS}[$p]->{PDB_ID},"\n");} } } }
      uuoouu thank you so much!!!! I can see my errors now! you save my life!!! THANKS THANKS!!

        No reason to get excited. Here is how I would usually do it by iterating directly over the structure.

        my $position = 1200; for my $s (@$data) { next unless $s->{PROT_NAME} eq 'ALK'; foreach my $p ( @{ $s->{PDBS} } ) { foreach my $b ( @{ $p->{BLOCKS} } ) { if ($position >= $b->{PSTART} && $position <= $b->{PEND} ) +{ print ($p->{PDB_ID},"\n"); } } } }
Re: json data: dereferencing arrays
by sundialsvc4 (Abbot) on May 14, 2013 at 13:31 UTC

    As an aside, in situations such as this one I like to use “data walkers” whenever possible ... search for the keyword walker at and you will see what I mean.   (Say, Data::Leaf::Walker ... there are 110 packages today to choose from.)   This lets me “let the Walker do the walking” through the now-arbitrary data structure that I have been given, and to call my subroutines only for the elements of interest to me no matter where exactly they happen to be.   If the data structure changes, my code still works.   Some of these are simple; others more-or-less emulate the approach of “XPath expressions” in XML.   While this notion is certainly not of a nature that “you should always do this” – not by any means – it is a useful technique to keep in the back of your pocket-protector.

    I used this technique very successfully to winnow through parse-trees that were produced from a nasty collection of SAS® programs, Korn shell scripts, and Tivoli® workload schedules.   Truly an ugly project to design and build, but it worked splendidly, thanks in very large part to this technique.