Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

calculate distance between atoms and a target atom

by mishimakaz (Initiate)
on Dec 02, 2013 at 21:51 UTC ( #1065326=perlquestion: print w/ replies, xml ) Need Help??
mishimakaz has asked for the wisdom of the Perl Monks concerning the following question:

Hey all, I am a PERL newbie trying to create a script to calculate the distance between all HIS/ASP residues and a single Zinc atom embedded into a protein.

#!/usr/bin/perl -W use strict; use warnings; #subroutines for calculations sub distanceAB { my $distance = 0; my $Ax = substr($_[0], 30, 8) + 0; my $Ay = substr($_[0], 38, 8) + 0; my $Az = substr($_[0], 46, 8) + 0; my $Bx = substr($_[1], 30, 8) + 0; my $By = substr($_[1], 38, 8) + 0; my $Bz = substr($_[1], 46, 8) + 0; $distance = sqrt(($Ax - $Bx)**2 + ($Ay - $By)**2 + ($Az - $Bz) +**2); return sprintf("%4.2f", $distance); } #open files for calculations and modify distance cutoff for target res +idues my $input=$ARGV[0]; my $num = 0; my $i = 0; open IN, "<$input" or die "can't open .pdb"; my @pdblines = (); while (<IN>) { for ( @pdblines) { #chomp $pdbline; if (my $pdbline =~ /ZN1 LG1 X/) { my $ZNline = $pdbline; next; } #find xyz coordinates for other atoms and store in array if (my $pdbline =~ /^ATOM.*(OD2 ASP|NE2 HIS)/) { my $Atomline = $pdbline; my $resname = substr($pdbline, 16, 3); my $resnumber = substr($pdbline, 22, 3); #calculate Zn to each atom distance my $Zndistance=distanceAB(my $ZNline, $Atomline); if (my $Zndistance < 2.5) { print "$Zndistance \n"; print "$resname $resnumber \n"; print "Coordinator $num \n"; } } ++$num; } }

When I run the code, I get a lot of "Use of initialized value" errors, such as

Use of uninitialized value $Znx in array element at ./getRESZNdistance.pl line 58, <IN> line 2863.

Use of uninitialized value $Znz in array element at ./getRESZNdistance.pl line 58, <IN> line 2863.

Basically when the script encounters lines that do not contain /OD2 ASP/ or /NE2 HIS/ it runs into an error from what I can see.

Any suggestions?

Comment on calculate distance between atoms and a target atom
Download Code
Re: calculate distance between atoms and a target atom
by PerlSufi (Friar) on Dec 02, 2013 at 22:04 UTC
    Hi there, it doesn't appear that you have formally declared the $pdbline variable. maybe trying
    my $pdbline = <IN>;
    would work? But basically, you need declare the new variables with 'my'
    or better yet, use a 3 pointed opening method like this:
    my $filename = $ARGV[0]; open(my $fh, '<:encoding(UTF-8)', $filename)); while (my $pdbline = <$fh>) {
Re: calculate distance between atoms and a target atom
by toolic (Chancellor) on Dec 02, 2013 at 22:09 UTC
    I get a lot more warnings. For example
    Name "main::ZNx" used only once: possible typo at .... Name "main::Znz" used only once: possible typo at ....
    $ZNx is a different variable from $Znx because Perl is case-sensitive. You really want to start using strict (Tip #1 from the Basic debugging checklist).

    Also, you probably want to change:

    if ($pdbline =~ /OD2 ASP/ or /NE2 HIS/)

    to:

    if ($pdbline =~ /OD2 ASP/ or $pdbline =~ /NE2 HIS/)
      yeah, toolic gave a better recommendation :)
Re: calculate distance between atoms and a target atom
by hippo (Curate) on Dec 02, 2013 at 22:15 UTC

    Don't use this sort of construct:

    if ($pdbline =~ /OD2 ASP/ or /NE2 HIS/)

    You are testing 2 possibilities here. The first one is $pdbline =~ /OD2 ASP/ which may be fine, but the second one is /NE2 HIS/ which by default will test $_ which is probably not what you want. Either combine the 2 regular expressions into 1 (left as an exercise) or test them individually like this:

    if ($pdbline =~ /OD2 ASP/ or $pdbline =~ /NE2 HIS/)

    The "Use of uninitialized value" reports you see are warnings, not errors. However, the fact that they are presented to you indicates that you are attempting to do something with an uninitialised value which is usually not what you intend.

    In this specific case, your references to items like $x1[$Znx] on line 58 will give that if the $Znx is uninitialised. You will need to trace through the logic to see how that might happen and correct it.

    Best advice is:

    • use strict; # Always
    • Lose the prototypes unless you have a very, very good reason for keeping them.
    • Don't use "or" like you were.

    Best of luck with the rest of it.

Re: calculate distance between atoms and a target atom
by educated_foo (Vicar) on Dec 02, 2013 at 22:34 UTC
    Your code does something like
    if (X) { $Znx = ...; } elsif (Y) { $Znx = ...; }
    If neither X nor Y is true, $Znx does not get assigned a value, so it remains undefined. Perl treats that as zero, which may or may not be what you want.
Re: calculate distance between atoms and a target atom
by Anonymous Monk on Dec 02, 2013 at 23:35 UTC
    Do atoms really have (X,Y,Z) "coordinates" and trigonometric "distances?!"
      they do in a PDB file, its a snapshot of a protein structure, not an accurate representation but the best we have in most cases


      This is not a Signature...
Re: calculate distance between atoms and a target atom
by Laurent_R (Parson) on Dec 03, 2013 at 07:48 UTC
    Don't use function prototypes unless you know damn well what you are doing and why you are doing it. It is obviously not useful in your code.
Re: calculate distance between atoms and a target atom
by oiskuu (Friar) on Dec 03, 2013 at 10:54 UTC
    Some further suggestions. It would be fitting to hold point coordinates in an array type variable. Instead of ($Znx, $Zny, $Znz), a single @Zn variable might be used.

    Secondly, regular expressions offer a convenient method to capture specific parts of input. This allows one to write, e.g. @Zn = ($1, $2, $3); — but you must include (...) groups in the pattern. If the data is of rigidly fixed format, pack() and unpack() are often quite useful:

    my $xyz_fmt = '@30a7 @38a6 @46a6'; while ($pdbline = <IN>) { if ($pdbline =~ m/ZN1 LG1 X/) { @Zn = unpack $xyz_fmt, $pdbline; next; } if ($pdbline =~ /^ATOM.*(OD2 ASP|NE2 HIS)/) { push @Atoms, [unpack $xyz_fmt, $pdbline]; ... } } # ... do the calculations once we have all data.
    Note that the above example looks a bit strange because of the mixed regex and unpack usage. The second regex here combines three patterns from original code. @Atoms is an array of arrays (AoA). You can check the number of points with int(@Atoms).

    Update: for questions regarding the handling of AoA, please see perllol or perldsc.

      I am trying to use your suggestion, but also at the same time, create an array of "Atoms" such that I can calculate the distance between the Zinc atom and all other OD2 and NE2 atoms. So I am trying to use ++$num to tally up the number of OD2s and NE2s It would appear that the problem is in the this line

      push $Atoms$num, unpack $xyz_fmt, $pdbline;

      #!/usr/bin/perl -W #subroutines for calculations use strict; sub distanceAB($$$$$$) { my ($x1,$y1,$z1,$x2,$y2,$z2) = @_; $distance = sqrt (($x2-$x1)**2 + ($y2 -$y1)**2 + ($z2-$z1)**2); return sprintf("%4.2f", $distance); } #open files for calculations and modify distance cutoff for target res +idues $input=$ARGV[0]; open(IN, "$input"); $num = 0; $count = 0; my $xyz_fmt = '@30a8 @38a8 @46a8'; while ($pdbline = <IN>) { if ($pdbline =~ m/ZN1 LG1 X/) { @ZN = unpack $xyz_fmt, $pdbline; next; } #find xyz coordinates for other atoms and store in array if ($pdbline =~ /^ATOM.*(OD2 ASP|NE2 HIS)/) { push $Atoms[$num], [unpack $xyz_fmt, $pdbline]; $resname[$num] = substr($pdbline, 16, 3); $resnumber[$num] = substr($pdbline, 22, 3); ++$num; } } #count number of potential coordinating atoms if ($pdbline =~ /OD2 ASP/ or $pdbline =~ /NE2 HIS/) { ++$count; } #calculate Zn to each atom distance foreach $i (0..$count) { $Zndistance=distanceAB($x1[$ZN[0]],$y1[$ZN[1]],$z1[$ZN[2]],$x2[$At +oms[$i][0]],$y2[$Atoms[$i][1]],$z2[$Atoms[$i][2]]); if ($Zndistance < 2.5) { $distanceAB=$Zndistance; print "$distanceAB \n"; print "$resname[$i] $resnumber[$i] \n"; } }
        Regarding my new code, how would I implement my subroutine calculation?
Re: calculate distance between atoms and a target atom
by mishimakaz (Initiate) on Dec 03, 2013 at 17:07 UTC
    Thank you all for helping this newbie. I am working on patching it with yous suggestions
Re: calculate distance between atoms and a target atom
by mishimakaz (Initiate) on Dec 04, 2013 at 18:09 UTC
    Bump. Using suggestions from some lab mates I have updated the code. The code runs with no errors, but there is no output...as in nothing is printed. Anyone know why?

      You'd need to post the updated code if you are to get help with it.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        My mistake. I updated my original post instead of pasting the code onto another post.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1065326]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2014-12-25 18:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (161 votes), past polls