### Calc distance between atoms in pdb file

by stellaparallax (Initiate)
 on Apr 29, 2012 at 12:34 UTC ( #967928=perlquestion: print w/ replies, xml ) Need Help??
stellaparallax has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am creating a program that calculates the distance between the x, y, z coordinates of atoms listed in a pdb file. So far i have this :
```#!/usr/bin/perl -w

\$num = 0;
\$count = 0;

while (<>) {

# Find x, y, z coordinates and store in separate arrays

if (\$_ =~ /^ATOM/) {
@line = \$_ =~ m/^(.....).(.....).(....).(...)..(....)....(....
+....)(........)(........)/;

\$x = \$line[5];
\$arrayx[\$num] = \$x;

\$y = \$line[6];
\$arrayy[\$num] = \$y;

\$z = \$line[7];
\$arrayz[\$num] = \$z;

++\$num;
}

# Count number of atoms

if (\$_ =~ /^ATOM/) {
++\$count;
}
}

# Calculate distance between all atom coordinates

foreach \$i (0..\$count) {

foreach \$j (\$i + 1..\$count) {

\$dist = sqrt(
(\$arrayx[\$i] - \$arrayx[\$j])**2 +
(\$arrayy[\$i] - \$arrayy[\$j])**2 +
(\$arrayz[\$i] - \$arrayz[\$j])**2
);

print "\$dist\n";

}
}
When I run the program i get this message popping up for some of the lines and I don't know what to do to fix it: "Use of uninitialized value in subtraction (-) at ./gas.pl line 42, <> line 14368" The line that it states is the last line of the pdb file, however i don't see why this line is involved in my calculations as this is not present in any of my arrays. The pdb file I'm using is 3PBL.pdb (sorry wasn't able to attach or post link but easy to find if u put that name into google). Any help would be much appreciated as I am VERY new to Perl. Thanks

Re: Calc distance between atoms in pdb file
by toolic (Chancellor) on Apr 29, 2012 at 12:51 UTC
Since I can not easily reproduce your problem, my best guess is that you either have an off-by-one error in your nested foreach loops or your arrays don't have as many elements as you think they do.

Before your foreach loops, you can check the number of elements in the arrays:

```print scalar(@arrayx), "\n";
print scalar(@arrayy), "\n";
print scalar(@arrayz), "\n";

If that doesn't solve it, add print statements inside your foreach loops.

Another good practice is to check if your regex matches:

```if (@line = \$_ =~ m/^(.....).(.....).(....).(...)..(....)....(........
+)(........)(........)/) {

\$x = \$line[5];
\$arrayx[\$num] = \$x;

# more code

}

Thanks so everyone for taking the time to help a newbie out :). Makes sense now and it works yay!
Re: Calc distance between atoms in pdb file
by BrowserUk (Pope) on Apr 29, 2012 at 13:10 UTC
"Use of uninitialized value in subtraction (-) at ./gas.pl line 42, <> line 14368" The line that it states is the last line of the pdb file, however i don't see why this line is involved in my calculations as this is not present in any of my arrays.

This: <> line 14368" is not relevant to the actual error. It simply mean that is the last line that was read from the last still open file. Perl appends it to the error message because the file is still open, and it might therefore be relevant. In this case, it isn't.

You can suppress that part of the error messages by close open files when you are done with them. In this case, the open file is the implicit file handle opened by the use of the diamond operator (while( <> ). To close it, follow the loop with:

```# Count number of atoms

if (\$_ =~ /^ATOM/) {
++\$count;
}
}

The reason why you are getting the "uninitialised value" warnings is because your for loops are running off the end of your arrays.

The variable \$count counts the number of elements in the arrays, but the indexes run from 0; so the last index will be one less than the number of elements!

Ie. An array that contains the 10 values 1 .. 10, will have indexes 0 .. 9:

```@a = 1 .. 10;
\$a[0] = 1;
\$a[1] = 2;
\$a[2] = 3;
...
\$a[8] = 9;
\$a[9] = 10;

So, to prevent the error detected by the warnings, your for loops should run from 0 to \$count - 1. Ie:

```# Calculate distance between all atom coordinates

foreach \$i ( 0 .. \$count-1 ) {

foreach \$j ( \$i + 1 .. \$count-1 ) {

\$dist = sqrt(
(\$arrayx[\$i] - \$arrayx[\$j])**2 +
(\$arrayy[\$i] - \$arrayy[\$j])**2 +
(\$arrayz[\$i] - \$arrayz[\$j])**2
);

print "\$dist\n";

}
}

That should get you going. There are several other changes that would make your life easier, but I'll leave that for other posts.

One thing that intrigues me though. How did @line get transmuted into (at)line in your post?

Re: Calc distance between atoms in pdb file
by BrowserUk (Pope) on Apr 29, 2012 at 13:19 UTC

```#!/usr/bin/perl -w
use strict;

my( @arrayx, @arrayy, @arrayz );

while (<>) {
# Find x, y, z coordinates and store in separate arrays

if (\$_ =~ /^ATOM/) {
my @line = \$_ =~ m/^(.....).(.....).(....).(...)..(....)....(.
+.......)(........)(........)/;

## using push mean you don't have to count because ...
push @arrayx, \$line[5];
push @arrayy, \$line[6];
push @arrayz, \$line[7];
}

}
close *ARGV; ## prevent confusing error message suffixes

# Calculate distance between all atom coordinates
## ... \$#xxx gives you the highest index in array @xxx

foreach my \$i ( 0 .. \$#arrayx ) {

foreach my \$j ( \$i + 1 .. \$#arrayx ) {

my \$dist = sqrt(
(\$arrayx[\$i] - \$arrayx[\$j])**2 +
(\$arrayy[\$i] - \$arrayy[\$j])**2 +
(\$arrayz[\$i] - \$arrayz[\$j])**2
);

## Adding \$i and \$j to your output will let you know what that
+ output is.
print "\$i <> \$j : \$dist\n";

}
}

Re: Calc distance between atoms in pdb file
by brx (Pilgrim) on Apr 29, 2012 at 18:53 UTC

Post some real input data: answers will be better, with a better regex/method.

Reading each line, you can calculate distance with all previous points - no need to make a double loop after reading all the file.

Update: strike first part (sorry OP)
Post some real input data: answers will be better,

I found the real input data via the OPs reference.

