Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
more useful options
 
PerlMonks  

Re: Hello im new to programming i need help

by Kenosis (Priest)
on Feb 10, 2013 at 05:51 UTC ( #1018026=note: print w/ replies, xml ) Need Help??


in reply to Hello im new to programming i need help

You've been given some excellent suggestions. One major issue is that your PDB records are fixed-width (see Coordinate File Description (PDB Format), so splitting shouldn't be used to obtain the fields.

The code below has a subroutine that uses unpack to get the fields you want. You'll also see use strict; use warnings;;always have these in your programs. Lexically-scoped variables (my) are also used:

#!/usr/bin/perl -w use strict; use warnings; my ($xC4, $yC4, $zC4, $xC1, $yC1, $zC1, $xC13, $yC13, $zC13); my ($xS, $yS, $zS, $xY, $yY, $zY, $xG, $yG, $zG); #$dir = "/net/klab2/u2/home/fmohammad/GFP2/run1"; #opendir (DIR, $dir) or die $!; #@one = readdir (DIR); #foreach $file (@one) #{ my $initialfile2 = "UM_802_R203E112_1GFL_clean_gfp_5.pdb"; open my $FILETWO, '<', $initialfile2 or die "Cannot open $initialfile2 for read: $!\n"; while ( my $line = <$FILETWO> ) { next unless $line =~ m/^HETATM/; my ( $serial, $atom, $xCoord, $yCoord, $zCoord ) = getRecData($lin +e); if ( $atom eq "C4" ) { $xC4 = $xCoord; $yC4 = $yCoord; $zC4 = $zCoord; } if ( $atom eq "C1" ) { $xC1 = $xCoord; $yC1 = $yCoord; $zC1 = $zCoord; } if ( $atom eq "C13" ) { $xC13 = $xCoord; $yC13 = $yCoord; $zC13 = $zCoord; } } close $FILETWO; my $initialfile = "1GFL.pdb"; open my $FILEONE, '<', $initialfile or die "Cannot open $initialfile for read: $!\n"; while ( my $line = <$FILEONE> ) { next unless $line =~ m/^ATOM/; my ( $serial, $atom, $xCoord, $yCoord, $zCoord ) = getRecData($lin +e); if ( $serial == 479 ) { $xS = $xCoord; $yS = $yCoord; $zS = $zCoord; } if ( $serial == 484 ) { $xY = $xCoord; $yY = $yCoord; $zY = $zCoord; } if ( $serial == 496 ) { $xG = $xCoord; $yG = $yCoord; $zG = $zCoord; } } close $FILEONE; my $part1 = ( ( ( $xC1 - $xS )**2 ) + ( ( $yC1 - $yS )**2 ) + ( ( $zC1 - $zS )** +2 ) ); my $part2 = ( ( ( $xC4 - $xY )**2 ) + ( ( $yC4 - $yY )**2 ) + ( ( $zC4 - $zY )** +2 ) ); my $part3 = ( ( ( $xC13 - $xG )**2 ) + ( ( $yC13 - $yG )**2 ) + ( ( $zC13 - $zG +)**2 ) ); my $sum = $part1 + $part2 + $part3; my $sum1 = $sum / 3; my $rmsd = sqrt($sum1); print "$rmsd\n"; #} sub getRecData { return map { s/\s//g; $_ } ( unpack 'a6 a5 a1 a4 a14 a8 a8 a8', $_[0] )[ 1, 3, 5, 6, 7 ]; }

I don't have the data sets your working with, so this hasn't been tested with files. However, the subroutine has been tested on both ATOM and HETATM records, and does return the fields you want.

I hope this gets you closer to a fully working script.

Update: Bio::PDB::Structure is a module you can use to parse the pdb files. However, when using it, field values were returned that needed to be cleaned--as if it just returns an entire field at its fixed-width. Not a lot 'out there' yet on using this module (it's still young). Given this, I don't think the module provides an advantage, in your case.


Comment on Re: Hello im new to programming i need help
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018026]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-04-19 00:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (473 votes), past polls