http://www.perlmonks.org?node_id=890866


in reply to Re^6: incrementing already existing file
in thread incrementing already existing file

With your program Newfile is replaced by B set of values. How do I make it to take A set first and go to B set.
So what you have is two collections of numbers in the file you open as MYFILE. One set is labeled 'A' in column 3 and the other is labeled 'B' in column 3. The values in column 5 are an index, and the values in the last column are the ones you need to collect and add to NEWF.

What I said before is still mostly what you need to do. Here's a revised outline of the code you need to write:

You didn't show a sample of what the output should look like, so this is as far as I can go with what you've posted. If the output should have both 'A' and 'B' values at the end of a line, you do one thing. If you need two lines (one for each of the 'A' and 'B' values), you do something different. But we can't know unless you tell us. Which leads me to the following advice about how to get along well on PerlMonks:

I think you have enough additional information now to "fix" my code to do what you want. Give it a shot and come back with questions if you run into trouble.

Replies are listed 'Best First'.
Re^8: incrementing already existing file
by wanttoprogram (Novice) on Mar 02, 2011 at 21:12 UTC
    Sorry about not being clear. I could not figure out. Thank you.
    My MYFILE looks like: It has two sets of values. Column 3 has A and B IN CHAIN A RESIDUE 1 HAS ENERGY 28.8216 IN CHAIN A RESIDUE 2 HAS ENERGY 24.9274 IN CHAIN A RESIDUE 3 HAS ENERGY 19.0884 IN CHAIN B RESIDUE 1 HAS ENERGY 29.2508 IN CHAIN B RESIDUE 2 HAS ENERGY 28.0677 IN CHAIN B RESIDUE 3 HAS ENERGY 18.4152 My NEWF looks like: It has two sets of values. Last column has A and +B ATOM 1 N MET 1 4.751 25.841 -14.267 1.00 0.00 + A ATOM 2 HT1 MET 1 4.805 25.474 -15.239 1.00 0.00 + A ATOM 3 HT2 MET 1 4.342 26.797 -14.278 1.00 0.00 + A ATOM 21 HN ALA 2 7.312 24.620 -11.539 1.00 0.00 + A ATOM 22 CA ALA 2 6.858 25.683 -9.839 1.00 0.00 + A ATOM 23 HA ALA 2 6.844 26.753 -9.694 1.00 0.00 + A ATOM 24 CB ALA 2 5.754 24.998 -9.007 1.00 0.00 + A ATOM 36 OG1 THR 3 11.141 27.554 -8.457 1.00 93.20 + A ATOM 37 HG1 THR 3 10.623 27.711 -9.249 1.00 0.00 + A ATOM 38 CG2 THR 3 11.584 25.728 -9.911 1.00 91.24 + A ATOM 39 HG21 THR 3 12.484 26.243 -10.310 1.00 0.00 + A ATOM 8104 N MET 1 -6.496 -26.244 -13.291 1.00 0.00 + B ATOM 8105 HT1 MET 1 -6.452 -26.200 -14.329 1.00 0.00 + B ATOM 8106 HT2 MET 1 -5.949 -27.077 -12.992 1.00 0.00 + B ATOM 8123 N ALA 2 -8.788 -26.130 -10.521 1.00 0.00 + B ATOM 8124 HN ALA 2 -9.227 -25.264 -10.749 1.00 0.00 + B ATOM 8125 CA ALA 2 -9.147 -26.653 -9.220 1.00 0.00 + B ATOM 8126 HA ALA 2 -9.475 -27.676 -9.335 1.00 0.00 + B ATOM 8127 CB ALA 2 -7.979 -26.598 -8.218 1.00 0.00 + B ATOM 8133 N THR 3 -11.093 -26.367 -7.736 1.00 86.23 + B ATOM 8134 HN THR 3 -10.807 -27.207 -7.281 1.00 0.00 + B ATOM 8135 CA THR 3 -12.402 -25.826 -7.360 1.00 88.87 + B ATOM 8136 HA THR 3 -12.432 -24.774 -7.602 1.00 0.00 + B The last column values of MYFILE should be copied and repeated in last + but one column of my output. The two sets must be repeated. My output looks like: ATOM 1 N MET 1 4.751 25.841 -14.267 1.00 28.8216 + A ATOM 2 HT1 MET 1 4.805 25.474 -15.239 1.00 28.8216 + A ATOM 3 HT2 MET 1 4.342 26.797 -14.278 1.00 28.8216 + A ATOM 21 HN ALA 2 7.312 24.620 -11.539 1.00 24.9274 + A ATOM 22 CA ALA 2 6.858 25.683 -9.839 1.00 24.9274 + A ATOM 23 HA ALA 2 6.844 26.753 -9.694 1.00 24.9274 + A ATOM 24 CB ALA 2 5.754 24.998 -9.007 1.00 24.9274 + A ATOM 36 OG1 THR 3 11.141 27.554 -8.457 1.00 19.0884 + A ATOM 37 HG1 THR 3 10.623 27.711 -9.249 1.00 19.0884 + A ATOM 38 CG2 THR 3 11.584 25.728 -9.911 1.00 19.0884 + A ATOM 39 HG21 THR 3 12.484 26.243 -10.310 1.00 19.0884 + A ATOM 8104 N MET 1 -6.496 -26.244 -13.291 1.00 29.2508 + B ATOM 8105 HT1 MET 1 -6.452 -26.200 -14.329 1.00 29.2508 + B ATOM 8106 HT2 MET 1 -5.949 -27.077 -12.992 1.00 29.2508 + B ATOM 8123 N ALA 2 -8.788 -26.130 -10.521 1.00 28.0677 + B ATOM 8124 HN ALA 2 -9.227 -25.264 -10.749 1.00 28.0677 + B ATOM 8125 CA ALA 2 -9.147 -26.653 -9.220 1.00 28.0677 + B ATOM 8126 HA ALA 2 -9.475 -27.676 -9.335 1.00 28.0677 + B ATOM 8127 CB ALA 2 -7.979 -26.598 -8.218 1.00 28.0677 + B ATOM 8133 N THR 3 -11.093 -26.367 -7.736 1.00 18.4152 + B ATOM 8134 HN THR 3 -10.807 -27.207 -7.281 1.00 18.4152 + B ATOM 8135 CA THR 3 -12.402 -25.826 -7.360 1.00 18.4152 + B ATOM 8136 HA THR 3 -12.432 -24.774 -7.602 1.00 18.4152 + B
      Show us the code you tried. For that desired output, the modifications to the code I posted are not that complicated, especially with the suggestions that I already made. What I already said about collecting values from MYFILE will still work. Here are some additional hints to create the output you want:
      • For each line from NEWF, you need to check the index in column 5 and the A/B label in the last column. You get your hands on each of those with something like this:
        my @fields = ( split /\s+/ ); # need to hold on to all field +s for later my( $index, $label ) = @fields[4, -1]; # this is an array slice
      • Now that you know the index and the label for a line, you use them to find the proper value from one of the arrays you created while reading from MYFILE (e.g., @a_vals or @b_vals if you followed my example). As I mentioned, before, you can use if/else to do that but a nicer way is with given/when (as long as your Perl is 5.10.0 or newer). Here's a suggestion but note that it will not work as written here. $x and $y are not declared. You need to replace them with the correct value from @a_vals or @b_vals (HINT: use $index from above).
        use 5.010; given( $label ) { when ( 'A' ) { $fields[ -2 ] = $x; } when ( 'B' ) { $fields[ -2 ] = $y; } default { say "OOPS! bad label"; } }
      • Now you can print the result:
        say join "\t", @fields; # separates fields with TAB

      That's all there is to it. Give it another go and post back with your code if you need more help.

        #!/usr/bin/env perl use strict; use warnings; my $file1 = "2hgs_d00_internal_nrg_e.dat"; my $file2 = "2HGS_bio_conv-min_p.pdb"; open( MYFILE, '<', $file1 ) or die "cannot open $file1: $!"; open( NEWF, '<', $file2 ) or die "cannot open $file2: $!"; my (@a_vals, @b_vals); while ( <MYFILE> ) { chomp; my( $label, $index, $value ) = ( split /\s+/ )[3, 5, -1]; $a_vals[ $index ] = $value; $b_vals[ $index ] = $value; } close MYFILE; while ( <NEWF> ) { chomp; my @fields = ( split /\s+/ ); my $index = $fields[4]; my $label = $fields[-1]; if ($label eq "A"){ $fields[-2] = $a_vals[ $index ] } if ($label eq "B"){ $fields[-2] = $b_vals[ $index ] } my $output = join "\t", @fields; print "$output\n"; } close NEWF;
        I am still getting values from B from MYFILE