Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Help with pushing into a hash

by Kenosis (Priest)
on Aug 29, 2012 at 18:13 UTC ( #990523=note: print w/ replies, xml ) Need Help??


in reply to Help with pushing into a hash

Here's an option to consider:

use Modern::Perl; use File::Slurp qw/read_file write_file/; my $test = 'test.txt'; my $test2 = 'test2.txt'; my $activout = 'ActivACNPF.txt'; my @lines; my %data = map { /(.+)\s+\|\s+(.+)/; $1 => $2 } read_file $test2; for ( read_file $test ) { /(.+)\s+.+=([^\s]+)/; push @lines, "$1 $2 $data{$1}\n" if $data{$1}; } write_file $activout, @lines;

Output to file:

Q197F8 IIV3-002R PF04947.9 Q91G88 IIV6-006L PF01486.12 PF00319.13

%data is initialized using the captured data from test2.txt as key/value pairs. Next, the 'keys' and associated 'values' are captured from test.txt, and the completed line is push onto @lines if a matching key is found. Finally, @lines is written to ActivACNPF.txt.

Hope this helps!

Update: Replaced a single-line map with a multi-line for to improve readability.


Comment on Re: Help with pushing into a hash
Select or Download Code
Re^2: Help with pushing into a hash
by jemswira (Novice) on Aug 30, 2012 at 11:32 UTC
    Thanks so much! It works like a charm. On a side note, how do I remove the decimal place. I tried using (PF.{5}) instead of the (.+) but it would only return the last PF value and nothing else.

      Ask yourself what you want to match?

      In regular expression dot means any character

        Well I want to match all the PF.{5} but i don't want the decimal point at the end, and the numbers after the decimal point. So essentially I want to match all the PF.{5} only.

      You're welcome, jemswira!

      To remove the decimal values in the test2.txt data, try changing the following:

      my %data = map { /(.+)\s+\|\s+(.+)/; $1 => $2 } read_file $test2;

      to:

      my %data = map {s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_file $ +test2;

      New output to file:

      Q197F8 IIV3-002R PF04947 Q91G88 IIV6-006L PF01486 PF00319

      The substitution at the beginning of the map block will globally remove a decimal point followed by one or more digits. Since only the test2.txt values (not keys) contain decimal points, this should work.

        Thank again Kenosis (bows)

        Well I used the code you gave me and tweaked it abit so it could do multiple files at one time so I didnt have to load the hash everytime I wanted to do multiple files. But it returns the errors:

        Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 20. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 22. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 26. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 27. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 28.

        Also, the when I run it in Padre, it gives the popup message

        line 39: Substitute(s///) doesnt return the changed value even if map.  Continue? Y/N.

        What is wrong with my code?

        #!/usr/bin/perl use Modern::Perl; use File::Slurp qw/read_file write_file/; my $uniprot = 'uniprot-sfinal.txt'; my $activin = 'Activator-PFAM.txt'; my $antioxin = 'AntiOxidant-PFAM.txt'; my $toxinin= 'Toxin-PFAM.txt'; my $activout = 'ActivACNPF.txt'; my $antioxout= 'AntioxACNPF.txt'; my $toxinout= 'ToxinACNPF.txt'; my @activline; my @antioxline; my @toxinline; my %activ = map { s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_fil +e $activin; my %antiox = map { s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_fil +e $antioxin; my %toxin = map { s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_fil +e $toxinin; for ( read_file $uniprot ) { /(.{6})\s+.+=([^\s]+)/; push @activline, "$1 | $2 | $activ{$1}\n" if $activ{$1}; push @antioxline, "$1 | $2 | $antiox{$1}\n" if $antiox{$1}; push @toxinline, "$1 | $2 | $toxin{$1}\n" if $toxin{$1}; } write_file $activout, @activline; write_file $antioxout, @antioxline; write_file $toxinout, @toxinline;

        The input format is still the same as before, but just more input.

[untitled node, ID 990972]
by jemswira (Novice) on Aug 31, 2012 at 10:30 UTC

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Node Status?
    node history
    Node Type: note [id://990523]
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others meditating upon the Monastery: (5)
    As of 2014-08-30 23:22 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The best computer themed movie is:











      Results (294 votes), past polls