Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Help with pushing into a hash

by Kenosis (Priest)
on Aug 29, 2012 at 18:13 UTC ( #990523=note: print w/ replies, xml ) Need Help??


in reply to Help with pushing into a hash

Here's an option to consider:

use Modern::Perl; use File::Slurp qw/read_file write_file/; my $test = 'test.txt'; my $test2 = 'test2.txt'; my $activout = 'ActivACNPF.txt'; my @lines; my %data = map { /(.+)\s+\|\s+(.+)/; $1 => $2 } read_file $test2; for ( read_file $test ) { /(.+)\s+.+=([^\s]+)/; push @lines, "$1 $2 $data{$1}\n" if $data{$1}; } write_file $activout, @lines;

Output to file:

Q197F8 IIV3-002R PF04947.9 Q91G88 IIV6-006L PF01486.12 PF00319.13

%data is initialized using the captured data from test2.txt as key/value pairs. Next, the 'keys' and associated 'values' are captured from test.txt, and the completed line is push onto @lines if a matching key is found. Finally, @lines is written to ActivACNPF.txt.

Hope this helps!

Update: Replaced a single-line map with a multi-line for to improve readability.


Comment on Re: Help with pushing into a hash
Select or Download Code
Re^2: Help with pushing into a hash
by jemswira (Novice) on Aug 30, 2012 at 11:32 UTC
    Thanks so much! It works like a charm. On a side note, how do I remove the decimal place. I tried using (PF.{5}) instead of the (.+) but it would only return the last PF value and nothing else.

      Ask yourself what you want to match?

      In regular expression dot means any character

        Well I want to match all the PF.{5} but i don't want the decimal point at the end, and the numbers after the decimal point. So essentially I want to match all the PF.{5} only.

      You're welcome, jemswira!

      To remove the decimal values in the test2.txt data, try changing the following:

      my %data = map { /(.+)\s+\|\s+(.+)/; $1 => $2 } read_file $test2;

      to:

      my %data = map {s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_file $ +test2;

      New output to file:

      Q197F8 IIV3-002R PF04947 Q91G88 IIV6-006L PF01486 PF00319

      The substitution at the beginning of the map block will globally remove a decimal point followed by one or more digits. Since only the test2.txt values (not keys) contain decimal points, this should work.

        Thank again Kenosis (bows)

        Well I used the code you gave me and tweaked it abit so it could do multiple files at one time so I didnt have to load the hash everytime I wanted to do multiple files. But it returns the errors:

        Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 20. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 22. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 26. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 27. Use of uninitialized value in list assignment at C:\Users\Jems\Desktop +\Perl\test\test2script.plx line 28.

        Also, the when I run it in Padre, it gives the popup message

        line 39: Substitute(s///) doesnt return the changed value even if map.  Continue? Y/N.

        What is wrong with my code?

        #!/usr/bin/perl use Modern::Perl; use File::Slurp qw/read_file write_file/; my $uniprot = 'uniprot-sfinal.txt'; my $activin = 'Activator-PFAM.txt'; my $antioxin = 'AntiOxidant-PFAM.txt'; my $toxinin= 'Toxin-PFAM.txt'; my $activout = 'ActivACNPF.txt'; my $antioxout= 'AntioxACNPF.txt'; my $toxinout= 'ToxinACNPF.txt'; my @activline; my @antioxline; my @toxinline; my %activ = map { s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_fil +e $activin; my %antiox = map { s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_fil +e $antioxin; my %toxin = map { s/\.\d+//g; /(.+)\s+\|\s+(.+)/; $1 => $2 } read_fil +e $toxinin; for ( read_file $uniprot ) { /(.{6})\s+.+=([^\s]+)/; push @activline, "$1 | $2 | $activ{$1}\n" if $activ{$1}; push @antioxline, "$1 | $2 | $antiox{$1}\n" if $antiox{$1}; push @toxinline, "$1 | $2 | $toxin{$1}\n" if $toxin{$1}; } write_file $activout, @activline; write_file $antioxout, @antioxline; write_file $toxinout, @toxinline;

        The input format is still the same as before, but just more input.

[untitled node, ID 990972]
by jemswira (Novice) on Aug 31, 2012 at 10:30 UTC

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Node Status?
    node history
    Node Type: note [id://990523]
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others browsing the Monastery: (5)
    As of 2015-07-05 05:45 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (60 votes), past polls