Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Print entire line

by pabla23 (Novice)
on Nov 06, 2014 at 10:18 UTC ( [id://1106345]=perlquestion: print w/replies, xml ) Need Help??

pabla23 has asked for the wisdom of the Perl Monks concerning the following question:

Good Morning All, i've this kind of file:

Charcot-Marie-Tooth disease DOID:10595 KIF20A MTMR2 MTM1 LMNA HOXD10 PRX NEFL EGR2 LITAF GARS NDRG1 ERBB3 HSPB1 EMP2 MPZ ERBB2 PMP22 MFN2 GJB1

Post-traumatic stress disorder DOID:2055 APOE FKBP5 CRH IL2 SLC6A3 MAOB DBH IL8 <(p>

I want to do this:

- as input i've DOID:2055

- into the file i must search this ID

- after i must print all associated "genes" (APOE, FKBP5,...)

At first i split the file in this way:

use strict; use warnings; open (FILE, "/Users/Pabli/Desktop/do_human_mapping.gmt"); my @array_with_all_fields=(); my @mio=(); while(<FILE>){ @array_with_all_fields=split(/\t/); if ($array_with_all_fields [1] eq "DOID:2055"){ print "".$array_with_all_fields[]."\n"; } } close FILE;

I don't know the way to print all associated genes that are on the same line....this code infact isn't complete! Can someone help me? Thanks a lot

Replies are listed 'Best First'.
Re: Print entire line
by Loops (Curate) on Nov 06, 2014 at 11:01 UTC

    Hi,

    Do you actually have data that has the fields separated by Tab characters as your code suggests? If so are there tabs between each and every gene? Specifying the input file format exactly will determine the solution.

    If the input is actually just space characters but you can count on the id you're searching for to be the only field with a colon in it... this works:

    my $match = "DOID:2055"; while (<DATA>){ my ($name,$id,$genes) = m/(.*?)\s+(\S+?:\S+?)\s+(.*)/; print "$genes\n" if $id eq $match; } __DATA__ Charcot-Marie-Tooth disease DOID:10595 KIF20A MTMR2 MTM1 LMNA HOXD10 P +RX NEFL EGR2 LITAF GARS NDRG1 ERBB3 HSPB1 EMP2 MPZ ERBB2 PMP22 MFN2 G +JB1 Post-traumatic stress disorder DOID:2055 APOE FKBP5 CRH IL2 SLC6A3 MAO +B DBH IL8
      Ok, there is a tab between post-trau/DOID/APOE/FKBP5, they are on the same line

      post-traumatic stress disorder DOID:2055 APOE FKBP5 CRH IL2 SLC6A3 MAOB DBH IL8

      My input is "DOID:2055" and my output should be:

      APOE

      FKBP5

      CRH

      IL2

      and the other genes. Sorry for my explanation now is clear? Thanks Paola

        ok,

        my $filename = '/Users/Pabli/Desktop/do_human_mapping.gmt'; my $match = 'DOID:2055'; open(my $file, '<', $filename) or die "open: $!"; while (<$file>){ my ($name,$id,@genes) = split /\t/; print join("\n",@genes) if $id eq $match; }

        The answer to your question then, is to use the assignment idiom above, to name the first two fields, and then use an array to slurp up all the genes that follow on the line. Because the name and id never get lumped into the @genes array, you don't have to go through contortions when it comes time to print.

Re: Print entire line
by CountZero (Bishop) on Nov 07, 2014 at 10:19 UTC
    Loops has give you some good examples of programs.

    But you also see that for each type of query you will need a different program and you will each and every time have to go through your whole file again and again and again.

    To avoid that, databases have been invented! Once you insert the data into a database, you can query your database to find the answer to all these (and more!) kind of questions easily by using the standard SQL language.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1106345]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-20 15:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found