Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

spliting a table into individual columns

by C_elegance (Initiate)
on Dec 20, 2007 at 22:42 UTC ( [id://658309]=perlquestion: print w/replies, xml ) Need Help??

C_elegance has asked for the wisdom of the Perl Monks concerning the following question:

H!
it's my first message, and i'm a total beginner. I'm trying to split a table into individual columns (and i think i've done it) but how go i extract the data from the individual column and/or search them for certain data, like type?
use warnings; use strict; sub read_TABLE{ my($file_name)=@_; open (OPEN_TABLE,$file_name); if(open (OPEN_TABLE, $file_name)){ print "Can open file \"$file_name\" \n"; } unless(open(OPEN_TABLE, $file_name)){ print STDERR "Can't open file \"$file_name\" \n"; } my@table; while(my$line=<OPEN_TABLE>){ chomp$line; my@data=split(/\t/,$line,5); push@table,\@data; } close OPEN_TABLE; return @table; } my@s=&read_table("TABLE.txt"); foreach my$line(@s) { my$start=[0]; my$end=[1]; my$name=[2]; my$strand=[3]; my$type=[4]; print "$type\n"; }
one of the columns has tells me whether it's a promoter (DNA data). i want to fish out all "promoters", match them to the name and sequence start and end description other columns so i can fish out the sequence from different file.
can you help?
also when i run it i get:
Can open file "TABLE.txt" ARRAY(0x181ea88) ARRAY(0x181ea88) ARRAY(0x181ea88) ARRAY(0x181ea88) ARRAY(0x181ea88) ..... ARRAY(0x181ea88)
what does it mean?? it looks like some sort of reference??

actually i've just tested whether it's a reference with

print "@$type\n";
and yes, i get
Can open file "TABLE.txt" 4 4 4 4 etc
which means that i created a column called "type" full of 4s rather then an array of data from 5th column arrrrrrr

Since this morning i've tried other ways but still nothing, please help, i losing will to live or brewing a computer rage, i don't know what worst, one's sure i feel right idiot!!

Thank a million
x

Replies are listed 'Best First'.
Re: spliting a table into individual columns
by NetWallah (Canon) on Dec 20, 2007 at 23:31 UTC
    Welcome to the Monastary!

    • Please format your node based on the Writeup Formatting Tips
    • Your code makes THREE attempts at opening the file. The first attempt probably succeeds, but your code does not bother to verify that. The second attempt probably fails, giving you the error message. Please see the sample code for the "open" function.
    • The statement: "push @table,\@data;" adds ARRAY-REFERENCES as elements of @table.
      When you print an array reference, you get "ARRAY(0x<address>)" that you are seeing
    • Squarebrackets with a number like "my $type=[4];" result in $type containing an array reference to a single-element (anonymous) array. The element's value is 4.
    • It appears that you need a very simple parser - please post a small sample of the data, and how you would like the output to look, so monks can help write that for you, perhaps with a one-liner.

    Your (incorrect) code should look like this, when posted to this site. This is produced using "code" tags:

    use warnings; use strict; sub read_TABLE{ my($file_name)=@_; open (OPEN_TABLE,$file_name); if(open (OPEN_TABLE, $file_name)){ print "Can open file \"$file_name\" \n"; } unless(open(OPEN_TABLE, $file_name)){ print STDERR "Can't open file \"$file_name\" + \n"; } my@table; while(my$line=<OPEN_TABLE>){ chomp $line; my @data=split(/\t/,$line,5); push @table,\@data; } close OPEN_TABLE; return @table; } my @s=read_table("TABLE.txt"); foreach my $line(@s) { my$start=[0]; my$end=[1]; my$name=[2]; my$strand=[3]; my$type=[4]; print "$type\n"; }

         "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom

      Your code makes THREE attempts at opening the file. The first attempt probably succeeds, but your code does not bother to verify that. The second attempt probably fails, giving you the error message. Please see the sample code for the "open" function.

      If the first attempt succeeds then the second attempt will close the filehandle before it opens the filehandle again and then the same will happen for third attempt.

      if the first attempt fails then the second and third attempts will also fail.

Re: spliting a table into individual columns
by graff (Chancellor) on Dec 21, 2007 at 01:54 UTC
    I don't know what your input data file looks like (presumably it's plain text containing 5 tab-delimited values per line), but your code could be simplified to the following, in order to do what I think you want:
    #!/usr/bin/perl use warnings; use strict; sub read_TABLE { my ( $filename ) = @_; open( TBL, $filename ) or die "cannot open $filename: $!\n"; my @table; while ( <TBL> ) { chomp; push @table, [ split( /\t/, $_ , 5 ) ]; } close TBL; return @table; } for my $row ( read_table( "TABLE.txt" )) { my ( $start, $end, $name, $strand, $type ) = @$row; print "$type\n"; }
    Some notes:
    • As mentioned in the first reply, you should only call "open()" once before reading a given file, and test the return from that one call.

    • You don't need the "temporary" arrays to store lists of things; when a sub or function call returns a list, you can use the call directly as input a "for" loop, or as the content of an anonymous array creation (e.g. putting "split()" between square brackets).

    • The "read_table" function is returning a list of array references, so in the main part of the code, the "for" loop treats each value returned by "read_table" as an array reference (@$row).
      firstly I'm sorry for not formatting my message properly, i didn't think anyone would bother with me (i've tried other fora!). i decided to try this life line with the last ounces of my sanity just before collapsing to sleep. I WON'T DO IT AGAIN. i'll try the advice and will be back with the next problem
      when i put your suggestion into the action i get: "Use of uninitialized value in concatenation (.) or string at 8.pl line 22" repeating itself. o'k now i'm back to write the proper answer to your first mail.
      what I'm trying to do - it's to extract the data from the file that is not a table but arranged in 8 columns separated by space (which i can separate [split(' ',$_, 8)] the next step is to search "the type" array for certain information, select only those. from there- using id (different column) and the coordinates also different column, to fish that part of the data from another file containing many id with their data- including the ones i need so i want to find the type, select for this. then i have to select for other parameters in other columns in the same way, then match the id to the coordinates - hash (of start -end joined into one information by join subroutine, that will return the parameters as regular expression that can be matched)? search the other file for the same ids, select only those and then fish out the part of the data i need and select it before i go any further....

        As you are a beginner, I just wanted to point you to a valuable tool bundled right into perl, Data::Dumper. My pointing this out won't necessarily _solve_ your problem, but it will go a long way towards helping you _visualize_ it.

        All you need to do is put use Data::Dumper; in your script. Then you'll be able to use it to display the contents of your variables with a line like:

        print Dumper( $foo );

        So, you might use it in your script to display your @s array, or within your foreach loop to show the current value of your $line variable.

        Keep Data::Dumper in mind. It's helped me learn all sorts of things in perl because it makes it so easy to actually see what's happening in a script.

        --
        naChoZ

        Therapy is expensive. Popping bubble wrap is cheap. You choose.

        i get: "Use of uninitialized value in concatenation (.) or string at 8.pl line 22"

        ... the file ... is not a table but arranged in 8 columns separated by space...

        So, if you used the code as I posted it on the file that you describe, then yes, you would get that message about line 22 (the "print" statement) and no other output, because the split was based on /\t/ (as you had it in your original post), and with no tabs in the file, nothing would be assigned to $type. I gather you've gotten past that problem, and are moving on to the real job.

        But the rest of your description of the job is hard to follow. If you are having trouble working it out, you'll need to show us some sample data that will make it clear what you are trying to do. If you are using two or more input files, and trying to relate information across the files to get specific outputs, show us a few relevant lines from each input, and what the corresponding output should look like. Then show us the code you've tried so far.

        (Now that you've gotten past the issue of "splitting a table into individual columns", you might want this new information and question, about joining data from different inputs, to be the start of a new thread.)

Re: spliting a table into individual columns
by ysth (Canon) on Dec 21, 2007 at 01:02 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://658309]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-19 13:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found