http://www.perlmonks.org?node_id=1042496

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello dear Monks,
I have this file:
METHOD 0TM 1TM 2TM >2TM method1 3 68 18 10 method2 3 80 10 6 method3 3 87 9 1 method4 5 83 9 3 method5 6 75 15 4 method6 14 77 6 3 method7 12 82 2 3 method8 9 84 3 3 method9 13 68 16 3 method10 5 65 20 9 method11 9 64 20 7 method12 5 68 18 9 method13 6 82 9 2 method14 11 79 8 2 method15 6 69 20 5 method16 5 80 11 3 method17 1 68 17 14
and I want to end up with 4 arrays, which will be the following ones (but created on-the-fly).
@x = qw(method1 method2 method3 method4 method5 method6 method7 method +8 method9 method10 method11 method12 method13 method14 method15 metho +d16 method17); @y1 = (3, 3, 3, 5, 6, 14, 12, 9, 13, 5, 9, 5, 6, 11, 6, 5, 1); @y2 = (68, 80, 87, 83, 75, 77, 82, 84, 68, 65, 64, 68, 82, 79, 69, 80, + 68); @y3 = (18, 10, 9, 9, 15, 6, 2, 3, 16, 20, 20, 18, 9, 8, 20, 11, 17); @y4 = (10, 6, 1, 3, 4, 3, 3, 3, 3, 9, 7, 9, 2, 2, 5, 3, 14);

Is this correct code?
open IN, "$infile" or die $!; @lines = <IN>; close IN; #get first line which will be the header line and keep only the titles + we need (the column names) $header_line = shift (@lines); @split_header = split(/\t/, $header_line); shift (@split_header); #work with the rest of the input file for ($i=0; $i<=$#lines; $i++) { $method_line = $_; #lines with the methods and the results for +each one @split_method = split(/\t/, $method_line); $method_name= shift (@split_method); #get only the name of the me +thod push @methods, $method_name; push @AoA, [ @split_method ]; #keep the data } unshift @AoA, [ @methods ];

Replies are listed 'Best First'.
Re: Is this the correct structure?
by moritz (Cardinal) on Jul 04, 2013 at 18:15 UTC
Re: Is this the correct structure?
by 2teez (Vicar) on Jul 04, 2013 at 20:28 UTC

    Hi,
    I think really, you can achieve what you wanted, by combining all your have except that I would not favour C type of "for loop".. See a headup below:

    use warnings; use strict; use Data::Dumper; <DATA>; #remove the heading if not needed my @data; my $counter; while(<DATA>){ chomp; $counter = 0; for (split/\s+/,$_){ push @{$data[$counter++]},$_; } } print Dumper \@data; __DATA__ METHOD 0TM 1TM 2TM >2TM method1 3 68 18 10 method2 3 80 10 6 method3 3 87 9 1 method4 5 83 9 3 method5 6 75 15 4 method6 14 77 6 3 method7 12 82 2 3 method8 9 84 3 3 method9 13 68 16 3 method10 5 65 20 9 method11 9 64 20 7 method12 5 68 18 9 method13 6 82 9 2 method14 11 79 8 2 method15 6 69 20 5 method16 5 80 11 3 method17 1 68 17 14
    You get an ARRAY of ARRAY, then you can print out want you want.
    for more info. Please check perldsc.
    Update:
    The code above produces ...

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
      Thank you all!
      Here is where I am at right now:
      #!/usr/bin/perl use Chart::Gnuplot; #one must specify FILENAME to get data from (tab-delimited), a name fo +r the output file, a type for output file (e.g. pdf, png, eps) and a +plot title ($infile, $title, $type, $plot_title) = @ARGV; $outfile = $title.'.'.$type; #add or remove colours accordingly @colour_array = ("lblue", "brown", "blue", "lred", "lgreen", "yellow", + "green", "red", "purple", "orange", "pink", "dyellow"); @data; $counter; open IN, "$infile" or die $!; $label_line = <IN>; chomp $label_line; #find the labels data (for n +aming the colours in the stacked bars) @split_labels = split(/\t/, $label_line); while(<IN>) { chomp; $counter = 0; for (split/\t/,$_) { push @{$data[$counter++]}, $_; } } close IN; #x-axis labels is the row #0 in the AoA @data @x = @{$data[0]}; $total_y = @data-1; #find how many y rows we have => how many colou +rs we need (-1 because #0 is x-axis) @pick_colours = @colour_array[ 0 .. $total_y - 1 ]; # Initiate the chart object $chart = Chart::Gnuplot->new( output => $outfile, title => $title, xlabel => { text => 'Method', color => 'black', offset => "-1", font => 'Times-Roman, 20' }, ylabel => { text => $plot_title, color => 'black', offset => -1, font => 'Times-Roman, 20' }, yrange => [0, '*'], xtics => { rotate => '-270', font => 'Times-Roman, 15' }, ytics => { font => 'Times-Roman, 15' }, bg => { color => 'white', density => 0.2, }, plotbg => 'white', legend => {position => 'outside bottom'}, border => { sides => 'bottom, left', linetype => 3, width => 2, color => 'black', }, "style histogram" => 'rowstacked' ); #all the rest rows in @data are used for creating the histogram bars #create object foreach series of data (column in excel) $counter_objects=0; for $y ( 1 .. $#data ) { $legend_name = $split_labels[$y]; #get the name for the legend, + e.g 1TM, 2TM, >2TM @tmp_y = @{$data[$y]}; #create temp array $counter_objects++; ${y."$counter_objects"} = Chart::Gnuplot::DataSet->new ( xdata => \@x, ydata => \@tmp_y, title => "$legend_name", fill => {density => 0.2}, style => "histograms", ); } #$chart->plot2d($h1, $h2, $h3, $h4);

      My main problem is that I want this script to be a general one, so I cannot know beforehand how many rows and columns I will have... I based my script in one example provided on the net, but in that one, the user had 4 arrays ( columns), which were hardcoded, so it was easy to say $chart -> plot2d ($h1, $h2, $3, $4). How can I do it in my case? And also, is the way I am trying to create the objects in the above for loop dynamically, correct? I think now, but I cannot figure out which is the correct syntax in this case. My thought was to somehow create the objects for the chart (in this case I need 4, aka 0TM, 1TM, 2TM, >2TM).

        It looks like you are trying to create a stacked bar histogram. Do you want horizontal or vertical bars ?

        poj
Re: Is this the correct structure?
by hermes1908 (Novice) on Jul 04, 2013 at 18:44 UTC
    Your code has quite a few problems and it seems that you are not grasping some fundamental concepts. If you are new to the language I would recommend you pick up a copy of "learning perl". I'm too lazy to comment each line of your code, but I'm sure some kind monk will. A thing which you have to be careful about this particular file is that it uses crlf (windows style) newlines. As I understand it perl automatically accounts for this on windows, but if you're on unix you need to explicitly open the file with "<:crlf".
    Some important concepts you should familiarize yourself with:
    • Variable references vs Normal variables (perldata)
    • Scalar vs List context (perldata)
    • Looping constructs in perl (perlsyn)
    • Default variables (perlvar)

    The list is by no means complete and as mentioned I would recommend getting a good book perl (learning perl or programming perl)

    Below is some code which does what you want using what I believe are reasonably good conventions.


    Good Luck
    Hermes
    #!/usr/bin/env perl #Good practice (prevents autovivification) use strict; my (@x, @y1, @y2, @y3, @y4); open FILE, '<:crlf', "test.txt"; #Get and store header data chomp(my $headerline=<FILE>); my @headerdata=split /\t/, $headerline; #Process each subsequent line of the file, storing the current line in + $_ while(<FILE>) { #Remove newline from $_ (var holding current line) chomp; #Split $_ using tab as delimeter and store contents in @rowdata my @rowdata=split /\t/; #Store first column of the line in @x, etc.. push @x, shift @rowdata; push @y1, shift @rowdata; push @y2, shift @rowdata; push @y3, shift @rowdata; push @y4, shift @rowdata; } close FILE; print "Array headerdata: @headerdata\n"; print "Array x: @x\n"; print "Array y1: @y1\n"; print "Array y2: @y2\n"; print "Array y3: @y3\n"; print "Array y4: @y4\n";
Re: Is this the correct structure?
by poj (Abbot) on Jul 04, 2013 at 18:45 UTC
    You'll be close if you change this line
    #$method_line = $_; #lines with the methods/results for each one $method_line = $lines[$i]; chomp($method_line);
    and this
    #unshift @AoA, [ @methods ]; unshift @AoA, [ @split_header ];
    poj
Re: Is this the correct structure?
by code-ninja (Scribe) on Jul 05, 2013 at 12:22 UTC
    The simplest way to do this can be:

    use strict; use warnings; my @AoA; while(<STREAM>){ chomp; next if (/METHOD/); push @AoA, [split /\s+/]; } print \@AoA; __DATA__ METHOD 0TM 1TM 2TM >2TM method1 3 68 18 10 method2 3 80 10 6 method3 3 87 9 1 method4 5 83 9 3 method5 6 75 15 4 method6 14 77 6 3 method7 12 82 2 3 method8 9 84 3 3 method9 13 68 16 3 method10 5 65 20 9 method11 9 64 20 7 method12 5 68 18 9 method13 6 82 9 2 method14 11 79 8 2 method15 6 69 20 5 method16 5 80 11 3 method17 1 68 17 14

    What happens here is, the split "tokenizes" each line assuming the space between the words as the delimiter. The square brackets around the split create a reference to an anonymous array which is pushed into the array of arrays data structure.

    UPDATE

    Umm 2teez already said what I wrote above... I did not check the replies before posting... mesa bad.