Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Selecting, matching and counting column elements, using randomly generated numbers

by $new_guy (Acolyte)
on Sep 29, 2010 at 09:05 UTC ( #862568=perlquestion: print w/ replies, xml ) Need Help??
$new_guy has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Hello. I have written a script that randomly generates numbers that are grouped together (separated by a space when printed on the screen!). The size of the group increases until a maximum is reached (the maximum is inserted at command line). For purposes of this question the number is 96.

The script takes a file containing the columns (of z's) which I want compare and count. The command for running it is:

perl script.pl <filename> <number, 96 in this case>

After randomly generating the column numbers what I would like to do is to go to that column and compare it to other columns. I would like to compare where they have z's in the same row. If they all have a "z" in the same row (ie same position) then the count increases, otherwise if they don't have a "z" or if one if them lacks a "z" a count is not taken.

My script is:

#!/usr/bin/perl use strict; use warnings; #exit if there's more or less than two arguments if(scalar(@ARGV)!= 2) { print "\nUsage script.pl <file name> <number + of columns>\n"; exit(); } ##you will print results but first remove any previous files my $remove_random = "random.txt"; if (unlink($remove_random) == 1) { print "Existing \"random.txt\" file was remove +d\n"; } ## proceed by opening the file my $ro = $ARGV[0]; open(DATA3, $ro); while ($ro = <DATA3>) { #now make a file for the output my $output_r = "random.txt"; if (! open(POS, ">>$output_r") ) { print "Cannot open file \"$output_r\" to write to!!\n\n"; exit; } # now randomly generate the columns to count z's # but first declare variables my $randomize = $ARGV[1]; # the number of columns entered at com +mand-line my $range = $randomize; # the maximum number of columns my $minimum = 1; # the minimum number of columns my $y; # the increasing number of columns my $x; # the random genome selected my $count; # count the number of randomisations done my @uniform = (); my @data = (); my $n = 0; #loop through the selection process for($y = 1; $y < $range +1; $y++){ # make selection from 2 column +s to 96 columns print "\n"; # separate each random selecti +on by a space for($x = 1; $x < $y; $x++){ # do the random colum +n selection #randomly select columns my $random_number = int(rand($range)) + $minimum; #print the columns selected at random print $random_number . "\n"; $count++; ## random columns for selection have been created ## now compare the elements of each of the groups selected and count o +nly the number of z's common to all columns for each group! ## i.e. count only those times that have z's in all of them (i.e. the +group) ##this bit of the script is not working ### # @uniform = $random_number; # my @temp = map { [ $_[1], $_[0], $_ ] } # step +1 # map { $_->[2] } # step 2 # @uniform; #Count array elements that match a pattern #In a scalar context, grep returns a count of the selected elements. #foreach my $num_genes(@temp){ #print POS "@temp\n"; #} } } #evaluate the number of random columns/columns selections used for thi +s analysis print POS "\n". $count*30 ." random columns selections were +used!!\n"; print "\n". $count*30 ." random columns selections were use +d!!\n"; } # the end # my $count2; open (FILE, "random.txt") or die"can't count cluster +s\n"; $count2++ while <FILE>; print "\n$count2 round(s) done\n";

My data file is:

0 z z z z z z z z + z z z z z z z z z z z z z z z z + z - z z z z z z z z z z z z z z + z z z z z z z z z z z z z z z z z z + z z z z z z z z z z z z z z z z z z +z z z z z z z z z z z z z z z z z z z + z 0 z z z z z z - z + - z z z z z z - z z z z z + - z z - - z - z z z z + - z z z z z z z z - z z z + - z z z - - z - z z z z + z z z z z z z z z z z z z z z z z + - z z z z z - z z - z z z z + z z z z z z z z 0 z z z z z z - z + - z z z z - z - - + z z z z - z - - - z + - - - z - - z z z z + - z z z - z z z - - + - z - - z - - z z + z z - z - - z z z z z + - z - z z z - - - z +z - - - z z - z z z z + z z z - z z z - 0 z z z z z z - z + - z z z z - - - + - z - z z - z - - + - z - - - z - +- z z z z - z - z - z z + - - - - - - + - z - - z - - z - + z - - z - z - - + - - - - z z - +- - z z - - - z z + - z z z z - z - - - + - z - 0 z z - z - z - z + - z z - - - - + - - z - - - - + - - - - - - + - - z - - - z z + - - - - - - +- - - - - - - + - - - - - z - + - - - z - - z + - - - - - - - + - - - - - - +z - - - - z - + - z - z z - z - - + - - - - 0 - z - z - - + - z - z - - - - + - - - - - - + - - - - - - + - - - - - - + - - z z - - - - + - - - - - - + - - - - - - +- - - - - - - + z - - - - - - + - - - - - - + - - - - z - - + - - z - - z - + - z - z - - - - + - - 0 - z - z - - + - - - z - - - + - - - - - - + - - - - - - + - - - - - - - + - - - z - - +- - - - - - - + - - - - - - + - - - - - - + - - z - - - - + - - - - - - +- - - - - - - + - - - - z - - + z - - - - z - + - - - - - 0 - - - - - - + - - - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - - - + - - - - - - +- - - - - - - + - - - - - - + - - - z - - - + - - - - - - + - - - - - - +- - - - - - - + - - - - - - + - - - - - - + - - 0 - - - - - - + - - - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - - - + - - - - - - +- - - - - - - + - - - - - - + - - - z - - - + - - - - - - + - - - - - - +- - - - - - - + - - - - - - + - - - - - - + - - 1 z z z z z z z z + z z z z z z z z z z z z - z + z z z z z z z z z z z z z z z - + z z z z z z z z z z z - z z z + - z z - z z z z z z z z z z z +z z z z z z z z z z z z z z z z z z z +z z z z z - z z z 1 z z z z - z - z + z z z - z z z z - z z z + - - z z z z z z - - + - z - - z z z - z z z + z z z z - z z z - - z + - - z z - z z z z z z z z z + - - z - z z z z - + - z - z z z z z z z z z z z z z +z - - z z 1 z z z z - z - z + z z z - - z - z + - z z - - - z - - + - z z - - - - + - - - z z - z z - + - z z - - z - z - + - - - - z z - - + - - - - z - z z + - - z - z z z - - + - z - - z - z z z z z + z z - z - - - - + - - 1 z z z z - z - z + z z z - - z - - + - z - - - - - + - - - - z - - + - - - - - z z + - z z - - - - - + - z - z - - - + - - - z - - - + - - - z - - z - + - - - - - z - + - - z - - - - +z - - z z z z - z - + - - - - - 1 z z z z - z - z + z z z - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - z - - + - - - - - z - +z - - - - - - + - - - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - z - z + - - z - - - - + - - 1 z z z z - z - z + z z z - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - z - - + - - - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - - - + - - - - - - +- - - - - - - + - - - - - - + - - - - - 1 z z z z - - - z + z z z - - - - + - - - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - - - + - - - - - - +- - - - - - - + - - - - - - + - - - - - - + - - - - - - - + - - - - - - + - - - - - - + - - - - - -

Other queries with the script are:-

- It seems to increase the number of iterations every time the file size changes! I would like to keep this constant at say 200. So that each result has 200 rounds/iterations done

I would like to have the counts of z's printed out for each iteration, and an average for all counts at the end. Possibly displayed as columns for each round with the last being the average.

Comment on Selecting, matching and counting column elements, using randomly generated numbers
Select or Download Code
Re: Selecting, matching and counting column elements, using randomly generated numbers
by moritz (Cardinal) on Sep 29, 2010 at 09:30 UTC
    Just because you're using random numbers doesn't mean you should use random indentation for your code; it makes reading very hard.

    A good way to format code is to start in the first column, and indent a fixed amount of spaces (for example 4) for each level of unclosed curly brace. See also: perlstyle. This is not just a question of aesthetics, it's a necessity for any nontrivial program.

    I don't quite understand your code, and what you want to achieve; one comment says random columns for selection have been created, but I don't see any created columns; you just print some numbers to standard output, but never record them in a data structure, so they are essentially lost to the program.

    ##this bit of the script is not working ### # @uniform = $random_number; # my @temp = map { [ $_[1], $_[0], $_ ] } # step +1 # map { $_->[2] } # step 2 # @uniform;

    It's not working because after the line @uniform = $random_number;, the array @uniform contains a single number. Whereas the map accesses the array elements as if there were array references stored in the array.

    My general advise is to don't use map until you understand what your variables contain, and the basic control flow. Data::Dumper can help you with the former.

    Perl 6 - links to (nearly) everything that is Perl 6.
      Dear Moritz,

      The numbers are generated at random and I just print them to the screen to show what they are. But, yes you could store them in a file! So what i intend to do is to use the random numbers generated to do the counts! For instance if my group has 38, 39, 40; then what I intend to do is to compare the z's in columns 38, 39, 40 (that have been randomly generated) of my file! So if they are 44, 45, 99, ...., 123; I would like to count and find the average of all the z's that are common to all these columns!

      The bit that doesn't work is where I run out of ideas and got stuck!

        But, yes you could store them in a file!

        But your program doesn't do that. And later on you want to compare those values to some other values, and it doesn't work... because you don't have access to them anymore.

        So, let's summarize: You generate values, but you don't store them. You read data from a while, but you don't do anything with it. So there's are at least two steps missing: extracting the columns from the read data, and make the generated data available to the program itself.

        When you've done these two steps, maybe you'll get unstuck.

        Also please notice that your description of what you want to do is incomplete: you write you want to compare values, but you never mentioned what you want to do with the result of the comparison. Store it? count it? make funny bit masks? destroy the world?

        Perl 6 - links to (nearly) everything that is Perl 6.
Re: Selecting, matching and counting column elements, using randomly generated numbers
by perlpie (Beadle) on Sep 29, 2010 at 10:37 UTC
    ##this bit of the script is not working ### # @uniform = $random_number; # my @temp = map { [ $_[1], $_[0], $_ ] } # step +1 # map { $_->[2] } # step 2 # @uniform;

    From the "step 1" and "step 2" comments, I think that you may just be able to reverse those parts.

    my @temp = # third, the final results are stored in @temp map { $_->[2] } # second this applies to the results of the fo +llowing: map { [ $_[1], $_[0], $_ ] } # first this applies to @uniform @uniform;

    In many cases where folks stack map or grep, they could combine them. The above is the same as

    my @temp = map { ($_[1], $_[0], $_ )[2] } @uniform;

    which is the same as

    my @temp = @uniform;

    Did you intend for that portion of code to do something else?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://862568]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (11)
As of 2014-09-15 04:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (145 votes), past polls