http://www.perlmonks.org?node_id=989913

supriyoch_2008 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perlmonks,

I am a beginner in perl programing. I have tried several times to generate a set of random DNA molecules using the code described in "Example 7-3: Generate random DNA" in the book "Beginning Perl for Bioinformatics" authored by James Tisdall (8th Indian Reprint 2011) page No.138-141. I have kept the code almost same except substituting the variable name my $size_of_set for my $number and giving <STDIN> input three times at lines 6, 8 and 10 in the script randomdna.pl. But it does not show the desired result due to use of uninitialized value $dna in concatenation (.) or string at C:\Users\x \Desktop\randomdna.pl line 20, <STDIN> line 3. I have tried to fix the problem by using the initializing code $dna=(); at line 40. But it does not work. I don't understand where I am going wrong. Can any perlmonk help me correct the mistake in my code randomdna.pl? Here goes the code:

#!/usr/bin/perl # Program to generate Random DNA set: use strict; use warnings; print"\n\n Enter No. of DNA Molecules required: ";# Line 5 my $number= <STDIN>; print"\n Enter Maximum length of DNA (bases): "; my $maxl= <STDIN>; print"\n Enter Minimum length of DNA (bases): "; my $minl= <STDIN>; # Line 10 # An array initialized to the empty list, to store the DNA in: my @random_DNA= (); srand(time|$$); # Call the subroutine to do the real job: @random_DNA=make_random_DNA_set($minl,$maxl,$number); # print the results, one per line: Line 16 print"\n The DNA set containing $number DNA molecules, varying in leng +th from $minl to $maxl bases, are:\n\n"; my $dna=(); foreach my $dna (@random_DNA) { # Line 19 print"$dna\n";} print"\n";# one per line Line 21 my $output="RandDNA .txt"; unless (open(RESULT,">my $output")){ print"Cannot open file\"my $output\".\n\n"; exit; } # Line 26 print RESULT"\n Randomly Generated dna:\n The DNA set containing $number molecules varying in length from $minl + to $maxl bases are:\n\n my $dna\n"; close(RESULT);# Line 30 exit; # Line 32 Subroutines (five): # Calling the subroutine make_random_DNA_set: # Line 33 sub make_random_DNA_set { # Collect arguments, declare variables my ($minl,$maxl,$number)=@_; # Length of each DNA fragment # Line 37 my $length; # DNA fragment: my $dna; # Line 40 # Set of DNA fragments my @set; # Create set of random DNA: for (my $i=0;$i<$number;++$i) { # find a random length between min & max Line 45: $length= randomlength ($minl,$maxl); # add $dna fragment to @set Line 47: push (@set,$dna);} return @set;} # Line 49 # Write the Subroutine randomlength: sub randomlength { # Collect arguments, declare variables Line 52: my ($minl,$maxl)=@_; # Get the random number in correct interval: return (int(rand($maxl-$minl+1))+$minl);} # Now write the subroutine make_random_DNA: sub make_random_DNA { # Line 57 # Collect arguments, declare variables: my ($length)=@_; # Line 59 my $dna; for (my $i=0;$i<$length;++$i) { $dna.=randomnucleotide();} # Line 62 return $dna;} # Now write the subroutine randomnucleotide Line 64: sub randomnucleotide { my (@nucleotides)=('A','T','G','C'); return randomelement(@nucleotides); # Line 67 # Now write the subroutine randomelement: sub randomelement { my(@array)=@_; return $array[rand @array];} # Line 71 }

I have got the following incorrect results in cmd screen:

Microsoft Windows [Version 6.1.7600] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\Users\x>cd desktop C:\Users\x\Desktop>randomdna.pl Enter No. of DNA Molecules required: 5 Enter Maximum length of DNA (bases): 10 Enter Minimum length of DNA (bases): 7 The DNA set containing 5 DNA molecules, varying in length from 7 to 10 bases, are: Use of uninitialized value $dna in concatenation (.) or string at C:\U +sers\x \Desktop\randomdna.pl line 20, <STDIN> line 3. Use of uninitialized value $dna in concatenation (.) or string at C:\U +sers\x \Desktop\randomdna.pl line 20, <STDIN> line 3. Use of uninitialized value $dna in concatenation (.) or string at C:\U +sers\x \Desktop\randomdna.pl line 20, <STDIN> line 3. Use of uninitialized value $dna in concatenation (.) or string at C:\U +sers\x \Desktop\randomdna.pl line 20, <STDIN> line 3. Use of uninitialized value $dna in concatenation (.) or string at C:\U +sers\x \Desktop\randomdna.pl line 20, <STDIN> line 3. Use of uninitialized value $dna in concatenation (.) or string at C:\U +sers\DR-SU PRIYO\Desktop\randomdna.pl line 27, <STDIN> line 3. C:\Users\x\Desktop>

Replies are listed 'Best First'.
Re: How can I generate random DNA using the code given by James Tisdall?
by jwkrahn (Abbot) on Aug 27, 2012 at 09:32 UTC
    my $dna=(); foreach my $dna (@random_DNA) { # Line 19 print"$dna\n";} print"\n";# one per line Line 21 my $output="RandDNA .txt"; unless (open(RESULT,">my $output")){ print"Cannot open file\"my $output\".\n\n"; exit; } # Line 26 print RESULT"\n Randomly Generated dna:\n The DNA set containing $number molecules varying in length from $minl + to $maxl bases are:\n\n my $dna\n";

    The first line creates the variable $dna with nothing in it (the undef value) and the print at the end trys to print this value resulting in a warning message.    The second line creates a variable named $dna that is local to the foreach loop so the print inside the loop will only produce a warning message if @random_DNA contains an undef value.



    my $number= <STDIN>; my $maxl= <STDIN>; my $minl= <STDIN>; # Line 10

    You should chomp values you receive via the readline operator.



    srand(time|$$);

    Unless you have a really really REALLY old version of perl you shouldn't use the srand function.



    my $dna; # Line 40 # Set of DNA fragments my @set; # Create set of random DNA: for (my $i=0;$i<$number;++$i) { # find a random length between min & max Line 45: $length= randomlength ($minl,$maxl); # add $dna fragment to @set Line 47: push (@set,$dna);} return @set;} # Line 49

    On the first line you create the variable $dna with nothing in it (the undef value) and then push that value into @set many times and then return a list of those undef values.



    # Now write the subroutine make_random_DNA: sub make_random_DNA { # Line 57 # Collect arguments, declare variables: my ($length)=@_; # Line 59 my $dna; for (my $i=0;$i<$length;++$i) { $dna.=randomnucleotide();} # Line 62 return $dna;}

    This subroutine never gets used anywhere so no random DNA sequences are created.



    This may work better, at least it's easier to read (UNTESTED):

    #!/usr/bin/perl # Program to generate Random DNA set: use strict; use warnings; print "\n\n Enter No. of DNA Molecules required: "; chomp( my $number = <STDIN> ); print "\n Enter Maximum length of DNA (bases): "; chomp( my $maxl = <STDIN> ); print "\n Enter Minimum length of DNA (bases): "; chomp( my $minl = <STDIN> ); # An array initialized to the empty list, to store the DNA in: # Call the subroutine to do the real job: my @random_DNA = make_random_DNA_set( $minl, $maxl, $number ); # print the results, one per line print "\n The DNA set containing $number DNA molecules, varying in len +gth from $minl to $maxl bases, are:\n\n"; print map( "$_\n", @random_DNA ), "\n"; my $output = 'my RandDNA .txt'; open my $RESULT, '>', $output or die qq[Cannot open file "$output" bec +ause: $!]; print $RESULT <<RESULT; Randomly Generated dna: The DNA set containing $number molecules varying in length from $minl +to $maxl bases are: @random_DNA RESULT exit 0; # Subroutines # Calling the subroutine make_random_DNA_set sub make_random_DNA_set { # Collect arguments, declare variables my ( $minl, $maxl, $number ) = @_; my @set; # Create set of random DNA: for ( 1 .. $number ) { # find a random length between min & max my $length = $minl + int rand $maxl - $minl + 1; # add $dna fragment to @set push @set, make_random_DNA( $length ); } return @set; } # Now write the subroutine make_random_DNA: sub make_random_DNA { my ( $length ) = @_; my @nucleotides = qw( A T G C ); return join '', map $nucleotides[ rand @nucleotides ], 1 .. $l +ength; }
Re: How can I generate random DNA using the code given by James Tisdall?
by Anonymous Monk on Aug 27, 2012 at 06:17 UTC

    Hire James Tisdall?

    Maybe just find the errata for his book?