Finding error: uninitialized value in concatenation?

bwelch has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,
In a function that reads several million rows from a database, I'm finding it is taking a very long time to complete. I recently added use warnings; to the script and am now seeing this message many times in the log:

"Use of uninitialized value in concatenation (.) or string at cf.pl line xxx"

I'm not seeing what is not initialized here. Do you?

Since this is taking a long time to complete, could you recommend any optimizations? thanks, Bryan

use strict;
use warnings; # Added this line in last run

sub build4PartFastaFile
{
  &addDateStamp;
  my ( $fileToMake, $dbCommand ) = @_;

  open( FILE, ">$fileToMake" ) or die print "Can't open fileToMake: $!
+";

  my $sth2 = $dbh->prepare( $dbCommand ) or die print LOG "Can't prepa
+re: $! OR $DBI::errstr";
  $sth2->execute or die print LOG "Can't execute: $! OR $DBI::errstr";

  while ( ( my $giName, my $gssName, my $definition, my $sequence ) = 
+$sth2->fetchrow_array )
  {
    $sequence = uc( $sequence );
    $sequence =~ s/(\S{1,80})/$1\n/g;
    print FILE ">$giName $gssName $definition\n$sequence";
  }

  close( FILE );
}
[download]

Comment on Finding error: uninitialized value in concatenation? Download Code

Replies are listed 'Best First'.
Re: Finding error: uninitialized value in concatenation? by Fletch (Bishop) on Mar 02, 2005 at 15:20 UTC
Just a guess, but if you're receiving any NULL values back from your DB then those will be translated to `undef` in perl.	[reply] [d/l]
Re: Finding error: uninitialized value in concatenation? by gellyfish (Monsignor) on Mar 02, 2005 at 15:23 UTC
It is probable that one of the values of $giName, $gssName, $definition or $sequence are not defined (NULL) in the line `print FILE ">$giName $gssName $definition\n$sequence";` [download] I guess that the data contains some NULL columns. Also you might be better off writing the: `( my $giName, my $gssName, my $definition, my $sequence )` [download] as `my ($giName,$gssName,$definition, $sequence )` [download] /J\	[reply] [d/l] [select]
Re: Finding error: uninitialized value in concatenation? by Roy Johnson (Monsignor) on Mar 02, 2005 at 15:27 UTC
Just to be clear: is the line it refers to the print within the while in your example code? If one of the fetched columns is NULL, the variable it is assigned to will be undef. You can group your `my` declarations, and you can use map to get rid of undefs: `while ( my( $giName, $gssName, $definition, $sequence ) = map {defined + ? $_ : ''} $sth2->fetchrow_array )` [download] Or you can turn off uninitialized warnings within the while loop: `no warnings 'uninitialized';` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l] [select]
Re: Finding error: uninitialized value in concatenation? by tall_man (Parson) on Mar 02, 2005 at 15:48 UTC
I believe Fletch is right about the NULLs. The "concatentation" in your error message comes from interpolating the results into the string in this line: `print FILE ">$giName $gssName $definition\n$sequence";` [download] One easy way to turn undef's into reasonable values is this idiom: `$giName \|\|= " ";` [download] I don't see a lot to optimize, but here are my suggestions: # makes array interpolate with newlines local $" = "\n"; # Pull my variable declarations outside the inner loop. my ($giName, $gssName, $definition, $sequence, @seq); while (($giName, $gssName, $definition, $sequence) = $sth2->fetchro +w_array ) { $sequence = uc( $sequence ); # If I'm right about your data, you just want to # turn the whitespace into newlines. # The following should be faster. @seq = split /\s+/,$sequence; # For any fields that can be null: $giName \|\|= " "; # Less interpolation will be faster, but we need it # for the array. print FILE ">",$giName," ",$gssName," ",$definition,"\n","@seq", +"\n"; } [download]	[reply] [d/l] [select]
Re^2: Finding error: uninitialized value in concatenation? by bwelch (Curate) on Mar 02, 2005 at 18:59 UTC
Thanks much. The problem was with an occasionally NULL value. To clean up the sequence data and make it suitable for other tools, I needed to convert it to upper case and break up the sequence and associated info into lines of 80 characters or less. I've made several of the optimizations and will try another run soon.	[reply]
Re^3: Finding error: uninitialized value in concatenation? by tall_man (Parson) on Mar 02, 2005 at 23:36 UTC
If you need to reformat blocks of text with wrapping, I suggest you look at Text:Wrap. The regular expression you have now will over-split short lines: `use strict; my $sequence = "This is a nice line "; # Note: You should use '\1' rather than '$1'. $sequence =~ s/(\S{1,80})/\1\n/g; print "",$sequence,"\n";` [download] This prints: `This is a nice line ` [download] If `$sequence` contains nothing but solid blocks of non-space characters (genome sequences, for example), or you don't care about splitting short words, then unpack would be faster. `use strict; my $sequence = "123456789012345678901234567890123456789012345678901234 +56789012345678901234567890123456"; my $len = int(length($sequence)/80); my @seq = ($len > 0) ? unpack("(A80)$len A",$sequence) : $sequence; print "",join("\n",@seq),"*\n";` [download]	[reply] [d/l] [select]


XP is just a number
	PerlMonks