http://www.perlmonks.org?node_id=810040


in reply to Re^3: Building a UNIX path from Irritating data
in thread Building a UNIX path from Irritating data

Hi kyle --

I've been working on your code for a while, and I cannot yet see how it builds a path, and I cannot get it to work for me. I copied your code above exactly as you said and used the data file that I provided, but here's the output I get:

100 => 100 101 => 101 1013 => 1013 1014 => 1014 1015 => 1015 1053 => 1053 1057 => 1057 1059 => 1059 1065 => 1065 119 => 119 1198 => 1198 1227 => 1227 1228 => 1228 1229 => 1229 1230 => 1230 1231 => 1231 1232 => 1232 1233 => 1233 1234 => 1234 1238 => 1238 1239 => 1239 1241 => 1241 1298 => 1298 1299 => 1299 1300 => 1300 1311 => 1311 1317 => 1317 1318 => 1318 1320 => 1320 1321 => 1321 1322 => 1322 1323 => 1323 1324 => 1324 1325 => 1325 1327 => 1327 1328 => 1328 1329 => 1329 1347 => 1347 1348 => 1348 1349 => 1349 1350 => 1350 1351 => 1351 1352 => 1352 1374 => 1374 1375 => 1375 1376 => 1376 1377 => 1377 1390 => 1390 1393 => 1393 1394 => 1394 1395 => 1395 1396 => 1396 1397 => 1397 1398 => 1398 ...
I think the line that may be wrong in my code is this one:
return @parents ? [ map { map { "$_/$folderid" } @{ build_path( $_ ) } } @parents ] : [ $folderid ];
I've never seen a "return" command used in a conditional this way. It would seem to me that in order for the conditional to be true, the focus would have already returned to the line calling build_path originally. Does this actually mean if "return @parents" would succeed then do X, if not, Y?

I really appreciate your help. I think this is very close to what I need, but I'm missing some keystone. To make sure I got your code correct, here's the exact script I'm running (in cygwin on a windows system, and on an HPUX system):

#!/usr/bin/perl -w use strict; my $foldername_file = shift @ARGV; my $folderfolder_file = shift @ARGV; my %folders = (); my %folderpaths = (); open(FNAMES,"$foldername_file") or die "Can't open $foldername_file: $ +!\n"; foreach my $folderline (<FNAMES>) { chomp $folderline; if ($folderline =~ /^Folder (\d{3,5})\s+\-\s+(.*)\.$/) { $folders{$1} = $2; } } close(FNAMES); open(FFINFO,"$folderfolder_file") or die "Can't open $folderfolder_fil +e: $!\n"; my @subfolders = <FFINFO>; my %parents_of; my $parentid; while ( my $line = <FFINFO> ) { if ($line =~ /^Folder (\d{3,5})\s+\-\s+.*\./) { $parentid = $1; } elsif ($line =~ /\s+subfolder\s+(\S+)\s+\-\s+.*\./) { push @{ $parents_of{ $1 } }, $parentid; } } sub build_path { my $folderid = shift @_; my @parents = @{ $parents_of{ $folderid } || [] }; return @parents ? [ map { map { "$_/$folderid" } @{ build_path( +$_ ) } } @parents ] : [ $folderid ]; } foreach my $k (sort (keys (%folders))) { $folderpaths{$k} = build_path($k); print "$k => $_\n" for @{ $folderpaths{$k} }; }

Replies are listed 'Best First'.
Re^5: Building a UNIX path from Irritating data
by kyle (Abbot) on Nov 30, 2009 at 08:24 UTC

    I think the problem is that what you have still reads everything into @subfolders. Here's the snippet I'm talking about.

    open(FFINFO,"$folderfolder_file") or die "Can't open $folderfolder_fil +e: $!\n"; # This reads in every line from FFINFO my @subfolders = <FFINFO>; my %parents_of; my $parentid; # This loop also wants to read every line, # but the handle is already at EOF, # so it gets nothing. while ( my $line = <FFINFO> ) {

    If you need @subfolders for some purpose outside the code we're talking about, you could still have it. Just declare my @subfolders; at the same scope as $parentid and %parents_of, without initializing it. Then, right inside the FFINFO read loop, push every $line in. It winds up looking like this:

    open(FFINFO,"$folderfolder_file") or die "Can't open $folderfolder_fil +e: $!\n"; my @subfolders; my %parents_of; my $parentid; while ( my $line = <FFINFO> ) { push @subfolders, $line;

    You also had a question about this:

    return @parents ? [ map { map { "$_/$folderid" } @{ build_path( +$_ ) } } @parents ] : [ $folderid ];

    What this means is, "if @parents is non-empty, return first expression (with nested maps), else return the second expression ($folderid alone in an array reference)." Your description leads me to believe that you have the precedence mixed up. What you describe is this:

    ( return @parents ) ? [ X ] : [ Y ];

    What's happening is this:

    return ( @parents ? [ X ] : [ Y ] );

    You can see this kind of thing yourself using B::Deparse. I ran this command line:

    perl -MO=Deparse,-p input-file.pl

    With the -p, it puts parentheses in to clarify how expressions are interpreted. As a result, it showed me this (along with all the other code):

    return((@parents ? [map({map({"$_/$folderid";} @{build_path($_);});} @ +parents)] : [$folderid]));

    That's not pretty, but it does show that the return is "outside" the whole rest of the line.

    I hope this helps. I feel I've written in some haste, so I wouldn't be surprised if I've been unclear. If I have left you with any other questions, feel free to ask.

      Holy smokes! I had to look at this code for about 2 hours to understand it. I see now that build_path always returns a list (array), but usually only with 1 element. I also see that you're using several anonymous arrays for each iteration. It also took me a while to see how you're switching between lists and scalars back and forth. This code is very efficient. It works brilliantly for me now, and I thank you very much for the enlightenment!