Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Perl code for finding shortest path not working on large files (chomp)

by Athanasius (Archbishop)
on Jul 25, 2014 at 12:56 UTC ( [id://1095040]=note: print w/replies, xml ) Need Help??


in reply to Perl code for finding shortest path not working on large files

Hello zing,

A little debugging reveals that the difference in behaviour has nothing to do with the length of the two input files, and everything to do with the fact that in the file which works, each line has a comma immediately before the newline, whereas in the file which fails (p17226.csv), the comma is missing. When this terminal comma is omitted, split leaves the newline in the final field. So the call to rank() accesses, e.g., $rank{"E\n"}, which is different to $rank{"E"}, and since the former key does not exist in the hash, the comparison generates a warning and fails to work as desired.

Like QM, I’m largely in the dark as to what this code is doing and how it is supposed to work. However, by changing this:

ins split /,\s*/ for <DATA>;

to this:

for (<DATA>) { chomp; ins split /,\s*/; }

I managed to run the script on file p17226.csv (2794 lines) with apparent success.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^2: Perl code for finding shortest path not working on large files (chomp)
by zing (Beadle) on Jul 26, 2014 at 07:54 UTC
    ======The much needed explanation of the code====

    Im sorry for the delay but here it is, and hope Im clear enough.

    Consider the input file:
    Eve,BigDaddy,Father John,Eve,Son John,Chang,Son Chang,Eve,Mother
    So in this case consider the first column,we have 3 children : Eve, John, Chang.

    The third column is the relation of second column to first column.

    For each of them we need to find their shortest link to the BigDaddy(or "Q" as I have shown in the sample input file. "Q" is the BigDaddy in the sample input case. So for Eve we already have the shortest path to BigDaddy which is

    Eve :    Eve<-BigDaddy,  Father , Male_relations_1

    While for John, John is related to eve as (Son) and Eve is BigDaddy as (Father). Thus the third column will have their relationships concatenated :

    John:    John<-Eve<-BigDaddy, Son.Father , Male_relations_5

    The fourth column is the Relation id for these concatenated relations. These sets of relation ids will already be given (as you can see in the code its inside the hash %DEF). Eg

    DEF = ( Male_relations => [qw(Father Father.Son Son Brother Son.Father +.....)], Female_relations => [qw(Mother Mother.Son Aunt Aunt.Father ....)])

    The concatenated relation between John and BigDaddy is Son.Father, which is number 5 in the Male_relations, hence we denote Male_relations_5

Re^2: Perl code for finding shortest path not working on large files (chomp)
by zing (Beadle) on Jul 28, 2014 at 06:21 UTC
    Hi Athanasius, I tried your suggestion but Im getting a partial output of only 25 lines (while the input file P17226.csv as provided is 2794 lines long).

    This is what Im getting after trying your suggestion

    Please check and suggest.
    M3<-Q, Pl M5<-Q, Pl M22<-Q, Pl M24<-Q, Pl M12<-Q, Pl M23<-Q, Pl M14<-Q, Pl M15<-Q, Pl M31<-Q, Pl M11<-Q, Pl M25<-Q, Pl M27<-Q, Pl M30<-Q, Pl M33<-Q, E M34<-Q, E M10<-Q, E M32<-Q, E M29<-Q, E M18<-Q, E M7<-Q, E M1<-Q, E M6<-Q, E M28<-Q, E M9<-Q, E M19<-Q, P

      Hello zing,

      When I take your original script, make the change I described, and run it on the file “p17226.csv” (which I downloaded from http://qfs.mobi/f1508588), I get 848 lines of output which begins and ends as follows:

      16:31 >perl orig.pl Enter file name: p17226.csv M19 : M19<-Q, P I_1 M15 : M15<-Q, Pl I_2 M23 : M23<-Q, Pl I_2 M11 : M11<-Q, Pl I_2 M22 : M22<-Q, Pl I_2 M31 : M31<-Q, Pl I_2 M27 : M27<-Q, Pl I_2 M24 : M24<-Q, Pl I_2 M5 : M5<-Q, Pl I_2 M12 : M12<-Q, Pl I_2 ... M567: M567<-M73<-M1<-Q, E.E.E IV_39 M560: M560<-M73<-M1<-Q, E.E.E IV_39 M559: M559<-M73<-M1<-Q, E.E.E IV_39 M561: M561<-M73<-M1<-Q, E.E.E IV_39 M550: M550<-M73<-M1<-Q, E.E.E IV_39 M549: M549<-M73<-M1<-Q, E.E.E IV_39 M569: M569<-M73<-M1<-Q, E.E.E IV_39 M563: M563<-M73<-M1<-Q, E.E.E IV_39 M568: M568<-M73<-M1<-Q, E.E.E IV_39 M562: M562<-M73<-M1<-Q, E.E.E IV_39

      Your output format has changed, have you altered the script since first posting? If not, it looks as though you’re going to have to debug the script yourself, on your own machine, to find out exactly what’s going on. The tutorial perldebtut will help you to get started.

      Hope that helps,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        Thanks athan. Another help I need is that all this output I want it dumped in a output file. But I cant use this :
        perl program.pl >> output.txt
        Because as you can see the program asks for user input, so I cant redirect the output. Can you help me on how to redirect the output of such a code which asks for user input on command line.

        There's some code that needs to be included around the last printf which is actually printing on console out, but exactly how to do it is what I want. Thanks !!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1095040]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-04-24 04:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found