Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

CPAN Namespace Navigator

by Discipulus (Monsignor)
on Nov 25, 2014 at 11:23 UTC ( #1108341=CUFP: print w/replies, xml ) Need Help??

CPAN Namespace Navigator is an interactive program that let you to navigate all namespaces on CPAN.
The idea born when i read that before upload something to CPAN is better to explore existing modules, but when i asked here in the chat how to browse it i discovered that ther is not a real exploration program to do it.

So the challenge was to hack directly the fomous file 02packages.details.txt that we receive (gzipped) when we search some module with some CPAN client. I used Term::ReadLine not without some headache.

I decided (unwisely) to eval directly the data received to build up a big HoH with the whole hierarchy of CPAN modules and reletad infos. As suggested (wisely) by ambrus and yitzchak i looked at tye's Data::Diver and on my own at an ancient and unmaintained Data::Walker one.

I was not able to bind Data::Diver at my will to add to the structure others infos like parent namespace or version, so i reinvented that wheel evaluating everything by myself.

Surprisingly it worked.

This is the usage and the navigation commands available during the navigation:
USAGE: cpannn.pl [02packages.details.txt] NAVIGATION: . simple list of contained namespaces .. move one level up + detailed list of contained namespaces * read the readme file of current namespace ** download the current namespace's package ? print this help TAB completion enabled on all sub namespaces cpannn.pl by Discipulus as found at perlmonks.org
And here you have the code, finally crafted after 37 steps of development.

#!perl use strict; use warnings; # CPANnn would be impossible without a big ABuse o +f Data::Dump::Streamer my ($ua,$cpanfh); # ugly again? no! UserAgent. need to be here befor +e BEGIN block,the file handle for cpan data too BEGIN{ local $@; $ENV{PERL_RL}="Perl"; $ENV{TERM} = 'not dumb' if $^O eq 'MSWin32';# TAB completion made po +ssible on win32 via Term::Readline with TERM= eval{ require LWP::UserAgent; }; if ($@){print "WARNING: no LWP::UserAgent support!"} if ($@ and !$ARGV[0]){die "FATAL: no filename as argument nor LWP::U +serAgent support!\n"} $ua = LWP::UserAgent->new; my $filename = defined $ARGV[0] ? $ARGV[0] : '02packages.details.txt +'; # this must go inside or assignment is not run if (!$ARGV[0]){ print "Downloading $filename, please wait..\n"; $ua->get('http://www.cpan.org/modules/'.$filename,': +content_file'=>$filename) } open $cpanfh,'<',$filename or die "FATAL: unable to open '$filename' + for reading!\n"; } use Term::ReadLine; my $term = Term::ReadLine->new('CPAN namespace navigator'); my $cpan = {'.'=>'CPAN'}; # the main cpan hasref, container of a +ll namespaces my $skiprx = qr/^[\.\+]{1,2}$/; # regex used to skip secret hash keys: + . .. + ++ (last not used really) my $pagination = 20; # used to divide in screenfulls the re +adme files my @infos = "\nINFO:\n\n"; # infos about the file and help too # now feed @infos with headers from fi +le while (<$cpanfh>){print "Processing data, please wait..\n" and last if + /^$/;push @infos, $_} push @infos, $_ for "\n\n","USAGE: $0 [02packages.details.txt]\n\nNAVI +GATION:\n\n", ". simple list of contained namespaces\n",".. move one level up +\n","+ detailed list of contained namespaces\n", "* read the readme file of current namespace\n", "** download t +he current namespace's package\n", "? print this help\n","\nTAB completion enabled on all sub name +spaces\n","$0 by Discipulus as found at perlmonks.org\n\n"; while (<$cpanfh>){ # main extrapolation loop chomp; # AA::BB::CC 0.01 D/DI/DISCIPULUS/AA +-BB-CC-0.001.tar.gz my @fields = split /\s+/;# split namespaces, version, partial +path my @names = split /::/, $fields[0];# split namespace in AA BB +CC my @ancestors = @names; pop @ancestors; # @ancestors are @names less last ele +ment eval '$cpan->{\''. # start of cpan containe +r; it ends before next = sign (join '\'}{\'', @names).'\'} ='.# expand names and vivif +ies BECAUSE there is an assignment '{'. # hasref start '"."=>$names[-1],'. # hasref . is name and . +. is a ref to father '".."=> \%{$cpan'.(defined $ancestors[0] ?'->{\''.(j +oin '\'}{\'', @ancestors ).'\'}':'').'},'. '"+"=> [$fields[1],$fields[2]],'. # hashref + is use +d for version and author path array '}; '; # hashref end } my $current = \%$cpan; # the current hashref namespace starts at top l +evel of the hash &header($current); # first time header my @cur_names; # take track of namespaces and, if empty, tell +us we are at top level # the line below is the first time initalization for autocompletion $term->Attribs->{completion_function} = sub {my $txt=shift;return grep + { /^$txt/i } grep $_ !~ $skiprx,sort keys %$current}; while ( defined ( $_ = $term->readline( (join '::',@cur_names).'>') ) +) { /^$/ ? next : chomp; s/\s+//g; if (exists $$current{$_} and $_ !~ $skiprx) { $current = \%{$$current{$_}}; push @cur_names, $_; next; } elsif($_ eq '.'){ # . -> ls print "$_\n" for grep $_ !~ $skiprx, sort keys %$current; } elsif($_ eq '+'){ # + -> ls -l foreach my $k(grep $_ !~ $skiprx, sort keys %$current) { print "$k\t", ${$current->{$k}{'+'}}[0] ? join "\t" +, @{$current->{$k}{'+'}} : "--CONTAINER NAMESPACE--","\n"; } } elsif($_ eq '..'){# .. -> cd .. pop @cur_names; $current = \%{ eval '$cpan->{\''.(join '\'}{\'', @cur_names). +'\'}' || $cpan } ; } elsif($_ eq '*'){ # * -> dump the readme unless ($ua){print "WARNING: no LWP::UserAgent support!\n"; +next;} if (defined $$current{'+'}->[0]) { (my $url = 'http://www.cpan.org/authors/id/'.$$current +{'+'}->[1]) =~s/\.tar\.gz/\.readme/ ; my $line_count; my $resp = $ua->get($url); if ($resp->is_error){print "WARNING: ",$resp->status_l +ine," for $url\n";next;} foreach my $line (split "\n",$resp->content()) { ++$line_count; print "$line_count:".$line."\n" ; if ($line_count % $pagination == 0){print "-- pres +s Enter to continue..";while (<STDIN>){last if $_ }} } } } elsif($_ eq '**'){# ** -> download the package unless ($ua){print "WARNING: no LWP::UserAgent support!\n"; +next;} if (defined $$current{'+'}->[0]) { (my $gzfile = 'http://www.cpan.org/authors/id/'.$$curr +ent{'+'}->[1]) =~s{.+/}{} ; my $resp = $ua->get('http://www.cpan.org/authors/id/' +.$$current{'+'}->[1],':content_file'=>$gzfile); print $resp->is_success ? "OK: download of '$gzfile' +succesfull\n" : "WARNING: ",$resp->status_line,"!\n"; } } elsif($_ eq '?'){ print for @infos }# * -> shows infos and help else{print "WARNING: '$_' command not found!\n"; next} } continue{ &header($current); } sub header { my $hr = shift; my $num = scalar@{[grep $_ !~ $skiprx, keys %$hr]}; print "\n",(join '::',@cur_names or 'CPAN'),($$hr{'+'}->[0] ? "\t$ +$hr{'+'}->[0]\t$$hr{'+'}->[1]" : "")," contains ",$num," namespace".( +$num>1?'s':'')."\n\n"; }


HtH
L*

update: take a look also at Re: Autocomplete in perl console application
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Replies are listed 'Best First'.
Re: CPAN Namespace Navigator
by afoken (Abbot) on Nov 27, 2014 at 19:35 UTC
    And here you have the code, finally crafted after 37 steps of development.

    Looks more like 37 hours sleepless, uninterrupted hacking.

    Is this a joke? If so, I didn't get it.

    I think this is unmaintainable, "write-only code". Here's why:

    • Chaotic indent. 2 spaces, 4, 6, 10, 14, 18, with no obvious rules when and how indenting happens.
    • Random amount of whitespace around operators
    • Extra long lines (33% have more than 80 chars) full of commands - Perl ain't MUMPS, stuffing as much code as possible into a single line does not make the code faster.
    • perl 4 function calls with & prefix - my favorite. The last Perl 4 release was about 20 years ago (4.036 released 1993-Feb-05).
    • Comments all over the place, but rarely useful. Hardly readable even with syntax highlighting. What is the relation of "use warnings" and "CPANnn" or "a big ABuse of Data::Dump::Streamer"?
    • String evals. Evaluating unverified data read from the network. Twice.

    This really looks like code copied from about 3 or 4 different examples, hastily glued together after too much coffee and severe sleep deprivation.

    perltidy with default options expands this mess from 100 to 160 lines, slightly more readable, but still ugly.

    perlcritic --brutal emits 130 warnings, 1.3 warnings per line, recommends refactoring. perlcritic --harsh still emits 28 warnings, still recommends refactoring. Even perlcritic --gentle does not remain silent.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      new feature added: a tree view of the current namespace:
      USAGE: cpannn38.pl [02packages.details.txt | or other valid file] NAVIGATION: . simple list of contained namespaces .. move one level up + detailed list of namespaces directly contained in the current one ++ dump a simple recursive tree of contained namespaces * read the readme file of current namespace; needs LWP::UserAgent ** download the current namespace's package; needs LWP::UserAgent ? print this help TAB completion, case insensitive, enabled on all sub namespaces

      .. but i need some answer:

      Thanks afoken for your crude review.


      Is this a joke? If so, I didn't get it.

      No was not a joke, was, as always a challenge and the result seems fully usable.

      • Chaotic indent. 2 spaces, 4, 6, 10, 14, 18, with no obvious rules when and how indenting happens.
      • Random amount of whitespace around operators
      • Extra long lines (33% have more than 80 chars) full of commands - Perl ain't MUMPS, stuffing as much code as possible into a single line does not make the code faster.

      Here you are completely right. i sanitized cpannn a lot and added, i hope, usefull, comments. the 2 semi ironic comments are removed. humor seems not usefull for every programmer.
      You can see CPANnn 2.68 times longer at the end of my answer.

      • perl 4 function calls with & prefix - my favorite. The last Perl 4 release was about 20 years ago (4.036 released 1993-Feb-05).


      About this i have another opinion: the & prefix is still a valid Perl 5 syntax as you can read in official documentation. As i understand & is 'optional' sometimes, and needed other times:

      A subroutine may be called using an explicit & prefix. The & is optional in modern Perl, as are parentheses if the subroutine has been predeclared. The & is not optional when just naming the subroutine, such as when it's used as an argument to defined() or undef(). Nor is it optional when you want to do an indirect subroutine call with a subroutine name or reference using the &$subref() or &{$subref}() constructs, although the $subref->() notation solves that problem. See perlref for more about all that.
      That said i 'prefer' to use it (because i know what it implies) to visualize my own subs in respect to other functions. If you dont like dont use it, but let me write my Perl 5 code as i prefer.

      • String evals. Evaluating unverified data read from the network. Twice.


      Here you are right somehow: i added a warning at the begin of the program and some blande checks on data received. If you pass a file full of evil strings is not a program's problem. I do not simply like idiot-proof tech. Please let idiots do what they want.

      I have many reserves about perlcritc: even if can be used as a guide to write better code, i do not want to code inside such rails. I want to be free while coding Perl. It is my opinion, of course.
      • I think this is unmaintainable, "write-only code".


      Well i just maintained and added a nice feature: a tree dumping of the current package. We can speak of 'Write-Twice-Code' ?


      L*
      #!perl use strict; use warnings; use Data::Dump::Streamer; # if you wont to modify CPANnn take in consideration using Data::Dump: +:Streamer on the $cpan hasref # # UserAgent and cpan file handle. need to be here before BEGIN block,t +he file handle for cpan data too my ( $ua, $cpanfh ); # BEGIN block needed to set some ENV variables # and to evaluate LWP::UserAgent support # Also check some contions and set the file handle $cpanfh # and, eventually the LWP::UserAgent object $ua BEGIN { # WARNING !! string eval in action!! # let people to quit print "\n\nWARNING: $0 uses string eval!\n" ."Use at your own risk!\nENTER to continue or CTRL-C to termin +ate.\n"; while (<STDIN>){last if $_ } local $@; # force Term::ReadLine to load the Term::ReadLine::Perl if present $ENV{PERL_RL} = "Perl"; # TAB completion made possible on win32 via Term::Readline with TE +RM= $ENV{TERM} = 'not dumb' if $^O eq 'MSWin32'; # evaluate optional LWP::UserAgent support eval { require LWP::UserAgent; }; if ($@) { print "WARNING: no LWP::UserAgent support!" } # die if no LWP::UA nor filename given as arg if ( $@ and !$ARGV[0] ) { die "FATAL: no filename as argument nor LWP::UserAgent support +!\n"; } # let's proceed $ua = LWP::UserAgent->new; # this must go inside BEGIN or assignment is not run my $filename = defined $ARGV[0] ? $ARGV[0] : '02packages.details.txt'; # if we are here we have LWP support # so if no filename was given as arg we download it if ( !$ARGV[0] ) { print "Downloading $filename, please wait..\n"; $ua->get( 'http://www.cpan.org/modules/' . $filename, ':content_file' => $filename ); } # open the file (given or downloaded) # and set the filehandle open $cpanfh, '<', $filename or die "FATAL: unable to open '$filename' for reading!\n"; } use Term::ReadLine; my $term = Term::ReadLine->new('CPAN namespace navigator'); # the main cpan hasref, container of all namespaces my $cpan ={ '.' => 'CPAN' }; # regex used to skip secret hash keys: . .. + ++ my $skiprx = qr/^[\.\+]{1,2}$/; # used to divide in screenfulls the readme files my $pagination = 20; # infos about the file and help too my @infos = "\nINFO:\n\n"; # now feed @infos with headers from file 02packages.details.txt # fetching the cpan file until we reach an empty line # because after that strat namespaces enumeration while (<$cpanfh>) { print "Processing data, please wait..\n" and last if /^$/; push @infos, $_; } push @infos, $_ for "\n\n", "USAGE: $0 [02packages.details.txt | or other valid file]\n\nNAVIGAT +ION:\n\n", ". simple list of contained namespaces\n", ".. move one level up\n" +, "+ detailed list of namespaces directly contained in the current on +e\n", "++ dump a simple recursive tree of contained namespaces\n", "* read the readme file of current namespace; needs LWP::UserAgent\ +n", "** download the current namespace's package; needs LWP::UserAgent\n +", "? print this help\n", "\nTAB completion, case insensitive, enabled on all sub namespaces\n +", "$0 by Discipulus as found at perlmonks.org\n\n"; # main extrapolation loop # we go on fetchin the cpan file # because now there are only namespaces while (<$cpanfh>) { # AA::BB::CC 0.01 D/DI/DISCIPULUS/AA-BB-CC-0.001.tar.gz chomp; # split namespaces, version, partial path my @fields = split /\s+/; # split namespace in AA BB CC my @names = split /::/, $fields[0]; # die if received invalid data # or is better /\.gz|z +ip|tgz|bz2$/ ? unless (defined $names[0] and $fields[2]=~ /^[A-Z]{1}\/[A-Z]{2}\/[ +A-Z]+/ ) { die "FATAL: no valid data in the file?\nReceived: $_" . join ' ',@fields ."\n"; } # sanitize names containing ' that seems to valid map {s/'/\\'/} @names; # @ancestors are @names less last element my @ancestors = @names; pop @ancestors; local $@; # # evaluate the namespaces in order to build # a big hash structure where a namespaces has many key # as contained namespaces. # additional keys are created to store the name, # the parent, and an array with version and partial path # # start of cpan container; it ends before next = sign # AA::BB::CC was splitted in the @names array as: # AA BB CC the evaluation transfoms entries in # $cpan->{'AA'}{'BB'}{'CC'} # but eval autovivifies only BECAUSE there is an assignment: ie: # $cpan->{'AA'}{'BB'}{'CC'} = --hasref with data-- eval '$cpan->{\'' . ( join '\'}{\'', @names ) . '\'} =' # hasref start . '{' # hasref . is name and . '"."=>$names[-1],' . # .. is a ref to father # if there is at least one parent # now evaluate the path to parent # else main cpan hasref is the parent '".."=> \%{$cpan' . ( defined $ancestors[0] ? '->{\'' . ( join '\'}{\'', @ancestors ) . '\'}' : '' ) . '},' # + key is used to store in an array # with version and partial path . '"+"=> [$fields[1],$fields[2]],' . # hashref containted in the current key ends here '}; '; print "WARNING: $@\n\t@fields\n" if $@; } # the current hashref namespace starts at top level of the hash my $current = \%$cpan; # first time header &header($current); # take track of namespaces and, if empty, tell us we are at top level my @cur_names; # lines below is the first time initalization for autocompletion $term->Attribs->{completion_function} = sub { my $txt = shift; return grep { /^$txt/i } grep $_ !~ $skiprx, sort keys %$current; }; # # interactive part of the program while ( defined( $_ = $term->readline( ( join '::', @cur_names ) . '>' + ) ) ) { # next on empty lines, chomp input otherwise /^$/ ? next : chomp; # remove eventual spaces on input s/\s+//g; # if exists the given (input) key (not matching the skip regex) in # the current hashref we set current and cur_names and next cycle if ( exists $$current{$_} and $_ !~ $skiprx ) { $current = \%{ $$current{$_} }; push @cur_names, $_; } # . -> ls # print current keys (not matching the skip regex) elsif ( $_ eq '.' ) { print "$_\n" for grep $_ !~ $skiprx, sort keys %$current; } # + -> ls -l # print current keys (not matching the skip regex) # with additional infos: version and partial author's path # if such infos are not there, the namespace is a container only o +ne elsif ( $_ eq '+' ) { foreach my $k ( grep $_ !~ $skiprx, sort keys %$current ) { print "$k\t", ${ $current->{$k}{'+'} }[0] ? join "\t", @{ $current->{$k}{'+'} } : "--CONTAINER NAMESPACE--", "\n"; } } # ++ -> tree # print current keys (not matching the skip regex) # with additional infos: version and partial author's path # if such infos are not there, the namespace is a container only o +ne elsif ( $_ eq '++' ) { &header($current); tree_dump($current); } # .. -> cd .. # go up one level in the datastructure elsif ( $_ eq '..' ) { pop @cur_names; $current = \%{ eval '$cpan->{\'' . ( join '\'}{\'', @cur_names ) . '\'} +' || $cpan }; } # * -> dump the readme # if LWP::UserAgent is present we fetch the readme file # of the current distribution we are navigating. # silently skip container only namespaces elsif ( $_ eq '*' ) { unless ($ua) { print "WARNING: no LWP::UserAgent support!\n"; +next; } if ( defined $$current{'+'}->[0] ) { ( my $url = 'http://www.cpan.org/authors/id/' . $$current{'+'}-> +[1] ) =~ s/\.tar\.gz/\.readme/; my $line_count; my $resp = $ua->get($url); if ( $resp->is_error ) { print "WARNING: ", $resp->status_line, " for $url\n"; next; } # rough pagination la more # prints chunks of 20 ($pagination) lines foreach my $line ( split "\n", $resp->content() ) { ++$line_count; print "$line_count:" . $line . "\n"; if ( $line_count % $pagination == 0 ) { print "-- press Enter to continue.."; while (<STDIN>) { last if $_ } } } } } # ** -> download the package # if LWP::UA is present download the current package in the curren +t dir elsif ( $_ eq '**' ) { unless ($ua) { print "WARNING: no LWP::UserAgent support!\n"; +next; } if ( defined $$current{'+'}->[0] ) { ( my $gzfile = 'http://www.cpan.org/authors/id/' . $$current{'+'}-> +[1] ) =~ s{.+/}{}; my $resp = $ua->get( 'http://www.cpan.org/authors/id/' . $$current{ +'+'}->[1], ':content_file' => $gzfile ); print $resp->is_success ? "OK: download of '$gzfile' succesfull\n" : "WARNING: ", $resp->status_line, "!\n"; } } # ? -> shows infos and help # show the content of @infos array # id est: headers of the cpan file and usage of the program elsif ( $_ eq '?' ) { print for @infos } # unknown command else { print "WARNING: '$_' command not found!\n"; +next } } # in the continue block print the header of current namespace continue { &header($current); } sub header { my $hr = shift; my $num = scalar @{ [ grep $_ !~ $skiprx, keys %$hr ] }; print "\n", ( join '::', @cur_names or 'CPAN' ), ( $$hr{'+'}->[0] ? "\t$$hr{'+'}->[0]\t$$hr{'+'}->[1]" : "" ), " contains ", $num, " namespace" . ( $num > 1 ? 's' : '' ) . "\n +\n"; } sub tree_dump { my $ref = shift; my $deep = shift || 1; foreach my $k (grep $_ !~ $skiprx, sort keys %{$ref}) { print "\t" x $deep . "$k\n"; if (ref( ${$ref}{$k}) eq 'HASH') {&tree_dump (${$ref}{$k}, ( +$deep+1))} } }
      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
        honestly it's not only hard to read it's using some suboptimal techniques.

        about readability:
        • please consider using perltidy to reformat your code, it solves most problems afoken mentioned
        • use blank lines between logical steps, see commenting in chunks in Perl Best Practices
        • take care about naming conventions and clarity
          hasref seems to mean hash_ref but reads like a boolean has_a_reference
        • avoid deeply nested code!
          e.g. using a dispatcher like $cmd{$line}->() with %cmd=('**'=>\&dump_readme,'*'=>...)
        • prefer self commenting code, like moving code chunks into well named subs

        about commenting
        • be sure which audience you are targeting whith your comments
        • you seem to mix POD stuff (i.e. for the user) and dev-comments and (sorry) banalities,
        • line quantity doesn't equal quality
        about techniques:
        • your begin block is huge and I'm puzzled why (?)
        • your repeatedly looping with $_ over most of your lines, that's very vulnerable to bugs
        • &sub() in Perl5 is usable in the rare cases where you need to ignore prototypes
        All these mentioned problems keep me away to read more and to try it out.

        In general I'm sure you would love to have a look into Damian's PBP book.

        This book helped me a lot understanding the traps in Perl and I hope it'll help you too! :)

        Cheers Rolf

        (addicted to the Perl Programming Language and ☆☆☆☆ :)

        Updates

        ) if you need to check UserAgent within BEGIN, I'd consider using a second BEGIN block

        2) I.e. the top-down structure is hidden

      I think this could have been said in a friendlier or at least more polite way.

      For sure more constructive then.

      Cheers Rolf

      (addicted to the Perl Programming Language and ☆☆☆☆ :)

        Thanks LanX for your precious suggestions.
        I know my code (that above) is a quick hack and lacks of readability and (maybe) can be bogus prone (to demonstrate). I also know i have a lot to learn and to practice and my code (in general) need to improve. Exactly because of this i take from other good suggestions and examples and i dont mind the remnant.

        I put that book in queue. thanks.

        L*


        There are no rules, there are no thumbs..
        Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://1108341]
Approved by Athanasius
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (10)
As of 2017-11-17 18:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (270 votes). Check out past polls.

    Notices?