Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Merlyn's secret uncovered!

by boo_radley (Parson)
on Mar 03, 2001 at 03:24 UTC ( #61933=note: print w/ replies, xml ) Need Help??


in reply to Re: Re: Server getting and utilizing cookie
in thread Server getting and utilizing cookie

All in good fun, I promise. scans newest nodes, and looks for a similar WT column, then prints out a merlyn-like text.
Update
I guess I could've expected the reply I got... *sigh*
I wanted to search only the code listings to maximize the possibility of really getting a relevant article. If I searched through the entire site, or through the articles, I got results that make even less sense then they do now. (see this for example) :)
So, while I do plead ignorance to the column listed, I think my solution is better tailored for the situation.
Here's some sample output :



See my WT column on 'file' ideas. This smacks of cargo cult code. If I tried to run this past a customer, he'd shoot me. Are you sure you understand the problem?
For a more full featured exploration of files concepts, see my WT Column.
See my WT column on 'error' ideas.
I've already covered this topic. Check out my Web Techniques column on the subject.
I've already covered this topic. Check out my Web Techniques column on the subject.
I don't see why people insist on trying to write partially implemented solutions for this type of thing, especially when they can reference this'mail' WT column.
Why reinvent the wheel? Check mirod before you put serious time into this.
I've already covered this topic. Check out my Web Techniques column on the subject.
Oddly enough, this is precisely demonstrated in an upcoming WT column. Sadly, I can't republish the column until it has appeared in print, so wait a month or two and you'll see the whole thing.
Oddly enough, this is precisely demonstrated in an upcoming WT column. Sadly, I can't republish the column until it has appeared in print, so wait a month or two and you'll see the whole thing.
See my WT column on 'print' ideas.
I don't see why people insist on trying to write partially implemented solutions for this type of thing, especially when they can reference this'x' WT column.
I've written a lot of things here on the topic of printf. Try searching for them before asking questions like this.
This smacks of cargo cult code. If I tried to run this past a customer, he'd shoot me. Are you sure you understand the problem?
Oddly enough, this is precisely demonstrated in an upcoming WT column. Sadly, I can't republish the column until it has appeared in print, so wait a month or two and you'll see the whole thing.
Why reinvent the wheel? Check chmodded before you put serious time into this.
This smacks of cargo cult code. If I tried to run this past a customer, he'd shoot me. Are you sure you understand the problem?

#/usr/bin/perl -w use LWP::Simple; use strict qw(like a dominatrix); $|++; #step 1 - get list of new nodes. # get list of answered nodes # remove all nodes answered. my %newestnodes = &GetNodes(); my @merlyn_preface; open ANSWERED, "<c:\\answered_nodes.txt" || die "No answers! $!"; while (<ANSWERED>) { chomp; delete $newestnodes{$_} if exists $newestnodes{$_}; } close ANSWERED; #step 3 - for each unreplied node, compile "best" words. foreach (sort keys %newestnodes){ my $keyword = GetNode ($_); my $column = GetAnswer($keyword); if (defined $column) { @merlyn_preface = ("I've already covered this topic. Check out + <A HREF='$column'>my Web Techniques</A> column on the subject.", "See my <A HREF='$column'>WT column on '$ke +yword' ideas.</A>", "For a more full featured exploration of $k +eyword concepts, see my <A HREF='$column'>WT Column</A>.", "I don't see why people insist on trying to + write partially implemented solutions for this type of thing, especi +ally when they can reference this<A HREF='$column'>'$keyword' WT colu +mn.</A>" ); } else { @merlyn_preface = ("I don't really understand your request, an +d I'm not sure you know what you want to do. Nonetheless, I suggest y +ou browse my <A HREF='http://www.stonehenge.com/merlyn/WebTechniques/ +'>Web Techniques Perl columns</A>, and see if something helps you the +re.", "This smacks of cargo cult code. If I tried + to run this past a customer, he'd shoot me. Are you sure you underst +and the problem?", "Why reinvent the wheel? Check [CPAN://$key +word] before you put serious time into this.", "I've written a lot of things here on the t +opic of $keyword. Try searching for them before asking questions like + this.", "Oddly enough, this is precisely demonstrat +ed in an upcoming WT column. Sadly, I can't republish the column unti +l it has appeared in print, so wait a month or two and you'll see the + whole thing.", "I smell homework!" ); } print $merlyn_preface [rand (scalar @merlyn_preface)];print "\n"; } open ANSWERED, ">>c:\\answered_nodes.txt" || die "No answers! $!"; foreach (sort keys %newestnodes){ print ANSWERED "$_\n";} close ANSWERED; #step 4 - search for appropriate articles, return the url for one. # if there are no appropriate articles, return "upcoming" or # "cargo cult!" #step 5 - add replied nodes to flat file sub GetNodes { my $newnodes = get('http://perlmonks.org/index.pl?node_id=3628'); my @newsopw = ($newnodes =~/New Questions\<\/a\>\<\/H3\>\<TABLE\>( +.*?)\<\/TABLE\>/i); $newsopw[0] =~s/ (\<\/TR\>)/\n/ig; my %checknodes; while ($newsopw[0]=~/\?node_id=(\d*)\&.*?\?node_id=(\d*)\&/ig){ $checknodes{$1}=1; } return %checknodes; } sub GetNode{ my $node = shift; my $url= "http://perlmonks.org/index.pl?node_id=$node"; my $nodetext = get ($url); if ($nodetext=~/<INPUT TYPE="hidden" NAME="node_id" VALUE="$node" +><INPUT type=hidden name=op value=vote>(.*?)<BR><BR>.*?<CENTER.*?TABL +E/sig) { $text=$1; $text =~s/0X240/ /g; $text =~s/<.*?>/ /g; $text =~s/[^a-zA-Z0-9 ]//ig; my @words = split /\s+/, $text; my %freq; my %common; open COMMON, "common.txt" or die "no common words"; while (<COMMON>) {chomp;my $tempwd= uc ($_) ;$common{"$tempwd" +}=1;} close COMMON; foreach (@words) { my $tempwd = uc($_); if ($common{"$tempwd"}) {;next} $freq{$_}++ ; } my $maxval; my $search=""; foreach (sort {$freq{$b} <=>$freq{$a}} keys %freq) { if ($freq{$_}>=$maxval) { next if !/[a-zA-Z0-9]/; $maxval=$freq{$_}; return $_; } else {last} }; } } sub GetAnswer{ my $keyword = shift; $merlyn = get "http://web.stonehenge.com/cgi/wtsearch?search=$keyw +ord"; # this had a die clause on it, but I think merlyn's got a throttle + on the page... # dying isn't sexy anyway. # if ($merlyn =~/<PRE>(.*?)<\/PRE>/gis ){ my $columns = $1; my %uniquecolumns; while ($columns =~m|http://www.stonehenge.com/merlyn/WebTechni +ques/col(\d+).listing.txt|gi) { $uniquecolumns{$1}=1; } foreach (sort {rand(1) <=>rand(1)} keys %uniquecolumns) { retu +rn "http://www.stonehenge.com/merlyn/WebTechniques/col$_.html"} } return undef; }


Comment on Merlyn's secret uncovered!
Download Code
Re: Merlyn's secret uncovered!
by merlyn (Sage) on Mar 03, 2001 at 03:39 UTC
Re: Merlyn's secret uncovered!
by merlyn (Sage) on Mar 03, 2001 at 04:00 UTC
    With regard to:
    I wanted to search only the code listings to maximize the possibility of really getting a relevant article. If I searched through the entire site, or through the articles, I got results that make even less sense then they do now. (see this for example) :) So, while I do plead ignorance to the column listed, I think my solution is better tailored for the situation.
    you can just have WWW::Search::Google look for:
    "This text has appeared in an edited form" site:stonehenge.com YOUR_KE +YWORD_HERE
    and you'll get just the columns. I just tried it, works fine.

    -- Randal L. Schwartz, Perl hacker


    update: Thanks for that idea... I've implemented the column-only search on my site now!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://61933]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2014-12-27 07:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls