Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Yes, I know, I know - the subject has long since been beaten thouroughly to death, at least some of you will think so.

Indeed it has been mentioned before, in such places like printer friendly... a few google searches didn't turn up that much more, but that is as usual just my querys that not rocks. :)

On certain places on this site, there is also the possibility to say &displaytype=raw, as described among other places here and here. On the interesting places though, such as normal nodes, it usually bombs out with a "500 internal error".

Well, anyhow, I once again had reason to print something from this site, to study somewhere else. Then I started to wonder... exactly how hard would it be to simply strip out the parts of the page that were unwanted? As it turned out, it was a 10 minute job, since I cheated and used regexps instead of a real parser - but save for some HTML changes (which of course are bound to happen sooner or later), it seems to work just fine. I'm just waiting to hear it break badly. :)

To demonstrate this, I put together a little CGI that one can use to get what I consider the parts of the page that I want to print, and the rest is stripped out. You can try it here, and see the code below. As an extra thought, I added possibility to paste some CSS into the page too, if you want nice boxes around code blocks or something like that. Another option would have been to forward username/password combos to retain your own personal CSS if any, but that would fast be quite a question of trust... :)

Another option that are available is of course http://perlmonks.thepen.com, which already features a pretty stripped version of the site, although I do not like the huge part at the top of it (for printing, that is).

Well, so what was the point? I'm not sure. :) For one thing, I still want the feature where I could click a link to get a printerfriendly version of any page. If nothing else, with this I can demonstrate what parts I think should be taken out on that page, and which should be still there. If this would be implemented on perlmonks, it would most likely be more wise to not produce those parts, rather than remove after producing, so this code is probably worthless. On the other hand, I got to spend some time coding on something pointless, which is one of my top three hobbies (the other two is most probably beer), and I got something to post here. Fun is important. :)

I am, however gonna leave the code up (and you can get it from here as well), so as long as it doesn't break, one can solve the problems themselves with this. But of course it would be really cool if anyone actually could use it for something. :) For now, I will use it to get my printouts when needed...

#!/usr/bin/perl -wT use strict; use CGI; #use CGI::Carp 'fatalsToBrowser'; use LWP::Simple qw(get); use vars qw($error $html); # Good ole monks. Nothing beats monks. my $pm = 'http://www.perlmonks.org/index.pl'; my $q = CGI->new(); print $q->header; if($q->param) { my $node_id = $q->param('node_id') || ''; my $css = $q->param('css') || ''; if($node_id !~ m{^\d+$}) { $error = "Not a valid node id: '$node_id'"; } else { $html = get "$pm?node_id=$node_id" or $error = "Failed to fetch node with id: '$node_id'"; &strip_for_print(\$html, $css) if $html; } } # If first time, or something went wrong: if($error || !$html) { print $q->start_html(-title=>'Printerfriendly PM'); print $q->h2('Printerfriendly perlmonks'); # Error reporting, if any: print $q->h4($error) if $error; print $q->start_form(-method => 'post', -action => $q->self_url); print $q->table( $q->Tr($q->td('Input node id:')), $q->Tr($q->td($q->textfield(-name => 'node_id'))), $q->Tr($q->td('Input CSS (optional):')), $q->Tr($q->td($q->textarea(-name => 'css', -value => ''))), $q->Tr($q->td($q->submit(-name => 'go', -value => 'Go go gadge +t copter'))) ); print $q->end_form; # Some blah blah here print $q->h3('Instructions:'); print $q->p(<<'INSTRUCTIONS'); Enter the ID of a node on perlmonks in the textfield. INSTRUCTIONS print $q->p(<<'INSTRUCTIONS'); You can also supply your own CSS to get a better printout, if you like. It can be any external CSS, or just some normal CSS. You will have to supply the &lt;style&gt; tags yourself. (This field will simply be inserted last in the &lt;head&gt; section). INSTRUCTIONS print $q->end_html; } # Display the printable page else { print $html; } # Lots of assumptions here, will probably break like # dry twigs as soon as any PM developer sneezes. :) # # Anyhow, here is the important part, that strips all # the stuff we don't want in our printable page. sub strip_for_print { my $html_ref = shift; my $css = shift; # Remove radio boxes, buttons and ++/-- etc $$html_ref =~ s{<input[^>]+>(?:\s*(\+\+|--|\+=0))?}{}gi; # Remove ads and the search bar $$html_ref =~ s{<body>.+<!-- Begin title bar -->}{}si; # Strip out the top links, logged in version $$html_ref =~ s{<a HREF="/index\.pl\?op=logout&node_id=131">.+Need + Help\?\?</a>}{}si; # ... and logged out, which is the one used in this example $$html_ref =~ s{<a HREF="/index\.pl\?node=login">.+Need Help\?\?</ +a>}{}si; # Remove nodelets $$html_ref =~ s{<!-- Begin nodelets -->.+<!-- End nodelets -->}{}s +i; # Back to.... link. $$html_ref =~ s{Back\s+to\s+<a href[^<]+</a>}{}si; # Remove comment on... $$html_ref =~ s{<tr><th.+</th></tr>}{}si; # Insert CSS $$html_ref =~ s{</head>}{$css</head>}i if $css; # Keep the bottom notice on the page :) }

You have moved into a dark place.
It is pitch black. You are likely to be eaten by a grue.

In reply to Printer friendly pages on perlmonks. by Dog and Pony

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2022-10-07 00:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My preferred way to holiday/vacation is:











    Results (29 votes). Check out past polls.

    Notices?