Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Scrapping web site - printing - save to file

by davido (Archbishop)
on Jul 30, 2012 at 15:51 UTC ( #984467=note: print w/ replies, xml ) Need Help??


in reply to Scrapping web site - printing - save to file

Use a modern web framework such as Mojolicious or Dancer. Use a database such as SQLite to save the press releases so that your user doesn't experience the Schlemiel the Painter effect if they want to get to the last press release in the ever-growing file. Manage your DB connection with Mojolicious::Plugin::Database. Use Mojolicious::Plugin::Authentication along with a trusted CPAN digest module to deal with authenticating your administrative user (I use Class::User::DBI, but I'm highly biased, and it may be bigger than you need. Minimally, Authen::Passphrase is a nice starting point).

The whole thing would probably fit nicely into a Mojolicious::Lite style framework, but if it does grow to the point that you need a little stronger separation of concerns you can easily inflate a Mojolicious::Lite application into a full app where you separate the templates into their own files, the controllers into their own classes, and the router as the bulk of the application class: Mojolicious::Guides::Growing.

Update: I failed to mention earlier... If your application needs to do some scraping as well (did you mean scraping instead of scrapping?), then Mojolicious really is a good choice as a web framework, because it comes bundled with Mojo::UserAgent: A "Non-blocking I/O HTTP and WebSocket user agent", as well as Mojo::DOM, a "Minimalistic HTML5/XML DOM parser with CSS3 selectors", and Mojo::JSON, a "Minimalistic JSON" parser and generator. ...many of the important tools used in effective scraping.


Dave


Comment on Re: Scrapping web site - printing - save to file

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://984467]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2014-09-21 06:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (167 votes), past polls