Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I agree with the previous posts that this is an ambitious project for a beginner but don't let that stop you.
Maybe start with trying to read some data from one of your already downloaded web page with Perl without using any additional modules, just to get a grip on the language.
For example, just read up on how to read a file and how to use pattern matching (regular expressions) to fetch certain data from a file according to textual context. There are plenty of examples for that which you can use as a starting point. You would then write the results to a simple text file. Then maybe try to modify that so that your output is a proper CSV file that can already be opened in Excel. This can be done simply by printing your data with commas in between and quoting text. No need for an external module in most cases (although there are modules like Text::CSV that help you with the more complex cases).
Once you can do that. Try to fetch the data directly from the web with LWP::Simple instead of reading from a file. First write a script that uses LWP::simple just to download the whole page and print everything to a local file. Then try to combine that with your parser and you are almost done.
If you really want the data in a proper database you should learn basic SQL (database query language) and the Perl way of interacting with a database (the DBI or DBIc - too much to get into details here), but be prepared that that's not going to be done in one day.
Keep going and good luck!!

In reply to Re: fetching and storing data from web. by tospo
in thread fetching and storing data from web. by nicolethomson

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-25 23:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found