Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I agree with the previous posts that this is an ambitious project for a beginner but don't let that stop you.
Maybe start with trying to read some data from one of your already downloaded web page with Perl without using any additional modules, just to get a grip on the language.
For example, just read up on how to read a file and how to use pattern matching (regular expressions) to fetch certain data from a file according to textual context. There are plenty of examples for that which you can use as a starting point. You would then write the results to a simple text file. Then maybe try to modify that so that your output is a proper CSV file that can already be opened in Excel. This can be done simply by printing your data with commas in between and quoting text. No need for an external module in most cases (although there are modules like Text::CSV that help you with the more complex cases).
Once you can do that. Try to fetch the data directly from the web with LWP::Simple instead of reading from a file. First write a script that uses LWP::simple just to download the whole page and print everything to a local file. Then try to combine that with your parser and you are almost done.
If you really want the data in a proper database you should learn basic SQL (database query language) and the Perl way of interacting with a database (the DBI or DBIc - too much to get into details here), but be prepared that that's not going to be done in one day.
Keep going and good luck!!

In reply to Re: fetching and storing data from web. by tospo
in thread fetching and storing data from web. by nicolethomson

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others perusing the Monastery: (3)
    As of 2018-06-23 11:23 GMT
    Find Nodes?
      Voting Booth?
      Should cpanminus be part of the standard Perl release?

      Results (125 votes). Check out past polls.