Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: fetching and storing data from web.

by tospo (Hermit)
on Jan 27, 2012 at 09:25 UTC ( #950296=note: print w/ replies, xml ) Need Help??

in reply to fetching and storing data from web.

I agree with the previous posts that this is an ambitious project for a beginner but don't let that stop you.
Maybe start with trying to read some data from one of your already downloaded web page with Perl without using any additional modules, just to get a grip on the language.
For example, just read up on how to read a file and how to use pattern matching (regular expressions) to fetch certain data from a file according to textual context. There are plenty of examples for that which you can use as a starting point. You would then write the results to a simple text file. Then maybe try to modify that so that your output is a proper CSV file that can already be opened in Excel. This can be done simply by printing your data with commas in between and quoting text. No need for an external module in most cases (although there are modules like Text::CSV that help you with the more complex cases).
Once you can do that. Try to fetch the data directly from the web with LWP::Simple instead of reading from a file. First write a script that uses LWP::simple just to download the whole page and print everything to a local file. Then try to combine that with your parser and you are almost done.
If you really want the data in a proper database you should learn basic SQL (database query language) and the Perl way of interacting with a database (the DBI or DBIc - too much to get into details here), but be prepared that that's not going to be done in one day.
Keep going and good luck!!

Comment on Re: fetching and storing data from web.
Re^2: fetching and storing data from web.
by nicolethomson (Initiate) on Feb 01, 2012 at 09:45 UTC

    Thanks everyone

    partially i am doing things with awk/sed and bash commands ofcourse google did helped #sed -n -e 's/^[ ]*//g'  -e  's/\([0-9a-zA-Z\.]*\)  */\1 /g' -e 10p -e 15p -e 23p nic.htm > nic.txt  then i tried #perl -ne 'print;' *.txt > all.csv, but the result was not comfortable, then i tried with #for file in *.txt; do   cat "$file";   echo; done > newfile.csv  now .csv or .txt file gives me the result in readable format  from this text file, need to send it to database
    DISTRICT : ZUNHEBOTO STATE : ABC 02/02 03/02 04/02 05/02 06/02 speed 004 002 004 004 004... next line will be next paragraph
    DISTRICT : YUNHEBOOT STATE : EFG 02/02 03/02 04/02 05/02 06/02 speed 004 002 004 004 004
     is the result when i do cat of the same, in db i have created table and fields are STATE, District,date, speed  how to import it to db perl -MCPAN -e shell  and did the installation of HTML::PArser, for mysql what else i needto install here
      what exactly are you trying to achieve with
      perl -ne 'print;' *.txt > all.csv
      for file in *.txt; do cat "$file"; echo; done > newfile.csv
      ??? Those are a bit pointless, just concatenating files into another file, same as just doing
      cat *.txt > newfile.txt
      And these are not magically creating csv format for you - it's just the same text as in the input files. No idea what you mean with "how to import it to db perl -MCPAN -e shell"???
        <code> true tospo >/code> taking time to learn things, might be a kid's level code i did.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://950296]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (13)
As of 2014-07-31 18:56 GMT
Find Nodes?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:

    Results (251 votes), past polls