Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^6: Scrape Yahoo Financial Historical- Process Dataset - format and create dynamic page

by tbone654 (Sexton)
on Sep 24, 2012 at 05:37 UTC ( #995293=note: print w/ replies, xml ) Need Help??


in reply to Re^5: Scrape Yahoo Financial Historical- Process Dataset - format and create dynamic page
in thread Scrape Yahoo Financial Historical- Process Dataset - format and create dynamic page

So, what I think I'm hearing you suggest -- is that rather than trying to get yahoo to install a perl module into it's /usr/bin/perl structure on all their web servers -- it's possible to build a stripped down perl with all the modules required built right into it, and upload that package and all the related files somewhere in the path I own, and then somehow point to that build rather than the default for the yahoo server?

Now to be more specific, I build something like citrusperl on my laptop or wherever, add the CPAN perl modules I need to that, then use cavapackager to build a binary package that I can upload to somewhere in the tree to my site? Then I guess I change the path to the new perl instance in my script to use my version of perl somehow? I anticipate that I would need some symbolic links at worst...

Is that the idea? -- thank you for sticking with me on this by the way, I do appreciate it...

It seems like I could install cygwin somewhere without perl, then install citrusperl, cava package it, upload it to yahoo, test it, and keep the cygwin environment because I will probably wish to add functionality over time. Then I just refresh the upload whenever I add a non-standard module, and back in business for more? correct?


Comment on Re^6: Scrape Yahoo Financial Historical- Process Dataset - format and create dynamic page
Re^7: Scrape Yahoo Financial Historical- Process Dataset - format and create dynamic page
by Anonymous Monk on Sep 24, 2012 at 06:13 UTC

    Um, kinda, the general idea, yes :)

    First idea I would try is to read the docs for your hosting plan :) they explain how/to install what, what you're allowed to install, things like that, so you don't waste your time or piss them off :) -- look above, pair.com offers hosting plans that allow you to install any perl you want :)

    Then, if have shell access, use cpan/cpanm ... with INSTALL_BASE to install prerequisites

    Or install a modern perl (like 5.14/5.16) with all prerequisites

    If no shell access, use http://cgipan.sf.net/ to do the same ( linked via Installing modules without root and shell via Yes, even you can use CPAN )

    cpanminus is zero config cpan replacement, might make the installing go smoother

    If and only if the above directions fail, go through the hassle of installing a virtual machine, with same/equivalent operating system as what your webhost has,

    Then you can install whatever binary perl distribution you want (strawberry/activestate/citrusperl ), install all the modules you need,

    Then either pack a single executable using perlapp, par/pp, cavapackager

    Or pack a complete distribution , either DIY whole directory tree with activeperl/Strawberry, or use citrusperl, it comes with a nice GUI for creating distributions

    perlapp supposedly can target other platforms, doesn't require a virtual machine, but still requires all the prereqs to be available for that platform via ppm , and there might be some licensing issue -- I've no experience with this

      AM,

      I downloaded the modules from CPAN... Figured out how to debug from a browser... Uploaded the modules to Yahoo into a directory I created /lib... put a statement into my perl script...

      use lib "./lib"

      added additional module dependancies... and now it works, I can get historicals directly from yahoo, and print it right to the browser.

      2012/10/01,144.5200,145.6900,144.0100,144.3500,135911200 
      2012/10/02,144.9200,145.1500,143.8300,144.5000,113422200 
      2012/10/03,144.8900,145.4300,144.1300,145.0900,121283100 
      2012/10/04,145.6400,146.3400,145.4400,146.1300,124311600 
      2012/10/05,146.9100,147.1600,145.7000,146.1400,124842100 
      2012/10/08,145.6000,146.1200,145.3100,145.6400,78415400 
      2012/10/09,145.5300,145.6500,144.1500,144.2000,148872900 
      2012/10/10,144.1800,144.3200,143.0900,143.2800,123992700 
      

      Awsome, the power of figuring this out. Thank you so much for the great hints.

      My next task is to perform some math on the data and then print the original data plus calculated columns back to the browser. There are modules out there "Stockmonkey" Math::Business::xxx that does some common financial type things to the data. But it uses a database behind the calcs, and I think I must avoid that. The way this is going to look in the end is something like this:

      Date Open High Low Close Volume    op rg cl rg 89 163 str Reaction up trend dn trend Reaction DPDH RPDL APDH BPDL 
      10/10/2012 144.18 144.32 143.09 143.28 124,247,408  4 B .89 .15 -1  .29 .00 .00 .00 147.24 1.23 1.23 .00 .00 
      2012/10/10 144.18 144.32 143.09 143.28 123,992,700  3 ss .89 .15 1  .31 .00 .00 .00 147.24 2.56 .17 .00 1.06 
      2012/10/09 145.53 145.65 144.15 144.20 148,610,100  2 S .92 .03   .30 .00 .00 .00 147.24 1.97 .34 .00 1.16 
      2012/10/08 145.60 146.12 145.31 145.64 78,415,400 B 1 B .36 .41   .32 .00 .00 .00 147.24 1.85 .42 .00 .39 
      2012/10/05 146.91 147.16 145.70 146.14 124,842,100  6 ss .83 .30   .30 .00 .00 .00 147.24 .64 1.72 .82 .00 
      2012/10/04 145.64 146.34 145.44 146.13 124,311,600  5 S .22 .77   .28 .00 .00 .00 147.24 -.01 2.21 .91 .00 
      

      So my question is something like this: I belive I need to use an array of arrays (sorted by first element - Date) and append my calcs to the end of each array. (one line/array is a list of elements that I have to store in memory) Then I print it sorted (reverse) to the browser. Am I on the right track in that thinking? I'm worried about building a large data structure in memory on a server I don't own. I'm wondering if a module already exists that may do some of this already? And I don't know, what I don't know, so I was hoping to run this bye you first. Any thoughts?

      I can see it in c code like this...
      
      for(i=0;i<250;i++){
          for(j=0;j<20;j++) {
              array\i\\j\ = {do some math};
          }; //end of for
      }; //end of for
      
      see what I mean?  make sense?
      
      

        use lib "./lib"

        Relative paths are fragile, use a full path :) it can be dynamic ( see File::FindLib, FindBin, use lib $lib_dir not working )

        But it uses a database behind the calcs, and I think I must avoid that.

        The FAQ says to use database to store results, so you don't need it, but if you need one, you can always use DBD::SQLite

        Am I on the right track in that thinking? I'm worried about building a large data structure in memory on a server I don't own. I'm wondering if a module already exists that may do some of this already?

        I feet not :)

        A casual look at Math::Business::HMA/ Math::Business::StockMonkey::FAQ ... shoes hints it doesn't use a lot of memory (it doesn't build a giant array), so I'd just use stockmonkey, but I would do local testing to check for memory growth ( Devel::NYTProf, Devel::Leak Devel::LeakTrace, WeakRef )

        *hint* the memory question would make a good addition to Math::Business::StockMonkey::FAQ

        I feel my shoes with my feet , wanna take good care , remain elite

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://995293]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2014-07-31 06:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (245 votes), past polls