Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: Unwritten Perl Books: 2007 version

by lin0 (Curate)
on May 17, 2007 at 19:23 UTC ( #616089=note: print w/ replies, xml ) Need Help??


in reply to Re: Unwritten Perl Books: 2007 version
in thread Unwritten Perl Books: 2007 version

Hi samizdat,

I think I'd like to see one on building web systems for data analysis, with a discussion of graphing and visualizing on the fly.

me too :-)

And since we are talking about web systems, I would love to read more about the building of Yahoo Pipes

By the way samizdat, can you tell us a little bit more about the systems you are developing at Sandia Labs? Something like which modules you use and why, how you deal with large amount of data and response time of your system, etc.

Cheers,

lin0


Comment on Re^2: Unwritten Perl Books: 2007 version
Re^3: Unwritten Perl Books: 2007 version
by samizdat (Vicar) on May 17, 2007 at 20:11 UTC
    Sure, lin0 -

    We had a boatload of data parsed from ASCII log files of chip testers. Each test had its own syntax for responses and some were pass-fail, others were binned. I had a bunch of log parsers (straight Perl scripts, although with the parsing driven by files of regexen and hints for each test), and they stuffed a big MySQL database.

    The overall app/tool structure was that I used cron jobs to seek out new log files and parse them into the (MySql 4) DB. All the tables were very simple (read, FLAT), no special joins or exotic relationalism. My (Apache 1.3.33) web pages were built with Embperl (1.6, not 2), which did the data presentation. GD-based modules were used (GD::Graph::bars3d, IIRC) to generate pix on the fly. These were dumped into a temporary directory which was housecleaned of all files older than 3 hours. Another option was to build spreadsheets (Spreadsheet::WriteExcel::Big) or formatted RTF's (RTF::Writer, IIRC) from the data. Likewise, these were disposable.

    All the web pages were dynamic Embperl constructs, and they used the directory structure to allow me to use the same pages for many different test sets. Embperl's really good at that, because the environment persists as you go deeper into the directory hierarchy unless you overwrite it locally. Very trick. Embperl also has a lot of automatic HTML table-generation functionality which just blows PHP away. Dunno about the other web-ready Perl templating systems, but I was able to do a lot with a tiny bit of Embperl coding.

    We didn't have a lot of traffic, but we did have a pretty hefty storehouse of data. Even so, the queries returned full pages (with graphics and color-coded data tables) in just a few seconds. I could have split out the data into separate data servers, but there was no real need. Truth is, Open Source tools are plenty fast enough for most applications without needing threading, multi-processing, or multiple machines.

    Don Wilde
    "There's more than one level to any answer."

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://616089]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2014-12-27 09:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls