Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
This is difficult issue. It sounds as if you are going to have a handful of data collection schemes for your 100 data sources. This, combined with the desire to create a single, unified output format leads me to think Heirarchical. You will have some significant overlap in how you access the raw data per data source. All static web sites would have a url that you access. All FTP sites would have a remote server, login, password, and file path. All RDBMS will have similar credentials. In all cases, you will want to do the following:
  • fetch_raw_data
  • parse_raw_data
  • write_pased_data
  • This would make me go with something akin to:

    DataCollector DataCollector::Mechanized DataCollector::Mechanized::WalMart DataCollector::Mechanized::GeneralElectric DataCollector::RDBMS DataCollector::RDBMS::ExxonMobil DataCollector::FTP DataCollector::FTP::GeneralMotors DataCollector::Scrape DataCollector::Scrape::FordMotorCompany DataCollector::Scrape::CiscoSystemsInc . . . etc.

    Your driver program would then, unfortunately, need to know all of the DataCollector leaf classes or devise a method to dynamically load and run them. But for each of these classes, you could call the above mentioned methods. Those methods would make private method calls on down until you get to the ugly details in the individual implementation classes. These implementation classes would only need to know where it's going for data and how to pull the real data from raw data source. Up one level would be how to talk to the data source type, based on information in the implementation classes. Up in the top level is the detail of how to write out the data.

    I hope this makes sense, isn't too vague, etc. Good luck.

    Ivan Heffner
    Sr. Software Engineer, DAS Lead
    WhitePages.com, Inc.

    In reply to Re: Thorny design problem by Codon
    in thread Thorny design problem by tlm

    Title:
    Use:  <p> text here (a paragraph) </p>
    and:  <code> code here </code>
    to format your post; it's "PerlMonks-approved HTML":



    • Are you posting in the right place? Check out Where do I post X? to know for sure.
    • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
      <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
    • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
    • Want more info? How to link or How to display code and escape characters are good places to start.
    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others pondering the Monastery: (7)
    As of 2024-03-28 18:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found