Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Creating a Directory Site

by CodeJunkie (Monk)
on Aug 12, 2003 at 14:16 UTC ( #283191=perlquestion: print w/replies, xml ) Need Help??

CodeJunkie has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I appologise for the incredibly general nature of this question, but I was wondering if anyone knew any good ways to create a directory site. I.e. a bit like directory.google.com.

I have been asked to help develop a student website, at the moment the site is comprised of hundreds of individual HTML pages full of links to other sites. It's basically a massive directory site based individual HTML pages.

As you can imagine this is A COMPLETE NIGHTMARE to maintain, so they want me to provide a better way to maintain the site.

I was thinking about putting everything in a MySQL database, and storing everything in different categories etc. but having looked at google it seems they have static HTML pages also (although I am guessing these must be dynamically generated...?)

Just wondering if anyone could advise me on suitable ways of developing directory sites... maybe there are CPAN modules out there that could help me...? Any help and advice appreciated.

Cheers,
Tom

Replies are listed 'Best First'.
Re: Creating a Directory Site
by ctilmes (Vicar) on Aug 12, 2003 at 14:29 UTC
    As you said, there are many ways to do this, and I'm sure you'll get lots of advice.

    My advice is this: See postgreSQL and HTML::Mason.

    Mason has nice builtin caching that allows you to build your pages dyanamically from your database, then keep the page around for better performance.

      Thanks for the comments, I was thinking about Mason, the trouble is I havent really done any work with it before, and doesnt it require mod_perl? Also the thought of OO Perl is a little worrying... but i'll definately think about it. Caching is a good idea even if I dont go for Mason.

      cheers,
      Tom

        If you are afraid of using OO Perl, using an unknown module or mod_perl, perhaps you shouldn't do the task of creating a directory site. Or at least not in Perl, perhaps you are more fluently in other languages. But your Perl skills don't seem to be evolved enough to handle a big task as you described.

        Abigail

Re: Creating a Directory Site
by Ryszard (Priest) on Aug 12, 2003 at 14:48 UTC
    I think the best place to start is "Requirements Gathering".

    Altho' perl can most definately do what you want, it may not be the best tool for you.

    What are the goals of the resedign? just saying improved maintaince is not really enuff... what you need to do is start at the top, get agreed goals, and break down the project pieve (pun intended) by piece into smaller chunks.

    For each small chunk, you can say it will take me 2-3 weeks to complete a production quality application or peice. From this data you can quantify to management how long it will take, and you can "Manage their expectations (tm)", which is always a good thing.

    So, in short

    Just wondering if anyone could advise me on suitable ways of developing...

    Planning.

Re: Creating a Directory Site
by cfreak (Chaplain) on Aug 12, 2003 at 14:35 UTC

    CodeJunkie,
    I'm kind of needing this too and I've been working on it in my spare time to update a site for work... unfortunatly there doesn't seem to be anything out there really that is much help for this specific problem.

    I'm going to stay away from code that generates the pages. The problem with that approach is that it still creates a mess especially in my experience working with people who thought they could simply update the HTML pages which of course got everything out of sync. :) YMMV however if you are working on it on your own or with people that understand they need to use the update tool.

    I'm planning on using MySQL and building it like a normal web app, using CGI, HTML::Template and possibly CGI::Application, etc. My suggestion is use a templating system and make it modular. I'll share my stuff but at the moment there isn't a whole lot to share (and actually I may have to write it in PHP since my boss thinks its the best ever .. or something ...gotta love those PHBs!)

    Lobster Aliens Are attacking the world!

      Hi,
      Thanks for the advice, I was originally planning to use HTML::Template and MySQL and build like a normal web app as you describe. You make some good points about the code that generates static pages and I'll probably stay away from that option now.

      Let me know if you do go for Perl with your project and i'll share my ideas and explain the kind of system I end up with.

      Cheers,
      Tom

Re: Creating a Directory Site
by jdtoronto (Prior) on Aug 12, 2003 at 14:40 UTC
    I am not sure that Mason will be mnuch help? Aren't the pages likely to be generated by individuals or groups on campus?

    Essentially you are going to maintain a directory or index. A database would store the various aspects of teh category information and so on, be searchable in a variety of ways, but ultimately gives LINKS to the documents themselves.

    You could add to the directory system a hosting management portion which allows the completed pages to be uploaded, placed in the hosting directories and have the links added to the database.

    Think of the problem as a document management problem. The Google directory system only maintains a set of classifications, descriptions and links - all displayed dynamically - it does not maintain or manage any of the actual documents.

    Good luck! Sounds like a fun project - but I don't think it is quite as daunting as you think.

    ...john

      I am not sure that Mason will be mnuch help? Aren't the pages likely to be generated by individuals or groups on campus?

      It depends on exactly what the OP is really trying to do.

      Imagine a database schema storing links, classifications, descriptions, etc. The actual pages might be elsewhere.

      Then two sets of web pages, one that allows the "individuals or groups on campus" to add/edit the information in the database, and another set that search the database and construct the pages for the end user.

Re: Creating a Directory Site
by blue_cowdawg (Monsignor) on Aug 12, 2003 at 15:16 UTC

    To amplify a bit on what Ryszard said you need to look at what you are trying to do with a project management type of methodology. The steps you need to take are:

    1. Gather requirements
    2. Measure what you have
    3. Formulate a transisition plan
    4. Get signoff from the stakeholders
    5. Fine tune the plan based on feed back from step 4
    6. Iterate #4 & #5 as needed
    7. Set a schedule with milestones
    8. Execute the plan. Changes to the plan at this point need to be re-negotiated with a setting of expectations (see #4 and #5) as needed.
    9. DOCUMENT EVERYTHING
    10. Wrap up the project with a "what went well and what went wrong" session
    A consequence of all this should be a set of publication standards for those who are contributing pages to the site so they can be properly indexed.

    Now: implementation wise the kinds of technologies you should be looking at would include some sort of spider that looks at meta tags within the pages to build a database of those tags indexed against titles and authors. The meta tags and how they are used would be part of the standards you establish and those pages that don't conform to the standards don't get indexed. The consequence of not being indexed should be the "management's" call and part of the expectation setting that needs to take place.

    Generating the index page (or pages) can be done several different ways. Run a job once a day that spiders the site and compiles the indices. That much stays pretty constant. The part that is subject to implementation preference is wheather you generate a static HTML page as a result of the spidering OR use HTML::Mason, PHP, or other dynamic web page technology is in small part a case of how big a load you expect the web server to have to deal with, how "hefty" the machine is, personal preference and possibly political considerations.

    When I say political considerations I worked in a shop for a while that had "Core Engineering Standards" that in a draconian fashion dictated what technologies were approved for use on company machines and you were not allowed to even suggest a technology not in the CES document. Something else to consider before making big plans.

    Humph... wish I wasn't under NDA or I'd give you the link of the site I did all this work for... :-)


    Peter @ Berghold . Net

    Sieze the cow! Bite the day!

    Nobody expects the Perl inquisition!

    Test the code? We don't need to test no stinkin' code!
    All code posted here is as is where is unless otherwise stated.

    Brewer of Belgian style Ales

      Go back and read the question again. google directory is not the same as google search.

      And if the question was "How can I rewrite Google?" the answer would be "just buy it."

      Thanks for all that, very useful to see the stages planned out like that. I will take it all into consideration.

      Cheers,
      Tom

Re: Creating a Directory Site
by jmanning2k (Pilgrim) on Aug 12, 2003 at 16:17 UTC
    Do you have to create it from scratch, or can you use an existing one?

    directory.google.com is based on dmoz.org - the Open Directory Project. Unfortunately, the source to their engine is not available (but all the content is).

    A quick search for dmoz on freshmeat.net showed several compatible projects, including one in perl.

    I've never used any of them, so I can't vouch for them, but at least it's a pointer in the right direction.

    You can also check the ODP Forums for ideas

Re: Creating a Directory Site
by dash2 (Hermit) on Aug 12, 2003 at 15:17 UTC
    Is writing your own code the way to go for this? Link directories are pretty common, and not that difficult. You will almost certainly find an open source version (PHP, Perl, or whatever) somewhere that will do the job fine for you. Search around the usual places: sourceforge, CPAN, maybe even the commercial script directories like hotscripts and CGI resources.

    This is assuming that you don't actively want to code this, in order to further your skills. If so, then go ahead.

    Dave
    A massive flamewar beneath your chosen depth has not been shown here

      One of the commercial ones with which i have had good experiences so far is hyperseek.. Does a good job, it's in perl, templateble and it goes for $499, which isn't that much at all.. Might be interesting.
Re: Creating a Directory Site
by CountZero (Bishop) on Aug 12, 2003 at 19:21 UTC

    I'd say, that this is an ideal project for an XML-centric approach.

    You obtain the data you need from your database and "export" the data in XML-format, which you then transform into HTML through XSLT (which can be done server-side or client side, most of the browsers now support XSLT).

    No need to use a templating system as XSLT will take care of that.

    The downside is of course that you need to learn XML and XSLT!

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Creating a Directory Site
by wufnik (Friar) on Aug 13, 2003 at 08:33 UTC
    hola: just briefly:

    the debate as to how to manage the site technically is interesting enough, but i am not sure the question as to how to categorize your html/documents has been debated.

    in other words, you have nodes that are to be added automatically to your hierarchy, how do you go about this classification?

    imho, one of the better ways of doing this is to use Ken Williams' AI::Categorizer module. This gives you a number of ways of classifying your docs, and what is more is relatively fun to use, the only drawback being a largish number of dependent perl modules. which install pretty easily.

    best of luck,

    wufnik

    -- in the world of the mules there are no rules --
Re: Creating a Directory Site
by aquarium (Curate) on Aug 13, 2003 at 05:28 UTC
    you just need to represent a tree structure (surely there's some modules for this, and store on DB). once you have that, the bottom nodes are urls whilst branches are links to either further links or end nodes. Failing that, find yourself a hierarchial db with perl DBI module.
Re: Creating a Directory Site
by mattr (Curate) on Aug 13, 2003 at 11:10 UTC
    I remember a thing I played with a little way back.. perlhoo (perl based yahoo-style directory). Though I don't specifically recommend using it.

    Actually I've made a couple of these, a relatively simple one which I added to a mod_perl app and one now which is a bit bigger but unfortunately am forced to do in PHP with a preset structure - acckkk! as bill the cat would say. Also PHP is evil. Thank you.

    If you use mod_perl you are probably doing it for speed. But this may be overkill and you will need to use strict and get used to watching the apache error_log. Getting into Apache is useful though, for example you could use the mod_rewrite apache module to perhaps make it look like the pages are indeed static.

    Perhaps it would be useful to first get realistic page mockups done of each representative "screen" in html or as a bitmap image so you can get the full feature list through discussions with your client. Then as far as technology goes why not try and abstract out the storage mechanism and focus on the logical (algorithmic) design. It could be useful also to make a "web service" (mod_perl might be good..) which answers queries about the directory because then the directory could be more useable for other things in the future (i.e. maybe you would publish new items in an rdf feed). Of course you may not want to get into this yet but you can plan for it.

    Sure mysql is probably a great idea but I wouldn't spend too much time worrying about the storage method now, instead you need to know the whole feature set and also work out the logical design you will use for representing your data. I would recommend making every item an object that for example knows its url, description, creator, parent, category, etc. Recently I had good luck with Mysql and Class::DBI in another project, you might enjoy that and it makes it relatively easy to use an OO interface to your database.

Re: Creating a Directory Site
by tjh (Curate) on Aug 13, 2003 at 13:48 UTC
    My take on what you're after is link management, hopefully in some administratable data store, with a directory-like, dynamic display (if it's easy and has a management interface of some kind).

    Unless you feel you just have to recreate it, I thought some of these links might help. AFAIK, each of these is written in Perl. Also, there'll be enough coding to go around in implementing anything like this into a web site, so maybe it's a good thing to use something like these since they're complete (more or less)...

    • Links, shareware, free to non-commercial use, Here.

    • Many others are listed here. However, look closely and make careful choices.

    • Some of these have free versions that often have limited functionality (which may be all you need), and provide paid upgrades. Here's an example. (I don't know this product, just an example)

    • Open source things like this at sourceforge (do your own testing.)
    Just because something shows a purchase price, doesn't mean that's true. Plus, they may have free versions, or the product is free for not-for-profit use. Don't by shy; Ask.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://283191]
Approved by cfreak
Front-paged by zby
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2020-06-01 17:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you really want to know if there is extraterrestrial life?



    Results (5 votes). Check out past polls.

    Notices?