Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

XML Resume Module design

by rattusillegitimus (Friar)
on Jul 26, 2002 at 04:36 UTC ( #185424=perlquestion: print w/ replies, xml ) Need Help??
rattusillegitimus has asked for the wisdom of the Perl Monks concerning the following question:

Due to recent job search needs in my family, I've been working on a Perl module that will take a resume in XML format and translate it to a variety of formats. I did some looking around the Internet, but none of the wheels I found were quite the right shape or size for my needs, and I thought it'd be a fun learning project. Right now I have a single object with the following output functions:

  • output_raw - simply uses the toString method to return the raw XML
  • output_html - uses XSLT to transform the XML resume to (X)HTML
  • output_text - uses XSLT to transfor the XML resume to reasonably laid out text, then uses Text::Wrap to wrap long lines to 80 columns
  • output_pdf - the most ambitious, uses XSLT to transform the XML resume into a series of sections and lines that are then fed to PDFLib to generate a PDF file.

My question is this: should I keep this as a single object, or sub-class each output style into a child object. Ie, something like this:

  • Resume - contains the basic functions for importing the XML resume and has an output function that returns the raw XML
  • Resume::HTML - child of Resume that overloads the output function with one that transforms the XML to HTML and returns that
  • Resume::Text - child of Resume that overloads the output function with one that transforms the XML to ASCII text
  • Resume::PDF - contains all of the PDF functions and overloads the output function with one that generates the PDF file

I'm kind of leaning toward this new setup, which I believe leaves me open for future enhancements, but I thought I'd see what the experts have to say. I'd also love to hear any additional features or output formats you think I should add. Once I get more of the code put together, I'll be happy to post it for review.

-rattus

__________
He seemed like such a nice guy to his neighbors / Kept to himself and never bothered them with favors
- Jefferson Airplane, "Assassin"

Comment on XML Resume Module design
(jeffa) Re: XML Resume Module design
by jeffa (Chancellor) on Jul 26, 2002 at 04:47 UTC
    Polymorphism is your friend - definitely subclass! This way you can define an output method and just call it on which ever object you instantiated. You could even set up a factory object that instantiates the object for you. Here is a very simple example:
    package Factory; sub create { my $class = shift; my $obj = 'Base::' . shift; return $obj->new(); } package Base; sub output { my $self = shift; return $self->{thing}; } package Base::Foo; use base 'Base'; sub new { my $class = shift; my $self = { thing => 'foo' }; return bless $self,$class; } package Base::Bar; use base 'Base'; sub new { my $class = shift; my $self = { thing => 'bar' }; return bless $self,$class; } package main; my @things = ( Factory->create('Foo'), Factory->create('Bar'), ); print $_->output,$/ for @things;
    Just replace Base with Resume and Foo and Bar with XML, HTML, PDF or whatever. Check out http://www.patternsinperl.com/designpatterns/factorymethod for a good read also.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
      Polymorphism is your friend - definitely subclass! This way you can define an output method and just call it on which ever object you instantiated.

      Why? That way suggests that an ‘HTML resumé’ is an inherently different kind of thing to a ‘PDF resumeé᾿, that they are distinct classes of object in the same way that camels and llamas are distinct. (Well, OK, not in quite the same way.)

      I reckon the opposite is true — there isn't such a thing as an ‘HTML resumeé’ or a ‘PDF resumeé’ in terms of what it models. You can have a resumé which contains data representing your personal details and work history. You can do several things with that data, including emitting it in particular output formats.

      But it seems daft to have a object which in effect means “the data in this instance represents my life achievements but it can only be displayed as a PDF” and forcing you to copy the data to another instance just because you want it to be emitted differently.

      Smylers

        Ooops. Didn't notice usernames were case-sensitive, so failed to log in before posting the above. Apologies.

        Smylers

        "But it seems daft to [force] you to copy the data to another instance"

        I should have placed $self inside base instead, and allowed the subclasses to format it in some way - but i was trying to keep it simple. The idea is that the base clase contains the data in a Perl data structure and the subclasses are responsible for flitering that data structure into a particular format. Let me try again:
        package Factory; sub create { my $class = shift; my $obj = 'Base::' . shift; return $obj->new(); } package Base; sub new { my $class = shift; my $self = { thing => 'foo' }; return bless $self,$class; } package Base::Bold; use base 'Base'; sub output { my $self = shift; return '<b>' . $self->{thing} . '</b>'; } package Base::Italic; use base 'Base'; sub output { my $self = shift; return '<i>' . $self->{thing} . '</i>'; } package main; my @things = ( Factory->create('Bold'), Factory->create('Italic'), ); print $_->output,$/ for @things;

        And an HTML resume is different than a PDF resume: HTML and PDF are adjectives to the noun resume - an adjective qualifies, distinquishes, and specifies - hence, an HTML resume is one kind of resume (Resume::HTML), and a PDF resume is another kind (Resume::PDF). I you don't like this, then how about naming them Resume::AsHTML and Resume::AsPDF instead? ;)

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        
        ...you know, I was with jeffa on this until I read your post smylers, and now I think I partially agree with you.

        Purists will argue that here are good reasons to subclass and there are lame reasons. I think this question falls into the grey area where I believe that it's "six of one, half-dozen of the other."

        From the standpoint of extensibility and maintenance, there's little difference in my mind between supplying a new method in one class, or creating a new subclass. The effect of polymorphism makes it simple to select the method name if you use subclasses. But I don't think there is compelling as a reason for the approach.

        I did find it instructive to consider the implications on memory use (and processing time) required to convert an XML resume' into an HTML resume' when each is implemented as a different type of object. If you simply supply different output methods, there's only one object ever put into memory. If you have different object types, then the conversion does require a copy operation. (I don't know if it's a big deal at this scale, but it is a consideration.)

        I don't think it matters in this case overall, which approach you take. *BUT* I don't think it's "daft" (or daft enough) to go with the subclass approach. And if one of your objectives is to study and learn more about object design, then it's a good thing to do.

        ---v

Re: XML Resume Module design
by DamnDirtyApe (Curate) on Jul 26, 2002 at 07:08 UTC

    Just to kick in my $0.02, I'm currently working on something quite similar, ie. an XML resume which can be used to generate various formats. I'm using XML::Simple to parse my XML, followed by Template Toolkit to render my resume in plain text, HTML, and LaTeX (for PDF and PS creation). So far, I'm quite pleased with the results; if you come across issues of abstracting your document layouts, I highly recommend checking out Template Toolkit.


    _______________
    D a m n D i r t y A p e
    Home Node | Email
Re: XML Resume Module design
by bronto (Priest) on Jul 26, 2002 at 08:53 UTC

    I have the same project, and I was planning with gmax a complete rewrite of it.

    The current version is simply an XML format, named XCV, and several XPathScript stylesheets that do the translation from XCV to HTML (two versions) and LaTeX. In turn, from LaTeX I get DVI, PostScript and PDF versions. All of this takes place in a AxKit-enabled Apache server.

    With gmax we were planning a completely new format, a database backend and language support

    Should we join forces?

    Ciao!
    --bronto

    # Another Perl edition of a song:
    # The End, by The Beatles
    END {
      $you->take($love) eq $you->made($love) ;
    }

Another output method -- XML
by cebrown (Pilgrim) on Jul 26, 2002 at 12:03 UTC
    This is not really apropos your question, but what the heck... you may also want to support XML-to-XML conversion via XSLT.

    In a previous role I helped develop the HR-XML Staffiing Exchange Protocol, which, in part, provides a standard schema for resumes. I'm not sure what kind of support the protocol has in real life, but based on the HR-XML membership (Monster, HotJobs, DICE, as well as PeopleSoft, SAP, Oracle) I assume there will be some employers that will accept an uploaded resume in this schema.

    Now, convincing your HR person/interviewer that you really don't need to send a resume as a .pdf or .htm (let alone .doc)... good luck with that!

Re: XML Resume Module design
by tomhukins (Curate) on Jul 26, 2002 at 13:30 UTC
    This doesn't answer your question, but are you aware of the XML resumé library? Studying this library might help you understand the problems you're facing.
Re: XML Resume Module design
by hsmyers (Canon) on Jul 26, 2002 at 20:05 UTC

    I'd advise adding a 'output_rtf' to appease all of those word happy (not) '.doc' folks...

    --hsm

    "Never try to teach a pig to sing...it wastes your time and it annoys the pig."
Re: XML Resume Module design
by rattusillegitimus (Friar) on Jul 27, 2002 at 04:31 UTC

    Thanks for the great responses. Y'all have definitely given me plenty upon which to meditate. Here are some of my thoughts on what I've read thus far:

    On Sub-Classing...
    This generated the nice debate I was looking for. ;) I had considered the issue of having to create separate objects for each of the different output methods, which was the main reason I considered proceeding with the single-class system. But on the other hand, I would like to make it relatively easy to pick and choose the functionality and implementation you want to use on a given system. For example, if you don't have and/or don't want to install the PDFLib module and its baggage, you simply don't have to create any Resume::PDF objects. And if someone wanted to write a different PDF output system, it feels cleaner to me to just replace the existing Resume/PDF.pm module than to have to edit the Resume.pm module itself. I do like and intend to use the Factory object idea as well.
    Template Toolkit
    I've checked out many of the templating modules out there and liked what I've seen. So far, for the XML, HTML, and text outputs, they all seem overkill when I've already got my favorite new hammer on the workbench generating the outputs, but the jury's still out on the best way to handle PDF. I figured I'd get the PDFLib version completed, then see what I can improve if I change modules.
    Joining forces...
    I'm certainly not adverse to joining up with a similar (or identical) project. ;) I'm planning to get the code cleaned up a bit more and incorporate the ideas I've gotten thus far, and posting it for review Real Soon Now. When I do, let me know if it looks like you'd like my assistance.
    Other Outputs
    How could I have forgotten RTF? It's an obvious choice, especially considering how many times I've provided an MS Word version of my own in the past. There seem to be several RTF modules available for the task. I've also considered (for far off in the future when I get time to bang my head on it) learning Win32::OLE to try and directly generate Word documents.
    I also like the idea of XML to XML transforms using XSLT. That will be simple to add; on the XML output, I can allow an optional XSL file to be passed in for transformations.
    Related Projects (XML resumé library, etc)
    I should have mentioned XML resumé library explicity as one of the other solutions I've looked at. It is a fine resource, and actually one of the projects that inspired me to try this. It was quite a bit more than I need at this point, and I'm trying to avoid Java for now, but its has given me several ideas. While I'm at it, I should also mention Andrew Ho Résumé Formats. I considered using it for my own purposes, but I prefer to use XSLT for my transforms rather than a dispatch table.

    Thanks again for all the comments and for giving me plenty of food for thought.

    -rattus

    __________
    He seemed like such a nice guy to his neighbors / Kept to himself and never bothered them with favors
    - Jefferson Airplane, "Assassin"

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://185424]
Approved by mitd
Front-paged by Ryszard
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (12)
As of 2014-08-21 18:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (142 votes), past polls