|P is for Practical|
Automatically creating data validation module from XSDby brian_d_foy (Abbot)
|on Aug 09, 2007 at 07:10 UTC||Need Help??|
brian_d_foy has asked for the wisdom of the Perl Monks concerning the following question:
I know someone out there has already done this and is just hiding it somewhere in CPAN or on their local disk. I'm willing to do it myself if I must, but that will take a long time. Java, C#, and VB apparently already have this. I want it in Perl. I'm willing to put up a Stonehenge Rock Star grant for someone who can deliver. Heck, maybe I should make this an X-Perl Prize :) (And for Randal, doesn't this sound like a really, really cool column idea? Can you whip this up in McMenamin's tomorrow night? :)
I'm doing a lot of geocoding stuff right now, and taking data from various places so it eventually winds up in a GPX file. I'm not starting with XML, but I'm ending up there. Before I get to the XML, I want to validate the values according to the XSD (in the case of GPX, that's http://www.topografix.com/GPX/1/1/gpx.xsd before I put them into the Perl data structure, but I don't want to work too hard doing it.
So, in the stuff I'm working on for Geo::Gpx, I want to have a bit that validates values, but without pulling in a boat load of XML modules to parse the XSD every time. I'd really like to have a code generation tool that takes the XSD and outputs a module with the right methods ready-to-go. That way, the mere user of Geo::Gpx isn't stuck in dependency hell for something that doesn't need to be dynamic and that I can generate ahead of time and isn't directly related to the task of creating the GPX format. I would generate the module as the developer and simply distribute the result.
The toolchain starts with (and I'd be satisfied with):
Foo.pm should be completely self-contained and without dependencies, and contain all the methods I need to validate the data that will end up as the values in the XML. Once I have that tool, it's easy to automatically generate a separate distro if I wanted:
Of course, I can do this by hand. GPX isn't that big and isn't that hard. Indeed, I've done it by hand already. My code doesn't really care, because all that stuff hides behind an interface. Maybe I'll have to rename some functions, but that's not hard.
I looked at Sam Tregar's XML::Schema::Validator. It's a bit old and has a lot of fail reports, but down in the guts somewhere I think it has most of the pieces. It knows about the basic data types, so those methods are there, and it has a way to derive types. There might be some useful stuff in SOAP::WSDL. The trick is dumping just the parts I need into a new module, including the derived types special to the XSD. I didn't find anything else though.
However, after I get done with the GPX stuff, I have other formats I have to generate, and those get trickier. I don't want to keep doing this by hand.
So, pretty please with sugar on top, tell me someone has already done this. :)
Update: Moron, you missed the entire point about dependencies. I don't want to have to create tens of thousands of XML files just so I can use an XML parser to see if a value is a number and within range. The point is that there isn't a generic module to do this in Perl. I'm not looking for a hack, and I'm not having trouble. I'm looking for the work that somebody has already done before I do it myself for the general case.
Update: here's a sample. In the linked XSD, there's a user-defined type called longitudeType:
By hand, I turned that into a method that returns true if the scalar I pass to it fits that description:
brian d foy <firstname.lastname@example.org>
Subscribe to The Perl Review