Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: SGML FAQ to HTML (or XML? or SQL?)

by mdillon (Priest)
on Aug 17, 2002 at 18:31 UTC ( #190900=note: print w/ replies, xml ) Need Help??


in reply to SGML FAQ to HTML (or XML? or SQL?)

If you are going to use SGML (or XML as Zaxo suggests), you need to make sure that your document is well-formed. The version currently in CVS has a bunch of problems that can be found either with "nsgmls -s faq.sgml" from the Jade package, or by changing to an XML DTD and using xmllint from LibXML.

Since you aren't actually using any SGML syntax that is not part of XML, I would recommend using XML, since the tool landscape is currently much more full for XML than SGML. Using XSLT, I would probably combine it with mod_xslt which translates XML to whatever, on the fly. It also has caching capabilities. For the generation of HTML output, using the latest version of the Docbook XSLT stylesheets should be fine.

If you want to do it in Perl, have a look at XML::LibXML and XML::LibXSLT in conjunction with libxml2. You might want to come up with your own caching mechanism, in that case.


Comment on Re: SGML FAQ to HTML (or XML? or SQL?)
Re: SGML FAQ to HTML (or XML? or SQL?)
by hacker (Priest) on Aug 17, 2002 at 19:07 UTC
    The version currently in CVS has a bunch of problems that can be found either with "nsgmls -s faq.sgml" from the Jade package, or by changing to an XML DTD and using xmllint from LibXML.

    The version in CVS that I linked to in the original post definately validates against 'nsgmls -s FAQ.sgml' on my machines here. Perhaps your version of nsgmls is older? I'm using the latest from the 'sp' Debian pacakge, version 1.3.4-1.2.1-28. I made sure it validated before I committed the first revision to CVS.

    I'll look into your other suggestions, thanks for the reply.

      I hate to play the "no it didn't" game, but, no, it didn't. There were something like 10 errors, which I verified by actually looking at the file. Things like missing close tags (not optional ones, but on stuff like <sect2>), a "<para>" where there was supposed to be a "</para>"... I got largely the same errors with OpenJade's nsglms and xmllint (after changing the DTD to Docbook XML 4.1.2). Regardless, if you know you *should* validate the document, that is the most important thing, since using structured markup would be a waste otherwise.

      Update: the file was changed after I posted my responses so that the HEAD revision no longer has the errors I mentioned (ok, I guess it was 5 errors, not 10, but same difference). I really think it is obnoxious of hacker to do this and claim I'm the one who is wrong... Did he not just update the CVS after I pointed out the errors, as verified by the RCS tag?

      --- faq.sgml Sat Aug 17 12:49:47 2002 +++ faq.new.sgml Sat Aug 17 16:50:55 2002 @@ -1,6 +1,6 @@ <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V3.1//EN"> -<!-- Last modified: $Date: 2002/06/29 00:12:02 $: by $Author: desrod +$ --> +<!-- Last modified: $Date: 2002/08/17 23:04:00 $: by $Author: desrod +$ --> <!-- To convert this with DocBook, use the following syntax: @@ -93,6 +93,7 @@ where AvantGo would be far more appropriate than Plucker (online browsing, for example), however, you need an SDK to work with Avan +tGo. </para> +</sect2> <!-- ################################### --> <!-- Section 1.2: Different from AvantGo --> @@ -245,6 +246,7 @@ cvs -z9 -d :pserver:anonymous@cvs.plkr.org:/cvs/plucker co pluc +ker </para> </blockquote> + </sect2> <!-- ############################# --> <!-- Section 1.6 Status of Plucker --> @@ -259,7 +261,7 @@ different from that installed on ours. You might lose all the dat +a on your PalmPilot, you might need to reboot your machine, it might n +ever be the same again. - <para> + </para> <para> You have been warned. @@ -302,8 +304,6 @@ <!-- #################################### --> - - <!-- ####################### --> <!-- Section 2: Installation --> <!-- ####################### --> @@ -329,8 +329,6 @@ <!-- #################################### --> - - <!-- ################ --> <!-- Section 3: Usage --> <!-- ################ --> @@ -415,7 +413,6 @@ <title>I have all the necessary tools, but I still can't see any pi +ctures?</title> <para> - Make sure that you don't set a too high value for MAXWIDTH. Anyth +ing above 150 pixels (default value) for embedded images will give th +e result that no images are shown. @@ -446,7 +443,6 @@ <title>The protocol is missing when the parser attempts to download + a page?</title> <para> - Even if you don't tell us what OS you are running we would be ver +y surprised if it's not Red Hat 7.0, 7.1, or 7.2. </para> @@ -521,8 +517,6 @@ <!-- ############################# --> - - <!-- ######################## --> <!-- Section 4: Configuration --> <!-- ######################## --> @@ -551,8 +545,6 @@ <!-- ############################# --> - - <!-- ###################### --> <!-- Section 5: Development --> <!-- ###################### --> @@ -668,12 +660,10 @@ </sect2> </sect1> - <!-- ########################### --> - <!-- Section 5: Development: END --> - <!-- ########################### --> - - - +<!-- ########################### --> +<!-- Section 5: Development: END --> +<!-- ########################### --> + <!-- ################################## --> <!-- Section 6: Miscellaneous Questions --> @@ -808,6 +798,8 @@ Frames (it might be possible to implement this in a links kind +of way by rendering and converting the frames to tables, much like the + links browser). + </para> + </listitem> <listitem> <para>
        Yes I did, and like I said in public and in private /msg, the tool you suggested I use (nsgmls -s FAQ.sgml) reported zero errors on every machine I had at my disposal to test it on, Linux and BSD flavors. If every machine I run the validation on tells me there are zero errors, why would I be skeptical and believe otherwise?

        So, taking your advice, I use xmllint, a completely different tool, and ran that, which exposed the errors you mentioned, which I corrected. The FAQ is in cvs, and now you're going to lead people to believe I've done something wrong by updating it? I'm supposed to keep the file intact, so you can try vainly to prove yourself "right"? Wrong.

        The file had errors, exposed by xmllint, ignored by nsgmls, which I corrected, so the thousands of users of the project could benefit from a properly validated file. I've also thanked you for pointing them out, and the mention of that second tool to use to validate it, now get off my back.

        You've already pushed this in CB, and now in here. I don't care what your system reports. If a handful of stock Linux machines report no errors, a completely different set of BSD boxes report no errors using nsgmls, I maintain that the validator I was using was right, or broken on every machine I tested it on (12 machines total now).

        Downvote me into oblivion if you wish. Just because it fails validation on your machine doesn't mean my machines are "wrong" because it doesn't fail here. Take your attitude back to #perl on Efnet, where it belongs, that is, if your ego isn't so large it can't fit out the door.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://190900]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (18)
As of 2014-08-01 15:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (27 votes), past polls