Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Re (tilly) 2: Why is Perl so bad with XML?

by Anonymous Monk
on Jan 31, 2002 at 17:53 UTC ( #142467=note: print w/replies, xml ) Need Help??


in reply to Re (tilly) 2: Why is Perl so bad with XML?
in thread Why is Perl so bad with XML?

The problem I have with your attitude ("Just because everyone and his dog is using it has no weight with me") is that this continues to push Perl into history. It shows the world that hard core Perlers care not for this new fangled technology, they'd rather not think about it or learn about it. And so people go elsewhere. I'm talking about marketing of course, and most hackers don't give a hoot about marketing, but it's an issue Perl faces (why do you think the Perl 6 project exists?).

The world changes, and we have to handle new data formats. You're lucky if you're in an all Perl shop and don't deal with outside customers or other people in your department handing you data, but most of us don't live in that world. We do live in a world of strange tabular data formats, and perl handles those just fine, but one day your customer will have to send thier data to an MS shop, and they'll ask for XML, and then they'll ask you if they can send you their XML, because that's easier for them. It's just how the world works - we can't stop it turning even if we want to.

I've never suggested that everyone should use XML for everything just because, it's a tool, and a damn useful one - being able to describe data structures in a language independant way is a powerful paradigm. But I'm not here to persuade you to use a technology. Simply to convince you that you can use it if you want to or need to, and that perl is damn good at the job.

  • Comment on Re: Re (tilly) 2: Why is Perl so bad with XML?

Replies are listed 'Best First'.
Re: Re: Re (tilly) 2: Why is Perl so bad with XML?
by mirod (Canon) on Jan 31, 2002 at 19:08 UTC

    Hey, hey, calm down everybody!

    I do agree that the attitude of the XML community and of the Perl community are really different, to the point of being antagonistic, as proven by the 2 posts above. The Perl way is "whatever works is fine", convenience tends to be valued above completeness and formalism. The XML way is nearly the opposite: one-format fits all, forcing "the right way" unto unsuspecting coders (Unicode isa perfect example, it annoys 99% of people to no end while just slightly improving the condition of the last 1%), elaborate all-encompassing constructs (W3C Schemas).

    Plus of course XML is verbose, and monsters like XSLT and W3C schemas are way worse, while Perl favors economy of strokes and conciseness as a way to get elegance and maintenability. At least you could do SGML golf!

    I might be wrong, of course no one has any figure about module usage, but I believe the people who write Perl modules "the XML way", like Matt, Robin Berjon, Ilya Sterin etc... are in a way off-target. In a word they look a bit like XML nationals lost in a foreign Perl country (not that Perl people look terribly good when we go out and attend XML conferences, believe me! I tend to get laughed at when I describe XML::Twig to Real XML Gurus ;--). The modules they write are not what the Perl community wants. Not that their work has no value, I think it is really important for Perl to be used in other context and to spread past its current niche, but it might be something to think about when we try to understand why some very nice modules don't seem to be used much. In any case there is a reason why the current crowd of Perl hackers loves XML::Simple and refuses to use XML::SAX::PurePerl. Java drones want SAX, Perl hackers want XML::Simple (and don't get me wrong, I would be the first to say that I don't think XML::Simple is a generic XML tool, but it is darn convenient!).

    I know that XML is annoying as it oftens bring no improvement over home-brewed formats. It is usually imposed by management in order to be buzzword compliant, but for Perl hackers it brings very little to the table, except an additional risk. But let's face it, we will all have to deal with it. XML is a bastard, verbose and actually quite tricky, format that's being used not because it is the best one for any particular purpose but because it is a standard, which allows Java projects to re-use standard SUN or IBM libraries where Perl hackers would use a CPAN module or 2 lines of custom code.

    That said I see a couple of reasons why Perl hackers would want to use XML: first as a data format it is actually quite powerful. It makes it easy to stick HTML inside data, it makes it easy to mark data within HTML text. It makes it possible to change the structure of the data without upgrading the code right away (I guess that would be between "loosely coupled systems"), really, you should try it ;--)

      I might be wrong, of course no one has any figure about module usage, but I believe the people who write Perl modules "the XML way", like Matt, Robin Berjon, Ilya Sterin etc... are in a way off-target. ... The modules they write are not what the Perl community wants.

      I have to disagree with you there. Kobesearch.cpan.org keeps some stats, and it shows that beyond the base XML::Parser, XML::DOM is the most popular, followed by libxml-perl (containing the PerlSAX1 code). Granted XML::SAX or other modules like it aren't on the map yet, but the above two modules show that Perl XML users do want standards based tools.

      But I do love XML::Simple - it hits the 80/20 sweet spot most of the time. However I'm much happier now I can use it via SAX not just with XML::Parser, because sometimes (e.g. with mod_perl, or when you can't compile XS), XML::Parser isn't the right tool for the job.

      Maybe you've missed the point about XML::SAX::PurePerl though. Nobody should be using XML::SAX::PurePerl directly (except perhaps SAX module writers test suites). It's simply there as a backup - to try and be a lowest common denominator. That's all. Plus it stops people complaining about there being no pure perl XML parsers ;-)

        I have to disagree with you there. Kobesearch.cpan.org keeps some stats, and it shows that beyond the base XML::Parser, XML::DOM is the most popular, followed by libxml-perl (containing the PerlSAX1 code). Granted XML::SAX or other modules like it aren't on the map yet, but the above two modules show that Perl XML users do want standards based tools.

        Actually I find those statistics quite sad! XML::DOM, in fact the DOM itself, at least level 1, is a really bad tool for most XML transformations, and I would think that libxml is queried just because of the name. Most questions I read are about straight XML::Parser or XML::Simple.

Re (tilly) 4: Why is Perl so bad with XML?
by tilly (Archbishop) on Jan 31, 2002 at 20:43 UTC
    Have you noticed that you are trying to convince me of something I am clearly already convinced of, and then expecting me to do something unrelated?

    Here is what you are trying to convince me of. But I'm not here to persuade you to use a technology. Simply to convince you that you can use it if you want to or need to, and that perl is damn good at the job.

    But I never disagreed with that. In fact I already am convinced that I can use XML. Furthermore I know I can use Perl for that. Remember my opinion that, XML solves problems I don't have, doesn't solve problems I do have, and creates problems I didn't have? Do you think I had that opinion in a vacuum?

    Of course not! I and some co-workers did several small XML projects. We asked whether this tool was useful for us. And our conclusion was that it wasn't right now, so we didn't push it farther. This is the opposite of your unfounded insult that I would rather not think about it or learn about it.

    You may not like my conclusions, but that is no reason to insult me.

    Now the only point that you offered that would possibly matter to me is what would happen if I ran across a customer who only will accept XML. My attitude is that I will burn that bridge when I come to it. As it stands, said customer would be shooting themselves in the foot since they would be refusing to exchange files in the standard formats for the industry they are in. But even so, when we need to we can produce XML. I know that because we did some trial products and had no problem doing so.

    In the mean time I will worry about the real customers I do have, and not the hypothetical ones I don't. History shows that businesses that pay more attention to hypothetical customers than existing ones tend to go out of business, and I would rather not face that. (As I write this a sales guy just cheered over a verbal commitment for another long-term contract.)

    Which leads back to the fact that what you apparently want out of me is very different than what you claim to want to convince me of. You claimed I should believe that Perl is fine for XML. What it seems you really want is for me to drink the XML kool-aide then go out, use it and tell the world that Perl is great for XML. You aren't addressing the fact that I see costs and little to no benefit in doing so. You give me no reason I agree with to do so. But you want me to do it.

    You are claiming that it is important from a marketing point of view for Perl to push XML because XML is a defacto standard. There is a difference of belief here. My belief, based on actual customers, is that you don't need to market yourself as XML everything to get business. My further belief is that being productive, and my assisting others to be productive, is a more effective way to market Perl in the long run. (It certainly has been a more effective way to market it in the short run for me.) My final belief is that working in a swamp of buzzword soup is generally painful and no fun. Even if you convinced me that it was absolutely essential to The Future Of Perl for me to do so (which you have not), then you would have only convinced me that I want to find something other than Perl to do in the long run.

    In short, I understand perfectly well that you can do XML in Perl. I don't think it is worthwhile to do it, and won't spent any energy beyond basic exploration until I am convinced otherwise.

    Now if you will excuse me, I have some useful work to do with a tool (relational databases) that we have found productive, so that next week our sales people will have another reason to call prospective clients and make more sales...

      OK, I'm going to try and be a little less inflamatory now :-)

      The reason I think XML should be given more time of day is because of the tools. Period. Those tools exist and are well thought out and work because a lot of thought has been put into them, not just in the Perl world, but in the entire development community. There's a lot of chopping and munging you can do with those tools really easily, and sometimes the design of them and the re-use enabled by them (I'm talking about SAX filters here in particular) makes it a cost effective weapon for your business. I do find Perl hackers very quick to dismiss the research and study that has gone on for years in more academic areas of computer science, often with good reason, but one case in point for XML is XSLT, which is an incredibly mature technology, but Perl hackers (including myself at one time) dismiss it because it's verbose and fugly, rather than look at some of the reason behind the madness.

      But aside from all that, I really don't want to try and convince you that you must use Perl and XML. I've obviously written something completely wrong to make you think that, and for that I appologise. Let me clarify exactly what I want to happen: I want people to stop asking why Perl sucks at XML (and that includes you mirod!). Because people like mirod and myself have worked really damned hard in the past few years to make sure that it doesn't. And really - it doesn't. Perl absolutely rocks at XML, as people who have actually used it compared to the competition will testify.

      So I guess what I'd love to see is those people who have been successful at using it should make sure that other people know their success story. Because the XML world believes that we all use regexps to process our XML, and that Perl really sucks for XML, and that annoys me, because it's a myth. And it annoys me because I put in a lot of grunt work to make it not true.

Re: Re: Re (tilly) 2: Why is Perl so bad with XML?
by Anonymous Monk on Feb 01, 2002 at 06:29 UTC
    but one day your customer will have to send thier data to an MS shop, and they'll ask for XML, and then they'll ask you if they can send you their XML, because that's easier for them. It's just how the world works - we can't stop it turning even if we want to.

    I'm not in the industry myself, so I have no own experience with this. But your statement has a touch of self-fulfillment. If everyone thinks everyone else uses it, everyone will start to use it.

    Or is it de facto so that the majority out there uses and requires XML?

      You'll have to admit that XML is a very good exchange standard: it allows you to specify the encoding of information, it get rid of line ending problems, it is reasonably sef-documented and quite human legible. If all else fails you can always fire up vi and figure out what's in the file you just received.

      I know CSV is somewhat self documented if you include column headers but it is still easier to have the name of the field right around the field, even in the middle of 2 Mb of data than to have to go back to the beginning of the file, read the headers and figure out in which column you are. Plus some CSV files are pretty ugly. I just had to process the export of an Access DB that included multi-line fields that was a real pain to parse: there was no special end of record marker so Text::CSV_XS could not read records properly. An XML export would have solved this problem.

      Plus of course the hype factor makes it way kewler for management to say they're sending XML than CSV. Maybe tilly should start a W3C working group on CSV ;--)

        Did you try using DBD::CSV or Text::xSV to pull data from the Access CSV dump? (Which I assume was dumped using Microsoft's standard save as CSV.) Anyways nobody keeps data in that format for working, if it is tabular data then it is destined for life in a relational database where it is far more easily manipulated than it is in CSV. (Unless it is going to a financial analyst in which case it is destined for life in an Excel spreadsheet.)

        As for W3C, why waste my time? I don't think that CSV is an appropriate solution for their problems. It is a good one for interchanging a lot of the data that exists within the bond world, which is why there are standard formats there which have been used for years specifying CSV formatted data. They are here. They work. People use them.

        And when it comes to management and hype, sometimes that is life. Personally I prefer it when developers are free to choose the most cost-effective tools for what they are doing. (For one thing PHB heavy companies don't do so well. And companies that always seek to follow the herd tend to find a lot of nasty cliffs. Big companies have enough intertia to survive most such cliffs, small ones do not. Either way, they aren't fun for the developers who go over them.) If that means XML, that means XML. If it means CSV, then that means CSV.

        When it comes to tabular data I prefer CSV for several reasons. The first is that it is easier to see that it is tabular data at a glance. The second is that I prefer getting a 2 MB file to a 10 MB file. The third is that it takes a lot less work to set up a CSV format than it does to set up a DTD, etc. The fourth is that most people who work with tabular data already have more tools for CSV than XML. (Try any spreadsheet application.)

        For non-tabular data, well CSV is not appropriate for that. Use the right tool for the job. As for when XML makes sense for that, I don't have much of an opinion. I haven't had to solve that problem extensively. Certainly it was originally designed for that problem space, and the sheer effort that has gone into XML undoubtably has resulted in some effective tools for some problems. The hype has also definitely resulted in it being used for problems where it doesn't make sense. I just don't know where the boundaries are.

        You'll have to admit that XML is a very good exchange standard: it allows you to specify the encoding of information, it get rid of line ending problems, it is reasonably sef-documented and quite human legible. If all else fails you can always fire up vi and figure out what's in the file you just received.

        I quite agree. I'm no standards expert, but I think one of the clearest benefits of XML lies in a field wider than just code. With more and more dynamic community-based sites popping up by the day, and you average Joe wanting to make the stuff he writes on them look cool, the general public is becoming much more familiar with HTML. And, "XML looks just like HTML".

        The angle brackets of HTML looked very odd to me when I first saw them, and I'm sure they still do to anyone new to that stuff. But as the new generation takes up computers, HTML will begin to be understood by people, and I think eventually XML or a derivative of it will become the language for all code-related human-maintained files. By this, I mean that configuration files for your Perl programs will probably be written in XML, as that way everyone will understand them. None of all this silly var: value syntax that has to be relearned for every program that works slightly differently. Feel free to disagree, but I think that's a good thing. (/me shudders when he thinks of Esperanto, and hopes the same doesn't apply here...)

        I speak not as a devotee of XML; I'm still deciding on my module of choice, my familiarity with XML::Parser amounts to a brief look over the tutorial here, and reading the standard gave me a brain-ache. I certainly don't force people into using it. But I am excited about getting an oppurtunity to use it. I admit, tilly, that none have come up yet, but if/when one does I'll jump at the chance.



        --
        my one true love
        Actually, I was not even making a remark on how XML is. I was just wondering about your opinion -- if it's backed up by reality or if it's just an attitude.
      Remember that XML is a de jure standard, not a de facto one (well, we can pick nits about whether a "recommendation" from the W3C is a "standard" or not, but most Corps see it that way, and there are plans afoot to put XML through the ISO). That's very important to PHBs. And if you prefer de facto standards, well there's always MS Excel ;-)

      But yes, if everyone thinks everyone else uses it, everyone does start to use it. That's what happened with Java - Sun spent billions (yes, literally) of dollars selling it via articles and advertising and so on, and they convinced the world that it was the next big thing, and lo and behold, it was the next big thing. The beauty of XML is that everyone else is spending billions promoting it, yet we as Perl hackers get to play in the same ballpark as the MegaCorps, which is nice for a change.

Re: Re: Re (tilly) 2: Why is Perl so bad with XML?
by sfritz (Novice) on Feb 01, 2002 at 22:38 UTC
    ugh thats the third time today I've forgoten markup..

    /shrug

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://142467]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2018-11-21 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My code is most likely broken because:
















    Results (239 votes). Check out past polls.

    Notices?