framboise has asked for the wisdom of the Perl Monks concerning the following question:

Hello again. I posted to PerlMonks forum a few short months ago, asking about the resources available for natural language processing in Perl. Well, the project has gone far since those days. . . (reminiscence) But now I've run into a problem!

I'm using XML-RPC to query the ConceptNet database. For the most part it works flawlessly, but on the behemoths I get this error: "not well-formed (invalid token) [. . . (blah blah) . . .] at C:/Perl/lib/XML/ line 187" And I think to myself: ohh no! For a while, I've been working around the error, but I'm to the point where I have no alternative but to attempt to tackle the problem. So I've been thinking. . .

Is there a better alternative to Frontier::Client? I've looked and couldn't find such a thing. So I thought that perhaps I could edit Frontier::Client and replace XML::Parser with some other module that could handle larger files. Is there a way that I could implement XML::Twig inside Frontier? Or is there a better option?

All help will be greatly appreciated
- Justin

Replies are listed 'Best First'.
Re: XML-RPC troubles. . .
by stiller (Friar) on Feb 02, 2008 at 07:03 UTC
    The error message that you cite talks about input XML not being well-formed, why do you look for a solution that can handle bigger XML-files?
Re: XML-RPC troubles. . .
by Joost (Canon) on Feb 02, 2008 at 10:16 UTC
    I suspect the real problem is hinted at in the (invalid token) [. . . (blah blah) . .  ] line. More specifically in the (bla bla) part, which should contain some indication as to where in the XML the error occurs.

    update: or just save the XML somewhere and run it through some other conforming parser to see what's up.

Re: XML-RPC troubles. . .
by framboise (Novice) on Feb 02, 2008 at 21:44 UTC
    Solved the problem. Used XML::RPC module instead of Frontier::Client. No troubles after that.

    Thanks for the help.

      ...and thus saving me the trouble of recommending my own module :-). Glad it works for you, if you have any questions while using RPC::XML, feel free to drop me a line in email.


Re: XML-RPC troubles. . .
by framboise (Novice) on Feb 02, 2008 at 18:42 UTC
    The reason that I assumed that the error was related more to memory limitations was the pattern that it seemed to follow. Although there seems to be a problem with quick, successive queries, the error is also given after very general queries with large result sets such as "person" and "live."

    Concerning Joost's comment, the line, column, byte triplet of the error are dependent upon the query. For example, the error returned for "person" is ". . . (invalid token) at line 1449, column 38, byte 61239 at C:/Perl/lib/XML/ line 187," while a query for "live" returns ". . . at line 516, column 31, byte 22317 at C:/Perl/lib/XML/ line 187." What may be more useful is that the error occurs on line 187 of the XML::Parser module.

    Is it more likely that what I've thought to be a memory error has more to do with the data sent from the XML-RPC server itself? If this is the case, would the implementation of XML::Smart within Frontier::Client solve the problem?