Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

X-prize software challenge?

by BrowserUk (Patriarch)
on Oct 15, 2004 at 14:05 UTC ( [id://399508] : perlmeditation . print w/replies, xml ) Need Help??

Update: Please post entries to the challenge subordinate to Re: X-prize Suggestions here please! and leave the main thread for discussion arising from this node itself. Thanks--Buk.

The following was sparked by reading this X-Prize for world's 'Holy Grails' article on the BBC.

Having read it, including the bit about a Skype having recently won a (non-X-prize) award for it's voice-over-IP service, I got to wondering what I would consider a suitable challenge for the new X-prizes. Especially with respect to software (and to keep this on-topic, no matter what it is, there would always be at least some scope for 'it' having a Perl component or two.)

The usual suspects crossed my mind. Artificial Intelligence; Natural language processing; a generic way to optimise algorithms for 'solving' NP complete problems; but each of these seems to miss the mark that:

that X-prize problem proposals must have a good chance of succeeding within a reasonable timescale.

Then I got to thinking about why I felt that the above problems failed to satisfy this criteria? The conclusion I reached is that they are too generic; too open in their specification. The recently won Ansari X-Prize (Nice one Burt), had a very clear judgement criteria. Two flights by the same vehicle, carrying 3-persons (or equivalents), to a height of 100 kms or greater, within a timeframe of 2 weeks. Clear, concise and unambiguous*.

*Unless your a Ferngi lawyer who might consider the definition of "2 weeks" ambiguous:)

More importantly, it didn't carry with it any unnecessary additional criteria. Whilst most of those attempting to pursue the prize settled into emulating the Saturn V model, this was in no way encapsulated into the specification. Unlike say, the world land speed record specification which applies an additional criteria beyond the "average of two runs with 1 hour", but specifying that the vehicle must have 4 wheels--the undoing of at least one very good attempt in the last century.

The restriction of the rules to the necessary criteria allowed Burt Rutan to "think outside the box" and use his two stages to orbit approach. Not that this was unique: 1 2 3.

Of course, there is the Turing test which is often manifested as the Eliza challenge. I googled for a link and found this. Hmm. Maybe not the best available:

Eliza: Hello. I am ELIZA. How can I help you? Me: Maybe. Eliza: Why the uncertain tone? Me: When was your program devised and by whom? Eliza: Does that question interest you? Me: Yes. That is why I asked it. Eliza: You are sure? Me: Yes. I am sure. Eliza: You are sure? Me: Yep! Eliza: Please go on. Me: How can I when you haven't answered the question? Eliza: Does that question interest you? ...

but if you look at the "judgement criteria", it comes down to this phrase:

...it will be possible to programme computers, [...] to make them play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning...

The problem with this as a judgement criteria (as opposed to an argument in discussion--No. I'm not critisising Alan Turing:), is that it is much too vague to allow for unequivocal verification.

  • What constitutes an "average interrogator"?
  • How do we measure "70 percent chance"?
  • What consitutes "questioning"?

That last critisim may seem esoteric, but if I consitantly and invariably ask "Why?", is that allowed?

How long would a "real human being" tolorate that conversation before replying something akin to "P*** off you moron!"?

So, for an X-prize challenge to be viable, it must be:

  • Predictably achieveable.
  • Have a fairly widespread applicaability to the world. [399516|Thanks Tilly]

    A program to allow you to hum your great opus and have it translated into a full orchestral score would be way cool, but unlikely to grab the b...hearts of too many hackers:)

  • Achievable within a reasonable timeframe. That's still a little to loose for my tastes, so let's say 10 years.
  • Criteria for completion should be clear, concise and unambiguous. Easily verifiable.
  • It should have no inherent commercial value.

    I've um'd and ah's about including this but finally decided that I would. The point is that if simply meeting the criteria of the challenge itself, would result in something that has inherent commercial value, then it places a barrier on cooperation in developing an open, free-as-in-beer, solution.

    The Ansari X-prize may lead, with investment and commitment and a good tail wind, to a commercial venture that may result in profits--as with Sir Richard Branson's initative--but as and of itself, completing the challenge has no commercial value (beyond being a pretty good advert).

So, the point of this ramble. What do you consider would make a good candidate for a Software X-prize?

Ideally, you should supply both the specification of the challenge and the judgement criteria.

It would also be nice, if the denizens of this place permit it, if:

  1. each main (1st level) response to this thread was one such unique 'entry'.
  2. Any responses, challenges or suggested modifications to that entry would be posted subordinate to that entry.
  3. the 'creator' of that entry would take on the task of updating (without maintaining a laborious update history) that entry in the light of good (in the collectively applauded sense?) suggestions for additions/modifications/improvements to that challenge.

The idea being to refine the original idea in the light of the collective experience to produce a better definition. Each challenge posted would be a separate entity without there (necessarially) being an overall judgement upon which is the overall 'best'. To this end, I will post my favorite idea (if I ever get around to posting this post), as a reply to it.

It may be that this post will itself become subject to critism, discussion or censure, in which case, it might be better to start a spearate thread for the posting of actual challenges? Anyway, I will need time to refine my idea so, I'll post this first and see what the consensus is before posting it.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re: X-prize software challenge?
by tilly (Archbishop) on Oct 15, 2004 at 14:54 UTC
    You missed a key criteria. Prizes like these are intended to motivate people to do something that the creator wants done, which is unlikely to happen in a timely manner otherwise. For instance the Ansari prize was meant to jump-start privately-funded space travel.

    My feeling is that the dynamics of open source software are such that if you have an itch, it is more effective to scratch it yourself (or hire someone to do that) than to try to motivate people with an X-prize.

      I think the bigger difference is that software is relatively cheap to create, even complex software like operating systems, web browsers, graphic rendering, general-purpose servers, and even languages itself. The cost for SpaceShipOne to win the M$10 Ansari prize has been estimated at over M$20. That doesn't count the amount the other 70-odd entrants spent. It's arguable that nearly M$500 was spent on the Ansari prize. It's doubtful that this much has been spent on any major opensource software offering, even Linux.

      Part of the other problem is that privately-funded space travel is fungible. Software, intrinsically, is not, notwithstanding the excellent efforts from Redmond. And, given the efforts of Google and others, it's rapidly becoming less fungible.

      My feeling is that a good software X-Prize would be something along the lines of true generic natural-language processing or a program that plays Go at the master level. Feasible ... just "Really Hard"™.

      There used to be a prize for a Go program, but the person offering it passed away in 1997. Maybe, someone much richer than I should take it back up. A master-level Go program on today's hardware would be a quantum advance in certain algorithms.

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        Software is relatively expensive to produce. It takes lots of time by skilled workers to produce. Open source looks cheap because it is being done by volunteers. Or by companies where the cost is hidden.

        There was a recent email on the Linux kernel mailing list about the cost to reproduce Linux 2.6 from scratch. It used a standard model. With over 4 million lines,takes over 1300 person-years of effort. To finish in 8 years requires over 150 developers, for $175 million in salaries.

        It bet if you added up the research and development expenses of open source companies, you would end up with a significant chunk of change.

Re: X-prize Suggestions here please!
by BrowserUk (Patriarch) on Oct 15, 2004 at 14:57 UTC

    Hardburn suggested that a specific subthread for entries would be a good idea, and I agree.

    Post your suggestion as a reply to this node, and leave the parent thread for discussion arising from the parent node itself.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      Goal
      A program that, on off-the-shelf hardware, plays Go at a master level (1 dan, for instance).

      Criteria

      • Achieve 1 dan in the standard fashion prescribed by the International Go Association, modified solely by the fact that the person making the moves would not be the person playing the game, a la Deep Blue vs. Kasparov.

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        Suggestions:

        Split this node into two, leaving one idea here, and another at the same level.

        Clarify the "goal" of each challenge.

        Add a section denoting the judgement criteria for each.

        Update: Change the title of each post to something like: "Xprize: Natural language processing" and "Xprize: Master level Go program".

        Thanks.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        Even doing it on custom hardware (à la Deep Blue) would be pretty damned impressive.
      Goal
      To create a program (with any needed hardware) that can translate any arbitrary piece of text from any human language to any other human language.

      Criteria
      The criteria here are going to be a little vague, but hopefully we can expand on it.

      • Translations must take no longer than 1 second per word.
      • An arbitrarily chosen native speaker of the language translated into must not be able to discern that it was a computer-generated translation.

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        I think that this is not a good software X-prize contender because:

        1. What constitutes "any human langauge"?
          • LA street slang?
          • Egyptian hyroglyphics?
          • Chaucer's english?
          • Pidgeon?
          • Bill & Ted speak?
          • Clockwork Orange "newspeak"?
          • WW2 Navaho Indian code?
        2. Is this is an "arbitrary piece of text".

          platelayers paratrooper spumoni subversive bala womenfolk zealot wangling gym clout proxemic abravanel entryway assimilates faucets dialup's lamellate apparent propositioning olefin froude.

        3. Neither the goal nor the criteria specify anything about meaning.

          Input: "Mich interessiert das Thema, weil ich fachlich/ beruflich mit Betroffenen zu tun habe."

          Output: "Engaged computing means computational processes that are engaged with the world—think embedded systems."

          Both sentences are (probably) pretty well-formed in there respective languages. The cause of my indecison is that:

          1. I don't speak German, so I can comment on the first.
          2. My native english skills are far from brilliant. The second seems to make sense, and was probably written by a human being, but whether a English language teacher would find it so is a different matter.

          However, the two sentances (probably) have very little common meaning, as I picked them at random off the net.

        The problem with every definition I've seen of "Natural Langauge Processing"; is that it assumes that it is possible to encode not only the syntactic and semantic information contained in a piece of meaningful, correct* text in such a way that all of that information can be embodied into some other langauge. It also suggests that all the meta information that the human brain devines from some auxillary clues, like context; previous awareness of the writer's style; attitudes and prejudices; and a whole lot more besides.

        *How do we deal with almost correct input?

        Even a single word can have many meanings which the human being in many cases can devine through context. Eg.

        Fire.

        In the absence of any meta-clues, there are at least 3 or 4 possible interpretations of that single word. Chances are, that english is the only langauge in which those 3 or 4 meanings use the same word.

        Then there are phrases like: "Oh really". Without context, that can be a genuine enquiry, or pure sarcasm. In many cases, even native english speakers are hard pushed to discern the intended meaning even with the benefit of hearing the spoken inflection and being party to the context.

        Indeed, whenever the use of language moves beyond the simplest of purely descriptive use, the meaning heard by the listener (or read by the reader) is as much a function of the listener/readers experiences, biases and knowledge as it is of the speaker's or writer's.

        How often, even in this place with it's fairly constrained focus do half a dozen readers come away with different interpretations of a writer's words?

        If you translate a document from one langauge to another, word-by-word, you usually end with garbage. If you translate phrase by phrase, you need a huge lookup table of all the billions of possible phrases and you might end up with something that reads more fluently, but there are two problems.

        1. The mappings of phrase to phrase in each langauge would need to be done by a human being fluent in both langauges (or a pair of native speakers of the two langauges that could contrast and compare possible meanings until they arrived at a concensus). This would be a huge undertaking for any single pair of langauges; but for all human languages?
        2. Even then, it's not too hard to sit and construct a phrase in english and a translation of it in a second langauge that would be correct in some contexts but utterly wrong in others.

          How do you encapsulate the variability between what the writer intended to write, and what the reader read? And more so, the differences in meaning percieved between two or more readers reading the same words? Or the same reader, reading the same words in two or more different contexts?

        Using the "huge lookup table" method, the magnitude of the problem is not the hardware problem of storing and fast retreival of the translation databases. The problem is of constructing them in the first place.

        The 'other' method of achieving the goal, that of translating all of the syntactic, semantic, contextual, environmental and every other "...al" meaning that is embodied within natural language into some machine encodable intermediate language. So that once so encoded, the translation to other langauges can be done by applying a set of language specific "construction rules", is even less feasible.

        I think the problem with Natural Langauge Processing is that as yet, even the cleverest and most fluent speakers (in any language) have not found a way to use natural langauge to convey exact meanings even to others fluent in the same language.

        Until humans can achieve this with a reasonable degree of accuracy and precision, writing a computer program to do it is a non-starter.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

        Suggestion: Give a more rigrous testing style for the "arbitrarily chosen native speaker". Something like:

        The tester sits in one room with a computer connected to an IRC server in a private room. Two other users are allowed in the IRC room (but only one of them is in it at once), one of which is the program and the other is a second arbitrarily chosen native speaker. After an hour of questioning, the tester will make a guess as to which user is a program and which is a human. The test is repeated with other native speakers (up to some TBD number of tests). To win, the testers must guess incorrectly at least 50% of the time.

        This will probably need to be modified further, but should be a good start. It also adds the requirement that the program can talk over IRC, but I doubt that would be a challenge for anyone implementing a natural-language processor :)

        "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        I cut and pasted from from A. Cottrell's research on Indian-Western couples living in India
        which is an exerpt from the book "The use and misuse of Language" the article by Edmund S Glenn entitled "Semantic difficulties in international Communication" Good and cheap book, worth the read


        Glenn, in "Semantic Difficulties in International Communication" (also in Hayakawa) argues that difficulties transmitting ideas of one national or cultural group to another is not merely a problem of language, but is more a matter of the philosophy of the individual(s) communicating which determines how they see things and how they express their ideas. Philosopies or ideas, he feels, are what distinguish one culture group from another. "...what is meant by (national character) is in reality the embodiment of a philosophy or the habitual use of a method of judging and thinking.: (P 48) "The determination of the relationship between the patterns of thought of the cultural or national group whose ideas are to be communicated, to the patterns of thought of the cultural or national group wihich is to receive the communication, is an integral part of international communication. Failure to determine such relationships and to act in accordance with such determinations, will almost unavoidably lead to misunderstandings." Glenn gives examples of difference of philosophy in communication misunderstandings among nations based on UN debates. Also some examples which might be experienced by cross-cultural couples. For example: to the English No means No, to an Arab No means yes, but let's negotiate or discuss further (a "real" no has added emphasis) ...Indians say no when they mean yes regarding food or hospitality offered.
      Goal
      Create a program that can learn like a baby/child with only plain text in 1 natural language (like english) as input and as output. The program will also respond as a baby, making more sense as it matures.

      Criteria
      The program must store associative information based on the input given. If the program gets input like: "A tree is green.", it must (learn to) store a connection between some entity "tree" and some other entity "green". The program may not have any hard-coded knowledge of verbs, nouns, adjectives etc.
        This kind of rule-based learning is the basis for intermediate level programs in languages such as Prolog.

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      A Knowledge Protocol.

      Goal

      TBA

      Criteria

      When I (and anyone connected to the internet), can supply the following query to the protocol and receive back a few, relevant, accurate, location specific answers to the following query.

      Item: Microwave oven Location: UK Desired: Price, make, model, size, power.

      Description/Justification

      That's an ambitious, maybe even arrogant, title, but I'll try to clarify my idea and maybe someone will suggest a better one.

      Have you ever gone out on the web looking to find information about a particular item you are considering purchasing?

      I recently needed to replace my 20 year old microwave oven. So, I started out hitting the web sites of one or two of the national chains of white goods suppliers here in the UK. The result was an intensely frustrating experience.

      • All of them wanted to set cookies even before they had shown me anything that I wanted to see.

        No thanks. If I see what I want, and choose to make a purchase from you, I may allow a session cookie, that is only valid within the current domain, for the duration of the transaction.

        but otherwise: Go F...er.. No thankyou very much.

        And no amount of "It allows us to give you a better shopping experience" or any of the other lame excuses that I have recieved as replies to my complaints to sites will change my mind in this.

      • Most of them either didn't work at all, or didn't work correctly without javascript enabled.

        Bad luck guys! You lost my custum before we even got started.

      • Some, fewer these days thank goodness, won't work unless you are using Internet Explorer.

        Why the hell would anyone use IE (of any falvour)? It is single-handly responsible for the transmission of something like 90% of all the viruses, trojans, and other forms of internet nasties (probably 95% if you add Ooutlook Express) Why force me to use it? The upshot is, if you try, you lost me. (I just wish more people would take my lead and refuse to use websites that do these things--then 'they' would get the message).

      • Flash

        Just say no. No, say "No way". Better yet. Send sites that insist on using a 300k flash animation to say "Welcome! Click here to continue", a 10 MB flash animation in an email saying "No! No flash! No how, no way. NO! Bye sucker"

      • Have you ever looked at how much of a typical retail site page (in terms of screen real-estate area) is actually devoted to thing that the page purports to be about (the XYV model-pqr microwave for instance) and how much is a randomly distributed mish-mash of irrelavent information that is either needlessly displayed on every page of the site, or thrown up at random regardless of what it is that are actually doing?

        Why is Google so successful? Because of it's superior search engine technology? Maybe, now, but my original criteria for going there was the total lack of crap. With half of the search engines around, you have (or had) to wait 20 minutes for 300kb of crap to arrive and be formatted before you even got to type in your query, and as for the volumes of crap that is (was) presented after the query.

        Displayed by dictionary.reference.com after you search for how to spell: "excretia"

        Get the Most Popular Sites for "Excretia" #### Yeah, right! Suggestions: Exc retia #### No such word. Why offer it? Exc-retia #### Ditto Excreta Excretion Excrete Excreter Excreation Excreate

        Any of the increasing number of Google wannabes that wants to attract my patronage will need to have learned that lesson.

      Next, I tried Google to locate some information on "microwaves 900W UK price" and a whole slew of variations. Half the sites that turn up are US sites. Half of the rest are "Comparison shopping" sites that seemingly catch everything. Of those left, actually extracting the knowledge* that I was after, from amongst the noise, was just too painful (and probably unnecessary) to relate.

      So, what I am looking for is an "Knowledge protocal".

      There is an adage that I am not sure of the provenace of, nor could I locate it, but it says that:

      Anything (literally anything; words, sounds, numbers, blades of grass, fossilised feces, ash on the carpet, or the absence thereof; anything) is data.

      Once collated (in some fashion), data can become information. Whether said information is useful to any particular viewer is dependant upon a variety of things.

      But what I am seeking is not information. Visiting the Ferrari website to look for data about the fuel consumption of their vehicles, I might be presented with a banner informing me that

      Micheal Sheumacker's wife's sister has a friend that markets deoderent products for porcines.

      This may well be "information" (of the FYI or FWIW kind), but it certainly isn't what I went there seeking.

      It isn't knowledge.

      So what would a knowledge protocol allow me to do?

      Scenario. I send a query of the form.

      Item: Microwave oven Location: UK Desired: Price, make, model, size, power.

      to some anonymous email resender* (controversial: but why not use the distributive power of spam for good rather than bad?)

      The resender forwards the query to anyone whom has registered as a respondant to enquires concerning "Microwave ovens" in "UK". For the registration process, think along the lines of subscription to newsgroups and mailing lists.

      The resender forwards the request devoid of identifying infomation to a Knowledge Protocal Port.

      The deamon responds with:

      1. The requested information as defined by the "Desired" card
      2. To make it commercially interesting, a single url that should lead to a page that expands upon the requested information. And specifically, the requested information.

      Of course, there will be those that will either just link to their standard home page, or to a page that carries a re-direct to their standard home page, or otherwise try to subvert the rules of the proticol. But here the mailing list anology extends to the provision for kill-lists. Some way of extending this so that if enough* people place a particular responder on their cheaters-list, then that responder gets de-registered as a mechanism for keeping responders honest.

      This may sound a little like various other things around, say Froogle, but it's not. First, I've read sincere and reasoned discussion that worries whether Google isn't becoming rather too powerful. I'm also not sure, but doesn't Froogle take money to place your goods/services on the index?

      The whole idea of there being a central registry, or a for-money service negates the purpose of the protocol. Whilst I would want the protocol to cater for the distribution of commercial information, it should not be limited to, nor dominated by it.

      So, rather than a central server that would require hardware on which to run, and maintance staff, and salaries, and benefits packages et al. Why not utilise the power of Kazaa-style distributed filesharing protocols. With a suitably defined and simple protocol, leveraging existing experience with things like ftp/html/smtp etc., it should be easy to produce simple clients that would distribute the database in such a way that there is no need for centralisation and all the overheads that brings with it. Every client becomes a part of the distributed service.

      Help needed

      That pretty much concludes the inspiration and justification for the idea. However, I am having considerable difficulty trying to whittle that down to a single goal. Part of the idea of the parent post is to allow collective thinking to come to bear on such problems, so I am going to leave the definition of the goal open for now, and settle for a loose set of Judgement Criteria as the starting point. Maybe, if this thread, and this post grabs enough mindshare to interest people, then both of these will be refined over time to better reflect my aspirations for it.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        The semantic web folks are trying to do what you want. I have not followed what they are doing though, so I can't give you a sense of how viable it is or isn't. (But it certainly has attracted some interest.)
        Ignoring your suggestions for implementation for the moment, it sounds like what you want is a search engine that returns actual data, not meaninless webpages. Which sounds nice and all, but if it's a third party program, wouldn't returning just the data be a copy right violation?

        Addressing your implementation ideas, at first read it sounds like you want all of this to be done manually?! You send an email to the list and everyone reads it and possibly responds?