Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Choosing a data structure for AI applications

by Ovid (Cardinal)
on Jul 16, 2002 at 01:46 UTC ( [id://181980] : perlquestion . print w/replies, xml ) Need Help??

Ovid has asked for the wisdom of the Perl Monks concerning the following question:

Why have I posted this?

As I mentioned in the thread AI in Perl - proof of concept, I would like to create a Prolog-like Perl module (perlog?). This would be useful for people who would like to learn about AI but leverage their Perl knowledge. I want to design the syntax to be similar to Prolog because Prolog is fairly easy to learn and tends to naturally model the underlying problem.

This is actually a pretty long post and covers some fairly tricky things so it might be more appropriate to have some sort of collaborative effort started to work on this (this is open source, after all :). However, past collaborative efforts that were started here seem to have fizzled, yet many people enjoying responding to isolated posts, so I'm considering making this a "collaborative" effort in the sense that I make weird, long-winded posts and monks try to help me out -- nothing new there, huh?. If you like this idea, keep reading. If not, move along folks; nothing to see here.

Before reading the rest of this post, it might be helpful for you to read a Prolog tutorial, though this is by no means necessary.

Duplicate efforts

For those who may be concerned about duplication of effort, I have already checked out AI::Proplog (data structures are too simple), Language::Prolog::Interpreter (same problem), and Prolog::Alpha. The last module looks very interesting, but some of my objections about Prolog apply. There are other issues that I have with the last one, but they're beyond the scope of this post. I may return to the latter module and look at it more closely.

The basics of what needs to be implemented

Simple facts

Let's say I have a set of facts about the things that Ovid gives to others. These facts may resemble the following:

gives(ovid,money,irs). gives(ovid,book,grep). gives(ovid,book,kudra).

Note that everything starts with a lower-case letter. In Prolog, if an argument starts with an upper-case letter, it's a variable. This artificial constraint is part of the reason why I don't want to model Prolog too closely.

With the above set of facts (called a 'database', in Prolog), I can then ask questions. Let's say that I want to know if Ovid gives a book to kudra. I can ask this:

?- gives(ovid,book,kudra).

Prolog would respond "yes". However, if I ask the following:

?- gives(ovid,book,irs).

Prolog would respond "no". This is another reason why I would prefer to make something similar to Prolog, but still being Perl. A Perl programmer would expect a true or false response, but 'no" evaluates as true in Perl.

Now, let's say that I want to know to whom I would give books. I could ask this (note that the word "Who" is capitalized, making it a variable):

?- give(ovid,book,Who). Who=grep Who=kudra

If you know Prolog, you realize that I skipped some stuff there. Deal with it :)

With the data structure that I used in the Proof of Concept post, asking questions like this (particularly with facts that reference other facts) results in my using a lot of map and grep statements. With only a few facts, this is not an issue. However, true AI systems can have huge databases and this solution is not scalable. While I realize that some might scream "premature optimization", I submit that I have a known systemic performance issue and since I will be eventually auto-generating much of the code based upon the underlying data structures (trust me for now, it's mandatory), I need to work hard up front on this one piece of optimization. I have other ideas for improving performance, but I don't intend to implement them as they can be put in place after I know I have working code.

One idea that I've considered is simply using HoH and not using the values:

{ 'gives' => { 'Ovid' => { 'money' => { 'IRS' => undef }, 'book' => { 'grep' => undef, 'kudra' => undef } } 'kudra' => { 'book' => { 'ovid'=> undef } } } };

With that, to find out who Ovid gives books to, the code is simplified - ignoring for the moment that we need to check for the existence of allhash keys to avoid auto-vivifying data.

my @recipients = keys %{$facts{gives}{Ovid}{book}};

Now, that's not much of a win, but it is when we're looking up a single key. If we want to know if Ovid gives books to merlyn:

if ( exists $facts{gives}{Ovid}{book}{merlyn} ) { ... }

With my previous data structure, I would have had to iterate over all of the recipients. Clearly this is not scalable. If we have a database of tens of thousands of facts (large AI systems often have millions of facts) then the ability to quickly isolate a single fact is crucial.

This still doesn't quite work, though. For example, what if I want to have a list of everyone who gives books to others, regardless of the recipient? In Prolog I would ask this:

?- gives(Who,book,_); Who=ovid Who=kudra yes

Note: The final underscore says "I don't care who gets the book". Also, associating the results with a variable ("Who") is known as "Unification").

In Perl, I'm back to an iterative solution.

my @givers; foreach my $person ( keys %{$facts{gives}} ) { push @givers => $person if exists $facts{ gives }{ $person }{ book + }; }

Hmm... this is still better, but not good enough. Unification is one of the thorniest problems with trying to implement Prolog in Perl. It's not difficult to do. It's difficult to do quickly.

This solution also has the side effect of "randomly" ordering the facts in the database. I'm not yet sure if this is an issue. Some of the designs I have considered involve binary searches or weighted searches, so clearly a hash structure would have to go.

Compound structures

So, it appears that I haven't solved my problem, but it gets worse. What if I want to know what the title of the book is? In Prolog, I might embed another fact in the first fact.

gives(ovid,book(learning_perl), merlyn).

Now that I've given merlyn a book, I can find out what book I gave him.

?- gives(ovid, book(Title), merlyn). Title=learning_perl

Essentially, I can embed facts in other facts. This, to me, suggests "references". What if kudra, anxious about merlyn's faltering ability with Perl, also wants to loan him the llama?


As far as Prolog is concerned, both kudra and Ovid have given merlyn the same book, so I can store a reference to this book in one spot and have both facts refer to it (if I want to ensure that Prolog knows these are distinct books, I have to add more information to the book fact).


If you're beginning to suspect that there are some Lispish aspects of all of this, let me further confirm your suspicions. We can also manipulate lists in Prolog. What if I want to specify that three different items are in a kitchen drawer?

location( kitchen_drawer, [ knife, fork, kitten ] ).

Without going into all of the semantics of lists in Prolog (confession: I'm relearning some of this as I go -- not a good thing to be doing when I'm trying to reimplement this in Perl :), you can see how that also suggests a reference. And, naturally, the elements of a list can be constants, variables, facts, or other lists.

What data structure(s) do I use?

Note: I have no illusions that I can design a system suitable for large AI applications. Perl is simply not going to be fast enough to handle databases of millions of facts with thousands of rules, particularly when backtracking and recursion cut in. But I think that this might be practical as a learning tool.

As for data structures, I would kill to have Perl6 pairs, as they would solve my problems. Since Perl6 is not an option at this time, I would be willing to consider some sort of Inline::C solution. I haven't tested it, but if I could make a hash key bi-modal so that it represents both a string and an integer, I could use the integer as an array index to pick up additional properties about a given key.

Unfortunately, this has problems with mathematical functions.

positive(Number) :- Number > 0.

Another possibility would be to use two data structures. One would contain the original facts and the second would contain meta-data about those facts. The issue with this solution is figuring out how to relate the two data structures. Perhaps I could embed a "key" in the facts has and parse out the key every time I grab data, but that seems like an ugly solution.

I know that other solutions are available. My main concern is that these other solutions be fast and relatively straight-forward. By straight-forward, I mean they can be complex, but if they have a bunch of special cases, they will much more likely to be buggy. If I have to test if something is a scalar, a hash ref, an array ref, a sub ref, etc, then it's probably not the best solution. On the other hand, if I have to gather data from more than one source, but the data acquisition remains fairly constant regardless of the data that I'm accessing, then this is workable. Slow, perhaps, but workable.

One final note: there needs to be some persistence mechanism for this to be truly useful. I don't think anyone is going to be terribly interested in AI programming which requires that the database or rules be recreated every time the program is run.


Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: Choosing a data structure for AI applications
by Anonymous Monk on Jul 16, 2002 at 06:23 UTC
    You don't really need to map from prolog to perl, its pure semantics. You need to map from prolog to SQL via perl. Once you have achieved an effective mapping that way (possibly via a database schema compiler followed by a runtime interpreter or something), then you have achieved two things:

    1. Gained all the benefits of advanced database optimisation in your fact storage and lookup

    2. Achieved in one fell swoop persistance, load balancing, and external non-perlog access to the facts.

    That said, it still isn't gonna be easy. Good luck :)

      Gah! I hate it when people post intelligently but do so anonymously. It suggests to me that they're not going to hang around the monastery. I certainly hope you do.

      Actually, last night I was doing quite a bit of thinking (and reading) about this problem and using a database seemed to be one of two solutions. Actually, a multi-value database like UniData would be awesome (it would handle lists very easily), but I don't know of any open-source versions.

      The other thought was to use directed graphs with adjacency lists. I think that would be workable and I could use DB_File to handle persistance. However, I am much more familiar with database theory so I think that would be better (and probably allow more people to contribute). The nice thing about databases is that I think they would solve the Unification problem very easily, if I can come up an easy scheme to create them on the fly (I'm seeing more deviation from Prolog already).


      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        I post here all the time but I never remember to log in and I keep killing all my cookies :)

        The problem thus far with the discussion is that everyone is talking like using a database costs you something, as if its slower than a perl hash or some more complex in-ram data structure.

        The truth is that, for a dataset that can fit reasonably in ram, perl is quicker. But once you talk about anything running into the millions of facts, a real DBMS will be unbelievably helpful. Indexing is a huge win for one, they do their own ram-based caching as well.

        So, let us consider that, short of re-inventing a specialist database engine with many of the features that a DBMS already provides, a DBMS would be the most effective fact store for any reasonably sized factset.

        Your next problem is how to effectively generate database schema in such a way that you get the most out of indexes, ordering and joins within the database without having to replicate data all over the place in different forms (an option, but an unhappy one).

        The first thing is to understand that you want a powerful RDBMS. The object databases, PostgreSQL included, have many benefits, and its possible that one could be appropriate, but in my opinion the area of relational systems has been the most examined and they have the best combination of performance and flexibility (postgresql isn't shabby though, so as a free backend it may well be worth it).

        From this you would then want to read up on reasonably advanced SQL, notably views, indexes, subselects and joins, and the performance parameters inherent in them. Many features of SQL will map directly to features of prolog with the relevant schema, for example a table of facts could represent, in two columns, the IS A relationship and retrieving the facts is a set of trivial SQL statements.

        Unfortunately, I don't know enough prolog to take the analysis further, but it would seem to me that, if you're serious about doing big performance-intensive stuff, which is where expert-system AIs become really valuable, that the RDBMS backend is the only real option.

Re: Choosing a data structure for AI applications
by rjray (Chaplain) on Jul 16, 2002 at 06:20 UTC

    This isn't a fully-thought-out, comprehensive solution, but rather something that occurred to me while reading your examples. So there may be better arguments for or against this approach...

    What strikes me most about the predicates you describe is that they seem to lend themselves to a matrix-like structure. Granted, in your examples you are using names/strings rather than numbers, but that is a minor detail, comparitively speaking. If you can handle converting queries to and from matrix indices (or vector extractions, or vector operations, etc.) then you may find that PDL may be a great boon to your efforts. As an added plus, it handles sparse matrices, which I seem to remember being a key implementation component of matrix-based AI.

    But it's been 13 years since I took an AI class, or even really read up on the topic. This is just a thought.


      I don't claim to understand matrices very well, but they seem like a bad solution for this. My understanding is that a matrix will require two bits per connection between two vertices (O(|V2|)). That's for all possible connections. If I have a modest size database of 20,000 facts with an average of 5 vertices per fact, that's 100,000 bits squared. That works out to around 1192 megs of data to store the matrix. If the number of edges (|E|) is extremely large, than matrices start being more attractive, but I suspect that graph representations of AI systems are going to be rather sparse (someone feel free to contradict me on this) which suggests that adjacency lists are preferable).

      If I go with adjacency lists, the memory consumption is O(|V|+|E|). This will scale much better.


      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        This is why I made reference to PDL's support of sparse matrices. The in-memory storage is much more manageable, while still being able to do all the other operations you need.


Re: Choosing a data structure for AI applications
by shotgunefx (Parson) on Jul 16, 2002 at 09:33 UTC
    Damn you Ovid, damn you! You've gotten me hooked on this and I'm not getting any of my "real" work done.

    One of the things really bugging me is how to implement isa hasa taxonomy in an efficient manner. I think it is rather important to have to avoid redundant specification of facts and improve generalization.
    (terrier is=>canine | canine has=>hair is=>mammal | mammal is=>warmblooded)
    is('terrier', 'warmblooded') && has('terrier','hair');
    I think a centralized database of facts is the way to go as far as storage as everything is an instance of something. In my original reply to your post I took that tack.

    I don't know prolog but I would think that location and contents of the kitchen draw would best be described as properties of the "kitchen draw" and not of location. It appears to me that hoA or hoh would be the natural choice for the X does Y relationship of the data. Personally I would probably set up index hashes that map the relationships the other way to speed things up. You could just use SQL but I would think generating all the relavant tables would be a big pain (which might be worthwhile depending on your applications) with all the flavors of DBI. I would probably stick to Berkley DB but that's just me.


    "To be civilized is to deny one's nature."
Re: Choosing a data structure for AI applications
by samtregar (Abbot) on Jul 16, 2002 at 07:50 UTC
    Want pairs in Perl 5? Use Guile.

    Ok, that was silly. But probably no sillier than building a Prolog workalike in Perl, in the grand scheme of things.

    Get it, SCHEME of things?

    I slay me.


Re: Choosing a data structure for AI applications
by talexb (Chancellor) on Jul 16, 2002 at 13:25 UTC
    Very interesting.

    So far it looks like there are two possibilities -- store your 'database' in a database, or store it in some kind of HoH & meta-data store. You have the additional complication of 'duplication', re: the book you gave merlyn is later given to kudra.

    I'm not that familiar with Prolog but it seems that you would need to specify that the book has the attribute 'real -- cannot be given to more than one person at once' (which reminds me of a software company's license -- but I digress) as opposed to 'imaginary -- can be given to as many people as you like' (directions, blessing, a kiss).

    Getting back to the database/data & metadata discussion, I'd probably lean towards the second choice simply because you could model it with OO to enforce a certain way of storing the data .. among other things, this would allow you to enforce the 'can only be given once' rule mentioned in the previous paragraph.

    Doing this in some kind of matrix arrangement also seems intuitively obvious .. your example suggests you'd want to check a three-dimensional matrix or take a two dimensional slice of a 3D matrix. Naturally in the implementation you'd want a few more dimensions than that.

    Finally, as far as persistence is concerned, it should be possible to dump the 'workspace' (to use an APL term) then be able to load it back up when the next session starts. I haven't used Data::Dumper but it sounds like it would be right for the job.

    --t. alex

    "Mud, mud, glorious mud. Nothing quite like it for cooling the blood!"
    --Michael Flanders and Donald Swann

Re: Choosing a data structure for AI applications
by Hanamaki (Chaplain) on Jul 16, 2002 at 10:10 UTC
    I would like to create a Prolog-like Perl module (perlog?).

    I am looking foreward to this module. BTW, probably the first time the word perlog came up in this context was on the perl-ai mailing- list. Feel free to use the name perlog because I do not claim any copyright on this brand.

    ...Hanamaki eq riechert

      Shoot! I should have done a search on the 'Net before I tossed that out. Thanks for use of the name. If this ever gets anywhere, I'll be sure to credit you.

      In reading your post, I see that you were discussing issues with Terrence Brannon. His was one of the modules that I was considering but ultimately had to discard as "not there yet". It would have been nice to see more work done on that.


      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Choosing a data structure for AI applications
by dragonchild (Archbishop) on Jul 16, 2002 at 15:02 UTC
    Anon has it right when s/he suggests a DB to build an internal "database". If you're looking for schemas, I'd look at doing a traits-type thing. Essentially, you'd have a table of things ("giving", "ovid", "book", "merlyn") and you'd have another table to associate them together as attributes of each other.

    You'd also have to have subordinate relationships, ala "This book has this title" ... "All mammals have hair" ... etc. Those statements already have easy translations to SQL.

    This reminds me of Aristotelian logic and those circle diagrams. "All mammals have hair." "All mammals give birth live." "Some fish give birth live." "Do some fish have hair?" and be able to solve that.

    (Yes, I'm rambling, but this is a wonderful topic.)

    Does Prolog have "and" or "or" capabilities? Those could be useful. However, I would be very careful about working "if-then" of any sort in just yet. If you're curious why, we can get a nice OT meditation going, but it's beyond the scope of this thread.

    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      While driving into work I realized the problem with a pure database solution: backtracking. If I have to do much backtracking across several tables, disk access times are going to completely kill the performance. Imagine trying to issue a new SQL call every time I need to backtrack. Yuck. Of course, I can cache the SQL statements, but the only way to get around this problem (that occurs to me right now) is to preload the data. For the issues with that, see the root post of this thread :)

      I think for small to medium size datasets, the database solution may be okay, though. I'll have to play with it.


      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        Can you give an example of backtracking issues that might arise?

        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        DBI can cache the results of db queries as well. There is also a RAM DB driver which I would be compatible with if I were you so that people can use your system with scripts without dealing with a RDBMS
Re: Choosing a data structure for AI applications
by paulbort (Hermit) on Jul 16, 2002 at 17:11 UTC
    A database solution can be had without backtracking if it's designed properly. I think the right combination of references and indirection would do it. If the first table listed the object's name, a unique ID# of some sort, and bits to indicate if it was an object, operator, or both, the second table would simply be tuples of object, operator, object. Index all three columns separately on the second table, and any of the sample queries you describe should be straightforward.

    To address the problem of 'real' objects only being in one place at a time, that's a programming issue: The object ID# in question has records that indicate it is real, and with Kudra. The definition for real things could specify that they can only be with one person at a time. The code would check before adding a new assertion that the book is with someone that it can be with more than one person at once.

    The data structure isn't the hard part. Figuring out the tuples to put in the database is. I'm not sure what format OpenCYC is in, but it might be a good start, depending on what you want your AI to do.

    Spring: Forces, Coiled Again!
Re: Choosing a data structure for AI applications
by John M. Dlugosz (Monsignor) on Jul 16, 2002 at 17:38 UTC
    My impression when you talked about a fully-populated hash was of a lazy hash made using a tie, instead.

    I think book vs. book(learning_perl) is fundimental. A fact is not a simple symbol, but a nested structure that is to be matched, possibly with a pattern. Furthermore, this is the identical structure as a rule such as gives.

    Have you looked at CLIPS? It's been decades since I read the manual, but forgetting the forward-vs-backward chaining issues, I'm thinking that the way it represents knowledge and gets this into a C program would be worth looking at. It's easier to build up ad-hoc data structures in Perl...

    I think that knowledge representation is key. It doesn't matter whether it's persisted and how, but simply decide how everything is hooked up, to represent arbitrary rules and attributed facts and variables.

Re: Choosing a data structure for AI applications
by mattr (Curate) on Jul 17, 2002 at 13:45 UTC
    Dear Ovid,

    I spent a good number of hours experimenting after reading your last post about thieves etc. Unfortunately since I had two deadlines I didn't have time to finish or to post my work in time. (and also it didn't work completely). This is great stuff, though everyone seems to be talking about databases all of a sudden.

    I've looked around at reasoning code for use with Perl several times, though you probably know about the AI tool from NASA (CLIPS, thanks for the link John) and also CYC (also someone linked before I read the thread..). Perhaps you were prodded by a recent news story about Dr. Howard Cross' Problem Knowledge Coupler for medical diagnoses.

    Anyway, some quick thoughts on what you've mentioned so far.

    %actor = ( merlyn=>1, kudra=>1, ovid=>1, badguy=>1 ); %thing = ( gold=>50, pocketLint=>0.01, cheapWhiskey=>-20, dominationFund=>10E10 ); $actor{merlyn}{owns} = {gold=>1}; $actor{kudra}{owns} = {dominationFund=>1}; $actor{ovid}{owns} = {pocketLint=>1, cheapWhiskey=>1};
    or maybe one day, &addfact qw(merlyn owns (gold of value 50));

    What I first tried to do was implement the Who's a Thief problem using Damian Conway's Quantum Superpositions module, since it sounded like his any() and all() might be useful. Good for consciousness jokes too :) actually I had tried to apply it with an ingenious but only 80% implemented (too small to fit in this margin) solution to the question about filling a table with categories. The neat thing in both cases is that after setting up the data, the final answer usually is only one to three lines of what looks like very intuitive English-like phrasing.

    I wanted to be able to say,

    $x = any($actor); $y = any($thing); &rich($x) if (&owns($x,$y) and &isvaluable($y));
    I had difficulty using it, for example I don't think any(actor)->owns{item} syntax worked. But it was seductive because it looked just like your prolog statements, except for all that punctuation. I played around with these kinds of statements:
    if ($x{owns}{$y} && $thing{$y}{value}>20) { print "The badguy will ste +al $x\'s $y.\n"; } if ($item{value} > 20) { print "The badguy will steal from $potentialv +ictim\n"; } if ( $(any(@{$x->{owns}})){value} > 20) { print "badguy will steal fr +om $x\n"; } $isvaluable = sub ($) { my $x=shift; $x->{value} > 20 ? 0 : 1; };
    Here $item and $potentialvictim were supposed to be simultaneously multivalued. But (unless I misunderstood) some of this did not work because you actually have to superimpose two multivalued objects to get the superimposition, I think. Otherwise a lot of what you think should happen doesn't.

    Then I realized Quantum::Entanglement would be far more useful. This was also pretty difficult conceptually but I found it fascinating. In particular you might want to look at Entanglement's p_func which might be useful as a way to set rules operating on things. Also, the quantumlogic method looks extremely interesting and I haven't even barely looked in its general direction yet.

    Well, I tried to say

    $x = wave(keys %actor); $y = wave(keys %thing); $item = wave($actor{$x}{owns}->{$y});
    where &wave would entangle its parameters like any(). (maybe this is how any() works, didn't check).
    sub wave { # given an array, return entanglement of elements my @k = @_; my $n; foreach (@k) { push(@n,(1,$_)); } # every element is 100% probable return entangle @n; }
    This would seem to let you easily change the probability of certain events happening. This module is extremely cool and brain bending, I recommend it.

    However unless I am wrong (quite possible), since all possibilities need to be well-defined you end up with just a big matrix solver instead of an elegant logic engine. But the need for intersection, union, and probably a "such that" phrase (a pipe in math or haskell, like x|y=1) seem useful. Maybe for/foreach does it but there seems to be quite a lot more cool stuff in Haskell.

    So I'd recommend checking out Set::Bag and Set::Scalar/Set::Scalar::Valued which seems pretty powerful and even overrides some symbols so you can take the intersection of sets which would help solve the thief problem. You can say things in Set::Bag like: $rich = owns(x,y) * valuable(y); # intersection

    I think a Perl AI engine would be very useful in making interesting services, for example I did some looking around for one when I was planning a product recommendation engine that would help a customer find the right product given certain needs. I'd be much more interested now in being able to ask intelligent questions and get back statistically sorted answers, than to save a hundred thousand facts in a database just yet. Or to have just a little bit of intelligence added to my coding to save me time. Maybe something that knows all the functions I've written and could whip something up in response to a well-typed phrase?

    By the way, about Cyc, if you haven't it might be interesting to read some of the messages in the forum in the sourceforge project page. A couple months ago I remember some very interesting questions.

    And, have you looked at Haskell and what do you think?

    Also, what if we had a project wiki for pm projects?

    Finally a few thoughts about your questions, at the risk of stating the obvious..

    - making a statement like you did about kudra and merlyn creates possible counterstatements (how many books does merlyn have) for example, which would seem to mean either doing a database search each time, or updating a hash of facts about each object every time you do anything. Perhaps you would like to simulate a universe of objects? It might make queries faster. Hence your pairs note, I'm sure.

    - For some reason I think we operate more like the Entanglement routine above and less like SQL. Not subatomically speaking, but that some facts stand out, or your interest is piqued by certain words. Perhaps simulating a person listening to those facts and recording a story about them in a graph would be a viable approach. Plus you could graph it with DataViz! :)

    - I like the ability to state questions in perl-friendly phrasing, like $kudra gives $book{learningperl} to $merlyn. And I'd like to have "such that" added to Perl. Looking back at Haskell and even it would seem that a hard mathematical approach is one way to do an interface, while another might be a more linguistic approach.

    - I'd like to suggest a general purpose reasoning module that would emphasize useful interface and queries and not speed or volume of facts.

    - Also as mentioned above, something which does some of the work for you in Perl development would be extremely useful.
    For example, a repository of logic about Perl programming and other development issues, with knowledge about glueing modules together might be a killer. It could chat with you about what you want to build, and.. well you can finish this one yourself. But if a semantic repository is available, TMTOWTDI will reign supreme!

    Anyway, something that greatly magnifies the strength of a Perl programmer, and makes perl itself more powerful even in a small way, would have a lot of applications. What are the one or two words you would like to see added to the language?

    Hope this has helped stir the coals and not put you to sleep. Much luck!

    Matt R.

Re: Choosing a data structure for AI applications
by raptor (Sexton) on Jul 16, 2002 at 17:06 UTC
    i'm just opening this node... i was always wanted something like this :")... after reading the article i will comment more..
    keep working :")
Re: Choosing a data structure for AI applications
by Anonymous Monk on Nov 29, 2007 at 11:18 UTC
    I also have thought some on data structures for AI applications, which is why I'm here, reading your post.

    I think that humans are incredibly self-referential. for instance, we frequently use terms like 'My car', 'My house', etc... even when we are newly born, it's 'this is how I get nourishment'.

    Personally, I'm looking for a data structure that allows n links to a datum that has n links to other datum, where n is an arbitrarily large number. Think of dictionary, in which every word of every definition is linked to the definition of that word.

    Oh, I know some would say that's an excess that can't be sustained, but for true intelligence, it's a MUST. consider, for a moment The Fed-X box on the corner of my desk. It's a cube ( What's a cube?) it's interior and exterior dimensions have size (what's interior, exterior?) there's an unknown object inside it ( does interior have the same meaning, in this context, as 'inside'?). And I haven't even looked at surface properties, printed text, closure methods or dimensions.

    And this proposed data structure should be able to handle contextual linkages, as well. For instance the phrase 'I feel boxed in' is certainly not refering to the above mentioned Fed-X container.

    I also know, that such a data structure, once populated would be arbitrarily large, and the linkages between data might occupy more storage space than the data that's being linked.

    I can justify the data structure mentioned above any number of ways. Yet, I feel that such a data structure is required to make a mere program truly intelligent.