SIR Chris Mellish This TEACH file tells you how to construct a simple program which can converse in English and also "learn" new information from what it is told. It is based loosely on the following program: Bertram Raphael SIR: A COMPUTER PROGRAM FOR SEMANTIC INFORMATION RETRIEVAL in SEMANTIC INFORMATION PROCESSING (ed M. Minsky, MIT Press, 1967) INDEX FOR THIS FILE: -- INTRODUCTION -- SENTENCE PATTERNS -- USING LISTS TO REPRESENT SENTENCES -- SENTENCE PATTERNS AND MATCHING LISTS -- REPRESENTING PROPERTIES OF OBJECTS -- DEFINING A RESPONSE PREDICATE -- ADDING EXTRA CONDITIONS -- MAKING THIS INTO A COMPLETE PROGRAM -- ADDING NEW FACTS -- ADDING NEW RULES -- SEEING WHAT SIR KNOWS -- DOES SIR REALLY UNDERSTAND? -- POSSIBLE EXTENSIONS -- PROBLEMS -- INTRODUCTION ------------------------------------------------------------ In this file, we will use the name SIR rather loosely to refer to any program based on the same principles as Raphael's SIR. Not everything that we say will necessary be true of the original SIR program. In order to construct this program, we will have to cover the following new features of Prolog: - Using LISTS to represent sentences and patterns - Using ASSERT to add facts to the database Here is an example of the sort of interaction that SIR can produce. The program accepts simple English statements and answers simple questions, making some inferences as necessary. In the following, lines starting with a question mark ("?") are typed by a person, and the other lines are typed by the program: ?- sir. ? Fred is a minister ok ? every minister is a man ok ? every man is a person ok ? Mary is a woman ok ? is Fred a person yes ? is Mary a person no ? every woman is a person ok ? is Mary a person yes ? bye yes Before you read on, think for a while about the above dialogue and try to answer the following questions: - What kinds of sentences must the program understand? - What kinds of information must the program be able to represent? - What kinds of inferences must it be able to make? (could you represent these by Prolog rules?) -- SENTENCE PATTERNS -------------------------------------------------------- The SIR program cannot respond intelligently to all possible English sentences. Rather there are particular sentence PATTERNS which it recogses and which evoke particular kinds of response. For instance, if we type: ... is a ... where the "..."s are any single words, then it assumes that we are stating a fact that the object named in the first "..." is of the kind named in the second "..." (eg "Fred is a fireman"). In response to this, the program must say "OK" and also store away somehow the information that has been given (eg the information that Fred is a fireman). On the other hand, in response to a sentence of the form: is ... a ... the program assumes that it is being asked whether the person named in the first "..." is in fact of the kind mentioned in the second "..." (eg "is Fred a fireman"). The correct response is now to say "yes" or "no" (according to whether Fred is a fireman or not). This involves somehow being able to see if some fact is true on the basis of what has been told to it so far. -- USING LISTS TO REPRESENT SENTENCES --------------------------------------- In order to write a SIR program, we must have some way of talking about sentences in Prolog. We will do this by means of LISTS. A list in Prolog is a collection of objects in a particular order, just as a shopping list is a collection of things to be bought written down in some particular sequence. We write lists by enclosing them in square brackets ("[" and "]") and separating the individual items by commas. Thus in Prolog we might write: [apples,bread,butter,cheese] to represent the shopping list: apples bread butter cheese In fact, the items that appear within lists can be Prolog atoms, numbers, variables, or even other lists! or now, we will concentrate on lists of atoms, such as the one above. How do we use lists in Prolog? Basically, we can use a list just like any other kind of object we are talking about (atom, number, ...), except that a list cannot be a predicate (the name of a relationship). So, we can have a list as an argument to a predicate. For nstance, we could represent the fact that we have a monthly shopping list with particular contents byfact like: monthly_shopping_list([beer,oil,jam,tomatoes]). We can then ask: ?- monthly_shopping_list(X). and get back: X = [beer,oil,jam,tomatoes] A sentence is a collection of words written down in a particular order, and so we can represent a sentence in Prolog quite straightforwardly as a list of atoms, where the atoms have the same letters as the corresponding English words. We must bear in mind, however, that Prolog's rules about what sequences of characters make up atoms are not the same as the rules for words in English. As long as we stick to words consisting solely of small letters, there won't be any problems with this. Exercise: Write down the Prolog lists representing the following sentences: mary had a little lamb the man came to see me again What would you have to type to represent the following sentence as a Prolog list? Mary didn't have chicken-pox How might you represent the sequence of jobs that people have had using lists in Prolog? (eg. John has been a carpenter, then a joiner, then a plumber; Mary has been a teacher, then a plumber, then a programmer). -- SENTENCE PATTERNS AND MATCHING LISTS ------------------------------------- In the SIR program, we want to be able to represent, not just particular sentences, but also sentence PATTERNS (such as "... is a ..."). We can do this by making lists with a mixture of atoms and VARIABLES. Recall that in Prolog a variable appearing in a fact or rule stands for some object that will be known when the program is run; in different uses of the fact or rule, it may stand for different things, however. Likewise, each time we say that a pattern like "... is a ..." applies to a sentence the "..."s will correspond to specific words; on different osions, the words will be different though. We represent a sentence pattern in Prolog just like an ordinary sentence, except that where the pattern expresses a "..." we put a variable. For instance, the pattern: ... is a ... can be represented by [X,is,a,Y] Note that we will normally specify different variables for different "..."s. If we specify the same variable, then we are saying that the "..."s must be the same word. For instance, the pattern represented by: [X,is,a,X] would apply to the sentence "john is a john" but not to the sentence "john is a man". Exercise: Write down the Prolog representation of the following patterns: is ... a ... who is a ... every ... is a ... Now that we are talking about putting variables in lists, we must be rather precise about how Prolog goes about MATCHING ls together. Remember that when Prolog has a goal to be satisfied, it has to choose a fact or a rule for the correct predicate and check that its "head" MATCHES the goal. This means that each argument of the goal must MATCH the corresponding argument in the head of the fact or rule. What happens if both of these are lists? In order to try to match two lists together, Prolog compares the corresponding elements of the lists one by one: - If the two elements are the same atom, then that is OK. - If one of the elements is an uninstantiated variale (a variable that doesn't stand for anything yet) and the other is an atom, then that is also OK, and the variable is made to stand for that atom. - If Prolog comes across two different atoms, or if it comes to the end of one list before the other, then the matching fails. If Prolog comes to the end of both lists at once without encountering a failure then the match is OK. For example, if the system is trying to match together the two lists: [a,b,c,d] [a,X,c,d,e] then the matching will successfully compare: - "a" in the top list with "a" in the bottom list - "b" in the top list with "X" in the bottom list (OK if X doesn't stand for anything yet. X will n stand for "b") - "c" in the top list with "c" in the bottom list - "d" in the top list with "d" in the bottom list but then it will find that one list goes on further than the other, and so the whole match will fail (and X will no longer stand for "b"). Exercise: For the following Prolog representations of sentences and patterns, write down whether the sentence matches the pattern and, if so, what values the variables in the pattern have afterwards. You can assume that all the variables start off uninstantid. Sentence Pattern [john,is,a,man] [X,is,a,Y] [is,john,a,man,?] [is,X,a,Y] [john,has,an,hotel] [X,has,a,Y] [john,is,a,bee,keeper] [X,is,a,Y] -- REPRESENTING PROPERTIES OF OBJECTS ---------------------------------------- Now that we can represent sentences and sentence patterns in Prolog, there is only one more thing to decide on before we can start writing a SIR program - how the program is to represent the information that we tell it in English. A lot of the information povided is about PROPERTIES that OBJECTS have. For instance, "John is a fireman" is telling us that John (the object) has the property of being a fireman; "John is fat" tells us that John (the object) has the property of being fat. We can represent properties of objects by a predicate 'has_property' with two arguments. Here are some facts that we might want the program to know initially about what objects have what properties: has_property(john,fireman). has_property(john,fat). -- DEFINING A RESPONSE PREDICATE --------------------------------------------- We are now going to start working on the central part of the SIR program - the part that decides what response is to be generated, according to what pattern the sentence has. For this, we provide clauses for a predicate 'response' with two arguments: response(Sentence,Reply) says that the response to the sentence 'Sentence' should be 'Reply'. The idea is that the program will invoke the 'response' predicate when it has read in a sentence (represented as a list of atoms) and needs to know the response. Each clause for 'response' will specify a PATTERN that the sentence should match and an appropriate response to that pattern. Prolog will search through the clauses in order until it finds one that matches the 'response' goal with the appropriate sentence. Here are some very simple clauses for 'response': response([can,computers,think],[yes]). response([what,is,the,weather,like],[quite,fine]). response(X,[i,dont,understand]). These say the following. If the sentence is "can computers think" then the response is "yes" (a sentence with only one word); if the sentence is "what is the weather like" then the response is "quite fine"; (otherwise) if the sentence is anything at all (if it matches an uninstantiated "X") then the response is "i dont understand". Exercise: Type these (or similar) clauses for 'response' into a file 'response.pl', and then try them out. That is, try asking Prolog questions like: ?- response([what,is,the,weather,like],X). Try asking for alternative solutions - you will see that we really only want Prolog to consider the first one What happens if the order of the above facts is reversed? Why does the last one have to come last? These clauses for 'response' specify quite simple sentence patterns. We can actually write more generally useful rules. For instance: response([what,is,a,X],[i,dont,know,the,word,X]). response([are,computers,X],[yes]). The first of these will answer the whole class of questions "what is a ..." by simply saying "i dont know the word ...". To see how it works, let's assume that we have asked the question: ?- response([what,is,a,computer],R). and let's look at what Prolog does in slow motion. To answer this question, it must look through the clauses for the 'response' predicate in turn until it finds one that "matches". Eventually it will come to the fact: response([what,is,a,X],[i,dont,know,the,word,X]). Now it has to see whether this fact matches the question. Working from left to right, Prolog sees that the two predicates are the same (so OK so far). Now do the two first arguments match? Since they are lists, Prolog must look at each element in turn. Here's what happens: "what" matches "what" (OK) "is" matches "is" (OK) "a" matches "a" (OK) "computer" matches "X" (OK) and "X" stands for "computer" both lists end together (OK) So the matching succeeds, and now X in this use of the fact stands for "computer". Now do the second arguments match? The second arguments are: R in the question [i,dont,know,the,word,X] in the fact Since R is uninstantiated (doesn't stand for anything yet), it will match anything, and so the match succeeds, with R standing for [i,dont,know,the,word,X] where X is the X in this use of the rule (which stands for "computer"). Having succeeded in matching both arguments, Prolog has successfully used the fact and comes back with the answer: R = [i,dont,know,the,word,computer] So we can explain why the program comes to the correct answer in terms of how Prolog does searching and matching. Now that we understand the basic principles, we don't have to go through this for every 'response' clause we write! Instead, we can usually just think about what fact or rule it is stating in English: response([what,is,a,X],[i,dont,know,the,word,X]). means roughly "if the sentence is of the form 'what is a ...' then the response should be 'i dont know the word ...'". Exercise: Add these (or similar) extra facts to your existing facts in 'response.pl', and have the complete set compiled. Try them out as before. -- ADDING EXTRA CONDITIONS ------------------------------------------------- If we ask a question "is john a fireman", there unfortunately isn't always the same answer - it depends on whether john is a fireman or not. To capture this in the program, we need to spy that the answer is "yes" only if john is a fireman. Remember that to do this the program must const its facts about 'has_property'. The clauses for this look as follows: response([is,Obj,a,Prop],[yes]) :- has_property(Obj,Prop). response([is,Obj,a,Prop],[no]). The idea is that, for the pattern "is ... a ..." the response "yes" is appropriate PROVIDED THAT the object (Obj) does indeed have the property (Prop). If the sentence has this pattern but the 'has_property' goal CANNOT be satisfied, Prolog will try the second clause and give the response "no". Note that these clauses MUST be in this order. What happens if they are in the other order, or if we ask for alternatives after the first solution? Exercise: Type in these two clauses (possibly in addition to those you already have for 'response'). Give some initial 'has_property' facts as well. Then try out the program by asking questions about 'response', eg. ?- response([is,john,a,fireman],X). -- MAKING THIS INTO A COMPLETE PROGRAM --------------------------------------- By now you will be getting fed up with typing lists to try out your program, and by having to try out responses one at a time. Having the program accept sentences without the brackets and commas and having it carry out a continuin dialogue are really "cosmetic" alterations to the program - they do not add to its basic power to understand and respond to sentences. So we have provided the necessary "cosmetic" parts of the program for you. The important part of the program is the 'response' predicate, and so the program we provide calls your definition of 'response' at the crucial stage. To make use of these extra bits of program, type ?- library(sir). to Prolog. Now load your program. Don't load the library program after your own. Then to run the complete program, type ?- sir. The program will continue to ask you for a sentence and give a response (according to r definition of 'response') until you type simply: bye to it. The program will then finish, and you can give another command to Prolog or VED. Try running your existing 'response' rules with the rest of the SIR program now. -- ADDING NEW FACTS ---------------------------------------------------------- Our program can now answer about facts that it already knows, but it cannot absorb new information. In response to a pattern like "... is a ..." we want it to add a new fact to its store of information about properties. To this, we use a special predicate called 'assert'. 'assert' is a special predicate in various ways: First of all, it is already defined when we run Prolog, and so we can use it without providing a definition ourself. Secondly, a goal involving 'assert' (with one argument) ALWAYS SUCCEEDS (except in some conditions where it might cause a "mishap"). Finally, the important thing about 'assert' is not WHEN IT IS TRUE, but is WHAT IT DOES. For whenever we satisfy 'assert' with one argument SOMETHING GETS DONE - the argument is added as a fact to the database. Exercise: Give Prolog some example questions using 'assert' and appropriate facts. Note that you don't have to type a dot iediately after a fact serving as an argument to 'assert'. Eg: ?- assert(has_property(fred,fireman)). Do the facts go at the beginning or the end of the database? What happens if you give 'assert' strange things as arguments? Des it ever fail? Using 'assert', we can provide a 'response' rule for what to do if the sentence has the pattern "... is a ...". The response is to be simply "ok" and a fact is to be added to the database: response([X,is,a,Y],[ok]) :- assert(has_property(X,Y)). There are several features of note about this rule. First of all, the 'assert' goal always succeeds, and so the response is always "ok" if the sentence matches this pattern. By the time the 'assert' goal is tackled, the variables X and Y will have become instantiated (through the matching of the pattern with the sentence). So it will add a fact about a SPECIFIC X and Y (the ones mentioned in the sentence). Secondly, in order to understand this rule, we have to know enough about how Prolog works to ee that a fact will be added to the database, and that this will happen AFTER X and Y get their values. Exercise: This rule will handle sentences like "... is a ...", but won't handle "... is an ...". Add a new rule to cover this, and test it on some examples. -- ADDING NEW RULES ---------------------------------------------------------- One of the most interesting features of SIR is that we can tell it new rules about the World (such as: "Every minister is a mn") and that it can use these rules to help it answer questions. You ought by now to have ideas about how these sorts of rules could be represented by Prolog rules. For simplicity, however, and as they are rules of a very restricted kind, we will represent them as Prolog FACTS. We will represent the rule "Every minister is a man" by the fact: implies(minister,man). 'implies' is a predicate with 2 arguments. We will take it to mean that the property in the first argument position "implies" that in the second: ie. everything that has the first property has the second. Given this decision, we can add the following rule to our program: response([every,X,is,a,Y],[ok]) :- assert(implies(X,Y)). We now have a problem, however. SIR can't use its information about 'implies' to help it answer questions. We need to be able to tell Prolog how to infer consequences from 'implies' facts and 'has_property' facts. That is, we have to tell it what properties it should be able to infer. The following short program is one way to do this: inferred_property(Person,Property) :- has_property(Person,Property). inferred_perty(Person,Property) :- implies(Other,Property), inferred_property(Person,Other). You don't have to understand this program in great detail, or even type it in, as this program is automatically added to the database when you do '?- library(sir).'. What it says can be roughly translated into English as follows: Property can be inferred to apply to some Person if we know explicitly that Property applies to Person (ie if it is a 'has_property' fact) Property can be inferred to apply to some Person if some other property Other "implies" Property and Other can be inferred to apply to Person For instance, the property "man" can be inferred to apply to "john" if we know explicitly that it applies to him. It can also be inferred to apply to him if we know that "minister" implies "man" and we can infer that "john" is a "minister". Given this definition, we now have a choice of how to tell SIR to answer questions. Should we tell it to answer by looking only at facts explicitly told to it (ie using 'has_property', as at present), or should we allow it to make inferences as well (ie use 'inferred_property')? If we decide on the latter, we must change some of our rules. For instance, one ofr rules should now look like this: response([is,Obj,a,Prop],[yes]) :- inferred_property(Obj,Prop). We have now got a program that can be told "rules" and can make inferences from them. Try it out osome examples to test this capability. -- SEEING WHAT SIR KNOWS ----------------------------------------------------- After a reasonably long session with SIR, the program will have built up a fair number of 'has_property' and 'implies' facts. You might want to see what these are and have them in the database right from the start in a later session. The SIR library progrprovides a special predicate for doing this. If you ask: ?- list. then the current facts about 'has_property' and 'implies' will be displayed at the end of a file called 'property.pl'. You can then send them to Prolog at some later stage by marking the range and typinCTRL-d. -- DOES SIR REALLY UNDERSTAND? ----------------------------------------------- We can use English to talk to SIR and, in the right circumstances, it produce sensible answers to questions. SIR is able, in a limited way, to learn new facts and rules, to make inferences and to carry out a conversation in English. So would it be appropriate to say that SIR "understands English" to some extent? Let us consider what understanding SIght have of the word "man". It's just a few facts saying that some objects have the property of being a "man" and that "man" is perhaps implied by or implies various other properties. SIR doesn't know what it is like to be a man, to see a man or to decide whether something is a man or not. As far as SIR is concerned, "man" might be a word referring to a kind of Martian or a kind of screwdriver. It is just manipulating it as a symbol according to the rules we have written. We could have typed the word "nam" everywhere instead of "man" and got the same results (try it!). The last paragraph is really suggesting that SIR does not understand because it doesn't have any sophisticated notion of what a word means and because there is nothing to link what it is doing to the real world. Of course, SIR was only designed to be able to process a particular subset of English, and it turned out that it could do this with very little machinery. We could imagine puttin a lot more Prolog facts and rules into the program, so that all sorts of information about "man" was given and so that the symbols manipulated were linked in some way to possible perceptual inputs (from a television camera?). At what stage could we then say that SIR "understood"? If SIR was developed s that it could engage in as good a conversation as you or I, could we then say that it "understood" (even if it had no extra perceptual inputs?), or would we have to look at the structure of the program to decide? -- POSSIBLE EXTENSIONS ------------------------------------------------------- If you are ambitious, you might think about extending the SIR program to "understand" things like the following: who is a man - ie. answer with the name of a man (who is a ...) john is happy - ie. some properties are expressed by adjectives (... is ...) every man has a head - ie. parts and possessions, as well as properties (every ... has a ...) the capital of france is paris - ie. attributes and values as well (the ... of ... is ...) -- PROBLEMS -------------------------------------------------------------- In Prolog, we can't express a pattern that will allow a variable number of words. This means that we can't write a single rule to cater for any of every mollusc is a snail every large mollusc is a snail every large shelled mollusc is a snail every large shelled marine mollusc is a snail ..... If we had some way of specifying that the input sentence had to be of this form, we would still have to have a way of spying the action to be performed by SIR, when the number of words was not known in advance. Another problem that we encounter if we wish to handle sentences of any length is that we really need to be more specific about the words that are matched against variables in patterns. If in some improved version of Prolog we could specify a pattern that would match all the above sentences, ie something like: "every ... is a ..." ^---------------- any number of words the pattern would match sentences like: every man smokes and john is a moron ................... every day i see a boy who is a coward ................... which clearly cannot be handled in the same way as the "mollusc" sentences, where the first "..." just matches a sequence of adjectives and nouns. So there must be a way of telling SIR what sequences of words are acceptable in this position. In looking at these examples, we have found that a really good SIR is going to have to know something about the STRUCTURE of English utterances - about what kinds of phrases there are and how a phrase divides up into subphrases. When we talk about utterances in this way, we are discussing the SYNTAX of the language concerned, here English. Traditionally, the syntax of a language is expressed by means of a GRAMMAR, which is a formal description saying what kinds of sequences of words are grammatical and what kinds of sequences are not. We could imagine building a program that built some kind of description of the structure of a phrase as it found it. Such a program would be called a PARSER. The question arises - would such a description be of any use? The answer that many people would give is: Yes, a description of the syntactic structure of an utterance is of fundamental importance for determining the meaning of the utterance in a principled way. The existence of computer programs capable of building such descriptions leaves the way open for a generation of language understanding programs that go significantly beyond SIR in their generality and usefulness...... --- File: local/plog/teach/sir --- Distribution: all --- University of Sussex Poplog LOCAL File ------------------------------