RSD-TR-5-87 A GEOMETRIC APPROACH TO COUNTERFACTUAL REASONING AND ITS APPLICATION TO MODELING MANUFACTURING SYSTEMS by Arch W. Naylor Mohammad B. Shadmehr The Department of Electrical Engineering and Computer Science The University of Michigan Ann Arbor, Michigan 48109-1109 March 1987 CENTER FOR RESEARCH ON INTEGRATED MANUFACTURING Robot Systems Division College of Engineering The University of Michigan Ann Arbor, Michigan 48109-1109

RSD-TR-5-87 TABLE OF CONTENTS 1. Introduction.......................................................................................... 2. Overview of Counterfactual Reasoning........................................ 3 3. Review of the Modeling Formalism................................................. 4 4. Updating Problem and Its Relation to Counterfactual Reasoning........ 8 5. A Concept of Relative Closeness.......................................................... 8 6. Geometric Analysis of Configuration Space..............................8......... 8 7. Conditions for Unambiguous Updating................................................ 15 8 Syntactic Characterizations of Conditions........................................... 20 9. The Connection to Situation-Based Planning, The Frame Problem, and Rule-Based Systems....................................................................... 22 10. Summary............................................................................................. 23 11. References...................................................................................... 23 I

A Geometric Approach to Counterfactual Reasoning and Its Application to Modeling Manufacturing Systems Arch W. Naylor Mohammad B. Shadmehr University of Michigan University of Michigan February 28, 1987 Abstract A geometric approach to logic-based formal models of manufacturing systems [11 is introduced and is used to solve the updating problem that arises in these logic-based models. The problem arises because it is relatively easy to formulate a model in which the next state is ambiguous. Conditions are presented which identify such undesirable models. The approach is first to show that the updating problem is very similar to the world selection problem in counterfactual reasoning and then to use functional analytic methods to develop conditions which can be viewed as geometric tests. Furthermore, in models which pass these tests, updating becomes an orthogonal projection. Although the application is to manufacturing systems, the results presented are applicable to many other kinds of systems. For example, they are applicable to situation-based planning problems and the frame problem in Artificial Intelligence, discrete-event systems, and rule-based systems. Furthermore, the approach used here may be of some use in the philosophical problem of counterfactuals. Keywords: Manufacturing, Logic, Systems, Models, Discrete-Events, Counterfactual, Rule-Based, Expert, Planning, Situations, Frames 1 Introduction If one is concerned, as we are, with software for the real-time control of the flow of parts, tools, material, and information on the factory floor, then one must also be concerned, in one way or another, with modeling what we will call the "logical view" of a manufacturing system. That is, one needs to keep track of logical conditions such as "part no. 28 is on machine 3", "vehicle 7 is at the loading dock", and "part no. 28 has finished the fourth step in its process plan". One also needs to follow changes in these conditions and model how long it takes to make these changes. For example, one should be able to model moving vehicle 7 from the loading dock to machine 3. In particular, one should merely have to say that is takes about 18 seconds to make the move, ignoring lower level details. The importance of such logical-view models is that they give precisely the needed perspective for the kind of real-time control of interest here: they present just enough detail, and no more, to describe what is essentially a real-time scheduling problem. Needless to say, other 1

kinds of models-for instance, robot kinematic models-are also important for other kinds of control, but from the logical viewpoint one merely needs to know that it takes the robot 10 seconds to move a part from the machine to the vehicle, not the equations of the path followed. One approach to modeling the logical view of integrated manufacturing systems, the one we concentrate on in this article, is given in [1], where, among other things, a number of examples of its application to manufacturing systems are presented. Although it is shown there that the modeling approach is expressive enough to capture the real-time controlsoftware issues, it is also pointed out that there is a fundamental problem-the updating problem-with the modeling system. In this article we present a complete analysis and solution of the problem. These results are important for at least two reasons. First, the logic-based models of [1] are the "natural" way to describe manufacturing systems for purposes of real-time control. By "natural" we do not mean that the first-order logic notation used in [1 is requireed, that is just a very convenient standardized language. Other modes of description-for example, graphical-are clearly possible, but each will merely be another way of describing essentially the same perspective of the system being modeled, that is, the logical view; and each must deal with the updating problem. In other words, updating is a fundamental problem for modeling the logical view of a manufacturing system; and, therefore, its solution is important for real-time control software. Second, a side effect of the solution to the updating problem presented here is to show that the geometric insights of functional analysis and linear spaces can be used to describe the dynamics of manufacturing systems and their control. In particular, planes, orthogonal projections, and linear transformations are applicable. This promises a new approach to the analysis of these systems, and subsequent articles will exploit this approach. Next let us informally introduce the updating problem. Imagine a snapshot of a factory floor at some instant.'It shows the locations of parts, tools, and material as well as other things. Further, imagine giving a command that will result in a new configuration of the factory floor. If the command is something like: "Load the part on the material transport system onto machine no. 9," it is fairly clear what is intended. However, if the command is "Load the part on the material transport system onto machine no. 9 or take it to the loading dock," what is intended is not clear. Presumably, there are two possibilities. One might argue that this simply reveals a nondeterministic system. Perhaps, but suppose that we are attempting to design or model a system that is supposed to be deterministic. How do we recognize and avoid this kind of ambiguity? Doing so, turns out to be relatively easy, but not quite so easy as our simple example might suggest. For instance, suppose that the command is "Unload all machines that can be unloaded." Is it clear what the next factory floor configuration will be? Not necessarily. There may be various ways to unload as many machines as possible. In other words, there are cases that are hard to call. Moreover, in a formal model the commands are probably not in English, they are sentences in a firstorder language or something equivalent to that, and which ones lead to ambiguity is not immediately obvious. Our problem, then, is to avoid this ambiguity. This problem does not, as we have said, arise because we have chosen to model manufacturing systems using first-order logic. The problem would be present even if we chose a different modeling method, and the advantage of our logic-based approach is that it allows us to formulate and treat the problem precisely. The rest of the paper is organized as follows: First, we say just enough about counterfactual reasoning to present the spirit of this area. Then we review the logic-based modeling formalism developed in [1]. Next we mention how our work relates to counterfactuals. After 2

that we present a concept of relative closeness. Everything in this article hinges on this concept of closeness. In particular, updating comes down to picking, say, the next factory floor configuration to be the one ( from among all the possibilities ) that is tht te closest, and recognizing that sometimes there is no closest one. The next section shows that a functional analytic geometry follows naturally from our concept of closeness, and the remainder of the article exploits this geometry to get practical tests for unambiguous updating. Finally there is a section which describes, for the reader interested in situation-based planning or rule-based systems, how our work relates to these areas. 2 Overview of Counterfactual Reasoning Counterfactual reasoning is concerned with explaining what is meant when someone says, for example, "If kangaroos had no tails, they would topple over." Such statements are counterfactual in the sense that the antecedent, in this case, kangaroos have no tails, is false e in the world that we live in. Such statements ask us to imagine a different world where kangaroos do not have tails and to contemplate the consequences of tailless kangaroos. If kangaroos would indeed topple over in such a different world, then we accept the statement "If kangaroos had no tails, they would topple over" as true in our world. Of course, specification of what the different world is becomes a key problem, and the usual approach is to imagine making as few changes as possible in the real world. That is, one imagines a world with tailless kangaroos that is "near" the real world. In fact, one imagines the set of all worlds with tailless kangaroos and picks the world in that set which is "nearest" to the real world. Not too surprisingly, this general approach has raised considerable controversy, as anyone familiar with metrics, topology, and minimization will appreciate. Simply stated, it is just not clear that there will always be a unique, nearest world to the real world. It is not even clear what we should mean by "near". The collection [21 is an excellent overview of counterfactuals. It contains reprints of classic seminal papers in the field together with relatively recent developments, and it presents the various approaches to the "nearness" problem. We are interested in counterfactuals because they arise naturally in the logic-based models we employ. Consider the command "Unload the part from machine no. 4 and put it on the material transport system." After this command is executed the following statement will be true:"The part is not on machine no. 4 and the part is on the material transport system." Presumably this statement was previously false. Thus we went from a world (i.e., manufacturing system configuration) where the statement was false to a world where the statement is true. The problem is to determine what else has to change. That is, what counterfactuals of the form "If the part were not on machine no. 4 and the part were on the material transport system, then..." are true. The consequents of these counterfactuals would characterize the additional changes that have to be made. The goal of this article is to say what these should be. However, it is important to appreciate that we are saying what the changes "should" be, not what they "must" be. In other words, one may undertake any well defined scheme for updating. Presumably one could arrange things in our manufacturing system so that just about anything could happen, and we could somewhat arbitrarily require our models to follow suit. Instead, what we are doing here is to develop a simple, orderly, reasonable,'The antecedent does not have to be false; however, true antecedents just are not interesting when one is studying counterfactuals. 3

even "logical" interpretation of how these logic-based models should work. So our updating scheme is not the only one possible, but rather one that works and makes sense. This parallels what those working on counterfactuals do: They try to formulate reasonable and consistent interpretations for counterfactuals and ignore peculiar, strained ones. Stalnaker [2, page 87] argues that some counterfactual statements are neither true nor false. The classic example that he uses is contained in the following two statements: If Bizet and Verdi had been compatriots, Bizet would have been Italian and If Bizet and Verdi had been compatriots, Verdi would have been French. He argues that each of these counterfactual statements is neither true nor false. Basically, his argument is that there is no convincing way to choose between a world in which Bizet and Verdi are both Italians and one in which they are both French. We will use this point of view to argue that certain models are inappropriate because they raise this kind of ambiguity. 3 Review of the Modeling Formalism We review just enough of [11 to allow the formulation of the needed framework. Intuitively, the models are made up of entities that are things such as parts, tools, machines, vehicles, and so forth. A part being on a machine would typically be modeled by putting these two entities-part and machine-in contact; and if the part were unloaded from the machine, this contact relationship would be broken in the model. Most of the formal details of the modeling approach follow in a fairly straightforward manner. The state of the model at any given time is characterized by a structure [3, page 79] for a first-order language. This language always has a one-place predicate symbol AE and two two-place predicate symbols CT and ST. In addition, it has a number of other, usually oneplace, predicate symbols Ve, Vp,..., and it has constant symbols. Variable symbols are near the end of the alphabet: x, y, z, and constant symbols are in the alphabetic neighborhood of k. A structure characterizing the state is referred to here as a configuration and, as usual for such a structure, it has * A universe set, call it, Eu * A subset of Eu associated with predicate symbol AE. We denote this set also by AE and rely on context to clarify whether the predicate symbol AE or the set AE is meant. We will treat all predicate symbols and their associated relations in the same manner. * Subsets CT and ST of Eu x Eu associated with the predicate symbols CT and ST, respectively. * Subsets V,,Vp,... of the appropriate cartesian product of Eu with itself associated with the predicate symbols V, V,.... ~ Assignments of each constant symbol to an element of Eu. The elements of Eu are the entities, and, as we have said, they are usually things such as parts, assemblies, tools, machines, vehicles and messages. That is, they usually have a direct and obvious connection to the real world. Eu is meant to be the set of all entities that will ever be of interest in the model. Those entities that are currently active are in the set AE. For example, a part may become inactive by leaving the factory floor. Usually active entities are the ones that are known explicitly, and the set of inactive ones, that is, 4

E,- AE, is not described completely. 2 The sets V,,,Vp,... are value sets. For example x E VP,,rti or, equivalently, Vp,,,rt(x) designates that the entity x is a part of type i. In other words the value sets say what kind of entity x is. The entities that are active may change with time; however, the values of an entity are fixed forever. That is, a part is alway a part. Similarly, the element of E,, assigned to a particular constant symbol never changes. That is if an entity has a name, identifier, or serial number it never changes. CT c E,,x E, is the contact relation, and, again, it is used to model such things as the part being on the machine, that is, "the part and the machine in contact." The following sentences 3 are true at all times: (Vx)(-CT(x, x)) that is, CT is irreflexive, 4 (Vx) (Vy)(CT(x, y) -. CT(y, x)) that is, CT is symmetric, (Vy) (Vx)(CT(x, y) -* AE(x) A AE(y)) that is, an inactive entity is never in contact with another entity. The foregoing sentences as well as others that are always true are collected into a set which we will designate by 8. ST c E, x Eu is the substructure relation, and it is used to model such things as the part being a substructure of the assembly. The following well-formed formulas are true at all times, that is, in @: (Vx)ST(x, x) that is, ST is reflexive, (Vx, wi,..., ws, zl,...,, y)(ST(x, w)A...AST(w,, y)AST(y, zl)A...AST(zt, x) - x = y) for s=0,l,...and t=O,,.... This set of well-formed formulas means that the transitive closure of ST is antisymmetric. In other words, we do not want non-trivial "substructure loops." 8 may also contain well-formed formulas that are peculiar to a particular system. For example, it might be that at most one part can be on machine m, that is, Vprt (x) A Vprt (y) A CT(x, m) A CT(y, m) -_ x = y As we will see below, the existence of 0, significantly complicates the updating problem. A configuration is, as we have said, a structure of the form just described, and, as we have also said, it is the state of the system. However, we know that some pieces of the structure never change- the values, the assignments of constants, and the universal set E,- so these can be suppressed in the description of the state, and we can simply keep track of the ordered-triple (AE, CT, ST), where, of course, the interpretation of these symbols as relations is intended. It is possible to make the time at which the system enters 2Usually this lack of explicitness regarding E, - AE is the only reason that the model, i.e., structure, does not technically have a known finite size. 3Recall that a sentence is a well-formed formula with no free variables. 4Irreflexiveness is not really needed here, but it is convenient for CT to be that or reflexive. 5

configuration (AE, CT, ST) part of the state, that is, ((AE, CT, ST),r); however, this will not be needed in this article. 5 The operation of the system, then, becomes a sequence (AEk, CTk, STk) of configurations or states. We now describe how the state changes come about. The basic mechanism is a so-called control which is a collection of input-logical variable pairs, where the pairs are of the form (q, I) with q an input and I a logical variable. The control is used at time t by calling for one of its inputs. For example, if control C is the set (ql, I ),..., (qn, In), then input ql is called for at time t by making Ii true at time t. One can view the control as a set of buttons labeled 11, through In from which one selection can be made. A system can have more than one control. An input, in turn, is a collection of changes, and the purpose of a change is to transform some part of the current configuration. For example, (ki, k2) E CT may be transformed to (kl, k2)~CT. An input may be made up of more than one change because several parts of the configuration may have to be transformed, and these transformations may take different amounts of time. For example, moving a vehicle from A to B first separates the vehicle from A and then later puts it in contact with B, that is, the input Move contains two changes. Eventually inputs and their changes should result in mappings of configurations into configurations, but trying to characterize them directly as mappings is impractical. The systems are highly parallel; consequently, many changes may be occurring simultaneously, and one would, in effect, need a mapping for each possible combination of changes. Further, the number of possible configurations is probably enormous. The alternative is to characterize these mappings indirectly by representing changes in a rather obvious way. A change is as follows: L(t) - R(tt) where L and R are sets of well-formed formulas. The formulas in L refer to the configuration at the start of the change, and the formulas in R refer to the configuration after the change is finished. If all the formulas in L are satisfied, and the change is called for, then R, which is usually not satisfied by the current configuration, becomes satisfied by the new configuration. The effect of the change is to separate the entities kI and k2. It is important not to confuse the symbol "-h" with the first-order logic symbol "-n. That is, L 4 R is not a well-formed formula. In other words, the modeling systems uses first-order logic, but it is not contained in it. If the change starts at time t, then the response R becomes satisfied at time tl. For example, t1 = t + 10 shows that the response will occur 10 units of time after the change started. Although changes are usually quite simple, we do have to go into a bit more detail about how changes work. This is required as a foundation for later results. In particular, we are developing a general theory here, and we have to start with a precise understanding of changes. The remainder of this section can be skimmed on first reading specially by a reader familiar with Skolemization in first order logic. First we assume, with no loss in generality, that the formulas in L and R are in prenex normal form [3, page 150J. For example, L = Qixi ~ Qnxn, 5In fact, it is relatively easy to use entities to encode historical information into the current configuration. Thus, if we wanted to remember when we entered the current configuration, we could include an entity whose value was that time. This is important because it allows the state to contain, as it should, all we need to know about the past. 6

where the Qi's are quantifiers (i.e., V or 3), the xi's are variable symbols, and Y is a wellformed formula without any quantifiers. Occurrences of xi, 1 < i < n, in L are, of course, bound. If other variables symbols occur in L, they are free occurrences. As long as there are no existential quantifiers in L or free variables, things are straightforward. Otherwise, we have to be a bit fussy. In particular, we have to state carefully how information is carried from the left to the right side of a change. First-order logic cannot help us here. We treat existential quantifiers first. Our approach is to imagine 6 that we eliminate existential quantifiers by Skolemizing [3, page 274] L. We explain what this means through examples. If L i~ R is (3x)(CT( x, k)) - CT(x, k) we replaced it by CT(k', k) ~-,CT(k',k) where k' is a constant symbol that is used nowhere else. The intuitive idea is that if L is satisfied by a configuration, there is at least one entity that can be associated with the quantified variable x. We simply assign an arbitrary one of these entities to the new constant symbol k'. Carefully note that if there is indeed more than one choice, the arbitrary selection introduces an element of randomness. The constant k' replaces x on the right side also, where, by convention, it occurs free. Usually Skolem constants such as k' above handle all cases; however, if L is (Vy) (3x)(CT(x, y)) we replaced it by (Vy)(CT(F(y), y) where F is a function symbol that is used nowhere else. This new symbol is assigned to a function, and if there is more than one choice, the selection is made arbitrarily, and it replaces x on the right side as well. Similarly, if L is (3u)(Vv)(3w) (Vx) (Vty)(3z)(u, v, w, x, y, z) we replaced it by (Vv)(Vx)(Vty)y(kV, v, F,w(v), x, y, F(v,,, y)) where k' is a new constant symbol, F, and Fn are new function symbols, and y is a wellformed formula without quantifiers. As before k', Fw, and F, replace u, w, and z on the right side as well. Thus, a structure A satisfies, say, the sentence (Vv) (Vx)(Vy)yF(, v, Fw(v) x, y, F (v, x, y)) if we can assign k' to some entity in the universe of A and associate Fu, and F with functions such that the formula is, then, satisfied in the usual sense by A. We do not consider k', Fw, and F, part of the structure A. Rather we view them as augmenting A to yield a structure for the language with the Skolem symbols added. Again, if there is more than one choice GWe use the word "imagine" because to use the modeling formalism it is not really necessary to make the substitutions that we will discuss. We are merely using Skolemizing as a vehicle for clarifying the operation of the model. 7

possible for k', F,,,, and/or F,, one is selected arbitrarily. This corresponds to selecting, say, any part on a pallet when there is at least one. Note that any constants and functions introduced by Skolemizing refer to the same things in both L and R. Further, variables in L and R refer, by convention, to the same thing. This finishes the discussion of the existential quantifiers. For the free variables that occur in L or R we have to use a different method, but since free variables rarely occur we omit their discussion here for the sake of brevity. Intuitively, when the free variables are used the change is to be instantiated at various places in the manufacturing system and take place at all these places simultaneously. 4 Updating Problem and Its Relation to Counterfactual Reasoning The similarity of our updating problem to counterfactual reasoning is clear. Suppose that the currently active configuration is A and that the left side L of some change is satisfied. Suppose further that the corresponding right side R becomes true at time t' = t + A. Further, suppose that a set O of sentences is always true. The updating problem is, as we have said, to determine what the next configuration A' should be. It is a problem because there may be no obvious and well-defined way to do it. The candidates for A' are in the class P of all configurations that satisfy R and 9. The configuration A satisfies e but presumably not R. Thus, from a counterfactual reasoning point of view we are assuming that we are only interested in O-worlds. The counterfactual statements of interest are of the form "If R were true and 9 remained true, then... " The difference from counterfactual reasoning is that our worlds are far simpler and quite explicit. Still counterfactual reasoning raises exactly the same closeness problem, that is, which world in P, if any, is closest to the real world A, and much of the philosophical literature [21 on counterfactual reasoning is concerned with this problem. 5 A Concept of Relative Closeness Our approach is quite simple. First we require that the configurations all have the same universe, and that constants symbols, value predicate symbols, and Skolem symbols be interpreted in the same way. Then we say that A' is closer to A than A, is if A' 3 A, and wherever A' disagrees with A so does As. That is, A, disagrees with A in all the ways that A' does and in some additional ones. (We formalize this definition in the next section.) Arguably, any concept of closeness must at least have this property. The difficulty comes when one tries to incorporate more. For example, one might compare factory floor configurations by means of metrics, topologies, neighborhood systems, or other more elaborate constructs. However, we would find this rather arbitrary and of questionable meaningfulness. 6 Geometric Analysis of Configuration Space A geometric interpretation of the updating problem flows almost immediately out of our concept of relative closeness. This geometric viewpoint simplifies things immensely. The basic approach is to consider configurations to be vectors in a linear vector space, where 8

the space is usually infinite dimensional. We will show that selecting the next configuration can be interpreted as an orthogonal projection of the present configuration onto the set of all candidate next configurations. Let C be a set of all configurations with a universe E,, which have a common interpretation for constant symbols, value predicates, and Skolemn symbols. We focus our attention on the relations AE, CT, and ST, and build the geometric viewpoint based on them. C, then, is the set of all triples (AE, CT, ST), where AE C E,, CT C E x E,,, ST C EU x E,. Let P be a subset of C. P is meant to be the set of all configurations satisfying the right hand side, but for now we take it to be arbitrary. Given an arbitrary configuration A ( which is meant to be the current configuration ), our problem is to find a unique configuration A, in P that is closest to A. The formalization of our concept of relative closeness is as follows: Definition 1 Let A, A,, A be configurations in C. We write A - A0 < A - A,, meaning that Ao is closer to A than Ag is to A, or Ao = A,, when AAEAE, C AEAAE, CTACT,, C CTACT, STAST,, C STAST, where A denotes symmetric set difference (that is, the points in one of the sets but not both). Now our problem is, for a given P and an arbitrary A, to find A, E P such that (VA,)(A, e P -. A - A, < A - A,) This involves two stages, one to show that such an A, exists, and another to characterize it in terms of A and P. The first problem is, of course, that this may not be possible for arbitrary P's. Therefore, the problem becomes to characterize the ones for which it is, that is, to characterize P's such that (VA)(3A,)(VA,)(AO E P A (A, E P, A - A, < A - A,.)) (1) Notice that if such an A, exists, it is necessarily unique. To see this let A,, and A', be two configurations in P that are closest to A. By equation ( 1) we have: (A - Ao < A - A') A (A - A'o < A - Ao) == A - A'o = A - Ao ==> A'o = A, Now that our goal is formally stated, we assume that we have a set P which satisfies equation ( 1) and derive some necessary conditions that P must satisfy. Next we will show that these conditions are also sufficient. Finally, for all such P's, we characterize A,,. Also for the purpose of simplicity we assume that a configuration consists of a single unary relation. Because we are going to treat configurations as points in a space this simplification is without loss of generality. It is quite easy to see how the arguments we present for the simple case of a single unary relation applies to the general case, where we consider AE, CT, and ST all at the same time, and, in fact, it is easy to generalize to other applications. This simplification means that the set of configurations becomes the power set of the set of entities. So from here on we will be concerned with the subsets of a given arbitrary set rather than configurations. 9

Lemma I Given an arbitrary nonempty set S, the power set of S (denoted by 2') is a vector space over GF(2) (that is, the binary field of 0 and 1, see [4, page 81 where vector addition and scalar multiplication are defined as follows: ~r~^? E ^ ~ def Alef dl(f xl, x2 E 2' xl +x2 = XlAx2; 1.xl = x1 Oxl = 0 Proof: The proof is straightforward. It only involves the verification of the axioms characterizing a linear vector space. Note 1: In this vector space x1 + xi = X Axl = 0 => X1 = -X-i => Xi + X2 = X1 - X2 Thus, from here on'-' and'+' will be used interchangeably. The reader must be careful not to confuse'-' which is symmetric set difference here, with the usual definition of'-' for sets which is asymmetric set difference. The'-' is never used for the latter purpose in this article unless specifically mentioned. Note 2: C defines a partial order in 2`. We will be using < and C interchangeably. Furthermore, we will be using the fact that (2s, n, U, S, 0) is a boolean algebra implicitly. Note 3: Note the consistency between Definition 1 and this lemma. Note 4: For the sake of clarity, in all subsequent equations n and U are given precedence over +. That is: (x Uy) + (y n z) = xUy+ ynz The following are two equalities that are very useful in proofs. For x,y, zE2s: (x+y)n z=xnz+ynz (2) For x,y,zE2: x+y=xUy+xny (3) Definition 2 A Plane is a subset P in a vector space U characterized by a generating subspace V and a vector d in the following way: (xz)(3y)(z E P 4' (y e V) A (x = d + y)) It can be shown that the following simpler definition of a plane is equivalent to the one above when the underlying vector space is 2'. We will be using this new one more often. Definition 3 A subset P in the vector space of 2' is a Plane if it contains the sum of any S of its (not necessarily distinct) vectors. This finishes the preliminaries. Now we turn to the necessary conditions. Lemma 2 If P C 2s satisfies equation ( 1), then P is a plane. Proof: Let x, y, z be any three (not necessarily distinct) elements of P. Let A = x + y + z. From equation ( 1) there exists A, E P such that for any p E P: A - A, < A - p But x, y, z are in P, hence: A- AoA - x A - Ao<A -y y == - A < (A - x) (A - y)(A - z) = (y +z)n (z + x)n( + y) A-A,<A-z z =(yn znx++y z+y x+yx + z x+z y y+zx+z n xn y) =0 = A - A = 0 - A = AO == A E P 10

It follows from Definition 3 that P is a plane. K> Lemma 3 If P C 2' satisfies equation (1), then P is Cubic 7 that is (Vx) (Vx)(Vx2)(xL E P A^2 E P x ~ x < x< -o x E P) Proof: Take xl,x2 E P; xl < x < X2. From equation ( 1) there exists x,, E P such that for any p P: x-x, < x-p. But xl, x2 are in P hence: - <z - - oz < (x - xi) n (x - 2) = (x + x n x2 + X n z1 + x, n 2) X -- Xf<X ~- X2 = X + X + X1 + X1 = 0 -- - === 0 = X = Xo, X E P K> Lemma 4 If P C 2^ satisfies equation ( 1),then P is Closed, that is the limit of any monotonically increasing (decreasing) sequence of elements of P is in P. The limit of a monotonically increasing (decreasing) sequence is defined in the usual way, that is, as the union (intersection) of the elements in the sequence. Proof: a) monotonically increasing sequences. The statement of the lemma is: 00 (xi E P i = 1,2,...) A (x < i < j i, j = 1, 2,...) = lm = = zi P i=l Note that x = Ui~ xi is well defined because xi's are in the power set of S;therefore, x C S. ZFrom equation ( 1) there exists x, E P such that for any p E P: x - x < x3- p. But xi i = 1,2,... are in P hence: (Vxi)(x - x, < x - xi = (x n x) u (x' n xi) = x n ) x= oo oo oo X- 37n0 I07 37 = 37 n Q 7 = 3n (u xi)' = x n'= = 3 = 3x = =x 3x E P i=l i=l i=l b) monotonically decreasing sequences. The statement of the lemma is: 00 (xi E P i =1,2,...) A (xi > i < j i,j = 1,2,...)= x = lim xi = xi P i=l The proof of this part is analogous to that of part a) and hence omitted. Lemma 5 If P C 2 satisfies equation ( 1), then every translation of P is a Closed Cubic Plane, where a translation of P is any set Pd = {y I ( = x + d) A (x E P)} where d is a fixed vector. 7There are a number of terms that might be used here, for example, ideal, model complete, interval, or free. We borrowed "cubic" from switching theory. 11

Proof: a) P, is a plane Let yl, Y2, Y3 be any three vectors in P,1. By definition we have yi = xi + d, xi E P; i = 1, 2, 3 We have already shown that P must be a plane, hence xl + x2 + x3 E P. YI +Y2+Y3 = (xi -x2 +x3)+d == Y+ Y2+3 E P b) P, is cubic Let Yi < y < y2, where y is any vector and Y1, Y2 are vectors in P1. By definition we have yi = xi + d,xi E P,i = 1,2. Let x = y + d. From equation ( 1) there exists x, E P such that for any p E P: x-x,) < x -p. But xi; i = 1,2 are in P, hence: x-X, < (x-xl)n(x-x2) = (y-yl)n(y-y2) = y+y+y+yl = 0 =, x = a, =s x E P == y Pd, c) Pt is closed Proof of this part is analogous to proof of part b) and hence omitted. Lemma 6 If P C 2` and all its translations are Closed Cubic Planes, then P has a least element e and a greatest element g. Proof: Only the proof for the existence of g is given. The other case is similar. Being closed means that every monotonically increasing chain in P has a limit in P. Zorn's lemma [5, Page 556] states that: If every chain in a partially ordered set has an upperbound, then the set has a maximal element. Therefore, P must have a maximal element g. To show that g is the greatest element assume to the contrary that there exists x E P, x ~ g. This means that x U g > g, which is a contradiction once we show that x U g E P. To show x U g E P consider the set P + g which is a translation of P and hence is cubic by hypothesis. (a; n g+ g)n(z+g) = xng+ xn gnx+ g = xng+ g ==> 0< (x n g + g) < (x + g) Since 0 E P + g and x + g E P + g the cubic property implies that x n g + g E P + g which results in x n g E P. Using equation xUg=x+g+xng == xUgE P The previous lemmas have provided all the necessary and sufficient conditions. Now we are ready to characterize the sets having a unique approximation for every point in the space. Lemma 7 P C 2s and all its translations are Closed Cubic Planes if and only if (vx)(3,,)(Vp)(zo E PA (p E P - x-x, < x-p)) Proof: The if part is already done by previous lemmas. For the only if part take any x E 2"' and set x, = (xU ) n g, where i and g are the least and greatest elements of P (lemma 6). 12

To complete the proof x,, must be shown to be in P and it must be shown to be the closest element of P to x. x,,n g- =,, x,,< g x,,nti= t =>e < I XQ e P ( P) A (g E P) To show that x, is the closest element of P to x take any p G P (x- xo) n (x-p) = x+ p +, n x+x,,np = x + xp+xp n g+ (xU e) np Now apply equation 3 to the last two terms to get x+ xnp+xnp+(xng)u((xue)np) = x+(xut)n((xng)up) = x+(xue)n((xup)ng) = x + (x ( n p)) n g == - - x-Xn(x-p) = - x = X- X, X-p The lemma just proved characterizes our desired sets as sets all of whose translations are closed cubic planes. Let's briefly investigate the independence of these four conditions. It can be shown that being a plane is implied by staying cubic under translations but the other three conditions are independent. The following examples show what goes wrong if any of the three properties of closedness, cubic-ness, and invariance under translations is omitted. Example 1 Let S have an infinite number of elements. Let P E 2' be the set of all finite subsets of S. P and all its translations are obviously planes. Furthermore, P is obviously cubic. Translations of P are also cubic; however, this may not be so obvious. Consider P + d, where d is a vector (reminder: vectors are sets) of infinite cardinality (if d is of finite cardinality things are obvious). Let x1,x2 E P + d and xl < x < x2. xi,x2 are different from d at a finite number of places. This implies that x can be different from d at only a finite number of places. Hence x + d is finite which means that x E P. So we see that P and all its translations are cubic planes but they are obviously not closed. Moreover, it is not difficult to see that P does not have a closest approximation for any infinite vector. Example 2 Let S = {a, b, c}, P = {0, S}. It is easy to verify that P and all its translations are closed planes but not cubic. Moreover, P does not have a closest approximation for any point outside of it. For example consider {a}, neither point in P is a closest approximation for this point because {a}- 0 and {a} - S are not even comparable. Example 3 Let S = {1,2, 3,4,5}, P = {{1,3,5}, {2, 3,5}, {1, 4, 5}, {2,4,5}}. It is easy to verify that P is a closed cubic plane. However, not all its translations are. Consider the translation of P under the vector d = {1, 3, 5} P+d= {0, {1, 2}, {3, 4}, {1, 2, 3,4}} Obviously this translation is not cubic, and it is easy to see that P has no closest approximation for many points, for example the point {5}. The following theorem is the major result of this section. So far we have shown that the sets possessing the desired property of having a closest approximation for every point in the space are sets all of whose translations are closed cubic planes. The next theorem shows that these translation invariant closed cubic planes can be characterized as the set of all points that contain as a subset (remember points in the space are subsets of S) a given 13

point and have no intersection with another given point. In terms of configurations, it says that the set of all configurations such that every one of the three relations AE, CT, ST of every configuration is true on a given portion of the space, false on another given portion of the space, and arbitrary elsewhere; characterizes a desired right hand side. This was conjectured in [11. Theorem 8 For P C 2' the following three statements are equivalent: 1) P and all its translations are Closed Cubic Planes 2) (Vx)(3x))(Vp) (x, E P A (p E P - - xz o < - p)) 3) There exist sets PT, PF E 22, PT n PF = 0 such that (Vx)( E P -> (PT C ) A (X n PF = 0)) Proof: Previous lemma showed that 1) if and only if 2). Here we will show that 1) -- 3) and 3) -, 2) to finish the proof. 1)- 3) P has a least element e and a greatest element g as shown by Lemma 6. Set PT = e, PF = g' (where g' is complement of g ). To show that 3) holds, consider any x E P. Because P is cubic and because of the way i, g, PT, PF are defined we have: xE P = e= < z < x g (e c x) A (x C g) =- (PT C x) A (x n PF = 0) 3) - 2) To show that 2) holds set,, = (x U PT) - PF, where'-' denotes the asymmetric set difference here. To complete the proof, x, must be shown to be in P, and it must be shown to be the closest element of P to x. To show these simply replace PT for e and PF for g in the proof of the previous lemma. So far all the attention has been centered on the characteristics of sets that possess a unique approximation for every point in the space. Now we move on to show that this unique approximation is an orthogonal projection. Definition 4 Two sets A, B E 28 are called orthogonal if A n B = 0. Theorem 9 Let P and all its translations be closed cubic planes. Lemma 6 shows that P has a least element e and a greatest element g. Define:.: 2" ~_ P, 1r(x) = So = (x e)ng The mapping (r - e) is an orthogonal projection on the subspace P - e The statement of the theorem needs a word of explanation. Because e E P, the plane P - e is the translated version of P that is a subspace. Recall Definition 2. It should be obvious that P - e is the generating subspace of P. It can be shown that the generating subspace of a plane is always unique and does not depend on the choice of the vector (here i). With the above in mind it should not be difficult to see what the statement of the theorem is saying. One question remains however: why are we talking about (r - i) and P - e and not directly about Ir and P? The answer is that -r fails to be linear whereas (Ir - t) is linear and this is anything but unusual. In the Cartesian plane only lines through the origin are subspaces and only a projection on these lines is linear. Projection on a line that does not go through the origin can be in a sense orthogonal but not linear. This is exactly the case 14

here. Proof of Theorem 9: Three things need to be established: r - e is linear, r - e is a projection on P - e, and ir - e is an orthogonal projection. But first an elementary result needed in all three subproofs: x E 2': (ir-)(x) = (xui)ng+eng = (xUt+t)ng = (((xue)ni')u((zui)'nt))ng = xnt'ng 1) Since 0 and 1 are the only scalars in this vector space the following two conditions suffice for linearity. (r - e)(0) = (O u e) n g + e = 0 x,yE2s: (7r-e)(x+y)= (x+y) nt'ng=xnt'ng+yn'ng= (r-e)(x)+ (r-e)(y) 2) The previous theorem showed that,, = 7r(x) E P. Immediate consequence of this is that (Tr - t)(x) E (P - e) which implies that P - = Range(r - e). For Xr- e to be a projection it must also be shown that (r - )2 = (Tr- ). xE 2: (r - e)2(x) = (n - )(x ne'ng) = (x n e' n g n' n ) = (xn' n g) = (r - e)(x) 3) To show that (~r - ) is an orthogonal projection on P - it must be shown that x - (r - e)()I(7 - e)(:). x E 2: (x- (r-e)(z))n(r-e)(x) = (x+xni'ng)n(xne'ng) = xne'ng+xni'ng = 0 7 Conditions for Unambiguous Updating By now the necessary and sufficient conditions for unambiguous updating would appear to be obvious: the set of candidate next configurations must form a translation invariant closed cubic plane. However, this is not quite the case. In Section 6 we ignored the role of 8, that is, the sentences that are always true, and the role of L, that, is the sentences in the left hand side of the change. We allowed the current configuration to be any configuration in C, and we know that only those configurations which satisfy all the sentences in O and L need be considered. Furthermore we did not require the resultant configuration to satisfy 8, and we should have. This section is devoted to these two further considerations and how they affect our scheme for updating. Needless to say, if R, the sentences in the right hand side, together with O characterize a translation invariant closed cubic plane, there is nothing new to say. The problem occurs when they do not, and this can happen in realistic cases. Let PR, Pe, PL, PL.e, and PR.e be, respectively, the set of configurations that satisfy R, 8, L, L U 8, and R U 8. Clearly, PR.^ = PR n Pe, and PL.6 = PL n Pe. Presumably, we are interested in the case where PR,e, and PL.e are not empty, that is, R and L are both consistent with 8. Let us consider an example in which we restrict our attention to just CT. If R = CT(kl,k2) A CT(k2, k), we know that PR is a translation invariant closed cubic plane (characterized by PRT = {(k, k2),(k2,kl)},PPR = 0, where PRT and PRF are the sets introduced in the third part of Thereom 8. ). If e says that CT is symmetric, that is, (Vx)(Vy)(CT(x,y) -- CT(y,x)), then Pc( is a closed subspace, but it is not cubic. PRe 15

is all symmetric configurations satisfying CT(k1, k^) A CT(k2, ki), and it is a closed plane but not a cubic one. However, in this case updating is still unambiguous. If CT is any symmetric CT-relation, its orthogonal projection onto the translation invariant closed cubic plane PR is also symmetric, and, therefore, in Pn.i. On the other hand, if R = CT(kl, k2), PR is still a translation invariant closed cubic plane. However, now the orthogonal projection of a symmetric CT onto PR will not be symmetric. Thus, for a given e, there are certain translation invariant closed cubic planes that lead to unambiguous updating and ones that do not. Our task is, therefore, to go from a current configuration A E PL.e to the next configuration A, E PR.e such that A, is the closest element of PR.(- to A. Our problems are: 1) How to make sure there exists such an A, and that it is unique 2) How to find such an A, Of course we can get around these problems by requiring PR.e to be a translation invariant closed cubic plane, but that is a stronger than necessary requirement which is not satisfied in many realistic cases like the examples above. So it might appear that we have to discard all the results of the last section regarding translation invariant closed cubic planes and start over again. But, although we have to start over again, the results of the last section turn into very useful tools that simplify our problems immensely. In fact, updating is still carried out by an orthogonal projection onto a translation invariant closed cubic plane. Again, for the sake of simplicity, we are going to consider the subsets of a given set S instead of configurations. We keep on using PR, P(,-) PR.e, PL, and PLe, but interpret them in the new context. Next some definitions and lemmas are presented which culminate in a test for verification of the existence of A,,. Specification of A,, when it exists, is taken care of next. Finally we draw some general conclusions using illustrative examples. Definition 5 Consider an arbitrary set Q C 2S. The Kin of q, q E Q, with respect to Q is the set of all points in the space (that is 2S) that q is closer to than any point in Q - {q} is. That is Kin(q) = {x I x - q < x - q'; q' E Q} The following is a set-theoretic identity that will be used later. Lemma 10 For R C 28, xE 2` let x + R = {x + rJr E R}, then f(x + R) = x + n (n R+ UR)+ R Proof: Using the fact that A n B' = A + A n B we get x+xn(UR+nR)+fR = (z+xnUR)+~(fR+xn R) = (xn(U R)')+(nRnx') Now apply equation ( 3) of last section = (x n (U R)') n (n R n x') + (x n (U R)') u ( R n x') = 0+ (x n (U R)') u ( R n x') u (a n x') u (n R n (U R)') = (x' u (U R)') n (n R u x) = (x' U n{r'lr E R}) n (n{rlr E R} u x) = n({' U r'Ir E R} n({x u rlr e R} = n{(x u r') n (x U r)lr E R} = n{x + rJr E R} = n(x + R) Lemma 11 The kins of two different points are disjoint. 16

Proof: Let k be a point that is in kins of both ql, and q2. (k E kin(ql)Ak E kin(q2)) ==. (k-ql < k-q2)A(k-q2 < IC-qi) == k-q1 = l-q2 == q = q2 Lemma 12 Consider an arbitrary set Q C 2'. The kin of q, q E Q, with respect to Q is a translation invariant closed cubic plane. Furthermore, the kins of q1 E Q, and q2 E Q with respect to Q are simply shifted versions of each other. Proof: a) Let K = kin of q = {klk E 2S,k - q < k - q', q' E Q}. Use Theorem 8 to show that K is a translation invariant closed cubic plane. Let KT=q+nQ KF=q+UQ First we have to verify that KT n KF = 0 KT n KF = (q + nQ) n (q + U Q) = q + + q Q + nQ = 0 What remains to be shown is that the following well-formed formula is true: (Vx)(x E K 4 - (KT C x) A ( n KF = 0)) By definition of K we have (x - q) = n(x - Q) for x E K. Apply Lemma 10 to this equality to get q = x n UQ + xn nQ + n Q. For the right implication let x E K. xnKT = xn(q+fnlQ) = xn (x n U Q+ xn Q) = (x n U Q+ xn nQ) =q + n Q= KT =. KTC Z xnKF= xn (q + UQ) = n (x n U Q +xn nQ+ nQ + UQ) = xn (UQ + nQ + nQ + u Q) == KF n = 0 For the left implication let KT C x, x n KF = 0 (x n (q +nQ) = q+ nQ) A(xn(q+UQ) = ) = ( xnq+xnn = q+nQ)A (xnq=xnU Q) => xnUQ+xnnQ=q+nQ => q=xnUQ+xnnQ+nQ =. x-q=n(x-Q) => xeK b) Let K1 and K2 be the kins of ql and q2 respectively. We claim that K2 = K1 + ql + q2. Recall that by definition: Ki= {klk E 2S, k - q < k-q',q' EQ} i=1, 2 To prove the claim let xi E K1, and q' be any element in Q then: ((1 + q, + q2) + q') n ((x, + q, + q2) + q2) = ((xi + ql) + (x1 + q2) + (x + q')) n (x, + qi) = (Xi + q)) + (xi + qi) + (1x + q,) = (1x + qq) = ((x + q, + q2) + q2) = (1x + q, + q2) E K2 Similarly it can be shown that: (x2 + ql + q2) E K1 which proves the claim. Now we are ready for the test for the verification of the existence of A, Definition 6 PR, PL, and P( are called compatible when for any point in PL.(C there is a unique point in PR,e that is the closest point of PR.e to that point. 17

Lemma 13 PR, PL, and P() are compatible if and only if PL.(-) is a subset of the union of the kins of the points of P.(.) (with respect to P.,()). Proof: If PI, PL, and P(- are compatible, then the claim of the lemma is an immediate consequence of the definition of the kin. If PL.(-) is a subset of the union of the kins of the points of PR(-), then take any point A E PL.(-). A is in the kin of one (and only one) point of PR.(-) and hence that unique point is the closest element of PR.(?) to A. Therefore compatibility holds. Now that we are assured of the existence and the uniqueness of A, we turn to the second problem of how to find A,. Had PR,e been a translation invariant closed cubic plane we would have found A,) by taking the orthogonal projection of A onto PRI(?. Here PR.. is not, but we claim that A, is the orthogonal projection of A onto the smallest translation invariant closed cubic plane containing PR.e as a subset. Lemma 14 Let e =nPR.(, 9g=UPR., P = {pl P g} P is the smallest translation invariant closed cubic plane containing P.eo. Proof: let PT = e, PF = g' = 2 - g, then we have: P = {pI(PT C P) A (p n PF = 0)} Hence by Theorem 8 of the previous section P is a translation invariant closed cubic plane. The fact that P is the smallest one is obvious by the construction of P and the choices for e and g. The following theorem sums up the results of this section. In simple words, it is saying that unambiguous updating is possible when PR, PL, and P(e are compatible and updating is done by taking an orthogonal projection. Theorem 15 When PR, PL, and Pe are compatible, the closest element of PR,. to a given A E PL,,g is the orthogonal projection of A onto the smallest translation invariant closed cubic plane containing PR.e. Proof: A E PL.e plus the fact that PR, PL, and Pe are compatible implies that there exists A, E PR.@ such that A is in the kin of A,, hence Ao is the closest element of PR.( to A. What remains to be shown is that A,) is the orthogonal projection of A onto plane P defined below, or to be precise (A, - f) is an orthogonal projection onto P - t (see the discussion of Theorem P= {pe < pg 9, e= nPRe, g= UPRe} Using Theorem 9 all we need to do is to show that the following relation between A and Ao holds: A, = (A U i) n g To show this we use Lemma 12. Let K = kin(A,) with respect to PR.e. Since A E K we have: (A n KT = KT) A (A n KF0 = 0) (A n (A, + e) = A, + t) A (A n (A, + g) = 0) (AnAo+Ant=A,+) A(AnA,=Ang) = (Ang+Ant=Ao+) == ((Ang)+(Angn ) +=Ao) == ((A ng)u=A) == ((Au e) n g=A A) 18

Now that we have characterized the condition for unambiguous updating (namely requiring Pn, PL, and P() to be compatible) and showed how to carry out the updating (by taking an orthogonal projection), let us go back to the two examples we discussed at the beginning of the section and see how each can be treated. Let R = CT(kl, k^)ACT(k9, kl), and let O say that CT is symmetric, that is, (Vx)(Ydy)(CT(x, y) - CT(y, x)). We ignore L in this example. Let us see whether P() and PI are compatible. PRan is all symmetric configurations satisfying CT(l, k) A CT(k, k). Hence the kin of any point in PRf consists of four points which agree everywhere except on the ordered pairs (kI k2) and (k2, ki) (where they are arbitrary). The union of all the kins is the set of all configurations whose CT is symmetric everywhere with the possible exception of locations (kl, k2) and (k2, kl). This union obviously includes Pt as a subset. Hence Pw( and PR are compatible. To update a current configuration A here, we take its orthogonal projection on the smallest translation invariant closed cubic plane through PR.e. The result is A,=(Aue)ng where: e=nPR.(., 9=UPR.e Which is equivalent to saying: The CT of the next configuration is true on locations (ka, k2) and (k2, k), but everywhere else it is exactly like the CT of the current configuration. A result that is in perfect agreement with our intuition of what the above right hand side is trying to do. The next example is more interesting. Let R = CT(ki, k2), e be as before (all symmetric configurations), and as before ignore L. Let us see whether Pe and PR are compatible. PRE is, as befor,, the set of all symmetric configurations satisfying CT(ki,k2) A CT(k2, k). Notice that although R only specifies CT(kl, k2), because of 8 we are forced to include CT(k2, ki) as well. From here on this example is just like the previous one and updating under the right hand sides of either of these examples yields the same "next configuration". The above examples illustrate a very important point. They show that the modeling system developed here is powerful enough to keep 8 always satisfied without being explicitly told (in the right hand sides of the changes) how to do so. This means that the designer of the changes need not worry about making sure that his right hand sides are in harmony with the things that are always true (Of course this presupposes compatibility). This may be an insignificant advantage when O is simple but it becomes a very significant one with a complex 8. The next example shows another important point. It demonstrates the capability of the modeling system to detect bad combinations of L, R, and 8. Let R = ST(ki, k2), and let 8 be the following well-formed formulas: (Vx)(ST(x, x)) (Vxi)... (Vx X)(ST(x1, X2) A... A ( Xn) ST(x,, x) A ST(x,, xl) - xi = Xn) In other words, 8 is saying that ST is reflexive and its transitive closure is antisymmetric, both of which make eminently good sense given the intuitive notion of a substructure relationship because we do not want to allow non-trivial substructure loops. Assume that the current configuration is A and the following are true in A: ST(k2, k3) A ST(k3, kI) kIcl I2 $ k3 19

Let L be arbitrary otherwise, as long as A E PL.(- (for example L can be CT(k1, k^)). Let us see whether L, R, and 0 are compatible. Let q be an arbitrary element of P?.,)(. Kin(q) is a translation invariant closed cubic plane characterized by KT, and KF where: KT = q + P.() = q + {x I x is a configuration in which ST(k1, k2) is true, and everything else is false} KF = q + PR.. = q + {x i x is a configuration in which ST(k2, k1) is false, and everything else is true} Since for any y E Kin(q) we must have (KT C y A y n KF = 0), we conclude that Kin(q) consists of four points (configurations) which are all exactly like q everywhere except on the ordered pairs (kl, k^), and (k2, kI) (where they are arbitrary). The union of these kins is the set of all configurations that are arbitrary on the ordered pairs (k, k2), and (k2, ), and at other locations are exactly like a point (configuration) in PRn e Now consider the current configuration A which is true on the ordered pairs (k2, k3) and (k3, kl). If A is in the union ofe in the kpoins of the points of P there must be a point of PR.( which is exactly like A at all locations except for ordered pairs (k,,k2), and (k2, k). Thus at this point both ST(k, k3), and ST(k3,k1) are true. Moreover, at this point ST(k1,k2) is also true because this point belongs to PR. But the truth of the above three substructures at this point is in clear contradiction with E. Hence we have a contradiction, which implies that A is not in the union of the kins of the points of PR.e which means that L, R, and O are not compatible. When a situation like that in the last example happens, it warns the system designer of an ambiguity in the system. In the above example ambiguity arises because the specified action of the right hand side creates a non-trivial loop of length three of substructure relations. Since such loops are prohibited (by 0) the loop must be broken somewhere, but where? It can be broken at any of the three places and that is ambiguous. The course of action that the system designer should take when encountering such a situation is to expand L to prevent this change from taking place when there is a possibility of trouble (here creating a loop); or to expand R to remedy the situation should the trouble occur (here by specifying how to break the loop). We conclude by saying that the modeling system provided here not only answers some theoretical questions regarding the problem of updating, but it also is powerful enough to help the designer of the system with his design. In fact for complex systems a powerful tool like this is indispensable. 8 Syntactic Characterizations of Conditions We see now that it is important that we be able to recognize translation invariant closed cubic planes, and we have given a definition in semantic (i.e., set-theoretic) terms. Now we have to connect this semantic definition to the syntax; otherwise, we will not be able to recognize sentences that yield translation invariant closed cubic planes. Perhaps it is well to repeat what the situation is. We are not concerned with all configurations which satisfy a set of sentences. At least we are not concerned with them all at once. Rather we are concerned with special classes of configurations. In particular, any one of these classes is made up of configurations that have the same universe and that interpret constant symbols, value predicate symbols, and Skolem symbols in the same way. That is, 20

they differ only in the ways that they interpret AE, CT, and ST. Within one of these classes we are interested in the configurations that satisfy a set of sentences, and we want to know if they form a translation invariant closed cubic plane. We also know from Theorem 8 that a translation invariant closed cubic plane is characterized by sets AET, AEF, CTT, CTF, STT, and STE. Thus, we need sentences that will define these sets, and, of course, these sets have to be defined either explicitly or implicitly using symbols from our first-order language that are not AE, CT, or ST. We define these sets explicitly 8 as follows: (VX)(AET(x) +- >AET(Z)) (VX)(AEF(X) +AEF(X)) where 4AET(X), and /AEF(x) are well-formed formulas with x as the only free variable and which do not contain AE, CT, or ST, and, of course, the two sets must be disjoint which means that (VX)(-'(^AET(x) A qAEF(x))) The sets CTT, CTE, STT, and STF are taken care of similarly. We then describe translation invariant closed cubic planes with three sentences as follows: (VX)((kAET(X) A AE(x)) V (kAEF(x) A -AE(x)) V -(~AET(x) V AEF(X)) Any AE which satisfies this sentence must be true on AET, false on AEF, and arbitrary elsewhere. The other two sentences are (V) (Vy)(((OCTT(X, y) A CT(x, y)) V (ObcTF(x, y) A -'CT(x, y)) V -(0(kTT(x, y) V'OkTF(X, y)) and (Vx)(Vy)(((sTT(, y) A ST(x, y)) V (OSTF(X, ) A -ST(x, y)) V -(qSTT(x, Y) V 4STF(x, Y)) Example 4 Suppose that the right side is AE(k1) A AE(k2) A -CT(kl, k2) A -CT(k2, ki) In the above canonical form this becomes OAET = ((x = ki) V (x = k2)) ~AEF = False (:,TT = False qCTF = ((x = kIi) A (y = k2)) V ((x = k2) A (y = ki)) Of course, the original form of R in the foregoing example is preferable to the canonical form, and, in fact, many translation invariant closed cubic planes are characterized by such conjunctions. The canonical form just provides us with a convenient test. Finally, a technical point, there are sets that cannot be defined using our first-order language: therefore, there are translation invariant closed cubic planes that cannot be defined using our first-order language. However, they are of no practical interest. 8Beth's Theorem [6, page 871 guarantees that an implicit definition of AET etc is not necessary. 21

9 The Connection to Situation-based Planning, the Frame Problem, and Rule-based Systems Robot planning problems are discussed in most introductory texts on Artificial Intelligence (for example, see [7, page 2521). In brief the problem is that given the current status of the world(that is the current'situation'), represented in the context of first-order logic, how do we go about changing the world to attain a goal status. One popular method for going about solving this problem is by using automatic theorem proving. But since the facts change from situation to situation a new kind of variable called a situation variable is introduced and added to the arguments of all predicates (This idea is due to Green [81). The axioms of the theorem proving system are the operations or actions that can be performed by the robot (plus the facts that are true in the initial ituation). A typical operation looks like: L(s) ~ R(s') where L and R are well-formed formulas and s and s' are situation variables. The above operation says how the status of the world change as the transition from situation s to situation s' is made. However, that is all it says; more specifically it does not say what happens to other things in the world, and this is where the frame problem arises. The problem is to specify what the next situation will be unambiguously. One way to do this is to add more axioms for every operation. These added axioms, called frame axioms, simply assert what happens to each each part of a situation. It then becomes possible to use automatic theorem proving to find a sequence of operations (axioms) that change the current situation to a situation in which a desired goal statement is true. However, there are usually so many axioms in the system now that this method becomes impractical for all but small problems. That is why some robot problem-solving systems such as STRIPS [9J abandon the idea of situation variables and explicit frame axioms and solve the frame problem implicitly by requiring every operation to have a'delete' list and an'add' list on the right hand side rather than simply a statement that is to be true in the next situation. The status of the situation at any time is represented by a global database. When any action is performed the database is updated by making true the predicates in the'add' list and making false the ones in the'delete' list and leaving everything else intact (this is the implicit frame axiom). We are interested in'situations' and the frame problem because situations and our configurations are essentially the same kinds of semantic objects. However, a situation is usually viewed as part of a larger semantic object that is a collection of situations. Furthermore, our modeling system does not explicitly allow deduction to be used as a tool for planning. Rather than one logical system, we have many. At any given time deduction is applicable to the current configuration only. The frame problem can obviously be viewed as a kind of updating problem. In particular, one wants to specify a sufficient number of first-order sentences to characterize the next situation uniquely. Obviously, this is what we are interested in too. In fact, we analyze the problem in a context that goes beyond first-order logic, but, given our results, we can put everything back into a first-order system as is done in situation-based planning. It is an easy matter to write down a well-formed formula that characterizes an orthogonal projection which, we now know, is the solution to the updating (and frame) problem. This well-formed formula is the required frame axiom. Thus, there is not an explosion of frame axioms, we just need a few. Further, since we know that each orthogonal projection is characterized by an add and delete list (see, part 3 of 8), we see that the STRIPS approach fits into our theoretical framework also. 22

Anybody familiar with rule-based systems (see the Handbook of Artificial Intelligence for a good overview [101) can see that our changes are similar to the rules of a rule-based system and that our configuration is reminiscent of the'working memory'. Rule-based systems being more a programming paradigm than a modeling tool, however, never give a systematic treatment to the updating problem, and solve it implicitly by also restricting the right-hand side to an'add,delete' list of the elements to be added to or deleted from the working memory. Our analysis confirms, then, the intuitive idea of the'add, delete' list, but it also shows that more can be achieved within our system (see discussion at the end of section 7). Another relevant, difference between our system and Rule-based systems is the presence of O (the set of sentences that must always be true) in our system. 10 Summary We have presented a practical test for determining which models have unambiguous updating and which do not. Presumably, ones that do not are inappropriate models. In addition and perhaps more importantly, we have developed a geometric approach to the analysis of our logic-based model. It is fairly clear that this approach can be used to address a number of other questions regarding these models, and it is our intent to address them in subsequent articles. References [1] Naylor, A.W. and Maletz, M.C.,"The Manufacturing Game: A Formal Approach to Manufacturing Software," IEEE Trans. on Systems, Man, and Cybernetics, May-June, 1986 [2] Harper, W.L., Stalnaker, R., and Pearce, G. (eds.), "IFS Conditionals, Belief, Decision, Chance, and Time," D. Reidel, Dordrecht, Holland, 1981 [3] Enderton, H.B.,"A Mathematical Introduction to Logic," Academic Press, 1972 [4] Gill, A., Linear Sequential Circuits," McGraw-Hill, 1966 [5] Naylor, A. W. and Sell, G. R., "Linear Operator Theory in Engineering and Science," (2nd Edition), Springer-Verlag, 1982 [6] Chang, C.C. and Keisler, H.J., "Model Theory," North Holland Publishing Co., 1973 [7] Rich, E., "Artificial Intelligence", New York: McGraw-Hill, 1983 [8] Green, C., "Theorem Proving by Resolution as a basis for Question-Answering Systems" in "Machine Intelligence 4" editted by Meltzer and Michie, 1969 [9] Fikes,R.E. and Nilsson,N.J. "STRIPS: A new approach to the application of Theorem Proving to problem solving," Artificial Intelligence, Vol. 2, 1971 [10] Barr,A. and Feigenbaum,E.A. "The Handbook of Artificial intelligence," (Vol. 1), Kaufman, Los Altos, California, 1981 23

It^coe to CO