THE UNIVERSITY OF MICHIGAN Technical Report 6 FEASIBILITY OF A SET-THEORETIC DATA STRUCTURE A General Structure Based on a Reconstituted Definition of Relation David L. Childs CONCOMP: Research in Conversational Use of Computers F.H. Westervelt, Project Director ORA Project 07449..... ',. supported by: ADVANCED*RESEARCH PROJECTS AGENCY DEPARTMENT OF DEFENSE WASHINGTON, D.C. CONTRACT NOo DA-49-083 OSA-3050 ARPA ORDER NO. 716 administered through: OFFICE OF RESEARCH ADMINISTRATION ANN ARBOR August 1968

7.,,^ tFt I. 4 VI. — CF- C:

This paper was prepared for the IFIP Congress 1968 for presentation at Edinburgh, Scotland on August 6, 1968. iii * * 11 1

ABSTRACT This paper is motivated by an assumption that many problems dealing with arbitrarily related data can be expedited on a digital computer by a storage structure which allows rapid execution of operations within and between sets of datum names. In order for such a structure to be feasible, two problems must be considered: 1. the structure should be general enough that the sets involved may be unrestricted, thus allowing sets of sets of sets...; sets of ordered pairs, ordered triples...; sets of variable length n-tuples, n-tuples of arbitrary sets; etc.; 2. the set-operations should be general in nature, allowing any of the usual set theory operations between sets as described above, with the assurance that these operations will be executed rapidly. A sufficient condition for the latter is the existence of a well-ordering relation on the union of the participating sets. These problems are resolved in this paper with the introduction of the concept of a 'complex,' which has an additional feature of allowing a natural extension of properties of binary relations to properties of general relations. v

TABLE OF CONTENTS Page ABSTRACT............................................ I. INTRODUCTION.................................. II. COMPLEXES.................................... 5 III. EXTENSION OF SET OPERATIONS TO COMPLEXES...... 9 IV. ORDERED PAIR DEFINED BY A COMPLEX............. 11 V. FUNCTIONS..................................... 13 VI. DEVELOPMENT OF AN N-TUPLE..................... 14 VII. RECONSTITUTED DEFINITION OF RELATION.......... 18 VIII. TAU-ORDERING............................ 21 IX. EXTENDED OPERATIONS FOR RELATIONS............. 28 X. EXAMPLES.................................. 31 XI. CONCLUSION............................... 33 DEFINED SYMBOLS..................................... 35 REFERENCES.......................................... 37 vii

I. INTRODUCTION The overall goal, of which this paper is a part, is the development of a machine-independent data structure allowing rapid processing of data related by arbitrary assignment such as: the contents of a telephone book, library files, census reports, family lineage, networks, etc. Data which are non-intrinisically related have to be expressed (stored) in such a way as to define the way in which they are related before any data structure is applicable. Since any relation can be expressed in set theory as a set of ordered pairs and since set theory provides a wealth of operations for dealing with relations, a set-theoretic data structure appears worth investigation. Let a Set-Theoretic Data Structure (STDS) consist of a set of data and a corresponding set B of datum names or pointers. A datum can be a single item or block of storage. Either way, one datum name (pointer) locates one datum. Many concepts, or groupings of data, can be represented by a set "defined over X." A formal definition of "defined over" will be presented later. Intuitively, A "defined over" g is defined recursively by: any element of A, containing no element, is an element of 3; and any element of A, containing an element, is itself "defined over" 3. Let the rest of an STDS consist of: a. independent sets defined over B (groups of contiguous virtual storage locations without interconnecting -1 -

-2 - pointers between groups) having just one pointer for each set designating the head of that set; and b. set-theoretic operations (data structure operations) which can extract any subset of a given set defined over 1, or give any possible set resulting from a finite number of set-theoretic operations, nested to any depth, on sets defined over. In order to realize such a structure there are two main problems: 1o The first problem concerns the scope of what is to be included as allowable independent sets: sets of sets, sets of n-tuples of sets of n-tuples, etc. 2. The second problem is to assure, for any two allowable sets, an abundance of set operations which can be executed rapidly. If any two sets can be well-ordered such that their union preserves this well-ordering, then subroutines needed for set operations just involve a form of merge or, at worst, a binary search of just one of the sets. Some running times are presented at the end of this paper for some set operations using a merge technique. To emphasize that a well-ordering on two sets does not necessitate a well-ordering on their union let A = {1,2,3} with < defined in the usual way, and let B = {a,b,c} with a<b, b<c, and a<c o Then these two sets are well-ordered, since every subset has a first element; but their union is not

-3 - well-ordered since not every subset has a first element. Since this paper is to show that an STDS is feasible, in the sense that there exists at least one instance, we will consider only one problem by incorporating in the genre of allowable sets that: allowable sets must be such that their union is wellordered. A further consideration, however, is that all sets should be represented using the individual elements of B instead of relying on hierarchical names assigned to particular sets. The latter would prevent the complications involved in keeping track of arbitrary names and the associated sets during nested operations on sets. A final consideration is the "burdensome" definitions of "ordered pairs" used in set theory - burdensome in the sense that they are difficult to express since most of the definitions rely on a redundancy of the elements for expressing order. For example, <a,b> = {{a},{a,b}}. Further, these definitions do not allow a natural extension to ordered triples, ordered quadruples, or general n-tuples, except by establishing a canonical form of nested ordered pairs. For example, <a,b,c,d> could be any of the following: <<a,b>, <c,d>>, <a,<b,<c,d>>>, <a<<b,>,d>>,<<<a,b>,c>,d>, <<a,<b,c>>,d> For any of these, a storage representation using redundancy to express order would waste storage. Another failing with these definitions is, though an ordered pair is defined to be a set, no utilization of set operations between n-tuples is feasible. Another alternative is to define an n-tuple in terms of a function from N(n) = {l,...,n} to a set A by:

-4 - <al,a,..o,a > = {<i,a.>: ieN(n) & a.eA} n 1 1 A problem here is justifying the ambiguity of ordered pairs and 2-tuples under set operations; for ordered pairs <a,b> & <x,b>, <ab>on<x,b> = 0, but for 2-tuples <a,b> & 0 0 0 <x,b>2, <a,b>2<x,b> = {<2,b>} The purpose of this paper then is to establish a broad scope of allowable sets for an STDS, such that: 1. sets of n-tuples are included, 2. set operations between n-tuples are meaningful, 3. all allowable sets can be represented in terms of elements of P, and 4. to insure rapid execution of the set operations on a collection of sets, the union of these sets must preserve well-ordering. The presentation is based on the definition of a "complex." The development of set theory up to, but not including the introduction of, ordered pairs is assumed (see [2]). Let N represent the natural numbers, and let N(n) represent the natural numbers up to and including n. The natural numbers are not being assumed for they can be developed* in a set-theoretic framework without the use of ordered pairs. The first definition is that of a complex. * In order to avoid ambiguities when the empty set is an element of a complex, the natural numbers could be redefined as: O={0}, 1={0,{0}}, 2={0,{0},{0,{0}}}, etc. (see Suppes, p. 134).

II. COMPLEXES DEFINITION 1. Any two sets A and B form a complex (A;B) iff (IX) (IY) (Xe{A,B}) (Ye{A,B}) [(VxeX) (lieN) ({{x},i}eY) & (VyeY) ( j eN) (IxeX) ({ {x}, j }=y) ] This definition is stated in such a way as not to presuppose any ordering in (A;B) of A before B, insuring that a complex be an unordered coupling of two sets, each bearing a mutual dependence on the other. The definition states that for every element x of one of the two sets X the other set Y contains an element containing a natural number and a set whose only element is x; and that Y is such that every element of Y contains only a natural number and a singleton set containing an element of X (either X=A and Y=B, or X=B and Y=A, but not both). Let A = {a,b,c B = {{{a},l}, {{b},3},{{c},963},{{b},6}} and let C = {a,b,{{b},3},{{a},1}, {{d},6}} then (A;B),(B;A) and (AnC;BBC) are complexes, while (A;A), (A;C), (A;BrC) and (AcC;B) are not complexes. From the definition it is clear that if (A;B) is a complex then (B;A) is the same complex and A # B. Since for any complex Q = (A;B), it can be determined whether (VxeA) (i) ({{x},i}eB) or (VxeB) (li)({{x},i}eA) is true, and since one of these has to be true, we can define sets Q and -5 -

-6 - DEFINITION 2. If A and B form a complex Q then there exist sets Q and Q such that ({QQ}={A,B})(VaeQ) (IieN)({{a},i}eQ) For any set A, there is one set of particular interest. DEFINITION 3. For any set A, IA {{y,l}: (xeA > y=0) [(3xeA)(y={x})] e This gives rise to the following theorem. THEOREM 1. For any set A, (A;IA) is a complex. The proof follows directly from Definitions 1 and 3. It will be shown that any such complex formed is also a set, and any set must be a complex of this form. In order to establish the relationship between complexes and sets, and to extend set operations to complexes, we must first define the conditions for containment in a complex. DEFINITION 4. Conditions for containment in a complex Q: i xeQ ++ (xeQ) (Q=IQ) 1 1 ii. xiQ ++ ({{x},i}eQ)(VaeQ) j)({{a}j}eQ) iii. x4iQ ++ (x iQ). For a given complex (A;B) part (i) of the above states that for x to be an element of (A;B) then either x has to be an element of A and B=I v or x has to be an element of A'

-7 - B and A=IB. For an example of (ii) and (iii): let A = {a,b,c} and B = {{{a},l},{{a},2},{{b},3},{{c},7}} and let Q=(A;B) then Q is a complex and aE2Q, b&3Q, and b7 Q are true statements. THEOREM 2. For any set A, A=(A;IA) PROOF. 1. Show xeA + xe(A;IA) xeA -+ {{x},l}eI A xe(A; IA) 2. Show xe(A;IA) + xeA xe(A;IA) + (xeA)({{x},l}eIA) - xeA A A (1)&(2) + A=(A;IA) Q E. Q.E.D. Since for any set A, by Theorem 2, (A;IA) is a set equal to A, it follows that (((A;IA);IA);IA) = A THEOREM 3. For any complex Q, xeQ -+ xl Q PROOF. xeQ + (xeQ)({{x},l}eQ) + xe Q. Q.E.D. Notice that the converse of Theorem 3, xe Q + xeQ, is not generally true. As a counter example let Q = ({a,b};{{{a},}, {{b},2}}) then ae1Q but aOQ since QJIQ since {{b},2}eQ DEFINITION 5. For any two complexes A and B A=B -+ (A=BO0 + A=B) From this definition of equality between complexes, we can prove the following theorem which is analogous to the theorem for equality between sets.

-8 - THEOREM 4. For any two complexes A and B A=B -+ (Vi) (Vx) (x.iA -+ xiB). PROOF. 1. First show A=B - (Vi)(Vx)(xe.A -+ xE.B) 1 1 From Definition 5 A=B + A=B & A=B for A&B ~ 0 then xEoA + ({{x},i}eA)(VaeA) (j) ({{a}j}eA) by Definition 4(ii), and by substitution, xE.A + ({{x},i}eB) (VaeB)(3j)({{a},j}eB) which gives from Definition 4(ii) xc. A + xe B 1 1 similarly xE.B + x>.A:*A=B + (Vi)(Vx) (xe.A t+ xe.B) 1 1 2. Secondly show (Vi)(Vx)(x. A -- xe.B) + A=B By Definition 4 (Vi)(Vx)(x.iA — + xe.B) (Vi) (Vx) [({{x},i}eA) (VaeA) (Ij) ({{a},j}eA) ++ ({{x},i}eB) (VbeB) (1k) ({{b},k}eB)] with Y (A) = (VaeA) (3j)({{a},j}eA) this gives (Vi)(Vx)[({{x},i}eX)T(A) ++ ({{x},i}e'B)T(B)] and since Y(A) & Y(B) are both independent of x & i, if either xS.A or XE.B is ever true, then 1(A) and T(B) are always true. (If neither xe.A nor xeiB is ever true A=B=0). Then for A & B $ 0

-9 - (Vi) (Vx) [({{xi}eA) (- ({{x},i}eB)] which makes A = B by set equivalence. Then since T(A) -++ (B), (VaeA) (j ) ({{a},j}eA) +-A (VaeB) (j) ({{a},j} eA) by substitution of A for B, which reduces to (Va)(aeA — + aeB) and gives A=B by set equivalence. Since A=B & A=B implies A=B we have (Vi) (Vx) (x~.A -++ x~.B) -A=B. From (1) and (2) A=B -+ (Vi) (Vx)(x~iA ++ x~.B) Q.E.D. III. EXTENSION OF SET OPERATIONS TO COMPLEXES With Theorem 4 it is now possible to extend the usual set operations to include complexes. It should be noted that the following are defined for all complexes, and when the complexes are sets the usual interpretation is preserved. For any complexes A and B, define A is a subset of B as: DEFINITION 6. AcB -+ (Vi) (Vx) (xi.A -+ x~.B) and for A a proper subset of B: DEFINITION 7. A jB -+ Ac B & AB. We now introduce a Definition Schema to make subsequent definitions less cumbersome. DEFINITION SCHEMA 1. {x:T(x,i)} = A ++ [(Vx) (VieN)(x~.A ++ Y(x,i)) & A is a complex] or [A=0 & %(3B)(Vx) (VieN) (x~.B -- (xi))].

-10 - The symbol x has no meaning apart from being enclosed by brackets. A = {a6,b8} means that ae6A and be8A not a6eA. Instead of having to write AO B = C ++ (Vi)(Vx) (xE.C -+ xe.A & xeiB) & C is a complex, this schema allows us to define the intersection of A and B as: DEFINITION 8. A B = {x:xeoA & X~oB} The union of A and B is defined as: i DEFINITION 9. AuB = {x:xoA v xe B} The symmetric difference of A and B: DEFINITION 10. A B = {x: (xe Au B) (x. An B)} The relative complement of A and B: i DEFINITION 11o AB = {x:xe.A ^ xg B} The next two definitions are unary union and unary intersection defined on any complex A. However, the result is non-empty only if A contains a non-empty complex. DEFINITION 12. UA = {x:(3j)(3B)(Be.A & xeiB)} DEFINITION 13. nA = {x: (j)(VB)(Be.A + xeoB). This completes the extension of principal set operations to complexes

IV. ORDERED PAIR DEFINED BY A COMPLEX The next two definitions have no previous equivalent in set theory. For a complex A they are respectively the maximum dimension of A and the minimum dimension of A DEFINITION 14. 3 (A) = i ++ (ieN) [ (Ix) (xiA) (Vj) (Vy) (ye.A + j~i)]. DEFINITION 15. a (A) = i ++ (ieN)[(3x)(x i A)(Vj)(Vy) (yc.A - jzi)]. The maximum (minimum) dimension of a complex A, if defined, is the largest (smallest) integer contained in any element of. The maximum (minimum) dimension is not defined for A if a (A)=O (9 (A)=O) which occurs when A=0 or when there are infinitely many different integers in elements of A g as when A=N and A={xi:(xeN)(i=x)}. Notice that a non-empty complex A is a set iff +(A)=l. We are now in a position to define ordered pair. DEFINITION 16. <x,y> = x,y,y};{{{x,l},{{y},2}}) An ordered pair <a,b> is a complex defined by the sets {a,b} and {{{a},l}, {{b},2}}. Using the Definition Schema 1, the complex <a,b> can be written as {alb2}. Certain differences result between this definition of ordered pair and ordered pair due to Kuratowski (see [2], p. 32) which is defined: <a,b>k = {{a},{a,b}}. The intersection of <a,b>k -11 -

-12 - & <x,y>k is non-empty iff a=x, while <a,b>f<x,y> is nonempty iff a=x or b=y. U<x,y>k = {x,y} while U<a,b> = 0 if a and b are not complexes. C<x,y>k = {x} while n<a,b> = 0 if a and b are not complexes. The most interesting difference between the two definitions is demonstrated by the following: (<a,b>U <x,y>) n <x,b> = <x,b> while (<a,b>kU <xy>'k) <x,b>k = 0 For any definition of ordered pair the following theorem must be trueO THEOREM 5. <x,y> = <a,b> - x=a & y=b PROOF. This follows immediately from Theorem 4 and the fact that <x,y> & <a,b> are the complexes ({x,y};{{{x},l},{{y},2}}) and ({a,b};{{{a},l}, {{b},2}}) respectivelyo The following definition is the usual definition of a binary relation as a set of ordered pairs. DEFINITION 17. A is a binary relation ++ (Vx)[xeA + (3y) (1z) (x=<y, z>)] For notational convenience let xAy be equivalent to <x,y>eA then the usual definitions of domain, range, and converse of a binary relation are:

-13 - DEFINITION 18. D(A) = {x: (y)(xAy)}. DEFINITION 19. R(A) = {y: (Ix)(xAy)}. DEFINITION 20. A = {<y,x>:xAy}. Given any binary relation A, A is a subset of {<x,y>:xe P(A) & ye R(A)}. In other words, a binary relation is, by definition, a relation between sets. This presents a problem, if a function f is defined in the usual way as a binary relation such that (Vx,y,z)( xfy & xfz + y=z), then this definition only allows functions between sets, and not functions between complexes. A more general definition for function will be given to allow functions between complexes. V. FUNCTIONS DEFINITION 21. f is a function ++ (3A) ( B)(fc{{x,y}: xecA & yEj B}) and (Vx,y,z) (Vi,j,k) [({x, y }ef) i k ({xz ef) (i<j) (i<k) + (k=j)(yz)]. This definition incorporates the usual definition of function, since when A and B are sets f is a binary relation; but unlike the usual definition of a function, a function will not always be a relation. This will be clearer after a general definition of relation has been given (see Definition 33). An example of a function defined between two complexes is given by: A {a6,b3,c7} and B = {x3,y4,z6} with f = {{a6,x9},

-14 - {b3,y7, {c7,x }} An important type of function is a function that is one-to-one. DEFINITION 22. f is one-to-one ++ f is a function and ii h k (Vx,y,z) (Vi,j,k,h)[({x,ye f) ({x,z ef)(j<i) (k<h) (i-j = h-k) -+ (i=h)(y=z)] Notice that in the last example f is not one-to-one since f maps xE3B to a~6A and to c~ A In the following, if f is one-to-one it will be written as fl1 VI. DEVELOPMENT OF AN N-TUPLE Though a general definition of domain and range will be given later for an arbitrary complex, temporary definitions are needed now which will be equivalent to the general definitions when the complex is a function: D(f) = {x':(3j)(3y)(i<j)({x,y3}ef)} and R(f) = {x: (ij) (y) (3h) (i=h-j) (j<h) ({x,y}ef)} Using these temporary definitions we can define a finite complex A and its cardinality #A, (let Z=N{0}) DEFINITION 23. A complex A is finite and #A=n -+ [(n=O & A=0)v(3fl -) (p(f)=A) (R(f)=N(n) ) ]

-15 - The cardinality of a finite complex is just the number of elements contained in the complex. If a non-empty complex A cannot be put in a one-to-one correspondence with a finite subset of N then #AOZ. Therefore, for any A, #AeZ asserts that A is a finite complex and there exists an n such that #A=n. The degree of a complex A is the number of different integers that appear in the elements of A DEFINITION 24. 6(A) = #(uAON) For any set A the degree of A is 1, as is stated in the following theorem. THEOREM 6. If A is a set then 6(A)=l. PROOF. A is a set - A={{{x},l}:xeA} by Theorem 2, then 6 (A)=# (u XN)=# (UxeA{ {x}, l}nN)=#{1}=l Q.E.D. This holds even if A is the empty set since 0=(0;{{0,1}}) then 6(0)=#(U{{0,1}}1N)=#{l}=l. Four components associated with all complexes, compact, uniform, simple, and normal are so closely related that they will be defined en maSte. DEFINITION 25. For any complex A 1. A is compact ++ 6(A) =a(A)+1-a (A) 2. A is uniform ++ (Vx)(Vy)(Vi)(xe.A & yE.A + x=y) 3. A is simple ++ 6(A)=1. 4. A is normal ++ 6(A)=3 (A)

-16 - The following theorems, though important, are trivially proved. THEOREM 7. A is simple -+ A is compact. THEOREM 8. A is simple and normal -+ A is a set. THEOREM 9. A is normal + A is compact. THEOREM 10. A is uniform ++ #(A)=6(A). With these definitions it is possible to embark on part 1 and 2 of our objective, namely to define the concept of n-tuple and to establish that with this definition set operations between n-tuples are meaningful. A preview of this was presented while comparing definitions of an ordered pair, for an ordered pair will turn out to be an n-tuple of degree 2. DEFINITION 26. A is an n-tuple ++ A is a uniform, normal complex with finite-degree n The following theorem is a direct consequence of the previous definition. THEOREM 11. A is an n-tuple -+ (A) =6(A)=#A PROOF. A is normal ++ 6(A)=+ (A) by Definition 25-4 A is uniform ++ #A=6(A) by Theorem 10 #A=6(A) +<- 6(A)eN by Definition 23; then by Definition 26. A is an n-tuple -+ a (A)=6(A)=#A.

-17 - Let X={x,...,x } represent an indexed set of n elements x., where ieN(n), then: DEFINITION 27. <xl,..,x> = (X;{{{xi},i}: (Vi)(xieX) ' ' n ' 'i 1 (ieN(n)}) This gives a definition for the usual form of an n-tuple. It remains to be shown, however, that this does in fact represent an n-tuple as defined by Definition 26. THEOREM 12. <x,x,o..,x > is an n-tuple. PROOF. Let A = <xl,x2,...,x > then by Definition 27 A = (X;{{{x.},i}:(Vi)(xieX)(ieN(n))}) where X = {x.:ieN(n)} then a (A)=n by Definition 14 and 6(A)=n by Definition 24 and #A=n by Definition 23 therefore 3 (A) = 6(A) = #A and by Theorem 11, A is an n-tuple. Q.E.D. An analogous theorem to Theorem 4 can now be proved for any two n-tuples. THEOREM 13. <a,...,a > = <b,..,b > (VieN(n)) n I.1 P n (a.=b.) PROOF. Let A = <al,...,a > and B = <bl,...,b > 1 2 then A = {a,a2,...,a } and B = bl,b2,...,bn} 1' 2 n

-18 - by definition of A, (VieN(n))(x~.A -+ x=a ) by definition of B, (VieN(n))(xe~B ++ x=bi) and since A=B + (VieN(n))(xe.A -++- xeB) then A=B + (VieN(n))(x~.A -++ x~.B)(x~oA ++ x=ai) (x~ B ++ x=b ) 1 i i which reduces to A=B - (VieN(n))(x=a.o + x=b ) which gives A=B + (VieN(n))(ai=bi). Q.E.D. At this point we can demonstrate that union, intersection, and symmetric difference are meaningful between n-tuples. The following can be easily verified by the reader. 1 <a,b,c> n <x,b,y> = {b2} 2. <a,b,c> <x,y> = {a,x1,b2,y2c 3} 3. {a,b,c} <a,x,y> = <a> = {a1} = {a} 4. ULa' b 2 x' c3}3 {y2, d4}4} = {x1c3}U{y2,d4} <x,y,c,d> 5S <a,b,z>&<a,y,c>A<x,b,c> = <x,y,z> 6, <a,b,c,d> % <x,y,c,d> = <a,b> VII. RECONSTITUTED DEFINITION OF RELATION Two familiar concepts of set theory have yet to be established under the definition of a complex. These are the Cartesian product and relation. To develop these definitions we need to introduce the expansion of a complex A DEFINITION 28. A = {{x }: X~oA} 1

-19 - For any complex A, A is a set and #A = #A. Notice that UA=A. The following definition of concatenation between any two complexes will be most useful. DEFINITION 29. A*B = AU{xJ:(3i) (j=i+9 (A))(x~iB)}. This amounts to the union of A with B, after every element in B has been "shifted right" by a (A). For example, {a3,b6}*{c2d3} = {a3,b6,cd9}. When all the elements of two complexes are mutually concatenated it is called the product of A and B DEFINITION 30. A.B = {x:(ae.A) (3b~.B) (a0#b)(x=a*b)} 1 1 The previous definitions now allow the definition of Cartesian product between two complexes A and B DEFINITION 31. A x B = A ~ B Notice that for three non-empty sets A,B,C we can define any of three canonical forms of sets of triples: (AxB)xC = {<<a,b>,c>: (aeA) (beB) (ceC)} Ax(BxC) {<a,<b,c>>:(aeA)(beB)(ceC)} A.Boe = {<a,b,c>:(aeA)(beB) (ceC)} Notice, Cartesian product is not associative while product is associative. There is a strong similarity between product and relative product, which will be defined subsequently. The next step on the way to defining relation is the definition of the product space of a complex A

-20 - DEFINITION 32. Let C (A) = C2(A) = C (A) A 2 1 (VneN) C n 1(A) = Cn(A)A then I(A) UieNCi(A) II(A) is the product space of A and contains all n-tuples that can be constructed from A. Using the concept of product space we can now define any relation R DEFINITION 33. R is a relation ++ (IA) (neN) (Rc11(A)) (Vx) (xeR + 6(x) n) Since, for any complex A, A is a set and since an infinite union of sets is a set, then for any complex A, II(A) is a set; and since any subset of a set is a set we have that any relation is just a set of finite degree n-tuples;. If the ntuples all have the same degree then the relation is called homogeneous DEFINITION 34. If R is a relation, then R is homogeneous in degree n e+ (Vx) (xeR + 6(x)=n) THEOREM 14. R is a binary relation ++ R is a homogeneous relation in degree 2. PROOF. This is a direct consequence of Definitions 17 and 34, since for all a and b 6(<a,b>) = 2

-21 - With a general concept of relation defined we can expand some definitions, that were previously restricted to binary relations, to apply for general relations. For a relation A the i-th domain of A,.i(A), and the i-th range of A, Ri(A), will be respectively: DEFINITION 35. A is a relation - Do(A) = {x:(3aeA)(xE.a)} and Ri(A) = {x: (aaeA) (j=6(a)+l-i)(xeja) } A convention that will be used in this paper is that D(A) and R(A) will be used in place of D1(A) and R1(A). It should be noted that when A is a binary relation, Definition 35 is compatible with Definitions 18 and 19. Further exploration into expanded binary-relation properties to general-relation properties will be given at the end of this paper. VIII. TAU-ORDERING In order to complete our goal and establish the allowable-sets*, in an STDS, as those complexes which can be defined over 3 and whose union is well-ordered, we must introduce some concepts peculiar to the new definition of n-tuple. Repeated unary union on a complex A will be represented by UnA where U A=A and Un+lA(UnA) for neN. The rank of a complex can then be defined by: * It will turn out that more than just sets are allowable, namely all totally finite complexes defined over.

-22 - DEFINITION 36. L(A)=n ++ (neN) (U nlA0) (UnA=0) The rank of a volved in the example, let A let B1=A, defined since Two of a complex B, p (B). n complex A is the greatest depth of nesting incomplex, if the complex has finite depth. For A = {a6,{b2,{a2}5}7} then I(A)=3; and for any B2=B, B1 =B then for C=U B, (C) is unt(C)QN. closely related concepts are the n-th projection A, P (A), and the n-th component of an n-tuple n DEFINITION 37. P (A) = {x:x~ A} DEFINITION 38. p (B) = x -+ P (B) = {x} n n For example: p3(<a,b,c,d>) = c and P6({a6,b3,{a1}4,c6}) = {a,c}. The concept of defined over has been alluded to but has not yet been formally defined. For any complex A let: 1 (A) = {x: (li) (xA) (y)(Vj)(y4jx)) and An (A) = Xn(A)uX(CUn' A) n+l n then: DEFINITION 39. A complex A is defined over the set B ++- (3neN) (t(A)=n) (X (A)c) n This says that every element of A that is not a complex is an element of g, and that every element of A that is a

-23 - complex is itself defined over B. Sometimes the word "generated by" may be used instead of "defined over." What we want to show is that for a given set B, any complex defined over X will be an "allowable-set." This means that for any two complexes defined over B the union of these two complexes must be well-ordered. This in turn necessitates a relation to establish the well-ordering by. To do this we introduce the definition of, where < is the usual less than relation, and we refer to Q being induced on a set A by < o DEFINITION 40. For any set A and function f such that fl- l,(f)=A and R(f)cN then f induces ~ on A ++ (Va,beA)(a b -+ f(a) < f(b)) Then Q well-orders A since < well-orders any subset of N and since f is one-to-one and R(f) is a subset of N Let us now define < for n-tuples a & w defined "directly" over Z (ioeO, (VieN(6()))(p (a)eZ)). DEFINITION 41. For any two n-tuples a & w such that aell(Z) & well(Z), then a < O *- (3ieN)[(p (a) < Pi (3))v(Pi(ag)=0Pi())] (VjeN)(j < i + Pi(a) =Pi (c)) For example: <1,3,2,4> < <1,4,0,3> < <2,1,6> < <3>. Notice that < is defined on Z and on I1(Z) but not on ZUEl(Z) We can extend < further to subsets of 11(Z)

-24 - DEFINITION 42. If B,CCI (Z) then B < C ++ (CbeBrC) (VaeBuC) (a < b -+ aeC) This establishes a well-ordering on the set of all subsets (the power set) of II(Z),P(II(Z)). Now if a one-to-one mapping can be found from the set of all complexes defined over B to P(I(Z)), then the set of all complexes defined over 3 would be well-ordered, which would assure the feasibility of an STDS under the conditions previously stated. In order to establish this mapping let us define the reduction of a complex by i. i -< k DEFINITION 43. A {x:(j=k+i)(x~.A)} For example, if A = {a3,{a2,b 2}6x8 then A = {{a2,b2}3,x5} We can now define the collected cardinality of a complex A, i(A), recursively by: DEFINITION 44. t (A) = #x:(xE A)(j ) (7y x)} i(A) = 0 (Vj) (7y~jA) i(A) = t (A) +. j ac.A (a) A complex A is totally finite if C(A)eZ, which means that for any complex A,A has no infinite constituents. As a notational convenience, for any complex A let NA = N(n)-+ n=#A o Then NA is defined only if #AeN. We will now define, for any totally finite complex Q defined over 3, the tau-ordering of Q by o

-25 - DEFINITION 45. For any totally finite complex Q defined over., and f such that fl-1,(f)=8, and R(f)=NB then Q is tau-ordered by 6 if there i exists a set T(Q)={TQ(x ):xeiQ}, such that: Q a. (VieN)(VaE:Q) (keNQ) (VOl4 Q) 1. ae TQ (a') = {<l,k,i,O,f(a)>} 2. aO3 + T (a ) = {x: (b) (j)(be a)(3yeT (bJ)) Q J! a (x = <pl(y)+l,k,i>*y)} 3. TQ() 0 b. (Va) (Vb)(Vx)(Vy)[xeTQ(a ) & yeTQ(bJ) + (p (x) = 2(y) ++ {{a},i} = {{b},j}) For any totally finite complex Q defined on B, each element of T(Q) is a set of n-tuples defined over Z. For each ntuple <x.0.,x > x is always zero, x is an integer i'..n- n n-l n corresponding to an element in S, and xl is the depth of nesting of this element with n = 2x1+3. For all even i such that 2c i n - 3 x. is the relative order extablished by the remainder of the n-tuple. For i odd and 3s-i=n - 2 x. is the position in the complex of the element represented by the remainder of the n-tuple. T is a one-to-one mapping of Q into P(n(Z)), defined by T(Q)={TQ(x ):xciQ}, and the unary union of tau UT maps the constituents of Q onto K where K is a subset of II(Z) such that #K=S(Q), and (VxeK)(6(x)_ 2L(Q)+3). What we would like to show is that T well-orders a complex, and for any two complexes T

-26 -preserves the well-ordering on their union. T does not quite do this under <, for we have included in the first two components of every element of K, the depth of nesting and the relative order of the B constituent in Q. For example: let Q = {a3,b2,{a6,b 2,c}3, abe7}}2} where a,b,c,e are in g, then ~(Q)=8 and /(Q)=3. For this example leave a,b,c, and e in place of f(a),f(b), f(c), and f(e) for ease of tracing the example: then T(Q)={T (xi):xiEQ} and UT(Q) = { <l,l,2,0,b>, <3,2,2,1,1,1,1,0,a>, <3,2,2,1,1,2,1,0,b>, <3,2,2,1,1,3,7,0,e>, <1,3,3,0,a>, <2,4,3, 1, 1,0,>, <2,4,3,21,,0,b>, <2,4,3,3,6,0,a> } For all xeUT(Q),p (x) is the depth of nesting of the element of 3 and for i=(2pl(x)+3) then f(pi(x))e3 (in the case of this example p.(x)e1). In all cases pi-l(x)=0, this is so that the order on B will not influence the tau-ordering, unless all the previous components are the same. It remains to show that the tau-ordering for all totally finite complexes defined on B does, in fact, establish the appropriate well-ordering conditions under some relation.

-27 - Since 0 is defined only for B we can extend Q to apply to subsets of H(Z), as < was extended. DEFINITION 46. For a & w eH(Z) such that 6(a))5 & 6(w)_5, then: 2 2 aC -W -a C — < W Notice in the previous example the elements of UT(Q) are well-ordered by Q. The following definition is analogous to Definition 42. DEFINITION 47. For A,BclI(Z) such that (Vx e A,B) (6(x) t 5) then: A ~ B ++ — (IbeAB) (VaeAuB) (a ~ b -+ aeB) For any totally finite complexes A and B defined over g the elements of T(A) and r(B) satisfy conditions of Definition 47, since T is a one-to-one mapping and since AuB is defined over P, our conditions of well-ordering are established by the following theorem. THEOREM 15. For all totally finite complexes defined over 5 and for Q defined on S then, (Vi,j) (Vx,y) (xiA) (yEjB)TA(x ) TB(y ) TATB(X) 0 TAuB (y) PROOF. This follows from the fact that ~ is independent of pl(w) and p2(w) for all eTD(z ), for any zEkD: TA(xi) @ T(yj) +

-28 - 2 2 {w: (JaeTA(x )) ( w=a)} < {w: (laeTB (y) (o=a) by Definition 47, 46, and 42- Then for all 2 2 weTA (x), 3peTAB (x) such that w=p; and for all weT A B(x) (IpeTA(x ) such that 22 A s-p. Therefore (VX)(TA(x ) OX + TAUB(x ) X) Similarly, (VX)(X O TB(yj) + X O TA(yJ )) Therefore, TA (x) TB(yj) - TAuB (x) TA (y ).Q.E.D. The relation 0 defined on B permits all totally finite complexes defined over B as allowable sets in an STDS. This allows any finite nesting of imaginable combinations of finite complexes in an STDS, as long as the storage capacity is not violatedo In particular, sets of variable length n-tuples are allowed where the components may be: complexes, elements of, or themselves n-tuple;is IX. EXTENDED OPERATIONS FOR RELATIONS Sets of n-tuples are a natural means for expressing arbitrarily related data. Full utilization of this feature can not be realized by relying on properties defined solely for sets of order-pairs as: domain, range, image, relative product, restriction, etc. Therefore, these properties need to be extended to sets of n-tuples in such a way as: (i) to preserve their original meaning for binary relations, and

-29 - (ii) when applied to sets of n-tuples to give directly the same result usually achieved through chain applications of binary-oriented operations to nested ordered pairs. Two such definitions have already been given, for i-th domain and i-th range (see Definition 35). First we can define weakly restricted and strongly restricted respectively: DEFINITION 48. For any relation R and any complex B RIB = {x:(xeR)(l (xnB)- 6(B)) } and R||B = {x:(xeR)(6(XAB) = 6(B))} For example, let R={<a,b,c>,<a,x>,<b,y,x>} and let B = {a, 3}, then RIB = {<a,b,c>,<a,x>} and R|IB = {<a,b,c>} Notice that for a binary relation R and a set B: RIB = IJB = Rr(BxR(R)), given the usual definition of restriction for binary relation. For any relation R define the converse of R, R: DEFINITION 49. R is a relation { R = {x: (yeR)(VieN(6(y))) (j=6(y)+l-i)(pi(x)=pj (y))} Then for the image of B and the converse image of B under the relation R, respectively:

-30 - DEFINITION 50. R[B] = {x: (fj,y,w)(weR) (y. wr\B) (xe jw)} and [B]R = {x: ([j,y,w) (weR) (yej.wB) (xe w)} If R = {<a,b,c,d>,<x,y,z>}, B = {b2,c2,y2},then R[B] = {c,z}, R[B] ={b,x}, [B]R = {a,x}, [B]R = {d,z}. Again these propertie are consistent with the usual definitions when R is a binary relation and B is a set. The next extended definition will be that of the i-th relative product of two relations A and B. DEFINITION 51. If A and B are relations then, A/iB = {x:( IaeA) (IbeB) (k=6(a)-i)(0<k) (0<6(b)-i) (h=6(a)+6(b)-2i)(VjeN(i))[(pk+ (a)=pj(b)) + (VmeN(h))(m= k -+ p(x) = p (a))(k<m - pm(x) = Pmk(b))] As an example of the last definition let A = {<a,b,c>,<,x,y,z>, <a,d,b,c>} and let B = {<b,c,f>,<b,c,p,q>,<y,z,m,n>} then A/2B = {<a,f>,<a,p,q>,<o,x,m,n>,<a,d,f>,<a,d,p,q>}. When A and B are binary relations and i=l then A/1B gives the usual interpretation of relative product. For any complex A the i-th non-empty position of A is pi(A), for li-i6(A) DEFINITION 52. For any complex A pi(A) = n + (i=l & n=- (A)) or (l<i)(ieN(6(A))) (j=pi_(A)) (x=A + n=j+3 (x))

-31 - With this definition we can define the i-th domain and the i-th range and the converse of any arbitrary complex. DEFINITION 53. The i-th domain of A is D (A) = x:y) (3j) (yEA i 6(y))(k=i (y)) ki())(x ky) (h=k-i _ (y))} DEFINITION 54. The converse of A is A = {y:(3x) (xeiA) ((x)eN) (j=3 (x)+a (x)) (VkeN) (3 (x)))(Pk(y) = Pk-j))} Where Pk(y) is given by Definition 37. The i-th range of A is just the i-th domain of the converse of A:Ri(A)=Di(A). These are just a few of the definitions possible for complexes; more imaginative constructions can be developed as the need arises. X. EXAMPLES Since this paper is a claim for the feasibility of an STDS, an implemented example is in order. Two examples were run on an IBM 7090. The times may or may not be characteristic of the potential speeds in an STDS. With just two examples no claims are made other than that two examples were run with the following results: PROBLEM 1o Given a population of 24,000 people and a file F containing a ten-tuple for each person such that each ten-tuple is of the form <age, sex, marital

-32 - status, race, political affiliation, mother tongue, employment status, family size, highest school grade completed, type of dwelling>, the following four questions were asked: ao Find the number of married females: #(Fll{f2,m3}) Answer: 6,015 Time: 050 seconds b. Find the number of people of Spanish race whose mother tongue is not Spanish. Answer: 1,352 Time: 0o48 seconds c. Find the number of people aged 93 or 94. Answer: 46 Time: 0.73 seconds do Find the number of males and unmarried femaleso Answer: 17,985 Time: 0,55 seconds eo Find the number of males between the ages of 20 and 40. Answer: 588 Time: 0o62 seconds PROBLEM 2, Given a population of 1000 people and given two collections, A and B, of subsets from this

-33 - population such that: A contains 20 sets of 500 people, and B contains 500 sets of 20 people. Find the set of people belonging to some set in A, to all sets in A, and to an odd number of sets in A; and similarly for B Results a. people in some set b. people in all sets c. people in odd number of sets A-Times B-Times a. 0.73 sec 0.76 sec bo 0.48 sec 0.05 sec c. 0.76 sec 0.78 sec A point to notice is that where every element has to be accessed, as in (a) and (c), the times are dependent on the total number of elements included (i(A) = S(B) = 10,000) and not the number of sets involved (20 for A and 500 for B). XI. CONCLUSION In conclusion, the existence of a well-ordering relation satisfying previously stated criteria establishes that an STDS can be implemented having reasonably fast times for a given collection of subroutines. However, this does not mean

-34 -that the well-ordering relation developed here is the only one that can be used in every case, but other orderings may be more suitable for a particular given problem.

DEFINED SYMBOLS N (A;B) Q IA 1 A =B ~ {x': V(x,i)} AnB AuB AAB AB A O A + (A) a (A) <x,y> fl-1 #A Z 6(A) <X... A A*B A- B AXB 1H(A) (A) P (A) P (A) Page 4 Page 5 Page 5 Page 5 Page 6 Page 6 Page 9 Page 9 Page 9 Page 10 Page 10 Page 10 Page 10 Page 10 Page 10 Page 11 Page 11 Page 11 Page 14 Page 14 Page 14 Page 15 Page 17 Page 18 Page 19 Page 19 Page 19 Page 20 Page 22 Page 22 Page 22 Natural Numbers Complex First Set of a Complex Second Set of a Complex Identity Set Complex Containment Subcomplex Proper Subcomplex Schema for a Complex Complex Intersection Complex Union Complex Symmetric Difference Relative Complement Unary Union Unary Intersection Maximum Dimension Minimum Dimension Ordered Pair One-to-One Function Cardinality Non-Negative Integers Degree of a Complex N-Tuple Expansion of a Complex Concatenation Product Cartesian Product Product Space Rank N-th Projection N-th Component -35 -

DEFINED SYMBOLS (cont'd) i A Page 24 Reduction by i i(A) Page 24 Collected Cardinality RIB Page 29 Weak Restriction R||B Page 29 Strong Restriction R Page 29 Converse R[B] Page 30 Image A/iB Page 30 i-th Relative Product U (A) Page 30 i-th Non-Empty Position Di(A) Page 31 i-th Domain R (A) Page 31 i-th Range 1 -36 -

REFERENCES 1. Childs, D.L., Description of a Set-Theoretic Data Structure, Technical Report 3, Concomp Project, University of Michigan, March 1968. 2. Suppes, P., Axiomatic Set Thoery, Van Nostrand, Princeton, 1960. -37 -

Incl asci fi ed -39 - Security Classification,, I, 1 -... DOCUMENT CONTROL DATA - R&D (Security classification of title, body of abstract and indexing annotation must be entered when the overall report is classified) I 1. ORIGINATIN G ACTIVITY (Corporate author) g-a. REPORT SECURITY C LASSIFICATION THE UNIVERSITY OF MICHIGAN Unclassified CONCOMP PROJECT 2b GROUP 3. REPORT TITLE FEASIBILITY OF A SET-THEORETIC DATA STRUCTURE A General Structure Based on a Reconstituted Definition of Relation 4. DESCRIPTIVE NOTES (Type of report and inclusive dates) TPhni iral Report 6.. 5. AUTHOR(S) (Last name, first name, initial) Childs, David L. 6q REPORT DATE 7f. TOTAL NO. OF PAGES 7b. NO. OF REFS -August 1968 13 2 8a. CONTRACT OR GRANT NO. 9a. ORIGINATOR'S REPORT NUMBER(S) DA-49-083 OSA-3050 b. PROJECT NO. Technical Report 6 c. 9b. OTHER REPORT NO(S) (Any other numbers that may be assiSned this report) d. 10. A VA IL ABILITY/LIMITATION NOTICES Qualified requesters may obtain copies of this report from DDC. 11. SUPPLEMENTARY NOTES 12. SPONSORING MILITARY ACTIVITY Advanced Research Projects Agency I 13. ABSTRACT. STRACThis aper is motivated by an assumption that many problems dealing with arbitrarily related data can be expedited on a digital computer by a storage structure which allows rapid execution of operations within and between sets of datum names. In order for such a structure to be feasible, two problems must be considered: (1) the structure should be general enough that the sets involved may be unrestricted, thus allowing sets of sets of sets...; sets of ordered pairs, ordered triples...; sets of variable length n-tuples, n-tuples of arbitrary sets; etc.; (2) the set-operations should be general in nature, allowing any of the usual set theory operations between sets as described above, with the assurance that these operations will be executed rapidly. A sufficient condition for the latter is the existence of a well-ordering relation on the union of the participating sets. These problems are resolved in this paper with the introduction of the concept of a 'complex' which has an additional feature of allowing a natural extension of properties of binary relations to properties of general relations. DD JAN 64 1473 Unclassified Security Classification

Unclassified -40 - Security Classification 14. LINK A LINK B LINK C KEY WORDS ____________________.________________WORDS__ ROLE WT ROLE WT ROLE WT Set-Theoretic Data Structure Arbitrarily Related Data Sets Set Theory Information Retrieval N-tuples Complexes Relations Tau-ordering Extended Set Operations INSTRUCTIONS * 1. ORIGINATING ACTIVITY: Enter the name and address of the contractor, subcontractor, grantee, Department of Defense activity or other organization (corporate author) issuing the report. 2a. REPORT SECURITY CLASSIFICATION: Enter the overall security classification of the report. Indicate whether "Restricted Data" is included. Marking is to be in accordance with appropriate security regulations. 2b. GROUP: Automatic downgrading is specified in DoD Directive 5200.10 and Armed Forces Industrial Manual. Enter the group number. Also, when applicable, show that optional markings have been used for Group 3 and Group 4 as authorized. 3. REPORT TITLE: Enter the complete report title in all capital letters. Titles in all cases should be unclassified. If a meaningful title cannot be selected without classification, show title classification in all capitals in parenthesis immediately following the title. 4. DESCRIPTIVE NOTES: If appropriate, enter the type of report, e.g., interim, progress, summary, annual, or final. Give the inclusive dates when a specific reporting period is covered. 5. AUTHOR(S): Enter the name(s) of author(s) as shown on or in the report. Enter last name, first name, middle initial. If military, show rank and branch of service. The name of the principal author is an absolute minimum requirement. 6. REPORT DATE: Enter the date of the report as day, month, year; or month, year. If more than one date appears on the report, use date of publication. 7a. TOTAL NUMBER OF PAGES: The total page count should follow normal pagination procedures, i.e., enter the number of pages containing information 7b. NUMBER OF REFERENCES: Enter the total number of references cited in the report. 8a. CONTRACT OR GRANT NUMBER: If appropriate, enter the applicable number of the contract or grant under which the report was written. 8b, 8c, & 8d. PROJECT NUMBER: Enter the appropriate military department identification, such as project number, subproject number, system numbers, task number, etc. 9a. ORIGINATOR'S REPORT NUMBER(S): Enter the official report number by which the document will be identified and controlled by the originating activity. This number must be unique to this report. 9b. OTHER REPORT NUMBER(S): If the report has been assigned any other report numbers (either by the originator or by the sponsor), also enter this number(s). 10. AVAILABILITY/LIMITATION NOTICES: Enter any limitations on further dissemination of the report, other than those imposed by security classification, using standard statements such as: (1) "Qualified requesters may obtain copies of this report from DDC." (2) "Foreign announcement and dissemination of this report by DDC is not authorized." (3) "U. S. Government agencies may obtain copies of this report directly from DDC. Other qualified DDC users shall request through (4) "U. S. military agencies may obtain copies of this report directly from DDC. Other qualified users shall request through i, (5) "All distribution of this report is controlled. Qualified DDC users shall request through If the report has been furnished to the Office of Technical Services, Department of Commerce, for sale to the public, indicate this fact and enter the price, if known. 1L SUPPLEMENTARY NOTES: Use for additional explanatory notes. 12. SPONSORING MILITARY ACTIVITY: Enter the name of the departmental project office or laboratory sponsoring (paying for) the research and development. Include address. 13. ABSTRACT: Enter an abstract giving a brief and factual summary of the document indicative of the report, even though it may also appear elsewhere in the body of the technical report. If additional space is required, a continuation sheet shall' be attached. It is highly desirable that the abstract of classified reports be unclassified. Each paragraph of the abstract shall end with an indication of the military security classification of the information in the paragraph, represented as (TS), (S), (C), or (U). There is no limitation cn the length of the abstract. However, the suggested length is from 150 to 225 Words. 14. KEY WORDS: Key words are technically meaningful terms or short phrases that characterize a report and may be used as index entries for cataloging the report. Key words must be selected so that no security classification is required. Identifiers, such as equipment model designation, trade name, military project code name, geographic location, may be used as key words but will be followed by an indication of technical context. The assignment of links, rules, and weights is optional. I I i GPO 886-551 11n r 1 cc -C24 ~, a A

UNIVERSITY OF MICHIGAN 3 9015 02829 5247