TH E U N I V E R S I T Y OF M I C H I GAN Technical Report 3 DESCRIPTION OF A SET-THEORETIC DATA STRUCTURE David L. Childs CONCOMP: Research in Conversational Use of Computers ORA Project 074449 F.H. Westervelt, Director supported by: DEPARTMENT OF DEFENSE ADVANCED RESEARCH PROJECTS AGENCY WASHINGTON, D.C. CONTRACT NO. DA-49-083 OSA-3050 ARPA ORDER NO. 716 administered through: OFFICE OF RESEARCH ADMINISTRATION ANN ARBOR March 1968

: '? r n UII-, I L

ABSTRACT A set-theoretic data structure (STDS) is virtually a 'floating' or pointer-free structure allowing quicker access, less storage, and greater flexibility than fixed or rigid structures that rely heavily on internal pointers or hashcoding, such as 'associative or relational structures,' 'list structures,' 'ring structures,' etc. An STDS relies on settheoretic operations to do the work usually allocated to internal pointers. A question in an STDS will be a set-theoretic expression. Each set in an STDS is completely independent of every other set, allowing modification of any set without perturbation of the rest of the structure; while fixed structures resist creation, destruction, or changes in data. An STDS is essentially a meta-structure, allowing a question to ldictateo the structure or data-flow. A question establishes which sets are to be accessed and which operations are to be performed within and between these sets. In an STDS there are as many 'structures' as there are combinations of set-theoretic operations; and the addition, deletion, or change of data has no effect on set-theoretic operations,hence no effect on the 'dictated structures.' Thus in a floating structure like an STDS the question directs the structure, instead of being subservient to it.

-1 -A set-theoretic data structure (STDS) is comprised of two parts: a collection of sets Q and a collection of setoperations S. The collection Q consists of two special sets, n and 8, plus a finite number of other sets. The sets of Q are represented by blocks of contiguous storage locations with the set n containing names of all the sets, while the set B is the set of all 'datum-names.' 8 is represented by a contiguous block of storage locations; the address of a location in the $-block is a datum-name and an element of B. The content of a location in the b-block is the address of a stored description of that datum (see Fig. 1). The contents of the n-block and the $-block are the only pointers needed for the operation of an STDS. The storage representations of the remaining sets do not contain pointers, but contain datum-names. An STDS is a 'floating' structure or a meta-structure in the sense that the set-operations S act as the structural ties instead of using internal pointers or hash-coding. The set-operations are dependent only on the set n, the set containing the names of each set. Thus for any collection 2 the set-operations are independent of: 1) the deletion or addition of datum-names, 2) any changes in datum-names, 3) the order in which the datumnames are stored, 4) the size of any set, or 5) any other modification, including the creation or deletion of sets, as long as n is kept current. Furthermore, each set in Q is completely independent of any other set in Q (Q need not be disjointed)

STORAGE SCHEMATIC "_- L..\ e. / ' Z r * * K I —, --- / BLOCKS OF DATA SET-REPRESENTATIONS DETAIL OF s-BLOCK DETAIL OF a-BLOCK b b b + 1 + 2 1n n n + 1 + 2 b + i b+#B- 1 b+# #; ontains address of lata block with datumlame ' b + i' n+ j tains address set with e 'n + jg n- # - ] n+#q B = {b,b+l,b+2,...,b+# } n = {n,n+l,n+2,...,n+#n} (b & n are the addresses of the initial locations of B & n respectively) FIGURE 1

-3 - Since each set is an entity unto itself, completely free of internal pointers, and since the set-operations S are dependent only on n, the names of these sets, an STDS is relieved from the serious rigidity and excess storage encountered in fixed structures, such as 'associative or relational structures,' 'list structures,' 'ring structures,' or any other structure relying heavily on internal pointers or hash-coding. The viability of an STDS rests on the speed and scope of the set-operations in S. The algorithms for these operations will be presented in a forthcoming paper [4]; the feasibility of the operations' being extended to sets of arbitrary length n-tuples is expressed in another paper [3] which was submitted to IFIP Congress '68. The present paper presents the available operations along with some times experienced on an IBM 7090 (see Table 1). The set-theoretic definitions appear in Appendix I, for those who are not familiar with the definitions or are not accustomed to the notation preferred in this monograph. The following tableau presents the available set operations for constructing questions in any way compatible with the parent language.

-4 - S: THE COLLECTION OF AVAILABLE SET-OPERATIONS 1) UNION D= UN.(A,B,C) D= UN.(1,A,C) 2) INTERSECTION D= IN.(A,B,C) D= IN.(1,A,C) 3) SYMMETRIC DIFFERENCE D= SD.(A,B,C) D= SD.(1,A,C) 4) RELATIVE COMPLEMENT D= RL.(A,B,C) 5) EXACTLY N elements of A D= EX.(N,A,C) 6) DOMAIN of A D= DM.(A,C) D = {C},t It,, It It C = AUB C = UA C = A B C = OA C = A B C = AA C = A % B C = E A n C = D(A),, 7) RANGE of A D= RG.(A,C) 8) IMAGE of B under A D= IM.(A,B,C) 9) CONVERSE IMAGE of B under A D= CM.(A,B,C) 10) CONVERSE of A D= CV.(A,C) 11) RESTRICTION of A to B D= RS.(A,B,C) C = R(A) It It C = A[B] C = [B]A C = A C = A|B It

-5 - 12) RELATIVE PRODUCT of A and B D= RP.(A,B,C) D = 13) CARTESIAN PRODUCT of A and B D= XP.(A,B,C) 14) DOMAIN CONCURRENCE of A relative to B D= DC.(A,B,C) 15) RANGE CONCURRENCE of A relative to B D= RC.(A,B,C) 16) SET CONCURRENCE of A relative to B D= SC.(A,B,C) 17) CARDINALITY of A N= C.(A) N is a nun {c} C = A/B C = A x B C = 2(A:B) C = 2(A:B) C = '(A:B) N = #A nber 18) 19) 20) 21) 22) BOOLEAN OPERATIONS: Ie{0,1} I= SBS.(A,B) I = 1 iff I= EQL.(A,B) I= DSJ.(A,B) I= ELM.(A,B) I= EQV.(A,B) A is a subset of B A is equal to B A and B are disjoint A is an element of B #A is equal to #B SPECIAL CONTROL OPERATIONS....... - ~~~ ~~~ ~~~ ~~~ ~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 23) SET CONSTRUCTION X= S.(A,B,C,D,....) X = {A} 24) MODE of A N= M.(A) 25) INITIAL SETTING of A ISET.(A) A = {B,C,D,...} (see text) N e{l,2,...,8} sets A to be empty or the universe depending on the function which uses it first, see Appendix II.

-6 - 26) ACCESS DATA in A by FORMAT n D= ACC.(n,A,C) n e{1,2,3,...} D={C} C = may be a set of datum-names or a set of data; these two may be distinguished by the mode of C The operations are presented in a format compatible with MAD, and with FORTRAN if the periods are removed. In general the last parameter can be deleted from any function. This default case assigns a temporary storage block, the name of which is returned by the subroutine. For example: D=UN.(A,B) gives a name in D for the temporary storage block containing the union of A and B. Since all functions operate on just the name of a storage block representing a set, and since all functions return a name, any degree of nesting of operations within operations is allowable. Two exceptions to the above are (17) and (24) which are numbers and not storage locations. In the case of (23), if only one set is given, the set is unchanged, but the name of the set is put in X. The MODE of a set is covered in depth in an aforementioned paper [4]. It will suffice here to explain that 'mode' represents one of eight different storage configurations, each tailored to special sets and operations. The functions do not require participating sets to be of the same mode. Notice that all the operations are defined for any set though the result in some cases may always be empty as in the case of DM.(A) where A is the set of the first 10,000 integers. A forthcoming paper [2] will show that there is a

-7 - meaningful definition for relations covering arbitrary sets of variable length n-tuples without couching these relations as sets of ordered pairs. Also, the binary-relation properties (e.g., domain, image, relative product, restriction, etc.) are extended in a meaningful way to cover this extended concept of relation. These extended operations can also be implemented in an STDS [3]. Table 1 gives the results of implementing some of these operations on the IBM 7090. The four operations considered here are: unary union, unary intersection, unary symmetric difference, and 'exactly n' for n e{l,...,#G} where G is the family of ses being operated on. The number of elements in G is given by #G. All the elements of G contain the same number of elements, #A, and the size of the population which the elements of each A were chosen from is #P. It should be noted that the times in Table 1 are dependent on the total number of elements contained in the elements in G, and not the number of elements in G. In (d) through (i) the total number of elements contained in the elements of G is 10,000. While #G varies from 20 to 500,the times for UN. and SD. remain the same.

-8 - OPERATION SECONDS a) 500 b) 300 2 200 4 200 UN. (1,G) IN. (1,G) SD. (1,G) EX. (1,G) UN. (1,G) IN. (1,G) SD. (1,G) EX. (1,G).03.05.03.16 to EX.(2,G) to EX.(4,G).06.12.06.42 c) 50 10 20 UN.(C1,G) IN.(1,G) SD. (1,G) EX. (1,G).01.10.03.37 to EX. (10,G) d) 1200 e) 1000 f) 1000 10 1000 20 500 50 200 UN. (1,G) IN. (1,G) SD. (1,G) EX. (1,G) UN. (1,G) IN. (1,G) SD. (1,G) EX. (1,G) UN. (1,G) IN. (1,G) SD. (1,G) EX. (1,G) to EX. (10,G) to EX.(18,G) to EX.(20,G).73.90.76 7.89.73.48.76 10.96.75.16. 76 11.00 g) 1000 h) 1000 i) 1000 100 200 500 Table 1. 100 50 UN. (1,G) IN. (1, G) SD, (1,G) EX. (1,G) UN. (1,G) IN. (1,G) SD. (1,G) EX. (1,G) to EX.(23,G) to EX.(24,G) 1 1:.75.15.76 1.88.75.06.78 2.36.76.05.78 2.50 20 UN.(1,G) IN. (1,G) SD. (1,G) EX. (1,G) 1: EXECUTION TIMES FOR SET OPERATIONS ON THE IBM 7090

-9 - The rest of this paper will be devoted to examples demonstrating the applicability of an STDS. EXAMPLE 1 Let there be six sets: A,B,C,D,E,F,the membership lists o f six country clubs. For each male resident of Ann Arbor, let there be a datum-name in B for a data-block containing: person's name, address, phone number, credit rating, age, golf handicap, wife's name (if any), political affiliation, religious preference, and salary. The set n will contain the names of the sets, namely: A(O), B(O), C(O), D(O), E(O), F(O). This along with the collection S of set operations allows answering the following questions. 1) How many members belong to club A or B but not C? 2) Find the phone numbers of members in an odd number of clubs. 3) Get addresses of members belonging to one and only one club. 4) Get addresses and phone numbers of people not in any club. 5) Find members of A that are not also in B but who may be in C only if they are not in D, or in E if they are not in F. 6) Get the average credit rating of members belonging to exactly three clubs.

-10 - The possible questions may become ridiculously involved and may interact with any spontaneously constructed sets. For example of the latter, let X be the set of Ann Arbor males born in Ann Arbor. 7) Find the average age of members born in Ann Arbor and compare with average age of members not born in Ann Arbor. The answers to (1) through (7) formulated in an STDS are expressed below, with N and M representing real numbers, and with BB for B and NN for r. 1) N = C.(RL.(UN.(A,B),C)) ans: N 2) ACC.(1,SD.(1,NN),Q) ans: Q Format 1 gives phone numbers 3) ACC.(2,EX.(1,NN),Q) ans: Q Format 2 gives addresses 4) ACC.(3,RL.(BB,UN.(1,NN)),Q) ans: Q Format 3 gives phone numbers and addresses 5) RL.(RL.(A,B),UN.(RL.(D,C),RL.(F,E)),Q) ans: Q 6) ACC.(4,EX.(3,NN),Q) N = 0 THROUGH LOOP, FOR I = 1,1,I.G.C.(Q) LOOP N = N + Q(I) N = N/C.(Q) ans: N Format 4 gives credit rating

-11 - 7) N = M= 0 ACC.(5,X,T) THROUGH LOOP1,FOR I=1,l,I.G.C.(T) LOOP1 N = N + T(I) ACC.(5,RL.(BB,X),P) THROUGH LOOP2, FOR I=1,1,I.G.C.(P) LOOP2 M = M + P(I) N = N/C.(T) M = M/C.(P) ans: N and M are the respective average ages Format 5 gives ages EXAMPLE 2 Family lineage is easily expressed in an STDS. With just five initial relations defined over a population U, all questions concerning family ties may be expressed. Let U'be a population of people and let M = {<x,y>: y is the mother of x} F = {<x,y>: y is the father of x} S = {<x,y>: y is a sister of x} B = {<x,y>: y is a brother of x} H = {<x,y>: y is a husband of x} Let X be any subset of the population U, find 1) the set G of Grandfathers of X. G = F[(FUM)[X]] set notation IM.(F,IM.(UN. (F,M),X),G) in an STDS 2) the set GF of Grandfathers of X on the father's side. GF = F[F[X]] set notation IM.(F,IM,(F,X),GF) STDS

-12 - 3) the set GM of Grandfathers of X on the mother's side. GM = G % GF set notation RL.(G,GF,GM) STDS 4) the set GR: the grandfather relations over U. GR = (Fu M)/F set notation RP.(UN.(F,M),F,GR) STDS 5) the general relation: P = {<x,y>: y is a parent of x} P = Fu M set notation UN.(F,M,P) STDS 6) the general relation: Sibling, L. L = So B set notation UN.(S,B,L) STDS 7) the general relation: Children, C. C = M F = P set notation CV.(P,C) STDS 8) the general relation: Aunt, A. A = (P/S) u (P/B/H) set notation UN. (RP. (P,S),RP.(P,RP. (B,CV.(H))),A) STDS 9) the general relation: Wife, W. W =H set notation CV.(H,W) STDS 10) the general relation: Cousin, K. K = P/L/C set notation RP.(P,RP.(L,C),K) STDS

-13 - 11) the general relation: Half-sibling, HS. HS = P/C X (M/M nF/F) set notation RL.(RP.(CV.(C),C),IN.(RP.(M,CV.(M)), RP.(F,CV.(F))),HS) STDS 12) people in X with no brothers or sisters Q = X X V(L) set notation RL.(X,DM.(L),Q) STDS 13) find all relations of X to a set Y such that Y is equal to the image of X. Q ={A:(Aen)(Y = A[X])) set notation ISET.(Q) STDS DC.(X,NN,T) THROUGH LOOP, FOR I=l,1,I.G.C.(T) B = IM.(T(I),X) LOOP WHENEVER EQL. (Y,B).E.1, UN.(Q,S.(T(I)),Q) Many more possibilities are available and might be tried by the reader. An example of quantified questions will be found in Appendix II, which may also be of help to the reader who is familiar with associative data structures. Also of interest is a recently completed implementation of an associative data structure [1], which, while not as general as an STDS, is more general than other known implemented data structures.

APPENDIX I SET-THEORETIC DEFINITIONS Conventions The logical connectives 'and,' 'or, 'exclusive-or' are represented by 'A,' 'For all x,' 'for some x, 'for exactly n x' will be represented by 'Vx,' '3x,' 'E(n)!x.' Parentheses are used for separation, and as usual the concatenation of parentheses will represent conjunction. 'A' will be a set if and only if (a) it can be represented formally by abstraction (i.e., A={x:6(x)} where e(x) is a predicate condition specifying the allowable elements 'x5; (b) 'A' can be represented by {,} enclosing the specific elements of 'A.' Definitions The symbol 'e' means 'is an element of'; xeA reads: "x is an element of A." 1) UNION a) binary union of two sets A and B Au B = {x:(xeA)v(xeB)} b) unary union of a family G of sets UG = {x: (3AeG) (xeA) } c) indexed union of a set f(A) over the family G UAeGf(A) = {x: (AeG)(xef(A))}

I-2 2) INTERSECTION a) binary intersection of A and B An B = {x:(xeA)(xeB)} b) unary intersection of a family G /fG = {x:(VAeG)(xeA)} c) indexed intersection of f(A) over the family G AGf(A) = {x:(VAeG)(xef(A))} AeG 3) SYMMETRIC DIFFERENCE a) binary symmetric difference of A and B A A B = {x:(xeA)A(xeB)}* *even though the symbol 'a' has two different meanings, no confusion is likely b) unary symmetric difference of G AG = {x:(for an odd number of AeG)(xeA)} c) indexed symmetric difference of f(A) over G AAeGf(A) = {x:(for odd no. of AeG)(xef(A))} 4) RELATIVE COMPLEMENT A %B = {x:(xeA)(x0B)} 5) EXACTLY N! the set of elements common to exactly 'n' elements of a given set G is represented by: E G = {x:(E(n)!AeG)(xeA)} 6) DOMAIN of a set A D(A) = {x:(3y)(<x,y>eA)}* *<x,y> represents an ordered pair 7) RANGE of a set A R(A) = {y:(3x)(<x,y>eA)}

I-3 8) IMAGE of B under A A[B] = {y:(3xeB)(<x,y>eA)} 9) CONVERSE IMAGE of B under A [B]A = {x:(3yeB)(<x,y>eA)} 10) CONVERSE of A A = {<y,x>: <x,y> eA} 11) RESTRICTION of A to B AIB = {<x,y>:(<x,y>eA)(xeB)} 12) RELATIVE PRODUCT of A and B A/B = {<x,y>:(3z)(<x,z>eA)(<z,y>eB)} 13) CARTESIAN PRODUCT of A and B A x B = {<x,y>:(xeA)(yeB)} 14) DOMAIN CONCURRENCE of X relative to A D (X:A) = {B:(BeA)(XcP(B))} 15) RANGE CONCURRENCE of X relative to A (X:A) = {B:(BeA)(XcR(B))} 16) SET CONCURRENCE of X relative to A G (X:A) = {B:(BeA)(XcB)} 17) CARDINALITY of A #A = n iff there are exactly n elements in A 18) A is a SUBSET of B iff every element of A is an element of B 19) A is EQUAL to B iff A is a subset of B, and B is a subset of A 20) A and B are DISJOINT iff the intersection of A and B is empty 21) A is EQUIVALENT to B iff A and B contain the same number of elements

APPENDIX II TRANSFORMULATION OF AN ASSOCIATIVE DATA STRUCTURE If, in J.A. Feldman's paper [5], an 'attribute' represents a relation, then since any relation can be represented by a set of ordered-pairs, the formulation involving ordered triples may be abandoned in favor of sets of ordered-pairs. A correspondence may then be made between the expression A(o)=v and a set-theoretic interpretation. In Feldman's paper six questions are represented by: A(o)=?, A(?)=v,?(o)=v, A(?)=?,?(o)=?, and?(?)=v. As presented in the paper the expressions are ambiguous concerning whether 'o' and 'v' represent sets, or elements, or both. The general formulation is to assume that they are sets, and to replace 'o' and 'v' by the sets 'X' and 'Y' and to replace A by a set of relations R. If the original intention was for 'o' and 'v' to be elements, then X and Y will just be singleton sets. 'R(X)=Y' is now the general form, and generation of questions is accomplished by asserting one or two of the three sets and pondering the remaining. Just deleting one or two sets, however, does not yield a well-formed question; many interpretations may be possible. In an STDS all interpretations may be made explicit. For a sampling, each of the six questions is formulated in the most general way and then in some less general interpretations.

II-2 1) R(X) = Q Given a set of relations R and a set of elements X, find Q. The most general interpretation for Q is: Find the set of elements 'v' such that <o,v>eA a) for some AeR and some oeX Q = AeRJoexA[{o}] = {v:(3AeR)(3oeX)(oAv) Less general interpretations may be given Q by replacing quantifiers or changing their order: b) for all AeR and exactly one oeX Q = /AeRE(l)oeXA[{o}] = {v:(VAeR)(E(1)!oeX)(oAv)} c) for some oeX and all AeR Q = UoeX AeRA[ {} = {v: (3oeX)(VAeR)(oAv)} d) for all oeX and for an odd number of AeR Q = oXAAeRA[{o}] = {v: (VoeX) (AeR) (oAv) } e) for some AeR and all oeX Q = UAeRoA[{o}] = {v:(3AeR)(VoeX)(oAv)} Expressed in an STDS, these questions become: a) ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(R) LOOP UN. (Q,IM.(R(I),X),Q) b) ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(R) ISET.(T) THROUGH LOOP, FOR J=1,1,J.G.C.(X) LOOP IN. (Q,EX. (1,IM, (R(I),X(J)),T),Q) c) ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(X) ISET.(T) THROUGH LOOP, FOR J=1,1,J.G.C.(R) LOOP UN. (,IN.(T,IM.(R(J),X(T)),T),Q)

11-3 d) ISET. (Q) THROUGH LOOP, FOR I=1,1,I.G.C.(X) ISET.(T) THROUGH LOOP, FOR J=1,1,J.G.C.(R) LOOP IN. (Q,SD.(T,IM.(R(J),X(I)),T),Q) e) ISET.(Q) THROUGH LOOP, FOR I=l,l,I.G.C.(R) ISET.(T) THROUGH LOOP, FOR J=1,1,J.G.C.(X) LOOP UN.(Q,IN.(T,IM.(R(I),X(J)),T),Q) 2) R(Q) = Y Given a set of relations R and a set of elements Y, find Q. Just the most general interpretation will be given since quantifier manipulation was demonstrated by (1). Find the set of elements 'o' such that <o,v>eA for any AeR and any veY Q = UAeR [{v}]A = {o:(3AeR)(3veY) (oAv)} gives ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(R) ISET.(T) THROUGH LOOP, FOR J=, 1,J.G.C. (Y) LOOP UN.(Q,UN (T,CM.(R(I),Y(J)),T),Q) 3) Q(X) = Y Given two sets X and Y find the set of relations A such that <o,v>eA for some oeX and some veY Q = UoeX e ({<o,v>}) = {A:(3oeX) (3veY) (oAv)} gives ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(X) ISET.(T) THROUGH LOOP, FOR J=1,1,J.G.C.(Y) LOOP UN.(Q,UN.(T,SC.(XP.(X(I),Y(J))),T),Q)

11-4 4) R(Qo) = Qv Given a set of relations R there is no obvious delineation of sets Q or Q, three generically different questions may be phrased, each one of which may be expressed in different degrees of generality. i) Find Q independent of Q ii) Find Qv independent of Qo iii) Find Q x Q For (i) find the set of 'o' such that for some A in R there exists a 'v' such that <o,v>eA Qo = UAeR(A) = {o:(3AeR)(oeD(A))}= {o:(3AeR)(3ve) (oAv)} =O AeRD(A) For (ii) find the set of 'v' such that for some A in R there exists an 'o' such that <o,v>eA Q = UA RR(A)= {v:(3AeR)(veR(A))}={v:(3AeR) (3oe) (oAv)} For (iii) find the set of <o,v> such that for some A in R <o,v>eA Q = U A = {<o,v>:(3AeR)(oAv)} AeR These are represented in an STDS by: i) ISET.(Q) THROUGH LOOP, FOR I=l,l,I.G.C.(R) LOOP UN.(Q,DM.(R(I)),Q) ii) ISET.(Q) THROUGH LOOP, FOR I=l,l,I.G.C.(R) LOOP UN.(Q,RG.(R(I)),Q) iii) Q = UN.(1,R,Q)

11-5 5) QA(X) = Q Given a single set X requires, as in (4), three separated formulations: i) Find QA independent of Qv ii) Find Q independent of QA iii) Find QA x Qv For (i) find set of 'A' such that for some oeX there exists a 'v' such that <o,v>eA QA = UoeXC ({o}:n) = {A:(3oeX)(oeP(A))} = {A:(3oeX)(3veS)(oAv)} For (ii) find set of 'v' such that for some oeX there exists an 'A' such that <o,v>eA Q = e Un A[{o}] = {v:(3oeX) (3Aen) (oAv)} = {v: (3oeX) (3AeC(X':r)) (veA[{o}])} For (iii) find the set of <A,v> such that for some oeX <o,v>eA Q = U U n{A}xA[{o}] = {<A,v>:(3oeX)(oAv)} oeX Ae l = {<A,v>: 3oeX) (Ae(X:n) ) (veA[{o}]) } These are expressed in an STDS as: i) DC.(X,NN,Q) ii) ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(X) DC. (X(I),NN,A) (see note) THROUGH LOOP, FOR J=l,1,J.G.C.(A) LOOP UN. (Q,IM.(A(J),X(I)),Q) iii) ISET.(Q) THROUGH LOOP, FOR I=1,1,I.G.C.(X) DC. (X(I),NN,A) (see note) THROUGH LOOP, FOR J=1,1,J.G.C.(A) LOOP UN.(Q,XP.(S.(A(J)),IM.(A(J),X(I))),Q)

II-6 NOTE: Execution is minimized since C.(A) " C.(NN) and the substitution of UO Ae({} ) for eX Ueis eX e 1O({o}:n) oeX e T justified by a trivial theorem [3] which states: given X and n then (VoeX)(VAen) (Ae({o}:'n) -+A[{o}] # 0) 6) QA(Q) = Y is similar to (5).

GLOSSARY OF SYMBOLS Symbol iff Symbol Definition A v A Vx 3x E!x Ox E(n)!x 0 C An B A u B A& B A X B {x:e(x)} xAy if and only if Identity Conjunction Disjunction Exclusive or Implication (if... then) Equivalence Universal quantifier (for all) Existential quantifier (for some) Uniqueness quantifier (for exactly one) Odd quantifier (for an odd number of) Exact number quantifier Set membership Empty set Non-membership Set inclusion Intersection Union Symmetric difference Relative complement Ordered pair Definition by abstraction Ordered pair <x,y> contained in A

GLOSSARY OF SYMBOLS (cont.) Symbol Symbol Definition JG Union or sum of G nG Intersection of G AG Symmetric difference of G E G Elements contained in exactly n elements of G n AxB Cartesian product D(A) Domain of A R(A) Range of A A Converse of A A/B Relative product of A and B AIX A restricted to X A[X] Image of X under A [X]A Converse-image of X under A ZD(X:A) Domain-concurrence of X relative to A %(X:A) Range-concurrence of X relative to A e5(X:A) Set-concurrence of X relative to A

REFERENCES 1. Ash, W.L. and E.H. Sibley, TRAMP: A Relational Memory with an Associative Base, Concomp Technical Report (in preparation). 2. Childs, D.L., Reconstituted Set-Theoretic Definition for Relations, Concomp Technical Report (in preparation) 3. Childs, D.L., Feasibility of a Set-Theoretic Data Structure Based on a Reconstituted Set-Theoretic Definition for Relations, Concomp Technical Report (in preparation) 4. Childs, D.L., Implementing Set-Theoretic Operations for a Set-Theoretic Data Structure, Concomp Technical Report (in preparation) 5. Feldman, J.A., Aspects of Associative Processing, Technical Note 1965-13, M.I.T. Lincoln Laboratories, April 1965.

Security Classification I [ DOCUMENT CONTROL DATA R&O (Security clasrification of title, body of abstract and indexing annotation must be entered when the overall report is claesified) 1. ORIGINATING ACTIVITY (Corporate author) ra. REPORT SECURITY C LASSIFICATION The University of Michigan Unclassified CONCOMP Project 2b GROUP 3. REPORT TITLE DESCRIPTION OF A SET-THEORETIC DATA STRUCTURE 4. DESCRIPTIVE NOTES (Type of report and inctusive dates) Technical Report S. AUTHOR(S) (Last name, first name, initial) CHILDS, David L. 6. REPO RT DATE 7P. TOTAL NO. OF PAGES 7b. NO OF REFS March 1968 27 5 8a.. CONTRACT OR GRANT NO. 9a. ORIGINATOR'S REPORT NUMBER(SJ DA-49-083 OSA-3050 b. PROJECT NO. ~b. PROJECT NO. | Technical Report 3 c. |9b. OTHER REPORT NO(S) (Any other number that may be asigned this reportJ d. 10. AVA ILABILITY/LIMITATION NOTICES Qualified requesters may obtain copies of this report from DDC. 11. SUPPLEMENTARY NOTES 12. SPONSORING MILITARY ACTIVITY m 13. ABSTRACT A set-theoretic data structure (STDS) is virtually a 'floating' or pointer-free structure allowing quicker access, less storage, and greater flexibility than fixed or rigid structures that rely heavily on internal pointers or hash-coding, such as 'associative or relational structures,' 'list structures,' 'ring structures,' etc. An STDS relies on set-theoretic operations to do the work usually allocated to internal pointers. A question in an STDS will be a set-theoretic expression. Each set in an STDS is completely independent of every other set, allowing modification of any set without perturbation of the rest of the structure; while fixed structures resist creation, destruction, or change: in data. An STDS is essentially a meta-structure, allowing a question to 'dictate' the structure or data-flow. A question establishe: which sets are to be accessed and which operations are to be performed within and between these sets. In an STDS there are.as many 'structures' as there are combinations of set-theoretic operations; and the addition, deletion, or change of data has no effect on set-theoretic operations, hence no effect on the 'dictated structures.' Thus in a floating structure like an STDS the question directs the structure, instead of being subservient to it. ___ _ DD JAN 64 1473 Security Classification

Security Classification 14. LINK A LINK B LINK C KEY WORDS ROLE WT ROLE WT ROLE WT Associative data structure Data modification Datum-names Floating structure Information retrieval Meta-structure Pointer- free structure Quantified questions Set Set operations Set theoretic data structure _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ~ _ _ _ _,II _ _ _ _l I INSTRUCTIONS 1. ORIGINATING ACTIVITY: Enter the name and address of the contractor, subcontractor, grantee, Department of Defense activity or other organization (corporate author) issuing the report. 2a. REPORT SECUIITY CLASSIFICATION: Enter the overall security classification of the report. Indicate whether "Restricted Data" is included. Marking is to be in accordance with appropriate security regulations. 2b. GROUP: Automatic downgrading is specified in DoD Directive 5200.10 and Armed Forces Industrial Manual. Enter the group number. Also, when applicable, show that optional markings have been used for Group 3 and Group 4 as authorized. 3. REPORT TITLE: Enter the complete report title in all capital letters. Titles in all cases should be unclassified. If a meaningful title annot be selected without classification, show title classification in all capitals in parenthesis immediately following the title. 4. DESCRIPTIVE NOTES: If appropriate, enter the type of report, e.g., interim, progress, summary, annual, or final. Give the inclusive dates when a specific reporting period is covered. 5. AUTHOR(S): Enter the name(s) of author(s) as shown on or in the report. Enter last name, first name, middle initial. If military, show rank and branch of service. The name of the principal.athor is an absolute minimum requirement. 6. REPORT DATT Enter the date of the report as day, month, year; or month, year. If more than one date appears on the report, use date of publication. 7a. TOTAL NUMBER OF PAGES: The total page count should follov normal pagination procedures, i.e., enter the number of pages containing information. 7b. NUMBER OF REFERENCES: Enter the total number of references cited in the report. 8a. CONTRACT OR GRANT NUMBER: If appropriate, enter the applicable number of the contract or grant under which the report was written. 8b, 8c, & 8d. PROJECT NUMBER: Enter the appropriate military department identification, such as project number, subproject number, system numbers, task number, etc. 9a. ORIGINATOR'S REPORT NUMBER(S): Enter the official report number by which the document will be identified and controlled by the originating activity. This number must be unique to this report. 9b. OTHER REPORT NUMBER(S): If the report has been assigned any other report numbers (either by the originator or by the sponsor), also enter this number(s). 10. AVAILABILITY/LIMITATION NOTICES: Enter any limitations on further dissemination of the report, other than those imposed by security classification, using standard statements such as: (1) "Qualified requesters may obtain copies of this report from DDC." (2) "Foreign announcement and dissemination of this report by DDC is not authorized." (3) "U. S. Government agencies may obtain copies of this report directly from DDC. Other qualified DDC users shall request through (4) "U. S. military agencies may obtain copies of this report directly from DDC. Other qualified users shall request through (5) "All distribution of this report is controlled. Qualified DDC users shall request through,t If the report has been furnished tc the Office of Technical Services, Department of Commerce, for sale to the public, indicate this fact and enter the price, if known. 11. SUPPLEMENTARY NOTES: Use for additional explanatory notes. 12. SPONSORING MILITARY ACTIVITY: Enter the name of the departmental project office or laboratory sponsoring (paying for) the research and development. Include address. 13. ABSTRACT: Enter an abstract giving a brief and factual summary of the document indicative of the report, even though it may also appear elsewhere in the body of the technical report. If additional space is required, a continuation sheet shall be attached. It is highly desirable that the abstract of classified reports be unclassified. Each paragraph of the abstract shall end with an indication of the military security classification of the information in the paragraph, represented as (TS), (S), (C), or (U). There is no limitation cn the length of the abstract. However, the suggested length is from 150 to 225 words. 14. KEY WORDS: Key words are technically meaningful terms or short phrases that characterize a report and may be used as index entries for cataloging the report. Key words must be selected so that no security classification is required. Identifiers, such as equipment model designation, trade name, military project code name, geographic location, may be used as key words but will be followed by an indication of technical context. The assignment of links, rules, and weights is optional. I j 6. GPO 886-551 Security Classification