j' I Bureau of Business Research Graduate School of Business Administration University of Michigan March, 1971 A DEDUCTIVE APPROACH TO LINEAR OPTIMIZATION WORKING PAPER NO. 30 by W. Allen Spivey Professor of Statistics University of Michigan FOR DISCUSSION PURPOSES ONLY None of this material is to be quoted or reproduced without the express permission of the Bureau of Business Research

BACKGROUND OF THIS PAPER This is a draft of two chapters of a monograph on linear and nonlinear optimization. It is intended for students in optimization courses having interests in the mmateatics of optimization.

Chapter 1. Mathematical Background 1.1. Introduction. Linear optimization theory can be developed either constructively or deductively. In the former approach, mathematical concepts are introduced which do doubly duty: they are used to prove the existence of optimal solutions and to calculate (and characterize) such solutions. Thus the constructive approach is an economical one and largely for this reason has become standard (Dantzig [4], Spivey and Thrall [12], and others). Deductive approaches, used first in the development of the theory and still earlier in the development of the closely related theory of games, have disadvantages. They utilize mathematical concepts (such as convexity and separating hyperplanes) that are not normally found in a linear algebra course. Moreover, these concepts are needed to prove the important existence theorems, but once this task is completed, an entirely different set of mathematical concepts is needed to develop an algorithm for calculating the solutions whose existence has been assured. Thus these approaches, despite their mathematical elegance, are regarded as uneconomical. A deductive approach, however, has superior features if one wishes to generalize to nonlinear optimization, since the tools of convexity, separating hyperplanes, etc., are then essential. To put it another way, the mathematical tools used to prove existence theorems in the linear case (and which are not used again) move to the center of the stage when one wishes to consider nonlinear optimization. a.

.4. qft - - 2 - This monograph is concerned with a deductive approach which is largely algebraic and which minimizes the use of concepts from analysis. It is based upon the work of Tucker, some of which is summarized in a paper by Good [5], and assumes as a prerequisite a course in linear algebra.* Since such courses vary considerably in content, the discussion begins with a brief review of relevant concepts from the theory of bilinear forms and from orthogonality, and introduces the necessary concepts from analysis. Concepts that are found in the typical linear algebra course are either briefly mentioned or stated formally without proof (references to standard linear algebra texts are frequently indicated). Other concepts which are less likely to be covered in such a course are stated formally and sometimes proved, with the proofs of the remainder indicated as exercises for the reader. 1,2. Bilinear Forms. We begin in the spirit of linear algebra and introduce concepts for vectors that are coordinate free. Vectors will be denoted initially by Greek letters and their coordinates, relative to indicated bases, will be denoted by lower case Latin letters. Later, when only one basis is used for the discussion, we will identify a vector in terms of *A deductive approach that is exclusively algebraic can be found in paper 4 of Kuhn and Tucker [7]; a deductive approach that uses analysis can be found in Ben-Israel [1]. ir. a

- 3 - its coordinates, since it has only one such representation in this case. All vector spaces will be understood to be finite dimens i onal. Definition. Let U and V be vector spaces over a field F and let f be a function which maps a pair of vectors a and ', where asU and eV,r into F such that f is a linear function of a and B separately. Now suppose f(a,B) is a bilinear form; then for ai, a eU sU ' B2V and a1, a2eF we consider f(al1l + a2a 2 bl1l + b2B2) For the moment let 6 = bl31 + b2 2; since f is linear in its first argument we have f(alal+a2a2'6) = alf(al,6) + a2f(a2,6) =alf(al,blBl+b2b2) + a2f(a2,b151+b2).2) Now consider f(al, b11I+b252); this is linear in the second argument, so +f(aa bl1 ) + b2) = blf(l') l2) and an analogous result holds for f(a2,b lSl+b252) Hence f(all+a2a2,bl +b2B2) alblf(a 1) + alb2f(l,),2 + a2blf( 2'1) + a2b2f(a2', 2)

I - 4 - As a slight generalization, let U and V be vector spaces over a field F having, respectively, bases {ac', ', ap and {s1, *' O n } and let P n a = Xia.i and = $ Yj Then for a bilinear form we can write f(aan).= = i x i f(a,) i=l i=l and for each i we have n n f(ai,) =f a|i Yj j yY, f(aij $) j=l J3 j=j l J p I n P f(a,) = 4x yi Yjf(ai%) j xiyjf(aij). i=l j-l ' ji i= j=Therefore, for any aeU and ~pV, the values of the bilinear form f(a,8) are known when the pn values f(cai Bj) are known -- given that we have bases of U and V. Now denote f(a,.) = bi eF and define B = [bi ] to be the matrix representing the bilinear form with respect to the bases above. Relative to the same bases we can use the coordinate vectors

40 4 r" x- y X =; and Y= x y to represent the vectors a and 3 respectively; then p n f(aB) = Jl lxibijyj = [xl x b i=l j=l b 1 b n' - [x1.. x " p pn_ n ] T -- X BY. Since any p by n matrix can be regarded as the matrix of a bilinear form in p+n variables, many theorems about matrices have counterparts in the theory of bilinear forms anc conversely. We will see some examples of this below. In all the following we confine ourselves to the real field R and all bilinear forms will be understoodto be defined for vector spaces over R. Definition. Let f be a bilinear form; if f(aO, ) = f ( for every a,scV(R), we say that f is symmetric.* Theorem 1. A bilinear form f(a, ) is symmetric if and only if any matrix B representing f(a,3) is symmetric (i.e., Ia) *For this definition to have meaning the bilinear form f must be defined on pairs of vectors from the same vector space.

- 6 - T has the property that B = B ). Proof. The matrix B = [bij] is determined by f(ai, aj) T and B = [bji] is determined by f(a j,a), where the vectors a. are basis vectors for the vector space. Now b = f(ataj) = o = b ji j j i b.j' T so B BB. Among other things, this theorem assures us that if a matrix of a bilinear form, expressed in terms of a given basis, equals its own transpose, then the matrix of the form in terms of any other basis will also be equal to its transpose. Definition. If f(a,a) = 0 for every aeV(R) then the bilinear form f is said to be skew- symmetric Theorem 2. A bilinear form f(ac,B) is skew-symmetric if and only if any matrix representing f is skew-symmetric (i.e., has the property that B -B ). Proof. For any a,BV(R) consider a+B; we have from skewsymmetry f (a+,a+B) = 0. Also 0 = f(a+B,a++) = f(a,a) + f(a,S) + f (S,a) + f(3,B) = f(,B) + f(B,a). T Hence f(a,B) = -f(,oa) and B = -B. T On the other hand, if B = -B, we have f(a,B) = -f(, a) for all a,BeV(R). Then f (a,a) = -f (,a) so that f(a,a) + f (ao) = 0 or (1+1) f (o,) = 0.

- 7 -Since 1+1 7 0 in R, this means that f (a,) = 0 so that f is skew-symmetric. Both Theorems 1 and 2 can be generalized to vector spaces over any field F of characteristic / 2. We define this concept as follows. For any positive integer n, denote by nl the sum 1+...+1 (n terms). If nl y 0 for any positive integer n, then F is said to have characteristic 0. If nl = 0 for some positive integer, let p be the smallest positive integer such that pi = 0. Then p is called the characteristic of F. The field {0, 11 has characteristic 2 and the field {0, 1, 2} has characteristic 3. Returning to the concepts of symmetry and skew-symmetry, if x is any element of a field with characteristic 2, then x + x = 0 and x = -x. Thus the properties of symmetry and skewsymmetry in this case coincide. The characteristic p of a field, incidentally, is a prime number. A characteristic can be defined for other algebraic systems as well (for example, rings and integral domains). A discussion of the characteristic of a ring and domain appears in Birkhoff and MacLane [2], Chap. XIII; another good reference is Nomizu [10], Chap. 5. Theorem 3. Any bilinear form defined on V(R) can be represented uniquely as the sum of a symmetric bilinear form and a skew-symmetric bilinear form.

8 - Proof. Exercise; hint: write the symme tric form as f (aS) 1/2[f(a,8) + f(f,a)] and the skew-symmetric form as ss fss(aj) 1/2[f(aB) - f(, a)]. Theorem 3 can also be stated as a theorem on matrices: if B is any square matrix, it can be uniquely represented as the sum of a symmetric and a skew-symmetric matrix. Definition. Let f(a, ) be a bilinear form defined on V(R) such that (i) f(a,3) is symmetric, (ii) f(ac,) is positive definite (i.e., f (a,a) 0 for every aeV(R) and f(o,a) = 0 if and only if a = 0); then f(a, ) is called an inner product function. Definition. A vector space over R with an inner product function is called an Euclideans. Integer subscripts on a symbol indicating a vector space will denote the dimension of the space; thus V4(R) denotes a four-dimensional vector space over the real s and E denotes an Euclidean space of dimension n. Examples. (1) Let f(aB) = xly2 - x2y1 in E2 where a = x l and S =, then f(a,s) - [x x2] L 11 y2 and f is skew-symmetric.

- 9 - (2) Let a,geEn, let En have the unit vectors U. as basis vectors (i.e., U. has 1 for its i coordinate and zero elsewhere), let a and B have, respectively, the coordinate vectors 7 Jand Lxn. Then T T f(a,B) = f(ExiUi,yjU) = XTiY = Exiy i,:~ xiuCT, U.. the conventional inner product of two vectors. Note that the min z = CTX max g = UTB subject to subject to AX B UTA C X >0 U > 0 we get the inequality (for all feasible vectors X and U) U B U AX < CX. Exercises 1. Write the following bilinear forms as a matrix product: (a) g (a, ) = 2xY1 + 2 3 x2Y1 + x2 (b) f(a,$) = x Y1 + 2x1Y2 - xgy3 - 5x2Y1 + 7x2Y2 + 4x2Y3 (c) h(a,) = x1 - xY - 3x1y + 2x2y - x2Y3. wlity ( for all fea3ibl 2 XandU)3

19 - 10 - 2. Construct an example of a symmetric and a skew-symmetric bilinear form defined on V4(R). 3. Given the bilinear form in E3 f(cr3) = xlY1 - 3Y1 + 2x2Y1 + 1Y2 2 32 + 2Y2 + XY3 -2x23 + X3Y3, write it as the sum of a symmetric and a skew-symmetric bilinear form. T T 4. Prove that if A is any p by n matrix, then ATA and AA are symmetric. 5. Prove that if A is skew-symmetric, then A is symmetric. 6. Let A and B be symmetric matrices of the same size. Is AB symmetric? 7. If p is the characteristic of a field F, prove that p is a prime number. 8. Give two examples of a positive definite bilinear form and two examples of a form that is not positive definite. *9. Given a bilinear form f(a 8) defined on a vector space V(R) and a basis for the space, suppose the matrix B is associated with the form and basis. If the basis of the space is changed, then f will have a new matrix, say B', relative to the new basis. It can be shown that the rank of B is equal to the rank of B', and more generally, that the rank of the matrix of a bilinear form is invariant under change of basis. Thus one defines the rank of the form to be the rank of a matrix of the form relative to some basis.

- 11 - Now prove the following theorem: let f be a bilinear form defined on a vector space V of dimension n over the complex field F. There exists a basisf {ai, ~., a } of V such that: n (i) f(aiaj) = 0 for i y j; (ii) f(a.,ai) = 1 for 1 i r, where r = rank of f; (iii) f(a.,a.) = 0 for r + 1 < j n; the associated matrix is of the form Halmos [ h6], Cap. 1. that the sum of two inner product functions is an inner proII Is the difference of two inner product functions in En an inner product-function in E Is any positive scalar multiple If the answer is yes, proe it; if nomi pro[1de a c. unte Halmos [6], Chap. 1. 10.3. D efine two differe nt inner product funcotations in and prove that the sum of two inner product functions is an inner product function. 11. Is the difference of two inner product functions in En an inner product functionl in Ev Is any positive scalar multiple of an inner product function in En an inner product function? If the answer is yes, prove it; if not, produce a counterexample. 1.3. Length, Distance, Orthogonality. We have various notations for inner product functions: f(a,S) = (a,B) = a* = aS, and if we are considering vectors which are expressed in terms n n of a fixed basis {al, o a }, say a = x.a and Y = Yi i=l 1 ' j=l i

- 12 - then the vectors can be unambiguously denoted by the coordinate vectors X and Y so we also have in this case the notation f(a, ) = f(X,Y) = XTy and f(a a) = (a a) = aa a2 = f(XX) = x. From this point on we will deal with coordinate vectors. Definition. Let f(X,Y) be an inner product function defined on En; if X is any vector in En, then the length of X is ItXII = /f(X,X) = t Note that since f(X,X) a 0 this definition does not take us out of R, and also that for any scalar IlkXII = /f(kX,kX) 5 fCX) = klA = kl i II |x Definition. Let X be any nonzero vector in En; then the vector X* = ThrX is called the normalized vector associated with X (note that IIx*I = 1). Definition. Let X and Y be any vectors in E and let n d(X,Y) be a function defined on pairs of vectors X,Y such that d(X,Y) = I IY-lI. Then d(X,Y) is a distance function in E and for a given pair of vectors X,Y we call d(X,Y) the distance from X to Y.

i!s - 13 - Various notations for a distance function are available; for example, d(X,Y) = /f-X, Y-X) = /(Y-X) (Y-X).:~~~YX~ = ~YxTyx Q1 Lemma. If X and Y are 1, then ITyI = I IXI I I I YI any vectors in En having..~~~ length Proof. Consider the vector X - Y; we have for an inner product f -unction f(X-Y, X-Y) = f(XX) - f(X,Y) - = l 2 T f(Y,X) + f(Y,Y) I 2 lyl I Since of f Ixll =- I YIl = I we have from the positive definiteness 2 - 2X Y > 0, X Y 1. Similarly, from an examination of f(X+Y, X+Y) we get - X Y < 1 so we can write ixT YI 1. Theorem 4 (Schwartz inequality). vectors in E, then xT = X IY IX Yl = IIXI1I I1IY.I If X and Y are any Corollary (the triangle inequality). If X and Y are any vectors in En, then

- 14 I JX+YI < I |X| I + I IY I Theorem 5. If d(X,Y) is any distance function in E, then (i) d(X,Y) = d(Y,X); (ii) d(X,Y) > 0 and d(X,Y) = 0 if and only if X = Y; (iii) d(X,Y) - d(X,Z) + d(ZY). Definition. The vectors X and Y in E are said to be n orthogonal if XT =0. The zero vector in En is thus orthogonal to every vector n in E n Definition. Let {X1, '', Xn} be a set of vectors in E; n n if XiXj = 0 for i, j = 1, — *, n, i / j, then the set is orthogonal. A set is said to be orthonormal if it is orthogonal and every vector in the set has length 1. Theorem 6. Any orthogonal set of nonzero vectors in E n is linearly independent. We now introduce the concept of an orthogonal projection of a vector on another vector by examining some geometrical concepts in E2. Consider two vectors in E as in Figure 1. We I X-kY X y q --- -- 11-. ---1.kY for some scalar k Figure 1 4b 4

15 -want to call the vector kY (k not yet specified) the orthogonal projection of X on Y. What is the value of the scalar k? We first observe that if a scalar k exists such that X-kY and Y are orthogonal, then T (-kY) Y = 0 from which we easily obtain T T X Y - kYY = 0 and T T IH2 Y Y Y I iY 1 1 Conversely, if k is defined as above, then it is easy to show that X-kY and Y are orthogonal. With these geometrical concepts in mind we introduce the following Definition. If X and Y are any nonzero vectors in E n then the orthogonal projection of X on Y is the vector T x Y iYii2 ~ n1|Y|2 Suppose we return to E2 and that {XY} is a basis for the space as shown in Figure 2. Can we determine an orthonormal basis for E27 The answer is clearly yes since we can make use of the orthogonal projection of X on Y by setting Z = Y Z2 = X - T x Y i- l I I ' I I'2

F I I L u) u I A - 16 - Z2r '; Figure 2 The set {Z11Z2} is orthogonal and since from the geometry they are each nonzero, they can each be normalized and the result is what we seek. These ideas are generalized in the following result, which is referred to as the Gram-Schmidt orthogonalization process. Theorem 7. Every finite dimensional Euclidean vector space contains an orthonormal basis. Sketch of the roof. Every finite dimensional vector space contains a basis (from linear algebra - see, for example, Nering [9], Chap. 1). Suppose this basis is the set {X1, ''*, X }. Let Z1 = X1, T x~x1 2X1 Z X _X - - -X z2 2 I lxll 2 1 I Ixll II and k-l Xz. X Z. k 2 k =. 3 nk Xk - l z |2,, j=lj IZI

.. - 17 -Perform an induction to get an orthonormal set and normalize the vectors in the resulting set. It is oftentimes helpful to use the concept of angle between vectors in En. Angle is defined in terms of its cosine and is a straightforward generalization of familiar concepts from E2 2 Definition. If X and Y are any two nonzero vectors in En then the (unique) angle 8, 0 _ 0 w,y such that T cos 0 = x...N.FT is called the angle between X and Y. From the Schwartz inequality we have cos. = T |x Y|' = j =KT-I I I IYT7TT I so the angle e so defined retains features that are associated with angle in E2. We conclude this section with several useful theorems relating to orthogonality. Definition. Let X be any vector in E and A = {X1, —,X }: ~ — a n n be any set of vectors in En If X is orthogonal to every vector in A we say that X is orthogonal to the set A. Moreover, if A = {X1,-,Xn} and B = {Yi, '''Yn}, then we say the sets A and B are orthogonal if for each XieA and YisB we have XTYi = 0. Theorem 8. If X and A = {Xl,.,Xn} is orthogonal to the set A = {X1,' *,X }, then X is orthogonal to the space

"o spanned by A. - 18 - Theorem 9. If A ={X, '''*X } is a set of vectors from E n then the set of all vectors orthogonal to A is a subspace of E. Definition. Let S be any set of vectors from E; then the set of all vectors orthogonal to S, denoted S~, is called the orthogonal complement of S in E. Exercises. 1. Prove Theorem 4 using the preceding lemma. 2. Prove that if x and y are any nonnegative real numbers and if x2 y, then x < y. 3. Use Theorem 4 and the preceding exercise to prove the corollary to Theorem 4. Prove that if the Schwartz inequality reduces to an equality for vectors X, YSEn, then {X,Y} is linearly dependent. 4. Prove Theorem 5; in what case does (iii) of Theorem 5 hold as an equality? 5. Prove Theorem 6. -1 -2! 6. Given X = 1,' Y =1 Z = 0 -3 2 calculate the following. (a) xTX; XTy; yZ; Y*; Z*; y*T*; the orthogonal projection of X on Y and of X on Z. (b) Calculate the orthogonal projection of Z on X. (c) Construct a basis for E3 that is not orthogonal and from this basis construct an orthonormal basis...

I I - 19 -(d) Verify that Ty = YTX; why do we have this symmetry? 7. Prove Theorem 7. 8. Prove Theorem 8. 9. Prove Theorem 9. 10. Apply the last definition in this section to S1 and develop the set (Sl)i. Verify that SC(SL)t-L Sl. 1.4. Orthogonality and Double Description of Vector Subspaces. We want to show how a vector space can be defined either in terms of a spanning set or as the solution space of a system of homogeneous linear equations. When both definitions can be specifically applied to a vector space we say that we have a double description for the space. Theorem 10. If R = {X,* -,X } is an orthonormal set in 1~ P En and if Y is any vector in En, then the vector T X. Z = Y - E (Y i)Xi i= 1 is orthogonal to R (and hence to the space spanned by R), Proof. Consider ZX (Y XJ =) [X (Y Xi) XX XX.. for i.~ ~ j so tti~~~aT T Since R is orthonormal we have X~xj = O for iyj and X.X. = 1 for i=j so that T T T Z X. =Y X. - X = 0 ir j j> hence Y is orthogonal to R. O

- 20 - Theorem 11. Let S be a subspace of E having dimension m < n and let X En. Then X can be expressed as the sum Yi+Y2, where YleS and Y2eS3. Proof. S has an orthonormal basis, say {X1, *',X }o If XeS, the theorem is obvious. Suppose X/S; let m T o YdX = Y + Y1 = - 1 2(X X)XQ i=1 Now YeS and Y X-Y X - (X X.)X.. By Theorem 10, Y2 is orthogonal to the set {X1, ***, Xm} and hence Y2 is orthogonal to S. Theorem 12. The decomposition X = Y1 + Y in Theorem 11 1 2 is unique. Theorem 13. If S is any subspace of E, then n (i) SnS- = {0}; (ii) En = S 0 S~; (iii) SL- = S. Proof (i) If XeSlSt, then XeS and XeS1. This implies X X = 0 and X =0. (ii) The symbol @3 denotes the direct sum of the vector subspaces indicated. Let A = {X1, **, Xk} be an orthonormal basis for S and let Y be any vector in E. Then k T Z = Y - Z (Y xi)X i=l

- 21 - is orthogonal to A and hence to S. Thus ZcSt. Also, k T Y = Z + i (Y Xi)Xi is the sum of a vector ZES4 and z(yTi)XisS. 'Since S( S~ = {0}, -we have expressed Y as the (unique) sum of a vector in S and a vector in S1. We now make use of results on sums and direct sums of vector subspaces [see Nering [9], p. 21-22; Halmos [6], p. 28-30]; in particular the theorem that if U and V are any subspaces of a vector space W, the following conditions are equivalent: (a) W = U $ V; (b) U V {0} and U + V = W (i.e., U and V are complementary subspaces); (c) any vector YeW can be written in the form X + X2, where XlEU and X2EV, in one and only one w a... -. way. We have shown that ZeEn can be expressed uniquely as the sum of a vector in S and a vector in st. Since (c) implies (a) we have S $ S- = E. (iii) Consider the vector Y Z + z (YTXi)X. where ZeSt T and E (Y Xi)X.eS; we have T T T T T 2 zT = zz + ZT[z(YXi)Xi] = Z Z 2. Thus if YeSLL-, then ZTY = 0 implies 2 | Z 12 = which means that Y = Z(Y Xi)xiS. This also implies that SL CS and since we observed earlier that SCSIt we conclude that S = S Finally, it should be clear that if S is a subspace of En with dimension r, then S- has dimension n-r. We can now establish the double description theorem for vector subspaces:

- 22 Theorem 14. Every r-dimensional subspace of En is the solution space of a system of n-r independent homogeneous linear equations in the variables xl, ***, xn. Proof. Let S be any subspace of En having dimension r and let T be the orthogonal complement of S. T has dimension n-r; suppose we have a basis for T, T1 = (bll, b12, -. bln) n-r (bn-r,l '' '' bn r,) Since T is the orthogonal complement of S, we know that S is the orthogonal complement of T and the set of all vectors orthogonal to S is T. Since the set of all vectors orthogonal to T is the set of all vectors orthogonal to a basis of T, the set of all vectors X satisfying T1X = 0 T2X = 0.. is S. These equations can be written more fully as *= 0 o e bllXl + -* + b x = 0 bn-r, l I+ bn-"r,n n and these are independent since the coefficient vectors are a basis for T.

- 23 - Thus every subspace of E can be described in two ways: by providing a basis for the space or by providing a system of homogeneous linear equations whose solution set is the subspace. Figure 3 displays a schematic diagram which illustrates the situation in detail. Exercises 1. Let {X1, ~ X } be an orthonormal set in En where p = n. P n Prove that if Y is any vector in En, then E lYTx l2 < I IY 12 ~ i=l 2. Given the vectors 3. X= 0, X2 -1, X3[5], 1 4 4 verify that {X1, X2, X3} is a basis for E3 and then use the Gram-Schmidt orthogonalization process to construct an orthonormal basis for E3. 3. Let S be a 'subspace of E4 spanned by the vectors 0 1 ] and 2 3 0 0 describe S - in two ways. 4. Find a basis for the solution space of the linear, equation

I r Given a basis for a subspace S A1= (all, ',aln) A=(a,*.,an) r rl r rn Write down the system AX =1 0 AX=Q. I~~ linear equation I I A X =0 r The solution space of this system is the orthogonal complement of S, call it T. Reduce this system to an echelon form and obtain a basis for T, say T1 (bll-' ' bln) T (i 1 -r=(n-r,l) n-r,n I I S has a double description The solution space of this system is S C ----. Write down the system T1Y = 0. - Y,. -a -r T - Y - _______I, I Figu... w. _ P Y= nre 3. Double Description for Vector Subspaces. ra 3* Doubl Des~riptin fo eor - Su spae. tl3

I I - 25 system x + 2x2 - x3 4 0= 3x - x2 + x3 + 4x4 = 0. Also find a basis for the orthogonal complement of this space. 5. Suppose S is a vector space spanned by the vectors 2. 5 2.. 122 3.. 3 ' 1 1 L1 1 B4 find a system of homogeneous linear equations whose solution set is S. 6. Prove Theorem 12. 7. Let W be the subspace of E5 spanned by the set is.. h. set. 1 1 I 1 1 r% 'i * " "I.. '* 1- - 3 4 1 0 2 0 X 1 = 1, X = XX = 1, X = 1 1 1 1 0 1 J1 _1 1 0 X, * I,' XC5 = 1. 0 Find a basis for W and find a system of equations whose solution set is W. 1.5. Linear Manifolds. We will take the point of view — p —~ — that a linear manifold is a "displaced" vector subspace. This will enable us to utilize directly the previous work on subspaces.

- 26 - Definition. A linear manifold is a set L = X + S, where X is a fixed vector in En and S is a subspace of E (note that the symbol "+" means the algebraic sum of the set consisting of the vector X and the set S). An example from E2 displays (Figure 4) the displacement aspect of this definition; the fixed vector X being a key element in the displacement. S Figure 4 Definition, The manifold L is said to have dimension r if S has dimension r, Definition. The linear manifolds L1 = X1 + S1 and L = X + S2 are said to be parallel if and only if either 2 2 2

- 27 - S1C S2 or S2C S1 If S1 and S2 have the same dimension, they are parallel if and only if S1 S2. Now suppose that L = X + S is a linear manifold and that {Y **'*, Y } is a basis for S. Then every vector Y L can be r written in the form Y = X + tlY + -. + trr ti R. The expression above is sometimes called a parametric representation for L since X and a basis for S can be chosen in many ways. Definition. Let X0, X1, *, X be any vectors in E; - -----:.j — - O 1 X p p the smallest linear manifold containing these vectors is the linear manifold containing them which has smallest dimension. Given any points or vectors X0, X1, '*, X in E we p n know that they are contained in some manifold -- the space E. The question is, how can we determine the smallest manifold containing these points? Suppose we begin with the vectors X0, X *, Xp; then we can define a manifold containing them as L = X + S for some fixed XCL and some subspace S. Now X can be any vector whatever in L, so we choose X = XO. Then L - Xi + S and it is clear that L + (-XO) = S

- 28 - so that S contains the vectors X. X0 for i=l, **e, p. 1 X1 -X 0 IL On the other hand, if S contains {X1 - X0, *', Xp- X0}, then since L = X0 + S IJ.x +s L must contain X0 + (X1-Xo) = X1, *.., X + ( Xp 0) = ~ p Finally, the span (see Nering [9], p. 20) of the vectors X1-XO, *., X-X0 is the smallest subspace containing these vectors. If we use the notation < > to denote the span of a set of vectors, we can write the smallest manifold containing the vectors X%, X,, Xp as L = X0 + < XlXO,.-, Xp-X >. When we refer to "the manifold containing a given set" we will understand that we mean the smallest manifold containing the set.

- 29 - oIf XO, X1, ', Xr is any set of vectors in En, then the set {Xl-X0, **-, Xp-X } can be either dependent or independent. If this set is independent then L has dimension p, otherwise L has dimension k <p. An arbitrary vector X is in a linear manifold L if and only if X can be written X = X + tl(l-xO) +.* + tr(XrXo) for tiER (i=l, ' ', r) and XO, X1, -.-, Xr in L. Then X = Xo + tlXL-tlXo+ -+trXr-trX0 - ) XI = (Xo-tlx- -tXo)+tl +. r ' r r Now let to = 1 - C ti and then ~ tl=l and we can write 0 Pi i=lO t. R. Thus we see that the condition ~ t. = 1 is a necessary and i=0 z1 sufficient condition for XeL. Definition. If X klX1 + -.- + kpXp, kiER k =, XisE i=i n +~~~~~~~i 1 then X is called a weighted average of the vectors X, *, Xp 1 p

. I I I I ~ 30 - Definition. If Y is any vector in a linear manifold L where Y = tX + tX1 + tXr ti 1 X - XEL i 0 i=0 = tX0 XrSI' then L is said to be spanned by X3, X1, X*, Xr.. r We observe that the set of all linear combinations of a set of vectors (unrestricted combinations) is a vector subspace, whereas the set of all weighted averages (restricted combinations) of a set of vectors is a linear manifold. Theorem 15. (Double description theorem for linear manifolds). If L is a manifold of dimension r in E spanned by the.n vectors X0 Xl, ', Xr, then L is the solution set of a system of linear equations AX = B. Conversely, let AX = B be a solvable system of linear equations, where A has rank n-r; then the solution set of this system is a linear manifold of dimension r. Definition. A point in E is a linear manifold of dimension 0; n A line in E is a linear manifold of dimension 1; nn A p lane in E is a linear manifold of dimension 2; n A hyperplane in En is a linear manifold of dimension n-I. As a special case of Theorem 15 we see that a hyperplane in E can be described as the solution set of a linear equation n alx1 + - + anXn = b (not all ai = 0). n n 6

z I I i I a e JK - 31 -Definition. Let X = and Y = K be vectors in En- (i) X > Y if and only if x. > Yir, i. (ii) X _ Y if and only if xi Yi '. i (iii) X > Y if and only if X Y and X $ Y. (iv) X is strictl ositive if X > 0. (v) X is positive if X ~ 0. n (vi) If X Z 0 and x. = 1, then X is a probability vector. i=l (vii) If X a 0 and k n-, then X* = kX is a probability xi n vector and X = X * = xX. i!1 Exercises 1. Prove Theorem 15. Hint: use Theorem 14 on the vector subspace S. 2. Given the system of linear equations defined by 1 0 0 1 41 "X ~ r 7 1 1 0 3 9 = 15 2 0 1 5 14 X5 L23 (a) Express the linear manifold of solution vectors X as the sum of a vector in the manifold and the span of basis vectors for some subspace S. (b) Describe the subspace S in two ways. (c) Describe the subspace SL in two ways. 3. For each of the following linear manifolds write down a parametric representation and a determining set of linear n>

-32 equations. (a) L1 = (3, 1, 1) + <(-2, 2, 1), (1, 2, -1)> (b) L2 = (1r 0, 2) + <(2, 3, 0), (0, 1, 4)> 4. Prove that a vector X will be contained in a linear manifold L if and only if X can be expressed as X = tl(X1X0) + * + tr(Xr-x ). 5. In E3 find the smallest linear manifold L containing the vectors 2 ~! 2 Lid ', 1 '... * 21. 1 1.6. Hyperplanes and Orthogonali Consider the hyperplane defined by the equation 2xL + 7x2 = 0. This can be written I..2 ~~. **: ~ ~~~~ r Xl CX= [2 7] 0. If we graph the situation as in Figure 5 we see that the solution in set S of this equation consists of all vectors X orthogonal to the fixed vector C. What is the situation for the equation 2x1 + 7x2 = 19? Each vector XEH = {X|C X = 19} has the same orthogonal projection on the fixed vector C. This is illustrated in Figure 6. From an earlier section the orthogonal projection of any vector XEH is given by ICX

- 33 - 2 7 S Fi gure 5 aIk C H Figure 6

- 34 - The length of this vector is 1cx —.c j c-x 1c - =:| C | 2 ac 2 |I IC These comments are easily generalized; if H is a hyperT plane in En defined by C X = z, then the length of the orthogonal n projection is cTxlI _ _zi VI lC i I lC I and this length is sometimes referred to as the normal distance of the hyperplane H from the origin. The fixed vector C is called a normal vector to H and it is clear that given a normal vector one can then unambiguously specify a hyperplane. In some of the discussions below it will be natural to speak of normals or normal vectors when we wish to refer to corresponding hyperplanes. Indeed, we will encounter situations in which we will locate normal vectors in certain sets so that corresponding hyperplanes will be located in desired ways. Definition. If H {X= cT, C 0, XEEn}, then the vectors -+ Ici are called unit normals to the hyperplane'H. The geometrical concept of "moving a hyperplane parallel to itself"occurs often in optimization mathematics. Figure 7 illustrates the situation; H0 = {xJCTx = z0} and

9 ~$ - 35 - C \H \ Figure 7 T T H1 = {X|C1 =} zl so that C* is the orthogonal projection of XcH0 on C and C** is the orthogonal projection of XEH1 on C. T z Since C**eH1, we have C C** = zl and for some scalar k CTC** = CT(C* + kC) = CT** + kCTC = ZO + kI ICI 2 = zl 0~~~~~. 1t Then kl |CI = Z -Z.'1 0 since Ic 12 > 0, k > 0 implies that z - > 0 or that Zl > zO. Thus when a hyperplane is "moved up" the value of z1 > ZO. Likewise, k < 0 would mean that H1 was "moved down" and zl < z0. Exercises 1. Given the equation 2x1 - x2 + 7x3 = 13, find the following. (a) The unit normal vectors to the hyperplane. (b) If X is any vector in the hyperplane defined by the

I I - 36 equation above, what is the orthogonal projection of X on the normal vector to the hyperplane? (c) Write X in terms of a parametric representation in which the basis for an appropriate subspace appears in the parametric representation. 1.7. Convexity. Suppose the vectors X1 and X2 in E. ------ *&* *. JL 2 n span a line L. Then if YaL we know that Y tlX 1 + t2X2 t1 + = 1, or Y = tX1 + (l-t)X2, tR. If t = 0 we have Y = X2 and if t =1, Y = X1. We say that Y is between X1 and X2 if and only if 0 < t < 1. The line segment joining X1 and X can be written in the form Y tX1 + (l-t)X2 0 < t < 1. Definition. A subset C of E is convex if, whenever two points X1 and X2 are in C, the line segment joining X1 and X2 is also in C. I.e., C is convex if, for any X1 and X2 in C, the vector YsC where Y = tX1 + t2X2, t + t2 = 1 t1 > 0, t2 = 0. Definition. Let X1, **', Xp be any vectors in En, the vector Y~a1X1 + +~~ a X, Pai =1, a~ ~ 0, ~

- 37 - is called a convex cobination of the vectors X1, * * X. A set consisting of exactly one point and the null set are regarded as convex since in these cases the definition of convexity is satisfied "vacuously." Theorem 16. The intersection of any collection of convex sets is convex. The definition of convex set is geometrically appealing but it can be difficult to apply directly if we have a set and wish to determine if it is convex. The following theorem is often useful. Theorem 17, A set C is convex if and only if every convex combination of vectors in C is in C. Proof. If every convex combination of vectors in C is in C, then every line segment joining a pair of vectors in C must be in C, so C is convex, Suppose C is convex; we want to show that every convex combination of vectors in C is in C. Let r Y = ttiXi, I ti=lti3 0 Y il = be a convex combination in C. For r = 1 and r = 2 we have YeC; assume that a convex combination of fewer than r vectors of C is in C. The following is an identity:i' t tlX + * + + trlXr = (1-tr) r- - * + + r X = 1 i=l 1

- 38.C ~~ r-:. since t. = 1 implies t. = 1 - t. We then use this to j=l 3 3j=l r write t t Y - (1- tr) — *-* X l - tr '; X Li=l i =1. + *:* * t X Without loss of generality we can assume that tr e 1, otherwise we have Y = Xr. Now for tr 1 each coefficient of each vector in the sum in the brackets immediately above has the property that t. ____1 >' 0 (j=l, —, r-l) i=Kl and r-l t t: - njl1. 3=i1 E t iti: i=l i=l 1 This means that the linear combination of vectors within the brackets above is a convex combination. By our induction hypothesis this convex combination is a vector in C. Then the equation above gives a convex combination of two vectors in C, so YeC and the theorem is proved. Definition. Let S be any subset of En; then the convex hull of S is the smallest convex set containing S. It is de noted H(S).

- 39 -Since S is a subset of E and E is convex there is a n n convex set containing S. The intersection of all convex sets containing S is convex by Theorem 16, so such a smallest convex set exists, Theorem 18. The convex hull of a set S is the set of all convex combinations of the vectors of S. Proof. See Nering [9], Chap. 6. Definition. If C is a nonempty set of vectors in E, then C is a cone if it is closed under multiplication by nonnegative scalars. Definition. If C is a cone in E and is also closed under vector addition, then C is a convex set and is called a convex cone. If C is a convex cone and if there is a set of vectors {X1, **, X } of C such that every vector in C can be represented p as a nonnegative linear combination of the vectors X1, -, Xp, then the set is called a spanning set (or a set of generators) for C and C is also called a finite cone (or a finitely generated cone). The cone generated or spanned by a single nonzero vector is called a half line, and the convex hull of a finite set of vectors is called a convex polyhedron. Definition. If S is a convex set, an extreme point of S is a point XeS such that there are no other points X1, X2eS, X1 7 X2, such that X = kX1 + (l-k)X2, 0 < k < 1.

- 40 - Definition. Let f(X) be a function defined on a convex set C (i.e., the domain of f is the set C); then f is said to be a convex function if for any points X1, X2sC f[kX+(l kf(X + (-k)] kf(X) + (-k)f(X, 0 _ k < 1 Definition. Let f(X) be defined over a convex set S; then f(X) is a concave function if for any X1, X2cS, f[kXl+(l-k)X2] - kf(X1) + (l-k)f(X2), 0 - k - 1 Note: if f(X) is concave, then -f(X) is convex and conversely. Theorem 19. If f(X) is both convex and concave over a convex set S, then f(X) is a linear function over S. Exercises 1. Let 6 6 I 8 =, X 4 4 X1 = 2 X2 = 7 ' X3 ' 4 X 5 (a) What is the convex hull of the set {X1^ X2, X3, X4}? (b) Graph the set l(a); what is the cone spanned by X1 and X3? (c) Is X1 an extreme point of the convex hull of {X1, *', X4}? 2. Prove Theorem 16. 3. Prove Theorem 18. 4. Let X1 = [71 X2 =, X3 =. Sketch a graph of the set of all nonnegative linear combinations of the vectors X1, X2, X3. Call this set N; can N be expressed as the solu tion set of a system of homogeneous linear inequalities? If

41 - so, write down such a system. 5. Return to the convex set in l(a). Can you write out a set of linear inequalities whose solution set is equal to the convex hull in l(a)? 6. Given the functions defined over all of the reals (unless indicated otherwise) as follows: 2 2 (a) f(x) = x - x+3, (g) H(x) = -4x2+3 (b) g(= 2(h) Q(x) = 3x4-4x3-2x2+l, (c) h(x) = e, -2 ~ x 3, (d) G(x) = logex, x > 0, (i) R(x) = 3x+l. (e) F(x) = -, x > 1 Which are convex? Which are concave? 1.8. More on Double Description. We have shown in earlier sections that we have a double description for vector subspaces and for linear manifolds in E. We also have a double description for convex cones and for bounded convex sets (convex polyhedra). Given a convex cone, it can be regarded as the set of all nonnegative linear combinations of some spanning set or as the set of all solutions of a system of homogeneous linear inequalities AX ' 0. A convex polyhedron can be described as the set of all convex combinations of some finite set of vectors or as the solution set of a system of linear inequalities AX 2 B.* *The last two statements are not proved here since proofs are lengthy and complicated. See various papers in Kuhn and Tucker [7] for proofs and related discussions. See also Spivey and Thrall [12], p. 474-490.

. — * *.'/'~: -/:_ - 42 - Double description is important because it gives us conditions under which we can relate systems of linear equalities to spanning sets. From an applied point of view, problems frequently take the form of the former, whereas many theorems and operational results are stated in terms of the latter. Having double description theorems means that we can pass back and forth with ease. Figure 8 displays in brief form the full range of double description theorems that are available. 1.9. Concepts from Point Set Theory. In the next chapter we will speak of open and closed sets and of the boundary of a set. We now introduce concepts which will make these terms clear. Definition. Let S be any set in E and x:S. Then a n 0 neighborhood N of x is the set N(xo,r) = {xld(xo,) < r, r > 0}. Definition. If S is any set in E and xES, then x is an n interior oint of S if some neighborhood, perhaps a small one, is a subset of S. Definition. If S is a set in E, then the interior of -- n _ _ S is the set of all interior points of S. Examples (1) If xosE1, a neighborhood is the interior of an interval centered at x. o

Set Vector Subspace Linear Manifold Convex Cone Convex Polyhedron Set of Linear Combinations or "Spani" S is the set of all linear combinations of a set of vectors L is the set of all weighted averages of a set of vectors C is the set of all nonnegative linear combinations of a set of vectors. K is the set of all convex combinations of a set of vectors. System of Linear Equalities or Inequalities S = {XIAX=O for some A}. L = {XIAX=B for some A}. C = {XIAX-O for some A}. K = {XIAX>B for some A}. I Figure 8 u I

I 9 a (2) - 44 - If Xo E2, a neighborhood is the interior of a circular disc with center at x and radius r as in Figure 9. -. v. ^, 0 Figure 9 (3) If x sE, a neighborhood is the interior of a "solid ball" centered at x and having radius r. The concept of interior can correspond, as in the above examples, somewhat closely to the intuitive notion of interior, but the definition applies to cases which do not conform to the intuition. For example, consider the set of all rational points in E1 (i.e., the set of all points on the real line that correspond to the rational numbers). This set has no interior at all. For another example, consider En and the distance function defined as follows: if x, y are any points in En, then d(x,y) = if x / y, and d(x,x) = 0. Then every point x is an interior point of every set containing x. Definition. A set is said to be oen if each of its points is an interior point. Theorem 20. A neighborhood is an open set. Proof. Let N(xo,r) be a neighborhood in E, N(xo,0r) = {xd(xox) < r}, N(xor) = {xjd(xox) < r L

- 45 -and let x be any point in N(x,r). We want to show that x is an interior point. Since xeN(x,r), d(x,x) < r or d(x,x) = r-h for some h > 0. Let y be any point in N(x,h); then d(x,y) ' d(xo,x) + d(x,y) < (r-h) + h = r or d(x,y) < r. This implies yeN(x,r) = N(x,h)CN(xo,r). Definition. A point xosEn is a limit point of the set S if every N(xo,r) contains a point of S different from x. Examples (1) Let S = {(x,y) x2 + y2 1}; every point on the circumference is a limit point of S. Suppose x* is an interior point of S; is x* a limit point? (2) Let T = {x|0 < x ' 1}. The point x = 0 is a limit point of T, but xKT. The point x = 1 is also a limit point of T. Definition. A set is closed if it contains each of its limit points. Theorem 21. If x is a limit point of S, then every N(xo,r) contains infinitely many points of S. Proof. Suppose some N(x,r) contains a finite number of points of S, say Pl, ***, Pn. Then there is some point Pi closest to Xo; for this point let d(x op) = r*. Then construct 0 i N(xor*); this neighborhood contains none of the points p1, ', pn but this is a contradiction since x is a limit point of S. a

- 46 -Definition. Let SCE and let x En; then if every N(xo,r) contains a point of S and a point not in S, xo is a 0 0 boundary point of S. Note that a closed set contains all its boundary points because a boundary point either belongs to the set or is a limit point of the set (why?). Definition. The boundary of a set is the set of all boundary points of the set. Definition. A set in E is said to be bounded if it is n a subset of some N(O,r) for r > 0. A set is said to be unbounded if it is not bounded. Theorem 22. The complement of an open set is closed. If a set is closed, its complement is open. Theorem 23. The union of any collection of open sets is *~ ' * ^ ' '. * *.. open. Theorem 2 4. The intersection of any collection of closed sets is closed. Discussions of these and related concepts can be found in many introductory books on analysis; two good sources are Boas [3] and Natanson [8]. We noted earlier that the set L = {XJAX = b}, where A is a nonzero vector and b is a scalar, is a hyperplane. The set Q = {XAX 2 b} is called a halfspace in E. It is indicated in the exercises at the end of this section that Q* = {XIAX > b} is an open set or open half space and Q is a closed set or closed half space.

I I -0.. 47 - Exercises 1. The unit ball in E is defined by n: B = '(xl, *, xn )n 1 x } Prove that B is a convex set. 2. Prove Theorem 21. 3. Prove Theorem 22. 4. Prove Theorem 23. 5. Prove Theorem 24. 6. Prove that the set H = {XIAX b}, A a nonzero vector in E n and b a scalar, is closed. 7. Prove that the set H* = {XJAX > b} is open. 8. Prove that a hyperplane in E is a closed set. 9. Note distinctions between the concept of bounded set and boundary of a set. Give an illustration of an unbounded set which has a finite set of boundary points. Construct an example of an unbounded set that does not contain its boundary points. 10. Can a set be open and closed? If so, produce two examples. 1.10. Brief Review of Linear Transformations. Suppose we consider En, a basis for E consisting of the unit vectors {U, -*, Un}, E, and a unit basis for E, { U ' From linear algebra we know that if T is a linear transformation from E into Ep, then T is fully specified if we indicate the n p images of the basis vectors of En under T (see Nering [9], Chap. II). In colloquial language, we know where every vector in En goes (under T) if we know where the basis vectors in En n ~~~~~~~~~~~~~~n I i

- 48 - go. Suppose then for each basis vector U.j E we have T(Uj) = aljU1 + + a Up P P - a..U. i=l 13 3 Now if X is any vector in En, consider its image vector under T, T(X) = ylU1 + + ypU. We know that the representation of the vectors T(Uj) and T(X) in terms of the basis vectors U. is unique. Since T is a linear transformation we have T(X) = T(xlU1 + * — + xnUn) = lT(U1) + * — + x T(U) 1 iin n xl(ilailUi) -+ *' + ( ainU JL *a''' J ni= l i n y n = i ( i aj Xj) Ui. But since a vector has a unique representation in terms of a basis, this means that n y. = a..x. j=l 3 3 or, in matrix form Y - AX.

- 49 - Under these conditions it can be shown that the matrix A = [aij] represents the linear transformation T unambiguously (relative to the chosen bases). The matrix equation above indicates that the coordinate vector of the image of X under T is the product of the matrix of the transformation T and the coordinate vector X. Moreover, note that the columns of the matrix A are the coordinates of the images under T of the basis vectors in En al al * aln 21 ( a 2n p1 p apn coordinate vector of T(Uj) Definition. The set of all vectors XeE that are mapped by T on the Zero vector in E is called the kernel of the linear trans formation T. Definition. If X is any vector in En, then the set of all image vectors T(X) is called the range of T. Theorem 25. Let T be a linear transformation from E into E; then P (i) the kernel of T is a subspace of En; (ii) the range of T is a subspace of Ep; (iii) dim (kernel) + dim (range) = n = dim (En). Note that one can use a system of homogeneous linear equations to define the kernel of a linear transformation T.

- 50 - For example, suppose A is the matrix of a linear transformation from En into E relative to a chosen pair of bases. Then.n p Kernel of T = {XIAX = 0} A system of homogeneous linear equations always has a solution (the zero vector). Another way of saying this is: the kernel of a linear transformation always contains the zero vector (of En). Note that this is part of what is implied by Theorem 25(i) since a subspace must contain the zero vector. It may be the case that the kernel of T contains only the zero vector; then we know that T is a one-to-one transformation. On the other hand, if the kernel is a subspace other than the zero vector, there will be nonzero vectors in E which are mapped by T on the zero vector in E and the corresponding system of p linear equations AX = 0 has a nontrivial solution. We now consider a special case of the foregoing. Question 1. Are there vectors in E such that AX = 0 n and X > O? That is, does the kernel of T contain a positive vector X? To provide an answer, we normalize the vectors X 2 0 (this is possible, if such vectors exist, since X > 0), and note that such a normalized vector is a probability vector. We consider Question 1**. Does T map probability vectors in E on n the zero vector in E? PDefinition } be the set of unit vectors Definition. Let {U -- U} be the set of unit vectors Defiitio. Le U,,U

- 51 in E; then the convex hull of these vectors, denoted n H (U1, — *, Un) is called the fundamental probability simplex n n (F.P.S.) in E. We can now state Question 1 in yet another way: Question 1**. Does the image of the F.P.S. under T contain the origin of E? p We will see in the next chapter that answers to these questions will lead us to theorems that enable us to prove the existence and duality theorems of linear programming as well as other important theorems in the theory of linear inequalities and in the theory of games. Theorem 26. A linear transformation T maps subspaces into subspaces. Theorem 27. A linear transformation T maps linear manifolds into linear manifolds. Theorem 28. A linear transformation maps convex sets into convex sets. In particular, the image of the F.P.S. in E is a convex set in E n p Exercises 1. Prove Theorem 25. 2. Prove Theorem 26. 3. Prove that if the kernel of T consists only of the zero vector, then T is a one-to-one linear transformation. 4. Prove Theorem 27. 5. Prove Theorem 28.

- 52 -Chapter 2. Linear Optimization 2.1. Proof of a Key Theorem. We return to a consideration of Question 1** of Section 1.10 and recall that the j column vector of the matrix A is the image of the vector U.cE. Therefore, the image of H (U ~ -- U ) -- the fundamental probability simplex in E -- is the set H (A1, *'r A.), where H (A1 *'*-, An) is the convex hull of the image vectors P n A1, —, A in E. To put it another way, if we consider the n p set of all convex combinations of the unit vectors in E and then consider the image of this set in E, then the latter will consist of exactly the same convex combinations of the images of the unit vectors (Theorem 28). Now if the zero vector is in the set H (A1l, ', An) n there exists a probability vector in En satisfying AX = 0, X >Z 0, and we answer Question 1 (and Questions i* and 1**) in the affirmative. On the other hand, if the zero vector in E is not p contained in' the set Hp, there is no probability vector in En satisfying AX = 0, X > 0. Case I. Suppose 0 / Hp; what happens in this case? In Figure 10 we show the set H (A, A ) as the indicated convex hull of the vectors A1, - —, An, and the origin (of Ep) is shown contained in a hyperplane L (0 / H ). p Now since H is a bounded and closed set, there exists p a point p s H that is closest to the origin (a basic result from the calculus). Let U be the vector having the origin as initial point and P as end point (we know that U $ 0 because -~ ~ ~ ~ ~ ~ ~ ~ ~~on U 0 beas

I I I - 53 -A 2 A3 AA L\""~~.. —A4" "cP 5 ~Hp( ~ (A1,,A Figure 10 0, H ). Let U be a normal vector to a hyperplane L containing P the origin (see Figure 10). We first show that L does not intersect H for any point p p P chosen as above. Suppose Q e H is also in L and consider the line segment PQ in Figure 11. Since Q L, the vector OQ. 0 Figure 11 is orthogonal to OP, so the angles OPQ and OQP are acute and PQ is the hypotenuse of a right triangle. Construct an altitude of the triangle from 0 and let R be the point of intersection of this altitude with PQ. Since P, Q e H and H is convex, P P the line segment joining P and Q is in H, so R e Hp. This means that R e means that R: H is closest to the origin. Hence we have a w...~~~

i f,I I 54 contradiction since we assumed that P (R) is closest to the origin. Thus L does not intersect H and H is in the interior P P of the half space bounded by the hyperplane L. Furthermore, given H in the interior of the half space P and the normal vector U, every vector in H has a positive inner p product with U; in particular, UA. > 0 (i=l, *'', n). This establishes the following result: if 0 / Hp (i.e., there is no vector X > 0 in E such that AX = 0), then there exists a vector U E E such that.P UA. > 0 (i=l, ', n) th where Ai is the i column of the matrix A. Conversely, if there exists a vector U E E and that p UAi > 0 for i=l, *e*, n, then 0, H (A1,.*, A). To prove this, suppose that UA. > 0 for i=l, * *, n and ~E H; 1 -p clAl +. + CnA = n for c.i, c. = 1 Moreover, some c. > 0, since 0 e H 1 i=li 1 1 P Now UO = U(cA1 + '+ c A) = c UA + -- + c UA Since nd n n n Since UA. > 0 for every i and c. > 0 for at least one i, we have 1 1 clA + -- + cA > 0 i.

a~1 - 55 - But this is a contradiction since UO = 0 for every U. The situation 0 e H is characterized by the theorem of P Gordan (of which the above remarks constitute a proof)' The system X > 0, AX = 0 has a solution if and only if there is not a vector U e E such that UA > 0. p Case II. If 0 E H, then 0 is either on the boundary of H or in the interior of H. Suppose 0 is on the boundary p p of H and we have the situation shown in Figure 12. Here we p have UA1 = O0 UA = 0 and UA. > 0 for i=2, ***, n-1. Now since A A:2 3 A;A A aa A6 6 a 1 gure 1z 0 c H we have p O = lAl + x2 + -+ + x xA, n 1A 22 n- 1ln n n n where xi. 0, x. = 1. Also from the case in Figure 12 we 1 i=l 1 see that, in particular, xI > 0, xn > 0 and x. = 0 for 1 i=2, ***, n-1 (since O is on the line segment joining A1 and An). The preimage X of 0 E E is the same linear combination of the p basis vectors of En as O is of the vectors Al, —, A in E, soP

- 56 - X U X= T (xlAx) l + * + x ) - U n X T (x1A1+x A U x 0 >0, x > 0 n x. = 0 for j=2, -, n-1. Consider the vectors UA = [UA1 * UAn] and T = [x, *, xn]; whenever UA. = 0, the corresponding x. > 0 and whenever UA.> 0, the corresponding x. = 0. Thus in this case we have. T X + UA > 0. This is a proof for the special case in Figure 12 and contains the essential features of a proof which must be established for whatever "edge" the origin is contained in (see Good [5], p. 9-10). Case III. If 0 e H and is an interior point of H, then p p there is a convex combination 11.~ n n =' xA + * * + xA x > (i=lX, n) This means that if X is a preimage of 0 c E then X is strictly positive. Now let U = 0 and again we have X + U A > 0. A concise statement of all three possibilities is contained in the following theorem (called the key theorem by Tucker, see Good [5], p. 7). Theorem 29. The system of linear inequalities

- 57 -X = 0 X 0 UA.0 has a solution X* and U* such that X*T + U*A > 0. Proof. Except for details relating to Case II, this is proved in the foregoing discussion. It is perhaps more useful to state this theorem in the following way. Let A1, -, A be any finite set of vectors in E; there exist scalars xl *-*, x and a vector U* E such p 1 n p that A x* +.f4 A l+Ax* =0 AlXl + 1 + An n xj 2 O, (j =l, ~ -, n) U*A, > 0, (j=l, ', n), and x* + U*A. > 0, (j=l,, n). Corollary to Key Theorem. Let B, A, A be n+1 vectors in E. If UB > 0 for every U satisfying the inequalities p UA. O, i=l * * n then B is a nonnegative linear combination of the vectors A1, **, An Proof. Consider the vectors -B, A, *", A; from the n key theorem we know that there exist x! > 0, i=O, 1, ' n, satisfying (1) -Bx* + Ax + + A x = 0, 0 l11 nn

4i 58 - and a vector U* satisfying U*A. > 0, i=l, * n, and U* (-B) 2 0, for which (2) x0 + U(-B) > 0 * (3) x. + UA. > 0 (i=l, ', n). 1 1 Write (1) as (4) Alx* + A x* = x*B. 11L JL nf o If we can divide both sides of (4) by xl > the conclusion of 0 the theorem would immediately result. Suppose x* = 0; then (2) would require that UB < 0 and from (3) we have UA. > 0. This 1 contradicts the hypothesis, so x* > 0. Dividing (4) on both 0 sides by x* gives the desired result. This corollary is known as the Farkes theorem or lemma in the literature on linear optimization and the theory of games. Theorem 30 (Theorem of the Alternative for Matrices). The dual systems MX < 0 MT 0 X > 0 U > 0 have solutions U* and X* such that U* - MX* > 0 and MTU* - X* > 0. Proof. Suppose the system MX - 0, when written out more fully, appears as

4 - 59 -:L allX1 + *.* + o a aplx + 1* + alnXn = a Pn^~ and get a linear equation Insert slack variables w1, —, w p system whose matrix is A = [M I]. Let: u 0 u* Pi x1 X1 n Wi p pJ I applying the key theorem to the matrix A we know that there exist vectors U* and Y* satisfying AY* = 0 Y 2 0, A U 0, Y* + ATU* Now consider A U* O0; we get T L U* A U* - U* > 0 > 0. M U - 0 and IU* > 0 U* > 0. or Also from

X* [M I] = 0 W* x* O > 0 w* - 60 - X* MT + U* > 0' W* I t -~ we get MX* + IW = 0 => MX* = W* < 0 and X* + MTU* > 0 W*+ U* > 0. Since-MX* = X*, from the latter inequality we get U* - MX* > 0. Collecting results, we have MX* < 0 2X*- 0 X* + MTU* > 0, MU* 2 0 U* ~ MX* > 0, and which was to be shown. Theorem 31. (The Skew-Symmetric Matrix Theorem). Let K be a skew-symmetric matrix; the system KW - 0 W A 0 has a solution W* such that W* + KW* > 0 (i.e., W* has at least one positive component since W* + KW* > 0 implies that W* $ 0).

i: - 61 - Proof. Let M be a skew-symmetric matrix. Applying Theorem 30 to M we get MX < 0 (6) X 2O (5) MTu 2 0 U ~ 0 or (7) MU 4 0 U 2 0. These systems have solutions X* and U* such that U* - MX > 0 U - MX* > 0 (8). or (9). A %o r K M U* + Xw > 0 X Combining the systems (5), (7) gives M(X* + U*) < 0 (10) (X* + U*) > 0 - 1[vUu > U. and combining (9) gives (11* ) (X* + U*) - M(X* + U*) > 0. Now let K = -M and W* = X* + U*; then (10) and (11) can be written KW* > 0 (12) W* > 0 W* + KW* > 0 as was to be shown. Suppose, on the other hand, we wanted to find a W* satisfying (12), given a skew-symmetric matrix K. We take its negative, call it C(C = -K). Now use Theorem 30 on the matrix C

- 62 since C is skew-symmetric. Then there is a vector W* = X* + U* such that W*~2 0 CW* C 0 W* - CW* > 0. This result can also be stated as follows. There exists a probability vector P such that CP = 0, C skew-symmetric, such that P + CP > 0. Geometrically this means that the convex hull of the column vectors of the matrix C intersects the closed, nonnegative orthant in E. P Moreover, W* has at least one component greater than zero, since W* + KW* > 0 means that W* ~ 0. Finally, the vectors W* and KW* have the property that for each j=l, —, n, the jt component of one of the vectors W*, KW* is positive if and only th if the jth component of the other is zero. The reason is clear: the sum is positive but w*TKw* = 0 since -(W*TKW*)T = (w*TKW*) because of the skew-symmetric property of the matrix K. 2.2. Linear Optimizaton In this section we will consider the following linear programming problem and its dual, maximize f = clX1 + *- + c x sub4bizn-_.1n n sub rect to allX1 + X + alnXn < bL a + + ax b? pl 1 pn n p 3 0 Xj (j=l, 0., n).

".* - 63 - and minimize g = blU + * + b u p p subject to all + *- + aplp > c1 alnul + + apnUp = an Ui > 0 (i=l, -, p). Lemma 1. If X and U are feasible programs, respectively, then CT _ UTB. Lemma 2. If X* and U* are feasible programs respectively, and if C^X* = U*TB, then X* and U* are optimal vectors. Proof, Suppose X* and U* are feasible; from Lemma 1 we have for all feasible programs CTX U B. In particular, cTx* < UB. But cTx* = U*TB so u*TB B for any feasible U. This means that U* is an optimal program for the minization problem. A similar argument establishes that X* is optimal for the maximization problem. Now define a skew-symmetric matrix K to be. 0 -A B K = AT 0 -C BT CT a U the vector W = X, t a scalar. From Theorem 31 we know that there exists a vector there exis ts a vector

.... Fu*7 w* - X* _t* - 64 - such that (13) KW* - 0 W* >= 0 and (14) KW* + W* > 0. The inequalities (13) give, when written out, (i) AX* < Bt* (ii) ATU* > Ct* iii c (iii) C X* > 31U* and the inequality (14) gives (iv) (v) (vi) AX* < Bt* + U* A T* + X* > Ct* T + t* > T C X* + t* > B U*. Lemma 3 Suppose t* > 0; XO and U0 such that then there are optimal vectors To0 oT o 0 oT oT T C X = U TB, U~ + B > AX~, u A + XT > CT Proof. Multiply W* by the scalar - then

- 65 - _ r F U* w~ = Iw * - J - X0, and X0 and U0 then satisfy (i) and (ii) and are feasible. Lemma 2 and (iii) show that X~ and U0 are optimal, respectively, and T o oT Lemma 1 shows that C X = UTB. The inequalities in the statement of Lemma 3 follow from (iv) and (v) with t~ = 1. Lemma 4. Let t* = 0 then (a) at least one of the LP problems has no feasible vector. (b) If the maximization problem has a feasible vector, then the T solution set is unbounded and C X is unbounded over this solution set (a corresponding statement holds for the dual problem). (c) Neither problem has an optimal solution. Proof. Suppose X is a feasible vector for the maximization problem. Using (ii) t* = 0 and the nonnegativity of X we get U*TAX 2 0 This, together with (vi), t* = O, and the feasibility of X yields (15) 0 U*TAX U*B < C*. To prove (a) we note that if the minimization problem has a feasible vector U, then we can get (16) 0 UTAX* > CT X Now (16) contradicts (15), proving (a).

- 66 - To prove (b), consider the vector X +:X*, X 0. Now X + XX* > 0. From (i) and t* = 0 we have A(X + XX*) < AX < B. Therefore X +. X* for any X _ 0 is feasible. This proves the first assertion of (b). Moreover, from (15) we have C X* > 0 then T T T C (X + XX*) = CTX + xcTX* can be made arbitrarily large by choosing X sufficiently large. This proves the second assertion of (b). Finally, (c) is an immediate consequence of (b) for the maximization and the minimization problem. Lemma 5. Either both the maximization problem and the minimization problem have optimal vectors or neither does. In the first case the maximum and minimum values are equal; their common value is called the optimal value of the dual L.P. problems. Proof. If one of the problems has optimal vectors, then (c) of Lemma 4 shows that t* > 0. Then Lemma 3 shows that both problems have optimal vectors X~ and U such that max C T = T min U B. Theorem 32 (the Duality Theorem). A feasible X~ is optimal if and only if there is a feasible vector U~ with C X~ = U~ B. A feasible U~ is optimal if and only if there is a feasible vector X~ with C X U TB. Proof. We prove only the first statement since the second has an analogous proof. Lemma 2 is a proof of the

- 67 sufficiency of this theorem. To prove necessity, suppose that X~ is optimal. By Lemma 4(c) we must have t~ > 0. By Lemma 3 the minimization problem also has an optimal vector U~, and by T T Lemma 5 max C X = min U B. Theorem 33 (Existence Theorem of Linear Programming). A necessary and sufficient condition that one (and therefore both) of the linear programming problems have optimal vectors is that both have feasible vectors. Proof. The necessity is obvious. To prove sufficiency, suppose that both problems have feasible vectors. By Lemma 4(c) we have t* > 0 and by Lemma 3 both problems have optimal vectors. Theorem 34 (A Complementary Slackness Theorem). If both problems have feasible vectors, then they have optimal vectors X and U~ such that; A10 bA1th (1) if AiX~ = b, A the i row vector of the matrix A, then o.th u~ > 0 (the i dual variable is positive); oT th 0o (2) if U ~A. = c A. the j column vector of A, then x. > 0 th (j primal variable is positive). Proof. By the existence theorem both problems have optimal vectors, so Lemma 5(c) implies that t* > 0. By Lemma 3 there are optimal vectors such that (17) UT + B > AX~, U A + XT > C The theorem is clearly proved, for consider the inequality on the left above. It can be written as

- 68 - u1 b - A X o jo u~ + b. - AiX > 0. o. Then if AiX b. = 0, we must have u? > 0 Now consider the 1 1 inequality on the right in (17). This can be written as [U A1-cl - * U A.-cj. -* An cn]+ [xl *-x-.-xn]>[o0 —0 oo0] n Obviously if we have U~A. = ci, the strict inequality requires 0 X2a > 0. J

- 69 - Bibliography 1. Ben-Israel, A., Linear Equations and Inequalities on Finite Dimensional Real or Complex, Vector Spaces: A Unified *The__L, Systems Research Memorandum No. 224, Systems Research Group, Northwestern University, Nov. 1968 2. Birkhoff, G., and S. MacLane, A Surv f odern Aebra, 3rd edition, Macmillan, 1965. 3. Boas, R.P., A Primer of Real Functions, Mathematical Assn. of America, Carus Mathematical Monograph No. 13, 1960. 4. Dantzig, G., Linear Programming and Extensions, Princeton University Press, 1963. 5. Good, R.A., "Systems of Linear Relations," SIAM Review, vol, 1, no. 1, January, 1959, p. 1-31. 6. Halmos, P., Finite Dimensional Vector Spaces, 2nd edition, Van Nostrand & Co., 1958. 7. Kuhn, H.W., and A.W. Tucker, eds., Linear nualities and Related Systes, Annals of Mathematics Study No. 24, Princeton University Press, 1950. 8. Natanson, I.P., Theor of Functions of a Real Variable, Ungar Publishing Co., 1955. 9. Nering, E.D., Linear Algebra and Matrix Thry, John Wiley & Sons, 1963. 10. Nomizu, K., Fundamentals of Linear Algebra, McGraw-Hill Book Co., 1966. 11. Shilov, G.E., An Introduction to the Theory of Linear Spaces Prentice-Hall, Inc., 1961. 12. Spivey, W.A., and R.M. Thrall, Linear Optiization, Holt, Rinehart and Winston, 1970.

-70 - WORKING PAPERS Working Paper Numbe r 1 ""Reflections on Evolving Competitive Aspects in Major Industries," by Sidney C. Sufrin 2. "A Scoring System to Aid in Evaluating a Large Number of Proposed Public Works Projects by a Federal Agency," by M. Lynn Spruill 3 "The Delphi Method: A Systems Approach to the Utilization of Experts in Technological and Environmental Forecasting," by John D. Ludlow 4 "What Consumers of Fashion Want to Know," by Claude R, Martin — out of print. To be published in a forthcoming issue of the Journal of Retailing.:5 "Performance Issues of the Automobile Industry," by Sidney C. Sufrin, et. al. — out of print. To be published as a future Michigan Business Pape r 6 "Management Experience with Applications of Multidimensional Scaling Methods," by James R. Taylor 7 "Profitability and Industry Concentration," by Daryl Winn 8 "Why Differences in Buying Time? A Multivariate Approach," by Joseph W. Newman and Richard Staelin —out of print. To be published in a forthcoming issue of the Journal of Marketing Research. 9 "The Contribution of the Professional Buyer to the Success or Failure of a Store," by Claude R. Martin, Jr. 10 "An Empirical Comparison of Similarity and Preference Judgments in a Unidimensional Context, " by James R. Taylor 11 "A Marketing Rationale for the Distribution of Automobiles," by H. 0. Helmers —out of print. To be published as a future Michigan Business Paper. 12 "Global Capital Markets," by Merwin H. Waterman

Working Num. 13 14 15 16 17 18 19 20 -71 -Paper ber "The Theory of Double Jeopardy and Its Contribution to Understanding Consumer Behavior," by Claude R, Martin, Jr. "A Study of the Sources and Uses of Information in the Development of Minority Enterprise —A Proposal for Research on Entrepreneurship, by Patricia Braden and H. Paul Root "Program Auditing," by Andrew M. McCosh "Time Series Forecasting Procedures for an Economic Simulation Model," by Kenneth 0. Cogger "Models for Cash Flow Estimation in Capital Budgeting," by James T. Godfrey and W. Allen Spivey "Optimization in Complex Management Systems, " by W. Allen Spivey "Support for Women's Lib: Management Performance," by Claude R. Martin, Jr. "Innovations in the Economics of Project Investment," by Donald G. Simonson "Corporate Financial Modeling: Systems Analysis in Action," by Donn C. Achtenberg and William J. Wrobleski "Sea Grant Delphi Exercises: Techniques for Utilizing Informed Judgments of a Multidisciplinary Team of Researchers," by John D. Ludlow "The Spanish in Nova Scotia in the XVI Century —A Hint in the Oak Island Treasure Mystery, " by Ross Wilhelm Not yet ready for release "Market Power, Product Planning, and Motivation," by H. Paul Root "Competition and Consumer Alternatives," by H. Paul Root and Horst Sylvester "Stepwise Regression Analysis Applied to Regional Economic Research," by Dick A. Leabo 21 22 23 24 25 26 27

Working Paper Numbe r 28 "Some New Statistical Methods for Analyzing Incomplete Brand Preference Data, " by M. J. Karson and W. J. Wrobleski 29 "Multivariate Methods in Marketing Research: A Further Attempt at Classification," by Thomas C. Kinnear and James R. Taylor