AN EFFICIENT TRANSFORMATION OF THE GENERALIZED TRAVELING SALESMAN PROBLEM Charles E. Noon Management Science Program The University of Tennessee Knoxville, TN 37996 James C. Bean Department of Industrial & Operations Eng. The University of Michigan Ann Arbor, MI 48109-2117 Technical Report 89-36 December 1989

An Efficient Transformation of the Generalized Traveling Salesman Problem Charles E. Noon Management Science Program The University of Tennessee, Knoxville 37996. James C. Bean Department of Industrial and Operations Engineering The University of Michigan, Ann Arbor 48109. December 17, 1989 ABSTRACT This paper discusses the Generalized Traveling Salesman Problem (GTSP). The GTSP is defined on a directed graph in which the nodes are members of predefined nodesets. Each node is a member of some but not all nodesets. The GTSP seeks a minimum cost cycle which passes through each nodeset exactly once. Many discrete decisions which complicate routing problems can be modeled in this form. In this paper, we describe a class of problems that can be formulated as GTSP's and show how any problem in this class can be efficiently transformed into a standard traveling salesman problem. Keywords: Traveling salesman problem, vehicle routing, integer programming. 1

The Traveling Salesman Problem (TSP) is one of the oldest and most widely studied optimization problems in the field of operations research (see [8, 131). For the standard TSP, we assume that a single salesman, based at a home city, must visit a number of cities (or customers). The problem is to find the salesman's minimum cost cycle (or tour) which leaves the home city, passes through all the customer cities and returns to the home city. Real routing or sequencing problems often require additional decisions or considerations. Often, the standard TSP is used only after a number of discrete decisions have been made. For example, in vehicle routing, the standard TSP is often applied after the customer to vehicle assignments have been made. Preliminary decisions may also include choice of fleet vs. contract carrier, depot locations, backhauling, and alternative vehicles. This paper provides a framework for including many of these discrete complications into a single, structured model, the Generalized Traveling Salesman Problem (GTSP). The GTSP is defined on a directed graph in which the nodes are members of predefined nodesets. Each node is a member of some, but not all, nodesets. For the GTSP, we seek a minimum cost cycle which passes through each nodeset exactly once. Since a feasible cycle need not include every node, the decision regarding which nodes to pass through is imbedded within the GTSP. The model requires that we consider both the sequence and selection decisions simultaneously. Early applications using GTSP formulations are given for sequencing computer files [3] and for routing welfare clients through governmental agencies [14]. More recently, Laporte, Nobert and Mercure [71 discuss GTSP applications in the routing of mail vans. Several potential applications are given in [9] including warehouse order-picking with multiple stock locations, airport selection/routing for courier planes, and certain types of flexible manufacturing scheduling. Recent GTSP formulations for the Multi-Depot Multiple Traveling Salesman Problem (MTSP) and the Heterogeneous MTSP can be found in [11]. Certain other discrete decision/routing problems from the literature may be reformulated as GTSP's. One recent example is the Covering Salesman Problem (CSP) studied by Current and Schilling [2]. For the CSP, we are given a directed graph and a "covering radius", S. Unlike the standard TSP, the CSP does not require that each city be visited. A feasible tour for the CSP need only include a visit within radius S of each city. In this respect the CSP seeks a 2

minimum cost tour that "covers" all cities. The CSP can be modeled as a GTSP by replacing arc costs with shortest path costs (as discussed in [2]) and then defining nodesets. For each city, we define a nodeset with members consisting of the city and all surrounding cities within radius S. In this paper, we show how any problem formulated as a GTSP can be transformed into a standard TSP. One benefit of this type of transformation is that it allows existing TSP approaches, both heuristic and optimal, to be applied to the transformed GTSP problems. Such an approach may not necessarily outperform specialized algorithms for the GTSP type problems, however, it would provide researchers with a tool for pursuing or verifying optimality on smaller problems. Another benefit is that the transformation may allow us to relate results for the well-studied TSP to the more realistic complex routing problems. As an intermediate step, the GTSP is transformed to a canonical GTSP. An alternative solution approach is to employ specialized algorithms available for this problem, e.g. [7, 10]. 1.0 Problem Statement The asymmetric GTSP is defined on a directed graph with nodes XA and connecting arcs A. Each arc, (i, j) E A, has a corresponding cost, cij, with Icj | < +oo. The nodes of the GTSP belong to one or more predefined nodesets, Si, i = 1,2..., m. Any feasible solution must pass through at least one member of each nodeset. We assume that each node is a member of some nodesets, but not all. A further assumption is that there does not exist a nodeset which overlaps with all other nodesets, that is, j some nodeset Si such that for each j, Si n Sj S 0. A feasible GTSP tour is defined as a directed cycle (not necessarily simple) which enters and exits each nodeset exactly once. A subtour is a directed cycle which passes through each nodeset at most once and is not a tour. Figures l(a) and 1(b) display example feasible GTSP tours. Figure l(a) displays a feasible tour which passes through a node twice. Figure 1(b) displays a feasible tour which passes through an arc more than once. Both are acceptable as long as the tour is uninterrupted and each nodeset is entered and left exactly once. The GTSP can be stated as the problem of finding a minimum cost feasible GTSP tour. The standard TSP is a special case of the GTSP with mutually exclusive nodesets of cardinality one. 3

We can formulate the GTSP as an integer program with variable xij equal to the number of times arc (i, j) is traversed in a solution. The GTSP can be stated: Minimize E cijzij (i,j)EA Subject to, E E Xij = 1 iESk jeSk (i,j)EA E E. ij = 1 isSk jESk (i,j)EA I E xij- X jk = iEX / kE.n (i,j)EA (j,k)E.A z1 xi - E >E xkl < (i,j)EAT k7T f-V l 7T (k,l)EA 0 lA-rl- 1 for all nodesets, Sk, k = 1,..., m. (i) for all nodes j E /. (ii) for all possible subtours T, where As represents the arcs of T (iii) and A/ represents the nodes of T. for all (i,j) E A. xij > 0, integer Group (i) contains two constraints for each nodeset to ensure that exactly one arc entering a nodeset and one arc leaving a nodeset are included in a solution tour. Constraint group (ii) contains one constraint for each node to ensure a solution tour is uninterrupted. Constraint group (iii) prevents a disjoint subtour from appearing. As seen in Figures l(a) and 1(b), small cycles are allowed if they are part of a complete tour. The GTSP form presented in this paper requires that a feasible solution pass through each nodeset exactly once. In [9], a more general form is presented which allows nodesets to be specified as requiring either exactly one visit or at least one visit. It is shown that any problem in the more general form can be transformed into the form presented in this paper. 4

2. Problem Transformation We now want to show that any problem formulated as a GTSP can be transformed to a standard TSP. The fact that this is possible is not surprising since complexity theory assures us that any NP-hard problem can be polynomially transformed to any other NP-hard problem. Such transformations, however, are often accompanied by a significant increase in the size of the problem instance. For the transformation we present, the number of nodes in the resulting TSP is given by Ei=l ISi l. That is, the number of nodes is only increased according to nodes appearing in multiple nodesets. The entire transformation is divided into three main stages. In the first stage, the general problem is transformed to a GTSP having mutually exclusive nodesets. In the second stage, the intraset arcs are eliminated leading to the canonical form. In the third stage, the canonical form of the problem is transformed to a clustered TSP and then finally into a standard TSP. We define the following notation. Let M(i) be the collection of nodesets of which node i is a member, i.e, if i E SK, then K E M(i). Let (PO) be a restatement of the general problem. See Figure 2 for an example. Definition: Problem (PO) is a GTSP defined over nodes Af0 and arcset A~ with corresponding cost vector c~. For a given node i, let M(i) index the nodesets containing i. 2.1 Mutually Exclusive Nodesets Given that we have a problem in the form of (PO), this section describes a transformation to make the nodesets mutually exclusive. We begin by defining a problem (P1) which has nodes identical to (PO) but a different arcset. Define an arc in (P1) from i to j only if node j belongs to a nodeset to which i does not. This ensures that each arc crosses into at least one new nodeset. This characteristic will allow us to change all arc costs without disturbing the order of solutions. Definition: Problem (Pl) is a GTSP defined over nodes M/ = KO. The nodes are the same with respect to nodeset membership as well. The arc set Al is defined as follows. 5

Let i, j be two nodes of AC~ such that M(i) X M(j). Calculate the shortest path cost from i to j over the arcset Aij C A~, where arc (k, I) E Aij if the following hold, M(k) C M(i) U M(j), M(l) C M(i) UM(j), I does not belong to a nodeset Sp, P E M(i), such that k $ Sp, k does not belong to a set SQ, Q E M(j), such that I $ SQ. If the shortest path has finite cost, create arc (i',j') E Al with cost cl,j, equal to the shortest path cost. If no feasible path exists, then no arc (i', j') is created for A1. Each shortest path is calculated over a subnetwork which allows nodeset boundaries to be crossed only once. Each arc (i',j') E Al corresponds to the shortest path from i to j which does not pass through any other nodesets. Figure 3 shows an example. If the shortest path contains a negative cycle, the negative cycle can only occur over intraset arcs without crossing nodeset boundaries. In this case, a corresponding arc is created and given a large negative cost. If there exists a feasible solution to (P1) which contains the arc with the large negative cost, then the original problem is unbounded below. Lemma 1: Given an optimal solution, y*, to (P1), we can construct an optimal solution x* to (PO) with cx* = cly*. Proof: See Appendix A. Now that we have redefined the arcs and arc costs of the problem so that each arc enters at least one new nodeset, we can then add a constant to the cost of all arcs entering a nodeset without changing the order of solutions. Let us create a problem (P2) which differs from (P1) only by arc costs. Definition: Problem (P2) is a GTSP over the nodeset V2 = A' with A2 = Al. The nodes are the same with respect to nodeset membership as well. The arc costs for (P2) are computed below. First, however, we must make all costs of (P1) greater than or equal to zero. Let Cmin = Minimum{clj over all (i, j) E Al, O. For each nodeset, subtract cmin from all arcs entering that nodeset. Since each arc of (P1) enters at least 6

one new nodeset, all arc costs will become nonnegative. Assuming now c1 > 0, let a be defined as a positive number strictly greater than the sum of all arc costs, i.e., +0oo > > ci. Given an arc (i,j) E Al, let (i,j)EA' the cost of arc (i', j) E A2 be defined as c2j, = ( M(j) - { M(i)nM(j)}j)(a) + clj. This is equivalent to adding a to the arcs entering each nodeset. Since each arc crosses into at least one new nodeset, I M(j) - { M(i) n M(j)} I > 1. Lemma 2: Problem (P2) is equivalent to Problem (P1). Proof: Both cost adjusting operations are permitted since each nodeset must be entered exactly once in a solution tour. ~ Figure 4 shows an example. After the transformations to (P1) and (P2), every arc enters at least one new nodeset and has a cost of at least a. Any feasible solution to (P2) will have a total cost strictly less than (m + 1)a and greater than or equal to (m)a. These characteristics ensure equivalence is preserved throughout the next step of the transformation. We now create a problem (P3) which has mutually exclusive nodesets and is equivalent to (P2). Definition: Problem (P3) is a GTSP over N3, A3, c3, defined below. For every node i in (P2), we create a group of IM(i)I nodes for (P3), each with membership in only one nodeset. More formally, for i E A/2, define iP E A3 for all P E M(i) with iP belonging to only one nodeset, Sp. Hence, IM(iP)I = 1. Let the arcset A3 be defined as follows. - For an arc (i,j) E A2, create (iP,jQ) E A3 with cost c3pjQ = c for all P E M(i), Q E M(j). - For a node i E ^2 such that IM(i)l > 1, create arcs (iP, iQ) E A3 with cost cpQ = 0 for all P E M(i), Q E M(i), P $ Q. Problem (P3) now has mutually exclusive nodesets. Each arc cost of c3 is either greater than or equal to some nonzero multiple of a or is equal to zero. Figure 5 shows an example. 7

Lemma 3: Given an optimal solution, y*, to (P3) such that c3y* < (m + l)a, we can construct an optimal solution, x*, to (P2) with c2* = c3y*. Proof: See Appendix B. The importance of the cost adjusting (P2) transformation towards ensuring equivalence can best be illustrated by an example. Figure 6 shows a problem without the (P2) transformation and the same problem with the (P2) transformation. Both problems are then put in the form of (P3). The first problem allows a feasible solution with cost zero when, in fact, we cannot construct such a solution for the original problem. The problem that was transformed through (P2) allows only feasible solutions with cost greater than (m+ l)a. Thus, it correctly recognizes that the original problem is infeasible. In general, a represents a large penalty for visiting some, rather than all, of the (P3) nodes spawned from a particular multiple membership (P2) node. It forces the spawned nodes to be visited contiguously if visited at all. A feasible solution to (P3) passes through each nodeset exactly once. The arc costs ensure that if a nodeset is entered, a cost of at least a corresponding to that nodeset is incurred in the preceding arcs. The only way the a cost for entering a nodeset could not be incurred is if the nodes of a solution to (P3) all belonged to some particular nodeset. However, in the original problem, we assumed there existed no nodeset which overlapped with all other nodesets. Therefore, any feasible solution to (P3) must have cost greater than or equal to (m)a. In this section, we began with a problem with overlapping nodesets and transformed it to a problem in which the nodesets are mutually exclusive. We may now summarize the results of this section. Theorem 4: Given a GTSP in the form of (PO), we can transform the problem to a problem of the form of (P3). Given an optimal solution to (P3) with cost less than (m + l)a, we can construct an optimal solution to (PO). If an optimal solution to (P3) has cost greater than or equal to (m + 1)a, problem (PO) is infeasible. The next stage begins with a problem of the form (P3) and transforms it to a problem which contains no intraset arcs. 8

2.2 Intraset Arcs The second stage of the problem transformation eliminates intraset arcs. This procedure is applied after the problem has mutually exclusive nodesets. Given a problem of the form (P3), we can define a problem (P4) which differs only in the arcs and arc costs. Definition: Problem (P4) is a GTSP defined over nodes NA4 = Af3. The nodes are the same with respect to nodeset membership as well. The arc set A4 is defined as follows. Let i, j be any two nodes of X.3 such that M(i) $ M(j). Calculate the shortest path cost from i to j over the arcset Aij C A3, where arc (k, I) E Aij if the following hold, M(k) C M(i) U M(j), M(l) C M(i) U M(j), if M(l) = M(i) then M(k) must also equal M(i), if M(k) = M(j) then M(1) must also equal M(j). If the shortest path has finite cost, create arc (i',j') E A4 with cost c~ij, equal to the shortest path cost. If no feasible shortest path exists, then no arc (i', j') is created for A4. Theorem 5: Given an optimal solution, y*, to (P4), we can construct an optimal solution, x*, to (P3). Proof: See Appendix C. Problem (P4) represents the canonical form of the GTSP. The form has mutually exclusive nodesets each requiring exactly one visit and containing no intraset arcs (see Figure 7). The canonical GTSP offers considerable problem structure. Various approaches for solving canonical form GTSP's can be found in [3, 5, 6, 7, 10, 14, and 151. 9

2.3 Transformation to the Standard TSP The last stage transforms a canonical form GTSP to a clustered TSP. The clustered TSP is similar to the standard TSP except the nodes are grouped into predefined mutually exclusive and exhaustive node clusters. A feasible tour for the clustered TSP must visit all nodes of a cluster before visiting the nodes of any other cluster. Definition: Problem (P5) is a clustered TSP defined over nodes nA5 = A/4. A set of nodes in N'4, Si, corresponds to a cluster of nodes in Af5, Ci. For each nodeset (or cluster), we assume its member nodes have a given ordering. Let i1, i2,..., ir be the nodes of nodeset Si (or cluster Ci) with r = Sil, = iCI. The arcset A5 is defined as follows. Within each custer i with r = ICjI > 1, create intracluster arcs forming a single directed cycle according to its given ordering. Hence, create arcs (i', i2), (i2, i3), (i3, i4),..., (ir-1, it), (ir, il) in A5. Let each of these intracluster arcs have cost equal to zero, that is, cfi2 =... = c5,i = 0. For each interset arc (ij, kl) E A4, with j > 1, create an arc (i-1l, kl) E A5 with.5 4 4 identical cost, hence, c5ij-lkL = cikjL. For each interset arc (il, k') E A4 with r = ICi, create an arc (ir, kl) E A5 with cost CirkL = C4kl. Problem (P5) is a clustered TSP with a special structure. Since the only intracluster arcs are the zero cost arcs which form a directed cycle, it is easy to see that any feasible clustered TSP solution will traverse ICi - 1 of the intracluster arcs of Ci. See Figure 8. Lemma 6: Problem (P5) is equivalent to problem (P4). Proof: See Appendix D. Once the problem is in the form of (P5), it can be solved as a clustered TSP as in [41 or we can use an approach equivalent to that of Chisman [1] to transform a clustered TSP into a standard TSP by simply adding a large cost to all the intercluster arcs, as follows. See Figure 9. Definition: Problem (P6) is a standard TSP over nodes A6 = N/5 and arcs A6 = A5. The arc costs for (P6) are computed as follows. Let 10

6 c6i = c^j + p if i and j belong to different clusters c6j = ci5 if i and j belong to the same duster. where +oo >> c5j. (i,j)EA5 Our main result for transforming the canonical GTSP to a standard TSP easily follows. Theorem 7: Given a canonical GTSP in the form of (P4) with m nodesets, we can transform the problem into a standard TSP in the form of(P6). Given an optimal solution y* to (P6) with c6y* < (m + 1)P3, we can construct an optimal solution x* to (P4). Combining Theorems 4, 5 and 7, we can transform a GTSP into a TSP, solve, and transform the TSP solution to a solution for the original GTSP. 3.0 Summary and Conclusion The results in Section 2 show how to transform any problem modeled as a GTSP into a standard TSP. The number of digraph nodes is increased according to the number of multiple membership nodes in the original problem. Given a problem with m nodesets defined over an n-node digraph, the transformation can yield a TSP with at most mn nodes. Along with the increased number of nodes, there is a corresponding increase in the number of arcs. The effort required throughout the transformation is worst-case polynomially bounded by O(m2n4). But since the worst case bound is based on complete nodeset membership overlap, the transformation effort for real problems will likely be much less. One practical complication with these transformations is that arc costs can become very large. This is of no theoretical importance, but may cause stability problems in some solution techniques. The resulting TSP has an arc and arc cost structure that may cause difficulties for traditional assignment based methods for solving asymmetric TSP's. Specifically, the directed zero cost intracluster cycle will cause a simple subtour elimination algorithm to branch m levels before the first non-zero lower bound might be reached. This, along with the large arc cost values and the potential for degeneracy, suggests that approaches for the TSP based on integer linear programming [121 may be more appropriate for optimally solving the transformed problem. 11

An area for further research is to understand the extent to which the facet results for the TSP polytopes ([8], chapter 8) can be related to the canonical GTSP or even to the general form of the problem. This may allow us to define valid inequalities for complex routing problems which involve both sequencing and selection decisions. 12

APPENDIX A The proof of Lemma 1 follows the development of Lemmas 8, 9, and 10. Lemma 8: Given a feasible solution, y, to problem (P1), there exists a feasible solution to (PO), x, with cost c~x = cly. Proof: We know that for an arc (i', j') in y, there exists a path with cost cij, between i and j of A' which passes through only the nodesets to which i and j belong. Therefore, given y, we can construct a feasible solution x as a collection of paths corresponding to the arcs in y. n Lemma 9: Given a feasible solution, x, to (PO) expressed as a sequence of nodes from i to i as {i,...,,..., k,..., i}, there exists a corresponding subsequence, y = {j', k',...,j'}, such that for any two consecutive nodes, j' and k', M(j') 2 M(k'). Proof: The subsequence can be constructed as follows. Start checking x in sequence until a node j is found such that M(i) 2 M(j). We are guaranteed that such a node will be found since node i cannot belong to all nodesets. Designate j' to begin the subsequence y. From j, check in a similar manner until some node k (or i) is found such that M(j) 2 M(k). Again, we are guaranteed that such a node will be found since j cannot belong to all nodesets. Let k' follow j' in the subsequence. Continue checking x in sequence. Whenever a node is checked which belongs to at least one new nodeset, add its equivalent node to y. Continue until finally node i is checked. Since the nodeset memberships of A/~ and NA1 are the same, M(j) 2 M(k) implies M(j') 2 M(k'). I Lemma 10: Given a feasible solution, x, to (PO), there exists a feasible solution, y, to (P1) with cost c1y < c~x. Proof: We can write out the solution x as a sequence of nodes. From Lemma 9, we know there exists a subsequence y such that for any two consecutive nodes, i' and j', M(i') 2 M(j') and, by definition, there exists an arc (i',j') E Al with cost c3ij1. If nodes i and j are not consecutive within the sequence x, then the nodes between them 13

belong to, at most, the nodesets defined by M(i). The cost of the path from i to j over the sequence x will be greater than or equal to cl,,. * Proof of Lemma 1: From Lemma 8, we know we can construct x* which is feasible and cly* = c~x*. Suppose x* was not optimal and there was another tour x such that c~x < c~x*. From Lemma 10 we know there would exist a (P1) tour y such that cly < c0x. This would imply cy < cx < cx* = cy* and, hence, h, y* would not be optimal. By contradiction, x* is optimal for (PO). * 14

APPENDIX B The proof of Lemma 3 follows the development of Lemmas 11 and 12. Lemma 11: Given a feasible solution to (P2), x, there exists a feasible solution to (P3), y, with c2x > c3y. Proof: We can write out the solution x as a sequence of nodes from node i to itself as {i, j,..., k, l,...,p,..., q,..., i}. Construct a sequence y as follows. Choose some P E M(i), and start y at iP, i.e. y = {iP,...}. Continue building the sequence y by adding nodes iQ, for all Q E M(i), Q $ P, i.e. {iP, iQ,..., iR,...}, Q E M(i), R E M(i). By definition M(i) X M(j) so there exists some nodesets to which j belongs but i does not. Continue the sequence by adding jT, for all nodesets T such that T E {[M(j)U M(i)] -M(i)}, i.e. {iP, iQ,..., iR, jT,..,jU,...}, T,..., U E {[M(j) U M(i)] - M(i)}. Continue building the sequence, in general, by adding nodes Iv for all V such that V E {[M(l) U M(i) U M(j)... U M(k)] - M(i) - M(j)... - M(k)}. The construction will end at some node pw such that W E M(p), and W is the last nodeset not already visited, i.e., {iP, iQ,..., iR, T,... j...,,..., pW, iP}. The sequence cannot include more than m nodes since each node belongs to only one set and the construction prevents visiting nodesets already visited. All arcs of the sequence y of the form (iP, iQ), exist and have a cost of zero. The arcs between nodes created from different nodes, such as (iR, jT), exist and have cost CRjT = cj. Hence, the arc existence and arc costs are the same as that for all arcs of x until we reach the string of nodes {p,..., i}. If p immediately precedes i, the equivalence holds since c3wip = c2i. If p does not immediately precede i, then a node q between p and i can, at most, belong to the union of the nodesets of p and i, i.e., M(q) C {M(p) U M(i)}. If q belonged to a nodeset to which neither p nor i belonged, then the nodeset would be separately visited twice in the sequence x and would contradict our assumption that x is feasible. Since all nodes between p and i belong to the nodesets of p and i, there will exist an arc (pW, iP) E A3 with cost CWpP equal to the shortest path over nodes which will include those of the sequence between p and i. The shortest 15

path cost will be less than or equal to the path taken between p and i of the sequence x. Hence, c2x > c3y. I Lemma 12 For any feasible solution, y, to (P3) with c3y < (m + l)ca, there exists a feasible solution, x, to (P2) with c2x = c3y. Proof: The solution y enters each nodeset exactly once and consists of arcs with cost zero or some amount greater than or equal to a. Since c3y is less than (m + l)ct, any nodes of the form iP and iQ, with P, Q E M(i), must be adjacent in the tour y, otherwise an extra cost of ca would be incurred and cause c3y to be greater than or equal to (m + l)ca. We can construct a solution x by including arcs corresponding only to those arcs of y having the costs greater than zero (exactly the reverse of the construction in the proof of Lemma 11). ~ Proof of Lemma 3: From Lemma 12, we know that we can construct x* which is feasible and c3y* = c2x*. Suppose x* was not optimal and there was another tour x such that c2x < c2x*. From Lemma 11, we know there would exist a (P3) tour y such that c3y < c2x. This would imply c3y < c2x < c2x = c3y* and y* would not be optimal. By contradiction, x* is optimal for (P2). * 16

APPENDIX C The proof of Theorem 5 follows the development of Lemmas 13, 14, 15, and 16. Lemma 13: Given a feasible solution, x, to (P3) expressed as a finite sequence of nodes {i,..., j,..., k, k..., i}, there exists a corresponding subsequence y = {i,...j',., i' in (P4) which visits exactly one node in each nodeset. Proof: We can construct the subsequence with the following rule. For every nodeset, choose one member node from x and include it in y. m Lemma 14: Given two consecutive nodes of the subsequence y, j' E Sj and k' E SK, if the same nodes j and k are not consecutive in the sequence x, then any node between j and k must belong to either SJ or SK. Proof: Suppose the path between j and k includes some node I E SL. Since I' was not between j' and k' in the subsequence, then SL is visited in some other part of the sequence x. This would imply that SL is visited twice in the sequence x. By contradiction, I must belong to either SJ or SK. * Lemma 15: Given a feasible solution to (P3), x, we can construct a feasible solution to (P4), y, such that c4y < c3x. Proof: From Lemma 13, we know there exists a subsequence y which visits exactly one node in each nodeset. Given two consecutive nodes of y, j' and k', Lemma 14 establishes that there exists a feasible path between j and k which passes through only the nodesets of j and k. Since cpI., is equal to the shortest path through the nodes between j and k, then the total costs incurred on the sequence x from j to k must be greater than or equal to c4,k,. Hence, in total, c4y < c3x. m Lemma 16: Given a feasible solution to (P4), y, we can construct a feasible solution, x, to (P3) with c4y = c3x. 17

Proof: We can write out the feasible tour y as a finite sequence of nodes {i',..., j', k',..., i'). Since y is feasible, each arc (j', k') exists and has cost CI4,k. Given arc (j', k'), we know that there exists a feasible shortest path from j to k which may pass though only the nodesets of j and k, and the cost of this path is C4Ak. Simply construct x as the collection of these shortest paths defined by consecutive nodes of y. Since the cost of each arc of y is equal to the cost of the corresponding path in x, c4y = c3x. ~ Proof of Theorem 5: From Lemma 16, we can construct x* which is feasible and c4y* = C3X*. Suppose x* was not optimal and there was another tour x such that c3x < C3*. From Lemma 15 we know there would exist a (P4) tour y such that c4y < C3X. This would imply c4y < c3X < C3X* = c4y* and, hence, y* would not be optimal. By contradiction, x* is optimal for (P3). * 18

APPENDIX D The proof of Lemma 6 follows the development of Lemma 17. Lemma 17: A feasible solution to (P5) which enters cluster Ci through node i3 with j 1, must leave the cluster from node ij-1. A feasible solution to (P5) which enters cluster Ci through node i1, must leave the cluster from node ir where r = ICl. Proof: Omitted. Proof of Lemma 6: Given a feasible solution to (P4), x, expressed as a sequence of nodes {ij,k1,...,pq,, we can construct a solution y to (P5) as y = {i, i+l,... ij-, kl, kl+,...., kl-1,...p, pq+1,....,pq-1, ij}. The solution y is feasible for (P5) by definition of a clustered TSP and by construction of A5. Each intercluster arc of y has a cost equal to a corresponding interset arc of x. Hence, c5y = c4x. A feasible solution to (P5), y, must pass through the clusters as specified in Lemma 17. This means if intercluster arc (ij, k') is used in y, y must include an intercluster arc into node ij+1 and an intercluster arc out of node k'-1. We can construct a feasible solution to (P4), x, as follows. For every intercluster arc (ij, k1) in y, include arc (ij+l, kl) in x. By definition, c4+1k = cjk, so c4 = c5y. Cij+lkl'- CiJkl, S $0 X:'- y. 19

REFERENCES [1] J. Chisman, "The Clustered Traveling Salesman Problem," Computers and Operations Research 2 (1975) 115-119. [2] J. Current and D. Schilling, "The Covering Salesman Problem," Transportation Science 23(3) (1989) 208-213. [3] A.L. Henry-Labordere, "The Record Balancing Problem: A Dynamic Programming Solution of a Generalized Travelling Salesman Problem," RIRO B-2 (1969) 43-49. [4] K. Jongens and T. Volgenant,'The Symmetric Clustered Traveling Salesman Problem," European Journal of Operational Research 19 (1985) 68-75. [5] G. Laporte and Y. Nobert, "Generalized Travelling Salesman Problem Through n Sets of Nodes: An Integer Programming Approach," INFOR 21(1) (1983) 60-75. [6] G. Laporte, H. Mercure and Y. Nobert, "Finding the Shortest Hamiltonian Circuit Through n Clusters: A Lagrangean Approach," Congressus Numerantium 48 (1985) 277-290. [7] G. Laporte, H. Mercure and Y. Nobert, "Generalized Travelling Salesman Problem Through n Sets of Nodes: The Asymmetrical Case," Discrete Applied Mathematics 18 (1987) 185197. [8] E.L. Lawler, J.K. Lenstra, A.H. Rinnooy Kan and D.B. Shmoys, The Traveling Salesman Problem (Wiley, New York, 1985). [9] C.E. Noon,'The Generalized Traveling Salesman Problem," unpublished dissertation, Department of Industrial and Operations Engineering, The University of Michigan (Ann Arbor, 1988). [10] C.E. Noon and J.C. Bean, "A Lagrangian Based Approach for the Asymmetric Generalized Traveling Salesman Problem," College of Business Administration working paper No. 855, University of Tennessee (Knoxville, 1988). 20

[11] C.E. Noon and L.A. Kaplan, "A TSP Transformation for the Multi-Depot MTSP," working paper, Department of Management, University of Tennessee (Knoxville, 1989). [12] M.W. Padberg and G. Rinaldi, "Optimization of a 532-City Symmetric Traveling Salesman Problem by Branch and Cut," Operations Research Letters 6 (1987) 1-8. [13] R.G. Parker and R.L. Rardin, "The Traveling Salesman Problem: An Update of Research," Naval Research Logistics Quarterly 30 (1983) 69-96. [14] J.P. Saksena, "Mathematical Model of Scheduling Clients Through Welfare Agencies," CORS Journal 8 (1970) 185-200. [15] S. Srivastava, S. Kumar, R.C. Garg and P. Sen, "Generalized Traveling Salesman Problem Through n Sets of Nodes," CORS Journal 7 (1969) 97-101. 21

1 (a) Feasible GTSP tour which passes through a node twice. 1 (b) Feasible GTSP tour which passes through an arc twice. Figure 1: Examples of feasible GTSP tours. 22

A feasible solution is given by the sequence 1-3-5-6-1 with cost of 26 Figure 2: Example problem (PO). 23

Each arc must enter at least one new nodeset. Figure 3: Example problem (P1). 24

With a=1 00, costs are adjusted according to the number of new nodesets an arc enters. Figure 4: Example problem (P2). 25

111 For a node with multiple nodeset membership, we create duplicate nodes interconnected with zero cost arcs (dashed). Figure 5: Example problem (P3). 26

In form of Problem (P1), transformed to Problem (P3). In form of Problem (P2), transformed to Problem (P3). 27 Figure 6: Example showing necessity of (P2) transformation.

111 Intraset arcs are eliminated by restricted shortest path computations. Figure 7: Example problem (P4). 28

I I I I I I I I I I With =I000,thc e clustered TS ctrafoed to a standardTSP. A feasible solution is given by the sequence IA-2A.3B.04B.5B5C-6C 4c I with total cost - m(m+o) + 26.- 3326. Figure 9: Example problem (P6). 30

UNIVERSITY OF MICHIGAN I III 11115047327005 3 9015 04732 7005