A DECENTRALIZED APPROACH TO SYSTEM OPTIMAL ROUTINGS IN DYNAMIC TRAFFIC NETWORKS ALFRED GARCIA Department of Industrial and Operations engineering University of Michigan, Ann Arbor, MI 48109, USA DANIEL REAUME Department of Industrial and Operations engineering University of Michigan, Ann Arbor, MI 48109, USA ROBERT L. SMITH Department of Industrial and Operations engineering University of Michigan, Ann Arbor, MI 48109, USA Technical Report 97-15 November 24, 1997

A Decentralized Approach to System Optimal Routings in Dynamic Traffic Networks* Alfredo Garcia, Daniel Reaume and Robert L. Smith Department of Industrial and Operations Engineering University of Michigan, Ann Arbor, MI 48109 November 23, 1997 Abstract We introduce a novel procedure to compute system optimal routings in a dynamic traffic network. Fictitious play is utilized within a game of identical interests wherein vehicles are treated as players with the common payoff of "average trip time experienced" in the network. Encouraging results from a large scale computational test on a real network are presented. 1 Introduction \We consider the problem of finding a system-optimal routing vector in a dynamic traffic twork n other words, we seek a set of routes that minimizes the average trip time experienced by the vehicles in the network A closely related problem is that of finding (i u.cer-optimal routing vector. that is, one such that no single driver may reduce her/his (ext)ected travel time, by unilaterally deviating from their assigned route. This latter problem can also be reformulated to be the determination of a Nash equilibriumn solution for the dynamic traffic network game. In such a game, players are identified with vehicles and payoffs are computed through an assignment mapping that, given routing decisions for all players, calculates the resulting dvnamic travel times. A routing mapping assigns to each vehicle a time-dependent shortest path from origin to destination given the route choices of all other vehicles. In other words, the routing mapping is a best reply to the rolltilng decisions of the other vehicles. FzctIztous play is an iterative procedure in which at each step players compute best replies Lssluningf that opponents decisions are distributed according to the historical frequency of t lhir previous decisions (Brown [1951]). In the dynamic traffic network context, this pro(ellure can )( interpreted as an iterative routing-assignment algorithm, in which at each * his work was supported in part by the Intelligent Transportation Systems Research Center of Excellence alt Iliversitv of Mlichigan. 1

step, for each player, the routing mapping computes time-dependent shortest paths given that other players decisions are distributed according to the historical frequency of routing decisions. This algorithm has been implemented and widely tested (see Kaufman, Smith and Wunderlich[1992]). Experimental results for this procedure have been favorable but difficult to justify since a proof of convergence has yet to be developed. Recent theoretical advances in game theory (see Monderer and Shapley [1996]), yielded conditions that may be applied to the dynamic vehicle routing problem. Monderer and Shapley demonstrate that when players (in this case, vehicles) share a common objective function, such as minimizing the average trip times of all vehicles in the network, then fictitious play converges in to an equilibrium in beliefs, i.e historical frequencies of routing decisions. Although, this convergence result requires interpretation in the dynamic traffic network context, it motivates a potentially attractive algorithm. Since an equilibrium solution to an artificial dynamic traffic game in which players share the above mentioned common objective is probably a good system optimal routing. 2 Preliminaries. 2.1 Notation. We introduce the dynamic traffic network game where: * A' = {1.2...., n} is the index set of vehicles. * Every vehicle has a finite number of routing choices, that is to say: for every i E N there is a set Y = {rl2... r,,,} of possible routes to take. Let us denote by y = n IZ E: A *.: - R' is the asszqnment mapping. For any y E Y, i.e a routing policy vector,.1.(y! is the sum over the path determined by y, of the resulting dynamic travel times (er link. In other words..,(y) is the expected total travel time for vehicle i, given th}at the other vehicles take routes y,. j $ i. We will denote by Ai the set of mixed rolltilng decisions i.e: A = {f:;-. )11 such that y fi(y,)= 1} Yi E )Y ThI1 extreme points of A' are just the elements of }. ILet A = n A'. W\e extend the (donainl of the assignment mapping so that for f E A \\'e }11've:'.l(f)= E.-,(!/) f'l(l). f2(2).. f'n(y) *~ l.(i A- = AJ. the cartesian product of the sets of mixed routing decisions for all Vclh lesf other thanI i. 2

Ri: A - Ai is the routing mapping or "best response" for vehicle i to a mixed routing decision followed by the other vehicles. Given f E A, Ri(f) is the set of routes for vehicle i that yield the least expected travel time assuming the other vehicles j $ i follow the routing decisions as prescribed by f. Formally: Ri(f) = arg min [Z Ai(y). f'(yi).fi(yi).... fn(y,)] f, EAt yEY 3 Nash Equilibrium. We say that a mixed routing vector f* is user-optimal if for every vehicle i E N the probabilities assigned to routes for vehicle i by f* yield the minimum expected total travel time, provided that f*i, the mixed routing choice of all other vehicles is held fixed, i.e: f* E arg min Ai(fi, fi) Now, recalling our definition of R,(.). the routing mapping, we note the previous equation can also be expressed as: f: E R(f*) Finally, in vector notation the last equation can be rewritten as: f = R(f*) which is the very familiar definition of a fixed point. It is worth pointing out that we only differ from Rosenthal's [1973] formulation in that to take into account time varying link t ravel times. we do not make any analytical assumptions on the assignment mapping. Moreover, as opposed in the static setting (see Haurie and Marcotte[1985]) in this dymlllllic settilng. there is not a straightforward relationship between Wardrops' user optimality.111( tlhe Nash equilibrium concept. 3.1 Existence of Nash Equilibrium in Mixed Decisions. I\e(all tliat a mixed routing decision f E A is a "lottery" combination of routing decisions ill t lie set )} A str;aightforward conclusion from the classical Nash equilibrium existence theorem is: Theorem 1:A dynamic traffic network ihas a Nash equilibrium in mixed routing deciProof: It simply follows Nash![1950>. i(cen the finiteness of the set of routes for each!i, ivr 1lld( t lh way payoffs are defined for mixe. d strategies. * 4 Fictitious Play Convergence. \\i llnox riefl!v review Mlonderer and Shaple 's result 3

4.1 Notation. Let us denote by K C A the equilibrium set for the artificial traffic game above presented and 11.11 any fixed euclidean norm on A. For 6 > 0 let Ba(K) the open ball with radii 6, i.e: Ba(K) = {g E A: min ig - fli < 6} fEK A path is a sequence y = {y(t) }i 1 of elements of Y A belief path is a sequence f = {f(t))}1 in A Definition 1: We say that a belief path converges to equilibrium (or converges in beliefs) if each limit point is an equilibrium point. Formally, for every 6 > 0 there exists T such that f (t) B6(K) for all t > T To every path y we can associate a belief path fy by simply computing the historical frequency of the various routing decisions, for given r E Yi: #{1 < s < t: yi(s) = r} f).t (r) - #{1^5<^t Note that if we define I4yt(r) to be the indicator function of the route r in the path then: yt+ (r) = f.(r) + (I, - (r)) t+l \vi(TF et:~ 1 ifyi(t)=r Iy(r) = 0 otherwise Definition 2: A path y is a fictitious play process if for every i E N and every t: y, (t + 1) E arg min[Ai(y, y;,)] In1 words. at each t the route prescribed for player i.is the pure best response to the mixed st rat e! for all other vehicles consisting of the historical frequency of routes they have chosen. Definition 3:A Game has the Fzctztiouu. Play Property if every fictitious play process (ol (Trr/c, zn beliefs. \\e now state the important Monderer and Shapley's result. Theorem 2: If all players have the same payoff function, then the game has the Ficti\t j1ill. II 'rop)erty. Proof:Se M.londerer and Shapley [1996]. e \\( io\ introduce the artificial dnramic traffic game that will, by construction, posess thle fictit iols pla property. 4

5 The Artificial Dynamic Traffic Game. In order to apply Monderer and Shapley's convergence result, we redefine the traffic game by artificially imposing the same payoff function to every player in the game. Specifically, we will use in this artificial traffic game (which we shall refer to as ATDG) the "average trip time" as the commmon payoff function for all vehicles, U: A - R: U(f) = A(f) iEN Let us now examine the meaning of an equilibrium with respect to this "artificial" game. f* E arg min U(fi, f-i) f' EAg Intuitively, given that all other vehicles j: i follow f,*, vehicle i can not reduce any further the "average trip time" experienced by the vehicles in the network by deviating from the prescribed routing f*. In other words. for the optimization problem: (P) minfA U(f) the mixed routing f* is some sort of a "local" optimal solution. However, there may be optimal solutions to problem (P) that are not neccesarily equilibrium routings for the ATDG gIame. In other words; Optimal Solution Set(P) C Equilibrium Set(ADTG) 5.1 An Algorithm. \\( foriimally present the algorithm motivated by Ponderer and Shapley's result. It presents, il(eve ar. major difficulty; Theorem 2 only asserts that for a converging sequence of mizxed /I(If( (/l(:< gIenerated by fictitious plaN the limit is a Nash equilibrium of the original game. It I1 (1.1sense(. a "limsup " set convergence result. and for computational purposes we need a "lill if" type of result. However. it is worth pointing out that whenever the equilibrium set of or artificial game is a singleton. the algorithm is guaranteed to converge. In any other '.t>(. tile algorithlm will compute routings that will be arbitrarily close to the equilibrium sel. t hen continuity of the asszgnment mapping. a reasonable assumption, will ensure a good Algorithm! P'k aIl iniitial "pure" routilng vector f,. ( omlpl te })est reply:?y,( + 1) E arg min[U((y, ft-)] y, EY I' ' )(lll Iii storical frequencies of route choices. ft 1 I. f,,I - ftl < ~ then Stop. othlerwise go to 2. 5

5.2 Implementation. We have implemented the above algorithm in a software package called Alliance. To ensure tractability, we made several simplifications to the original algorithm. First, it is extremely difficult to, for each vehicle i, analytically compute a best response to the historical frequencies of the routings of the other vehicles. Instead we simulate the passage of these other vehicles through the network where each vehicle chooses its route with probability in accordance with the historical frequencies of its routings. Using the time dependent link travel time profile produced by this simulation, vehicle i may then be assigned to a path minimizing the increase to total system travel time. To further simplify matters. since a single vehicle will have little effect on congestion, we also simulate the the travel of vehicle i in this simulation, thus avoiding the need to run a separate simulation for each vehicle. It is hoped that for a heavily congested network with many vehicles leaving at the same time with the same origin and destination this approach will allow to adequately approximate true best responses. The second simplification we perform is to discretize time into a sequence of slices. Within each slice, vehicles are routed by the simulation according to the routing tables. These tables assign, for each slice s and node n. the probability distribution with which a vehicle arriving at node n with destination d at time s. will choose its next link. Note that this simplification allows vehicles to follow routes they may have historically never taken, contradicting the algorithm's scheme. Here again. we appeal to the large number of vehicles flowing through the net"work to justify this simplification since the effective congestion should remain relatively unchanged while greatly reducing storage and computation requirements. 5.3 Computational Tests. 1,, validate 4lliance we applied the algorithm to the Troy, Michigan traffic network. Appr).xilnately 16500 vehicles were allowed to flow into the network in 24 minutes according It) t ravell patterns approximating those actually observed in Troy. After 24 minutes the flow 1 (o th Ile net work was halted and thie vehicles were allowed to to travel for further 36 minutes, tliis1 allowingi the network to clear. Fo account for the impact of different market penetration levels of ITS technologies we lefinled thiree types of vehicles classes: Class 1. consisted of those vehicles following the free fow\ shiortest patlh. Class 2. consisted of those vehicles that perform a periodic update of the fre(e flow shortest paths. and finally. Cla.ss.Y velhicles were guided by the Alliance algorithm. I'Th1 initial routing given to all classes corresponded to shortest path under free flow. 1I1 the first test. with high market penetration (i.e. Class 3 vehicles account for 25% of tlh( total number of vehicles) we observe that Alliance computes routings which are as good ai> th i ose computed with S4 VaA'T in terms of system average trip time, in considerable less ii rl)(T (f it ('rations and c.p.u time. Test 1 6

C1(50%) C2(25%) C3(25%) # Iterations Alliance 8.85988 8.85677 8.71779 14 SAVaNT 8.87245 8.84866 8.68266 34 In the second text, with low market penetration (i.e. Class 3 vehicles account for 5% of the total number of vehicles) we can observe reductions in travel time for intelligent vehicles, here again at a substantially lower computational effort when compared to SAVaNT. Test 2 C1(95%) C3(5%) Average # Iterations Alliance 17.30430 15.59930 17.21905 20 SAVaNT 17.49250 15.49160 17.39240 68 6 Conclusions. Through the application of recent results in the theory of learning in games and the extension of Rosenthal's[1973] framework to the formulation of the dynamic congestion game, we have implemented a decentralized iterative procedure to compute system optimal routings. By focusing on discrete routing decisions and with the help of a dynamic travel time simulator we avoided the technicalities of a more thorough analytical development. First large scale results are encouraging, mostly by the substantial reductions in computational requirements. 7

References [1] Brown G.W, "Iterative solution of games by fictitious play" Activity Analysis of production and Allocation John Wiley (1955) [2] Haurie A. and Marcotte P. "On the relationship between Nash-Cournot and Wardrop Equilibria" Networks 15: (1985) pp. 295-308. [3] Kaufman D., Smith R.L and Wunderlich K. "Dynamic User-Equilibrium properties of Fixed-Points in Iterative Routing-Assignment Methods" IVHS Technical Report 92-12 (1992) University of Michigan (submitted to Transportation Research Part C) [4] Monderer D. and Shapley L. "Fictitious Play Property for games with Identical Interests" Journal of Economic Theory, 68, (1996) pp 258-265 [5] Nash John "Equilibrium points in n-person games" Proceedings of the National Academy of Sciences 36: (1950) pp. 48-49. [6] Rosenthal R.W. "The network equilibrium problem in integers" Networks 3 (1973) pp 53-59 8