Traveling Salesman Problem

Traveling Salesman Problem

traveling salesman problem

The Traveling Salesman Problem consists in solving the following:

Given a complete graph, with n nodes, and the cost of traveling between each and every pair of nodes, what is the shortest possible route that visit each node only once and returns to the initial node?

The following elements come into play:

Formulation of the TSP

There exist several formulations for the TSP.

Basic TSP formulation

A basic formulation of the [[#Traveling Salesman Problem]] is the following:

basic traveling salesman problem formulation

minxij(i,j)Acijxij
subject to:
Degree constraints:{jA+(i)xij=1,iVjA(i)xij=1,iV
where:

  • xij{0,1},(i,j)A
    • xij=1 if link (i,j) is used in the solution, 0 otherwise

DEGREE CONSTRAINTS:
The degree constraints are needed to ensure that every node has only one link entering it and one exiting it.

More specifically:

This formulation is not sufficient to define the problem. In fact, a possible solution with this formulation could give something like this:

Schermata 2024-12-27 alle 17.25.03.png

In the solution shown above, we can see that every link is only visited once, and the cost was minimized. We can easily see thoug that many subtours were made in place of a single tour.

TSP formulation with subtours breaking constraints

In order to avoid subtours like the one originating from the [[#Basic TSP formulation]], a new constraint is added to the formulation:

traveling salesman problem formulation with subtours breaking constraints

minxij(i,j)Acijxij
subject to:
$
\begin{align}
\text{Degree constraints:} &\begin{cases}
\sum\limits_{j \in A^{+}(i)} x_{ij} = 1, \qquad \forall i \in V \
\sum\limits_{j \in A^{-}(i)} x_{ij} = 1, \qquad \forall i \in V \
x_{ij} \in {0,1}, \qquad \forall (i,j) \in V
\end{cases} \ \

\text{Subtour breaking constraint:} &\sum\limits_{i\in S, j \in S} x_{ij} \le |S| - 1, \qquad \forall S \ne 0, \quad S\subset V
\end{align}
$
where:

  • xij{0,1},(i,j)A
    • xij=1 if link (i,j) is used in the solution, 0 otherwise
  • S: is a subset of all the nodes (V)

Remember:

SUBTOUR BREAKING COSTRAINT:
The subset breaking constraint does the following: for any subset of nodes, the number of links connecting these nodes has to be lower or equal the number of nodes in the subset minus 1. With this constraint, if SV, than a tour connecting all nodes will necessarily have |V|1 links. If, for example, I was to take a subset, S, with only 3 nodes, and these 3 nodes where connected to each other in a cycle, the cycle would contain 3 links, but this would break the constraint. So, if we check this condition for every possible subset of V, we will be sure that no subtour is created.

Despite this constraint being theoretically acturate, and it would allow to come to the right solution to the problem, given a network with n nodes, the constraints needed would amount to
2n1
conditions. These, is way too many for any computer when n approaches values of any significance.

TSP formulation with weak subtour breaking constraints

Since, as we've seen in [[#TSP formulation with subtours breaking constraints]], the formulation of the breaking constrains is highly impractical when it comes to solving the problem with a computer, we hereby propose an alternative formulation.

traveling salesman problem formulation with weak subtours breaking constraints

minxij(i,j)Acijxij
subject to:
Degree constraints:{jA+(i)xij=1,iVjA(i)xij=1,iVxij{0,1},(i,j)VWeak Subtour breaking constraint:{ujuin(1xij)+1,ij,i1u1=11uin,iV
where:

  • xij=1 if link (i,j) is used in the solution, 0 otherwise

Remember:

Explanation of the WEAK SUBTOUR BREAKING CONSTRAINT
ui is a variable that allows to avoid subtours in the solution. Notice how uj must be greater than a certain quantity, specifically ujui+1 for links in the solution, except for when i=1. i=1 is the depot. This means that, the only occasion where uj can be lower than ui+1 is when we go back to the depot. It is therefore impossible for the problem to generate subtours. In fact, that would mean that a cycle is created but in order for that to happen, uj would be greater than ui+1 when i is not 1. Here is an attempt of a visual representation of how this constraint works:

Traveling Salesman Problem 2024-12-27 19.37.16.excalidraw.png
%%Edit in excalidraw%%

The figure above shoes the same network with 2 different solutions to the TSP. Keep in mind only the links in the solution are shown, the network is supposed to be complete.

On the left, the solution breaks the the weak subtour breaking constraints. In fact, u14 and u1=1 is suppsed to be a contraddiction: in fact a number cannot be greater than 4 and equal to 1 at the same time (the prime notation is just to show that it is obtained following the cycle). This is allowed since for i=1 the constraint is not valid. Notice though how the same happens in the green cycle at node 4. This time though this disrepancy is not allowed, therefore, this solution is not permitted.

On the write, a unique cycle where the only occasion where the constraint is broken is when it is actually allowed, at i=1.

Solving the TSP

Solving the [[Traveling Salesman Problem]] is no easy task. That is why the problem is most often approached through [[Heuristic]] methods.

These are methods that are not proven to be mathematically optimised (as a matter of fact, they are not), but rather use an intuitive approach to reach a solution that is, at the very least, good enough. They won't guarantee to approach the exact solution, but should be able to provide a solution that is not far from it.

We will see the following:

For any of the following method, a cost matrix is going to be needed. This is a matrix that, for each link, gives the cost of traveling on that link.

Nearest neighbour heuristic

The nearest neighbour heuristic is considered a [[greedy heuristic]].

nearest neighbour heuristic

To approach a solution this is how we proceed:

  1. Select a node iN in the network. Label this node as to remember that has already been visited;
  2. Look at all the links leaving node i and select the one with the lowest cost;
  3. Follow the selected link to j and label it too;
  4. Keep going like this until all nodes have been labeled.

We can think of labelling a node as to add it to a set T. Whenever we look for a new link, we will only check links the go to a node that is not yet in T.

When all nodes are in T, we stop as we will have obtained a [[Hamiltonian circuit]].

Nearest neighbour insertion heuristic

The nearest neighbour insertion heuristic is considered a [[greedy heuristic]].

nearest neighbour insertion heuristic
  1. Select any initial subtour. The nodes in the subtour will go into a set C
  2. For every ith node not in the subtour, create a set, d(i,C) containing all costs from and to that node i to every node in the subtour (C)
    • we will have something like: d(i,C)={ci1,ci2,...,cin} for each node iC
  3. Select the minimum cost of each set d(i,C): minjC(d(i,C))
  4. Select the minimum of all the costs found in step 3
  5. Add the selected link to the subtour and repeat the steps

Spanning tree heuristic

To apply the spanning tree heuristic, a complete graph is needed: Kn=(V,A) (where: V-set of links and A-set of nodes).

spanning tree heuristic
  1. Find a minimum spanning tree of K --> ST
  2. Double all arcs in ST, generating an auxiliary graph: (V,ST)
  3. Since each node has even degree, an Eulerian cycle exists
    1. Determine an Eulerian cycle: C
    2. Assign an orientation to C
    3. Select a starting node i
  4. Make a Hamiltonian cycle following the nodes in the order of the Eulerian cycle. whenever a node already visited should be selected, jump to the first available node in the Eulerian cycle
  5. Stop when the Hamiltonian cycle is complete

Finding a Minimum Spanning Tree - Prim's Algorithm

prim's algorithm
  1. Select an arbitrary node in the network
  2. Select the arc with the minimum cost among those that connect any node already in the ST to any node not yet in the spanning tree
  3. Repeat until the Spanning Tree contains all nodes in the network

Christofides heuristic

christofides heuristic
  1. Determine a Minimum Spanning Tree ST of Kn
  2. Find all nodes with odd degree and find a Minimum Perfect Matching, M
  3. Make a new graph, H, out of all the edges in ST and M
  4. Form an Eulerian cycle in H and orient it arbitrarily
  5. Make a Hamiltonian cycle following the nodes in the order of the Eulerian cycle. whenever a node already visited should be selected, jump to the first available node in the Eulerian cycle
    Stop when all nodes are part of the Hamiltonian cycle.

Improvement heuristics

In the heuristics described above, it is rare to find an optimal solution. They are, in general, of moderate quality. Therefore, we pose the question of how we can improve a Hamiltonian cycle found in one of the previous heuristic.

We will see the [[#Exchange Heuristic]]. This is quite intuitive to understand if we're looking at an Euclidean problems. In a Euclidean problem, the distance between 2 nodes is given simply by the Euclidean distance between the nodes (basically, the real distance in any geometric space).

In this case, whenever a Hamiltonian cycle crosses itself, there must exist a better path, without crossings. What we will do is to unfold the path where it crosses itself in order to have a lower cost.

In a general problem (not euclidean), we can't really think in terms of unfolding a crossing, but the concept stays the same. We'll try to exchange some links in order to bring the cost between 2 nodes down. This is what we do in the [[#Exchange Heuristic]]

Exchange Heuristic

Suppose we have a Hamiltonian cycle like the following:

Traveling Salesman Problem 2024-12-28 15.41.04.excalidraw.png
%%🖋 Edit in Excalidraw%%

We are interested in links: (ip,ip+1) and (i1,iq+1).

The heuristic goes as follow:

  1. Choose any vertex i
  2. Pick a pair of links such that they fall into the set Z:
Z:={(ip,ip+1),(iq,iq+1)|p+1q,pq,q+1p,1p,qn}

which means any pair of links such that they have at least one other link in between them.
3. For each pair in Z check if

cp,p+1+cq,q+1>cp,q+cp+1,q+1

then remove links (ip,ip+1) and (i1,iq+1) from the solution and add (ip,iq) and (ip+1,iq+1).
Repeat for every node in the network.

Traveling Salesman Problem 2024-12-28 15.54.03.excalidraw.png
%%🖋 Edit in Excalidraw%%

In the example we compare the cost of having the green links in place of the red ones. Since the cost is lower for the green pair, then we exchange them. Notice how we also need to switch the direction of the dashed links in order to keep the directed hamiltonian cycle.

asymmetric graphs

In case of asymmetric graphs, we also need to check and compare the costs of the links we do not intend to exchange but that we need to reverse. It would become:
cp,p+1+cq,q+1+cp+1,q>cp,q+cp+1,q+1+cq,p+1
where the orange costs are the total costs to go from node p+1 to node q (with all the intermediate links taken into account) and viceversa.