Home | Archive | About | Other

Shortest Path problem

Find shortest path and distance in a directed/undirected graph {vertex, edge}

Dijkstra’s Algorithm

Compute shortest distance between a starting vertex and all other vertexes in a graph. You can also track shortest path simultaneously.

The idea is:

How does this make sure distance to each vertex is minimum? The key is to explore vertices in order of distance. (this is implemented by heap)

Time Complexity: $O(mlogn)$, m = # of edges, n = # of vertexes

Pros:

  1. can deal with directed graph, undirected graph, non-negatives paths

Cons:

  1. can’t be applied to negative edge lengths (can be improved?. See below code implementation)
  2. not very distributed (relevant to Internet routing)

Code Implementation

class Graph:
	def __init__(self, adjacencyList=None):
		if not adjacencyList:
			self.G = defaultdict(list)
			self.G_rev = defaultdict(list)
		else:
			self.G = adjacencyList

	def addEdge(self, u, v):
		self.G[u].append(v)
		self.G_rev[v].append(u)
		return

	def addEdgeLen(self, u, v, l):
		self.G[u].append((v, l)) 
		return

	def minDist(self, start):
		maxdist = float('inf')
		dists = {v: maxdist for v in self.G.keys()}
		seen = set()
		pq = [(0, start)]
		while pq:
			cur_dist, cur_v = heapq.heappop(pq)
			if cur_v in seen:
				continue
			dists[cur_v] = cur_dist
			seen.add(cur_v)
			for neighbor, weight in self.G[cur_v]:
				if neighbor in seen:
					continue
				dist = cur_dist + weight
				heapq.heappush(pq, (dist, neighbor))
		return dists

Bellman Ford Algorithm

Either compute a shortest cycle-free path $s-v$ or output a negative cycle, for all paths starting from $s$. This is a Dynamic Programming solution.

Identify optimal sub-structures & base cases

Since subpath of a shortest path is by itself a shortest path, for a shortest path s-v, it can be broken into shortest of all shortest paths w-v (w is any adjacent vertex of v’s) plus its last edge cost C(w,v). Is this enough? When the structure is broken into sub-structures, we notice number of edges in the subpath decrease by 1. And this information needs to be reflected in the program. Therefore, we assign a budget (number of edges needed at maximum) for any structure of shortest path s-v. Easily known, this budget is no bigger than vertex # - 1. And we have our base case $A[x, s] = 0$, which means shortest distance s-s with any number of edges is 0. Remaining elements in $A$ will be initialized to $+\infty$. Also notice with budget introduced, there’s also a case when $A[v, i] = A[v,i-1]$ when $i$ number of edges is enough for optimal path $s-v$

Pseudo code & analysis

Why only one more iteration is sufficient to capture one negative cycle in shortest path?

Why it is not the intuitive answer $O(n^2)$? total work is $O(n\sum_{v\in{V}}{\text{in-degree}(v)})$ = $O(nm)$

code

def BellmanFord(G, n):
	"""
	Bellman Ford algorithm to compute shortest paths of all pairs, given Graph and number of vertexes
	time complexity: O(n^2m), m is # of edge, n is # of vertex. (slow. 23200s on (V, E)~(1k, 50k))
	space compelxity: O(n)

	assignment is to check either existence of negative cycle or smallest shortest distances of all pairs
	note: input graph is 1-indexed, but the algorithm is 0-indexed
	"""
	global_d_min = float("inf")

	for s in range(n): # tail vertex
		# start of one BellmanFord.
		L = [[float("inf") if i!=s else 0 for i in range(n)] for j in range(2)] # L[head][max edge #] = shortest distance
		# B = [[None for i in range(n)] for j in range(2)] # B[u][i] = shortest-path (s,v)'s 2nd-last vertex with edge # <= i for path construction 
		for i in range(1, n+1): # budeget of edge #. 0 for base case; n for negative cycle case.
			curi, prei = i%2, 1-i%2 # flip index 
			for v in range(n): # head vertex
				L[curi][v] = L[prei][v]
				# B[curi][v] = B[prei][v]
				for w in G.G_rev[v+1]:
					if L[prei][w-1] + G.C[(w, v+1)] < L[curi][v]:
						L[curi][v] = L[prei][w-1] + G.C[(w, v+1)]
						# B[curi][v] = w-1
				if i==n:
					if L[curi][v] < L[prei][v]: return 0, None # check negative cycle
					if s != v and L[curi][v] < global_d_min: global_d_min = L[curi][v] # assignment: track global min of all shortest distances
		# end of one BellmanFord.
		print("progress: {0:.2f}%".format((s + 1) * 100 / n), end="\r")
	return 1, global_d_min

Floyd-Warshall Algorithm

Compute all pairs of shortest paths (APSP) and report relevant negative cycles if any. This is a [[Dynamic Programming]] solution.

Identify optimal sub-structures

we know a subpath of a shortest path is in itself a shortest path. This time we don’t break down an optimal path into a sub-optimal path and last edge, but two shortest paths (with a shared intermediary internal vertex of course). Notice also during this breakdown, for internal vertices of a path, the shared vertex is out. If we pre-index the vertexes 1, 2, …, n, then during the optimal sub-structure breakdown, we can search for $f(K)$ which denotes vertexes 1 to K used for an optimal structure. If vertex $K$ is an internal vertex for the optimal path A[s, v, K] (denoting optimal distance between s and v, with 1,2,…,K vertex as internal vertices), then its two optimal sub-paths will be exactly $A[s,k,k-1]$ and $A[k,v,k-1]$.

Base cases & initialization:

Also notice with budget introduced, there’s also a case when $A[s,v,k]=A[s,v,k-1]$ when $1,2,…k-1$ internal vertexes are enough for optimal path $s-v$

pseudo code & analysis

preindex vertices as 1,2,...,n
let A[s,v,K] denote shortest distance s-v, with internal vertices 1,2,...K. 
let B[s,v] denote max index in its internal vertices.

for k=1 to n:
	for i=1 to n:
		for j=1 to n:
			A[i,j,k] = min{A[i,j,k-1], A[i,k,k-1] + A[k,j,k-1]}
			B[i,j] = k if A[i,j,k] gets updated // it means k is an internal vertex to path i-j
code
def FloydWarshall(G, n):
	"""
	Floyd Warshall algorithm to compute shortest paths of all pairs, given Graph and number of vertexes
	time complexity: O(n^3), m is # of edge, n is # of vertex (okay. 600s on (V, E)~(1k, 50k))
	space compelxity: O(n^2)

	assignment is to check either existence of negative cycle or smallest shortest distances of all pairs
	note: input graph is 1-indexed, but the algorithm is 0-indexed
	"""
	global_d_min = float("inf")
	A = [[[G.C[(u+1,v+1)] if ((u+1,v+1) in G.C and k==0) else 0 if (u==v and k==0) else float("inf") for v in range(n)] for u in range(n)] for k in range(2)] # A[internal max][tail][head] = shortest distance
	# B[u][v] = max internal vertex for path construction 

	for k in range(n): # internal max vertex
		curi, prei = 1-k%2, k%2 # flip predecessor
		for u in range(n): # tail
			for v in range(n): # head
					A[curi][u][v] = min(A[prei][u][v], A[prei][u][k] + A[prei][k][v])
					if k == n-1 and u != v and A[curi][u][v] < global_d_min: global_d_min = A[curi][u][v] # assgiment: track global min of shortest distances of all pairs
					if u==v and A[curi][u][v] < 0: return 0, None # check negative cycle
		print("progress: {0:.2f}%".format(k * 100 / n), end="\r")
	return 1, global_d_min

Johnson’s Algorithm

Compute all pairs of shortest paths (APSP). This is a [[Dynamic Programming]] solution.

Motivation

running time wise, n*Dijkstra is better than n*Bellman Ford, but Dijkstra can not deal with negative edges. And only in dense graph ($n=O(m^2)$), Floyd-Warshall is better than n*Dijkstra. So how to apply fast Dijkstra with negative edges in graph?

Core idea

Reweight edge costs in a way that makes:

but how?

yet how to assign such weights?

why new edges are guaranteed to be non-negative?

Pseudo code & analysis

  1. add new vertex $s$, outgoing arc to each original vertex with distance 0.
  2. run 1 Bellman Ford on $s$.
  3. assign weights to each vertex $v$ = shortest distance $D(s,v)$. edit new edge cost = old edge cost + tail weight - head weight
  4. run N Dijkstra for the graph on new edge costs. then you get all pairs of shortest paths. reverse edit shortest path distance by actual distance = new distance + head weight - tail weight

code

def Johnson(G, n):
	"""
	Johnson's algorithm to compute shortest paths of all pairs, given Graph and number of vertexes
	time complexity: O(mnlogn). m is # of edge, n is # of vertex. (fastest. 173s on (V, E)~(1k, 50k))
	space complexity: O(m)

	assignment is to check either existence of negative cycle or smallest shortest distances of all pairs
	note: input graph is 1-indexed same as the array in the algorithm with the new vertex indexed at 0
	"""
	heapq = MinHeap()
	global_d_min = float("inf")

	# do 1 BellmanFord on newly added vertex "0" and reweight all edge costs
	for v in range(1, n+1):
		G.addEdge(0, v, 0)

	L = [[float("inf") if i!=0 else 0 for i in range(n+1)] for j in range(2)] # vertex weights. L[head][edge #]
	for i in range(1, n+1): # edge #. 0 for base case; n for negative cycle case.
		curi, prei = i%2, 1-i%2
		for v in range(n+1): # head
			L[curi][v] = L[prei][v]
			for w in G.G_rev[v]:
				if L[prei][w] + G.C[(w, v)] < L[curi][v]: L[curi][v] = L[prei][w] + G.C[(w, v)]
			if i==n and L[curi][v] < L[prei][v]: return 0, None # check negative cycle
	
	for (u, v) in G.C:
		G.C[(u, v)] += L[curi][u] - L[curi][v] # new cost = old cost + tail weight - head weight

	# do n Dikistra on original vertices
	for s in range(1, n+1): # tail
		# start of 1 Dikistra
	    dists = {v: float('inf') if v != s else 0 for v in range(1, n+1)}
	    seen = set()
	    pq = [(0, s)]
	    while pq:
	        cur_dist, cur_v = heapq.pop(pq) # heap pops min edge
	        if cur_v in seen: continue # ignore explored vertex
	        dists[cur_v] = cur_dist  # update min distance for this vertex
	        if s != cur_v and cur_dist + L[curi][cur_v] - L[curi][s] < global_d_min: # assignment: track global min of shortest distances of all pairs
	        	global_d_min = cur_dist + L[curi][cur_v] - L[curi][s]
	        seen.add(cur_v)  # mark this vertex as explored
	        for neighbor in G.G[cur_v]:
	            if neighbor in seen: continue
	            dist = cur_dist + G.C[(cur_v, neighbor)]
	            heapq.add(pq, (dist, neighbor)) # push new (possible distance, head) to heap
	    # end of 1 Dikistra
	    print("progress: {0:.2f}%".format((s+1) * 100 / n), end="\r")
	return 1, global_d_min

Summary

For sparse graph $m=O(n)$ -> Johnson is best For dense graph $m=O(n^2)$ -> Floyd-Warshall is best