Find shortest path and distance in a directed/undirected graph {vertex, edge}
Compute shortest distance between a starting vertex and all other vertexes in a graph. You can also track shortest path simultaneously.
The idea is:
How does this make sure distance to each vertex is minimum? The key is to explore vertices in order of distance. (this is implemented by heap)
Time Complexity: $O(mlogn)$, m = # of edges, n = # of vertexes
Pros:
Cons:
class Graph:
def __init__(self, adjacencyList=None):
if not adjacencyList:
self.G = defaultdict(list)
self.G_rev = defaultdict(list)
else:
self.G = adjacencyList
def addEdge(self, u, v):
self.G[u].append(v)
self.G_rev[v].append(u)
return
def addEdgeLen(self, u, v, l):
self.G[u].append((v, l))
return
def minDist(self, start):
maxdist = float('inf')
dists = {v: maxdist for v in self.G.keys()}
seen = set()
pq = [(0, start)]
while pq:
cur_dist, cur_v = heapq.heappop(pq)
if cur_v in seen:
continue
dists[cur_v] = cur_dist
seen.add(cur_v)
for neighbor, weight in self.G[cur_v]:
if neighbor in seen:
continue
dist = cur_dist + weight
heapq.heappush(pq, (dist, neighbor))
return dists
Either compute a shortest cycle-free path $s-v$ or output a negative cycle, for all paths starting from $s$. This is a Dynamic Programming solution.
Since subpath of a shortest path is by itself a shortest path, for a shortest path s-v, it can be broken into shortest of all shortest paths w-v (w is any adjacent vertex of v’s) plus its last edge cost C(w,v). Is this enough? When the structure is broken into sub-structures, we notice number of edges in the subpath decrease by 1. And this information needs to be reflected in the program. Therefore, we assign a budget (number of edges needed at maximum) for any structure of shortest path s-v. Easily known, this budget is no bigger than vertex # - 1. And we have our base case $A[x, s] = 0$, which means shortest distance s-s with any number of edges is 0. Remaining elements in $A$ will be initialized to $+\infty$. Also notice with budget introduced, there’s also a case when $A[v, i] = A[v,i-1]$ when $i$ number of edges is enough for optimal path $s-v$
Why only one more iteration is sufficient to capture one negative cycle in shortest path?
- If there’s 1 negative-cycle in shortest s-v, one of the vertex in the path Must be visited Twice, number of edges for the budget will increase by one. n-1 => n.
- Extension: I think K additional iterations on top of original 0 -> n-1 will be able to capture K negative cycles in the shortest path s-v, since one negative cycle makes one vertex visited one more time, increasing edge number budget by 1.
Why it is not the intuitive answer $O(n^2)$? total work is $O(n\sum_{v\in{V}}{\text{in-degree}(v)})$ = $O(nm)$
def BellmanFord(G, n):
"""
Bellman Ford algorithm to compute shortest paths of all pairs, given Graph and number of vertexes
time complexity: O(n^2m), m is # of edge, n is # of vertex. (slow. 23200s on (V, E)~(1k, 50k))
space compelxity: O(n)
assignment is to check either existence of negative cycle or smallest shortest distances of all pairs
note: input graph is 1-indexed, but the algorithm is 0-indexed
"""
global_d_min = float("inf")
for s in range(n): # tail vertex
# start of one BellmanFord.
L = [[float("inf") if i!=s else 0 for i in range(n)] for j in range(2)] # L[head][max edge #] = shortest distance
# B = [[None for i in range(n)] for j in range(2)] # B[u][i] = shortest-path (s,v)'s 2nd-last vertex with edge # <= i for path construction
for i in range(1, n+1): # budeget of edge #. 0 for base case; n for negative cycle case.
curi, prei = i%2, 1-i%2 # flip index
for v in range(n): # head vertex
L[curi][v] = L[prei][v]
# B[curi][v] = B[prei][v]
for w in G.G_rev[v+1]:
if L[prei][w-1] + G.C[(w, v+1)] < L[curi][v]:
L[curi][v] = L[prei][w-1] + G.C[(w, v+1)]
# B[curi][v] = w-1
if i==n:
if L[curi][v] < L[prei][v]: return 0, None # check negative cycle
if s != v and L[curi][v] < global_d_min: global_d_min = L[curi][v] # assignment: track global min of all shortest distances
# end of one BellmanFord.
print("progress: {0:.2f}%".format((s + 1) * 100 / n), end="\r")
return 1, global_d_min
Compute all pairs of shortest paths (APSP) and report relevant negative cycles if any. This is a [[Dynamic Programming]] solution.
we know a subpath of a shortest path is in itself a shortest path. This time we don’t break down an optimal path into a sub-optimal path and last edge, but two shortest paths (with a shared intermediary internal vertex of course). Notice also during this breakdown, for internal vertices of a path, the shared vertex is out. If we pre-index the vertexes 1, 2, …, n, then during the optimal sub-structure breakdown, we can search for $f(K)$ which denotes vertexes 1 to K used for an optimal structure. If vertex $K$ is an internal vertex for the optimal path A[s, v, K] (denoting optimal distance between s and v, with 1,2,…,K vertex as internal vertices), then its two optimal sub-paths will be exactly $A[s,k,k-1]$ and $A[k,v,k-1]$.
Base cases & initialization:
Also notice with budget introduced, there’s also a case when $A[s,v,k]=A[s,v,k-1]$ when $1,2,…k-1$ internal vertexes are enough for optimal path $s-v$
preindex vertices as 1,2,...,n
let A[s,v,K] denote shortest distance s-v, with internal vertices 1,2,...K.
let B[s,v] denote max index in its internal vertices.
for k=1 to n:
for i=1 to n:
for j=1 to n:
A[i,j,k] = min{A[i,j,k-1], A[i,k,k-1] + A[k,j,k-1]}
B[i,j] = k if A[i,j,k] gets updated // it means k is an internal vertex to path i-j
A[i,j,k] = max{A[i,j,k-1], A[i,k,k-1] * A[k,j,k-1]}
def FloydWarshall(G, n):
"""
Floyd Warshall algorithm to compute shortest paths of all pairs, given Graph and number of vertexes
time complexity: O(n^3), m is # of edge, n is # of vertex (okay. 600s on (V, E)~(1k, 50k))
space compelxity: O(n^2)
assignment is to check either existence of negative cycle or smallest shortest distances of all pairs
note: input graph is 1-indexed, but the algorithm is 0-indexed
"""
global_d_min = float("inf")
A = [[[G.C[(u+1,v+1)] if ((u+1,v+1) in G.C and k==0) else 0 if (u==v and k==0) else float("inf") for v in range(n)] for u in range(n)] for k in range(2)] # A[internal max][tail][head] = shortest distance
# B[u][v] = max internal vertex for path construction
for k in range(n): # internal max vertex
curi, prei = 1-k%2, k%2 # flip predecessor
for u in range(n): # tail
for v in range(n): # head
A[curi][u][v] = min(A[prei][u][v], A[prei][u][k] + A[prei][k][v])
if k == n-1 and u != v and A[curi][u][v] < global_d_min: global_d_min = A[curi][u][v] # assgiment: track global min of shortest distances of all pairs
if u==v and A[curi][u][v] < 0: return 0, None # check negative cycle
print("progress: {0:.2f}%".format(k * 100 / n), end="\r")
return 1, global_d_min
Compute all pairs of shortest paths (APSP). This is a [[Dynamic Programming]] solution.
running time wise, n*Dijkstra is better than n*Bellman Ford, but Dijkstra can not deal with negative edges. And only in dense graph ($n=O(m^2)$), Floyd-Warshall is better than n*Dijkstra. So how to apply fast Dijkstra with negative edges in graph?
Reweight edge costs in a way that makes:
but how?
yet how to assign such weights?
why new edges are guaranteed to be non-negative?
def Johnson(G, n):
"""
Johnson's algorithm to compute shortest paths of all pairs, given Graph and number of vertexes
time complexity: O(mnlogn). m is # of edge, n is # of vertex. (fastest. 173s on (V, E)~(1k, 50k))
space complexity: O(m)
assignment is to check either existence of negative cycle or smallest shortest distances of all pairs
note: input graph is 1-indexed same as the array in the algorithm with the new vertex indexed at 0
"""
heapq = MinHeap()
global_d_min = float("inf")
# do 1 BellmanFord on newly added vertex "0" and reweight all edge costs
for v in range(1, n+1):
G.addEdge(0, v, 0)
L = [[float("inf") if i!=0 else 0 for i in range(n+1)] for j in range(2)] # vertex weights. L[head][edge #]
for i in range(1, n+1): # edge #. 0 for base case; n for negative cycle case.
curi, prei = i%2, 1-i%2
for v in range(n+1): # head
L[curi][v] = L[prei][v]
for w in G.G_rev[v]:
if L[prei][w] + G.C[(w, v)] < L[curi][v]: L[curi][v] = L[prei][w] + G.C[(w, v)]
if i==n and L[curi][v] < L[prei][v]: return 0, None # check negative cycle
for (u, v) in G.C:
G.C[(u, v)] += L[curi][u] - L[curi][v] # new cost = old cost + tail weight - head weight
# do n Dikistra on original vertices
for s in range(1, n+1): # tail
# start of 1 Dikistra
dists = {v: float('inf') if v != s else 0 for v in range(1, n+1)}
seen = set()
pq = [(0, s)]
while pq:
cur_dist, cur_v = heapq.pop(pq) # heap pops min edge
if cur_v in seen: continue # ignore explored vertex
dists[cur_v] = cur_dist # update min distance for this vertex
if s != cur_v and cur_dist + L[curi][cur_v] - L[curi][s] < global_d_min: # assignment: track global min of shortest distances of all pairs
global_d_min = cur_dist + L[curi][cur_v] - L[curi][s]
seen.add(cur_v) # mark this vertex as explored
for neighbor in G.G[cur_v]:
if neighbor in seen: continue
dist = cur_dist + G.C[(cur_v, neighbor)]
heapq.add(pq, (dist, neighbor)) # push new (possible distance, head) to heap
# end of 1 Dikistra
print("progress: {0:.2f}%".format((s+1) * 100 / n), end="\r")
return 1, global_d_min
For sparse graph $m=O(n)$ -> Johnson is best For dense graph $m=O(n^2)$ -> Floyd-Warshall is best