Number of paths from source to destination in a directed acyclic graph
Given a Directed Acyclic Graph with n vertices and m edges. The task is to find the number of different paths that exist from a source vertex to destination vertex.
Input: source = 0, destination = 4
0 -> 2 -> 3 -> 4
0 -> 3 -> 4
0 -> 4
Input: source = 0, destination = 1
Explanation: There exists only one path 0->1
Approach: Let f(u) be the number of ways one can travel from node u to destination vertex. Hence, f(source) is required answer. As f(destination) = 1 here so there is just one path from destination to itself. One can observe, f(u) depends on nothing other than the f values of all the nodes which are possible to travel from u. It makes sense because the number of different paths from u to the destination is the sum of all different paths from v1, v2, v3… v-n to destination vertex where v1 to v-n are all the vertices that have a direct path from vertex u. This approach, however, is too slow to be useful. Each function call branches out into further calls, and that branches into further calls, until each and every path is explored once.
The problem with this approach is the calculation of f(u) again and again each time the function is called with argument u. Since this problem exhibits both overlapping subproblems and optimal substructure, dynamic programming is applicable here. In order to evaluate f(u) for each u just once, evaluate f(v) for all v that can be visited from u before evaluating f(u). This condition is satisfied by reverse topological sorted order of the nodes of the graph.
Below is the implementation of the above approach:
Method 2 : ( Top down dp)
Let us consider the graph below
Let us consider source as 0 ( zero ) and destination as 4 . Now we need to find the number of ways to reach 4 from the source i.e., 0 . One of the basic intuition is that if we are already at the destination we have found 1 valid path . Let us consider that our source and destination are different as of now we don’t know in how many ways we can reach from source to destination . But if there exists some neighbours for source then if the neighbours can reach the destination via some path then in all of these paths we can just append source to get the number of ways to reach the destination from source .
If we can visualise it :
In order to compute the number of ways to reach from source to destination i.e., source to destination . If the neighbours of source i.e., 0 can reach the destination ( 4 ) via some path , then we can just append the source to get the number of ways that the source can reach the destination .
source ( 0 ) neighbours are 4 , 3 , 2
4 is the destination so we have found 1 valid path . So in order to get the path from source we can just append the source in front of destination i.e., 0 -> 4 .
The number of ways 3 can reach the 4 is 3 -> 4 is the only possible way . In this case we can just append source to get the number of ways to reach the destination from source via 3 i.e., 0 -> 3 -> 4 . This is one more possible path .
The number of ways 2 can reach the 4 is 2 -> 3 -> 4 is the only possible way . Now we can just append the source to get the path from source to destination i.e., 0 -> 2 -> 3 -> 4 .
We have found 3 possible ways to reach the destination from source . But we can see there are some overlapping of sub – problems i.e., when we are computing the answer for 2 we are exploring the path of 3 which we have already computed . In order to avoid this we can just store the result of every vertex ones we have computed the answer to it , So that it will help us to avoid computing the solution of similar sub – problems again and again . There comes the intuition of dynamic programming .
- If we are already at the destination we have found one valid path .
- If we have not reached the destination , then the number of ways to reach the destination from the current vertex depends on the number of ways the neighbours can reach the destination . We sum all the ways and store it in the dp array .
- If we have already computed the result of any vertex we return the answer directly . In order to identify that we have not computed the answer for any vertex we initialise the dp array with -1 ( indicates we have not computed the answer for that vertex ) .
- If the neighbours of any vertex are unable to reach the destination we return -1 to indicate that there is no path .
- If the number of ways are really very large we can module it with 10^9 + 7 and store the result .
Below is the C++ implementation
Time complexity : O ( V + E ) where V are the vertices and E are the edges .
Space complexity : O ( V + E + V ) where O ( V + E ) for adjacency list and O ( V ) for dp array .