Open In App

SciPy CSGraph – Compressed Sparse Graph

Last Updated : 11 Jul, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Graphs are powerful mathematical structures used to represent relationships between entities in various fields, including computer science, social networks, transportation systems, and more. Analyzing and computing graphs is a fundamental task in many applications, but it can be challenging, especially when dealing with large graphs with sparse connectivity. Fortunately, the scipy.sparse.csgraph subpackage in the SciPy library offers a comprehensive set of tools and algorithms specifically designed for efficient graph analysis using sparse matrix representations. Sparse matrices are matrices where the majority of elements are zero, making them ideal for representing and manipulating large graphs with sparse connectivity.

Note: Before going further strongly recommended to know how to create a sparse matrix in Python (Refer to this article How to Create a Sparse Matrix in Python).

Key Functionalities of SciPy CSGraph

The scipy.sparse.csgraph subpackage offers a wide range of functionalities and algorithms for efficient graph analysis. Let’s delve into its key features:

  • Shortest Path Algorithms
  • Connected Components
    • connected_components: Identify the connected components in a graph, providing the number of components and labels for each node.
    • connected_components_dist: Compute the connected components considering edge weights.
  • Minimum Spanning Tree
    • minimum_spanning_tree: Calculate the minimum spanning tree of a graph, finding the subset of edges with the minimum total weight.
    • minimum_spanning_tree_csr: Compute the minimum spanning tree for graphs represented as Compressed Sparse Row (CSR) matrices.
  • Strongly Connected Components
    • strongly_connected_components: Identify strongly connected components in a directed graph.
    • strongly_connected_components_csr: Compute strongly connected components for CSR matrix representation.

Creating CSGraph From Adjacency Matrix

  • Define an adjacency matrix that represents the connectivity of the graph.
  • Convert the adjacency matrix to a sparse matrix representation (e.g., CSR, CSC).
  • Use the csgraph_from_dense function to convert the sparse matrix to a graph representation
  • The graph is directed.

In this example, we are using the Numpy and Scipy for creating a sparse matrix and then it’s converted into a graph.

Python3




import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# Creating a 3 * 3 sparse matrix .
sparseMatrix = csr_matrix((3, 3),
                          dtype=np.int8).toarray()
 
# converting sparse matrix to graph
graph = csgraph_from_dense(sparseMatrix)
 
print(graph.toarray())


Output:

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

In this example first, we created the adjacency matrix, then we converted it into a sparse matrix.

Python3




from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# Define the adjacency matrix for a directed graph
adjacency_matrix = [[0, 1, 0, 1],
                    [0, 0, 1, 0],
                    [0, 0, 0, 1],
                    [0, 0, 0, 0]]
 
# Convert the adjacency matrix to CSR format
graph_sparse = csr_matrix(adjacency_matrix).toarray()
 
# Convert CSR format to graph representation
graph = csgraph_from_dense(graph_sparse)
# it will print graph as=> (source,destination) edge-weight
print(graph) 


Output:

  (0, 1)    1.0
(0, 3) 1.0
(1, 2) 1.0
(2, 3) 1.0

Creating CSGraph from Edge List

  • Define an edge list that represents the connectivity of the graph.
  • Convert the edge list to a sparse matrix representation (e.g., COO).
  • Use the csgraph_from_dense function to convert the sparse matrix to a graph representation
  • The graph is directed.

Python3




import numpy as np
from scipy.sparse import coo_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# creating the edge list
edgeList = coo_matrix((3, 3),
                      dtype=np.int8).toarray()
 
# converting the edge list to graph
graph = csgraph_from_dense(edgeList)
 
print(graph.toarray())


Output:

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

Creating The undirected Graph:

To create an undirected graph using scipy.sparse.csgraph, you can use the symmetric adjacency matrix.

Symmetric Matrix: When we say that a matrix is symmetric, it means that the matrix is equal to its transpose. In other words, for a square matrix, if the element at row i and column j is equal to the element at row j and column i, then the matrix is symmetric.

Python3




from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import csgraph_from_dense
 
# Define the adjacency matrix for an undirected graph
# Here 1 represents the edge weight between source to destination
adjacency_matrix = [[0, 1, 0, 1],
                    [1, 0, 1, 0],
                    [0, 1, 0, 1],
                    [1, 0, 1, 0]]
 
# Set the matrix symmetrically
adjacency_matrix = [[max(adjacency_matrix[i][j],
                         adjacency_matrix[j][i])
                     for j in range(len(adjacency_matrix))]
                    for i in range(len(adjacency_matrix))]
 
# Convert the adjacency matrix to CSR format
graph_sparse = csr_matrix(adjacency_matrix).toarray()
 
# Convert CSR format to graph representation
graph = csgraph_from_dense(graph_sparse)
 
print(graph)


Output:

  (0, 1)    1.0
(0, 3) 1.0
(1, 0) 1.0
(1, 2) 1.0
(2, 1) 1.0
(2, 3) 1.0
(3, 0) 1.0
(3, 2) 1.0

Syntax:

breadth_first_order(csgraph, i_start, directed=True)

Parameters

  • csgraph : The N x N array representing the input graph.
  • i_start :(int) The index of starting node

Return:

node_array: ndarray(one dimension) The breadth-first list of nodes, starting with specified node. The length of node_array is the number of nodes reachable from the specified node.

Python3




from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import breadth_first_order
 
adjMat = [
    [0, 1, 2, 0],
    [0, 0, 0, 1],
    [2, 0, 0, 3],
    [0, 0, 0, 0]
]
 
graph = csr_matrix(adjMat)
print(graph) 
 
# bfs start from Node 0
bfs = breadth_first_order(graph, 0,
                          return_predecessors=False)
 
print("Breadth-first travelling order:", bfs)


Output:

  (0, 1)    1
(0, 2) 2
(1, 3) 1
(2, 0) 2
(2, 3) 3
Breadth-first travelling order: [0 1 2 3]

depth_first_order(csgraph, i_start, directed=True): Return a depth-first ordering starting with the specified node.

Python3




from scipy.sparse.csgraph import depth_first_order
 
# dfs Travel Start from Node 1
dfs = depth_first_order(graph, i_start=1,
                        return_predecessors=False)
print("Depth First Travelling order:", dfs)


Output:

Depth First Travelling order: [1 3]

Syntax:

shortest_path(csgraph, method=’auto’, directed=True,indices=None)

Parameters:

  • csgraph : The N x N array of distances representing the input graph.
  • method : (string [‘auto’|’FW’|’D’], optional) Algorithm to use for shortest paths. Options are:
    • ‘auto’ – (default) select the best among ‘FW’, ‘D’, ‘BF’, or ‘J’
    • ‘FW’ – Floyd-Warshall algorithm. Computational cost is
    • ‘D’ – Dijkstra’s algorithm with Fibonacci heaps.
    • ‘BF’ – Bellman-Ford algorithm. This algorithm can be used
    • ‘J’ – Johnson’s algorithm. Like the Bellman-Ford
  • directed: (bool, optional):
    • If True (default), then find the shortest path on a directed graph:
    • If False, then find the shortest path on an undirected graph
  • indices : (arrays/int) If specified, only compute the paths from the points at the given indices.

Returns:

  • dist_matrixnd : (array)The N x N matrix of distances between graph nodes. dist_matrix[i,j] gives the shortest distance from point i to point j along the graph

Python3




from scipy.sparse.csgraph import shortest_path
 
# the shortest path distance between
# the Node 1 to remaning Nodes
source = 1
dist1 = shortest_path(csgraph=graph,
                      method="auto",
                      directed=False,
                      indices=source)
print("Distance from Node {source} to remaning Nodes",
      dist1)
 
# the shortest path distances between All Nodes
dist_matrix = shortest_path(csgraph=graph,
                            method='FW',
                            directed=False)
print("Distance between the All the Nodes\n",
      dist_matrix)


Output:

Distance from Node {source} to remaning Nodes [1. 0. 3. 1.]
Distance between the All the Nodes
[[0. 1. 2. 2.]
[1. 0. 3. 1.]
[2. 3. 0. 3.]
[2. 1. 3. 0.]]
  • output1: dist1[j] represents the shortest distance between the source node(In example 1) to Node j
  • output2: distance[i, j] represents the shortest path between the node i to j.

Syntax:

minimum_spanning_tree(csgraph, overwrite=False)

  • A minimum spanning tree is a graph consisting of the subset of edges which together connect all connected nodes, while minimizing the total sum of weights on the edges. This is computed using the Kruskal algorithm

Parameters:

  • csgraph : input graph
  • overwrite :(bool ,optional) If true, then parts of the input graph will be overwritten for efficiency. Default is False.

Return:

  • span_tree :(csr_matrix) The N x N compressed-sparse representation of the undirected minimum spanning tree over the input

Python3




from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import minimum_spanning_tree
 
X = csr_matrix([[0, 8, 0, 3],
                [0, 0, 2, 5],
                [0, 0, 0, 6],
                [0, 0, 0, 0]])
 
# Finding minimum span tree
Tcsr = minimum_spanning_tree(X)
 
# Minimum Span tree
print(Tcsr.toarray())


Output:

[[0. 0. 0. 3.]
[0. 0. 2. 5.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]

Syntax:

maximum_flow(csgraph, source, sink)

Parameters:

  • csgraph: input graph
  • source : source node
  • sink : destination node

Return:

  • return instance of MaximumFlowResult class
    • The Attributes of the class are flow_value(Max flow to graph) and flow_matrix

Python3




from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import maximum_flow
 
# Define the adjacency matrix for a directed graph
adjacency_matrix = [[0, 16, 13, 0, 0, 0],
                    [0, 0, 0, 12, 0, 0],
                    [0, 4, 0, 0, 14, 0],
                    [0, 0, 9, 0, 0, 20],
                    [0, 0, 0, 7, 0, 4],
                    [0, 0, 0, 0, 0, 0]]
 
# Convert the adjacency matrix to CSR format
graph_sparse = csr_matrix(adjacency_matrix)
 
# Compute the maximum flow in the graph
flow_dict = maximum_flow(graph_sparse, 0, 5)
 
# Retrieve the maximum flow value
max_flow_value = flow_dict.flow_value
 
# Retrieve the flow distribution along the edges
flow_matrix = flow_dict.flow
 
print("Maximum Flow Value:", max_flow_value)
print("Flow Distribution:")
print(flow_matrix.toarray())


Output:

Maximum Flow Value: 23
Flow Distribution:
[[ 0 12 11 0 0 0]
[-12 0 0 12 0 0]
[-11 0 0 0 11 0]
[ 0 -12 0 0 -7 19]
[ 0 0 -11 7 0 4]
[ 0 0 0 -19 -4 0]]
  • The maximum flow value is 23, indicating that a maximum of 23 units of flow can be sent from the source node to the sink node
  • The flow distribution matrix shows the flow along each edge. For example, the element flow_matrix[0, 1] represents the flow from node 0 to node 1, which is 12.

Directed v/s Undirected Graph

 

Directed Graph

Undirected Graph

Edge Representation

Edges have a specific direction between Nodes.

For example:

If you see the output of Example-2 there is a directed edge from 0 to 1, it signifies that we can move from 0 to 1. But we can’t move from 1 to 0.

In an undirected graph, the edges do not have any specific direction

For example:

If you see the output of the above example there is an edge from 0 to 1 and also an edge from 1 to 0. It signifies that we can move from 0 to 1 and also we can move from 1 to 0

Symmetry

The adjacency matrix is asymmetric or the Relationship between vertices is asymmetric.

The adjacency matrix of example-2 is asymmetric.

The adjacency matrix is symmetric or the Relationship between vertices is symmetric.

The adjacency matrix of the above example is symmetric.

Edge Notation

Represented as (source vertex, target vertex).

Represented as an unordered pair {vertex A, vertex B}.

 

Flow charts, one-way streets

Bidirectional streets

Conclusion:

Throughout this article, we explored the key features and functionalities of scipy.sparse.csgraph. We discussed how to create a graph using different methods such as COO matrix representation and dense matrix conversion. We learned about important graph algorithms like Dijkstra’s algorithm for finding the shortest paths and the maximum flow algorithm for network flow problems.

As you continue exploring the capabilities of scipy.sparse.csgraph, you’ll discover a rich collection of algorithms and methods that can be applied to a wide range of graph-related problems. From graph traversal and connectivity analysis to graph partitioning and network flow optimization, scipy.sparse.csgraph is a versatile tool that opens up a world of possibilities for graph analysis and optimization.



Similar Reads

SciPy - Sparse Matrix Multiplication
Sparse matrices are those matrices that have the most of their elements as zeroes. scipy.sparse is SciPy 2-D sparse matrix package for numeric data. It provides us different classes to create sparse matrices. csc_matrix and csr_matrix are the two such classes. csc_matrix() is used to create a compressed sparse column matrix whereas csr_matrix() is
4 min read
SciPy Linear Algebra - SciPy Linalg
The SciPy package includes the features of the NumPy package in Python. It uses NumPy arrays as the fundamental data structure. It has all the features included in the linear algebra of the NumPy module and some extended functionality. It consists of a linalg submodule, and there is an overlap in the functionality provided by the SciPy and NumPy su
8 min read
How To Get The Uncompressed And Compressed File Size Of A File In Python
We are given a compressed file as well as uncompressed file. Our task is to get the size of uncompressed as well as compressed file in Python. In this article, we will see how we can get the uncompressed and compressed file size of a file in Python. What is Uncompressed And Compressed File?Uncompressed File: An uncompressed file is a file that has
3 min read
Python program to Convert a Matrix to Sparse Matrix
Given a matrix with most of its elements as 0, we need to convert this matrix into a sparse matrix in Python. Example: Input: Matrix: 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 5 0 0 0 Output: Sparse Matrix: 0 0 1 1 1 2 2 2 3 3 3 4 4 0 5 Explanation: Here the Matrix is represented using a 2D list and the Sparse Matrix is represented in the form Row Column Val
3 min read
How To Visualize Sparse Matrix in Python using Matplotlib?
Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-platform data visualization library built on NumPy arrays and designed to work with the broader SciPy stack. Visualize Sparse Matrix using Matplotlib Spy is a function used to visualize the array as an image similar to matplotlib imshow function,
3 min read
How to Create a Sparse Matrix in Python
If most of the elements of the matrix have 0 value, then it is called a sparse matrix. The two major benefits of using sparse matrix instead of a simple matrix are: Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements.Computing time: Computing time can be saved by logically designing
3 min read
Sparse Matrix in Python using Dictionary
A sparse matrix is a matrix in which most of the elements have zero value and thus efficient ways of storing such matrices are required. Sparse matrices are generally utilized in applied machine learning such as in data containing data-encodings that map categories to count and also in entire subfields of machine learning such as natural language p
2 min read
Python Program to Check if a given matrix is sparse or not
A matrix is a two-dimensional data object having m rows and n columns, therefore a total of m*n values. If most of the values of a matrix are 0 then we say that the matrix is sparse. Consider a definition of Sparse where a matrix is considered sparse if the number of 0s is more than half of the elements in the matrix, Examples: Input : 1 0 3 0 0 4
4 min read
How to reduce dimensionality on Sparse Matrix in Python?
A matrix usually consists of a combination of zeros and non-zeros. When a matrix is comprised mostly of zeros, then such a matrix is called a sparse matrix. A matrix that consists of maximum non-zero numbers, such a matrix is called a dense matrix. Sparse matrix finds its application in high dimensional Machine learning and deep learning problems.
3 min read
Classification of text documents using sparse features in Python Scikit Learn
Classification is a type of machine learning algorithm in which the model is trained, so as to categorize or label the given input based on the provided features for example classifying the input image as an image of a dog or a cat (binary classification) or to classify the provided picture of a living organism into one of the species from within t
5 min read
Article Tags :
Practice Tags :