# Clustering Coefficient in Graph Theory

In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterized by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes (Holland and Leinhardt, 1971; Watts and Strogatz, 1998).

Two versions of this measure exist: the global and the local. The global version was designed to give an overall indication of the clustering in the network, whereas the local gives an indication of the embeddedness of single nodes.

*Global clustering coefficient*

The global clustering coefficient is based on triplets of nodes. A triplet consists of three connected nodes. A triangle therefore includes three closed triplets, one centered on each of the nodes (n.b. this means the three triplets in a triangle come from overlapping selections of nodes). The global clustering coefficient is the number of closed triplets (or 3 x triangles) over the total number of triplets (both open and closed). The first attempt to measure it was made by Luce and Perry (1949). This measure gives an indication of the clustering in the whole network (global), and can be applied to both undirected and directed networks.

**Local clustering coefficient**

A graph formally consists of a set of vertices V and a set of edges E between them. An edge connects vertex with vertex .

The neighborhood for a vertex is defined as its immediately connected neighbors as follows:

.

We define as the number of vertices, , in the neighbourhood, , of a vertex.

The local clustering coefficient for a vertex is then given by the proportion of links between the vertices within its neighborhood divided by the number of links that could possibly exist between them. For a directed graph, is distinct from , and therefore for each neighborhood there are links that could exist among the vertices within the neighborhood ( is the number of neighbors of a vertex). Thus, the local clustering coefficient for directed graphs is given as [2]

.

An undirected graph has the property that and are considered identical. Therefore, if a vertex has neighbors, edges could exist among the vertices within the neighborhood. Thus, the local clustering coefficient for undirected graphs can be defined as

.

Let be the number of triangles on for undirected graph G. That is, is the number of sub-graphs of G with 3 edges and 3 vertices, one of which is v. Let be the number of triples on . That is, is the number of sub-graphs (not necessarily induced) with 2 edges and 3 vertices, one of which is v and such that v is incident to both edges. Then we can also define the clustering coefficient as

lue

.

It is simple to show that the two preceding definitions are the same, since

.

These measures are 1 if every neighbor connected to is also connected to every other vertex within the neighborhood, and 0 if no vertex that is connected to connects to any other vertex that is connected to .

Example local clustering coefficient on an undirected graph. The local clustering coefficient of the green node is computed as the proportion of connections among its neighbours.

Here is the code to implement the above clustering coefficient in a graph. It is a part of the networkx library and can be directly accessed using it.

`def` `average_clustering(G, trials` `=` `1000` `): ` ` ` `"""Estimates the average clustering coefficient of G. ` ` ` ` ` `The local clustering of each node in `G` is the ` ` ` `fraction of triangles that actually exist over ` ` ` `all possible triangles in its neighborhood. ` ` ` `The average clustering coefficient of a graph ` ` ` ``G` is the mean of local clusterings. ` ` ` ` ` `This function finds an approximate average ` ` ` `clustering coefficient for G by repeating `n` ` ` ` `times (defined in `trials`) the following ` ` ` `experiment: choose a node at random, choose ` ` ` `two of its neighbors at random, and check if ` ` ` `they are connected. The approximate coefficient ` ` ` `is the fraction of triangles found over the ` ` ` `number of trials [1]_. ` ` ` ` ` `Parameters ` ` ` `---------- ` ` ` `G : NetworkX graph ` ` ` ` ` `trials : integer ` ` ` `Number of trials to perform (default 1000). ` ` ` ` ` `Returns ` ` ` `------- ` ` ` `c : float ` ` ` `Approximated average clustering coefficient. ` ` ` ` ` ` ` ` ` `"""` ` ` `n ` `=` `len` `(G) ` ` ` `triangles ` `=` `0` ` ` `nodes ` `=` `G.nodes() ` ` ` `for` `i ` `in` `[` `int` `(random.random() ` `*` `n) ` `for` `i ` `in` `range` `(trials)]: ` ` ` `nbrs ` `=` `list` `(G[nodes[i]]) ` ` ` `if` `len` `(nbrs) < ` `2` `: ` ` ` `continue` ` ` `u, v ` `=` `random.sample(nbrs, ` `2` `) ` ` ` `if` `u ` `in` `G[v]: ` ` ` `triangles ` `+` `=` `1` ` ` `return` `triangles ` `/` `float` `(trials) ` |

*chevron_right*

*filter_none*

Note: The above code is valid for undirected networks and not for the directed networks.

The code below has been run on IDLE(Python IDE of windows). You would need to download the networkx library before you run this code. The part inside the curly braces represent the output. It is almost similar as Ipython(for Ububtu users).

`>>> ` `import` `networkx as nx ` `>>> G` `=` `nx.erdos_renyi_graph(` `10` `,` `0.4` `) ` `>>> cc` `=` `nx.average_clustering(G) ` `>>> cc ` `#Output of Global CC ` `0.08333333333333333` `>>> c` `=` `nx.clustering(G) ` `>>> c ` `# Output of local CC ` `{` `0` `: ` `0.0` `, ` `1` `: ` `0.3333333333333333` `, ` `2` `: ` `0.0` `, ` `3` `: ` `0.0` `, ` `4` `: ` `0.0` `, ` `5` `: ` `0.0` `, ` `6` `: ` `0.0` `, ` ` ` `7` `: ` `0.3333333333333333` `, ` `8` `: ` `0.0` `, ` `9` `: ` `0.16666666666666666` `} ` |

*chevron_right*

*filter_none*

The above two values give us the global clustering coefficient of a network as well as local clustering coefficient of a network.

Next into this series, we will talk about another centrality measure for any given network.

*References*

You can read more about the same at

.

This article is contributed by **Jayant Bisht**. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: **DSA Self Paced**. Become industry ready at a student-friendly price.

## Recommended Posts:

- Mathematics | Graph theory practice questions
- Proof that Independent Set in Graph theory is NP Complete
- Binomial Coefficient | DP-9
- Sum of product of r and rth Binomial Coefficient (r * nCr)
- Central binomial coefficient
- Convert the undirected graph into directed graph such that there is no path of length greater than 1
- Maximum number of edges that N-vertex graph can have such that graph is Triangle free | Mantel's Theorem
- Maximum binomial coefficient term value
- Program to find correlation coefficient
- Graph implementation using STL for competitive programming | Set 2 (Weighted graph)
- Detect cycle in the graph using degrees of nodes of graph
- Space and time efficient Binomial Coefficient
- Replace the maximum element in the array by coefficient of range
- Eggs dropping puzzle (Binomial Coefficient and Binary Search Solution)
- Convert undirected connected graph to strongly connected directed graph
- Number Theory (Interesting Facts and Algorithms)
- Combinatorial Game Theory | Set 4 (Sprague - Grundy Theorem)
- Number Theory | Generators of finite cyclic group under addition
- Fibonomial coefficient and Fibonomial triangle
- Game Theory (Normal-form Game) | Set 5 (Dominance Property-Mixed Strategy)