Python – Bigrams Frequency in String

• Last Updated : 13 Mar, 2023

Sometimes while working with Python Data, we can have problem in which we need to extract bigrams from string. This has application in NLP domains. But sometimes, we need to compute the frequency of unique bigram for data collection. The solution to this problem can be useful. Lets discuss certain ways in which this task can be performed.

Method #1 : Using Counter() + generator expression The combination of above functions can be used to solve this problem. In this, we compute the frequency using Counter() and bigram computation using generator expression and string slicing.

Python3

 `# Python3 code to demonstrate working of``# Bigrams Frequency in String``# Using Counter() + generator expression``from` `collections ``import` `Counter``    ` `# initializing string``test_str ``=` `'geeksforgeeks'` `# printing original string``print``(``"The original string is : "` `+` `str``(test_str))` `# Bigrams Frequency in String``# Using Counter() + generator expression``res ``=` `Counter(test_str[idx : idx ``+` `2``] ``for` `idx ``in` `range``(``len``(test_str) ``-` `1``))` `# printing result``print``(``"The Bigrams Frequency is : "` `+` `str``(``dict``(res)))`

Output :

The original string is : geeksforgeeks The Bigrams Frequency is : {‘ee’: 2, ‘ks’: 2, ‘ek’: 2, ‘sf’: 1, ‘fo’: 1, ‘ge’: 2, ‘rg’: 1, ‘or’: 1}

Method #2 : Using Counter() + zip() + map() + join The combination of above functions can also be used to solve this problem. In this, we perform the task of constructing bigrams using zip() + map() + join.

Python3

 `# Python3 code to demonstrate working of``# Bigrams Frequency in String``# Using Counter() + zip() + map() + join``from` `collections ``import` `Counter``    ` `# initializing string``test_str ``=` `'geeksforgeeks'` `# printing original string``print``(``"The original string is : "` `+` `str``(test_str))` `# Bigrams Frequency in String``# Using Counter() + zip() + map() + join``res ``=` `Counter(``map``(''.join, ``zip``(test_str, test_str[``1``:])))` `# printing result``print``(``"The Bigrams Frequency is : "` `+` `str``(``dict``(res)))`

Output :

The original string is : geeksforgeeks The Bigrams Frequency is : {‘ee’: 2, ‘ks’: 2, ‘ek’: 2, ‘sf’: 1, ‘fo’: 1, ‘ge’: 2, ‘rg’: 1, ‘or’: 1}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method 3: use a loop and a dictionary to keep track of the bigram frequencies.

• Initialize an empty dictionary to keep track of the bigram frequencies.
• Loop through the characters in the input string, starting from the second character.
• For each character, get the previous character and concatenate them to form a bigram.
• Check if the bigram is already in the dictionary.
• If the bigram is not in the dictionary, add it with a frequency of 1.
• If the bigram is already in the dictionary, increment its frequency by 1.
• Print the bigram frequencies.

Python3

 `# Python3 code to demonstrate working of``# Bigrams Frequency in String``# Using a loop and dictionary` `# initializing string``test_str ``=` `'geeksforgeeks'` `# printing original string``print``(``"The original string is : "` `+` `str``(test_str))` `# Bigrams Frequency in String``# Using a loop and dictionary``freq_dict ``=` `{}``for` `i ``in` `range``(``1``, ``len``(test_str)):``    ``bigram ``=` `test_str[i``-``1``:i``+``1``]``    ``if` `bigram ``in` `freq_dict:``        ``freq_dict[bigram] ``+``=` `1``    ``else``:``        ``freq_dict[bigram] ``=` `1` `# printing result``print``(``"The Bigrams Frequency is : "` `+` `str``(freq_dict))`

Output

```The original string is : geeksforgeeks
The Bigrams Frequency is : {'ge': 2, 'ee': 2, 'ek': 2, 'ks': 2, 'sf': 1, 'fo': 1, 'or': 1, 'rg': 1}```

Time complexity: O(n), where n is the length of the input string.
Auxiliary space: O(k), where k is the number of unique bigrams in the input string.

My Personal Notes arrow_drop_up