Open In App

Python – Extract hashtags from text

A hashtag is a keyword or phrase preceded by the hash symbol (#), written within a post or comment to highlight it and facilitate a search for it. Some examples are: #like, #gfg, #selfie

We are provided with a string containing hashtags, we have to extract these hashtags into a list and print them. 



Examples:  

Input : GeeksforGeeks is a wonderful #website for #ComputerScience
Output :  website , ComputerScience
Input : This day is beautiful! #instagood #photooftheday #cute
Output :  instagood, photooftheday, cute



Method 1: 




# function to print all the hashtags in a text
def extract_hashtags(text):
 
    # initializing hashtag_list variable
    hashtag_list = []
 
    # splitting the text into words
    for word in text.split():
 
        # checking the first character of every word
        if word[0] == '#':
 
            # adding the word to the hashtag_list
            hashtag_list.append(word[1:])
 
    # printing the hashtag_list
    print("The hashtags in \"" + text + "\" are :")
    for hashtag in hashtag_list:
        print(hashtag)
 
 
if __name__ == "__main__":
    text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
    text2 = "This day is beautiful ! #instagood #photooftheday #cute"
    extract_hashtags(text1)
    extract_hashtags(text2)

Output
The hashtags in "GeeksforGeeks is a wonderful #website for #ComputerScience" are :
website
ComputerScience
The hashtags in "This day is beautiful ! #instagood #photooftheday #cute" are :
instagood
photooftheday
cute

Time complexity: O(n), where n is the number of words in the text.

Auxiliary space: O(n), where n is the number of hashtags in the text.

Method 2 : Using regular expressions.




# import the regex module
import re
 
# function to print all the hashtags in a text
 
 
def extract_hashtags(text):
 
    # the regular expression
    regex = "#(\w+)"
 
    # extracting the hashtags
    hashtag_list = re.findall(regex, text)
 
    # printing the hashtag_list
    print("The hashtags in \"" + text + "\" are :")
    for hashtag in hashtag_list:
        print(hashtag)
 
 
if __name__ == "__main__":
    text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
    text2 = "This day is beautiful ! #instagood #photooftheday #cute"
    extract_hashtags(text1)
    extract_hashtags(text2)

Output
The hashtags in "GeeksforGeeks is a wonderful #website for #ComputerScience" are :
website
ComputerScience
The hashtags in "This day is beautiful ! #instagood #photooftheday #cute" are :
instagood
photooftheday
cute

Method 3 : Using startswith() and replace()




# program to print all the hashtags in a text
 
text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
textList = text1.split()
for i in textList:
    if(i.startswith("#")):
        x = i.replace("#", '')
        print(x)

Output
website
ComputerScience

Method 4 : Using replace()




# program to print all the hashtags in a text
 
text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
textList = text1.split()
for i in textList:
    if(i[0] == "#"):
        x = i.replace("#", '')
        print(x)

Output
website
ComputerScience

Method 5 : Using find() and replace() methods




# program to print all the hashtags in a text
 
text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
textList = text1.split()
for i in textList:
    if(i.find("#") == 0):
        x = i.replace("#", '')
        print(x)

Output
website
ComputerScience

Article Tags :