Skip to content
Related Articles

Related Articles

Python – Extract hashtags from text

View Discussion
Improve Article
Save Article
  • Difficulty Level : Easy
  • Last Updated : 16 Aug, 2022
View Discussion
Improve Article
Save Article

A hashtag is a keyword or phrase preceded by the hash symbol (#), written within a post or comment to highlight it and facilitate a search for it. Some examples are: #like, #gfg, #selfie

We are provided with a string containing hashtags, we have to extract these hashtags into a list and print them. 

Examples:  

Input : GeeksforGeeks is a wonderful #website for #ComputerScience
Output :  website , ComputerScience
Input : This day is beautiful! #instagood #photooftheday #cute
Output :  instagood, photooftheday, cute

Method 1: 

  • Split the text into words using the split() method.
  • For every word check if the first character is a hash symbol(#) or not.
  • If yes then add the word to the list of hashtags without the hash symbol.
  • Print the list of hashtags.

Python3




# function to print all the hashtags in a text
def extract_hashtags(text):
     
    # initializing hashtag_list variable
    hashtag_list = []
     
    # splitting the text into words
    for word in text.split():
         
        # checking the first character of every word
        if word[0] == '#':
             
            # adding the word to the hashtag_list
            hashtag_list.append(word[1:])
     
    # printing the hashtag_list
    print("The hashtags in \"" + text + "\" are :")
    for hashtag in hashtag_list:
        print(hashtag)
 
if __name__=="__main__":
    text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
    text2 = "This day is beautiful ! #instagood #photooftheday #cute"
    extract_hashtags(text1)
    extract_hashtags(text2)

Output

The hashtags in "GeeksforGeeks is a wonderful #website for #ComputerScience" are :
website
ComputerScience
The hashtags in "This day is beautiful ! #instagood #photooftheday #cute" are :
instagood
photooftheday
cute

Method 2 : Using regular expressions.

Python3




# import the regex module
import re
 
# function to print all the hashtags in a text
def extract_hashtags(text):
     
    # the regular expression
    regex = "#(\w+)"
     
    # extracting the hashtags
    hashtag_list = re.findall(regex, text)
     
    # printing the hashtag_list
    print("The hashtags in \"" + text + "\" are :")
    for hashtag in hashtag_list:
        print(hashtag)
 
if __name__=="__main__":
    text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
    text2 = "This day is beautiful ! #instagood #photooftheday #cute"
    extract_hashtags(text1)
    extract_hashtags(text2)

Output

The hashtags in "GeeksforGeeks is a wonderful #website for #ComputerScience" are :
website
ComputerScience
The hashtags in "This day is beautiful ! #instagood #photooftheday #cute" are :
instagood
photooftheday
cute

Method 3 : Using startswith() and replace()

Python3




# program to print all the hashtags in a text
 
text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
textList = text1.split()
for i in textList:
    if(i.startswith("#")):
        x=i.replace("#",'')
        print(x)

Output

website
ComputerScience

Method 4 : Using replace()

Python3




# program to print all the hashtags in a text
 
text1 = "GeeksforGeeks is a wonderful #website for #ComputerScience"
textList = text1.split()
for i in textList:
    if(i[0]=="#"):
        x=i.replace("#",'')
        print(x)

Output

website
ComputerScience

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!