Related Articles

Related Articles

How to validate HTML tag using Regular Expression
  • Last Updated : 16 Oct, 2020

Given string str, the task is to check whether it is a valid HTML tag or not by using Regular Expression.
The valid HTML tag must satisfy the following conditions: 

  1. It should start with an opening tag (<).
  2. It should be followed by a double quotes string or single quotes string.
  3. It should not allow one double quotes string, one single quotes string or a closing tag (>) without single or double quotes enclosed.
  4. It should end with a closing tag (>).

Examples: 

Input: str = “<input value = ‘>’>”; 
Output: true 
Explanation: The given string satisfies all the above mentioned conditions.
Input: str = “<br/>”; 
Output: true 
Explanation: The given string satisfies all the above mentioned conditions.
Input: str = “br/>”; 
Output: false 
Explanation: The given string doesn’t starts with an opening tag “<“. Therefore, it is not a valid HTML tag.
Input: str = “<‘br/>”; 
Output: false 
Explanation: The given string has one single quotes string that is not allowed. Therefore, it is not a valid HTML tag.
Input: str = “<input value => >”; 
Output: false 
Explanation: The given string has a closing tag (>) without single or double quotes enclosed that is not allowed. Therefore, it is not a valid HTML tag.

Approach: The idea is to use Regular Expression to solve this problem. The following steps can be followed to compute the answer. 

  • Get the String.
  • Create a regular expression to check valid HTML tag as mentioned below: 
     

regex = “<(“[^”]*”|'[^’]*’|[^'”>])*>”; 



  • Where: 
    • < represents the string should start with an opening tag (<).
    • ( represents the starting of the group.
    • “[^”]*” represents the string should allow double quotes enclosed string.
    • | represents or.
    • ‘[^’]*‘ represents the string should allow single quotes enclosed string.
    • | represents or.
    • [^'”>] represents the string should not contain one single quote, double quotes, and “>”.
    • ) represents the ending of the group.
    • * represents 0 or more.
    • > represents the string should end with a closing tag (>).
  • Match the given string with the regular expression. In Java, this can be done by using Pattern.matcher().
  • Return true if the string matches with the given regular expression, else return false.

Below is the implementation of the above approach: 
 

Java

filter_none

edit
close

play_arrow

link
brightness_4
code

// Java program to validate
// HTML tag using regex.
 
import java.util.regex.*;
 
class GFG {
 
    // Function to validate
    // HTML tag using regex.
    public static boolean
    isValidHTMLTag(String str)
    {
        // Regex to check valid HTML tag.
        String regex
            = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";
 
        // Compile the ReGex
        Pattern p = Pattern.compile(regex);
 
        // If the string is empty
        // return false
        if (str == null) {
            return false;
        }
 
        // Find match between given string
        // and regular expression
        // using Pattern.matcher()
        Matcher m = p.matcher(str);
 
        // Return if the string
        // matched the ReGex
        return m.matches();
    }
 
    // Driver Code.
    public static void main(String args[])
    {
 
        // Test Case 1:
        String str1 = "<input value = '>'>";
        System.out.println(isValidHTMLTag(str1));
 
        // Test Case 2:
        String str2 = "<br/>";
        System.out.println(isValidHTMLTag(str2));
 
        // Test Case 3:
        String str3 = "br/>";
        System.out.println(isValidHTMLTag(str3));
 
        // Test Case 4:
        String str4 = "<'br/>";
        System.out.println(isValidHTMLTag(str4));
 
        // Test Case 5:
        String str5 = "<input value => >";
        System.out.println(isValidHTMLTag(str5));
    }
}

chevron_right


Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# Python3 program to validate
# HTML tag using regex.  
# using regular expression
import re
 
# Function to validate
# HTML tag using regex.
def isValidHTMLTag(str):
 
    # Regex to check valid
    # HTML tag using regex.
    regex = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>"
     
    # Compile the ReGex
    p = re.compile(regex)
 
    # If the string is empty
    # return false
    if (str == None):
        return False
 
    # Return if the string
    # matched the ReGex
    if(re.search(p, str)):
        return True
    else:
        return False
 
# Driver code
 
# Test Case 1:
str1 = "<input value = '>'>"
print(isValidHTMLTag(str1))
 
# Test Case 2:
str2 = "<br/>"
print(isValidHTMLTag(str2))
 
# Test Case 3:
str3 = "br/>"
print(isValidHTMLTag(str3))
 
# Test Case 4:
str4 = "<'br/>"
print(isValidHTMLTag(str4))
 
# This code is contributed by avanitrachhadiya2155

chevron_right


Output: 

true
true
false
false
false

 

full-stack-img




My Personal Notes arrow_drop_up
Recommended Articles
Page :