Open In App

How to Validate SQL Query With Regular Expression?

Last Updated : 14 Mar, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Generally, a row of data will consist of email address, phone numbers, alpha/alphanumeric/numeric values, etc., Usually, front end validations are there to validate an email address/phone number, etc., In this article, we will see how it can be validated by using Regular expressions while writing SQL queries. Because always it is good to pass relevant data in the backend. Whenever there is an error in the data, Regular expression quickly validates that and informs the user whether is it a success or failure.

Step 1: Create a database

Query:

SELECT * FROM sys.databases WHERE name = 'GEEKSFORGEEKS'
    BEGIN
    CREATE DATABASE [GEEKSFORGEEKS]
    END

Step 2: Create  “GeekAuthors” table under the “GEEKSFORGEEKS” database

Query:

use GEEKSFORGEEKS
CREATE TABLE GeekAuthors (
    ID INT IDENTITY NOT NULL PRIMARY KEY,
    AuthorName VARCHAR(255) NOT NULL,
    AuthorEmail VARCHAR(255) NOT NULL,
    Age INT,
    Skillsets VARCHAR(255),
    NumberOfPosts INT
);
GO

As AuthorEmail is a varchar column, a wrongly invalid email address got entered into the table “GeekAuthors”. Our task is to find out the rows that got an invalid email address.

Step 3: To achieve this, let us put sample data in each row

INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) 
VALUES ('Rachel','rachel@gmail.com',25,'Java,Python,.Net',5);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) 
VALUES ('Phoebe','phoebegmailcom',22,'Android,Python,.Net',15);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) 
VALUES ('Monica','monica@gmailcom',23,'IOS,GO,R',10);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) 
VALUES ('Joey','joey@.com',24,'Java,Python,GO',5);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) 
VALUES ('Chandler','chandler@gmail',23,'IOS,GO,R',10);
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) 
VALUES ('Ross','ross@gmail.com',24,'Java,Python,GO',5);

Step 4: Now we validate the email address.

We need to validate whether the entered email address is correct or not. For that, we can easily achieve that

Query:

-- Get all email address
SELECT AuthorEmail FROM GeekAuthors
-- Get ValidEmail Address
SELECT AuthorEmail AS ValidEmail
FROM GeekAuthors
WHERE AuthorEmail LIKE '%_@__%.__%'
        AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0
GO
-- use NOT condition in the WHERE clause and select all the invalid emails as well.
SELECT AuthorEmail AS NotValidEmail
FROM GeekAuthors
WHERE NOT AuthorEmail LIKE '%_@__%.__%'
        AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0
GO

Output:

 

Step 5: For locating numeric value from the given value using Regular expression. It can be found out by using two ways

Our input expression may consist of alphanumeric values. In order to find out the position of the numeric value in the data, we can use the below format.

Query:

--pattern to check is: Not of  A-Z (both A-Z and a-z(search pattern))
-- in the input string and 
--finding the position of the non-matching pattern
-- As we are checking numeric values, it checks for numeric value position 
--and it is displaying position of the character
Way 1 :
SELECT 'GeekPremierLeague2022' as ContestName,
PATINDEX('%[^A-Za-z]%', 'GeekPremierLeague2022') as
NumericCharacterPosition;
Way 2 :
SELECT 'GeekPremierLeague2022' as ContestName, 
PATINDEX('%[0-9]%', 'GeekPremierLeague2022') as
NumericCharacterPosition;

Output:

In case numeric is not available, we will get 0 as the answer

Output:

Step 6: In a few scenarios, we may need to extract only alphabets (a-z) from the input string. 

It can be possible by combining 2 functions namely PATINDEX and STUFF. 

  • A regular expression has to be applied on PATINDEX. Using the above query we will get the numeric position
  • By using the STUFF function, we can replace all the numeric values.

Query:

-- At the 18th position, one character is replaced. 
We need to remove numeric value. 
-- Hence STUFF function is used
1st Param -- Input string
 2nd Param -- Start location . As numeric is present
  at 18th location, here it is given
 3rd Param -- Number of characters to be replaced 
 4th Param - Replacing value
SELECT STUFF('GeekPremierLeague2022', 18, 1, '' ); 
-- One character is replaced at the index 18
-- If we want to remove all the numeric above
SELECT STUFF('GeekPremierLeague2022', 18, 4, '' ); 
-- Four characters are replaced starting from  the index 18

Output:

Hence by applying both PATINDEX and STUFF, we can remove the numeric value in the given string:

Query:

-- Below code will produce only alpha text only
-- Scenario: A column is created in a table to accept
   only alphabets. As there is no direct way available, 
-- we can use this approach and extract the alphabets alone
DECLARE @inputData NVARCHAR(MAX) = 'GEEKS PREMIER LEAGUE 2022'       
--input string
DECLARE @intPosition INT    
 --get the position of the integer from the input string   
SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData) 
-- We can use PATINDEX('%[0-9]%') also
print @intPosition
--run loop until no integer is found in the input string
WHILE @intPosition > 0                                     
  BEGIN  
     --remove the integer from that position
    SET @inputData = STUFF(@inputData, @intPosition, 1, '' )   
    --PRINT @inputData 
    SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData )
    --Again get the position of the next integer in the input string
    --PRINT @intPosition 
END  
SELECT 'GEEKS PREMIER LEAGUE 2022' as InputString, @inputData AS AlphabetsOnly

Output:

Step 7: To get the numeric values alone suppose when the data feed is occurring that consists of employee name, phone number, address, and email id, we can use the below approach to retrieve the phone number (i.e. numeric value) in the below way. 

Query:

DECLARE @inputString  VARCHAR(MAX)
DECLARE @validchars VARCHAR(MAX)

SET @inputString  = 'GeekAuthor1 123456789 Address1 geek@gmail.com'
--We are going to take continuous set of numbers
 and it should not have any spaces also in between
SET @validchars = '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
DECLARE @idx INT
SET @idx = PATINDEX('%'+ @validchars +'%',@inputString )
IF @idx > 0 AND 
    (@idx = LEN(@inputString )-8
    OR PATINDEX(SUBSTRING(@inputString ,@idx+9,1),'[0-9]')=0)
    SET @inputString =SUBSTRING(@inputString,
    PATINDEX('%'+ @validchars +'%',@inputString ), 9)
ELSE SET @inputString  = ''
SELECT @inputString as NumericAlone

Hence Regular expressions are used in many places for validating an email, address, validating alpha alone, or validating numeric alone.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads