Generally, a row of data will consist of email address, phone numbers, alpha/alphanumeric/numeric values, etc., Usually, front end validations are there to validate an email address/phone number, etc., In this article, we will see how it can be validated by using Regular expressions while writing SQL queries. Because always it is good to pass relevant data in the backend. Whenever there is an error in the data, Regular expression quickly validates that and informs the user whether is it a success or failure.
Step 1: Create a database
Query:
SELECT * FROM sys.databases WHERE name = 'GEEKSFORGEEKS' BEGIN CREATE DATABASE [GEEKSFORGEEKS] END
Step 2: Create “GeekAuthors” table under the “GEEKSFORGEEKS” database
Query:
use GEEKSFORGEEKS CREATE TABLE GeekAuthors ( ID INT IDENTITY NOT NULL PRIMARY KEY, AuthorName VARCHAR(255) NOT NULL, AuthorEmail VARCHAR(255) NOT NULL, Age INT, Skillsets VARCHAR(255), NumberOfPosts INT ); GO
As AuthorEmail is a varchar column, a wrongly invalid email address got entered into the table “GeekAuthors”. Our task is to find out the rows that got an invalid email address.
Step 3: To achieve this, let us put sample data in each row
INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) VALUES ('Rachel','rachel@gmail.com',25,'Java,Python,.Net',5); INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) VALUES ('Phoebe','phoebegmailcom',22,'Android,Python,.Net',15); INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) VALUES ('Monica','monica@gmailcom',23,'IOS,GO,R',10); INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) VALUES ('Joey','joey@.com',24,'Java,Python,GO',5); INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) VALUES ('Chandler','chandler@gmail',23,'IOS,GO,R',10); INSERT INTO GeekAuthors (AuthorName,AuthorEmail,Age,Skillsets,NumberOfPosts) VALUES ('Ross','ross@gmail.com',24,'Java,Python,GO',5);
Step 4: Now we validate the email address.
We need to validate whether the entered email address is correct or not. For that, we can easily achieve that
Query:
-- Get all email address SELECT AuthorEmail FROM GeekAuthors -- Get ValidEmail Address SELECT AuthorEmail AS ValidEmail FROM GeekAuthors WHERE AuthorEmail LIKE '%_@__%.__%' AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0 GO -- use NOT condition in the WHERE clause and select all the invalid emails as well. SELECT AuthorEmail AS NotValidEmail FROM GeekAuthors WHERE NOT AuthorEmail LIKE '%_@__%.__%' AND PATINDEX('%[^a-z,0-9,@,.,_,\-]%', AuthorEmail) = 0 GO
Output:
Step 5: For locating numeric value from the given value using Regular expression. It can be found out by using two ways
Our input expression may consist of alphanumeric values. In order to find out the position of the numeric value in the data, we can use the below format.
Query:
--pattern to check is: Not of A-Z (both A-Z and a-z(search pattern)) -- in the input string and --finding the position of the non-matching pattern -- As we are checking numeric values, it checks for numeric value position --and it is displaying position of the character Way 1 : SELECT 'GeekPremierLeague2022' as ContestName, PATINDEX('%[^A-Za-z]%', 'GeekPremierLeague2022') as NumericCharacterPosition; Way 2 : SELECT 'GeekPremierLeague2022' as ContestName, PATINDEX('%[0-9]%', 'GeekPremierLeague2022') as NumericCharacterPosition;
Output:
In case numeric is not available, we will get 0 as the answer
Output:
Step 6: In a few scenarios, we may need to extract only alphabets (a-z) from the input string.
It can be possible by combining 2 functions namely PATINDEX and STUFF.
- A regular expression has to be applied on PATINDEX. Using the above query we will get the numeric position
- By using the STUFF function, we can replace all the numeric values.
Query:
-- At the 18th position, one character is replaced. We need to remove numeric value. -- Hence STUFF function is used 1st Param -- Input string 2nd Param -- Start location . As numeric is present at 18th location, here it is given 3rd Param -- Number of characters to be replaced 4th Param - Replacing value SELECT STUFF('GeekPremierLeague2022', 18, 1, '' ); -- One character is replaced at the index 18 -- If we want to remove all the numeric above SELECT STUFF('GeekPremierLeague2022', 18, 4, '' ); -- Four characters are replaced starting from the index 18
Output:
Hence by applying both PATINDEX and STUFF, we can remove the numeric value in the given string:
Query:
-- Below code will produce only alpha text only -- Scenario: A column is created in a table to accept only alphabets. As there is no direct way available, -- we can use this approach and extract the alphabets alone DECLARE @inputData NVARCHAR(MAX) = 'GEEKS PREMIER LEAGUE 2022' --input string DECLARE @intPosition INT --get the position of the integer from the input string SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData) -- We can use PATINDEX('%[0-9]%') also print @intPosition --run loop until no integer is found in the input string WHILE @intPosition > 0 BEGIN --remove the integer from that position SET @inputData = STUFF(@inputData, @intPosition, 1, '' ) --PRINT @inputData SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData ) --Again get the position of the next integer in the input string --PRINT @intPosition END SELECT 'GEEKS PREMIER LEAGUE 2022' as InputString, @inputData AS AlphabetsOnly
Output:
Step 7: To get the numeric values alone suppose when the data feed is occurring that consists of employee name, phone number, address, and email id, we can use the below approach to retrieve the phone number (i.e. numeric value) in the below way.
Query:
DECLARE @inputString VARCHAR(MAX) DECLARE @validchars VARCHAR(MAX) SET @inputString = 'GeekAuthor1 123456789 Address1 geek@gmail.com' --We are going to take continuous set of numbers and it should not have any spaces also in between SET @validchars = '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' DECLARE @idx INT SET @idx = PATINDEX('%'+ @validchars +'%',@inputString ) IF @idx > 0 AND (@idx = LEN(@inputString )-8 OR PATINDEX(SUBSTRING(@inputString ,@idx+9,1),'[0-9]')=0) SET @inputString =SUBSTRING(@inputString, PATINDEX('%'+ @validchars +'%',@inputString ), 9) ELSE SET @inputString = '' SELECT @inputString as NumericAlone
Hence Regular expressions are used in many places for validating an email, address, validating alpha alone, or validating numeric alone.