Regular Expression to Extract SQL Query
Regular Expressions are the easier mechanism to search the data that matches the complex criteria. For example, from an alphanumeric value, extract only the alpha value or numeric value or check for the specific patterns of character matching and retrieve the records, etc.
Let us see them one by one by taking some sample scenarios;
Step 1: Creating Database
Query:
SELECT * FROM sys.databases WHERE name = 'GEEKSFORGEEKS' BEGIN CREATE DATABASE [GEEKSFORGEEKS] END
Step 2: Using the database
Query:
USE GEEKSFORGEEKS
Step 3: CREATING TABLE Country under GEEKSFORGEEKS and insert few records
Query:
INSERT INTO Country(CountryID,CountryName) VALUES (1,'United States'); INSERT INTO Country(CountryID,CountryName) VALUES (2,'United States'); INSERT INTO Country(CountryID,CountryName) VALUES (3,'United Kingdom'); INSERT INTO Country(CountryID,CountryName) VALUES (4,'Canada'); INSERT INTO Country(CountryID,CountryName) VALUES (5,'United Kingdom'); INSERT INTO Country(CountryID,CountryName) VALUES (6,'Canada'); INSERT INTO Country(CountryID,CountryName) VALUES (7,'United States'); INSERT INTO Country(CountryID,CountryName) VALUES (8,'Australia'); INSERT INTO Country(CountryID,CountryName) VALUES (9,'Canada'); INSERT INTO Country(CountryID,CountryName) VALUES (10,'United States'); SELECT * FROM Country
Output :
Example 1:
To get the records starting between A – D and the second letter between U to Z and the rest of the letters can be anything.
Query:
--Find Country Names having: --First character should be A and D alphabets. --The second character should be from U and Z alphabet SELECT * FROM Country WHERE CountryName like '[A-D][U-Z]%' --regular expression is used here
Output :
If we need to specifically check for the first character alone and the rest of the characters can be anything, then
Query:
--Find Country Names having: --First character should be A and D alphabets. --Rest letters can be any character SELECT * FROM Country WHERE CountryName like '[A-D]%' --regular expression
Output :
By seeing the above two outputs, we can understand that just by giving different regular expressions, we are getting different outputs.
Suppose if we want to find country names starting with ‘U’ alone then the query will be as follows:
Query:
--Find country names starting with 'U' alone SELECT * FROM Country WHERE CountryName like 'U%' --regular expression
Output :
Suppose if we want to find country names starting with ‘U’ and additional information then the query will be as follows:
Query:
--Find country names starting with -- U and additional condition is given SELECT * FROM Country WHERE CountryName like 'U% [S]%' --regular expression
Output :

‘
While using with Like operator, we need to understand the below factors also
Wildcard character | Description |
% | A string of 0 or more characters will be retrieved |
[ ] | Within the specified range, any single character alone will be retrieved |
[^] | Within the specified range, none of the characters will be retrieved |
We can use the regular expression in other functions also.
Example 2:
Let us check out PATINDEX function first. It is a function that accepts the search pattern and input string and returns the starting position of the character not matching the pattern.
--pattern to check is A-Z or a-z(search pattern) -- in the input string and --position of the non-matching pattern -- It checks for numeric value position --and it is displaying position of the character SELECT 'GFGVersion1' as InputString, PATINDEX('%[^A-Za-z]%', 'GFGVersion1') as NumericCharacterPosition;
Output :
To get numbers only from an input string, we can use in below way also
SELECT 'GFGVersion1' as InputString, PATINDEX('%[0-9]%', 'GFGVersion1') as NumericCharacterPosition;
i.e. instead of using [^A-Za-z], used [0-9] and getting the same result as above
-- 0 will indicate no numeric value present SELECT 'VERSION' as InputString, PATINDEX('%[^A-Za-z]%', 'VERSION') as NumericPosition;
Output :

If there are no numeric, 0 will be displayed for NumericPosition
We can use this functionality of using a regular expression with functions like PATINDEX, we can solve to get only the characters alone from input string/numbers alone from input string etc.,
For that, let us see STUFF function also
STUFF function
--remove the integer from -- position 3 in the input string. /* As only one character need to be removed, we need to send params like this 1st Param -- Input string 2nd Param -- Start location 3rd Param -- Number of characters to be replaced 4th Param - Replacing value */ SELECT STUFF('GE098EKS9VER1', 3, 1, '' );
Output :
Example 3 :
By having PATINDEX and STUFF functions, we can get only the character values from an input string.
- We need to use a regular expression to get applied on PATINDEX
- Find out the numeric position and remove the numbers by using STUFF function
- Step 2 has to be repeated until there is no numeric value
Query:
DECLARE @inputData NVARCHAR(MAX) = 'GE098EKS9VER1' --input string DECLARE @intPosition INT --get the position of the integer from the input string SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData) print @intPosition --run loop until no integer is found in the input string WHILE @intPosition > 0 BEGIN --remove the integer from that position SET @inputData = STUFF(@inputData, @intPosition, 1, '' ) --PRINT @inputData SET @intPosition = PATINDEX('%[^A-Za-z]%', @inputData ) --Again get the position of the next integer in the input string --PRINT @intPosition END SELECT 'GE098EKS9VER1' as InputString, @inputData AS AlphabetsOnly
Output :
Similarly, we can use regular expression as the best search pattern practice.
Throughout SQL, we can use a regular expression to extract different outputs to our needs. It can be used with other functions also and it will help to get only alphabets from an input string/numerals from an input string.
Please Login to comment...