Open In App

How to Split a Delimited String to Access Individual Items in SQL?

In SQL, dealing with delimited strings is a common task, especially when handling data that are not structured in a traditional tabular format. Whether it’s parsing a list of values separated by commas or any other delimiter, splitting these strings into individual items is crucial for various data manipulation tasks.

In SQL, sometimes we get data that’s all squished together, like a bunch of words separated by commas or other symbols. This article is all about learning how to do just that—take a long string of text and break it into pieces we can easily work with.



Splitting Delimited Strings in SQL

A delimited string is a single string containing multiple values separated by a specific character or sequence of characters. Common delimiters include commas (‘,’), semicolons (‘;’), tabs (‘\t’), or any custom character. For example, Imagine having a list of fruits like “apple,banana,cherry” and you want to look at each fruit one by one.

This is where splitting that long string into smaller pieces, or individual items, becomes super handy. Here’s how you can accomplish this using three methods:



1. String Functions

SUBSTRING and LOCATE: This method involves SQL string manipulation functions to extract individual items from a delimited string. The approach is quite versatile but can become complex and less efficient for strings with varying item lengths or for very long strings.

Example: Extracting Color from ProductDescription in SQL

-- Create the Product table
CREATE TABLE Product (
ProductDescription VARCHAR(100)
);

-- Insert some sample data
INSERT INTO Product (ProductDescription) VALUES
('Shirt,Blue,Large'),
('Pants,Black,Medium'),
('Dress,Red,Small');

-- Query to extract the color from the ProductDescription column
SELECT
ProductDescription,
SUBSTRING(
ProductDescription,
LOCATE(',', ProductDescription) + 1,
LOCATE(',', ProductDescription, LOCATE(',', ProductDescription) + 1) - LOCATE(',', ProductDescription) - 1
) AS Color
FROM
Product;

Output:

Using “LOCATE” Function

LEFT and RIGHT: Another approach also splits the string based on delimiter positions but uses LEFT and RIGHT functions to extract the beginning and end parts of the string, respectively. This method can be useful when you need the parts of the string before the first delimiter or after the last delimiter.

Example 2: Extracting Color, Product, and Size from ProductDescription in SQL

-- Query to extract the color from the ProductDescription column using LEFT, RIGHT, and LOCATE
SELECT
ProductDescription,
SUBSTRING(
ProductDescription,
LOCATE(',', ProductDescription) + 1,
LOCATE(',', ProductDescription, LOCATE(',', ProductDescription) + 1) - LOCATE(',', ProductDescription) - 1
) AS Color,
LEFT(ProductDescription, LOCATE(',', ProductDescription) - 1) AS Product,
RIGHT(ProductDescription, LENGTH(ProductDescription) - LOCATE(',', ProductDescription, LOCATE(',', ProductDescription) + 1)) AS Size
FROM
Product;

Output:

Using “LEFT & RIGHT” Function

2. Recursive CTE (Common Table Expression)

Recursive CTE is a more elegant solution for splitting delimited strings recursively. It involves recursively breaking down the string until all individual items are extracted. Although efficient, it might not be supported in all SQL environments and can be resource-intensive for large datasets.

Example: Extracting Color from ProductDescription using Recursive CTE in SQL

-- Create the ProductDetails table
CREATE TABLE ProductDetails (
ProductID INT AUTO_INCREMENT PRIMARY KEY,
ProductDescription VARCHAR(100)
);

-- Insert some sample data
INSERT INTO ProductDetails (ProductDescription) VALUES
('Shirt,Blue,Large'),
('Pants,Black,Medium'),
('Dress,Red,Small');

-- Query to extract the color from the ProductDescription column using Recursive CTE
WITH RECURSIVE ProductCTE AS (
SELECT
ProductID,
ProductDescription,
SUBSTRING_INDEX(ProductDescription, ',', 1) AS ProductName,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', 2), ',', -1) AS Color,
SUBSTRING_INDEX(ProductDescription, ',', -1) AS Size,
1 AS StartIndex
FROM
ProductDetails
UNION ALL
SELECT
ProductID,
ProductDescription,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', StartIndex + 1), ',', -1) AS ProductName,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', StartIndex + 2), ',', -1) AS Color,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', StartIndex + 3), ',', -1) AS Size,
StartIndex + 1 AS StartIndex
FROM
ProductCTE
WHERE
StartIndex < LENGTH(ProductDescription) - LENGTH(REPLACE(ProductDescription, ',', '')) + 1
)
SELECT
ProductDescription,
Color
FROM
ProductCTE;

Output:

Using “Recursive CTE” Function

3. STRING_SPLIT Function

SQL Server introduced the STRING_SPLIT function to directly split delimited strings into a table of values. It takes the input string and delimiter as parameters, returning a table with individual items. This method is efficient and straightforward, ideal for modern SQL Server environments.

Example: Splitting Delimited Strings into Individual Values in SQL

-- Create a sample table with a column containing delimited strings
CREATE TABLE SampleData (
ID INT,
Data VARCHAR(100)
);

-- Insert some sample data
INSERT INTO SampleData (ID, Data)
VALUES
(1, 'apple,banana,orange'),
(2, 'carrot,potato,tomato');

-- Use SUBSTRING_INDEX to split the delimited strings into individual values
SELECT ID,
SUBSTRING_INDEX(SUBSTRING_INDEX(Data, ',', n.n), ',', -1) AS SplitData
FROM SampleData
JOIN (
SELECT 1 n UNION ALL
SELECT 2 UNION ALL
SELECT 3 -- Add more if needed based on maximum elements in the list
) n ON LENGTH(Data) - LENGTH(REPLACE(Data, ',', '')) >= n.n - 1
ORDER BY ID, SplitData;

Output:

Using “STRING_SPLIT”

Conclusion

Splitting delimited strings in SQL is a fundamental task in data manipulation and analysis. Understanding various methods, including built-in functions like STRING_SPLIT and recursive CTEs, empowers SQL developers to efficiently access individual items within delimited strings. While newer SQL versions offer dedicated functions for this purpose, legacy systems may require alternative approaches such as custom functions or string manipulation techniques.


Article Tags :