Open In App

How to Split a Delimited String to Access Individual Items in SQL?

Last Updated : 26 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In SQL, dealing with delimited strings is a common task, especially when handling data that are not structured in a traditional tabular format. Whether it’s parsing a list of values separated by commas or any other delimiter, splitting these strings into individual items is crucial for various data manipulation tasks.

In SQL, sometimes we get data that’s all squished together, like a bunch of words separated by commas or other symbols. This article is all about learning how to do just that—take a long string of text and break it into pieces we can easily work with.

Splitting Delimited Strings in SQL

A delimited string is a single string containing multiple values separated by a specific character or sequence of characters. Common delimiters include commas (‘,’), semicolons (‘;’), tabs (‘\t’), or any custom character. For example, Imagine having a list of fruits like “apple,banana,cherry” and you want to look at each fruit one by one.

This is where splitting that long string into smaller pieces, or individual items, becomes super handy. Here’s how you can accomplish this using three methods:

  • String Functions
  • Recursive CTE (Common Table Expression)
  • STRING_SPLIT Function

1. String Functions

SUBSTRING and LOCATE: This method involves SQL string manipulation functions to extract individual items from a delimited string. The approach is quite versatile but can become complex and less efficient for strings with varying item lengths or for very long strings.

Example: Extracting Color from ProductDescription in SQL

-- Create the Product table
CREATE TABLE Product (
ProductDescription VARCHAR(100)
);

-- Insert some sample data
INSERT INTO Product (ProductDescription) VALUES
('Shirt,Blue,Large'),
('Pants,Black,Medium'),
('Dress,Red,Small');

-- Query to extract the color from the ProductDescription column
SELECT
ProductDescription,
SUBSTRING(
ProductDescription,
LOCATE(',', ProductDescription) + 1,
LOCATE(',', ProductDescription, LOCATE(',', ProductDescription) + 1) - LOCATE(',', ProductDescription) - 1
) AS Color
FROM
Product;

Output:

Using-LOCATE-Function

Using “LOCATE” Function

  • LOCATE(‘,’, ProductDescription) function finds the position of the first comma in the ProductDescription string.
  • LOCATE(‘,’, ProductDescription, LOCATE(‘,’, ProductDescription) + 1) finds the position of the second comma in the ProductDescription string, starting the search after the position of the first comma.
  • SUBSTRING(ProductDescription, start_position, length) extracts a substring from ProductDescription. The start_position is the position of the first comma plus 1, and the length is calculated as the difference between the positions of the second and first commas minus 1.
  • AS Color alias is used to give a meaningful name to the extracted substring, representing the color of the product.
  • SELECT statement retrieves the ProductDescription column along with the extracted color using the SUBSTRING() function.

LEFT and RIGHT: Another approach also splits the string based on delimiter positions but uses LEFT and RIGHT functions to extract the beginning and end parts of the string, respectively. This method can be useful when you need the parts of the string before the first delimiter or after the last delimiter.

Example 2: Extracting Color, Product, and Size from ProductDescription in SQL

-- Query to extract the color from the ProductDescription column using LEFT, RIGHT, and LOCATE
SELECT
ProductDescription,
SUBSTRING(
ProductDescription,
LOCATE(',', ProductDescription) + 1,
LOCATE(',', ProductDescription, LOCATE(',', ProductDescription) + 1) - LOCATE(',', ProductDescription) - 1
) AS Color,
LEFT(ProductDescription, LOCATE(',', ProductDescription) - 1) AS Product,
RIGHT(ProductDescription, LENGTH(ProductDescription) - LOCATE(',', ProductDescription, LOCATE(',', ProductDescription) + 1)) AS Size
FROM
Product;

Output:

Using-LEFT-&-RIGHT-Function

Using “LEFT & RIGHT” Function

  • The LEFT function extracts the substring from the beginning of the ProductDescription column up to the first comma, representing the product name.
  • The RIGHT function extracts the substring from the second comma to the end of the ProductDescription column, representing the size.

2. Recursive CTE (Common Table Expression)

Recursive CTE is a more elegant solution for splitting delimited strings recursively. It involves recursively breaking down the string until all individual items are extracted. Although efficient, it might not be supported in all SQL environments and can be resource-intensive for large datasets.

Example: Extracting Color from ProductDescription using Recursive CTE in SQL

-- Create the ProductDetails table
CREATE TABLE ProductDetails (
ProductID INT AUTO_INCREMENT PRIMARY KEY,
ProductDescription VARCHAR(100)
);

-- Insert some sample data
INSERT INTO ProductDetails (ProductDescription) VALUES
('Shirt,Blue,Large'),
('Pants,Black,Medium'),
('Dress,Red,Small');

-- Query to extract the color from the ProductDescription column using Recursive CTE
WITH RECURSIVE ProductCTE AS (
SELECT
ProductID,
ProductDescription,
SUBSTRING_INDEX(ProductDescription, ',', 1) AS ProductName,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', 2), ',', -1) AS Color,
SUBSTRING_INDEX(ProductDescription, ',', -1) AS Size,
1 AS StartIndex
FROM
ProductDetails
UNION ALL
SELECT
ProductID,
ProductDescription,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', StartIndex + 1), ',', -1) AS ProductName,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', StartIndex + 2), ',', -1) AS Color,
SUBSTRING_INDEX(SUBSTRING_INDEX(ProductDescription, ',', StartIndex + 3), ',', -1) AS Size,
StartIndex + 1 AS StartIndex
FROM
ProductCTE
WHERE
StartIndex < LENGTH(ProductDescription) - LENGTH(REPLACE(ProductDescription, ',', '')) + 1
)
SELECT
ProductDescription,
Color
FROM
ProductCTE;

Output:

Using-Recursive-CTE-Function

Using “Recursive CTE” Function

  • Create a table named ProductDetails with a ProductDescription column to store the product details.
  • Sample data is inserted into the ProductDetails table.
  • We use a Recursive CTE(Common Table Expression) to split the ProductDescription column into its components (name, color, size).
  • The Recursive CTE splits the string recursively based on the comma delimiter.

3. STRING_SPLIT Function

SQL Server introduced the STRING_SPLIT function to directly split delimited strings into a table of values. It takes the input string and delimiter as parameters, returning a table with individual items. This method is efficient and straightforward, ideal for modern SQL Server environments.

Example: Splitting Delimited Strings into Individual Values in SQL

-- Create a sample table with a column containing delimited strings
CREATE TABLE SampleData (
ID INT,
Data VARCHAR(100)
);

-- Insert some sample data
INSERT INTO SampleData (ID, Data)
VALUES
(1, 'apple,banana,orange'),
(2, 'carrot,potato,tomato');

-- Use SUBSTRING_INDEX to split the delimited strings into individual values
SELECT ID,
SUBSTRING_INDEX(SUBSTRING_INDEX(Data, ',', n.n), ',', -1) AS SplitData
FROM SampleData
JOIN (
SELECT 1 n UNION ALL
SELECT 2 UNION ALL
SELECT 3 -- Add more if needed based on maximum elements in the list
) n ON LENGTH(Data) - LENGTH(REPLACE(Data, ',', '')) >= n.n - 1
ORDER BY ID, SplitData;

Output:

Using-STRING_SPLIT

Using “STRING_SPLIT”

  • Uses the STRING_SPLIT function to split the delimited strings into individual values.
  • Orders the output by ID and SplitData for clarity.
  • Each row includes the original ID value and the corresponding split data (SplitData).
  • The output is sorted by ID and SplitData for easier interpretation.
  • Demonstrates how STRING_SPLIT converts delimited strings into individual values, facilitating data processing and analysis.

Conclusion

Splitting delimited strings in SQL is a fundamental task in data manipulation and analysis. Understanding various methods, including built-in functions like STRING_SPLIT and recursive CTEs, empowers SQL developers to efficiently access individual items within delimited strings. While newer SQL versions offer dedicated functions for this purpose, legacy systems may require alternative approaches such as custom functions or string manipulation techniques.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads