Open In App

Python | Delimited String List to String Matrix

Last Updated : 22 Apr, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with Python strings, we can have problem in which we need to convert String list which have strings that are joined by deliminator to String Matrix by separation by deliminator. Lets discuss certain ways in which this task can be performed.

Method #1 : Using loop + split() 
This is one of way in which this task can be performed. In this, we iterate for each string and perform the split using split().
 

Python3




# Python3 code to demonstrate working of
# Delimited String List to String Matrix
# Using loop + split()
 
# initializing list
test_list = ['gfg:is:best', 'for:all', 'geeks:and:CS']
 
# printing original list
print("The original list is : " + str(test_list))
 
# Delimited String List to String Matrix
# Using loop + split()
res = []
for sub in test_list:
    res.append(sub.split(':'))
 
# printing result
print("The list after conversion : " + str(res))


Output : 

The original list is : ['gfg:is:best', 'for:all', 'geeks:and:CS']
The list after conversion : [['gfg', 'is', 'best'], ['for', 'all'], ['geeks', 'and', 'CS']]

 

 
Method #2 : Using list comprehension + split() 
Yet another way to perform this task, is just modification of above method. In this, we use list comprehension as shorthand and one-liner to perform this.
 

Python3




# Python3 code to demonstrate working of
# Delimited String List to String Matrix
# Using list comprehension + split()
 
# initializing list
test_list = ['gfg:is:best', 'for:all', 'geeks:and:CS']
 
# printing original list
print("The original list is : " + str(test_list))
 
# Delimited String List to String Matrix
# Using list comprehension + split()
res = [sub.split(':') for sub in test_list]
 
# printing result
print("The list after conversion : " + str(res))


Output : 

The original list is : ['gfg:is:best', 'for:all', 'geeks:and:CS']
The list after conversion : [['gfg', 'is', 'best'], ['for', 'all'], ['geeks', 'and', 'CS']]

 

The Time and Space Complexity for all the methods are the same:

Time Complexity: O(n)

Space Complexity: O(n)

Method 3: Using csv.reader and StringIO
This method uses the csv library’s reader method to split the string by a specified delimiter. This method can also handle different number of columns in each row of the matrix.

Python3




import csv
from io import StringIO
 
test_list = ['gfg:is:best', 'for:all', 'geeks:and:CS']
 
print("The original list is:", test_list)
 
#using csv.reader and StringIO to convert delimited string list to string matrix
res = []
for sub in test_list:
    s = StringIO(sub)
    reader = csv.reader(s, delimiter=':')
    res.append(list(reader)[0])
 
print("The list after conversion:", res)


Output

The original list is: ['gfg:is:best', 'for:all', 'geeks:and:CS']
The list after conversion: [['gfg', 'is', 'best'], ['for', 'all'], ['geeks', 'and', 'CS']]

Time Complexity: O(n)
Auxiliary. Space: O(n)

Method 4: Using re.split() 

Python3




import re
# Function to convert a delimited string list to a string matrix
def delimited_list_to_matrix(string_list, delimiter):
    # Initialize an empty matrix
    matrix = []
    # Iterate over each string in the string_list
    for string in string_list:
        # Split the string into a list of substrings using the specified delimiter
        split_string = re.split(delimiter, string)
        # Append the split string to the matrix
        matrix.append(split_string)
    # Return the matrix
    return matrix
# Test the function with an example input
test_list = ['gfg:is:best', 'for:all', 'geeks:and:CS']
# printing original list
print("The original list is : " + str(test_list))
delimiter = ':'
print(delimited_list_to_matrix(test_list, delimiter))
#This code is contributed by Jyothi pinjala


Output

The original list is : ['gfg:is:best', 'for:all', 'geeks:and:CS']
[['gfg', 'is', 'best'], ['for', 'all'], ['geeks', 'and', 'CS']]

Time Complexity: The time complexity of this code is O(n*m), where n is the number of strings in the input list and m is the average length of each string. This is because the code iterates over each string in the list, and for each string, it uses the   re.split() function which takes O(m) time.
 

Auxiliary. Space: 

The space complexity of this code is O(n*m), where n is the number of strings in the input list and m is the average length of each string. This is because the code creates a new list of substrings for each string in the input list, and the size of this list will be proportional to the size of the input list and the average length of each string.

Method 5: Using pandas

 step-by-step approach

  1. Import the pandas library using the import pandas as pd statement. This will allow you to use the pandas library under the shorthand pd.
  2. Define a function called delimited_list_to_matrix that takes in two parameters: a list of strings called string_list, and a delimiter character called delimiter.
  3. Inside the function, create a pandas DataFrame from the string_list using the pd.DataFrame function. The columns parameter specifies the column name of the DataFrame.
  4. Split each string in the DataFrame into multiple columns using the specified delimiter character. This is done by calling the str.split method on the ‘string’ column of the DataFrame, with the expand=True parameter to create a new column for each split element.
  5. Convert the resulting DataFrame to a matrix using the to_numpy method. This returns a NumPy array of the DataFrame data.
  6. Return the resulting matrix.
  7. Call the delimited_list_to_matrix function with a test input consisting of a list of three strings and the ‘:’ delimiter.
  8. Print the resulting matrix.

Python3




import pandas as pd
 
# Function to convert a delimited string list to a string matrix
def delimited_list_to_matrix(string_list, delimiter):
    # Create a pandas DataFrame from the string_list
    df = pd.DataFrame(string_list, columns=['string'])
    # Split the strings into multiple columns using the specified delimiter
    df = df['string'].str.split(delimiter, expand=True)
    # Convert the DataFrame to a matrix
    matrix = df.to_numpy()
    # Return the matrix
    return matrix
 
# Test the function with an example input
test_list = ['gfg:is:best', 'for:all', 'geeks:and:CS']
delimiter = ':'
print(delimited_list_to_matrix(test_list, delimiter))


OUTPUT :
[['gfg' 'is' 'best']
['for' 'all']
['geeks' 'and' 'CS']]

The time complexity  O(n*m), where n is the length of the string list and m is the maximum number of substrings.

The space complexity : O(n*m).



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads