Open In App

How to split a string in C/C++, Python and Java?

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Share
Report issue
Report

Splitting a string by some delimiter is a very common task. For example, we have a comma-separated list of items from a file and we want individual items in an array. 
Almost all programming languages, provide a function split a string by some delimiter. 

In C:  

// Splits str[] according to given delimiters.
// and returns next token. It needs to be called
// in a loop to get all tokens. It returns NULL
// when there are no more tokens.
char * strtok(char str[], const char *delims);

C




// A C/C++ program for splitting a string
// using strtok()
#include <stdio.h>
#include <string.h>
 
int main()
{
    char str[] = "Geeks-for-Geeks";
 
    // Returns first token
    char *token = strtok(str, "-");
   
    // Keep printing tokens while one of the
    // delimiters present in str[].
    while (token != NULL)
    {
        printf("%s\n", token);
        token = strtok(NULL, "-");
    }
 
    return 0;
}


Output: Geeks
    for
    Geeks

Time complexity : O(n)

Auxiliary Space: O(n)

In C++

Note:  The main disadvantage of strtok() is that it only works for C style strings.
       Therefore we need to explicitly convert C++ string into a char array.
       Many programmers are unaware that C++ has two additional APIs which are more elegant
       and works with C++ string. 

Method 1: Using  stringstream API of C++

Prerequisite:  stringstream API 

Stringstream object can be initialized using a string object, it automatically tokenizes strings on space char. Just like “cin” stream stringstream allows you to read a string as a stream of words. Alternately, we can also utilise getline function to tokenize string on any single character delimiter.

Some of the Most Common used functions of StringStream.
clear() — flushes the stream 
str() —  converts a stream of words into a C++ string object.
operator << — pushes a string object into the stream.
operator >> — extracts a word from the stream.

 The code below demonstrates it. 

C++




#include <bits/stdc++.h>
using namespace std;
 
// A quick way to split strings separated via spaces.
void simple_tokenizer(string s)
{
    stringstream ss(s);
    string word;
    while (ss >> word) {
        cout << word << endl;
    }
}
 
// A quick way to split strings separated via any character
// delimiter.
void adv_tokenizer(string s, char del)
{
    stringstream ss(s);
    string word;
    while (!ss.eof()) {
        getline(ss, word, del);
        cout << word << endl;
    }
}
 
int main(int argc, char const* argv[])
{
    string a = "How do you do!";
    string b = "How$do$you$do!";
    // Takes only space separated C++ strings.
    simple_tokenizer(a);
    cout << endl;
    adv_tokenizer(b, '$');
    cout << endl;
    return 0;
}


Output : How 
     do 
     you
     do!
     

Time Complexity: O(n)

Auxiliary Space:O(n)

Where n is the length of the input string.

Method 2: Using C++ find() and substr() APIs.

Prerequisite: find function and substr().

This method is more robust and can parse a string with any delimiter, not just spaces(though the default behavior is to separate on spaces.) The logic is pretty simple to understand from the code below.

C++




#include <bits/stdc++.h>
using namespace std;
 
void tokenize(string s, string del = " ")
{
    int start, end = -1*del.size();
    do {
        start = end + del.size();
        end = s.find(del, start);
        cout << s.substr(start, end - start) << endl;
    } while (end != -1);
}
int main(int argc, char const* argv[])
{
    // Takes C++ string with any separator
    string a = "How$%do$%you$%do$%!";
    tokenize(a, "$%");
    cout << endl;
 
    return 0;
}


Output: How 
    do 
    you
    do
    !

Time Complexity: O(n)

Auxiliary Space:O(1)

Where n is the length of the input string.

Method 3: Using  temporary string

If you are given that the length of the delimiter is 1, then you can simply use a temp string to split the string. This will save the function overhead time in the case of method 2.

C++




#include <iostream>
using namespace std;
 
void split(string str, char del){
    // declaring temp string to store the curr "word" upto del
      string temp = "";
   
      for(int i=0; i<(int)str.size(); i++){
        // If cur char is not del, then append it to the cur "word", otherwise
          // you have completed the word, print it, and start a new word.
         if(str[i] != del){
            temp += str[i];
        }
          else{
            cout << temp << " ";
              temp = "";
        }
    }
       
      cout << temp;
}
 
int main() {
 
    string str = "geeks_for_geeks";    // string to be split
     char del = '_';    // delimiter around which string is to be split
   
      split(str, del);
     
    return 0;
}


Output

geeks for geeks

Time complexity : O(n)

Auxiliary Space: O(n)

In Java : 
In Java, split() is a method in String class. 

// expregexp is the delimiting regular expression; 
// limit is the number of returned strings
public String[] split(String regexp, int limit);

// We can call split() without limit also
public String[] split(String regexp)

Java




// A Java program for splitting a string
// using split()
import java.io.*;
public class Test
{
    public static void main(String args[])
    {
        String Str = new String("Geeks-for-Geeks");
 
        // Split above string in at-most two strings 
        for (String val: Str.split("-", 2))
            System.out.println(val);
 
        System.out.println("");
   
        // Splits Str into all possible tokens
        for (String val: Str.split("-"))
            System.out.println(val);
    }
}


Output: 

Geeks
for-Geeks

Geeks
for
Geeks

Time complexity : O(n)
Auxiliary Space: O(1)

In Python: 
The split() method in Python returns a list of strings after breaking the given string by the specified separator.  

 
  // regexp is the delimiting regular expression; 
  // limit is limit the number of splits to be made 
  str.split(regexp = "", limit = string.count(str))  

Python3




line = "Geek1 \nGeek2 \nGeek3"
print(line.split())
print(line.split(' ', 1))


Output: 

['Geek1', 'Geek2', 'Geek3']
['Geek1', '\nGeek2 \nGeek3'] 

Time Complexity : O(N), since it just traverse through the string finding all whitespace.

Auxiliary Space : O(1), since no extra space has been used.

This article is contributed by Aarti_Rathi and Aditya Chatterjee.
 



Last Updated : 18 Apr, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads