Open In App

Java substring() method memory leak issue and fix

Improve
Improve
Like Article
Like
Save
Share
Report

String is a special class in Java. substring() is one of the widely used methods of String class. It is used to extract part of a string and has two overloaded variants:

1. substring(int beginIndex):

This method is used to extract a portion of the string starting from beginIndex

Example: 

Java




class GFG {
    public static void main(String args[]) {
        String s = "geeksforgeeks";
        String subString = s.substring(4);
        System.out.print(subString);
    }
}


Output

sforgeeks

The beginIndex parameter must be within the range of source string, otherwise you would see the following exception:

java.lang.StringIndexOutOfBoundsException: String index out of range:

2. substring(int beginIndex, int endIndex):

This variant accepts two parameters beginIndex and endIndex. It breaks String starting from beginIndex till endIndex – 1.

Example: 

Java




class GFG {
    public static void main(String args[]) {
        String s = "geeksforgeeks";
        String subString = s.substring(5, 13);
        System.out.print(subString);
    }
}


Output

forgeeks

How substring() works internally 
We all know that String in Java is sequence of characters. String is internally represented by array of characters, when new String object is created, it has following fields. 

  • char value[] – Array of characters
  • int count – Total characters in the String
  • int offset – Starting index offset in character array

String s = “geeksforgeeks”; 
value[] = {‘g’, ‘e’, ‘e’, ‘k’, ‘s’, ‘f’, ‘o’, ‘r’, ‘g’, ‘e’, ‘e’, ‘k’, ‘s’} 
count = 13 
offset = 0

When we take substring from original string, new String object will be created in constant pool or in heap. The value[] char array will be shared among two String objects, but count and offset attributes of String object will vary according to substring length and starting index.
 

String s = “geeksforgeeks”; 
String substr = s.substring(5, 8)
For substr: 
value[] = {‘g’, ‘e’, ‘e’, ‘k’, ‘s’, ‘f’, ‘o’, ‘r’, ‘g’, ‘e’, ‘e’, ‘k’, ‘s’} 
count = 3 
offset = 5 

Problem caused by substring() in JDK 6 
This method works well with small Strings. But when it comes with taking substring() from a String with more characters, it leads to serious memory issues if you are using JDK 6 or below.

Example: 

String bigString = new String(new byte[100000])

 The above String already occupies a lot of memory in heap. Now consider the scenario where we need first 2 characters from bigString,. 

String substr = bigString.substring(0, 2)

Now we don’t need the original String. 

bigString = null

We might think that bigString object will be Garbage collected as we made it null but our assumption is wrong. When we call substring(), a new String object is created in memory. But still it refers the char[] array value from original String. This prevents bigString from Garbage collection process and we are unnecessarily storing 100000 bytes in memory (just for 2 characters). The bug details can be found here.

Handling substring() in JDK 6 
This issue should be handled by developers. One option is creating new String object from substring returned String. 

String substr = new String(bigString.substring(0, 2))

 
Now, new String object is created in java heap, having its own char[] array, eventually original bigString will eligible for garbage collection process.
Other option is, call intern() method on substring, which will then fetch an existing string from pool or add it if necessary. 

String substr = bigString.substring(0, 2).intern()

Fix for substring() in JDK 7 
Sun Microsystems has changed the implementation of substring() from JdK 7. When we invoke substring() in JDK 7, instead of referring char[] array from original String, jvm creates new String objects with its own char[] array.

Java




//JDK 7
public String(char value[], int offset, int count) {
    //check boundary
    this.value = Arrays.copyOfRange(value, offset, offset + count);
}
  
public String substring(int beginIndex, int endIndex) {
    //check boundary
    int subLen = endIndex - beginIndex;
    return new String(value, beginIndex, subLen);
}


It is worth noting that, new String object from memory is referred when substring() method is invoked in JDK 7, thus making original string eligible for garbage collection.



Last Updated : 12 Sep, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads