Java substring() method memory leak issue and fix

String is a special class in Java. substring() is one of the widely used methods of String class. It is used to extract part of a string and has two overloaded variants:

1. substring(int beginIndex)
This method is used to extract a portion of the string starting from beginIndex.

Example:



filter_none

edit
close

play_arrow

link
brightness_4
code

class GFG {
    public static void main(String args[]) {
        String s = "geeksforgeeks";
        String subString = s.substring(4);
        System.out.print(subString);
    }
}

chevron_right


The beginIndex parameter must be within the range of source string, otherwise you would see the following exception:

java.lang.StringIndexOutOfBoundsException: String index out of range:

2. substring(int beginIndex, int endIndex)
This variant accepts two parameters beginIndex and endIndex. It breaks String starting from beginIndex till endIndex – 1.

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

class GFG {
    public static void main(String args[]) {
        String s = "geeksforgeeks";
        String subString = s.substring(5, 13);
        System.out.print(subString);
    }
}

chevron_right


How substring() works internally
We all know that String in Java is sequence of characters. String is internally reblockquotesented by array of characters, when new String object is created, it has following fields.

  • char value[] – Array of characters
  • int count – Total characters in the String
  • int offset – Starting index offset in character array

String s = “geeksforgeeks”;
value[] = {‘g’, ‘e’, ‘e’, ‘k’, ‘s’, ‘f’, ‘o’, ‘r’, ‘g’, ‘e’, ‘e’, ‘k’, ‘s’}
count = 13
offset = 0

When we take substring from original string, new String object will be created in constant pool or in heap. The value[] char array will be shared among two String objects, but count and offset attributes of String object will vary according to substring length and starting index.

String s = “geeksforgeeks”;
String substr = s.substring(4, 3)

For substr:
value[] = {‘g’, ‘e’, ‘e’, ‘k’, ‘s’, ‘f’, ‘o’, ‘r’, ‘g’, ‘e’, ‘e’, ‘k’, ‘s’}
count = 3
offset = 4


Problem caused by substring() in JDK 6
This method works well with small Strings. But when it comes with taking substring() from a String with more characters, it leads to serious memory issues if you are using JDK 6 or below.

Example:

String bigString = new String(new byte[100000])

The above String already occupies a lot of memory in heap. Now consider the scenario where we need first 2 characters from bigString,.

String substr = bigString.substring(0, 2)

Now we don’t need the original String.

bigString = null

We might think that bigString object will be Garbage collected as we made it null but our assumption is wrong. When we call substring(), a new String object is created in memory. But still it refers the char[] array value from original String. This blockquotevents bigString from Garbage collection process and we are unnecessarily storing 100000 bytes in memory (just for 2 characters). The bug details can be found here.

Handling substring() in JDK 6
This issue should be handled by developers. One option is creating new String object from substring returned String.

String substr = new String(bigString.substring(0, 2))


Now, new String object is created in java heap, having its own char[] array, eventually original bigString will eligible for garbage collection process.

Other option is, call intern() method on substring, which will then fetch an existing string from pool or add it if necessary.

String substr = bigString.substring(0, 2).intern()

Fix for substring() in JDK 7
Sun Microsystems has changed the implementation of substring() from JdK 7. When we invoke substring() in JDK 7, instead of referring char[] array from original String, jvm creates new String objects with its own char[] array.

filter_none

edit
close

play_arrow

link
brightness_4
code

//JDK 7
public String(char value[], int offset, int count) {
    //check boundary
    this.value = Arrays.copyOfRange(value, offset, offset + count);
}
   
public String substring(int beginIndex, int endIndex) {
    //check boundary
    int subLen = endIndex - beginIndex;
    return new String(value, beginIndex, subLen);
}

chevron_right


It is worth noting that, new String object from memory is referred when substring() method is invoked in JDK 7, thus making original string eligible for garbage collection.



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.