Open In App

Escaping XML Special Characters in Java String

Improve
Improve
Like Article
Like
Save
Share
Report

When we read an XML file and try writing to another XML file it is important for us to take care of special characters in the XML. There are some reserved characters in Java that need to be transformed or escaped to be considered as a string literal. If we don’t escape these special characters then parsers like DOM or SAX parsers in java would consider them as XML tags specifically in the case of < and >. Even if we use XSLT transform these parsers would fail. Hence, we need to escape or transform these special characters before reading them as a String literal in Java.

Special characters in XML

There are 5 mostly used special characters in XML that needs to be escaped when used as a Java String

  • & — &amp;
  • < — &lt;
  • > — &gt;
  • ” — &quot;
  • ‘ — &apos;

These special characters are also referred to as XML Metacharacters. By the process of escaping, we would be replacing these characters with alternate strings to give the literal result of special characters.

Example:

<GeeksForGeeks> Data Structures & Java </GeeksForGeeks>

// is an invalid string in java because '&' is a reserved literal 
// in XML that is used to import other XML entity. For converting this 
// to a valid String literal we need to &amp; instead of & here.

<GeeksForGeeks> Data Structure &amp; Java </GeeksForGeeks>

// now becomes a valid String.

In Java, we could always write our own functions to escape XML special characters with its equivalent String literals, but we could also use the Java library “StringEscapeUtils” provided by Apache Commons. This library provides us with a common API that does the XML escaping for us.

Code:

Java




// Java program to escape all the five characters
// mentioned above using the StringEscapeUtils class
  
import java.io.*;
import org.apache.commons.lang.StringEscapeUtils;
  
class GeeksForGeeks {
    public static void main (String[] args) {
        
      System.out.println("Program to escape XML Special Characters !!");
          
      // Escape & character in XML String 
      String unescapedXMLString = "DataStructures & Java";
        
      System.out.println("Unescaped String: " + unescapedXMLString);
        
      // using StringEscapeUtils
      System.out.println("Escaped String: " 
                         + StringEscapeUtils.escapeXml(unescapedXMLString));
        
      // Escape > character in XML String 
      unescapedXMLString = "DataStructures > Java";
        
      System.out.println("Unescaped String: " + unescapedXMLString);
        
      // using StringEscapeUtils
      System.out.println("Escaped String: " 
                         + StringEscapeUtils.escapeXml(unescapedXMLString));
        
      // Escape < character in XML String 
      unescapedXMLString = "DataStructures < Java";
        
      System.out.println("Unescaped String: " + unescapedXMLString);
        
      // using StringEscapeUtils
      System.out.println("Escaped String: " 
                         + StringEscapeUtils.escapeXml(unescapedXMLString));
        
      // Escape " character in XML String 
      unescapedXMLString = "DataStructures \" Java";
        
      System.out.println("Unescaped String: " + unescapedXMLString);
        
      // using StringEscapeUtils
      System.out.println("Escaped String: " 
                         + StringEscapeUtils.escapeXml(unescapedXMLString));
        
      // Escape ' character in XML String 
      unescapedXMLString = "DataStructures ' Java";
        
      System.out.println("Unescaped String: " + unescapedXMLString);
        
      // using StringEscapeUtils
      System.out.println("Escaped String: " 
                         + StringEscapeUtils.escapeXml(unescapedXMLString));
             
    }
}


Output:

Program to escape XML Special Characters !!
Unescaped String: DataStructures & Java
Escaped String: DataStructures &amp; Java
Unescaped String: DataStructures > Java
Escaped String: DataStructures &gt; Java
Unescaped String: DataStructures < Java
Escaped String: DataStructures &lt; Java
Unescaped String: DataStructures " Java
Escaped String: DataStructures &quot; Java
Unescaped String: DataStructures ' Java
Escaped String: DataStructures &apos; Java


Last Updated : 22 Feb, 2021
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads