Escaping XML Special Characters in Java String
When we read an XML file and try writing to another XML file it is important for us to take care of special characters in the XML. There are some reserved characters in Java that need to be transformed or escaped to be considered as a string literal. If we don’t escape these special characters then parsers like DOM or SAX parsers in java would consider them as XML tags specifically in the case of < and >. Even if we use XSLT transform these parsers would fail. Hence, we need to escape or transform these special characters before reading them as a String literal in Java.
Special characters in XML
There are 5 mostly used special characters in XML that needs to be escaped when used as a Java String
- & — &
- < — <
- > — >
- ” — "
- ‘ — '
These special characters are also referred to as XML Metacharacters. By the process of escaping, we would be replacing these characters with alternate strings to give the literal result of special characters.
<GeeksForGeeks> Data Structures & Java </GeeksForGeeks> // is an invalid string in java because '&' is a reserved literal // in XML that is used to import other XML entity. For converting this // to a valid String literal we need to & instead of & here. <GeeksForGeeks> Data Structure & Java </GeeksForGeeks> // now becomes a valid String.
In Java, we could always write our own functions to escape XML special characters with its equivalent String literals, but we could also use the Java library “StringEscapeUtils” provided by Apache Commons. This library provides us with a common API that does the XML escaping for us.
Program to escape XML Special Characters !! Unescaped String: DataStructures & Java Escaped String: DataStructures & Java Unescaped String: DataStructures > Java Escaped String: DataStructures > Java Unescaped String: DataStructures < Java Escaped String: DataStructures < Java Unescaped String: DataStructures " Java Escaped String: DataStructures " Java Unescaped String: DataStructures ' Java Escaped String: DataStructures ' Java