How to Get and Set Default Character Encoding or Charset in Java?
Default Character encoding or Charset in Java is used by Java Virtual Machine (JVM) to convert bytes into a string of characters in the absence of file.encoding java system property. During JVM start-up, Java gets character encoding by calling System.getProperty(“file.encoding”,”UTF-8″). In the absence of file.encoding attribute, Java uses “UTF-8” character encoding by default.
Character encoding basically interprets a sequence of bytes into a string of specific characters. The same combination of bytes can denote different characters in different character encoding. Therefore, the specification of the right character encoding plays an important role. Java caches character encoding in most of its major classes which requires character encoding. Therefore, calling System.setProperty(“file.encoding” , “UTF-16”) may not have desire effect while using InputStreamReader and other Java packages.
Getting default character encoding or Charset
There are various ways of retrieving the default charset in Java namely as follows:
- Using “file.encoding” system property
- Using java.nio.Charset
- Using Charset.defaultCharset() method
Methods:
- “file.encoding” system property
- java.nio.Charset
- Code InputStreamReader.getEncoding()
Now let us brief about them before invoking them in the implementation part in order to get default character encoding or Charset
Method 1: “file.encoding” system property
System.getProperty(“file.encoding”) in Java returns the default charset that is used in the application, in case either the JVM is started with the -Dfile.encoding property or the JavaScript has not explicitly invoked the System.setProperty(“file.encoding, encoding) method, where the type of encoding is specified.
Method 2: java.nio.Charset
The java package provides a static method to retrieve the default character encoding for translating between bytes and Unicode characters. Charset.defaultCharset() method returns the default charset that is being used.
Method 3: Code InputStreamReader.getEncoding()
The package InputStreamReader in Java uses a method getEncoding() which returns the name of the character encoding used by this stream.
Example:
Java
// Java Program to Get and Set // Default Character encoding or Charset // Importing input output classes import java.io.ByteArrayInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.UnsupportedEncodingException; // Importing Charset class that defines charsets and // translation between bytes and Unicode characters. import java.nio.charset.Charset; // Class 1 // Helper Class for character encoding public class GFG { // Method // To public static String getCharacterEncoding() { // Creating an array of byte type chars and // passing random alphabet as an argument.abstract // Say alphabet be 'w' byte [] byte_array = { 'w' }; // Creating an object of InputStream InputStream instream = new ByteArrayInputStream(byte_array); // Now, opening new file input stream reader InputStreamReader streamreader = new InputStreamReader(instream); String defaultCharset = streamreader.getEncoding(); // Returning default character encoding return defaultCharset; } // Main driver method public static void main(String args[]) throws FileNotFoundException, UnsupportedEncodingException, IOException { // Method returns a string of character encoding // used by using System.getProperty() String defaultencoding = System.getProperty( "file.encoding" ); System.out.println( "Default Charset: " + defaultencoding); // Getting character encoding by InputStreamReader System.out.println( "Default Charset by InputStreamReader: " + getCharacterEncoding()); // Getting character encoding by java.nio.charset System.out.println( "Default Charset: " + Charset.defaultCharset()); } } |
Output:
Setting default character encoding or Charset
Methods: There are various ways of specifying the default charset value in Java.
- Using the java system property
- Using JAVA_TOOLS_OPTIONS
Now let us brief about them before invoking them in the implementation part in order to get default character encoding or Charset
Method 1: Using the Java System property “file.encoding”
Upon starting Java Virtual Machine, by providing the file.encoding system property
java -Dfile.encoding="UTF-8" HelloWorld, we can specify UTF-8 charset.
Method 2: Specifying the environment variable “JAVA_TOOLS_OPTIONS.”
In case we start JVM starts up using some scripts and tools, the default charset can be set using the environment variable JAVA_TOOL_OPTIONS to -Dfile.encoding = ”UTF-16” or any other which is then used up by the program whenever JVM starts in the machine. As an output of this method, the console displays up as follows:
“Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF16″ to indicate usage of JAVA_TOOS_OPTIONS.
The below snippet indicate the setting of default character encoding using JAVA_TOOLS_OPTIONS:
test@system:~/java java HelloWorld þÿExecuting HelloWorld Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF16
Example:
Java
// Java Program to Get and Set // Default Character encoding or Charset // Importing all input output classes import java.io.ByteArrayInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; // Importing Charset class that defines charsets and // translation between bytes and Unicode characters import java.io.UnsupportedEncodingException; import java.nio.charset.Charset; // Class // Class to encode characters public class GFG { // Method 1 // To encode the characters public static String getCharacterEncoding() { // Creating and initializing byte array // with some random character say it be N // Here N = w byte [] byte_array = { 'w' }; // Creating an object of inputStream InputStream instream = new ByteArrayInputStream(byte_array); // Now, opening new file input stream reader InputStreamReader streamreader = new InputStreamReader(instream); String defaultCharset = streamreader.getEncoding(); // Returning the default character encoded // Here it is for N = 'w' return defaultCharset; } // Method 2 // Main driver method public static void main(String args[]) throws FileNotFoundException, UnsupportedEncodingException, IOException { // Setting the file encoding explicitly // to a new value System.setProperty( "file.encoding" , "UTF-16" ); // Returns a string of character encoding // using the getProperty() method String defaultencoding = System.getProperty( "file.encoding" ); // Return the above string of character encoded System.out.println( "Default Charset: " + defaultencoding); // Getting character encoding by InputStreamReader // using the getCharacterEncoding() method System.out.println( "Default Charset by InputStreamReader: " + getCharacterEncoding()); // Getting character encoding by java.nio.charset // using the default charset() method System.out.println( "Default Charset: " + Charset.defaultCharset()); } } |
Output
Default Charset: UTF-16 Default Charset by InputStreamReader: UTF8 Default Charset: UTF-8
The Default charset encoding UTF-8 is preserved and cached by JVM and is therefore, not replaced by specifying explicit character encoding UTF-16, that is System.setProperty()
Please Login to comment...