Convert String to Byte Array in Java Using getBytes(encoding) Method
In Java, any sequence of characters within double quotes is treated as String literal. String class represents the character string literal. The string class is present in java.lang package. All string literals in java are immutable i.e their value cannot be changed once created. In Java, the string literals are stored as an array of Unicode characters. A byte array is an array of bytes. We can use a byte array to store the collection of binary data.
In order to convert a string literal into a byte array, we have to first convert the sequence of characters into a sequence of bytes and for this conversion, we can use an instance of Charset.
Attention reader! Don’t stop learning now. Get hold of all the important Java Foundation and Collections concepts with the Fundamentals of Java and Java Collections Course at a student-friendly price and become industry ready. To complete your preparation from learning a language to DS Algo and many more, please refer Complete Interview Preparation Course.
It is an abstract class present in java.nio package and it is used to define a mapping between sequence of sixteen-bit UTF-16 code units i.e sequence of character and sequence of bytes. Basically, it mainly used for encoding and decoding of charset and unicode. The process which we discuss above to convert a string literal into a byte array is defined as encoding in which we encode each character of the string literal into a byte.
public byte getBytes(String charsetName) throws UnsupportedEncodingException
This method encodes the string literal into byte using the named charset and returns the byte array, but this method may throw an UnsupportedEncodingException if the named charset is not supported. So in order to handle the exception we use try-catch block.
- In the below program getBytes() method converts the string literal into byte by using UTF-16 (16 is a number of bits)encoding constant.
- Where UTF is Unicode Transformation Format and it is used in encoding the characters. There are many variations of UTF like UTF-8 which uses one byte at the minimum in encoding characters where UTF-16 uses 2bytes and UTF-32 uses 4bytes.
- Here in the below program, we use UTF-16 which takes at least 2 bytes to encode a character i.e why the length of the resulting byte array is not same as the length of the given string. But if you use UTF-8 you get the length of the resultant array the same as the length of the input string because UTF-8 takes one byte to encode a character.
[-2, -1, 0, 71, 0, 101, 0, 101, 0, 107, 0, 115, 0, 70, 0, 111, 0, 114, 0, 71, 0, 101, 0, 101, 0, 107, 0, 115] Length of String 13 Length of byte Array 28