Open In App

Convert Unicode to Bytes in Python

Unicode, often known as the Universal Character Set, is a standard for text encoding. The primary objective of Unicode is to create a universal character set that can represent text in any language or writing system. Text characters from various writing systems are given distinctive representations by it.

Convert Unicode to Byte in Python

Below, are the ways to convert a Unicode String to a Byte String In Python.



Convert A Unicode to Byte Using encode() with UTF-8

In this example, the Unicode string “Hello, GeeksforGeeks” is encoded into bytes using the UTF-8 encoding with the `encode()` method. The resulting `bytes_representation` is a sequence of bytes representing the UTF-8 encoded version of the original string.




unicode_string = "Hello, GeeksforGeeks"
 
bytes_representation = unicode_string.encode('utf-8')
 
print(bytes_representation)

Output

b'Hello, GeeksforGeeks'


Unicode To Byte Using encode() with a Different Encoding

In this example, the Unicode string is encoded into a byte string using the UTF-16 encoding with the `encode()` method. The resulting `byte_string_utf16` contains the UTF-16 encoded representation of the original string, which is then printed to the console.




unicode_string = "Hello, Noida"
 
byte_string_utf16 = unicode_string.encode('utf-16')
 
# Displaying the byte string
print(byte_string_utf16)

Output
b'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00N\x00o\x00i\x00d\x00a\x00'


Convert A Unicode String To Byte String Using bytes() Constructor

In this example, the Unicode string is converted to a byte string using the bytes() constructor with UTF-8 encoding. The resulting `byte_string_bytes` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.




unicode_string = "Hello, 你好"
 
byte_string_bytes = bytes(unicode_string, 'utf-8')
 
# Displaying the byte string
print(byte_string_bytes)

Output
b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'


Python Unicode String to Byte String Using str.encode() Method

In this example, the Unicode string is transformed into a byte string using the str.encode() method with UTF-8 encoding. The resulting `byte_string_str_encode` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.




unicode_string = "Hello, Shivang"
 
byte_string_str_encode = str.encode(unicode_string, 'utf-8')
 
# Displaying the byte string
print(byte_string_str_encode)

Output
b'Hello, Shivang'



Article Tags :