Convert Unicode String to a Byte String in Python
Last Updated :
30 Jan, 2024
Python is a versatile programming language known for its simplicity and readability. Unicode support is a crucial aspect of Python, allowing developers to handle characters from various scripts and languages. However, there are instances where you might need to convert a Unicode string to a regular string. In this article, we will explore five different methods to achieve this in Python.
Convert A Unicode String to a Byte String In Python
Below, are the ways to convert a Unicode String to a Byte String In Python.
- Using
encode()
with UTF-8
- Using
encode()
with a Different Encoding
- Using
bytes()
Constructor
- Using
str.encode()
Method
Convert A Unicode to a Byte String Using encode()
with UTF-8
In this example, a Unicode string containing English and Chinese characters is encoded to a byte string using UTF-8 encoding. The resulting `bytes_representation` is printed, demonstrating the transformation of the mixed-language Unicode string into its byte representation suitable for storage or transmission in a UTF-8 encoded format.
Python3
unicode_string = "Hello, ä½ å¥½"
bytes_representation = unicode_string.encode( 'utf-8' )
print (bytes_representation)
|
Output
b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Unicode String To Byte String Using encode()
with a Different Encoding
In this example, the Unicode string is encoded into a byte string using UTF-16 encoding, resulting in a sequence of bytes that represents the mixed-language string. The byte string is then printed to demonstrate the UTF-16 encoded representation of the Unicode characters.
Python3
unicode_string = "Hello, ä½ å¥½"
byte_string_utf16 = unicode_string.encode( 'utf-16' )
print (byte_string_utf16)
|
Output
b'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00`O}Y'
Convert A Unicode String To Byte String Using bytes()
Constructor
In this example, the Unicode string is converted to a byte string using the bytes() constructor with UTF-8 encoding. The resulting `byte_string_bytes` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.
Python3
unicode_string = "Hello, ä½ å¥½"
byte_string_bytes = bytes(unicode_string, 'utf-8' )
print (byte_string_bytes)
|
Output
b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Python Unicode String To Byte String Using str.encode()
Method
In this example, the Unicode string is transformed into a byte string using the str.encode() method with UTF-8 encoding. The resulting `byte_string_str_encode` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.
Python3
unicode_string = "Hello, ä½ å¥½"
byte_string_str_encode = str .encode(unicode_string, 'utf-8' )
print (byte_string_str_encode)
|
Output
b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Share your thoughts in the comments
Please Login to comment...