For C strings represented as a pair
char *, int, it is to decide whether or not – the string presented as a raw byte string or as a Unicode string.
Byte objects can be built using
To create a Unicode string and is it is known that s points to data encoded as UTF-8, the code given below can be used as –
If s is encoded in some other known encoding, a string using
PyUnicode_Decode() can be made as:
If a wide string needs to be represented as
wchar_t *, len pair. Then are few options as shown below –
- The data from C must be explicitly decoded into a string according to some codec
- Common encodings include ASCII, Latin-1, and UTF-8.
- If you’re encoding is not known, then it is best off to encode the string as bytes instead.
- Python always copies the string data (being provided) when making an object.
- Also, for better reliability, strings should be created using both a pointer and a size rather than relying on NULL-terminated data.
- Python | Timezone Conversion
- Type Conversion in Python
- Python | Key-Value to URL Parameter Conversion
- Python | Dictionary to list of tuple conversion
- Python | List of tuples to dictionary conversion
- Python | Type conversion of dictionary items
- Python | Type conversion in dictionary values
- HTML Cleaning and Entity Conversion | Python
- Python | Decimal to binary list conversion
- Python | Remove empty strings from list of strings
- Python | Tokenizing strings in list of strings
- Python | Interleaving two strings
- Python | Removing strings from tuple
- Python | C Strings of Doubtful Encoding | Set-1
- Python | C Strings of Doubtful Encoding | Set-2
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.