For C strings represented as a pair
char *, int, it is to decide whether or not – the string presented as a raw byte string or as a Unicode string.
Byte objects can be built using
To create a Unicode string and is it is known that s points to data encoded as UTF-8, the code given below can be used as –
If s is encoded in some other known encoding, a string using
PyUnicode_Decode() can be made as:
If a wide string needs to be represented as
wchar_t *, len pair. Then are few options as shown below –
- The data from C must be explicitly decoded into a string according to some codec
- Common encodings include ASCII, Latin-1, and UTF-8.
- If you’re encoding is not known, then it is best off to encode the string as bytes instead.
- Python always copies the string data (being provided) when making an object.
- Also, for better reliability, strings should be created using both a pointer and a size rather than relying on NULL-terminated data.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.