Skip to content
Related Articles
Get the best out of our app
GeeksforGeeks App
Open App

Related Articles

C strings conversion to Python

Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article

For C strings represented as a pair char *, int, it is to decide whether or not – the string presented as a raw byte string or as a Unicode string.

Byte objects can be built using Py_BuildValue() as

// Pointer to C string data
char *s; 
// Length of data 
int len; 
// Make a bytes object
PyObject *obj = Py_BuildValue("y#", s, len);

To create a Unicode string and is it is known that s points to data encoded as UTF-8, the code given below can be used as –

PyObject *obj = Py_BuildValue("s#", s, len);

If s is encoded in some other known encoding, a string using PyUnicode_Decode() can be made as:

PyObject *obj = PyUnicode_Decode(s, len, "encoding", "errors");
// Example
obj = PyUnicode_Decode(s, len, "latin-1", "strict");
obj = PyUnicode_Decode(s, len, "ascii", "ignore");

If a wide string needs to be represented as wchar_t *, len pair. Then are few options as shown below –

// Wide character string
wchar_t *w;
// Length
int len; 
// Option 1 - use Py_BuildValue()
PyObject *obj = Py_BuildValue("u#", w, len);
// Option 2 - use PyUnicode_FromWideChar()
PyObject *obj = PyUnicode_FromWideChar(w, len);

  • The data from C must be explicitly decoded into a string according to some codec
  • Common encodings include ASCII, Latin-1, and UTF-8.
  • If you’re encoding is not known, then it is best off to encode the string as bytes instead.
  • Python always copies the string data (being provided) when making an object.
  • Also, for better reliability, strings should be created using both a pointer and a size rather than relying on NULL-terminated data.

My Personal Notes arrow_drop_up
Last Updated : 02 Apr, 2019
Like Article
Save Article
Similar Reads
Related Tutorials