Open In App

C strings conversion to Python

For C strings represented as a pair char *, int, it is to decide whether or not – the string presented as a raw byte string or as a Unicode string.

Byte objects can be built using Py_BuildValue() as






// Pointer to C string data
char *s; 
  
// Length of data 
int len; 
  
// Make a bytes object
PyObject *obj = Py_BuildValue("y#", s, len);

 
To create a Unicode string and is it is known that s points to data encoded as UTF-8, the code given below can be used as –




PyObject *obj = Py_BuildValue("s#", s, len);

 
If s is encoded in some other known encoding, a string using PyUnicode_Decode() can be made as:






PyObject *obj = PyUnicode_Decode(s, len, "encoding", "errors");
  
// Example
obj = PyUnicode_Decode(s, len, "latin-1", "strict");
obj = PyUnicode_Decode(s, len, "ascii", "ignore");

 
If a wide string needs to be represented as wchar_t *, len pair. Then are few options as shown below –




// Wide character string
wchar_t *w;
  
// Length
int len; 
  
// Option 1 - use Py_BuildValue()
PyObject *obj = Py_BuildValue("u#", w, len);
  
// Option 2 - use PyUnicode_FromWideChar()
PyObject *obj = PyUnicode_FromWideChar(w, len);


Article Tags :