Open In App

mbrtowc() function in C/C++

Last Updated : 10 May, 2019
Improve
Improve
Like Article
Like
Save
Share
Report

The mbrtowc() function in C/C++ converts multibyte sequence to wide characters. This function returns the length in bytes of a multibyte character. The multibyte character pointed by s is converted to a value of type wchar_t and stored at the location pointed by pwc. If s points to a null character, the function resets the shift state and returns zero after storing the wide null character at pwc.

Syntax:

size_t mbrtowc (wchar_t* pwc, const char* pmb, size_t max, mbstate_t* ps)

Parameter: The function accepts four parameters as described below:

  • pwc : pointer to the location where the resulting wide character will be written
  • s : pointer to the multibyte character string used as input
  • n : limit on the number of bytes in s that can be examined
  • ps : pointer to the conversion state used when interpreting the multibyte string

Return value: The function returns four value as follows:

  1. If, null wide character or if pmb is a null pointer, the function returns 0
  2. the number of bytes [1…n] of the multibyte character successfully converted from s
  3. If the max first characters of pmb form an incomplete multibyte character, the function returns length-2
  4. Otherwise, function returns length-1 and it sets errno to EILSEQ

Note: None of the values possibly returned is less than zero.

Below programs illustrate the above function:
Program 1:




// C++ program to illustrate
// mbrtowc() function
#include <bits/stdc++.h>
using namespace std;
  
// Function to convert multibyte
// sequence to wide character
void print_(const char* s)
{
    // initial state
    mbstate_t ps = mbstate_t();
  
    // length of the string
    int length = strlen(s);
  
    const char* n = s + length;
    int len;
    wchar_t pwc;
  
    // printing each bytes
    while ((len = mbrtowc(&pwc, s, n - s, &ps)) > 0) {
        wcout << "Next " << len << 
        " bytes are the character " << pwc << '\n';
        s += len;
    }
}
  
// Driver code
int main()
{
    setlocale(LC_ALL, "en_US.utf8");
  
    // UTF-8 narrow multibyte encoding
    const char* str = u8"z\u00df\u6c34\U0001d10b";
  
    print_(str);
}


Output:

Next 1 bytes are the character z
Next 2 bytes are the character Ã?
Next 3 bytes are the character æ°´
Next 4 bytes are the character ð??

Program 2:




// C++ program to illustrate
// mbrtowc() function
// with different UTF-8 characters
#include <bits/stdc++.h>
using namespace std;
  
// Function to convert multibyte
// sequence to wide character
void print_(const char* s)
{
    // initial state
    mbstate_t ps = mbstate_t();
  
    // length of the string
    int length = strlen(s);
  
    const char* n = s + length;
    int len;
    wchar_t pwc;
  
    // printing each bytes
    while ((len = mbrtowc(&pwc, s, n - s, &ps)) > 0) {
        wcout << "Next " << len << 
        " bytes are the character " << pwc << '\n';
        s += len;
    }
}
  
// Driver code
int main()
{
    setlocale(LC_ALL, "en_US.utf8");
  
    // UTF-8 narrow multibyte encoding
    const char* str = u8"\xE2\x88\x83y\xE2\x88\x80x\xC2\xAC";
  
    print_(str);
}


Output:

Next 3 bytes are the character â??
Next 1 bytes are the character y
Next 3 bytes are the character â??
Next 1 bytes are the character x
Next 2 bytes are the character ¬


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads