mbrtowc() function in C/C++

The mbrtowc() function in C/C++ converts multibyte sequence to wide characters. This function returns the length in bytes of a multibyte character. The multibyte character pointed by s is converted to a value of type wchar_t and stored at the location pointed by pwc. If s points to a null character, the function resets the shift state and returns zero after storing the wide null character at pwc.

Syntax:

size_t mbrtowc (wchar_t* pwc, const char* pmb, size_t max, mbstate_t* ps)

Parameter: The function accepts four parameters as described below:



  • pwc : pointer to the location where the resulting wide character will be written
  • s : pointer to the multibyte character string used as input
  • n : limit on the number of bytes in s that can be examined
  • ps : pointer to the conversion state used when interpreting the multibyte string

Return value: The function returns four value as follows:

  1. If, null wide character or if pmb is a null pointer, the function returns 0
  2. the number of bytes [1…n] of the multibyte character successfully converted from s
  3. If the max first characters of pmb form an incomplete multibyte character, the function returns length-2
  4. Otherwise, function returns length-1 and it sets errno to EILSEQ

Note: None of the values possibly returned is less than zero.

Below programs illustrate the above function:
Program 1:

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to illustrate
// mbrtowc() function
#include <bits/stdc++.h>
using namespace std;
  
// Function to convert multibyte
// sequence to wide character
void print_(const char* s)
{
    // initial state
    mbstate_t ps = mbstate_t();
  
    // length of the string
    int length = strlen(s);
  
    const char* n = s + length;
    int len;
    wchar_t pwc;
  
    // printing each bytes
    while ((len = mbrtowc(&pwc, s, n - s, &ps)) > 0) {
        wcout << "Next " << len << 
        " bytes are the character " << pwc << '\n';
        s += len;
    }
}
  
// Driver code
int main()
{
    setlocale(LC_ALL, "en_US.utf8");
  
    // UTF-8 narrow multibyte encoding
    const char* str = u8"z\u00df\u6c34\U0001d10b";
  
    print_(str);
}

chevron_right


Output:

Next 1 bytes are the character z
Next 2 bytes are the character Ã?
Next 3 bytes are the character æ°´
Next 4 bytes are the character ð??

Program 2:

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to illustrate
// mbrtowc() function
// with different UTF-8 characters
#include <bits/stdc++.h>
using namespace std;
  
// Function to convert multibyte
// sequence to wide character
void print_(const char* s)
{
    // initial state
    mbstate_t ps = mbstate_t();
  
    // length of the string
    int length = strlen(s);
  
    const char* n = s + length;
    int len;
    wchar_t pwc;
  
    // printing each bytes
    while ((len = mbrtowc(&pwc, s, n - s, &ps)) > 0) {
        wcout << "Next " << len << 
        " bytes are the character " << pwc << '\n';
        s += len;
    }
}
  
// Driver code
int main()
{
    setlocale(LC_ALL, "en_US.utf8");
  
    // UTF-8 narrow multibyte encoding
    const char* str = u8"\xE2\x88\x83y\xE2\x88\x80x\xC2\xAC";
  
    print_(str);
}

chevron_right


Output:

Next 3 bytes are the character â??
Next 1 bytes are the character y
Next 3 bytes are the character â??
Next 1 bytes are the character x
Next 2 bytes are the character ¬


My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.



Improved By : shubham_singh



Article Tags :
Practice Tags :


Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.