Open In App
Related Articles

How to Detect Character Encoding using mb_detect_encoding() function in PHP ?

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Report issue
Report

Character encoding is an essential aspect of handling text in various programming languages including PHP. Different character encodings such as UTF-8, ISO-8859-1, and ASCII represent characters differently. Finding the correct character encoding of text is important to ensure proper data processing and display.

In PHP, the mb_detect_encoding() function allows you to detect the character encoding of a given string. This function is part of the multibyte string extension (mbstring) which must be enabled in your PHP configuration.

Syntax:

mb_detect_encoding(
string $string,
array|string|null $encodings = null,
bool $strict = false
): string|false

Parameters:

  • $str: The input string for which you want to detect the encoding.
  • $encoding_list: A list of character encodings to consider during the detection process. It can be a string or an array of encoding names. it uses the values mb_detect_order() set in PHP.
  • $strict: A boolean flag indicating whether to use strict mode for detecting the encoding. If strict is set to true & the function only returns encoding if it is confident in the result.

Return Values: The mb_detect_encoding() function returns the detected character encoding of the input string. If no encoding is detected or the input string is empty, it returns false.

Approach 1: Using Default Detection Order

The mb_detect_encoding() function with the default detection order as specified in PHP.

PHP

<?php
  
$text = "Hi, こんにちは, 你好, привет!";
$encoding = mb_detect_encoding($text);
echo "The Detected Encoding : " . $encoding;
  
?>

                    

Output:

The Detected Encoding : UTF-8

Approach 2: Specifying Custom Encoding List

The mb_detect_encoding() function with a custom list of character encodings to consider during the detection process.

PHP

<?php
  
$text = "Hi, こんにちは, 你好, привет!";
$encoding_list = ["UTF-8", "EUC-JP", "GBK"];
$encoding = mb_detect_encoding($text, $encoding_list);
echo "The Detected Encoding : " . $encoding;
  
?>

                    

Output:

The Detected Encoding : UTF-8.


Last Updated : 22 Sep, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads