Open In App

How to Detect Character Encoding using mb_detect_encoding() function in PHP ?

Last Updated : 22 Sep, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Character encoding is an essential aspect of handling text in various programming languages including PHP. Different character encodings such as UTF-8, ISO-8859-1, and ASCII represent characters differently. Finding the correct character encoding of text is important to ensure proper data processing and display.

In PHP, the mb_detect_encoding() function allows you to detect the character encoding of a given string. This function is part of the multibyte string extension (mbstring) which must be enabled in your PHP configuration.

Syntax:

mb_detect_encoding(
string $string,
array|string|null $encodings = null,
bool $strict = false
): string|false

Parameters:

  • $str: The input string for which you want to detect the encoding.
  • $encoding_list: A list of character encodings to consider during the detection process. It can be a string or an array of encoding names. it uses the values mb_detect_order() set in PHP.
  • $strict: A boolean flag indicating whether to use strict mode for detecting the encoding. If strict is set to true & the function only returns encoding if it is confident in the result.

Return Values: The mb_detect_encoding() function returns the detected character encoding of the input string. If no encoding is detected or the input string is empty, it returns false.

Approach 1: Using Default Detection Order

The mb_detect_encoding() function with the default detection order as specified in PHP.

PHP




<?php
  
$text = "Hi, こんにちは, 你好, привет!";
$encoding = mb_detect_encoding($text);
echo "The Detected Encoding : " . $encoding;
  
?>


Output:

The Detected Encoding : UTF-8

Approach 2: Specifying Custom Encoding List

The mb_detect_encoding() function with a custom list of character encodings to consider during the detection process.

PHP




<?php
  
$text = "Hi, こんにちは, 你好, привет!";
$encoding_list = ["UTF-8", "EUC-JP", "GBK"];
$encoding = mb_detect_encoding($text, $encoding_list);
echo "The Detected Encoding : " . $encoding;
  
?>


Output:

The Detected Encoding : UTF-8.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads