Open In App

What is Collation and Character Set in MySQL?

Last Updated : 07 Sep, 2020
Improve
Improve
Like Article
Like
Save
Share
Report

MySQL Collation has always been a mystifying topic for beginners of  MySQL learners. A MySQL collation is a well-defined set of rules which are used to compare characters of a particular character-set by using their corresponding encoding. Each character set in MySQL might have more than one collation, and has, at least, one default collation. Two character sets cannot have the same collation.

  • A character set is a set of specific symbols and encoding techniques. A collation is a set of rules for comparing characters in a character set. A Character-set allows us to store data through a variety of character sets and do comparisons according to a variety of collations. We can highlight character sets at the server, database, table, and column level.
  • Suppose, we have some alphabet with A,B,C,D,a,b,c,d. We assigned a number to all letters like A=1,B=2,C=3,D=4,a=5,b=6,c=7,d=8. So, for symbol A encoding is 1, for B encoding is 2, for C encoding is 3, and so on. If we want to compare string A, B, a, b. We have an easier method to do this, just a moment ago we have assigned a distinct value to some alphabets like encoding of A is 1, for B it is 2 similarly encoding for a and b is 5 and 6. So, how we have able to perform this comparison, just because of Collation. We explicitly apply the technique of Collation(Compare there corresponding encoding) to our Character-set.
  • Character-set not only affects data storage but also affects the communication medium between client programs and the MySQL server. If you want the client program to communicate with the server using a Character-set different from the default, you’ll need to highlight which one of the character set you are using. For example, to use the utf8 Unicode character set, use this statement after  establishing connecting to the server :
SET NAMES 'utf8'; 

There is a MySQL statement to know about the default collations of character sets as follows:

SHOW CHARACTER SET;

                                                             

Character sets in mysql

Char-set

  • By default, the SHOW CHARACTER SET statement displays all available character sets. But if you want access character-set of specific types, then you need to use the LIKE or WHERE clause of MySQL that indicates which character-set names that match the conditions. The following example shows some Unicode character-sets that matches with the format(‘utf%’)  :

                                                                    

Fetching char-set using like statement

Fetching char-set using LIKE statement

  • If you want to have all collations for a specific or given character-set, So MySQL provides a statement  SHOW COLLATION as follows:
SHOW COLLATION LIKE 'character_set_name%';


                                                                                  

Fetching char-set using WHERE statement

Fetching Collations using WHERE statement

  • In the above tables, Collations are ending with _ci, the ci here stands for Case-insensitive. There may be other types of Collations like:
1.  _cs(Case-sensitive) ,
2.  _bin(Binary) 
  • MySQL query for Setting of character-set and collations at the database level:  If you don’t specify the character set at the time of creation then the database uses default character set, but in case you want to assign a specific character set so you can explicitly express it via MySQL query :
CREATE DATABASE name_of_database
CHARACTER SET character-set_name;
COLLATE collation_name;

Example:

CREATE DATABASE my_database
CHARACTER SET utf8mb4
COLLATE utf8_unicode_ci;

You can also change or apply character-set and collation-name for database using MySQL ‘ALTER’ statement :

ALTER DATABASE database_name
CHARACTER SET character_set_name
COLLATE collation_name; 

Example:

ALTER DATABASE my_database
CHARACTER SET utf8mb4
COLLATE  utf8mb4_0900_ai_ci;

MySQL query for Setting of character-set and collations at the table level: </b> You can also explicitly specify what type of character set and collation you want at the time of table creation, but If you don’t specify then default character-set and collation would be applied.

CREATE TABLE table_name(
ID   INT   AUTO_INCREMENT  NOT NULL,
NAME VARCHAR (20)     NOT NULL,
ADDRESS  CHAR (25) ,
SALARY   DECIMAL (18, 2),        
PRIMARY KEY (ID) );
CHARACTER SET character_set_name
COLLATE collation_name ;
  • You have also an option to set Character-set and Collation name as you want if you haven’t applied them at the time of table creation via MySQL ‘ALTER’ statement :
ALTER TABLE table_name(
RENAME COLUMN old_name TO new_name);
CHARACTER SET character_set_name
COLLATE collation_name;
  • You can also set character-set and collation name at the column level :</b> As we know that Column in a table may contain a variety of data like (varchar, Int, float ). So you can explicitly specify the character set and collation name for a different type(data-type) of columns as you want.
ALTER TABLE table_name
MODIFY column_name   VARCHAR(25)
CHARACTER SET latin1;


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads