What is Collation and Character Set in MySQL?
MySQL Collation has always been a mystifying topic for beginners of MySQL learners. A MySQL collation is a well-defined set of rules which are used to compare characters of a particular character-set by using their corresponding encoding. Each character set in MySQL might have more than one collation, and has, at least, one default collation. Two character sets cannot have the same collation.
- A character set is a set of specific symbols and encoding techniques. A collation is a set of rules for comparing characters in a character set. A Character-set allows us to store data through a variety of character sets and do comparisons according to a variety of collations. We can highlight character sets at the server, database, table, and column level.
- Suppose, we have some alphabet with A,B,C,D,a,b,c,d. We assigned a number to all letters like A=1,B=2,C=3,D=4,a=5,b=6,c=7,d=8. So, for symbol A encoding is 1, for B encoding is 2, for C encoding is 3, and so on. If we want to compare string A, B, a, b. We have an easier method to do this, just a moment ago we have assigned a distinct value to some alphabets like encoding of A is 1, for B it is 2 similarly encoding for a and b is 5 and 6. So, how we have able to perform this comparison, just because of Collation. We explicitly apply the technique of Collation(Compare there corresponding encoding) to our Character-set.
- Character-set not only affects data storage but also affects the communication medium between client programs and the MySQL server. If you want the client program to communicate with the server using a Character-set different from the default, you’ll need to highlight which one of the character set you are using. For example, to use the utf8 Unicode character set, use this statement after establishing connecting to the server :
SET NAMES 'utf8';
There is a MySQL statement to know about the default collations of character sets as follows:
SHOW CHARACTER SET;
- By default, the SHOW CHARACTER SET statement displays all available character sets. But if you want access character-set of specific types, then you need to use the LIKE or WHERE clause of MySQL that indicates which character-set names that match the conditions. The following example shows some Unicode character-sets that matches with the format(‘utf%’) :
- If you want to have all collations for a specific or given character-set, So MySQL provides a statement SHOW COLLATION as follows:
SHOW COLLATION LIKE 'character_set_name%';
- In the above tables, Collations are ending with _ci, the ci here stands for Case-insensitive. There may be other types of Collations like:
1. _cs(Case-sensitive) , 2. _bin(Binary)
- MySQL query for Setting of character-set and collations at the database level: If you don’t specify the character set at the time of creation then the database uses default character set, but in case you want to assign a specific character set so you can explicitly express it via MySQL query :
CREATE DATABASE name_of_database CHARACTER SET character-set_name; COLLATE collation_name;
CREATE DATABASE my_database CHARACTER SET utf8mb4 COLLATE utf8_unicode_ci;
You can also change or apply character-set and collation-name for database using MySQL ‘ALTER’ statement :
ALTER DATABASE database_name CHARACTER SET character_set_name COLLATE collation_name;
ALTER DATABASE my_database CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
MySQL query for Setting of character-set and collations at the table level: </b> You can also explicitly specify what type of character set and collation you want at the time of table creation, but If you don’t specify then default character-set and collation would be applied.
CREATE TABLE table_name( ID INT AUTO_INCREMENT NOT NULL, NAME VARCHAR (20) NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID) ); CHARACTER SET character_set_name COLLATE collation_name ;
- You have also an option to set Character-set and Collation name as you want if you haven’t applied them at the time of table creation via MySQL ‘ALTER’ statement :
ALTER TABLE table_name( RENAME COLUMN old_name TO new_name); CHARACTER SET character_set_name COLLATE collation_name;
- You can also set character-set and collation name at the column level :</b> As we know that Column in a table may contain a variety of data like (varchar, Int, float ). So you can explicitly specify the character set and collation name for a different type(data-type) of columns as you want.
ALTER TABLE table_name MODIFY column_name VARCHAR(25) CHARACTER SET latin1;