Open In App

Explain different kinds of character set available in HTML

Last Updated : 18 May, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Explain the different kinds of character sets available in HTML?

Before looking at different kinds of character sets available in HTML, let us first know what character sets in HTML actually are.

HTML Character Sets: Have you ever wondered how the browser displays the numbers, alphabets, and other symbols precisely? It is possible with the help of a particular Character Set. 

Have you ever wondered how the browser displays numbers, alphabets, and other symbols precisely? It is possible with the help of a particular Character Set. 

It is specified inside the <meta> tag.

<meta charset="UTF-8">

Different kinds of character set available in HTML

There have been different character sets available over time for the web. Let’s understand the different kinds of character sets available in HTML.

ASCII: The first and the most common character encoding format is ASCII(American Standard Code for Information Interchange). ASCII has defined 128 different alphanumeric characters that are numbers(0-9), lower(a-z) and upper case(A-Z) alphabet and some special characters like + – $ () @ etc. It represented 128 different characters as it used only 7 bits to store characters. The disadvantage with ASCII is that it excludes non-English letters.

Syntax:

<meta charset="ASCII">

The below table shows some of the 128 ASCII characters and their equivalent numbers –

        Char                 Number                 Description        
  32 Space
! 33 Exclamatory mark
“” 34 Quotation mark
# 35 Hash sign
$ 36 dollar sign
% 37 Percent sign
& 38 ampersand
39 apostrophe
( 40 Left parenthesis
) 41 Right parenthesis
* 42 Asterisk
2 50 Number 2
3 51 Number 3
4 52 Number 4
65 A Uppercase A
66 B Uppercase B
75 K Uppercase K
89 Y Uppercase Y
90 Z Uppercase Z
97 a lowercase a
98 b lowercase b
107 k lowercase k
121 y lowercase y
122 z lowercase z
126 ~ tilde

Example: This example shows how to use ASCII character set and the characters are printed using ASCII character set.

HTML




<!DOCTYPE html>
<html>
   
<head>
    <meta charset="ASCII">
    <title>ASCII character set</title>
    <link rel="stylesheet" href="style.css">
</head>
 
<body>
    <div>
         
<p>GeeksforGeeks</p>
 
         
<p>ASCII character set </p>
 
         
<p>! , [ , A </p>
 
    </div>
</body>
 
</html>


Output:

 

ISO-8859-1: The default character set used in HTML4. It supported 256 different character codes.  The ISO (International Standards Organization) defines the standard character sets for different languages/alphabets. It is an extension to ASCII with some additional international characters. For values 0 to 127, ISO-8859-1 is identical to ASCII and for values from 160 to 255, it is identical to UTF-8.

Note: The characters from 128 to 159 are not defined in ISO-8859-1.

Syntax: 

<meta charset="ISO-8859-1">

The below table shows some of the ISO-8859-1 characters and their equivalent numbers –

        Character                 Entity Name                 Entity Number                 Description        
¢ &cent;  ¢ cent
¦ &brvbar; ¦ broken vertical bar
© &copy; © copyright
® &reg; ® registered trademark
¼ &frac14; ¼ fraction 1/4
Ë &Euml; Ë capital e, umlaut mark
à &agrave; à small a , grave accent
þ &thorn; þ small thorn, Icelandic

Example: This example shows how to use the ISO-8859-1 character set and the characters are printed using ISO-8859-1 character set.

HTML




<!DOCTYPE html>
<html>
 
<head>
    <meta charset="ISO-8859-1">
    <title>ISO-8859-1 character set</title>
    <link rel="stylesheet" href="style.css">
</head>
 
<body>
    <div>
         
<p>GeeksforGeeks</p>
 
         
<p>ISO-8859-1 character set</p>
 
         
<p>Ë , ¦ , þ</p>
 
    </div>
</body>
 
</html>


Output:

 

ANSI (Windows-1252): The ANSI(Windows-1252) was the default character set in Windows, up to the Windows95 and the most popular character set too in windows around 1985 to 1990. It is an extension of the ASCII character set and almost identical to ISO-88591-1. It uses 8 bits as it has to store 256 different characters. This character set is supported by almost all the browsers.

Syntax:

<meta charset="ANSI">

The below table shows some of the ANSI(Windows-1252) characters and their equivalent numbers –

        Character                 Number                 Entity Name                 Description        
! 33   Exclamatory mark
& 38 &amp; ampersand
0 48   digital zero
G 71   Latin uppercase letter G
¼ 188 &frac14; vulgar fraction one quarter
© 169 &copy; Copyright sign
þ 254 &thorn; Latin small letter thorn
ø 248 &oslash; Latin small letter 0 with stroke

Example: This example shows how to use ANSI character set and the characters are printed using ANSI character set.

HTML




<!DOCTYPE html>
<html>
   
<head>
    <meta charset="ANSI">
    <title>ANSI(Windows-1252) character set</title>
    <link rel="stylesheet" href="style.css">
</head>
 
<body>
    <div>
         
<p>GeeksforGeeks</p>
 
         
<p>ANSI(Windows-1252) character set</p>
 
        
<p>ø , ¼ , þ</p>
 
    </div>
</body>
 
</html>


Output:

 

UTF-8:  The Unicode Standard was developed by the Unicode Consortium mainly the UTF-8 and UTF-16. The issue with other character sets is that they are limited, and are not compatible in a multilingual environment. It contains almost all the characters, punctuation, and symbols. Developers are encouraged to use the UTF-8 character set by the HTML5 specification.

Syntax: 

<meta charset="UTF-8">

The below table tells some of the UTF-8 character codes that are supported by HTML5 –

          Character Codes                     Hexadecimal                     Decimal          
Latin Extended-A 0100-017F 256-383
Greek and Coptic 0370-03FF 88-1023
Arrows 2190-21FF 8592-8703
Block ELements 2580-259F 9600-9631

Example: This example shows how to use UTF-8 character set and the characters are printed using UTF-8 character set.

HTML




<!DOCTYPE html>
<html>
   
<head>
    <meta charset="UTF-8">
    <title>UTF-8 character set</title>
    <link rel="stylesheet" href="style.css">
</head>
 
<body>
    <div>
         
<p>GeeksforGeeks</p>
 
         
<p>UTF-8 character set</p>
 
         
<p>Ͷ , ← , Ā </p>
 
    </div>
</body>
 
</html>


Output:

 



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads