HTML | URL Encoding

A Uniform Resource Locator (URL) is simply the address of a website to access the website content like www.geeksforgeeks.org. But there are certain characters are allowed to use in the URL like alphabets A-Z and a-z, numbers 0-9 and few special characters. They can be used as it is but the rest of the characters that are not in this list are used after encoding them to a suitable form.
URL Encoding is the process of converting the URL into valid format that is accepted by the web browsers. URL Encoding takes place by replacing all the characters that are not allowed by a % sign followed by two hexadecimal digits. These two hexadecimal values represent the numerical values of the character in the ASCII character set. For example a space is not acceptable in a URL and is replaced by ‘%20’ or a ‘+’ sign while encoding. Similarly a $ sign is replaced by ‘%24’.

Reserved Characters: There are certain characters which sometimes have special meanings in the URL and it can be used in both ways. For example the ‘/’ character is a reserved character and it has a special meaning when being used as a delimiter to separate the paths of a URL. Here it is used by encoding it to ‘%2F’. Else when it has no special purpose it can be used normally.

There are many reserved characters which are listed below:

Character Encoded Form
! %21
* %2A
%27
( %28
) %29
; %3B
: %3A
@ %40
& %26
= %3D
+ %2B
$ %24
, %2C
/ %2F
? %3F
# %23
[ %5B
] %5D

Some characters need to be encoded while some don’t need to be. Here is the classification shows the group of characters that need to be encoded.

  • Safe Characters: Alphanumeric i.e. 0-9, a-z and A-Z, special characters $, -, _, ., +, !, *, ‘, (, ), and reserved characters used for their reserved purposes. These characters have no need to be encoded.
  • ASCII Control characters: Ii includes the characters ranging from 00-1F in hex (0-31 decimal) and 7F (127 decimal). These characters needs to be encoded.
  • Non-ASCII Control characters: It includes 80-FF in hex (128-255 decimal). These characters needs to be encoded.
  • Reserved characters: These characters are used for special purpose an they requires encoding.
  • Unsafe characters: These character can be misunderstood within URLs for various reasons. So it requires encoding. The characters < and > are unsafe because they are used as the delimiters around URLs in free text, the quote mark (” “) is unsafe as it is used to delimit URLs in some systems.
    The unsafe characters list are given below:

    Character Encoded Form
    space %20
    %22
    < %3C
    > %3E
    # %23
    % %25
    { %7B
    } %7D
    | %7C
    \ %5C
    ^ %5E
    ~ %7E
    [ %5B
    ] %5D

A list of complete URL Encoded Characters is given below:

CHARACTER ENCODED FORM
backspace %08
tab %09
linefeed %0A
c return %0D
space %20
! %21
%22
# %23
$ %24
% %25
& %26
%27
( %28
) %29
* %2A
+ %2B
, %2C
%2D
. %2E
/ %2F
0 %30
1 %31
2 %32
3 %33
4 %34
5 %35
6 %36
7 %37
8 %38
9 %39
: %3A
; %3B
< %3C
= %3D
> %3E
? %3F
@ %40
A %41
B %42
C %43
D %44
E %45
F %46
G %47
H %48
I %49
J %4A
K %4B
L %4C
M %4D
N %4E
O %4F
P %50
Q %51
R %52
S %53
T %54
U %55
V %56
W %57
X %58
Y %59
Z %5A
[ %5B
\ %5C
] %5D
^ %5E
_ %5F
` %60
a %61
b %62
c %63
d %64
e %65
f %66
g %67
h %68
i %69
j %6A
k %6B
l %6C
m %6D
n %6E
o %6F
p %70
q %71
r %72
s %73
t %74
u %75
v %76
w %77
x %78
y %79
z %7A
{ %7B
| %7C
} %7D
~ %7E
  %7F
` %E2%82%AC
 %81
%E2%80%9A
ƒ %C6%92
%E2%80%9E
%E2%80%A6
%E2%80%A0
%E2%80%A1
ˆ %CB%86
%E2%80%B0
Š %C5%A0
%E2%80%B9
Π%C5%92
 %C5%8D
Ž %C5%BD
 %8F
 %C2%90
%E2%80%98
%E2%80%99
%E2%80%9C
%E2%80%9D
%E2%80%A2
%E2%80%93
%E2%80%94
˜ %CB%9C
%E2%84
š %C5%A1
%E2%80
œ %C5%93
 %9D
ž %C5%BE
Ÿ %C5%B8
  %C2%A0
¡ %C2%A1
¢ %C2%A2
£ %C2%A3
¤ %C2%A4
¥ %C2%A5
¦ %C2%A6
§ %C2%A7
¨ %C2%A8
© %C2%A9
ª %C2%AA
« %C2%AB
¬ %C2%AC
­ %C2%AD
® %C2%AE
¯ %C2%AF
° %C2%B0
± %C2%B1
² %C2%B2
³ %C2%B3
´ %C2%B4
µ %C2%B5
%C2%B6
· %C2%B7
¸ %C2%B8
¹ %C2%B9
º %C2%BA
» %C2%BB
¼ %C2%BC
½ %C2%BD
¾ %C2%BE
¿ %C2%BF
À %C3%80
Á %C3%81
 %C3%82
à %C3%83
Ä %C3%84
Å %C3%85
Æ %C3%86
Ç %C3%87
È %C3%88
É %C3%89
Ê %C3%8A
Ë %C3%8B
Ì %C3%8C
Í %C3%8D
Î %C3%8E
Ï %C3%8F
Ð %C3%90
Ñ %C3%91
Ò %C3%92
Ó %C3%93
Ô %C3%94
Õ %C3%95
Ö %C3%96
× %C3%97
Ø %C3%98
Ù %C3%99
Ú %C3%9A
Û %C3%9B
Ü %C3%9C
Ý %C3%9D
Þ %C3%9E
ß %C3%9F
à %C3%A0
á %C3%A1
â %C3%A2
ã %C3%A3
ä %C3%A4
å %C3%A5
æ %C3%A6
ç %C3%A7
è %C3%A8
é %C3%A9
ê %C3%AA
ë %C3%AB
ì %C3%AC
í %C3%AD
î %C3%AE
ï %C3%AF
ð %C3%B0
ñ %C3%B1
ò %C3%B2
ó %C3%B3
ô %C3%B4
õ %C3%B5
ö %C3%B6
÷ %C3%B7
ø %C3%B8
ù %C3%B9
ú %C3%BA
û %C3%BB
ü %C3%BC
ý %C3%BD
þ %C3%BE
ÿ %C3%BF


My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.