Unicode

We are going to discuss Unicode. As we all know that computer understands the number language. It assigns every character a number that is how it works but unfortunately, the encoding method was only for English alphabets, so for other languages, it was difficult to understand if we write any other language alphabets or symbols so it prints a block or square type symbol which means nothing but strange.

Unicode Characters

Unicode Characters made this thing easier we can say that It is a tool that can print any symbol, special character, emoji, and other language alphabets. 

It has around 144,697 characters which are available for computers, supercomputers, tablets, and mobiles. Some Characters are defined below.

Unicode Characters Table

These are the characters that we can use in place of the space key. It is a hidden space in your text, you can use it to send or type an Invisible Letter

 A list of a few Unicodes and Characters is below.

NameCharacterUnicode
Space[ ]U+0020
Exclamation mark!U+0021
Quotation mark?U+0022
Hash, Sharp#U+0023
Dollar sign$U+0024
Percent sign%U+0025
Ampersand&U+0026
Left parenthesis(U+0028
Right parenthesis)U+0029
Asterisk*U+002A
Hair Space[]U+200A
Narrow no-break Space[]U+202F
Medium Mathematical Space[]U+205F
At sign@U+0040
Semicolon;U+003C

UTF & It’s Types

The Unicode is called the universal standard for character encoding, 

Unicode follows UTF the abbreviation of UTF is “Unicode Transformation Format”. 

UTF defines how to represent Unicode code. UTF has 4 types which are explained below.

UFT-7

The UFT-7 was designed to show the ASCII codes. As we know ASCII works on 7 bits encoding so basically UTF-7 is used to show the ASCII in our emails and texts which use UTF-7.

UFT-8

The UTF-8 is the widely used scheme in encoding moreover UTF-8 has a special ability it can use almost 4 bits for showing characters.

Let’s see how it works:

  • It takes 1 byte for showing English Alphabet or symbols. 
  • 2 bytes for Latin or middle east alphabets or symbols. 
  • 3 bytes for Asian alphabets or symbols.
  • And 4 bytes for other special symbols. 

UFT-16

The UTF-16 is an extension for encoding, UTF-16 is used to show 65536 characters and it is also supported by 4 bytes for showing special characters.


UFT-32

We can say UTF-32 is a multi-byte scheme for encoding it also supports bytes for showing special characters.

Conclusion

In this tutorial, we talk about Unicode. We discussed what is Unicode and UTF, then we discussed the types of Unicode.

Leave a Comment