Characters & their encoding
Computer communicates in binary language which is denoted by 0s and 1s. These are called bits. Since the alphabet contains more than two letters, a letter cannot be represented by a bit. A byte is a sequence of bits. A word is the number of bits that are manipulated as a unit by the particular CPU of the computer. Today most CPUs have a word size of 32 or 64 bits.
An alphanumeric code is a series of letters and numbers (hence the name) that are written in a format that can be processed by a computer. The most widely used alphanumeric code is the American Standard Code for information Interchange (ASCII). It is a 7-bit code to represent characters. Apart from ASCII there are other codes such as Extended Binary Coded Decimal Interchange Code. It is an eight bit code capable of representing 256 symbols. Another popular character representation system is Unicode. ASCII character encoding provides a standard way to represent characters using numeric codes. These include upper and lower-case English letters, numbers and punctuation symbols. EBCDIC stands for Extended Binary Coded Decimal
Interchange Code. EBCDIC is a standard code that uses eight bits to represent each of up to 256 alphanumeric characters.
UTF-16 (16-bit Unicode Transformation Format) is a standard method of encoding Unicode character data. It is a character encoding that maps code points of Unicode character set to a sequence of two bytes (16-bits).