Java Character Set and tokens
A program is written by using a set of characters. These characters are recognized by the computer hardware and software. Java character set defines the basic elements that programs written in a given language may contain. ASCII and Unicode are two character encodings. Unicode is a universal character set. It is a two byte character set, representing almost all the characters of almost all the languages. ASCII originally used seven bits to encode each character which was later increased to eight.
Java uses Unicode character set because it intends to represent writing schemes of all of the world's major languages. In a Java program, all characters are grouped into symbols called tokens. Java includes five types of tokens - reserved keywords, identifiers, literals, operators and separators. There are some words that cannot be used as object or variable names in a Java program. These words are known as “reserved” words. Identifiers are used by programmers to name things in Java; such as variables, methods, fields, classes, interfaces, exceptions, packages, etc. Literals are a sequence of characters that represent constant values to be stored in variables. Operators are Java tokens containing a special symbol and predefined meaning. Separators help define the structure of a program. A data type in a programming language is a set of data with values having predefined characteristics. Java supports two types of data types: Primitive data types and Reference data type.