Next: Type conversions. Up: Primitive types. Previous: Floating-point primitive types.


The character type

Textbook: Section 9.4

Unicode

There's only one character primitive type - the char. In older languages, the character type is an 8-bit type, using the ASCII values. (Or another standard, on some odd computers that few people use anymore.) But in Java, char is a 16-bit type, using the Unicode representation.

Unicode is a newer standard that uses 16 bits. By allowing the extra 8 bits, the number of representable characters expands vastly. This permits the inclusion of many alphabets, enabling a Java program to work across many languages. The Unicode standard still includes ASCII for the first 128 values. (Actually, they adjust many of the control characters in the ASCII code, many of which have long been useless, but the visible characters remain the same.) But it includes many other characters - including the alphabets of Cyrillic, Greek, Hebrew, Arabic, Cherokee, and many others, as well as many odd symbols.

Characters as integers

In Java, the character type is actually another integer type. This permits you to do arithmetic on letters. For example, you could do the following to print out the alphabet.

for(char ch = 'A'; ch <= 'Z'; ch++) IO.println(ch);
Or you could have the following to convert a hexadecimal digit into its value.
public static int convertHexadecimalDigit(char ch) {
    // note that using Character.digit(ch, 16) is superior to this.
    if(ch >= '0' && ch <= '9') return ch - '0';
    else return ch - 'A';
}
You can even multiply or divide characters, though there's never a reason to actually do this.

Character class methods

Actually, the usefulness of doing arithmetic on characters is severely hampered by the fact that Java provides a number of built-in methods for working with characters. These are preferable to doing arithmetic, both because it makes the code easier to read and because it keeps your code more portable to other languages.

For example, Character.isLetter(ch) returns true if ch represents a letter of any language. There are many such methods in Java: isDigit, isWhitespace, and isLowerCase, for example.

There are also useful conversion routines: digit(ch, radix) converts from a digit to its integer representation (radix represents the base), and forDigit(i, radix) goes the other way. You should use digit(ch, 16) instead of the convertHexadecimalDigit() I defined earlier. Also, the methods Character.toLowerCase(ch) and Character.toUpperCase(ch) convert characters between their capital and lower-case equivalents.


Next: Type conversions. Up: Primitive types. Previous: Floating-point primitive types.