Lexical Structure

Gaps and & are allowed in Char constants.

The string ``{-#'' starts annotations. Unknown or badly placed annotations may cause an error.

Character and string literals may contain Unicode characters. Unicode is a 16 bit character encoding that encompasses most of the alphabets in the world. The first 256 characters of Unicode are the ISO8859-1 characters (i.e. the normal Haskell character set).

A Unicode character is written as \uXXXX, where XXXX is exactly 4 hexadecimal digits (i.e. exactly like Java).

Although hbc handles 16 bit characters correctly within a program the I/O libraries are still 8 bit oriented. The Unicode module contains functions to encode and decode various Unicode formats.

Example: "\u05d0\u05d1\u05d2" is a string with the first three characters (alef, bet, and gimel) of the Hebrew alphabet.

The Unicode character extension is always turned on.

Last modified: Sun Jul 21 23:54:14 MET DST 1996