9.4.1 Unicode escape sequences

A Unicode escape sequence represents a Unicode character. Unicode escape

sequences are processed in

identifiers (§9.4.2), regular string literals (§9.4.4.5), and character

literals (§9.4.4.4). A Unicode character escape

is not processed in any other location (for example, to form an operator,

punctuator, or keyword).

unicode-escape-sequence::

\u hex-digit hex-digit hex-digit hex-digit

\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit

hex-digit

A Unicode escape sequence represents the single Unicode character formed by

the hexadecimal number

following the .\u. or .\U. characters. Since C# uses a 16-bit encoding of

Unicode characters in characters and

string values, a Unicode code point in the range U+10000 to U+10FFFF is

represented using two Unicode

surrogate code units. Unicode code points above 0x10FFFF are invalid and

are not supported.

Multiple translations are not performed. For instance, the string literal

.\u005Cu005C. is equivalent to

.\u005C. rather than .\.. [Note: The Unicode value \u005C is the character

.\.. end note]

[Example: The example

Chapter 9 Lexical structure

class Class1

{

static void Test(bool \u0066) {

char c = ’\u0066’;

if (\u0066)

System.Console.WriteLine(c.ToString());

}

shows several uses of \u0066, which is the escape sequence for the letter

.f.. The program is equivalent to

class Class1

{

static void Test(bool f) {

char c = ’f’;

if (f)

System.Console.WriteLine(c.ToString());

}

end example]