9.4.1 Unicode escape sequences
A Unicode escape sequence represents a Unicode character. Unicode escape
sequences are processed in
identifiers (§9.4.2), regular string literals (§9.4.4.5), and character
literals (§9.4.4.4). A Unicode character escape
is not processed in any other location (for example, to form an operator,
punctuator, or keyword).
unicode-escape-sequence::
\u hex-digit hex-digit hex-digit hex-digit
\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit
hex-digit
A Unicode escape sequence represents the single Unicode character formed by
the hexadecimal number
following the .\u. or .\U. characters. Since C# uses a 16-bit encoding of
Unicode characters in characters and
string values, a Unicode code point in the range U+10000 to U+10FFFF is
represented using two Unicode
surrogate code units. Unicode code points above 0x10FFFF are invalid and
are not supported.
Multiple translations are not performed. For instance, the string literal
.\u005Cu005C. is equivalent to
.\u005C. rather than .\.. [Note: The Unicode value \u005C is the character
.\.. end note]
[Example: The example
Chapter 9 Lexical structure
55
class Class1
{
static void Test(bool \u0066) {
char c = ’\u0066’;
if (\u0066)
System.Console.WriteLine(c.ToString());
}
}
shows several uses of \u0066, which is the escape sequence for the letter
.f.. The program is equivalent to
class Class1
{
static void Test(bool f) {
char c = ’f’;
if (f)
System.Console.WriteLine(c.ToString());
}
}
end example]