分享
 
 
 

RFC2237 - Japanese Character Encoding for Internet Messages

王朝other·作者佚名  2008-05-31
窄屏简体版  字體: |||超大  

Network Working Group K. Tamaru

Request for Comments: 2237 Microsoft Corporation

Category: Informational November 1997

Japanese Character Encoding for Internet Messages

Status of this Memo

This memo provides information for the Internet community. It does

not specify an Internet standard of any kind. Distribution of this

memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1997). All Rights Reserved.

1. Abstract

This memo defines an encoding scheme for the Japanese Characters,

describes "ISO-2022-JP-1", which is used in electronic mail [RFC-

822], and network news [RFC1036]. Also this memo provides a listing

of the Japanese Character Set that can be used in this encoding

scheme.

2. Requirements Notation

This document uses terms that appear in capital letters to indicate

particular requirements of this specification. Those terms are

"MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY". The meaning of

each term are found in [RFC-2119]

3. IntrodUCtion

RFC1468 defines the way Japanese Characters are encoded, likewise

what this memo defines. It defines the use of JIS X 0208 as the

double-byte character set in ISO-2022-JP text.

Today, many operating systems support proprietary extended Japanese

characters or JIS X 0212, This includes the Unicode character set,

which does not conform to JIS X 0201 nor JIS X 0208. Therefore, this

limits the ability to communicate and correspond precise information

because of the limited availability of Kanji characters. Fortunately

JIS (Japanese Industry Standard) defines JIS X 0212 as "code of the

supplementary Japanese graphic character set for information

interchange". Most Japanese characters which are used in regular

electronic mail in most cases can be accommodated in JIS X 0201, JIS

X 0208 and JIS X 0212.

Also it is recognized that there is a tendency to use Unicode,

however, Unicode is not yet widely used and there is a certain

limitation with old electronic mail system. Furthermore, the purpose

of this comment is to add the capability of writing out JIS X 0212.

This comment does not describe any representation of iso-2022-jp-1

version information in addition to JIS X 0212 support.

4. Description

In "ISO-2022-JP-1" text, the initial character code of the message is

in ASCII. The "double-byte-seq"(see "Format Syntax" section) (ESC "$"

"B" / ESC "$" "@" / ESC "$" "(" "D") is the only designator that

indicates that the following character is double-byte, and it is

valid until another escape sequence appears. It is very discouraged

to use (ESC "$" "@") for double byte character encoding, new

implementation SHOULD use only (ESC "$" "B") for double byte encoding

instead.

The end of "ISO-2022-JP-1" text MUST be in ASCII. Also it is strongly

recommended to back up to the ASCII at the end of each line rather

than JIS X 0201-Roman if there is any none ASCII character in middle

of a line.

Since "ISO-2022-JP-1" is designed to add the capability of writing

out JIS X 0212, if the message does not contain none of JIS X 0212

characters. "ISO-2022-JP" text MUST BE used.

JIS X 0201-Roman is not identical to the ASCII with two different

characters.

The following list are the escape sequences and character sets that

can be used in "ISO-2022-JP-1" text. The registered number in the ISO

2375 Register which allow double-byte ideographic scripts to be

encoded within ISO/IEC 2022 code structure is indicated as reg#

below.

reg# character set ESC sequence designated to

6 ASCII ESC 2/8 4/2 ESC ( B G0

42 JIS X 0208-1978 ESC 2/4 4/0 ESC $ @ G0

87 JIS X 0208-1983 ESC 2/4 4/2 ESC $ B G0

14 JIS X 0201-Roman ESC 2/8 4/10 ESC ( J G0

159 JIS X 0212-1990 ESC 2/4 2/8 4/4 ESC $ ( D G0

Other restrictions are given in the Formal Syntax below.

5. Formal Syntax

The notational conventions used here are identical to those used in

STD 11, RFC822 [RFC822].

The * (asterisk) convention is as follows:

l*m something

meaning at least l and at most m something, with l and m taking

default values of 0 and infinity, respectively.

iso-2022-jp-1-text = *( line CRLF ) [line]

line = (*single-byte-char *segment

single-byte-seq *single-byte-char) /

*single-byte-char

segment = single-byte-segment / double-byte-segment

single-byte-segment = single-byte-seq *single-byte-char

double-byte-segment = double-byte-seq *(one-of-94 one-of-94)

reset-seq = ESC "(" ( "B" / "J" )

single-byte-seq = ESC "(" ( "B" / "J" )

double-byte-seq = (ESC "$" ( "@" / "B" )) /

(ESC "$" "(" "D" )

CRLF = CR LF;( Octal, Decimal.)

ESC = <ISO 2022 ESC, escape>;( 33,27.)

SI = <ISO 2022 SI, shift-in>;( 17,15.)

SO = <ISO 2022 SO, shift-out>;( 16,14.)

CR = <ASCII CR, carriage return>;( 15,13.)

LF = <ASCII LF, linefeed>;( 12,10.)

one-of-94 = <any one of 94 values>;(41-176,33.-126.)

one-of-96 = <any one of 96 values>;(40-177,32.-127.)

7BIT = <any 7-bit value>;(0-177,0.-127.)

single-byte-char = <any 7BIT, including bare CR & bare LF,

but NOT including CRLF, and not including

ESC, SI, SO>

6. Security Considerations

This memo raises no known security issues.

7. MIME Considerations

The name to be used for the Japanese encoding scheme in content is

"ISO-2022-JP-1". When this name is used in the MIME message form, it

would be:

Content-Type: text/plain; charset=iso-2022-jp-1

Since the "ISO-2022-JP-1" is 7bit encoding, it will be unnecessary to

encode in another format by specifying the "Content-Transfer-

Encoding" header. Also applying Based64 or Quoted-Printable encoding

MAY cause today's software to fail to decode the message.

"ISO-2022-JP-1" can be used in MIME headers. Also "ISO-2022-JP-1"

text can be used with Base64 or Quoted-Printable encoding.

8. Additional Information

As long as mail systems are capable of writing out Unicode, it is

recommended to also write out Unicode text in addition to "ISO-

2022-JP-1" text. Also writing out "ISO-2022-JP" text in addition to

"ISO-2022-JP-1" is strongly encouraged for backward compatibility

reasons.

Some mail systems write out 8bits characters in 'parameter' and

'value' defined in [RFC822] and [RFC1521]. All 8bit characters MUST

NOT be used in those fields. The implementation of future mail

systems SHOULD support those only for interoperability reasons.

9. References

[ISO2022]

International Organization for Standardization (ISO),

"Information processing -- ISO 7-bit and 8-bit coded

character sets -- Code extension techniques",

International Standard, Ref. No. ISO 2022-1986 (E).

[ISOREG]

International Organization for Standardization (ISO),

"International Register of Coded Character Sets To Be Used

With Escape Sequences".

[RFC-822]

Crocker, D., "Standard for the Format of ARPA Internet

Text Messages", STD 11, RFC822, August 1982.

[RFC-1468]

Murai, J., Crispin, M., and E. van der Poel, "Japanese

Character Encoding for Internet Messages", RFC1468, June

1993.

[RFC-1766]

Alvestrand, H., "Tags for the Identification of

Languages", RFC1766, March 1995.

[RFC-2045]

Freed, N., and N. Borenstein, "Multipurpose Internet Mail

Extensions (MIME) Part One: Format of Internet Message

Bodies", RFC2045, December 1996.

[RFC-2046]

Freed, N., and N. Borenstein, "Multipurpose Internet Mail

Extensions (MIME) Part Two: Media Types", RFC2046,

December 1996.

[RFC-2047]

Moore, K., "Multipurpose Internet Mail Extensions (MIME)

Part Three: Representation of Non-ASCII Text in Internet

Message Headers", RFC2047, December 1996.

[RFC-2048]

Freed, N., Klensin, J. and J. Postel, "Multipurpose

Internet Mail Extensions (MIME) Part Four: MIME

Registration Procedures", RFC2048, December 1996.

[RFC-2049]

Freed, N., and N. Borenstein, "Multipurpose Internet Mail

Extensions (MIME) Part Five: Conformance Criteria and

Examples", RFC2049, December 1996.

[RFC-2119]

Bradner, S., "Key Words for use in RFCs to Indicate

Requirement Levels", RFC2119, March 1997.

Author's Address

Kenzaburo Tamaru

Microsoft Corporation

One Microsoft Way

Redmond, WA 98052-6399

EMail: kenzat@microsoft.com

Full Copyright Statement

Copyright (C) The Internet Society (1997). All Rights Reserved.

This document and translations of it may be copied and furnished to

others, and derivative works that comment on or otherwise eXPlain it

or assist in its implementation may be prepared, copied, published

and distributed, in whole or in part, without restriction of any

kind, provided that the above copyright notice and this paragraph are

included on all such copies and derivative works. However, this

document itself may not be modified in any way, such as by removing

the copyright notice or references to the Internet Society or other

Internet organizations, except as needed for the purpose of

developing Internet standards in which case the procedures for

copyrights defined in the Internet Standards process must be

followed, or as required to translate it into languages other than

English.

The limited permissions granted above are perpetual and will not be

revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an

"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING

TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING

BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION

HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF

MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
 
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有