分享
 
 
 

RFC373 - Arbitrary Character Sets

王朝other·作者佚名  2008-05-31
窄屏简体版  字體: |||超大  

NWG/RFC#373 14 July 1972

NIC 11058 SU-AI

ARBITRARY CHARACTER SETS

by John McCarthy

It would be nice to be able to have documents stored in computers that

could include arbitrary characters and to be able to display them on

any CRT screen, edit them using any keyboard, and print them on any

printer. The object of this memorandum is to suggest how to get there

from here with special reference to the ARPA network.

Where are we now?

(1) At present, there is 96 character ASCII, and everyone agrees that

it should be included in any larger set.

(2) Many installations are dependent on 64 character sets which do not

even include the lower case latin alphabet.

(3) At the Stanford Artificial Intelligence Laboratory, we have a 114

character set that includes 96 character ASCII and which is

implemented in our keyboards, displays, and line printer

(4) Printers are becoming available that get their character designs

out of memory, for example, the Xerox XGP printer, one of which we are

getting.

(5) The IMLAC type display has the character designs in main memory so

that changing the displayed set is just a matter of reloading the

memory.

(6) Many display systems share the character generator among many

display units. In some of these, e.g. the Datadisc, arbitrary sets

are probably feasible (using kludgery to be described later), but in

other systems, e.g. our III's arbitrary sets are not feasible.

One possible approach to communication in eXPanded character sets is

to prodUCe an expanded standard set of characters, perhaps using 8 or

9 bits and expect new equipment to implement this set. This approach

has the disadvantage that it will be very hard to get agreement on

what the next step should be, and even if formal agreement is

realized, many groups will find it in their interest to ignore the

standard.

[Page 1]

NWG/RFC# 373 JMC 14-JUL-72 12:41 11058

ARBITRARY CHARACTER SETS by John McCarthy

Therefore, I would like to suggest that the next step be to arbitrary

character sets. I suggest implementing this in the following way:

(1) There be established a registry of characters. Anyone can

register a new character. Each character has a unique number, 17 bits

should be enough even to include Chinese. Besides this, each

character has a name in ASCII usually mnemonic. Finally, the

character has a design which is a picture on a 50 by 50 dot matrix.

(2) Besides the registry of characters, there is a registry of

characters sets, which different groups are using for different

classes of documents. A registered character set has a registry

number and a table giving the correspondence between the character

codes as bit sequences and the registered character numbers.

(3) Associated with a document is a statement of the character code

used therein. This may be one of the registered codes or it may

contain in addition modifications described by an auxiliary table

giving the code correspondence with registered character numbers. A

character code may have an escape character that says that the next

character is described by its registry number. The statement of the

character code may be a header on the document or the receiver may

have to learn it by some other means, e.g. because its library

catalog entry contains this information.

(4) Devices such as printers and displays draw characters in different

ways and standardization doesn't seem feasible at present. Therefore,

it is necessary to provide a way of going from the standard

description of a character using a 50 by 50 dot matrix to whatever

method the device uses. This is up to the programmers who are

supporting the device. Some may choose to manually create files

describing how registered characters are implemented. They may find

it too much work to provide for all the characters and to update their

files when new characters are registered. Others will provide

programs for going from the registered descriptions to descriptions

compatible with their implementations. Perhaps most will hand tailor

the characters most used and provide a program for the others.

[Page 2]

NWG/RFC# 373 JMC 14-JUL-72 12:41 11058

ARBITRARY CHARACTER SETS by John McCarthy

(5) The easiest device to handle is the line printer because it is

slow. At the beginning of the print job, the SPOOL program will look

up the character set and load the printer's memory with the character

designs used in the particular document. Sometimes, it may have to go

through the network to one of the computers that stores the registry

in order to find out what to do.

(6) Display systems that have a character memory for each display unit

can be handled in about the same way. Users will occasionally

experience delays when the display programs are surprised by

unfamiliar characters.

(7) Display systems that share character memories require more

complicated treatment. The object is to keep the memory large enough

to keep all the characters that the current set of users is using and

to handle the required table lookups from the different character

codes in a nice way. There will be limitations on the diversity of

character sets that can be in use simultaneously. Systems like the

Datadisc that only look up the character when it is first written can

be extended to work with large sets. Systems that have to look up

each character code 30 times per second in order to maintain the

display won't work so well.

I have no special ideas about how to make keyboards adaptable to

arbitrary sets. Each user may have to fend for himself.

In this memorandum so far, I have ignored typography, i.e. the fact

that in printed documents the same letter may be printed in many

fonts. Perhaps, each character in each font will require a separate

registered description, but with a constant difference between the

numbers of the same character in different fonts. Installations will

again have to decide what font distinctions they will implement.

Some other issues that might be considered are whether means can be

provided to adapt texts automatically to the line and page lengths of

the different devices.

It seems to me most likely that the typographical problems cannot be

solved at this time, and it would be best to adopt conventions for

registering character designs at this time, and leave typography for

later.

[Page 3]

NWG/RFC# 373 JMC 14-JUL-72 12:41 11058

ARBITRARY CHARACTER SETS by John McCarthy

In my opinion, there is no real obstacle to establishing the registry

in the ARPA network now, getting the standards organization to work,

and being able to exchange documents in extended character sets as

soon as the various installations can acquire the printers and display

devices.

It is the present policy of the Stanford Artificial Intelligence

Laboratory to acquire no more devices that are wedded to fixed

character sets.

[ This RFCwas put into machine readable form for entry ]

[ into the online RFCarchives by BBN Corp. under the ]

[ direction of Alex McKenzie. 1/97 ]

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
 
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有