分享
 
 
 

RFC2854 - The text/html Media Type

王朝html/css/js·作者佚名  2008-05-31
窄屏简体版  字體: |||超大  

Network Working Group D. Connolly

Request for Comments: 2854 World Wide Web Consortium (W3C)

Obsoletes: 2070, 1980, 1942, 1867, 1866 L. Masinter

Category: Informational AT&T

June 2000

The 'text/Html' Media Type

Status of this Memo

This memo provides information for the Internet community. It does

not specify an Internet standard of any kind. Distribution of this

memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (2000). All Rights Reserved.

Abstract

This document summarizes the history of HTML development, and defines

the "text/html" MIME type by pointing to the relevant W3C

recommendations; it is intended to obsolete the previous IETF

documents defining HTML, including RFC1866, RFC1867, RFC1980, RFC

1942 and RFC2070, and to remove HTML from IETF Standards Track.

This document was prepared at the request of the W3C HTML working

group. Please send comments to www-html@w3.org, a public mailing list

with archive at <http://lists.w3.org/Archives/Public/www-html/>.

1. IntrodUCtion and background

HTML has been in use in the World Wide Web information infrastructure

since 1990, and specified in various informal documents. The

text/html media type was first officially defined by the IETF HTML

working group in 1995 in [HTML20]. Extensions to HTML were proposed

in [HTML30], [UPLOAD], [TABLES], [CLIMAPS], and [I18N].

The IETF HTML working group closed Sep 1996, and work on defining

HTML moved to the World Wide Web Consortium (W3C). The proposed

extensions were incorporated to some extent in [HTML32], and to a

larger extent in [HTML40]. The definition of multipart/form-data from

[UPLOAD] was described in [FORMDATA]. In addition, a reformulation of

HTML 4.0 in XML 1.0[XHTML1] was developed.

[HTML32] notes "This specification defines HTML version 3.2. HTML 3.2

aims to capture recommended practice as of early '96 and as such to

be used as a replacement for HTML 2.0 (RFC1866)." Subsequent

specifications for HTML describe the differences in each version.

In addition to the development of standards, a wide variety of

additional extensions, restrictions, and modifications to HTML were

popularized by NCSA's Mosaic system and subsequently by the

competitive implementations of Netscape Navigator and Microsoft

Internet EXPlorer; these extensions are documented in numerous books

and online guides.

2. Registration of MIME media type text/html

MIME media type name: text

MIME suBType name: html

Required parameters: none

Optional parameters:

charset

The optional parameter "charset" refers to the character

encoding used to represent the HTML document as a sequence of

bytes. Any registered IANA charset may be used, but UTF-8 is

preferred. Although this parameter is optional, it is strongly

recommended that it always be present. See Section 6 below for

a discussion of charset default rules.

Note that [HTML20] included an optional "level" parameter; in

practice, this parameter was never used and has been removed from

this specification. [HTML30] also suggested a "version"

parameter; in practice, this parameter also was never used and has

been removed from this specification.

Encoding considerations:

See Section 4 of this document.

Security considerations:

See Section 7 of this document.

Interoperability considerations:

HTML is designed to be interoperable across the widest possible

range of platforms and devices of varying capabilities. However,

there are contexts (platforms of limited display capability, for

example) where not all of the capabilities of the full HTML

definition are feasible. There is ongoing work to develop both a

modularization of HTML and a set of profiling capabilities to

identify and negotiate restricted (and extended) capabilities.

Due to the long and distributed development of HTML, current

practice on the Internet includes a wide variety of HTML variants.

Implementors of text/html interpreters must be prepared to be

"bug-compatible" with popular browsers in order to work with many

HTML documents available the Internet.

Typically, different versions are distinguishable by the DOCTYPE

declaration contained within them, although the DOCTYPE

declaration itself is sometimes omitted or incorrect.

Published specification:

The text/html media type is now defined by W3C Recommendations;

the latest published version is [HTML401]. In addition, [XHTML1]

defines a profile of use of XHTML which is compatible with HTML

4.01 and which may also be labeled as text/html.

Applications which use this media type:

The first and most common application of HTML is the World Wide

Web; commonly, HTML documents contain URI references [URI] to

other documents and media to be retrieved using the HTTP protocol

[HTTP]. Many gateway applications provide HTML-based interfaces to

other underlying complex services. Numerous other applications now

also use HTML as a convenient platform-independent multimedia

document representation.

Additional information:

Magic number:

There is no single initial string that is always present for

HTML files. However, Section 5 below gives some guidelines for

recognizing HTML files.

File extension:

The file extensions 'html' or 'htm' are commonly used, but

other extensions denoting file formats for preprocessing are

also common.

Macintosh File Type code: TEXT

Person & email address to contact for further information:

Dan Connolly <connolly@w3.org>

Larry Masinter <lmm@acm.org>

Intended usage: COMMON

Author/Change controller:

The HTML specification is a work product of the World Wide Web

Consortium's HTML Working Group. The W3C has change control over

the HTML specification.

Further information:

HTML has a means of including, by reference via URI, additional

resources (image, video clip, applet) within the base document. In

order to transfer a complete HTML object and the included

resources in a single MIME object, the mechanisms of [MHTML] may

be used.

3. Fragment Identifiers

The URI specification [URI] notes that the semantics of a fragment

identifier (part of a URI after a "#") is a property of the data

resulting from a retrieval action, and that the format and

interpretation of fragment identifiers is dependent on the media type

of the retrieval result.

For documents labeled as text/html, the fragment identifier

designates the correspondingly named element; any element may be

named with the "id" attribute, and A, APPLET, FRAME, IFRAME, IMG and

MAP elements may be named with a "name" attribute. This is described

in detail in [HTML40] section 12.

4. Encoding considerations

Because of the availability within HTML itself for using character

entity references, documents that use a wide repertoire of characters

may still be represented using the US-ASCII charset and transported

without encoding. However, transport of text/html using a charset

other than US-ASCII may require base64 or quoted-printable encoding

for 7-bit channels.

As with all MIME text subtypes, the canonical form of "text/html"

must always represent a line break as a sequence of a CR byte value

(0x0D) followed by an LF (0x0A) byte value. Similarly, any

occurrence of such a CRLF sequence in "text/html" must represent a

line break. Use of CR byte values and LF byte values outside of line

break sequences is also forbidden. This rule applies regardless of

the character encoding ('charset') involved.

Note, however, that the HTTP protocol allows the transport of data

not in canonical form, and, in particular, with other end-of-line

conventions; see [HTTP] section 3.7.1. This exception is commonly

used for HTML.

HTML sent via email is still subject to the MIME restrictions; this

is discussed fully in [MHTML] Section 10.

5. Recognizing HTML files

Almost all HTML files have the string "<html" or "<HTML" near the

beginning of the file.

Documents conformant to HTML 2.0, HTML 3.2 and HTML 4.0 will start

with a DOCTYPE declaration "<!DOCTYPE HTML" near the beginning,

before the "<html". These dialects are case insensitive. Files may

start with white space, comments (introduced by "<!--" ), or

processing instructions (introduced by "<?") prior to the DOCTYPE

declaration.

XHTML documents (optionally) start with an XML declaration which

begins with "<?xml" and are required to have a DOCTYPE declaration

"<!DOCTYPE html".

6. Charset default rules

The use of an explicit charset parameter is strongly recommended.

While [MIME] specifies "The default character set, which must be

assumed in the absence of a charset parameter, is US-ASCII." [HTTP]

Section 3.7.1, defines that "media subtypes of the 'text' type are

defined to have a default charset value of 'ISO-8859-1'". Section

19.3 of [HTTP] gives additional guidelines. Using an explicit

charset parameter will help avoid confusion.

Using an explicit charset parameter also takes into account that the

overwhelming majority of deployed browsers are set to use something

else than 'ISO-8859-1' as the default; the actual default is either a

corporate character encoding or character encodings widely deployed

in a certain national or regional community. For further

considerations, please also see Section 5.2 of [HTML40].

7. Security Considerations

[HTML401], section B.10, notes various security issues with

interpreting anchors and forms in HTML documents.

In addition, the introduction of scripting languages and interactive

capabilities in HTML 4.0 introduced a number of security risks

associated with the automatic execution of programs written by the

sender but interpreted by the recipient. User agents executing such

scripts or programs must be extremely careful to insure that

untrusted software is executed in a protected environment.

8. Authors' Addresses

Daniel W. Connolly

World Wide Web Consortium (W3C)

MIT Laboratory for Computer Science

545 Technology Square

Cambridge, MA 02139, U.S.A.

EMail: connolly@w3.org

http://www.w3.org/People/Connolly/

Larry Masinter

AT&T

75 Willow Road

Menlo Park, CA 94025

EMail: LM@att.com

http://larry.masinter.net

9. References

[CLIMAPS] Seidman, J., "A Proposed Extension to HTML: Client-Side

Image Maps", RFC1980, August 1996.

[FORMDATA] Masinter, L., "Returning Values from Forms:

multipart/form-data", RFC2388, August 1998.

[HTML20] Berners-Lee, T. and D. Connolly, "Hypertext Markup

Language - 2.0", RFC1866, November 1995.

[HTML30] Raggett, D., "HyperText Markup Language Specification

Version 3.0", September 1995. (Available at

<http://www.w3.org/MarkUp/html3/CoverPage>).

[HTML32] Raggett, D., "HTML 3.2 Reference Specification", W3C

Recomendation, January 1997.

Available at <http://www.w3.org/TR/REC-html32>.

[HTML40] Raggett, D., et al., "HTML 4.0 Specification", W3C

Recommendation, December 1997.

Available at <http://www.w3.org/TR/1998/REC-html40-

19980424>

[HTML401] Raggett, D., et al., "HTML 4.01 Specification", W3C

Recommendation, December 1999.

Available at <http://www.w3.org/TR/html401>.

[HTTP] Gettys, J., Fielding, R., Mogul, J., Frystyk, H.,

Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext

Transfer Protocol -- HTTP/1.1", RFC2616, June 1999.

[I18N] Yergeau, F., Nicol, G. and M. Duerst,

"Internationalization of the Hypertext Markup Language",

RFC2070, January 1997.

[MHTML] Palme, J., Hotmann, A. and N. Shelness, "MIME

Encapsulation of Aggregate Documents, such as HTML

(MHTML)", RFC2557, March 1999.

[MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail

Extensions (MIME) Part Two: Media Types", RFC2046,

November 1996.

[TABLES] Raggett, D., "HTML Tables", RFC1942, May 1996.

[UPLOAD] Nebel, E. and L. Masinter, "Form-based File Upload in

HTML", RFC1867, November 1995.

[URI] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform

Resource Identifiers (URI): Generic Syntax", RFC2396,

August 1998.

[XHTML1] "XHTML 1.0: The Extensible HyperText Markup Language: A

Reformulation of HTML 4 in XML 1.0", W3C Recommendation,

January 2000. Available at <http://www.w3.org/TR/xhtml1>.

10. Full Copyright Statement

Copyright (C) The Internet Society (2000). All Rights Reserved.

This document and translations of it may be copied and furnished to

others, and derivative works that comment on or otherwise explain it

or assist in its implementation may be prepared, copied, published

and distributed, in whole or in part, without restriction of any

kind, provided that the above copyright notice and this paragraph are

included on all such copies and derivative works. However, this

document itself may not be modified in any way, such as by removing

the copyright notice or references to the Internet Society or other

Internet organizations, except as needed for the purpose of

developing Internet standards in which case the procedures for

copyrights defined in the Internet Standards process must be

followed, or as required to translate it into languages other than

English.

The limited permissions granted above are perpetual and will not be

revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an

"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING

TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING

BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION

HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF

MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

Funding for the RFCEditor function is currently provided by the

Internet Society.

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
 
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有