分享
 
 
 

BitTorrent 协议规范1.0版

王朝system·作者佚名  2006-12-17
窄屏简体版  字體: |||超大  

BitTorrent 协议规范1.0版

BitTorrent 协议规范1.0版 摘自http://wiki.theory.org/BitTorrentSpecification,标题为Bittorrent Protocol Specification v1.0。它比http://www.bitconjurer.org/BitTorrent/protocol.html介绍的详细多了。

Bittorrent Protocol Specification v1.0

Identification

http://www.bitconjurer.org. BitTorrent is designed to facilitate file transfers among multiple peers across unreliable networks.

Purpose

http://www.bitconjurer.org/BitTorrent/protocol.html outlines the protocol in somewhat general terms, and lacks behaviorial detail in some areas. The hope is that this document will become a formal specification, written in clear, unambiguous terms, which can be used as a basis for discussion and implementation in the future.

Scopenot here.

Related Documents

http://www.bitconjurer.org/BitTorrent/protocol.html - The official protocol specification.

BitTorrentWishList - A wish list for developers and end users alike.

BitTorrentTrackerExtensions - Describes the various extensions of the Tracker protocol that are in use.

ConventionsIn this document, a number of conventions are used in an attempt to present information in a concise and unambiguous fashion.

peer v/s client: In this document, a peer is any BitTorrent client participating in a download. The client is also a peer, however it is the BitTorrent client that is running on the local machine. Reader of this specification may choose to think of themselves as the client which connects to numerous peers. piece v/s block: In this document, a piece refers to a portion of the downloaded data that is described in the metainfo file, which can be verified by a SHA1 hash. A block is a portion of data that a client may request from a peer. Two or more blocks make up a whole piece, which may then be verified. defacto standard: Large blocks of text in italics indicates a practice so common in various client implementations of BitTorrent that it is considered a defacto standard.

bencoding

byte strings<string length encoded in base ten ASCII>:<string data>

Note that there is no constant beginning delimiter, and no ending delimiter.

Example: 4:spam represents the string "spam"

integersi<integer encoded in base ten ASCII>e

The initial i and trailing e are beginning and ending delimiters.

You can have negative numbers such as i-3e. You cannot prefix the number with a zero such as i04e. However, i0e is valid.

Example i3e represents the integer "3"

listsl<bencoded values>e

The initial l and trailing e are beginning and ending delimiters.

l4:spam4:eggse represents the list of two strings: ["spam", "eggs"]

dictionariesd<bencoded string><bencoded element>e

The initial d and trailing e are the beginning and ending delimiters.

d3:cow3:moo4:spam4:eggse represents the dictionary { "cow" => "moo", "spam" => "eggs" }

Example: d4:spaml1:a1:bee represents the dictionary { "spam" => ["a", "b"] }

Metainfo File Structure. The specification for bencoding is defined above.

required fields:

info: a dictionary that describes the file(s) of the torrent. There are two possible forms: one for the case of a 'single-file' torrent with no directory structure, and one for the case of a 'multi-file' torrent, which can contain subdirectory trees.

For the case of the single-file mode, the info dictionary contains the following structure length: length of the file in bytes (integer) md5sum: (optional) a 32-character hexadecimal string corresponding to the MD5 sum of the file. This is not used by BitTorrent at all, but it is included by some programs for greater compatibility. name: the filename of the file. This is purely advisory. (string) piece length: number of bytes in each piece (integer) pieces: string consisting of the concatenation of all 20-byte SHA1 hash values, one per piece (byte string) For the case of the multi-file mode, the info dictionary contains the following structure files: a list of dictionaries, one for each file. Each dictionary in this list contains the following keys:

length: length of the file in bytes (integer) md5sum: (optional) a 32-character hexadecimal string corresponding to the MD5 sum of the file. This is not used by BitTorrent at all, but it is included by some programs for greater compatibility. path: a list containing one or more string elements that together represent the path and filename. Each element in the list corresponds to either a directory name or (in the case of the final element) the filename. For example, a the file "dir1/dir2/file.ext" would consist of three string elements: "dir1", "dir2", and "file.ext". This is encoded as a bencoded list of strings such as l4:dir14:dir28:file.exte name: the name of the top-most directory in the structure -- the directory which contains all of the files listed in the above files list. (character string) piece length: number of bytes in each piece (integer) pieces: string consisting of the concatenation of all 20-byte SHA1 hash values, one per piece (byte string) announce: The announce URL of the tracker (string) announce-list: (optional) this is an extention to the official specification, which is also backwards compatible. This key is used to implement lists of backup trackers. The full specification can be found at

http://home.elp.rr.com/tur/multitracker-spec.txt creation date: (optional) the creation time of the torrent, in standard Unix epoch format (integer seconds since 1-Jan-1970 00:00:00 UTC) comment: (optional) free-form textual comments of the author (string) created by: (optional) name and version of the program used to create the .torrent (string) Notes The piece length specifies the nominal piece size, and is usually a power of 2. The piece size is typically chosen based on the total amount of file data in the torrent, constrained by the fact that too small a piece size will result in a large .torrent metadata file, and piece sizes too large cause inefficiency. The general rule of thumb seems to be to pick the smallest piece size that results in a .torrent file no greater than approx. 50 - 75 kB. The most common sizes are 256 kB, 512 kB, and 1 MB. Every piece is of equal length except for the final piece, which is irregular. The number of pieces is thus determined by 'ceil( total length / piece size )'. For the purposes of piece boundaries in the multi-file case, consider the file data as one long continuous stream, composed of the concatenation of each file in the order listed in the files list. The number of pieces and their boundaries are then determined in the same manner as the case of a single file. Pieces may overlap file boundaries. Each piece has a corresponding SHA1 hash of the data contained within that piece. These hashes are concatenated to form the pieces value in the above info dictionary. Note that this is not a list but rather a single string. The length of the string must be a multiple of 20 bytes. Tracker HTTP/HTTPS Protocol

info_hash: 20-byte SHA1 hash of the value of the info key from the Metainfo file. Note that the value will be a bencoded dictionary, given the definition of the info key above. Note: This string is always urlencoded, as opposed to peer_id, which needs to be unencoded. peer_id: 20-byte string used as a unique ID for the client, generated by the client at startup. This is allowed to be any value, and may be binary data. There are currently no guidelines for generating this peer ID. However, one may rightly presume that it must at least be unique for your local machine, thus should probably incorporate things like process ID and perhaps a timestamp recorded at startup. See peer_id below for common client encodings of this field. port: The port number that the client is listening on. Ports reserved for BitTorrent are typically 6881-6889. Clients may choose to give up if it cannot establish a port within this range. uploaded: The total amount uploaded (since the client sent the 'started' event to the tracker) in base ten ASCII. While not explicitly stated in the official specification, the concensus is that this should be the total number of bytes uploaded. downloaded: The total amount downloaded (since the client sent the 'started' event to the tracker) in base ten ASCII. While not explicitly stated in the official specification, the consensus is that this should be the total number of bytes downloaded. left: The number of bytes this client still has to download, encoded in base ten ASCII. no_peer_id: Seems to be used by Experimental BitTorrent client and Clients such as ABC to tell the tracker if the client has no peer_id. 1 Means that the client has a peer id and 0 means unknown Peer id. Trackers should not allow this to be 0. compact: Used in some bittorrent trackers to check if the Tracker is in compact mode. event: If specified, must be one of started, completed, stopped, (or empty which is the same as not being specified). If not specified, then this request is one performed at regular intervals.

started: The first request to the tracker must include the event key with the started value. stopped: Must be sent to the tracker if the client is shutting down gracefully. completed: Must be sent to the tracker when the download completes. However, must not be sent if the download was already 100% complete when the client started. Presumably, this is to allow the tracker to increment the "completed downloads" metric based soley on this event. ip: Optional. The true IP address of the client machine, in dotted quad format or rfc3513 defined hexed IPv6 address. Notes: In general this parameter is not necessary as the address of the client can be determined from the IP address from which the HTTP request came. The parameter is only needed in the case where the IP address that the request came in on is not the IP address of the client. This happens if the client is communicating to the tracker through a proxy (or a transparent web proxy/cache.) It also is necessary when both the client and the tracker are on the same local side of a NAT gateway. The reason for this is that otherwise the tracker would give out the internal (RFC1918) address of the client, which is not routeable. Therefore the client must explicitly state its (external, routeable) IP address to be given out to external peers. Various trackers treat this parameter differently. Some only honor it only if the IP address that the request came in on is in RFC1918 space. Others honor it unconditionally, while others ignore it completely. In case of IPv6 address (e.g.: 2001:db8:1:2::100) it indicates only that client can communicate via IPv6. numwant: Optional. Number of peers that the client would like to receive from the tracker. This value is permitted to be zero. If omitted, typically defaults to 50 peers.

failure reason: If present, then no other keys may be present. The value is a human-readable error message as to why the request failed (string). interval: Interval in seconds that the client should wait between sending regular requests to the tracker complete: number of peers with the entire file, i.e. seeders (integer) incomplete: number of non-seeder peers, aka "leechers" (integer) peers: The value is a list of dictionaries, each with the following keys:

peer id: peer's self-selected ID, as described above for the tracker request (string) ip: peer's IP address (either IPv6 or IPv4) or DNS name (string) port: peer's port number (integer) The tracker may choose to implement a more intelligent mechanism for peer selection when responding to a request. For instance, reporting seeds to other seeders could be avoided.

However, it is considered bad practice to "hammer" on a tracker to get multiple peers. If a client wants a large peer list in the response, then it should specify the numwant parameter.

Tracker 'scrape' Convention

not to be done. This standard is documented by Bram in the BitTorrent development list archive:

http://groups.yahoo.com/group/BitTorrent/message/3275

info_hash, a 20-byte value as described above. This restricts the tracker's report to that particular torrent. Otherwise stats for all torrents that the tracker is managing are returned. Software authors are strongly encouraged to use the info_hash parameter when at all possible, to reduce the load and bandwidth of the tracker.

files: a dictionary containing one key/value pair for each torrent for which there are stats. If info_hash was supplied and was valid, this dictionary will contain a single key/value. Each key consists of a 20-byte binary info_hash value. The value of that key is yet another nested dictionary containing the following:

complete: number of peers with the entire file, i.e. seeders (integer) downloaded: total number of times the tracker has registered a completion ("event=complete", i.e. a client finished downloading the torrent) incomplete: number of non-seeder peers, aka "leechers" (integer) name: (optional) the torrent's internal name, as specified by the "name" file in the info section of the .torrent file

.................... is the 20 byte info_hash and there are 5 seeders, 10 leechers, and 50 complete downloads.

Peer wire protocol (TCP)Overviewmetainfo file.

A client must maintain state information for each connection that it has with a remote peer:

choked: Whether or not the remote peer has choked this client. When a peer chokes the client, it is a notification that no requests will be answered until the client is unchoked. The client should not attempt to send requests for blocks, and it should consider all pending (unanswered) requests to be discarded by the remote peer. interested: Whether or not the remote peer is interested in something this client has to offer. This is a notification that the remote peer will begin requesting blocks when the client unchokes them. Note that this also implies that the client will also need to keep track of whether or not it is interested in the remote peer, and if it has the remote peer choked or unchoked. So, the real list looks something like this:

am_choking: this client is choking the peer am_interested: this client is interested in the peer peer_choking: peer is choking this client peer_interested: peer is interested in this client Client connections start out as "choked" and "not interested". In other words:

am_choking = 1 am_interested = 0 peer_choking = 1 peer_interested = 0

Data Types

Message flow

Handshake

handshake: <pstrlen><pstr><reserved><info_hash><peer_id>

pstrlen: string length of <pstr>, as a single raw byte pstr: string identifier of the protocol reserved: eight (8) reserved bytes. All current implementations use all zeroes. Each bit in these bytes can be used to change the behavior of the protocol. An email from Bram suggests that trailing bits should be used first, so that leading bits may be used to change the meaning of trailing bits. info_hash: 20-byte SHA1 hash of the info key in the metainfo file. This is the same info_hash that is transmitted in tracker requests. peer_id: 20-byte string used as a unique ID for the client. This is the same peer_id that is transmitted in tracker requests.

Note that the initiator presumably received the peer information from the tracker, which includes the peer_id that was registered by the peer. The peer_id from the tracker and in the handshake are expected to match.

peer_id

known clients that uses this encoding style are:

'AZ' -

Azureus 'BB' -
BitBuddy 'CT' -
CTorrent 'MT' -
MoonlightTorrent 'LT' -
libtorrent 'BX' - Bittorrent X 'TS' -
Torrentstorm 'TN' - TorrentDotNET 'SS' - SwarmScope 'XT' -
XanTorrent 'BS' -
BTSlave 'ZT' -
ZipTorrent

known clients that uses this encoding style are:

'S' -

Shadow's client 'U' -
UPnP NAT Bit Torrent 'T' -
BitTornado 'A' -
ABC

BitComet does something different still. Its peer_id consists of four ASCII characters 'exbc', followed by a null byte, followed by a single ASCII numeric digit, followed by random characters. The digit seems to denote the version of the software, though it appears to have no connection with the real version number. The digit is incremented with each new BitComet release.

Bram's client).

Messages

keep-alive: <len=0000>keep-alive message is a message with zero bytes, specified with the length prefix set to zero. There is no message ID and no payload. Peers may close a connection if they receive no messages for a certain period of time, so a keep-alive message can be sent to maintain the connection. A keep-alive message is generally sent once every two minutes.

choke: <len=0001><id=0>choke message is fixed-length and has no payload.

unchoke: <len=0001><id=1>unchoke message is fixed-length and has no payload.

interested: <len=0001><id=2>interested message is fixed-length and has no payload.

not interested: <len=0001><id=3>not interested message is fixed-length and has no payload.

have: <len=0005><id=4><piece index>have message is fixed length. The payload is the zero-based index of a piece that has just been successfully downloaded and verified via the hash.

bitfield: <len=0001+X><id=5><bitfield>bitfield message may only be sent immediately after the handshaking sequence is completed, and before any other messages are sent. It is optional, and need not be sent if a client has no pieces.

bitfield message is variable length, where X is the length of the bitfield. The payload is a bitfield representing the pieces that have been successfully downloaded. The high bit in the first byte corresponds to piece index 0. Bits that are cleared indicated a missing piece, and set bits indicate a valid and available piece. Spare bits at the end are set to zero.

request: <len=0013><id=6><index><begin><length>The request message is fixed length, and is used to request a block. The payload contains the following information index: integer specifying the zero-based piece index begin: integer specifying the zero-based byte offset within the piece length: integer specifying the requested length. This value should normally be 2^14 (16384) bytes. Smaller values may be used but are usually not needed except in rare cases like a piece length not divisible by 16384.

piece: <len=0009+X><id=7><index><begin><block>The piece message is variable length, where X is the length of the block. The payload contains the following information index: integer specifying the zero-based piece index begin: integer specifying the zero-based byte offset within the piece block: block of data, which is a subset of the piece specified by index. cancel: <len=0013><id=8><index><begin><length>cancel message is fixed length, and is used to cancel block requests. The payload is identical to that of the "request" message. It is typically used during "End Game" (see the Algorithms section below).

AlgorithmsSuper Seeding

NOT recommended for general use. While it does assist in the wider distribution of rare data, because it limits the selection of pieces a client can downlad, it also limits the ability of those clients to download data for pieces they have already partially retrieved. Therefore, super-seed mode is only recommended for initial seeding servers.

Why not rename it to e.g. "Initial Seeding Mode" or "Releaser Mode" then?

Piece downloading strategy

rarest first order. The client can determine this by keeping the initial bitfield from each peer, and updating it with every have message. Then, the client can download the pieces that appear least frequently in these peer bitfields.

End Game

Choking and Optimistic Unchoking

downloaders, because they are interested in downloading from the client.

downloaders) but aren't interested get unchoked. If they become interested, the downloader with the worst upload rate gets choked. If a client has a complete file, it uses its upload rate rather than its download rate to decide which peers to unchoke.

downloaders). Which peer is optimistically unchoked rotates every 30 seconds. Newly connected peers are three times as likely to start as the current optimistic unchoke as anywhere else in the rotation. This gives them a decent chance of getting a complete piece to upload.

Anti-snubbing (extension not in the official protocol)Occasionally a BitTorrent peer will be choked by all peers which it was formerly downloading from. In such cases it will usually continue to get poor download rates until the optimistic unchoke finds better peers. To mitigate this problem, when over a minute goes by without getting a single piece from a particular peer, BitTorrent assumes it is "snubbed" by that peer and doesn't upload to it except as an optimistic unchoke. This frequently results in more than one concurrent optimistic unchoke, (an exception to the exactly one optimistic unchoke rule mentioned above), which causes download rates to recover much more quickly when they falter.

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
2023年上半年GDP全球前十五强
 百态   2023-10-24
美众议院议长启动对拜登的弹劾调查
 百态   2023-09-13
上海、济南、武汉等多地出现不明坠落物
 探索   2023-09-06
印度或要将国名改为“巴拉特”
 百态   2023-09-06
男子为女友送行,买票不登机被捕
 百态   2023-08-20
手机地震预警功能怎么开?
 干货   2023-08-06
女子4年卖2套房花700多万做美容:不但没变美脸,面部还出现变形
 百态   2023-08-04
住户一楼被水淹 还冲来8头猪
 百态   2023-07-31
女子体内爬出大量瓜子状活虫
 百态   2023-07-25
地球连续35年收到神秘规律性信号,网友:不要回答!
 探索   2023-07-21
全球镓价格本周大涨27%
 探索   2023-07-09
钱都流向了那些不缺钱的人,苦都留给了能吃苦的人
 探索   2023-07-02
倩女手游刀客魅者强控制(强混乱强眩晕强睡眠)和对应控制抗性的关系
 百态   2020-08-20
美国5月9日最新疫情:美国确诊人数突破131万
 百态   2020-05-09
荷兰政府宣布将集体辞职
 干货   2020-04-30
倩女幽魂手游师徒任务情义春秋猜成语答案逍遥观:鹏程万里
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案神机营:射石饮羽
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案昆仑山:拔刀相助
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案天工阁:鬼斧神工
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案丝路古道:单枪匹马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:与虎谋皮
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:李代桃僵
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案镇郊荒野:指鹿为马
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:小鸟依人
 干货   2019-11-12
倩女幽魂手游师徒任务情义春秋猜成语答案金陵:千金买邻
 干货   2019-11-12
 
推荐阅读
 
 
 
>>返回首頁<<
 
靜靜地坐在廢墟上,四周的荒凉一望無際,忽然覺得,淒涼也很美
© 2005- 王朝網路 版權所有