Parent Directory
|
Revision Log
Clean up the formatting of the apt-dht protocol page.
Apt-DHT's Distributed Hash Table is very similar to BitTorrent's.
BitTorrent uses a "distributed sloppy hash table" (DHT) for storing peer
contact information for "trackerless" torrents. In effect, each peer
becomes a tracker. The protocol is based on [Kademlia][10] and is
implemented over UDP.
[10]: http://www.cs.rice.edu/Conferences/IPTPS02/109.pdf
Please note the terminology used in this document to avoid confusion. A
"peer" is a client/server listening on a TCP port that implements the
BitTorrent protocol. A "node" is a client/server listening on a UDP port
implementing the distributed hash table protocol. The DHT is composed of
nodes and stores the location of peers. BitTorrent clients include a DHT
node, which is used to contact other nodes in the DHT to get the
location of peers to download from using the BitTorrent protocol.
## Overview
Each node has a globally unique identifier known as the "node ID." Node
IDs are chosen at random from the same 160-bit space as file hashes (use
SHA1 and plenty of entropy to ensure a unique ID). A "distance metric"
is used to compare two node IDs or a node ID and an infohash for
"closeness." Nodes must maintain a routing table containing the contact
information for a small number of other nodes. The routing table becomes
more detailed as IDs get closer to the node's own ID. Nodes know about
many other nodes in the DHT that have IDs that are "close" to their own
but have only a handful of contacts with IDs that are very far away from
their own.
In Kademlia, the distance metric is XOR and the result is interpreted as
an unsigned integer. distance(A,B) = |A xor B| Smaller values are
closer.
When a node wants to find peers for a file download, it uses the
distance metric to compare the hash of the file with the IDs of the
nodes in its own routing table. It then contacts the nodes it knows
about with IDs closest to the file hash and asks them for the contact
information of peers that currently have the file. If a contacted node
knows about peers for the file, the peer contact information is returned
with the response. Otherwise, the contacted node must respond with the
contact information of the nodes in its routing table that are closest
to the hash of the file. The original node iteratively queries nodes
that are closer to the target hash until it cannot find any closer
nodes. After the download is completed, the client then inserts the peer
contact information for itself onto the responding nodes with IDs
closest to the hash of the file.
## Routing Table
Every node maintains a routing table of known good nodes. The nodes in
the routing table are used as starting points for queries in the DHT.
Nodes from the routing table are returned in response to queries from
other nodes.
Not all nodes that we learn about are equal. Some are "good" and some
are not. Many nodes using the DHT are able to send queries and receive
responses, but are not able to respond to queries from other nodes. It
is important that each node's routing table must contain only known good
nodes. A good node is a node that has responded to one of our queries
within the last 15 minutes. A node is also good if it has ever responded
to one of our queries and has sent us a query within the last 15
minutes. After 15 minutes of inactivity, a node becomes questionable.
Nodes become bad when they fail to respond to multiple queries in a row.
Nodes that we know are good are given priority over nodes with unknown
status.
The routing table covers the entire node ID space from 0 to
2<sup>160</sup>. The routing table is subdivided into "buckets" that
each cover a portion of the space. An empty table has one bucket with an
ID space range of min=0, max=2<sup>160</sup>. When a node with ID "N" is
inserted into the table, it is placed within the bucket that has min
<= N < max. An empty table has only one bucket so any node must
fit within it. Each bucket can only hold K nodes, currently eight,
before becoming "full." When a bucket is full of known good nodes, no
more nodes may be added unless our own node ID falls within the range of
the bucket. In that case, the bucket is replaced by two new buckets each
with half the range of the old bucket and the nodes from the old bucket
are distributed among the two new ones. For a new table with only one
bucket, the full bucket is always split into two new buckets covering
the ranges 0..2<sup>159</sup> and 2<sup>159</sup>..2<sup>160</sup>.
When the bucket is full of good nodes, the new node is simply discarded.
If any nodes in the bucket are known to have become bad, then one is
replaced by the new node. If there are any questionable nodes in the
bucket have not been seen in the last 15 minutes, the least recently
seen node is pinged. If the pinged node responds then the next least
recently seen questionable node is pinged until one fails to respond or
all of the nodes in the bucket are known to be good. If a node in the
bucket fails to respond to a ping, it is suggested to try once more
before discarding the node and replacing it with a new good node. In
this way, the table fills with stable long running nodes.
Each bucket should maintain a "last changed" property to indicate how
"fresh" the contents are. When a node in a bucket is pinged and it
responds, or a node is added to a bucket, or a node in a bucket is
replaced with another node, the bucket's last changed property should be
updated. Buckets that have not been changed in 15 minutes should be
"refreshed." This is done by picking a random ID in the range of the
bucket and performing a find_nodes search on it. Nodes that are able to
receive queries from other nodes usually do not need to refresh buckets
often. Nodes that are not able to receive queries from other nodes
usually will need to refresh all buckets periodically to ensure there
are good nodes in their table when the DHT is needed.
Upon inserting the first node into its routing table and when starting
up thereafter, the node should attempt to find the closest nodes in the
DHT to itself. It does this by issuing find_node messages to closer and
closer nodes until it cannot find any closer. The routing table should
be saved between invocations of the client software.
## KRPC Protocol
The KRPC protocol is a simple RPC mechanism consisting of bencoded
dictionaries sent over UDP. A single query packet is sent out and a
single packet is sent in response. There is no retry. There are three
message types: query, response, and error. For the DHT protocol, there
are currently four queries: ping, join, find_node, get_value, and
store_value.
A KRPC message is a single dictionary with two keys common to every
message and additional keys depending on the type of message. Every
message has a key "t" with a string value representing a transaction ID.
This transaction ID is generated by the querying node and is echoed in
the response, so responses may be correlated with multiple queries to
the same node. The transaction ID should be encoded as a random binary
string, typically 20 bytes in length corresponding to the SHA1 hash.
The other key contained in every KRPC message is "y" with a single
character value describing the type of message. The value of the "y" key
is one of "q" for query, "r" for response, or "e" for error.
### Queries
Queries, or KRPC message dictionaries with a "y" value of "q", contain
two additional keys; "q" and "a". Key "q" has a string value containing
the method name of the query. Key "a" has a dictionary value containing
named arguments to the query.
### Responses
Responses, or KRPC message dictionaries with a "y" value of "r", contain
one additional key "r". The value of "r" is a dictionary containing
named return values. Response messages are sent upon successful
completion of a query.
### Errors
Errors, or KRPC message dictionaries with a "y" value of "e", contain
one additional key "e". The value of "e" is a list. The first element is
an integer representing the error code. The second element is a string
containing the error message. Errors are sent when a query cannot be
fulfilled. The following table describes the possible error codes:
<table>
<th align="left"><td>Code</td><td>Description</td></th>
<tr><td>201</td><td>Generic Error</td></tr>
<tr><td>202</td><td>Server Error</td></tr>
<tr><td>203</td><td>Protocol Error, such as a malformed packet, invalid arguments, or bad token</td></tr>
<tr><td>204</td><td>Method Unknown</td></tr>
</table>
#### Example Error Packets
generic error = {"t":"12345678901234567890", "y":"e", "e":[201, "A Generic Error Ocurred"]}
bencoded = d1:eli201e23:A Generic Error Ocurrede1:t20:123456789012345678901:y1:ee
### Contact Encoding
Contact information for peers is encoded as a 6-byte string. Also known
as "Compact IP-address/port info" the 4-byte IP address is in network
byte order with the 2 byte port in network byte order concatenated onto
the end.
Contact information for nodes is encoded as a 26-byte string. Also known
as "Compact node info" the 20-byte Node ID in network byte order has the
compact IP-address/port info concatenated to the end.
## DHT Queries
All queries have an "id" key and value containing the node ID of the
querying node. All responses have an "id" key and value containing the
node ID of the responding node.
### ping
The most basic query is a ping. "q" = "ping" A ping query has a single
argument, "id" the value is a 20-byte string containing the senders node
ID in network byte order. The appropriate response to a ping has a
single key "id" containing the node ID of the responding node.
arguments: {"id": "<querying nodes id>"}
response: {"id": "<queried nodes id>"}
#### Example Packets
ping Query = {"t":"12345678901234567890", "y":"q", "q":"ping", "a":{"id":"abcdefghij0123456789"}}
bencoded = d1:ad2:id20:abcdefghij0123456789e1:q4:ping1:t20:123456789012345678901:y1:qe
Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}}
bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t20:123456789012345678901:y1:re
### join
join is identical to ping, except in the response provided. It is
intended to be sent only to bootstrap nodes during the process of a new
peer joining the system. The reponse to a "q" = "join" query has two
additional responses, a key "ip_addr" whose value is the bootstrap
node's understanding of the peer's IP address, and a key "port" whose
value is the port number that the bootstrap node received the request
from.
arguments: {"id": "<querying nodes id>"}
response: {"id": "<queried nodes id>", "ip_addr": "<querying nodes IP>", "port": "<querying nodes port>"}
#### Example Packets
join Query = {"t":"12345678901234567890", "y":"q", "q":"join", "a":{"id":"abcdefghij0123456789"}}
bencoded = d1:ad2:id20:abcdefghij0123456789e1:q4:join1:t20:123456789012345678901:y1:qe
Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"mnopqrstuvwxyz123456", "ip_addr":"123.123.123.123", "port":12345}}
bencoded = d1:rd2:id20:mnopqrstuvwxyz1234567:ip_addr15:123.123.123.1234:porti12345ee1:t20:123456789012345678901:y1:re
### find_node
Find node is used to find the contact information for a node given its
ID. "q" == "find_node" A find_node query has two arguments, "id"
containing the node ID of the querying node, and "target" containing the
ID of the node sought by the queryer. When a node receives a find_node
query, it should respond with a key "nodes" and value of a string
containing the compact node info for the target node or the K (8)
closest good nodes in its own routing table.
arguments: {"id": "<querying nodes id>", "target": "<id of target node>"}
response: {"id": "<queried nodes id>", "nodes": "<compact node info>"}
#### Example Packets
find_node Query = {"t":"12345678901234567890", "y":"q", "q":"find_node", "a": {"id":"abcdefghij0123456789", "target":"mnopqrstuvwxyz123456"}}
bencoded = d1:ad2:id20:abcdefghij01234567896:target20:mnopqrstuvwxyz123456e1:q9:find_node1:t20:123456789012345678901:y1:qe
Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"0123456789abcdefghij", "nodes": "def456..."}}
bencoded = d1:rd2:id20:0123456789abcdefghij5:nodes9:def456...e1:t20:123456789012345678901:y1:re
### get_value
Get peers associated with a file hash. "q" = "get_value" A get_value
query has two arguments, "id" containing the node ID of the querying
node, and "key" containing the hash of the file. If the queried node has
peers for the hash, they are returned in a key "values" as a list of
strings. Each string containing "compact" format peer information for a
single peer. If the queried node has no peers for the hash, a key
"nodes" is returned containing the K nodes in the queried nodes routing
table closest to the hash supplied in the query.
arguments: {"id": "<querying nodes id>", "key": "<20-byte hash of target file>"}
response: {"id": "<queried nodes id>", "values": ["<peer 1 info string>", "<peer 2 info string>"]}
or: {"id": "<queried nodes id>", "nodes": "<compact node info>"}
#### Example Packets
get_value Query = {"t":"12345678901234567890", "y":"q", "q":"get_value", "a": {"id":"abcdefghij0123456789", "key":"mnopqrstuvwxyz123456"}}
bencoded = d1:ad2:id20:abcdefghij01234567893:key20:mnopqrstuvwxyz123456e1:q9:get_value1:t20:123456789012345678901:y1:qe
Response with peers = {"t":"12345678901234567890", "y":"r", "r": {"id":"abcdefghij0123456789", "values": ["axje.u", "idhtnm"]}}
bencoded = d1:rd2:id20:abcdefghij01234567896:valuesl6:axje.u6:idhtnmee1:t20:123456789012345678901:y1:re
Response with nodes = {"t":"12345678901234567890", "y":"r", "r": {"id":"abcdefghij0123456789", "nodes": "def456..."}}
bencoded = d1:rd2:id20:abcdefghij01234567895:nodes9:def456...e1:t20:123456789012345678901:y1:re
### store_value
Announce that the peer, controlling the querying node, has the file
available for download. store_value has two arguments: "id" containing
the node ID of the querying node and "key" containing the hash of the
file. The queried node should store the contact information of the
querying node under the file hash in its store of peer contact
information.
arguments: {"id" : "<querying nodes id>", "key" : "<20-byte hash of target file>"}
response: {"id" : "<queried nodes id>"}
#### Example Packets
store_value Query = {"t":"12345678901234567890", "y":"q", "q":"store_value", "a": {"id":"abcdefghij0123456789", "key":"mnopqrstuvwxyz123456"}}
bencoded = d1:ad2:id20:abcdefghij01234567893:key20:mnopqrstuvwxyz123456e1:q13:store_value1:t20:123456789012345678901:y1:qe
Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}}
bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t20:123456789012345678901:y1:re
| CVS Admin | ViewVC Help |
| Powered by ViewVC 1.0.5 |