Parent Directory
|
Revision Log
Revision 123 -
(view)
(download)
Original Path: trunk/apt-dht/protocol.mdwn
| 1 : | camrdale | 118 | Apt-DHT's Distributed Hash Table is very similar to BitTorrent's. |
| 2 : | BitTorrent uses a "distributed sloppy hash table" (DHT) for storing peer | ||
| 3 : | contact information for "trackerless" torrents. In effect, each peer | ||
| 4 : | camrdale | 119 | becomes a tracker. The protocol is based on [Kademlia][10] and is |
| 5 : | camrdale | 118 | implemented over UDP. |
| 6 : | |||
| 7 : | [10]: http://www.cs.rice.edu/Conferences/IPTPS02/109.pdf | ||
| 8 : | |||
| 9 : | Please note the terminology used in this document to avoid confusion. A | ||
| 10 : | "peer" is a client/server listening on a TCP port that implements the | ||
| 11 : | BitTorrent protocol. A "node" is a client/server listening on a UDP port | ||
| 12 : | implementing the distributed hash table protocol. The DHT is composed of | ||
| 13 : | nodes and stores the location of peers. BitTorrent clients include a DHT | ||
| 14 : | node, which is used to contact other nodes in the DHT to get the | ||
| 15 : | location of peers to download from using the BitTorrent protocol. | ||
| 16 : | |||
| 17 : | ## Overview | ||
| 18 : | |||
| 19 : | Each node has a globally unique identifier known as the "node ID." Node | ||
| 20 : | IDs are chosen at random from the same 160-bit space as file hashes (use | ||
| 21 : | SHA1 and plenty of entropy to ensure a unique ID). A "distance metric" | ||
| 22 : | is used to compare two node IDs or a node ID and an infohash for | ||
| 23 : | "closeness." Nodes must maintain a routing table containing the contact | ||
| 24 : | information for a small number of other nodes. The routing table becomes | ||
| 25 : | more detailed as IDs get closer to the node's own ID. Nodes know about | ||
| 26 : | many other nodes in the DHT that have IDs that are "close" to their own | ||
| 27 : | but have only a handful of contacts with IDs that are very far away from | ||
| 28 : | their own. | ||
| 29 : | |||
| 30 : | In Kademlia, the distance metric is XOR and the result is interpreted as | ||
| 31 : | camrdale | 121 | an unsigned integer. distance(A,B) = |A XOR B| Smaller values are |
| 32 : | camrdale | 118 | closer. |
| 33 : | |||
| 34 : | When a node wants to find peers for a file download, it uses the | ||
| 35 : | distance metric to compare the hash of the file with the IDs of the | ||
| 36 : | nodes in its own routing table. It then contacts the nodes it knows | ||
| 37 : | about with IDs closest to the file hash and asks them for the contact | ||
| 38 : | information of peers that currently have the file. If a contacted node | ||
| 39 : | knows about peers for the file, the peer contact information is returned | ||
| 40 : | with the response. Otherwise, the contacted node must respond with the | ||
| 41 : | contact information of the nodes in its routing table that are closest | ||
| 42 : | to the hash of the file. The original node iteratively queries nodes | ||
| 43 : | that are closer to the target hash until it cannot find any closer | ||
| 44 : | nodes. After the download is completed, the client then inserts the peer | ||
| 45 : | contact information for itself onto the responding nodes with IDs | ||
| 46 : | closest to the hash of the file. | ||
| 47 : | |||
| 48 : | camrdale | 121 | The return value for a query for peers includes an opaque value known as |
| 49 : | the "token." For a node to announce that its controlling peer is | ||
| 50 : | downloading a torrent, it must present the token received from the same | ||
| 51 : | queried node in a recent query for nodes. When a node attempts to | ||
| 52 : | "announce" a torrent, the queried node checks the token against the | ||
| 53 : | querying node's IP address. This is to prevent malicious hosts from | ||
| 54 : | signing up other hosts for downloads. Since the token is merely returned | ||
| 55 : | by the querying node to the same node it received the token from, the | ||
| 56 : | implementation is not defined. Tokens must be accepted for a reasonable | ||
| 57 : | amount of time after they have been distributed. The current | ||
| 58 : | implementation uses the SHA1 hash of the IP address concatenated onto a | ||
| 59 : | secret that changes every 15 minutes and tokens up to one hour old are | ||
| 60 : | accepted. | ||
| 61 : | |||
| 62 : | camrdale | 118 | ## Routing Table |
| 63 : | |||
| 64 : | Every node maintains a routing table of known good nodes. The nodes in | ||
| 65 : | the routing table are used as starting points for queries in the DHT. | ||
| 66 : | Nodes from the routing table are returned in response to queries from | ||
| 67 : | other nodes. | ||
| 68 : | |||
| 69 : | Not all nodes that we learn about are equal. Some are "good" and some | ||
| 70 : | are not. Many nodes using the DHT are able to send queries and receive | ||
| 71 : | responses, but are not able to respond to queries from other nodes. It | ||
| 72 : | is important that each node's routing table must contain only known good | ||
| 73 : | nodes. A good node is a node that has responded to one of our queries | ||
| 74 : | within the last 15 minutes. A node is also good if it has ever responded | ||
| 75 : | to one of our queries and has sent us a query within the last 15 | ||
| 76 : | minutes. After 15 minutes of inactivity, a node becomes questionable. | ||
| 77 : | Nodes become bad when they fail to respond to multiple queries in a row. | ||
| 78 : | Nodes that we know are good are given priority over nodes with unknown | ||
| 79 : | status. | ||
| 80 : | |||
| 81 : | The routing table covers the entire node ID space from 0 to | ||
| 82 : | 2<sup>160</sup>. The routing table is subdivided into "buckets" that | ||
| 83 : | each cover a portion of the space. An empty table has one bucket with an | ||
| 84 : | ID space range of min=0, max=2<sup>160</sup>. When a node with ID "N" is | ||
| 85 : | inserted into the table, it is placed within the bucket that has min | ||
| 86 : | <= N < max. An empty table has only one bucket so any node must | ||
| 87 : | fit within it. Each bucket can only hold K nodes, currently eight, | ||
| 88 : | before becoming "full." When a bucket is full of known good nodes, no | ||
| 89 : | more nodes may be added unless our own node ID falls within the range of | ||
| 90 : | the bucket. In that case, the bucket is replaced by two new buckets each | ||
| 91 : | with half the range of the old bucket and the nodes from the old bucket | ||
| 92 : | are distributed among the two new ones. For a new table with only one | ||
| 93 : | bucket, the full bucket is always split into two new buckets covering | ||
| 94 : | the ranges 0..2<sup>159</sup> and 2<sup>159</sup>..2<sup>160</sup>. | ||
| 95 : | |||
| 96 : | When the bucket is full of good nodes, the new node is simply discarded. | ||
| 97 : | If any nodes in the bucket are known to have become bad, then one is | ||
| 98 : | replaced by the new node. If there are any questionable nodes in the | ||
| 99 : | bucket have not been seen in the last 15 minutes, the least recently | ||
| 100 : | seen node is pinged. If the pinged node responds then the next least | ||
| 101 : | recently seen questionable node is pinged until one fails to respond or | ||
| 102 : | all of the nodes in the bucket are known to be good. If a node in the | ||
| 103 : | bucket fails to respond to a ping, it is suggested to try once more | ||
| 104 : | before discarding the node and replacing it with a new good node. In | ||
| 105 : | this way, the table fills with stable long running nodes. | ||
| 106 : | |||
| 107 : | Each bucket should maintain a "last changed" property to indicate how | ||
| 108 : | "fresh" the contents are. When a node in a bucket is pinged and it | ||
| 109 : | responds, or a node is added to a bucket, or a node in a bucket is | ||
| 110 : | replaced with another node, the bucket's last changed property should be | ||
| 111 : | updated. Buckets that have not been changed in 15 minutes should be | ||
| 112 : | "refreshed." This is done by picking a random ID in the range of the | ||
| 113 : | bucket and performing a find_nodes search on it. Nodes that are able to | ||
| 114 : | receive queries from other nodes usually do not need to refresh buckets | ||
| 115 : | often. Nodes that are not able to receive queries from other nodes | ||
| 116 : | usually will need to refresh all buckets periodically to ensure there | ||
| 117 : | are good nodes in their table when the DHT is needed. | ||
| 118 : | |||
| 119 : | Upon inserting the first node into its routing table and when starting | ||
| 120 : | up thereafter, the node should attempt to find the closest nodes in the | ||
| 121 : | DHT to itself. It does this by issuing find_node messages to closer and | ||
| 122 : | closer nodes until it cannot find any closer. The routing table should | ||
| 123 : | be saved between invocations of the client software. | ||
| 124 : | |||
| 125 : | ## KRPC Protocol | ||
| 126 : | |||
| 127 : | The KRPC protocol is a simple RPC mechanism consisting of bencoded | ||
| 128 : | dictionaries sent over UDP. A single query packet is sent out and a | ||
| 129 : | single packet is sent in response. There is no retry. There are three | ||
| 130 : | message types: query, response, and error. For the DHT protocol, there | ||
| 131 : | camrdale | 123 | are currently five queries: ping, join, find\_node, get\_value, and |
| 132 : | store\_value. | ||
| 133 : | camrdale | 118 | |
| 134 : | A KRPC message is a single dictionary with two keys common to every | ||
| 135 : | message and additional keys depending on the type of message. Every | ||
| 136 : | message has a key "t" with a string value representing a transaction ID. | ||
| 137 : | This transaction ID is generated by the querying node and is echoed in | ||
| 138 : | the response, so responses may be correlated with multiple queries to | ||
| 139 : | the same node. The transaction ID should be encoded as a random binary | ||
| 140 : | string, typically 20 bytes in length corresponding to the SHA1 hash. | ||
| 141 : | |||
| 142 : | The other key contained in every KRPC message is "y" with a single | ||
| 143 : | character value describing the type of message. The value of the "y" key | ||
| 144 : | is one of "q" for query, "r" for response, or "e" for error. | ||
| 145 : | |||
| 146 : | camrdale | 119 | ### Queries |
| 147 : | camrdale | 118 | |
| 148 : | Queries, or KRPC message dictionaries with a "y" value of "q", contain | ||
| 149 : | two additional keys; "q" and "a". Key "q" has a string value containing | ||
| 150 : | the method name of the query. Key "a" has a dictionary value containing | ||
| 151 : | named arguments to the query. | ||
| 152 : | |||
| 153 : | camrdale | 119 | ### Responses |
| 154 : | camrdale | 118 | |
| 155 : | Responses, or KRPC message dictionaries with a "y" value of "r", contain | ||
| 156 : | one additional key "r". The value of "r" is a dictionary containing | ||
| 157 : | named return values. Response messages are sent upon successful | ||
| 158 : | completion of a query. | ||
| 159 : | |||
| 160 : | camrdale | 119 | ### Errors |
| 161 : | camrdale | 118 | |
| 162 : | Errors, or KRPC message dictionaries with a "y" value of "e", contain | ||
| 163 : | one additional key "e". The value of "e" is a list. The first element is | ||
| 164 : | an integer representing the error code. The second element is a string | ||
| 165 : | containing the error message. Errors are sent when a query cannot be | ||
| 166 : | fulfilled. The following table describes the possible error codes: | ||
| 167 : | |||
| 168 : | <table> | ||
| 169 : | camrdale | 120 | <tr><th>Code</th><th>Description</th></tr> |
| 170 : | camrdale | 118 | <tr><td>201</td><td>Generic Error</td></tr> |
| 171 : | <tr><td>202</td><td>Server Error</td></tr> | ||
| 172 : | <tr><td>203</td><td>Protocol Error, such as a malformed packet, invalid arguments, or bad token</td></tr> | ||
| 173 : | <tr><td>204</td><td>Method Unknown</td></tr> | ||
| 174 : | </table> | ||
| 175 : | |||
| 176 : | camrdale | 119 | #### Example Error Packets |
| 177 : | camrdale | 118 | |
| 178 : | camrdale | 121 | generic error = {"t":"12345678901234567890", "y":"e", "e":[201, "A Generic Error Occurred"]} |
| 179 : | bencoded = d1:eli201e24:A Generic Error Occurrede1:t20:123456789012345678901:y1:ee | ||
| 180 : | camrdale | 118 | |
| 181 : | camrdale | 119 | ### Contact Encoding |
| 182 : | camrdale | 118 | |
| 183 : | Contact information for peers is encoded as a 6-byte string. Also known | ||
| 184 : | as "Compact IP-address/port info" the 4-byte IP address is in network | ||
| 185 : | byte order with the 2 byte port in network byte order concatenated onto | ||
| 186 : | the end. | ||
| 187 : | |||
| 188 : | Contact information for nodes is encoded as a 26-byte string. Also known | ||
| 189 : | as "Compact node info" the 20-byte Node ID in network byte order has the | ||
| 190 : | compact IP-address/port info concatenated to the end. | ||
| 191 : | |||
| 192 : | ## DHT Queries | ||
| 193 : | |||
| 194 : | All queries have an "id" key and value containing the node ID of the | ||
| 195 : | querying node. All responses have an "id" key and value containing the | ||
| 196 : | node ID of the responding node. | ||
| 197 : | |||
| 198 : | camrdale | 119 | ### ping |
| 199 : | camrdale | 118 | |
| 200 : | The most basic query is a ping. "q" = "ping" A ping query has a single | ||
| 201 : | argument, "id" the value is a 20-byte string containing the senders node | ||
| 202 : | ID in network byte order. The appropriate response to a ping has a | ||
| 203 : | single key "id" containing the node ID of the responding node. | ||
| 204 : | |||
| 205 : | camrdale | 119 | arguments: {"id": "<querying nodes id>"} |
| 206 : | response: {"id": "<queried nodes id>"} | ||
| 207 : | camrdale | 118 | |
| 208 : | camrdale | 119 | #### Example Packets |
| 209 : | camrdale | 118 | |
| 210 : | ping Query = {"t":"12345678901234567890", "y":"q", "q":"ping", "a":{"id":"abcdefghij0123456789"}} | ||
| 211 : | camrdale | 119 | bencoded = d1:ad2:id20:abcdefghij0123456789e1:q4:ping1:t20:123456789012345678901:y1:qe |
| 212 : | camrdale | 118 | |
| 213 : | Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}} | ||
| 214 : | bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t20:123456789012345678901:y1:re | ||
| 215 : | |||
| 216 : | camrdale | 119 | ### join |
| 217 : | camrdale | 118 | |
| 218 : | join is identical to ping, except in the response provided. It is | ||
| 219 : | intended to be sent only to bootstrap nodes during the process of a new | ||
| 220 : | camrdale | 121 | peer joining the system. The response to a "q" = "join" query has two |
| 221 : | camrdale | 123 | additional responses, a key "ip\_addr" whose value is the bootstrap |
| 222 : | camrdale | 118 | node's understanding of the peer's IP address, and a key "port" whose |
| 223 : | value is the port number that the bootstrap node received the request | ||
| 224 : | from. | ||
| 225 : | |||
| 226 : | camrdale | 119 | arguments: {"id": "<querying nodes id>"} |
| 227 : | response: {"id": "<queried nodes id>", "ip_addr": "<querying nodes IP>", "port": "<querying nodes port>"} | ||
| 228 : | camrdale | 118 | |
| 229 : | camrdale | 119 | #### Example Packets |
| 230 : | camrdale | 118 | |
| 231 : | join Query = {"t":"12345678901234567890", "y":"q", "q":"join", "a":{"id":"abcdefghij0123456789"}} | ||
| 232 : | camrdale | 119 | bencoded = d1:ad2:id20:abcdefghij0123456789e1:q4:join1:t20:123456789012345678901:y1:qe |
| 233 : | camrdale | 118 | |
| 234 : | Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"mnopqrstuvwxyz123456", "ip_addr":"123.123.123.123", "port":12345}} | ||
| 235 : | bencoded = d1:rd2:id20:mnopqrstuvwxyz1234567:ip_addr15:123.123.123.1234:porti12345ee1:t20:123456789012345678901:y1:re | ||
| 236 : | |||
| 237 : | camrdale | 123 | ### find\_node |
| 238 : | camrdale | 118 | |
| 239 : | Find node is used to find the contact information for a node given its | ||
| 240 : | camrdale | 123 | ID. "q" = "find\_node" A find_node query has two arguments, "id" |
| 241 : | camrdale | 118 | containing the node ID of the querying node, and "target" containing the |
| 242 : | camrdale | 123 | ID of the node sought by the querier. When a node receives a find\_node |
| 243 : | camrdale | 118 | query, it should respond with a key "nodes" and value of a string |
| 244 : | containing the compact node info for the target node or the K (8) | ||
| 245 : | camrdale | 121 | closest good nodes in its own routing table. A "token" key is also |
| 246 : | included in the return value. The token value is a required argument for | ||
| 247 : | camrdale | 123 | a future store\_value query. The token value should be a short binary |
| 248 : | camrdale | 121 | string. |
| 249 : | camrdale | 118 | |
| 250 : | camrdale | 119 | arguments: {"id": "<querying nodes id>", "target": "<id of target node>"} |
| 251 : | camrdale | 121 | response: {"id": "<queried nodes id>", "nodes": "<compact node info>", "token": "<opaque write token>"} |
| 252 : | camrdale | 118 | |
| 253 : | camrdale | 119 | #### Example Packets |
| 254 : | camrdale | 118 | |
| 255 : | find_node Query = {"t":"12345678901234567890", "y":"q", "q":"find_node", "a": {"id":"abcdefghij0123456789", "target":"mnopqrstuvwxyz123456"}} | ||
| 256 : | camrdale | 119 | bencoded = d1:ad2:id20:abcdefghij01234567896:target20:mnopqrstuvwxyz123456e1:q9:find_node1:t20:123456789012345678901:y1:qe |
| 257 : | camrdale | 118 | |
| 258 : | camrdale | 121 | Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"0123456789abcdefghij", "nodes": "def456...", "token":"aoeusnth"}} |
| 259 : | bencoded = d1:rd2:id20:0123456789abcdefghij5:nodes9:def456...5:token8:aoeusnthe1:t20:123456789012345678901:y1:re | ||
| 260 : | camrdale | 118 | |
| 261 : | camrdale | 123 | ### get\_value |
| 262 : | camrdale | 118 | |
| 263 : | camrdale | 123 | Get values associated with a file hash. "q" = "get\_value" A get\_value |
| 264 : | camrdale | 118 | query has two arguments, "id" containing the node ID of the querying |
| 265 : | node, and "key" containing the hash of the file. If the queried node has | ||
| 266 : | camrdale | 121 | values for the hash, they are returned in a key "values" as a list of |
| 267 : | strings. See the next section for a description of the possible stored | ||
| 268 : | values. If the queried node has no values for the hash, a key "nodes" is | ||
| 269 : | returned containing the K nodes in the queried nodes routing table | ||
| 270 : | closest to the hash supplied in the query. | ||
| 271 : | camrdale | 118 | |
| 272 : | camrdale | 119 | arguments: {"id": "<querying nodes id>", "key": "<20-byte hash of target file>"} |
| 273 : | camrdale | 121 | response: {"id": "<queried nodes id>", "values": ["<peer 1 value string>", "<peer 2 value string>"]} |
| 274 : | camrdale | 119 | or: {"id": "<queried nodes id>", "nodes": "<compact node info>"} |
| 275 : | camrdale | 118 | |
| 276 : | camrdale | 119 | #### Example Packets |
| 277 : | camrdale | 118 | |
| 278 : | get_value Query = {"t":"12345678901234567890", "y":"q", "q":"get_value", "a": {"id":"abcdefghij0123456789", "key":"mnopqrstuvwxyz123456"}} | ||
| 279 : | camrdale | 119 | bencoded = d1:ad2:id20:abcdefghij01234567893:key20:mnopqrstuvwxyz123456e1:q9:get_value1:t20:123456789012345678901:y1:qe |
| 280 : | camrdale | 118 | |
| 281 : | camrdale | 121 | Response with values = {"t":"12345678901234567890", "y":"r", "r": {"id":"abcdefghij0123456789", "values": ["d1:c6:456abc", "d1:c6:def456e"]}} |
| 282 : | bencoded = d1:rd2:id20:abcdefghij01234567896:valuesl6:axje.u6:idhtnmee1:t20:123456789012345678901:y1:re | ||
| 283 : | camrdale | 118 | |
| 284 : | camrdale | 119 | Response with nodes = {"t":"12345678901234567890", "y":"r", "r": {"id":"abcdefghij0123456789", "nodes": "def456..."}} |
| 285 : | bencoded = d1:rd2:id20:abcdefghij01234567895:nodes9:def456...e1:t20:123456789012345678901:y1:re | ||
| 286 : | camrdale | 118 | |
| 287 : | camrdale | 123 | ### store\_value |
| 288 : | camrdale | 118 | |
| 289 : | camrdale | 123 | Store a value in the DHT. store\_value has four arguments: "id" |
| 290 : | camrdale | 121 | containing the node ID of the querying node, "key" containing the hash |
| 291 : | of the file, "value" is a string containing the value to be stored, and | ||
| 292 : | the "token" received in response to a previous find_node query. The | ||
| 293 : | queried node must verify that the token was previously sent to the same | ||
| 294 : | IP address as the querying node. See the next section for a description | ||
| 295 : | of the possible stored strings. The queried node should store the | ||
| 296 : | contact information of the querying node under the file hash in its | ||
| 297 : | store of peer contact information. | ||
| 298 : | camrdale | 118 | |
| 299 : | camrdale | 121 | arguments: {"id" : "<querying nodes id>", "key" : "<20-byte hash of target file>", "value" : "<string value>", "token" : "<opaque token>"} |
| 300 : | camrdale | 119 | response: {"id" : "<queried nodes id>"} |
| 301 : | camrdale | 118 | |
| 302 : | camrdale | 119 | #### Example Packets |
| 303 : | camrdale | 118 | |
| 304 : | camrdale | 121 | store_value Query = {"t":"12345678901234567890", "y":"q", "q":"store_value", "a": {"id":"abcdefghij0123456789", "key":"mnopqrstuvwxyz123456", "value":"d1:c6:def456e", "token":"aoeusnth"}} |
| 305 : | bencoded = d1:ad2:id20:abcdefghij01234567893:key20:mnopqrstuvwxyz1234565:token8:aoeusnth5:value13:d1:c6:def456ee1:q13:store_value1:t20:123456789012345678901:y1:qe | ||
| 306 : | camrdale | 118 | |
| 307 : | Response = {"t":"12345678901234567890", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}} | ||
| 308 : | bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t20:123456789012345678901:y1:re | ||
| 309 : | |||
| 310 : | camrdale | 121 | ## Stored Values |
| 311 : | |||
| 312 : | All of the values stored and retrieved in the previous section's | ||
| 313 : | camrdale | 123 | description of the "get\_value" and "store\_value" DHT queries are |
| 314 : | camrdale | 121 | strings. Currently, all stored values are bencoded strings of a |
| 315 : | dictionary containing many possibilities. In the simplest possible | ||
| 316 : | implementation, the dictionary will have a single key "c", whose value | ||
| 317 : | is the 6-byte compact contact information for the peer providing the | ||
| 318 : | file. All other possibilities build on this simple model. | ||
| 319 : | |||
| 320 : | stored value = {"c":"def456"} | ||
| 321 : | bencoded = d1:c6:def456e | ||
| 322 : | |||
| 323 : | ### Piece Hashes | ||
| 324 : | |||
| 325 : | In order to support the efficient downloading of large files from | ||
| 326 : | multiple peers, hashes of smaller pieces of the large files are | ||
| 327 : | necessary. | ||
| 328 : | |||
| 329 : | #### Piece Size | ||
| 330 : | |||
| 331 : | To accomplish this, a piece size must be chosen that is consistent | ||
| 332 : | across all clients, and is no larger than some acceptable maximum piece | ||
| 333 : | size. An optimal piece size would have all the pieces be the same size, | ||
| 334 : | which would be less than or equal to the maximum piece size, to prevent | ||
| 335 : | the last piece from being very small. However, downloading occurs by | ||
| 336 : | requesting smaller *chunks* of a piece, so choosing a piece size that is | ||
| 337 : | a multiple of the chunk size will prevent having to request very small | ||
| 338 : | chunks. Therefore, the following formula, in which `size` is the size of | ||
| 339 : | the file, `MAX_SIZE` is the maximum piece size, and `CHUNK_SIZE` is the | ||
| 340 : | chunk size, will give the number of pieces and optimal piece size: | ||
| 341 : | |||
| 342 : | n = 1 + floor((size - 1) / MAX_SIZE) | ||
| 343 : | camrdale | 122 | optimal = max(MAX_SIZE/2, ceil((size/n)/CHUNK_SIZE)*CHUNK_SIZE) |
| 344 : | camrdale | 121 | |
| 345 : | There will then be `n-1` pieces of size `optimal`, and 1 piece of size | ||
| 346 : | `size - (n-1)*optimal`. The maximum piece size is 512 KiB (524288 | ||
| 347 : | bytes), and the chunk size is 16 KiB (16384 bytes). So, for example, for | ||
| 348 : | a file of size 1234567, there will be 2 pieces of size 425984 followed | ||
| 349 : | by a single piece of size 382599. | ||
| 350 : | |||
| 351 : | #### Piece Hash Storage | ||
| 352 : | |||
| 353 : | Files larger than the maximum piece size will require multiple pieces. | ||
| 354 : | Each piece of the file will have to be hashed to get these piece hashes. | ||
| 355 : | Once hashed, the piece hashes will be concatenated and stored in a | ||
| 356 : | binary string whose length is a multiple of 20 (the length of the SHA1 | ||
| 357 : | hash). | ||
| 358 : | |||
| 359 : | #### Storage in the DHT | ||
| 360 : | |||
| 361 : | Storage of the piece hashes so that peers accessing the DHT can retrieve | ||
| 362 : | them will depend on the number of pieces. This is due to the limitation | ||
| 363 : | on the length of a UDP packet to prevent fragmentation, which prevents | ||
| 364 : | returning very long strings, especially when there are multiple values | ||
| 365 : | for a single key. This leads to 4 scenarios: | ||
| 366 : | |||
| 367 : | 1. The file requires only a single piece. In this case, no | ||
| 368 : | piece-hashing is needed, so store only the basic dictionary with | ||
| 369 : | the "c" key. | ||
| 370 : | |||
| 371 : | camrdale | 122 | stored value = {"c":"def456"} |
| 372 : | bencoded = d1:c6:def456e | ||
| 373 : | camrdale | 121 | |
| 374 : | 2. The file is 2 to 4 pieces in length. The piece hash strings are | ||
| 375 : | short enough to store directly with the contact information. A key | ||
| 376 : | "t" is added to the stored dictionary which is a dictionary | ||
| 377 : | containing a key "t" which has as a value the concatenated | ||
| 378 : | string of piece hashes. | ||
| 379 : | |||
| 380 : | camrdale | 122 | stored value = {"c":"def456", "t":{"t":"abcdefghij0123456789..."}} |
| 381 : | bencoded = d1:c6:def4561:td1:t23:abcdefghij0123456789...ee | ||
| 382 : | camrdale | 121 | |
| 383 : | 3. The file is 5 to 70 pieces in length. The piece hash strings are too | ||
| 384 : | long to store with the contact information, as that would limit the | ||
| 385 : | node to returning very few values. Instead, hash the piece hash string | ||
| 386 : | using SHA1, and store that string with a key "h" in the stored | ||
| 387 : | dictionary. Then, store the concatenated piece hash string in the DHT | ||
| 388 : | using the hash of the string as the key. There, the stored dictionary | ||
| 389 : | will contain only a single key "t", whose value will be the | ||
| 390 : | concatenated string of piece hashes. | ||
| 391 : | |||
| 392 : | camrdale | 122 | stored value = {"c":"def456", "h":"12345678901234567890"} |
| 393 : | bencoded = d1:c6:def4561:h20:12345678901234567890e | ||
| 394 : | camrdale | 121 | |
| 395 : | camrdale | 122 | stored value at key 12345678901234567890 = {"t":"abcdefghij0123456789..."} |
| 396 : | bencoded = d1:t23:abcdefghij0123456789...e | ||
| 397 : | camrdale | 121 | |
| 398 : | 4. The file is more than 70 pieces in length. The piece hash string is | ||
| 399 : | now too long for a single UDP packet, and so must be stored outside | ||
| 400 : | the DHT. Again, hash the piece hash string using SHA1, and store that | ||
| 401 : | string with a key "l" in the stored dictionary. Then, respond to any | ||
| 402 : | peer download requests for that hash with a bencoded dictionary | ||
| 403 : | containing a key "t" whose value is the concatenated string of piece | ||
| 404 : | hashes. | ||
| 405 : | |||
| 406 : | camrdale | 122 | stored value = {"c":"def456", "l":"12345678901234567890"} |
| 407 : | bencoded = d1:c6:def4561:l20:12345678901234567890e |
| CVS Admin | ViewVC Help |
| Powered by ViewVC 1.0.5 |