Christian Huitema's blog

Cloudy sky, waves on the sea, the sun is
shining

Distributed Hash Tables are really hard to secure

07 Dec 2010

One of the designs proposed for a P2P name service is to “use a Distributed Hash Table” to distribute the resolution of the names in a P2P zone. I don’t believe that’s a very good idea, because DHT are really hard to secure, and conversely really easy to attack.

DHT works by assigning a random index number to each node, another random index number to each record being stored, and by storing a copy of the record being stored on the node with the closest index. The node index is typically the hash of the node’s name, and the record index the hash of the record identifier – for the DNS application that would be name and record type. Using hashing functions ensures that the numbers are randomly distributed, and that property can be used to design look up protocols that guarantee access time of the order O(log N), where N is the number of nodes. That makes for very fast responses in large scale databases, but that basically suppose that all nodes are equal.

A “rogue” node in a classic DHT can mount many interesting attacks. It can lie about the presence or absence of records, it can reply with fake copies, and it can mount various kinds of attacks against the DHT routing mechanisms. Rogue participants can also attempt to game the hashing function to target placement of their records at specific nodes, effectively flooding these nodes. They can create fake nodes with strategically chosen indices to cut the DHT “cloud” in little pieces. It does not take many such attackers to bring the reliability of the system close to zero.

We built some strong defenses against such attacks in PNRP. These included replication, randomization, and a routing system that used “distributed caches” rather than “indexed stores.” We remove the use of node indices from the protocol, so that queries are only routed based on a record’s key. Each node can store cached copies of records, but the master copy is always accessible at the publishing node. The master copy is signed by its publisher, which allows for verification. In fact, we spent much more time devising these protections than designing the actual algorithm!

Bottom line, building a reliable P2P name server requires a lot more work than just taking a zone file and distributing it with a standard DHT.