When a node wants to join the opennet, it needs to connect to a seed node. A list of these nodes is provided by the main Freenet Project server for convenience (
see here). These nodes are required to be publicly accessible from the internet i.e. not behind any NAT more restrictive than a full cone. It establishes an encrypted connection with the node, without initially being routable. It sends an
AnnouncementRequest to the node. This includes an indication of what key it wants to end up serving, and the node's reference. The
AnnouncementRequest is forwarded until HTL hops after the nearest found node to the key, as any other request. All nodes on the chain who want more connections return their reference down the chain in an
AnnouncementPeer, and add the node to their routing tables. When the announcement reaches the end, it returns an
AnnouncementReply, which completes the announcement. Then the newbie node adds all the returned peers, and connects to them. FIXME how do we ensure a 1/d distribution? Maybe just by accepting one connection from each on the chain? Note that all involved nodes must be opennet nodes. Ideally an announcement would require authorization through a thinkcash/CAPTCHA puzzle. Having discussed this at length, it's not as easy as it sounds.
*Security: In 0.5 announcements chose a key at random through a distributed random number generation algorithm; we are going for functionality over security here. 0.5's announcements did not provide very much security as an attacker would just create more nodes and do more announcements until he got what he wanted... We do make life easier for an attacker by giving him exactly the key he wants, but this is quite possible to do on any opennet by manipulating your connections. We could do something like we did in 0.5, some sort of collaborative random destination generation; the problem is the attacker would just create a new node, and it would not be an effective introduction; the announcement protocol described above should get the node more or less the connections it needs to work. Finally, announcement requests should be limited to one every X time per link, possibly with queueing and a token passing style load limiting scheme.
Once a node is established, "path folding" or "destination sampling" ensures that it has the right kind of connections. When a request completes successfully on opennet, the node must send a message back along the request chain. This can either send a
CompletedAck (indicating that the request is over and can be forgotten), or it can send a
ConnectDestination, a message which includes its own reference. Any node along the request chain may reply with a
ConnectReply, which includes its reference (in which case the
ConnectDestination will stop being forwarded). This results in the two nodes adding each other's references and therefore being able to connect.
Path folding obviously causes connection churn. We can limit it at various points. A node which has fewer than
MinimumOpennetConnections will always send a
ConnectDestination message, and will usually reply to a
ConnectDestination. All other nodes will aim to get one new connection every
OpennetConnectionChurnInterval. If the node has
MaximumOpennetConnections connections or more, it will drop the node which least recently completed a request successfully.
At what point do we decide who will send the
ConnectReply? I don't know. In 0.5 we decided it going down from the data source - i.e. we set the data source in the first place, and then decided whether to reset it. However the data source may have rejected the connection later on. Close to the data source is probably a good thing, so it may make sense to include a hash of the data source on the
DataReply/
InsertReply, with that being reset or not along the chain.
*Security: Path folding has always been suspect from the point of view of both protecting the data source and protecting the request source. However the above is no more insecure with respect to the request source than the procedure in 0.5; intermediate nodes can see the request source's node reference, but they could as easily have done so by always resetting the noderef to themselves.
*Security: Somebody has suggested that we exploit the scarcity of IP addresses by only allowing one connection to/from any given IP address on any opennet node. This would reduce connectivity where we have multiple nodes behind a single IP address, on a NAT for example, but this wouldn't necessarily be a problem; the other nodes behind the IP would just have to connect to other nodes outside of it. Likewise we might want to try to ensure that no more than N/X% of our connections are in the same IP range. Both of these measures increase the cost of
routing table takeover attacks, but it is debatable whether they increase it significantly. Neither would be any protection against an attacker able to compromize lots of easily compromized machines and use them for his bogus nodes.