Data Persistence
Currently Freenet's data persistence is very poor.
We have been working on it for some time and have some ideas how to improve this:
-
For security reasons, we avoid caching data for several hops after the originator. This meant we were skipping over nodes that might be important sinks for the data. This is fixed in 1241 and needs testing. See this bug.
-
Triple insertion of a block seems to boost its retrievability after a week from ~70% to ~90%. Why? It should be stored in the same places even if it is cached in more places. The suspicion is that this is related to the first item, which we will test soon.
-
Bloom filter sharing may help significantly.
-
Ultimately, long term requests may help, but only if you are prepared to wait.
Analysis work suggests that mostly there is enough capacity, the problem is that data is stored as opposed to cached (see: Two level datastore) on nodes that then go offline, either because they are Low uptime nodes or because they leave the network.
Data persistence can also be boosted practically speaking in the client layer:
-
From 1241, we triple-insert the top block above a splitfile. The blocks in the splitfile are redundant, but the block above them isn't. We could make it redundant (see: https://bugs.freenetproject.org/view.php?id=3358), but tests so far suggest triple insert gives better or equal performance.
-
Large files have poor retrievability because of the segmented redundancy structure. This is being worked on: From 1255, files over 80MB are inserted with two-level segmentation (see: https://bugs.freenetproject.org/view.php?id=3370). LDPC codes (see: https://bugs.freenetproject.org/view.php?id=3950) might perform a bit better and might be implemented later.