Google Summer of Code/2010

From Freenet Wiki
Jump to: navigation, search

Past experience

We have been involved in 2006, 2007, 2008, 2009, 2010, 2011, 2012 and 2013; in 2006, 2007, 2009 and 2010 we had lots of students, most of whom achieved something of value, and a few of whom became volunteer developers for Freenet. In 2008, we had only one student, who failed because of external issues. This may be partly due to us having a poor ideas list that year; many ideas were not feasible, were too big etc.

Students are much more likely to be accepted if they demonstrate an ability to contribute by e.g. contributing a bug fix or minor feature; judging students based purely on their applications is not viable. Having said that, a student's application is important, because it specifies what they are building and therefore how we can determine whether they have succeeded; the initial application is a starting point, and we will help to fill in the detail and ensure that it is feasible by the time you are accepted.

Students will be required to communicate publicly on the mailing lists, and where appropriate via IRC, the wiki and so on; your mentor is there for you, but talking only to your mentor is inefficient and ultimately bad for the project; we want to treat you as a developer (which means a valued volunteer), while giving whatever help is possible and needed. Also note that almost all of Freenet is written in Java.

Example Proposal Ideas

Please do not be limited by the below list. Students' own proposals certainly will be considered. You might also want to look at the uservoice page. In many cases, detailed proposals (which may be wrong but will be useful references) can be found on the bug tracker or the mailing lists. Remember that you can make up to 20 applications to us with different proposals; we encourage you to make as many applications as you want!

Client layer

Improve the web interface generally
There is much to do to make it more user friendly. We will hopefully have a set of mock-up designs soon, but there are also many small things linked from this bug report; many are not, please have a look around, also check the mailing list archives. A recent but not very detailed, professional mockup focusing on the homepage is here: [1]. An older detailed suggestion might also be an inspiration.
Better installers and/or packages for non-Windows platforms
A .dmg file for OS/X, with a tool to generate it from Linux, packages for all the major linux distros, scripts to build these (from *nix) and to maintain the necessary package repositories. There are issues to solve with updating, and generally we can't be part of a distro which is frozen for years on end, so we will probably maintain our own repositories for linux. Robust scripts to automate package generation are *essential*. A system tray icon for linux, and fixing the OS/X system tray and memory autodetection to actually work. Possibly make the Java-based installer work as root on Linux and create a user etc.
More ways to manage your darknet connections (friends)
There are various proposals for easier exchange of node references, such as shorter references, possibly with out-of-band password-based verification (ideas have been posted on the bug tracker or the mailing lists). And when you have added your friends, you should be able to (non-anonymously) chat with them, transfer files easily (and reliably), share bookmarks, share file indexes and so on. Even hamachi-style virtual LANs with peers have been suggested and might be useful, although there are performance/security tradeoffs. Social networking style features, on an opt-in basis, have also been suggested; your friend is likely to be your friend's friend, so the ability to see your friend's friends, if they want you to, might be useful and result in adding them as a friend (we have to be careful here though!)
More content filters
We have to "filter" HTML, images, etc to ensure that they are safe for the web browser, and won't give away the user's IP address via inline images, scripting etc. Finishing the SVG filter written for 2009, implementing support for SVG embedded in XHTML embedded in ATOM (we have an ATOM filter but it is not integrated yet), maybe an RSS filter, would be useful. Audio and video formats would be very helpful, and with HTML5-based video playback support could make embedded video almost viable. Making it really viable would require deeper changes related to fetching data in order, access to partially downloaded content, and possibly an applet to show which parts have been downloaded and maybe to display those formats that we support (likely ogg) in browsers that don't support them. See here for more on embedded video playback: [2]. PDF would be very valuable but the spec is huge, however it is believed that minimal sufficient functionality is not *so* huge. ODF is similarly a possibility but again is gigantic. Javascript is an option for JS geniuses (create a safe API and then force the JS sent to the browser to only use that API; please talk to us in detail as there are many side-issues with any sort of "safe scripting"!).

Node layer

A documented, limited, published, external plugin API
Plugin dependancies with versioning, maybe using OSGi or similar (although automatically loading old, known-security-broken versions may not be a good idea). Support for untrusted or semi-trusted plugins would be especially awesome. This would require classloader tricks to ensure that they can only call the API, it would require various checks in the API layer itself, and then you get to the interesting stuff: If a client can request and insert data, it can probe the datastore/client-cache, figure out what you've browsed recently and where you are on the network, and report that data back. Ways to limit this include restricting access to parallel requests, accurate timing data (requires overriding System.currentTimeMillis()) and using a separate client cache for the plugin.
Transport plugins
Currently Freenet only supports UDP. Make it able to use TCP, HTTP, various steganographic transports (e.g. VoIP). Freenet should provide all the heavy lifting crypto etc, it should be *EASY* to write a transport plugin, just register it with the appropriate type, give block size and so on, and Freenet will do the rest.
Good FCP client libraries in more languages
-
Bandwidth scheduler
At certain times of day/days of the week, set the bwlimit to X. Should include a "pause" capability where the node would not exchange any network traffic, or possibly keep connections open but not route any requests, and tell peers that we are paused. There has been some work on how to implement that efficiently - the hard part is getting back onto the network fast.
Improved simulations
We do not have any simulations capable of telling us what the impact of different load management schemes would be, and we do not want to deploy any new load management without simulating it first. New load limiting/balancing could potentially improve performance significantly... we do have a couple of existing simulators, but they don't simulate load.
Low-level protocol changes
The current low-level protocol is home-grown and not sufficiently TCP-like. It is limited to 256 packets in flight at a time, which can be a problem for high latency and high bandwidth, it tries to be TCP-like but has explicit retransmits and lots of messiness. It is also bad in other ways. Also, because we have lots of different sized messages, we pad packets with random data; it would be much more efficient to pad with data transfers where possible.
More low level stuff
packet size problems. We do not detect path MTU; this is difficult from Java, although possible with some native code. Also, detection of when packets over a certain size never arrive, or statistically that they are highly unlikely to arrive above a certain size (in which case we should fragment the packet) would help us to establish the maximum safe packet size. Once we know this, streams aka padding with data transfers (also in the previous job) would allow us to efficiently adapt. This would help Freenet to work on weird connections (VPNs etc).
The two low level issues above are probably a single GSoC project. You can see some work on this here
User:Evanbd/New Packet Format Proposal or here: Node_protocol#New_Protocol.
Improve Build Process Integrity
At a minimum it would be useful to publish a procedure that third parties can use to build bit-for-bit identical versions of Freenet Project build products (i.e. jars) to verify that they are buildable from the released source. You can find the outline for a more ambitious proposal to increase build integrity here: https://bugs.freenetproject.org/view.php?id=409
Related to the above, migrate Freenet to Maven, or another similar build management system

Application layer

Rewrite Freemail to integrate well with Freetalk and Web of Trust
(e.g. reuse identities, ability to send private replies to public threads etc), have a good webmail GUI, work well (it seems very buggy for me), and resist spam (currently it doesn't use CAPTCHAs or anything similar). Should have a single logon for WoT and the Freemail accounts. Back compatibility with existing Freemail probably isn't very important. The backend will be/is quite different to Freetalk's backend: The key thing to prevent is an attacker getting traffic analysis data (time/sender/recipient), so private channels between each pair of recipients are essential, and if introductions can be hidden too that's even better. However this is already coded in the existing Freemail, albeit buggily.
A good filesharing/file search system
This should tie in with the Web of Trust, allowing users to publish indexes and search those of their anonymous friends, rate others' indexes, merge them into their own, set up long-term file searches, preload indexes for faster searches, and so on. It might also integrate with Freetalk to help with discussions on labelling or rating. The problems of spam/deliberately corrupt content are very similar on Freenet to on traditional p2p, although the solutions may be different, especially as it isn't possible to trace spammers; trusted community maintained indexes have developed as a working means of solving these problems on web-based filesharing. Note that we already have a scalable forkable on-freenet btree search system to use as a backend, but it is not yet used for anything, and it is not distributed or WoT-compatible.
Another interesting area for filesharing is a distributed, WoT-based way to download data by conventional hashes rather than CHKs, which could tie in with other networks; this is also related to the wierd stuff (backups) at the bottom.
Secure reinsert-on-demand filesharing, to improve the volume of content that is available. This is a lot harder than it sounds, but in any case we need searching first.
A microblogging and/or real-time chat system
Both of these things would actually be implemented in a fairly similar way. Evan has done a fair amount of work on how to efficiently implement microblogging over Freenet.
Easy-to-use tools for inserting freesites (freenet-hosted web sites) and files
We already have a blogging tool, but it needs more work, and tools to make it easy to insert existing content etc would also be useful. This should support uploading files of any size, should avoid re-uploading larger files on every update, but should be configurable to do so on a schedule, should work from within the freenet web interface as a plugin, and may support WebDAV uploads direct from authoring software. The ability to mirror stuff from the web would also be useful.
Scalable fork-and-merge distributed revision control over Freenet
This would integrate the new scalable on-Freenet b-trees from the new Library format by infinity0, in order to scale up to at least Wikipedia scales (to implement a wiki over Freenet using a fork-and-merge model). It would tie in closely with the Web of Trust (the trust network backing Freetalk), integrating with its identities and announcing forks, and allowing users to easily see changes in other forks and integrate them. The most obvious use for this is a wiki-over-freenet (note that because of spam and denial of service attacks, instant anonymous editing of a wiki on freenet is not possible), it might also be useful for distributing spidering Freenet, for source code (e.g. if we want to deploy a new build only after a certain number of people we trust have signed it, and then build it from source), or for anything that needs a forkable database over Freenet. You might also need to optimise the btrees' data persistence by e.g. including the top level metadata for each chunk in the next layer up.
Better freesite searching
Lots of work has been done on this, but more could be done: Using the new library format, rewriting the indexes on the fly after gathering a few hours' data rather than writing it from the database over a week, support for long-term searches, web of trust integration, better support for stop-words (maybe aggregating them with common before/after words), tokenisation for tricky languages (Chinese, Japanese), distributing spidering across multiple users (as scaling is getting to be a serious problem now), etc.
Wiki over Freenet
A wiki over Freenet would be really awesome. In fact it could be a killer app. But it is not easy to implement, as there are several challenges. You can learn more there.

Unconventional stuff

Various people have suggested a p2p "backup" system over Freenet. Clearly Freenet cannot provide reliable data storage, so this is something of a contradiction in terms. However, a means to provide full system snapshots would be feasible: It would be unreliable for "unpopular" files (files not on many systems), but it might be possible to identify such files automatically and back them up on limited paid-for external backup space and/or hardcopy (using e.g. random routed requests to probe for whether the blocks in question are retrievable). For "popular" files, which many people have on their systems, it could be useful and relatively reliable.

Personal tools