Javascript filtering
It should be possible to implement a javascript filter. This would involve modifying the code and grafting in filtering functions where necessary. Here are some of the major issues:
- Generated HTML. We can feed generated HTML back to the node for re-filtering.
- Generated javascript (eval). We can feed generated javascript back to the node too.
- URLs: We can either filter URLs locally (on the browser), or feed them back to the node.
- Data flow: A javascript program can easily find a property and then stick it into a variable. And then later on read it out of the variable and do something bad with it. It can be obfuscated in lots of ways which are impossible to predict at filtering time. The solution: When a dangerous property is assigned to a variable (or an element in an array), create a shadow variable (or array) which tracks exactly what is in it. We can therefore do type analysis at run time and avoid a lot of headaches.
- Inserts: Any kind of scripting has issues with inserts. If a page can for example time fetches of various pages, it can work out whether they are in your datastore. Then it can report this back to the bad guy. First off, we should only allow inserts for pages that a user has approved them for: sites which don't even pretend to be a forum for example shouldn't do inserts. Secondly, there are a few ways to address the problem: we could make requests accessible by scripts always take the same time (or some function of size), we could have a separate client-cache for the client (assuming that we've implemented premix routing or at least tunnels so that our locally requested data isn't in our datastore), and so on (although the client could probably still determine our location; we probably need several of these tactics).