Tahoe-LAFS Restrictive Proxy Gateway
lafs-rpg (Tahoe-LAFS Restrictive Proxy Gateway) uses nginx to serve as a publicly exposed, readonly gateway to a Tahoe-LAFS grid. The design is intended to protect the gateway operator from malicious remote users, where the Tahoe-LAFS gateway provides insufficient protection.
After reading this document, see ./INSTALL.html (or reading through a lafs-rpg process. This use case goal is called content compartmentalization. (Note: the operator, as a person-in-the-middle can learn all Capabilities passed through the system, but remote users should not be able to.)
Neither producers nor consumers should be able to burden the lafs-rpg host resources, nor negatively impact any accounting of the operator, except as necessary for directly reading content. This use case goal is called operator cost protection.
The users' and operator's general security liability (which may also have legal implications) should be minimized. See Known Security Issues.
An explicit non-goal is an access control policy for content. LAFS Capabilities are designed from the ground up to allow fine-grained control over content and lafs-rpg does not provide any additional guarantees beyond the underlying LAFS network. This is consistent with the fact that producers may not (and therefore need not) be aware of lafs-rpg and therefore cannot rely on it.
A final minor use case goal is that lafs-rpg should rely on well tested software that is secure and scales well. The stack's security may be seen as a subgoal of both operator cost protection and content compartmentalization. The scalability may be seen as a subgoal of cost protection.
The lafs-rpg host provides an https interface which proxies some requests to a LAFS gateway. Some requests are denied, and some result in http redirect responses without interacting with the LAFS gateway.
The use of https helps protect content compartmentalization by preventing network eavesdroppers from learning Capabilities. Security failures in https usability, authentication, or encryption are one attack surface which could compromise content compartmentalization (by leaking Capabilities). As a convenience for consumers and browsers with bad habits, the lafs-rpg configuration will redirect all non-TLS http pages to the https front page.
The denial of some http requests is called the access control policy. The goal of this policy is to protect operator costs. It is not to protect content access; that is the job of Capabilities.
An overarching philosophy of the implementation is to be conservative anywhere I notice ambiguity. When there is noticed ambiguity this document uses "may" to indicate my ignorance of design, implementation, or usage. This should also be a case where a more conservative choice amongst alternatives is employed.
If you notice any implementation flaws, please file a ticket.
The configuration of nginx listens to port 80 and port 443 on all interfaces. All http requests to port 80 redirect to: https://$PUBLIC_HOST/
The TLS/SSL port implements the http access control policies and passes any non-blocked or non-redirected requests to the LAFS gateway (which may be a remote host). Requests for / will be redirected to $FRONT_PAGE, so that a LAFS Capability can serve as the "front page".
To understand the explicit access control policy, please read the ./templates/etc/nginx/sites-available/lafs-rpg file (or the generated output under ./build after running ./configure.py).
One of the "big picture" policy considerations is content availability. Because the operator is unaware of what content is readable through the lafs-rpg process, this may imply technical, social, or legal liabilities. Note that LAFS has a blacklist feature, separate from lafs-rpg entirely, which may influence decision making about this design goal. If you are interested in completely operator controlled content, lafs-rpg may provide a starting point. If you do use lafs-rpg for a more restrictive use case, feedback to the development community is welcome.
Note that for producers or consumers, the best way to limit liability is by using a private LAFS gateway.
The http access control policies are intended to prevent remote users from invoking any "ambient authority" or learning any "semi-secrets" of the LAFS gateway. Specifically:
- Without blocking the http methods PUT, POST, or DELETE, remote users could upload or modify content which may incur accounting burdens on the operator and which introduce computational overhead, which would violate the operator cost protection goal. Conservatively the policy is to allow only the GET and HEAD methods.
- Without blocking some URL paths to the gateway, the remote user can learn information about the LAFS network which may violate the operator cost protection or content compartmentalization goals, so by conservative implementation philosophy, the configuration has a white-list of URL``s (or ``URL prefixes). Examples:
- By viewing the status of current operations, a remote user may learn Capabilities of other users, violating content compartmentalization. (I have not verified this, but again take a conservative approach.)
- Accounting may be based on total network storage and access to the introducer furl grants the ability to upload to a given network's storage. This secret is leaked by the LAFS gateway on some URL paths. (An example of this kind of accounting and access control is the Least Authority Enterprises product Tahoe-LAFS-on-S3. See: Disclaimers).
- With access to only a read or verify Capability, a remote user can initiate expensive check and repair operations which would provide a denial of service vector, violating operator cost protection.
This script does not verify Capabilities in anyway. The $FRONT_PAGE should point to a read-only (or literal/immutable) Capability. The access control policy is meant to protect operator and not the content. Producers are responsible for protecting their content, and this is acheived by guarding their write Capabilities, not by lafs-rpg.
nginx may be configured to log all requests, as is the standard default. Logging requests will leak Capabilities and the requesting remote IP address into syslog.
This undermines forward secrecy: an attacker who reads these logs after an http session has terminated can still recover the Capabilities and can determine which remote IP``s requested which Capabilities. This is a vulnerability to both the user and the ``lafs-rpg operator and violates the goal of minimal general security liability.
This logging should be made clear to the users of a lafs-rpg proxy.
One potential use case motivating logging is for the operator to gather statistics about the traffic to particular Capabilities. An alternative feature would be a means for producers (not operators) to request that this request-per-time information be gathered and published on a per-Capability basis. IP addresses could be irreversibly obfuscated to serve as a distinct-user proxy metric. [FIXME: Add this as a feature wishlist item.]
I have contracted with Least Authority Enterprises on the Tahoe-LAFS-on-S3 product, I am a user of that product, and I am motivated for that product to succeed.
I use lafs-rpg with an Tahoe-LAFS-on-S3 account. That was a motivating factor to work on lafs-rpg.
Also, I am of the opinion that Tahoe-LAFS is ossm and may be thusly biased.
- zancas in #tahoe-lafs on freenode irc for editing this document.