Encrypted Filesystem
From Entuura
In two context where we might deploy the Entuura system, data protection can be a life or death matter. The most extreme example is in human rights monitoring NGOs, which collect data that if it falls into the hands of the human rights abusers, could be a blueprint for retribution. The second example is health data, especially in a context where there is stigmatization of patients with certain disease (usually HIV). For real life examples of these risks and the way encryption has mitigated them, see this.
Because we are targeting the health industry as an initial market, it would sure be nice to have this built in so that it is just there and doesn't need to be added later.
Contents |
Threat Model
The threat is that the data storage device will be lost, stolen, or seized. The adversary is a government or a counter-government group, not an opportunistic thief (who would have no interest in the data itself, just the hardware). They will be organized enough to be able to read, analyze and act on any data on the device. It is this action (threatening harm to people) that we aim to prevent, by preventing access to the data.
Our adversary will be able to get physical access to the device. They will not be able to do advanced reverse engineering (i.e. desoldering chips, using in circuit emulators). While attempting to steal the data, they cannot monitor the telecommunications network any more than they normally can (see our threat model for the payload transfer encryption system). Power to the device will be interrupted sometime during the process of the adversary taking control of it. This is an assumption, but it is likely to be true because a confiscation operation is likely to be a hit and run operation. To analyze the data in the office where it is not the likely modus operandi of the adversary. One can find real-life accounts of data theft from MSF medical centers and from Latin American human rights organizations to verify this assertion. Finally the assumption that power will be interrupted to the device is valid because the data owner can be trained that the only sure way to protect their data is for they themselves to unplug the device in case of emergency (and perhaps each night when locking up the office). Military systems often depend on their operators to destroy key material when the device is in danger of being captured by the enemy. This is the same principle.
Our data owners will be able to send a message to the outside world indicating that their data has been compromised, and requesting aid to protect the data. In a very hostile situation, where the data owners will be detained or under surveillance, the data owner might be unable to send a message positively indicating compromise. In this case, the message could take the form of a "All is OK" SMS message that has to be sent out of the area once a day or once a week. If the "All is OK" message fails to leave the area of operations, then the data is assumed to be compromised. (For another implementation of a related idea, see this: [1]).
An authorized LAN user (someone with physical access to the ethernet network, someone with access to the wifi network) should have access to the files. If our attacker can get access to the LAN, he can get access to the data when it is in-flight, or just ask for it directly. The data owner must protect the operational security of his system. Once the system is confiscated, then the system should defend itself. This means that the easiest attack is not to confiscate the data, but to compromise the LAN (by tapping the network or sending a spy to work at the organization) or by breaking the WiFi encryption. This system cannot defend against that.
Requirements
- Able to defend against data compromise in the context of our threat model (above)
- Best practices for secure data storage
- Crypto filesystem with no known attacks, currently maintained by the open source community
- A stream cipher as strong as AES-256
- A symmetric cipher is acceptable. There is no need for public key cryptography for this application.
- Key is fullest length possible, not dependent on "user remembered password", which likely reduces entropy and results in compromise from user writing down the key (also, see next requirement)
- No user interface for encryption: as long as files are stored in the right place they are automatically protected
- No user interface for decryption: as long as the device is booting normally and the data owner has not indicated that the data has been compromised, the data will be unencrypted with no user interaction.
- Unencrypted storage space is still available on the USB key to allow SneakerNet.
- The feature is optional, so that in a context where using this feature would be considered suspicious, spying, or would threaten the project's authorization to operate, it can be turned off (and removed from the UI so that it cannot arouse suspicion)
These requirements seem to demand that the device will have external assistance to manage the key for the encrypted volume. That means this application depends on network access, though PPP would be enough.
Implementation
The Secure Folder
The encrypted volume will be a small disk image on the USB stick, sized according to the projected needs of the project, and according to their ability to back it up safely. Generally, a 500 meg file would be about right, so that there is plenty of room to store files, but the image comfortably fits onto a single CD.
Encrypted filesystems are implemented by encrypting the blocks. The filesystem itself still has to exist. Unless there is a compelling reason, we should keep it simple and stick to VFAT. However, other filesystems might handle volume growth more gracefully. But growing a VFAT volume is possible with open source tools. See gparted.
The folder should be offered up via Samba with the same security level as all of the other shares. According to our threat model, all LAN users can have access to the data.
The Crypto Itself
There are three choices for encrypted file systems under Linux. The oldest is cryptoloop. It has been discredited, and should no longer be used. The possible choices are dm-crypt, Loop-AES, and TrueCrypt. It's unclear what the right choice is from all of these.
Generally, they will all have the same characteristics. You make a big file on the disk, you give the crypto implementation a string of secret data, you keep the string of data secret, and as long as the secret is separate from the big file, the bad guys can't see what's inside.
Key Management
Our strict "no user interface" requirements mean that the device has to take responsibility for managing the decryption key for the encrypted volume with no help from the user whatsoever.
One possibility is to use the central server as a key escrow service. But this would put an unacceptable dependency on the central server's reliability. The device has a host key unknown to everyone else in the system. Using it alone as the key for the encrypted filesystem would fail to defend the data in the case where the storage and the device are stolen together (which is the assumed case).
Because we assume the device will lose power during the seizure, the only safe place for the decryption key for the encrypted volume is in RAM. When the device reboots, the key will be gone. Only extremely advanced reverse engineering could recover the key, and even then, only after a millisecond interruption of power. This is good enough defense for our threat model.
So, how does the key get into RAM? It has to come from outside the device, thus over the network. Because we already have an infrastructure for using RSA-1024 public keys and AES-256 symmetric encryption to protect payloads, we can use that to protect it in flight. We need to add a new service to the central server which allows the device to fetch the encrypted key from the central server. The most general purpose implementation would be a data-agnostic "get/set" API. The device uses SSL Client Authentication to prove to the server it has the right to use the "get/set" API. The server needs to know who is asking to avoid letting one node see another node's data. (This is actually optional, there are ways to set it up so that the central server doesn't care who is using the get/set interface, but it's slightly higher risk to do it that way.)
Once the key is in RAM, as a result of the first phone home after boot, the device can hold the key in RAM. It does need to be kept around, since the encrypted device needs to be re-mounted whenever the USB key is removed and reinserted. If this is not necessary, the key can just be used to mount the device and then immediately discarded.
Key Management on the Central Server
The central server cannot see the key, because it is encrypted using public key encryption, so that only the holder of the matching private key (the device itself) can decrypt it. However, the central server can choose whether or not to distribute the encrypted payload holding the key material. Because our threat model assumes that the data owner can notify the central server administrator that the data has been compromised, we can move the key aside and prevent it from being handed out. The central server should keep the key instead of deleting it, so that if the data owner gets their data back and still has the device's private key, they will be able to decrypt the volume encryption key, then once again decrypt their volume.
Stretch Goals
- Multiple encrypted volumes, not just one.
- An offline backup system of some sort, able to survive in the case of the same threat model as above
- A user interface to guide the user through unounting the volume and burning it to DVD? How to handle key management?
- Idea: A "n of m" system, where any n copies of the m that were made can reconstruct the volume in a decrypted manner, but if you have less than n of the pieces, you only have line noise. So backups are done by putting 3 pieces onto a local DVD (or two copies of the same DVD), and 2 of them go onto two USB drives stored by two people, physically separate from the DVDs (i.e. hidden in their house). If the system requires 4 of 5 pieces, then at least one of the USB keys has to be present with the DVD to decrypt the backup.
- Filesystem automatically grows as needed
