It seems to me like back in the day, all the companies we worked with shared files with FTP. Remember FTP? A surprising number of enterprise integrations patters depended on FTP and eventually SFTP.
Nowadays, it seems like many companies have moved to Amazon S3 to share information. This post is about using S3 securely and introduces a tool we’re working on to make it as easy as possible.
S3 is like a file system that you can access via an API and https. Because there are numerous clients, and Amazon provides a number of features out of the box, it is an extremely robust solution. Consider the features:
Typically we talk about buckets with S3 as discrete file systems. They are almost like different mapped drives. You can use a GUI client like this one from AWS:
We can also use a command line or build programs against the SDK. This lists our buckets:
aws s3 ls
Generally, S3 provides all the features I think most clients need. Sometimes it is not easy to use correctly though. In particular, access controls and encryption seem to be things that are tricky for people to use
Consider a use case where a company AwesomeAI is working on machine learning models that depend on very large csv data sets that they get and process for customers. Now assume that they are trying to get their partners to drop files in S3 to protect them.
We can address this use case 100% with native S3. AweomeAI can create a bucket with default encryption that only they can read out of. They can give BigDataRider access to write to the bucket. BigDataRider can write objects to the bucket and even set KMS key they want to use. In theory, BigDataRider can even bring their own KMS key. Meaning, even though it is AwesomeAI’s bucket, they can make it policy that BigDataRider can use their own KMS key.
aws s3 cp /filepath s3://mybucket/filename \
--sse aws:kms \
--sse-kms-key-id <key id>
Confused yet? Add in that there might be lots of additional customers and CustomDataSurfer shouldn’t be able to see or access any of the data BigDataRider pushes or that AwesomeAI can see.
So again, the things we care about:
aws-vault exec jemurai -- aws s3api get-bucket-policy --bucket amiexposed
So while we think S3 is awesome and feature rich, we also observe that it can be challenging to fully use securely. For the use case above, we built a simple open source tool S3S2 that will do a couple of things when you use it:
s3s2 share /directory/to/share
Or you can specify options at the command line:
s3s2 share --bucket the-bucket \
--pubKey the-public-key-of-the-receiver \
--awsKey the-name-of-the-kms-key \
This accomplishes the following goals:
Our new open source tool:
Tools and other pertinent references: