Access a confidential dataset
In this tutorial, you will learn how to leverage an encrypted dataset by using the IEXEC_DATASET_FILENAME
environment variable in your application.
Prerequisites:
Familiarity with the basic concepts of Intel® SGX and SCONE framework.
Trusted Execution Environments offer a huge advantage from a security perspective. They guarantee that the behavior of execution does not change even when launched on an untrusted remote machine. The data inside this type of environment is also protected, which allows its monetization while preventing leakage.
With iExec, it is possible to authorize only applications you trust to use your datasets and get paid for it. Data is encrypted using standard encryption mechanisms and the plain version never leaves your machine. The encrypted version is made available for usage and the encryption key is pushed into the SMS. After you deploy the dataset on iExec it is you, and only you, who decides which application is allowed to get the secret to decrypt it.
Datasets are only decrypted inside authorized enclaves and never leave them. The same thing applies to secrets.
Your secrets are transferred with the SDK from your machine to the SMS over a TLS channel.
Let's see how to do all of that!
Encrypt the dataset
Before starting, let's make sure we are inside the ~/iexec-projects
folder previously created during the quick start tutorial.
Make sure your chain.json
content is the same as the one described here.
Init the dataset configuration.
This command will create the datasets/encrypted
, datasets/original
and .secrets/datasets
folders. A new dataset
section will be added to the iexec.json
file as well.
We will create a dummy file that has "Hello, world!"
as content inside datasets/original
. Alternatively, you can put your own dataset file.
Now run the following command to encrypt the file:
iexec dataset encrypt
will output a checksum, keep this value for a later use.
As you can see, the command generated the file datasets/encrypted/my-first-dataset.txt.enc
. That file is the encrypted version of your dataset, you should push it somewhere accessible because the worker will download it during the execution process. You will enter this file's URI in the iexec.json
file (multiaddr
attribute) when you will deploy your dataset. Make sure that the URI is a DIRECT download link (not a link to a web page for example).
You can use Github for example to publish the file but you should add /raw/ to the URI like this: https://github.com/<username>/<repo>/raw/master/my-first-dataset.txt.enc
The file .secrets/datasets/my-first-dataset.txt.key
is the encryption key, make sure to back it up securely. The file .secrets/datasets/dataset.key
is just an "alias" in the sense that it is the key of the last encrypted dataset.
Deploy the dataset
Fill in the fields of the iexec.json
file. Choose a name
for your dataset, put the encrypted file's URI in multiaddr
(the URI you got after publishing the file) and fill the checksum
field. The checksum
of the dataset consists of a 0x
prefix followed by the sha256sum
of the dataset. This checksum
is printed when running the iexec dataset encrypt
command. If you missed it, you can retrieve the sha256sum
of the dataset by running sha256sum datasets/encrypted/my-first-dataset.txt.enc
.
To deploy your dataset run:
You will get a hexadecimal address for your deployed dataset. Use that address to push the encryption key to the SMS so it is available for authorized applications.
For simplicity, we will use the dataset with a TEE-debug app on a debug workerpool. The debug workerpool is connected to a debug Secret Management Service so we will send the dataset encryption key to this SMS (this is fine for debugging but do not use to store production secrets).
Push the dataset secret to the SMS
Check secret availability on the SMS
We saw in this section how to encrypt a dataset and deploy it on iExec. In addition, we learned how to push the encryption secret to the SMS. Now we need to build the application that is going to consume this dataset.
Prepare your application
For demo purposes, we omitted some development best practices in these examples.
Make sure to check your field's best practices before going to production.
Let's create a directory tree for this app in ~/iexec-projects/
.
In the folder src/
create the file app.js
or app.py
then copy this code inside:
The application reads the content of the dataset and writes it into the result's folder:
Build the TEE docker image
Create the Dockerfile
as described in Build your first application.
Build the Docker image:
Follow the steps described in Build Scone app > Build the TEE docker image.
Update the sconify.sh
script with the variables as follow:
Run the sconify.sh
script to build the Scone TEE application:
Test your app on iExec
At this stage, your application is ready to be tested on iExec.
Deploy the TEE app on iExec
Deploy the application as described in Build Scone app.
Run the TEE app
Specify the tag --tag tee,scone
and the dataset to use --dataset <datasetAddress>
in iexec app run
command to run a tee app with a dataset.
One last thing, in order to run a TEE-debug app you will also need to select a debug workerpool, use the debug workerpool debug-v8-learn.main.pools.iexec.eth
.
You are now ready to run the app
Next step?
Thanks to the explained confidential computing workflow, you now know how to use an encrypted dataset in a Confidential Computing application.
To go further, check out how to:
Last updated