[PATCH] Check mm_users!=0 when pinning enclave's MM
by Sean Christopherson
Abort sgx_pin_mm if mm_users==0 after acquiring mmap_sem to avoid
racing with do_exit during allocation and evictions flows. Remove
the same check from sgx_process_add_page_req as it is now redundant.
Signed-off-by: Sean Christopherson <sean.j.christopherson(a)intel.com>
---
drivers/platform/x86/intel_sgx_ioctl.c | 6 ------
drivers/platform/x86/intel_sgx_util.c | 11 ++++++++++-
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/platform/x86/intel_sgx_ioctl.c b/drivers/platform/x86/intel_sgx_ioctl.c
index 6c8962f..43aaaad 100644
--- a/drivers/platform/x86/intel_sgx_ioctl.c
+++ b/drivers/platform/x86/intel_sgx_ioctl.c
@@ -262,12 +262,6 @@ static bool sgx_process_add_page_req(struct sgx_add_page_req *req)
if (IS_ERR(backing))
goto out;
- /* Do not race with do_exit() */
- if (!atomic_read(&encl->mm->mm_users)) {
- sgx_put_backing(backing, 0);
- goto out;
- }
-
ret = vm_insert_pfn(vma, encl_page->addr, PFN_DOWN(epc_page->pa));
if (ret)
goto out;
diff --git a/drivers/platform/x86/intel_sgx_util.c b/drivers/platform/x86/intel_sgx_util.c
index d1c4c71..d076d77 100644
--- a/drivers/platform/x86/intel_sgx_util.c
+++ b/drivers/platform/x86/intel_sgx_util.c
@@ -149,7 +149,16 @@ bool sgx_pin_mm(struct sgx_encl *encl)
down_read(&encl->mm->mmap_sem);
- if (!encl->vma_cnt) {
+ /* Check both vma_cnt and mm_users after acquiring mmap_sem
+ * to avoid racing with the owning process exiting. mm_users
+ * needs to be checked as do_exit->exit_mmap tears down VMAs
+ * and PTEs without holding any MM locks (once mm_users==0).
+ * mm_count only guarantees the MM's kernel objects will not
+ * be freed, it doesn't protect the VMAs or PTEs. Allowing
+ * EPC page eviction to race with the PTEs being dismantled
+ * can result in PTEs being left in use when the MM is freed.
+ */
+ if (!encl->vma_cnt || !atomic_read(&encl->mm->mm_users)) {
sgx_unpin_mm(encl);
return false;
}
--
2.7.4
5 years, 4 months
[PATCH 0/3] PCMD backing storage
by Jarkko Sakkinen
Move PCMD to the shmem file in order to have better control of the
memory consumption.
Jarkko Sakkinen (2):
intel_sgx: pin the backing page inside sgx_eldu
intel_sgx: backing storage file for PCMD
Sean Christopherson (1):
intel_sgx: check the result of do_eldu
drivers/platform/x86/intel_sgx.h | 5 ++-
drivers/platform/x86/intel_sgx_ioctl.c | 12 +++++
drivers/platform/x86/intel_sgx_page_cache.c | 19 +++++++-
drivers/platform/x86/intel_sgx_util.c | 30 +++++++++++++
drivers/platform/x86/intel_sgx_vma.c | 69 +++++++++++++++++------------
5 files changed, 104 insertions(+), 31 deletions(-)
--
2.9.3
5 years, 4 months
[PATCH 0/6] intel_sgx: sgx_encl_page memory optimization
by Sean Christopherson
I will be on vacation until January 3rd and will not be checking
email, so there is no rush in evaluating this series.
The end goal of this patch series is to eliminate memory allocation
for eviction-specific structures for enclave pages that are resident
in the EPC. This is accomplished by moving the objects used to track
an evicted page, e.g. VA page, VA offset and PCMD, to a new struct,
sgx_evicted_page, that is allocated/freed on-demand. To completely
eliminate memory consumption for sgx_evicted_page when an encl page
is resident in EPC, sgx_epc_page and sgx_evicted_page are combined
into an anonymous union. The two pointers are mutually exclusive,
as an encl page cannot be simultaneously resident in EPC and evicted
from EPC.
The first two patches fix pre-existing bugs that are exposed by
allocating VA pages/offsets on-demand (immediately prior to EWB);
patches 3-5 prepare for on-demand allocation of eviction structures;
and the final patch implements sgx_evicted_page and its unionization
with sgx_epc_page.
Sean Christopherson (6):
intel_sgx: Abort sgx_vma_do_fault if do_eldu fails
intel_sgx: Lock the enlcave when updating va_pages
intel_sgx: Track SECS eviction using its epc_page
intel_sgx: Clean-up page freeing in sgx_ewb flow
intel_sgx: Delay VA slot allocation until EWB
intel_sgx: Combine epc/eviction pages via union
drivers/platform/x86/intel_sgx.h | 33 ++++++++-------
drivers/platform/x86/intel_sgx_ioctl.c | 108 ++++++++++++++++++++++++-------------------------
drivers/platform/x86/intel_sgx_page_cache.c | 115 +++++++++++++++++++++++++++++++++++------------------
drivers/platform/x86/intel_sgx_util.c | 12 +++---
drivers/platform/x86/intel_sgx_vma.c | 23 ++++++-----
5 files changed, 164 insertions(+), 127 deletions(-)
5 years, 4 months
[PATCH] Backing storage file for PCMD
by Jarkko Sakkinen
Move PCMD's to a backing storage file in order to give more control
(swapping and discarding mainly) and also to help pack stuff in the
struct sgx_enclave_page tighter in the heap.
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
drivers/platform/x86/intel_sgx.h | 5 ++++-
drivers/platform/x86/intel_sgx_ioctl.c | 12 ++++++++++++
drivers/platform/x86/intel_sgx_page_cache.c | 16 ++++++++++++++-
drivers/platform/x86/intel_sgx_util.c | 30 +++++++++++++++++++++++++++++
4 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/x86/intel_sgx.h b/drivers/platform/x86/intel_sgx.h
index ed9e8e6..1d03606 100644
--- a/drivers/platform/x86/intel_sgx.h
+++ b/drivers/platform/x86/intel_sgx.h
@@ -115,7 +115,6 @@ struct sgx_encl_page {
struct list_head load_list;
struct sgx_va_page *va_page;
unsigned int va_offset;
- struct sgx_pcmd pcmd;
};
struct sgx_tgid_ctx {
@@ -141,6 +140,7 @@ struct sgx_encl {
struct task_struct *owner;
struct mm_struct *mm;
struct file *backing;
+ struct file *pcmd;
struct list_head load_list;
struct kref refcount;
unsigned long base;
@@ -198,6 +198,9 @@ void sgx_put_epc_page(void *epc_page_vaddr);
struct page *sgx_get_backing(struct sgx_encl *encl,
struct sgx_encl_page *entry);
void sgx_put_backing(struct page *backing, bool write);
+struct page *sgx_get_pcmd(struct sgx_encl *encl,
+ struct sgx_encl_page *entry);
+void sgx_put_pcmd(struct page *pcmd_page, bool write);
void sgx_insert_pte(struct sgx_encl *encl,
struct sgx_encl_page *encl_page,
struct sgx_epc_page *epc_page,
diff --git a/drivers/platform/x86/intel_sgx_ioctl.c b/drivers/platform/x86/intel_sgx_ioctl.c
index 3a4a8fa..dec136a 100644
--- a/drivers/platform/x86/intel_sgx_ioctl.c
+++ b/drivers/platform/x86/intel_sgx_ioctl.c
@@ -477,6 +477,7 @@ static long sgx_ioc_enclave_create(struct file *filep, unsigned int cmd,
struct vm_area_struct *vma;
void *secs_vaddr = NULL;
struct file *backing;
+ struct file *pcmd;
long ret;
secs = kzalloc(sizeof(*secs), GFP_KERNEL);
@@ -501,9 +502,19 @@ static long sgx_ioc_enclave_create(struct file *filep, unsigned int cmd,
goto out;
}
+ pcmd = shmem_file_setup("dev/sgx",
+ ((secs->size >> PAGE_SHIFT) + 1) * 128,
+ VM_NORESERVE);
+ if (IS_ERR(pcmd)) {
+ fput(backing);
+ ret = PTR_ERR(pcmd);
+ goto out;
+ }
+
encl = kzalloc(sizeof(*encl), GFP_KERNEL);
if (!encl) {
fput(backing);
+ fput(pcmd);
ret = -ENOMEM;
goto out;
}
@@ -522,6 +533,7 @@ static long sgx_ioc_enclave_create(struct file *filep, unsigned int cmd,
encl->base = secs->base;
encl->size = secs->size;
encl->backing = backing;
+ encl->pcmd = pcmd;
secs_epc = sgx_alloc_page(encl->tgid_ctx, 0);
if (IS_ERR(secs_epc)) {
diff --git a/drivers/platform/x86/intel_sgx_page_cache.c b/drivers/platform/x86/intel_sgx_page_cache.c
index d073057..b264d20 100644
--- a/drivers/platform/x86/intel_sgx_page_cache.c
+++ b/drivers/platform/x86/intel_sgx_page_cache.c
@@ -237,10 +237,14 @@ static int __sgx_ewb(struct sgx_encl *encl,
{
struct sgx_page_info pginfo;
struct page *backing;
+ struct page *pcmd;
+ unsigned long pcmd_offset;
void *epc;
void *va;
int ret;
+ pcmd_offset = ((encl_page->addr >> PAGE_SHIFT() & 31) * 128;
+
backing = sgx_get_backing(encl, encl_page);
if (IS_ERR(backing)) {
ret = PTR_ERR(backing);
@@ -249,19 +253,29 @@ static int __sgx_ewb(struct sgx_encl *encl,
return ret;
}
+ pcmd = sgx_get_pcmd(encl, encl_page);
+ if (IS_ERR(pcmd)) {
+ ret = PTR_ERR(pcmd);
+ sgx_warn(encl, "pinning the pcmd page for EWB failed with %d\n",
+ ret);
+ return ret;
+ }
+
epc = sgx_get_epc_page(encl_page->epc_page);
va = sgx_get_epc_page(encl_page->va_page->epc_page);
pginfo.srcpge = (unsigned long)kmap_atomic(backing);
- pginfo.pcmd = (unsigned long)&encl_page->pcmd;
+ pginfo.pcmd = (unsigned long)kamp_atomic(pcmd) + pcmd_offset;
pginfo.linaddr = 0;
pginfo.secs = 0;
ret = __ewb(&pginfo, epc,
(void *)((unsigned long)va + encl_page->va_offset));
+ kunmap_atomic((void *)(unsigned long)(pginfo.pcmd - pcmd_offset));
kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
sgx_put_epc_page(va);
sgx_put_epc_page(epc);
+ sgx_put_pcmd(pcmd, true);
sgx_put_backing(backing, true);
return ret;
diff --git a/drivers/platform/x86/intel_sgx_util.c b/drivers/platform/x86/intel_sgx_util.c
index 2c390c5..40f5839 100644
--- a/drivers/platform/x86/intel_sgx_util.c
+++ b/drivers/platform/x86/intel_sgx_util.c
@@ -105,6 +105,33 @@ void sgx_put_backing(struct page *backing_page, bool write)
put_page(backing_page);
}
+struct page *sgx_get_pcmd(struct sgx_encl *encl,
+ struct sgx_encl_page *entry)
+{
+ struct page *pcmd;
+ struct inode *inode;
+ struct address_space *mapping;
+ gfp_t gfpmask;
+ pgoff_t index;
+
+ inode = encl->pcmd->f_path.dentry->d_inode;
+ mapping = inode->i_mapping;
+ gfpmask = mapping_gfp_mask(mapping);
+ /* 32 PCMD's per page */
+ index = (entry->addr - encl->base) >> (PAGE_SHIFT + 5);
+ pcmd = shmem_read_mapping_page_gfp(mapping, index, gfpmask);
+
+ return pcmd;
+}
+
+void sgx_put_pcmd(struct page *pcmd_page, bool write)
+{
+ if (write)
+ set_page_dirty(pcmd_page);
+
+ put_page(pcmd_page);
+}
+
struct vm_area_struct *sgx_find_vma(struct sgx_encl *encl, unsigned long addr)
{
struct vm_area_struct *vma;
@@ -245,5 +272,8 @@ void sgx_encl_release(struct kref *ref)
if (encl->backing)
fput(encl->backing);
+ if (encl->pcmd)
+ fput(encl->pcmd);
+
kfree(encl);
}
--
2.9.3
5 years, 4 months
[PATCH] intel_sgx: simplify sgx_write_pages()
by Jarkko Sakkinen
Now that sgx_ewb flow has a sane error recovery flow we can simplify
sgx_write_pages() significantly by moving the pinning of backing page
into sgx_ewb(). This was not possible before as in some situations
pinning could legally fail.
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
drivers/platform/x86/intel_sgx_page_cache.c | 63 ++++++++++++-----------------
1 file changed, 25 insertions(+), 38 deletions(-)
diff --git a/drivers/platform/x86/intel_sgx_page_cache.c b/drivers/platform/x86/intel_sgx_page_cache.c
index 36d4d54..d073057 100644
--- a/drivers/platform/x86/intel_sgx_page_cache.c
+++ b/drivers/platform/x86/intel_sgx_page_cache.c
@@ -233,48 +233,57 @@ static void sgx_etrack(struct sgx_epc_page *epc_page)
}
static int __sgx_ewb(struct sgx_encl *encl,
- struct sgx_encl_page *encl_page,
- struct page *backing)
+ struct sgx_encl_page *encl_page)
{
struct sgx_page_info pginfo;
+ struct page *backing;
void *epc;
void *va;
int ret;
- pginfo.srcpge = (unsigned long)kmap_atomic(backing);
+ backing = sgx_get_backing(encl, encl_page);
+ if (IS_ERR(backing)) {
+ ret = PTR_ERR(backing);
+ sgx_warn(encl, "pinning the backing page for EWB failed with %d\n",
+ ret);
+ return ret;
+ }
+
epc = sgx_get_epc_page(encl_page->epc_page);
va = sgx_get_epc_page(encl_page->va_page->epc_page);
+ pginfo.srcpge = (unsigned long)kmap_atomic(backing);
pginfo.pcmd = (unsigned long)&encl_page->pcmd;
pginfo.linaddr = 0;
pginfo.secs = 0;
ret = __ewb(&pginfo, epc,
(void *)((unsigned long)va + encl_page->va_offset));
+ kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
sgx_put_epc_page(va);
sgx_put_epc_page(epc);
- kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
+ sgx_put_backing(backing, true);
return ret;
}
static bool sgx_ewb(struct sgx_encl *encl,
- struct sgx_encl_page *entry,
- struct page *backing)
+ struct sgx_encl_page *entry)
{
- int ret = __sgx_ewb(encl, entry, backing);
+ int ret = __sgx_ewb(encl, entry);
if (ret == SGX_NOT_TRACKED) {
/* slow path, IPI needed */
smp_call_function(sgx_ipi_cb, NULL, 1);
- ret = __sgx_ewb(encl, entry, backing);
+ ret = __sgx_ewb(encl, entry);
}
if (ret) {
/* make enclave inaccessible */
sgx_invalidate(encl);
smp_call_function(sgx_ipi_cb, NULL, 1);
- sgx_err(encl, "EWB returned %d, enclave killed\n", ret);
+ if (ret > 0)
+ sgx_err(encl, "EWB returned %d, enclave killed\n", ret);
return false;
}
@@ -294,11 +303,8 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
{
struct sgx_encl_page *entry;
struct sgx_encl_page *tmp;
- struct page *pages[SGX_NR_SWAP_CLUSTER_MAX + 1];
struct vm_area_struct *evma;
unsigned int free_flags;
- int cnt = 0;
- int i = 0;
if (list_empty(src))
return;
@@ -316,25 +322,14 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
continue;
}
- pages[cnt] = sgx_get_backing(encl, entry);
- if (IS_ERR(pages[cnt])) {
- list_del(&entry->load_list);
- list_add_tail(&entry->load_list, &encl->load_list);
- entry->flags &= ~SGX_ENCL_PAGE_RESERVED;
- continue;
- }
-
zap_vma_ptes(evma, entry->addr, PAGE_SIZE);
sgx_eblock(entry->epc_page);
- cnt++;
}
/* ETRACK */
sgx_etrack(encl->secs_page.epc_page);
/* EWB */
- i = 0;
-
while (!list_empty(src)) {
entry = list_first_entry(src, struct sgx_encl_page,
load_list);
@@ -344,29 +339,21 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
evma = sgx_find_vma(encl, entry->addr);
if (evma) {
- if (sgx_ewb(encl, entry, pages[i]))
+ if (sgx_ewb(encl, entry))
free_flags = SGX_FREE_SKIP_EREMOVE;
encl->secs_child_cnt--;
}
sgx_free_encl_page(entry, encl, free_flags);
- sgx_put_backing(pages[i++], evma);
}
- /* Allow SECS page eviction only when the encl is initialized. */
- if (!encl->secs_child_cnt &&
- (encl->flags & SGX_ENCL_INITIALIZED)) {
- pages[cnt] = sgx_get_backing(encl, &encl->secs_page);
- if (!IS_ERR(pages[cnt])) {
- free_flags = 0;
- if (sgx_ewb(encl, &encl->secs_page, pages[cnt]))
- free_flags = SGX_FREE_SKIP_EREMOVE;
-
- encl->flags |= SGX_ENCL_SECS_EVICTED;
+ if (!encl->secs_child_cnt && (encl->flags & SGX_ENCL_INITIALIZED)) {
+ free_flags = 0;
+ if (sgx_ewb(encl, &encl->secs_page))
+ free_flags = SGX_FREE_SKIP_EREMOVE;
- sgx_free_encl_page(&encl->secs_page, encl, free_flags);
- sgx_put_backing(pages[cnt], true);
- }
+ encl->flags |= SGX_ENCL_SECS_EVICTED;
+ sgx_free_encl_page(&encl->secs_page, encl, free_flags);
}
mutex_unlock(&encl->lock);
--
2.7.4
5 years, 5 months
[PATCH v9 00/10] Fixes and performance improvements
by Jarkko Sakkinen
Jarkko Sakkinen (7):
intel_sgx: fallback more gracefully from EWB failure
intel_sgx: fix deadlock in sgx_ioc_enclave_create()
intel_sgx: fix null pointer deref in sgx_invalidate()
intel_sgx: kill the enclave when any of its VMAs are closed
intel_sgx: remove redundant code from sgx_vma_do_fault
intel_sgx: invalidate enclave when the user threads cease to exist
intel_sgx: migrate to radix tree for addressing enclave pages
Sean Christopherson (3):
intel_sgx: fix error resolution in SGX_IOC_ENCLAVE_INIT
intel_sgx: lock the enclave for the entire EPC eviction flow
intel_sgx: add LRU algorithm to page swapping
v5:
* in LRU pin mm_struct before isolating the pages
v6:
* set mmu notifier ops before registering
* clean up sgx_vma_open/close handling
* fixed checkpatch.pl errors in the LRU patch
* properly tested that this patch set does not break suspend
v7:
* Use sgx_invalidate in VMA open/close callbacks so that new HW
threads cannot enter. You cannot just zap PTE entries for a single
VMA.
v8:
* Renamed SGX_ENCL_INVALIDATED simply as SGX_ENCL_DEAD to underline
what it means.
* Harden only vma_close callback.
v9:
* exit_mmap() does not remove entries from mm_rb. Thus it is unsafe
to call find_vma() inside the close callback. The best we can do
is to remove PTEs from the closed VMA.
* fixed radix_tree_insert() call
* iterate through radix tree when deleting entries
drivers/platform/x86/Kconfig | 1 +
drivers/platform/x86/intel_sgx.h | 12 +--
drivers/platform/x86/intel_sgx_ioctl.c | 92 ++++++++++++--------
drivers/platform/x86/intel_sgx_page_cache.c | 127 ++++++++++++++++++----------
drivers/platform/x86/intel_sgx_util.c | 63 +++++---------
drivers/platform/x86/intel_sgx_vma.c | 52 +++++-------
6 files changed, 185 insertions(+), 162 deletions(-)
--
2.9.3
5 years, 5 months
[PATCH RFC] intel_sgx: simplify sgx_write_pages()
by Jarkko Sakkinen
Now that sgx_ewb flow has a sane error recovery flow we can simplify
sgx_write_pages() significantly by moving the pinning of backing page
into sgx_ewb(). This was not possible before as in some situations
pinning could legally fail.
[Marked as RFC because it requires a pending patch set. I implemented
this to show of benefits of the introduced recovery flow.]
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
drivers/platform/x86/intel_sgx_page_cache.c | 63 ++++++++++++-----------------
1 file changed, 26 insertions(+), 37 deletions(-)
diff --git a/drivers/platform/x86/intel_sgx_page_cache.c b/drivers/platform/x86/intel_sgx_page_cache.c
index 36d4d54..f62e5e7 100644
--- a/drivers/platform/x86/intel_sgx_page_cache.c
+++ b/drivers/platform/x86/intel_sgx_page_cache.c
@@ -233,41 +233,52 @@ static void sgx_etrack(struct sgx_epc_page *epc_page)
}
static int __sgx_ewb(struct sgx_encl *encl,
- struct sgx_encl_page *encl_page,
- struct page *backing)
+ struct sgx_encl_page *encl_page)
{
struct sgx_page_info pginfo;
+ struct page *backing;
void *epc;
void *va;
int ret;
- pginfo.srcpge = (unsigned long)kmap_atomic(backing);
+ backing = sgx_get_backing(encl, encl_page);
+ if (IS_ERR(backing)) {
+ ret = PTR_ERR(backing);
+ sgx_warn(encl, "pinning the backing page for EWB failed with %d\n",
+ ret);
+ return ret;
+ }
+
epc = sgx_get_epc_page(encl_page->epc_page);
va = sgx_get_epc_page(encl_page->va_page->epc_page);
+ pginfo.srcpge = (unsigned long)kmap_atomic(backing);
pginfo.pcmd = (unsigned long)&encl_page->pcmd;
pginfo.linaddr = 0;
pginfo.secs = 0;
ret = __ewb(&pginfo, epc,
(void *)((unsigned long)va + encl_page->va_offset));
+ kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
sgx_put_epc_page(va);
sgx_put_epc_page(epc);
- kunmap_atomic((void *)(unsigned long)pginfo.srcpge);
+ sgx_put_backing(backing, true);
return ret;
}
static bool sgx_ewb(struct sgx_encl *encl,
- struct sgx_encl_page *entry,
- struct page *backing)
+ struct sgx_encl_page *entry)
{
- int ret = __sgx_ewb(encl, entry, backing);
+ int ret = __sgx_ewb(encl, entry);
+
+ if (ret < 0)
+ return false;
if (ret == SGX_NOT_TRACKED) {
/* slow path, IPI needed */
smp_call_function(sgx_ipi_cb, NULL, 1);
- ret = __sgx_ewb(encl, entry, backing);
+ ret = __sgx_ewb(encl, entry);
}
if (ret) {
@@ -294,11 +305,8 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
{
struct sgx_encl_page *entry;
struct sgx_encl_page *tmp;
- struct page *pages[SGX_NR_SWAP_CLUSTER_MAX + 1];
struct vm_area_struct *evma;
unsigned int free_flags;
- int cnt = 0;
- int i = 0;
if (list_empty(src))
return;
@@ -316,25 +324,14 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
continue;
}
- pages[cnt] = sgx_get_backing(encl, entry);
- if (IS_ERR(pages[cnt])) {
- list_del(&entry->load_list);
- list_add_tail(&entry->load_list, &encl->load_list);
- entry->flags &= ~SGX_ENCL_PAGE_RESERVED;
- continue;
- }
-
zap_vma_ptes(evma, entry->addr, PAGE_SIZE);
sgx_eblock(entry->epc_page);
- cnt++;
}
/* ETRACK */
sgx_etrack(encl->secs_page.epc_page);
/* EWB */
- i = 0;
-
while (!list_empty(src)) {
entry = list_first_entry(src, struct sgx_encl_page,
load_list);
@@ -344,29 +341,21 @@ static void sgx_write_pages(struct sgx_encl *encl, struct list_head *src)
evma = sgx_find_vma(encl, entry->addr);
if (evma) {
- if (sgx_ewb(encl, entry, pages[i]))
+ if (sgx_ewb(encl, entry))
free_flags = SGX_FREE_SKIP_EREMOVE;
encl->secs_child_cnt--;
}
sgx_free_encl_page(entry, encl, free_flags);
- sgx_put_backing(pages[i++], evma);
}
- /* Allow SECS page eviction only when the encl is initialized. */
- if (!encl->secs_child_cnt &&
- (encl->flags & SGX_ENCL_INITIALIZED)) {
- pages[cnt] = sgx_get_backing(encl, &encl->secs_page);
- if (!IS_ERR(pages[cnt])) {
- free_flags = 0;
- if (sgx_ewb(encl, &encl->secs_page, pages[cnt]))
- free_flags = SGX_FREE_SKIP_EREMOVE;
-
- encl->flags |= SGX_ENCL_SECS_EVICTED;
+ if (!encl->secs_child_cnt && (encl->flags & SGX_ENCL_INITIALIZED)) {
+ free_flags = 0;
+ if (sgx_ewb(encl, &encl->secs_page))
+ free_flags = SGX_FREE_SKIP_EREMOVE;
- sgx_free_encl_page(&encl->secs_page, encl, free_flags);
- sgx_put_backing(pages[cnt], true);
- }
+ encl->flags |= SGX_ENCL_SECS_EVICTED;
+ sgx_free_encl_page(&encl->secs_page, encl, free_flags);
}
mutex_unlock(&encl->lock);
--
2.9.3
5 years, 5 months
[PATCH v9 RESEND] intel_sgx: migrate to radix tree for addressing enclave pages
by Jarkko Sakkinen
Radix tree is the fastest data structure for addressing so it does
make sense to replace RB tree with a Radix tree.
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
Ugh, copule of fixes to sgx_vma_do_fault were not squashed :( Sorry. Here's
a fixed version.
drivers/platform/x86/intel_sgx.h | 6 ++---
drivers/platform/x86/intel_sgx_ioctl.c | 40 +++++++------------------------
drivers/platform/x86/intel_sgx_util.c | 44 +++++++---------------------------
drivers/platform/x86/intel_sgx_vma.c | 20 +++++++++-------
4 files changed, 31 insertions(+), 79 deletions(-)
diff --git a/drivers/platform/x86/intel_sgx.h b/drivers/platform/x86/intel_sgx.h
index c8b65fe..ed9e8e6 100644
--- a/drivers/platform/x86/intel_sgx.h
+++ b/drivers/platform/x86/intel_sgx.h
@@ -68,6 +68,7 @@
#include <linux/sched.h>
#include <linux/workqueue.h>
#include <linux/mmu_notifier.h>
+#include <linux/radix-tree.h>
#define SGX_EINIT_SPIN_COUNT 20
#define SGX_EINIT_SLEEP_COUNT 50
@@ -115,7 +116,6 @@ struct sgx_encl_page {
struct sgx_va_page *va_page;
unsigned int va_offset;
struct sgx_pcmd pcmd;
- struct rb_node node;
};
struct sgx_tgid_ctx {
@@ -146,7 +146,7 @@ struct sgx_encl {
unsigned long base;
unsigned long size;
struct list_head va_pages;
- struct rb_root encl_rb;
+ struct radix_tree_root page_tree;
struct list_head add_page_reqs;
struct work_struct add_page_work;
struct sgx_encl_page secs_page;
@@ -211,8 +211,6 @@ void sgx_unpin_mm(struct sgx_encl *encl);
void sgx_invalidate(struct sgx_encl *encl);
int sgx_find_encl(struct mm_struct *mm, unsigned long addr,
struct vm_area_struct **vma);
-struct sgx_encl_page *sgx_encl_find_page(struct sgx_encl *encl,
- unsigned long addr);
void sgx_encl_release(struct kref *ref);
void sgx_tgid_ctx_release(struct kref *ref);
diff --git a/drivers/platform/x86/intel_sgx_ioctl.c b/drivers/platform/x86/intel_sgx_ioctl.c
index 8543373..3a4a8fa 100644
--- a/drivers/platform/x86/intel_sgx_ioctl.c
+++ b/drivers/platform/x86/intel_sgx_ioctl.c
@@ -138,33 +138,6 @@ void sgx_tgid_ctx_release(struct kref *ref)
kfree(pe);
}
-static int encl_rb_insert(struct rb_root *root,
- struct sgx_encl_page *data)
-{
- struct rb_node **new = &root->rb_node;
- struct rb_node *parent = NULL;
-
- /* Figure out where to put new node */
- while (*new) {
- struct sgx_encl_page *this =
- container_of(*new, struct sgx_encl_page, node);
-
- parent = *new;
- if (data->addr < this->addr)
- new = &((*new)->rb_left);
- else if (data->addr > this->addr)
- new = &((*new)->rb_right);
- else
- return -EFAULT;
- }
-
- /* Add new node and rebalance tree. */
- rb_link_node(&data->node, parent, new);
- rb_insert_color(&data->node, root);
-
- return 0;
-}
-
static int sgx_find_and_get_encl(unsigned long addr, struct sgx_encl **encl)
{
struct mm_struct *mm = current->mm;
@@ -538,6 +511,7 @@ static long sgx_ioc_enclave_create(struct file *filep, unsigned int cmd,
kref_init(&encl->refcount);
INIT_LIST_HEAD(&encl->add_page_reqs);
INIT_LIST_HEAD(&encl->va_pages);
+ INIT_RADIX_TREE(&encl->page_tree, GFP_KERNEL);
INIT_LIST_HEAD(&encl->load_list);
INIT_LIST_HEAD(&encl->encl_list);
mutex_init(&encl->lock);
@@ -713,7 +687,7 @@ static int __encl_add_page(struct sgx_encl *encl,
goto out;
}
- if (sgx_encl_find_page(encl, addp->addr)) {
+ if (radix_tree_lookup(&encl->page_tree, addp->addr >> PAGE_SHIFT)) {
ret = -EEXIST;
goto out;
}
@@ -730,6 +704,13 @@ static int __encl_add_page(struct sgx_encl *encl,
goto out;
}
+ ret = radix_tree_insert(&encl->page_tree, encl_page->addr >> PAGE_SHIFT,
+ encl_page);
+ if (ret) {
+ sgx_put_backing(backing, false /* write */);
+ goto out;
+ }
+
user_vaddr = kmap(backing);
tmp_vaddr = kmap(tmp_page);
memcpy(user_vaddr, tmp_vaddr, PAGE_SIZE);
@@ -757,9 +738,6 @@ static int __encl_add_page(struct sgx_encl *encl,
kfree(req);
sgx_free_va_slot(encl_page->va_page,
encl_page->va_offset);
- } else {
- ret = encl_rb_insert(&encl->encl_rb, encl_page);
- WARN_ON(ret);
}
mutex_unlock(&encl->lock);
diff --git a/drivers/platform/x86/intel_sgx_util.c b/drivers/platform/x86/intel_sgx_util.c
index f6f7dde0..2c390c5 100644
--- a/drivers/platform/x86/intel_sgx_util.c
+++ b/drivers/platform/x86/intel_sgx_util.c
@@ -120,13 +120,9 @@ struct vm_area_struct *sgx_find_vma(struct sgx_encl *encl, unsigned long addr)
void sgx_zap_tcs_ptes(struct sgx_encl *encl, struct vm_area_struct *vma)
{
struct sgx_encl_page *entry;
- struct rb_node *rb;
- rb = rb_first(&encl->encl_rb);
- while (rb) {
- entry = container_of(rb, struct sgx_encl_page, node);
- rb = rb_next(rb);
- if (entry->epc_page && (entry->flags & SGX_ENCL_PAGE_TCS) &&
+ list_for_each_entry(entry, &encl->load_list, load_list) {
+ if ((entry->flags & SGX_ENCL_PAGE_TCS) &&
entry->addr >= vma->vm_start &&
entry->addr < vma->vm_end)
zap_vma_ptes(vma, entry->addr, PAGE_SIZE);
@@ -203,55 +199,31 @@ int sgx_find_encl(struct mm_struct *mm, unsigned long addr,
return 0;
}
-struct sgx_encl_page *sgx_encl_find_page(struct sgx_encl *encl,
- unsigned long addr)
-{
- struct rb_node *node = encl->encl_rb.rb_node;
-
- while (node) {
- struct sgx_encl_page *data =
- container_of(node, struct sgx_encl_page, node);
-
- if (data->addr > addr)
- node = node->rb_left;
- else if (data->addr < addr)
- node = node->rb_right;
- else
- return data;
- }
-
- return NULL;
-}
-
void sgx_encl_release(struct kref *ref)
{
- struct rb_node *rb1, *rb2;
struct sgx_encl_page *entry;
struct sgx_va_page *va_page;
- struct sgx_encl *encl =
- container_of(ref, struct sgx_encl, refcount);
+ struct sgx_encl *encl = container_of(ref, struct sgx_encl, refcount);
+ struct radix_tree_iter iter;
+ void **slot;
mutex_lock(&sgx_tgid_ctx_mutex);
if (!list_empty(&encl->encl_list))
list_del(&encl->encl_list);
-
mutex_unlock(&sgx_tgid_ctx_mutex);
if (encl->mmu_notifier.ops)
mmu_notifier_unregister_no_release(&encl->mmu_notifier,
encl->mm);
- rb1 = rb_first(&encl->encl_rb);
- while (rb1) {
- entry = container_of(rb1, struct sgx_encl_page, node);
- rb2 = rb_next(rb1);
- rb_erase(rb1, &encl->encl_rb);
+ radix_tree_for_each_slot(slot, &encl->page_tree, &iter, 0) {
+ entry = *slot;
if (entry->epc_page) {
list_del(&entry->load_list);
sgx_free_page(entry->epc_page, encl, 0);
}
+ radix_tree_delete(&encl->page_tree, entry->addr >> PAGE_SHIFT);
kfree(entry);
- rb1 = rb2;
}
while (!list_empty(&encl->va_pages)) {
diff --git a/drivers/platform/x86/intel_sgx_vma.c b/drivers/platform/x86/intel_sgx_vma.c
index d588932..1ff55c1 100644
--- a/drivers/platform/x86/intel_sgx_vma.c
+++ b/drivers/platform/x86/intel_sgx_vma.c
@@ -169,16 +169,20 @@ static struct sgx_encl_page *sgx_vma_do_fault(struct vm_area_struct *vma,
if (!encl)
return ERR_PTR(-EFAULT);
- entry = sgx_encl_find_page(encl, addr);
- if (!entry)
- return ERR_PTR(-EFAULT);
+ mutex_lock(&encl->lock);
- epc_page = sgx_alloc_page(encl->tgid_ctx, SGX_ALLOC_ATOMIC);
- if (IS_ERR(epc_page))
- /* reinterpret the type as we return an error */
- return (struct sgx_encl_page *)epc_page;
+ entry = radix_tree_lookup(&encl->page_tree, addr >> PAGE_SHIFT);
+ if (!entry) {
+ entry = ERR_PTR(-EFAULT);
+ goto out;
+ }
- mutex_lock(&encl->lock);
+ epc_page = sgx_alloc_page(encl->tgid_ctx, SGX_ALLOC_ATOMIC);
+ if (IS_ERR(epc_page)) {
+ entry = (struct sgx_encl_page *)epc_page;
+ epc_page = NULL;
+ goto out;
+ }
if (encl->flags & SGX_ENCL_DEAD) {
entry = ERR_PTR(-EFAULT);
--
2.9.3
5 years, 5 months
[PATCH] intel_sgx: fix SECS page eviction
by Jarkko Sakkinen
The SECS page was evicted to the first virtual address of ELRANGE. If
there was an EPC page there, evicting the SECS page would overwrite its
swapped copy.
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
drivers/platform/x86/intel_sgx_ioctl.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/platform/x86/intel_sgx_ioctl.c b/drivers/platform/x86/intel_sgx_ioctl.c
index 3a4a8fa..186d789 100644
--- a/drivers/platform/x86/intel_sgx_ioctl.c
+++ b/drivers/platform/x86/intel_sgx_ioctl.c
@@ -556,6 +556,7 @@ static long sgx_ioc_enclave_create(struct file *filep, unsigned int cmd,
goto out;
}
+ encl->secs_page.addr = encl->base + encl->size;
encl->secs_page.epc_page = secs_epc;
createp->src = (unsigned long)encl->base;
--
2.9.3
5 years, 5 months
[PATCH v8 00/10] Fixes and performance improvements
by Jarkko Sakkinen
v5:
* in LRU pin mm_struct before isolating the pages
v6:
* set mmu notifier ops before registering
* clean up sgx_vma_open/close handling
* fixed checkpatch.pl errors in the LRU patch
* properly tested that this patch set does not break suspend
v7:
* Use sgx_invalidate in VMA open/close callbacks so that new HW
threads cannot enter. You cannot just zap PTE entries for a single
VMA.
v8:
* Renamed SGX_ENCL_INVALIDATED simply as SGX_ENCL_DEAD to underline
what it means.
* Harden only vma_close callback.
Jarkko Sakkinen (7):
intel_sgx: fallback more gracefully from EWB failure
intel_sgx: fix deadlock in sgx_ioc_enclave_create()
intel_sgx: fix null pointer deref in sgx_invalidate()
intel_sgx: kill the enclave when any of its VMAs are closed
intel_sgx: remove redundant code from sgx_vma_do_fault
intel_sgx: invalidate enclave when the user threads cease to exist
intel_sgx: migrate to radix tree for addressing enclave pages
Sean Christopherson (3):
intel_sgx: fix error resolution in SGX_IOC_ENCLAVE_INIT
intel_sgx: lock the enclave for the entire EPC eviction flow
intel_sgx: add LRU algorithm to page swapping
drivers/platform/x86/Kconfig | 1 +
drivers/platform/x86/intel_sgx.h | 18 ++--
drivers/platform/x86/intel_sgx_ioctl.c | 94 ++++++++++++--------
drivers/platform/x86/intel_sgx_page_cache.c | 127 ++++++++++++++++++----------
drivers/platform/x86/intel_sgx_util.c | 77 ++++++-----------
drivers/platform/x86/intel_sgx_vma.c | 50 +++++------
6 files changed, 196 insertions(+), 171 deletions(-)
--
2.9.3
5 years, 5 months