On 2017-01-05 08:21, Christopherson, Sean J wrote:
Are there any active threads in the enclave when munmap is called?
Does the slowdown occur if munmap is called immediately after
creating the enclave, i.e. before executing EENTER on any thread?
I assume more than just the enclave layout is causing the slowdown.
I wrote a simple test case that sets up only empty RW pages and fails
EINIT. Even then there is the slowdown on munmap. I can make this test
case publicly available, but see below.
Can you try to reproduce the issue using the most recent version of
the driver from
http://git.infradead.org/users/jjs/linux-isgx.git?
I think this particular issue may have been fixed. However, this driver
is very unstable with my actual workload:
[ 1885.331180] INFO: rcu_sched self-detected stall on CPU
[ 1885.331189] 3-...: (5250 ticks this GP) idle=bfd/140000000000001/0
softirq=33251/33251 fqs=2555
[ 1885.331192] (t=5250 jiffies g=25305 c=25304 q=3743)
[ 1885.331196] Task dump for CPU 3:
[ 1885.331198] enclave-runner R running task 0 3199 1
0x0000000c
[ 1885.331203] ffffffff9e861480 000000002b7723e0 ffffffff9dd883b4
ffff8bf6a14d9c80
[ 1885.331207] ffffffff9e861480 0000000000000000 ffffffff9e97b8c0
00000000ffffffff
[ 1885.331211] ffffffff9dce8950 0000000089ac6218 ffff8bf5ad53d700
0000000000000e9f
[ 1885.331214] Call Trace:
[ 1885.331216] <IRQ> [<ffffffff9dd883b4>] ? rcu_dump_cpu_stacks+0x91/0xaa
[ 1885.331227] [<ffffffff9dce8950>] ? rcu_check_callbacks+0x7f0/0x940
[ 1885.331231] [<ffffffff9dcffae0>] ? tick_sched_handle.isra.13+0x50/0x50
[ 1885.331234] [<ffffffff9dcef9c8>] ? update_process_times+0x28/0x50
[ 1885.331237] [<ffffffff9dcffab0>] ? tick_sched_handle.isra.13+0x20/0x50
[ 1885.331239] [<ffffffff9dcffb18>] ? tick_sched_timer+0x38/0x70
[ 1885.331243] [<ffffffff9dcf033a>] ? __hrtimer_run_queues+0xea/0x330
[ 1885.331246] [<ffffffff9dcf7abd>] ?
ktime_get_update_offsets_now+0x4d/0x100
[ 1885.331250] [<ffffffff9dcf0d1c>] ? hrtimer_interrupt+0x9c/0x1a0
[ 1885.331254] [<ffffffff9e342ae9>] ? smp_apic_timer_interrupt+0x39/0x50
[ 1885.331257] [<ffffffff9e341e02>] ? apic_timer_interrupt+0x82/0x90
[ 1885.331258] <EOI> [<ffffffffc0c620a1>] ? sgx_zap_tcs_ptes+0x21/0x60
[intel_sgx]
[ 1885.331270] [<ffffffffc0c621b7>] ? sgx_invalidate+0x27/0x50 [intel_sgx]
[ 1885.331274] [<ffffffffc0c62a86>] ? sgx_vma_open+0x36/0xb0 [intel_sgx]
[ 1885.331278] [<ffffffff9dc7db38>] ? copy_process.part.31+0xce8/0x1b30
[ 1885.331282] [<ffffffff9dc7eb5d>] ? _do_fork+0xed/0x3d0
[ 1885.331285] [<ffffffff9dc03b99>] ? do_syscall_64+0x59/0xb0
[ 1885.331288] [<ffffffff9e3402a5>] ? entry_SYSCALL64_slow_path+0x25/0x25
[ 1916.297912] INFO: rcu_sched self-detected stall on CPU
[ 1916.297925] 0-...: (5250 ticks this GP) idle=bd9/140000000000001/0
softirq=40635/40635 fqs=2538
[ 1916.297928] (t=5251 jiffies g=25310 c=25309 q=377)
[ 1916.297932] Task dump for CPU 0:
[ 1916.297934] enclave-runner R running task 0 3199 1
0x0000000c
[ 1916.297939] ffffffff9e861480 000000002b7723e0 ffffffff9dd883b4
ffff8bf6a1419c80
[ 1916.297943] ffffffff9e861480 0000000000000000 ffffffff9e97b8c0
00000000ffffffff
[ 1916.297947] ffffffff9dce8950 0000000089ac69d8 ffff8bf5ad53d700
0000000000000179
[ 1916.297951] Call Trace:
[ 1916.297952] <IRQ> [<ffffffff9dd883b4>] ? rcu_dump_cpu_stacks+0x91/0xaa
[ 1916.297964] [<ffffffff9dce8950>] ? rcu_check_callbacks+0x7f0/0x940
[ 1916.297968] [<ffffffff9dcffae0>] ? tick_sched_handle.isra.13+0x50/0x50
[ 1916.297972] [<ffffffff9dcef9c8>] ? update_process_times+0x28/0x50
[ 1916.297975] [<ffffffff9dcffab0>] ? tick_sched_handle.isra.13+0x20/0x50
[ 1916.297977] [<ffffffff9dcffb18>] ? tick_sched_timer+0x38/0x70
[ 1916.297981] [<ffffffff9dcf033a>] ? __hrtimer_run_queues+0xea/0x330
[ 1916.297988] [<ffffffff9dcf7abd>] ?
ktime_get_update_offsets_now+0x4d/0x100
[ 1916.297992] [<ffffffff9dcf0d1c>] ? hrtimer_interrupt+0x9c/0x1a0
[ 1916.297996] [<ffffffff9e342ae9>] ? smp_apic_timer_interrupt+0x39/0x50
[ 1916.297998] [<ffffffff9e341e02>] ? apic_timer_interrupt+0x82/0x90
[ 1916.297999] <EOI> [<ffffffffc0c620a1>] ? sgx_zap_tcs_ptes+0x21/0x60
[intel_sgx]
[ 1916.298012] [<ffffffffc0c621b7>] ? sgx_invalidate+0x27/0x50 [intel_sgx]
[ 1916.298015] [<ffffffffc0c62a86>] ? sgx_vma_open+0x36/0xb0 [intel_sgx]
[ 1916.298019] [<ffffffff9dc7db38>] ? copy_process.part.31+0xce8/0x1b30
[ 1916.298023] [<ffffffff9dc7eb5d>] ? _do_fork+0xed/0x3d0
[ 1916.298027] [<ffffffff9dc03b99>] ? do_syscall_64+0x59/0xb0
[ 1916.298029] [<ffffffff9e3402a5>] ? entry_SYSCALL64_slow_path+0x25/0x25
This is followed within the next minutes by further messages in iwlwifi
and i915 and a system hang. I have unfortunately not been able to
produce a simple test case that reproduces this.
Is there any way you can provide the files needed to reproduce this
issue? Reproducing this issue from scratch is likely going to be
quite difficult.
As I said, I can make available the test case for the slow munmap with
the sgx_driver_1.7. However that may no longer be necessary. See my
off-list message regarding the test case for the stalling with the
latest driver.
Jethro Beekman | Fortanix