Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's IA32_SGXLEPUBKEYHASHn runtime switch support
by Huang, Kai
Hi Paolo/Radim,
I'd like to start a discussion regarding to IA32_SGXLEPUBKEYHASHn
handling here. I also copied SGX driver mailing list (which looks like I
should do when sending out this series, sorry) and Sean, Haim and Haitao
from Intel to have a better discussion.
Basically IA32_SGXLEPUBKEYHASHn (or more generally speaking, SGX Launch
Control) allows us to run different Launch Enclave (LE) signed with
different RSA keys. Only when the value of IA32_SGXLEPUBKEYHASHn matches
the key used to sign the LE, the LE can be initialized, specifically, by
EINIT, successfully. So before calling EINIT for LE, we have to make
sure IA32_SGXLEPUBKEYHASHn contain the matched value. One fact is only
EINIT uses IA32_SGXLEPUBKEYHASHn, and after EINIT, other ENCLS/ENCLU
(ex, EGETKEY) runs correctly even the MSRs are changed to other values.
To support KVM guests to run their own LEs inside guests, KVM traps
IA32_SGXLEPUBKEYHASHn MSR write and keep the value to vcpu internally,
and KVM needs to write the cached value to real MSRs before guest runs
EINIT. The problem at host side, we also run LE, probably multiple LEs
(it seems currently SGX driver plans to run single in-kernel LE but I am
not familiar with details, and IMO we should not assume host will only
run one LE), therefore if KVM changes the physical MSRs for guest,
host may not be able to run LE as it may not re-write the right MSRs
back. There are two approaches to make host and KVM guests work together:
1. Anyone who wants to run LE is responsible for writing the correct
value to IA32_SGXLEPUBKEYHASHn.
My current patch is based on this assumption. For KVM guest, naturally,
we will write the cached value to real MSRs when vcpu is scheduled in.
For host, SGX driver should write its own value to MSRs when it performs
EINIT for LE.
One argument against this approach is KVM guest should never have impact
on host side, meaning host should not be aware of such MSR change, in
which case, if host do some performance optimization thing that won't
update MSRs actively, when host run EINIT, the physical MSRs may contain
incorrect value. Instead, KVM should be responsible for restoring the
original MSRs, which brings us to approach 2 below.
2. KVM should restore MSRs after changing for guest.
To do this, the simplest way for KVM is: 1) to save the original
physical MSRs and update to guest's MSRs before VMENTRY; 2) in VMEXIT
rewrite the original value to physical MSRs.
To me this approach is also arguable, as KVM guest is actually just a
normal process (OK, maybe not that normal), and KVM guest should be
treated as the same as other processes which runs LE, which means
approach 1 is also reasonable.
And approach 2 will have more performance impact than approach 1 for
KVM, as it read/write IA32_SGXLEPUBKEYHASHn during each VMEXIT/VMENTRY,
while approach 1 only write MSRs when vcpu is scheduled in, which is
less frequent.
I'd like to hear all your comments and hopefully we can have some
agreement on this.
Another thing is, not quite related to selecting which approach above,
and either we choose approach 1 or approach 2, KVM still suffers the
performance loss of writing (and/or reading) to IA32_SGXLEPUBKEYHASHn
MSRs, either when vcpu scheduled in or during each VMEXIT/VMENTRY. Given
the fact that the IA32_SGXLEPUBKEYHASHn will only be used by EINIT, We
can actually do some optimization by trapping EINIT from guest and only
update MSRs in EINIT VMEXIT. This works for approach 1, but for approach
2 we have to do some tricky thing during VMEXIT/VMENTRY
to check whether MSRs have been changed by EINIT VMEXIT, and only
restore the original value if EINIT VMEXIT has happened. Guest's LE
continues to run even physical MSRs are changed back to original.
But trapping ENCLS requires either 1) KVM to run ENCLS on hebalf of
guest, in which case we have to reconstruct and remap guest's ENCLS
parameters and skip the ENCLS for guest; 2) using MTF to let guest to
run ENCLS again, while still trapping ENCLS. Either case would introduce
more complicated code and potentially be more buggy, and I don't think
we should do this to save some time of writing MSRs. If we need to turn
on ENCLS VMEXIT anyway we can optimize this.
Thank you in advance.
Thanks,
-Kai
On 5/8/2017 5:24 PM, Kai Huang wrote:
> If SGX runtime launch control is enabled on host (IA32_FEATURE_CONTROL[17]
> is set), KVM can support running multiple guests with each running LE signed
> with different RSA pubkey. KVM traps IA32_SGXLEPUBKEYHASHn MSR write from
> and keeps the values to vcpu internally, and when vcpu is scheduled in, KVM
> write those values to real IA32_SGXLEPUBKEYHASHn MSR.
>
> Signed-off-by: Kai Huang <kai.huang(a)linux.intel.com>
> ---
> arch/x86/include/asm/msr-index.h | 5 ++
> arch/x86/kvm/vmx.c | 123 +++++++++++++++++++++++++++++++++++++++
> 2 files changed, 128 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index e3770f570bb9..70482b951b0f 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -417,6 +417,11 @@
> #define MSR_IA32_TSC_ADJUST 0x0000003b
> #define MSR_IA32_BNDCFGS 0x00000d90
>
> +#define MSR_IA32_SGXLEPUBKEYHASH0 0x0000008c
> +#define MSR_IA32_SGXLEPUBKEYHASH1 0x0000008d
> +#define MSR_IA32_SGXLEPUBKEYHASH2 0x0000008e
> +#define MSR_IA32_SGXLEPUBKEYHASH3 0x0000008f
> +
> #define MSR_IA32_XSS 0x00000da0
>
> #define FEATURE_CONTROL_LOCKED (1<<0)
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index a16539594a99..c96332b9dd44 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -656,6 +656,9 @@ struct vcpu_vmx {
> */
> u64 msr_ia32_feature_control;
> u64 msr_ia32_feature_control_valid_bits;
> +
> + /* SGX Launch Control public key hash */
> + u64 msr_ia32_sgxlepubkeyhash[4];
> };
>
> enum segment_cache_field {
> @@ -2244,6 +2247,70 @@ static void decache_tsc_multiplier(struct vcpu_vmx *vmx)
> vmcs_write64(TSC_MULTIPLIER, vmx->current_tsc_ratio);
> }
>
> +static bool cpu_sgx_lepubkeyhash_writable(void)
> +{
> + u64 val, sgx_lc_enabled_mask = (FEATURE_CONTROL_LOCKED |
> + FEATURE_CONTROL_SGX_LAUNCH_CONTROL_ENABLE);
> +
> + rdmsrl(MSR_IA32_FEATURE_CONTROL, val);
> +
> + return ((val & sgx_lc_enabled_mask) == sgx_lc_enabled_mask);
> +}
> +
> +static bool vmx_sgx_lc_disabled_in_bios(struct kvm_vcpu *vcpu)
> +{
> + return (to_vmx(vcpu)->msr_ia32_feature_control & FEATURE_CONTROL_LOCKED)
> + && (!(to_vmx(vcpu)->msr_ia32_feature_control &
> + FEATURE_CONTROL_SGX_LAUNCH_CONTROL_ENABLE));
> +}
> +
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH0 0xa6053e051270b7ac
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH1 0x6cfbe8ba8b3b413d
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH2 0xc4916d99f2b3735d
> +#define SGX_INTEL_DEFAULT_LEPUBKEYHASH3 0xd4f8c05909f9bb3b
> +
> +static void vmx_sgx_init_lepubkeyhash(struct kvm_vcpu *vcpu)
> +{
> + u64 h0, h1, h2, h3;
> +
> + /*
> + * If runtime launch control is enabled (IA32_SGXLEPUBKEYHASHn is
> + * writable), we set guest's default value to be Intel's default
> + * hash (which is fixed value and can be hard-coded). Otherwise,
> + * guest can only use machine's IA32_SGXLEPUBKEYHASHn so set guest's
> + * default to that.
> + */
> + if (cpu_sgx_lepubkeyhash_writable()) {
> + h0 = SGX_INTEL_DEFAULT_LEPUBKEYHASH0;
> + h1 = SGX_INTEL_DEFAULT_LEPUBKEYHASH1;
> + h2 = SGX_INTEL_DEFAULT_LEPUBKEYHASH2;
> + h3 = SGX_INTEL_DEFAULT_LEPUBKEYHASH3;
> + }
> + else {
> + rdmsrl(MSR_IA32_SGXLEPUBKEYHASH0, h0);
> + rdmsrl(MSR_IA32_SGXLEPUBKEYHASH1, h1);
> + rdmsrl(MSR_IA32_SGXLEPUBKEYHASH2, h2);
> + rdmsrl(MSR_IA32_SGXLEPUBKEYHASH3, h3);
> + }
> +
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[0] = h0;
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[1] = h1;
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[2] = h2;
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[3] = h3;
> +}
> +
> +static void vmx_sgx_lepubkeyhash_load(struct kvm_vcpu *vcpu)
> +{
> + wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0,
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[0]);
> + wrmsrl(MSR_IA32_SGXLEPUBKEYHASH1,
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[1]);
> + wrmsrl(MSR_IA32_SGXLEPUBKEYHASH2,
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[2]);
> + wrmsrl(MSR_IA32_SGXLEPUBKEYHASH3,
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[3]);
> +}
> +
> /*
> * Switches to specified vcpu, until a matching vcpu_put(), but assumes
> * vcpu mutex is already taken.
> @@ -2316,6 +2383,14 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>
> vmx_vcpu_pi_load(vcpu, cpu);
> vmx->host_pkru = read_pkru();
> +
> + /*
> + * Load guset's SGX LE pubkey hash if runtime launch control is
> + * enabled.
> + */
> + if (guest_cpuid_has_sgx_launch_control(vcpu) &&
> + cpu_sgx_lepubkeyhash_writable())
> + vmx_sgx_lepubkeyhash_load(vcpu);
> }
>
> static void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
> @@ -3225,6 +3300,19 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> case MSR_IA32_FEATURE_CONTROL:
> msr_info->data = to_vmx(vcpu)->msr_ia32_feature_control;
> break;
> + case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
> + /*
> + * SDM 35.1 Model-Specific Registers, table 35-2.
> + * Read permitted if CPUID.0x12.0:EAX[0] = 1. (We have
> + * guaranteed this will be true if guest_cpuid_has_sgx
> + * is true.)
> + */
> + if (!guest_cpuid_has_sgx(vcpu))
> + return 1;
> + msr_info->data =
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[msr_info->index -
> + MSR_IA32_SGXLEPUBKEYHASH0];
> + break;
> case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> if (!nested_vmx_allowed(vcpu))
> return 1;
> @@ -3344,6 +3432,37 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> * SGX has been enabled in BIOS before using SGX.
> */
> break;
> + case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
> + /*
> + * SDM 35.1 Model-Specific Registers, table 35-2.
> + * - If CPUID.0x7.0:ECX[30] = 1, FEATURE_CONTROL[17] is
> + * available.
> + * - Write permitted if CPUID.0x12.0:EAX[0] = 1 &&
> + * FEATURE_CONTROL[17] = 1 && FEATURE_CONTROL[0] = 1.
> + */
> + if (!guest_cpuid_has_sgx(vcpu) ||
> + !guest_cpuid_has_sgx_launch_control(vcpu))
> + return 1;
> + /*
> + * Don't let userspace set guest's IA32_SGXLEPUBKEYHASHn,
> + * if machine's IA32_SGXLEPUBKEYHASHn cannot be changed at
> + * runtime. Note to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash are
> + * set to default in vmx_create_vcpu therefore guest is able
> + * to get the machine's IA32_SGXLEPUBKEYHASHn by rdmsr in
> + * guest.
> + */
> + if (!cpu_sgx_lepubkeyhash_writable())
> + return 1;
> + /*
> + * If guest's FEATURE_CONTROL[17] is not set, guest's
> + * IA32_SGXLEPUBKEYHASHn are not writeable from guest.
> + */
> + if (!vmx_sgx_lc_disabled_in_bios(vcpu) &&
> + !msr_info->host_initiated)
> + return 1;
> + to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash[msr_index -
> + MSR_IA32_SGXLEPUBKEYHASH0] = data;
> + break;
> case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> if (!msr_info->host_initiated)
> return 1; /* they are read-only */
> @@ -9305,6 +9424,10 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
> vmx->nested.vpid02 = allocate_vpid();
> }
>
> + /* Set vcpu's default IA32_SGXLEPUBKEYHASHn */
> + if (enable_sgx && boot_cpu_has(X86_FEATURE_SGX_LAUNCH_CONTROL))
> + vmx_sgx_init_lepubkeyhash(&vmx->vcpu);
> +
> vmx->nested.posted_intr_nv = -1;
> vmx->nested.current_vmptr = -1ull;
> vmx->nested.current_vmcs12 = NULL;
>
4 years, 10 months
Debugging interface
by Jethro Beekman
In order to provide the ability to debug enclaves independently of their
programming model, while supporting live attach (meaning the debugger
might not have seen enclave creation), a debugger needs to following
capabilities for a debugged process:
1) Identify enclaves
a) Given a page, identify it as an enclave page
b) Given an enclave page, find that enclave's base address
c) Given a particular enclave, find the TCSes
d) Given a particular enclave, get the attributes/miscselect
2) Read/write enclave memory
3) Find debugging symbols for enclaves
I think (1)(a) and (1)(b) can be done using /proc/[pid]/maps, but I'd
like confirmation. I think the following is true: If a page is in a
memory range that is listed as mapped from the sgx device, it is an
enclave page. From a memory range that is listed as mapped from the sgx
device, you can compute the enclave base address as (start-offset). (2)
is supported already by the driver using the ptrace interface.
(1)(c) is necessary to find the SSAs and (1)(d) is necessary to
determine SSAFRAMESIZE and make sense of the data in the SSAs. What
would be the best way to expose this information? Best I can think of
currently is a dedicated file in /proc/[pid] where the driver exposes
this information which was originally provided at enclave creation time.
I don't think (3) can universally be done without help from the user
program. Yes, it might be possible to learn something from ELF headers
but Linux does not generally have a requirement that user processes must
have valid ELF information in their memory mappings. Most current
userspace debug symbol loading is based on filenames. SGX does not
provide any record of where enclave memory came from. Perhaps the create
ioctl could include some user data that is later exposed again in the
same proc file mentioned above. Frameworks and toolchains could
establish a convention regarding this user data to find symbols.
--
Jethro Beekman | Fortanix
4 years, 10 months
Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's IA32_SGXLEPUBKEYHASHn runtime switch support
by Huang, Kai
On 5/24/2017 4:43 AM, Paolo Bonzini wrote:
>
>
> On 23/05/2017 18:34, Andy Lutomirski wrote:
>>
>>> Using MTF is also a little bit tricky, as when we turn on MTF VMEXIT upon
>>> ENCLS VMEXIT, the MTF won't be absolutely pending at end of that ENCLS. For
>>> example, MTF may be pending at end of interrupt (cannot recall exactly) if
>>> event is pending during VMENTRY from ENCLS VMEXIT. Therefore we have to do
>>> additional thing to check whether this MTF VMEXIT really happens after ENCLS
>>> run (step 3 above). And depending on what we need to do, we may need to
>>> check whether ENCLS succeeded or not in guest, which is also tricky, as
>>> ENCLS can fail in either setting error code in RAX, or generating #GP or #UD
>>> (step 4 above). We may still need to do gva->gpa->hpa, ex, in order to
>>> locate EPC/SECS page and update status, depending on the purpose of trapping
>>> ENCLS.
>> I think there are some issues here.
>>
>> First, you're making a big assumption that, when you resume the guest
>> with MTF set, the instruction that gets executed is still
>> ENCLS[EINIT]. That's not guaranteed as is -- you could race against
>> another vCPU that changes the instruction, the instruction could be in
>> IO space, host userspace could be messing with you, etc. Second, I
>> don't think there's any precedent at all in KVM for doing this.
>> Third, you still need to make sure that the MSRs retain the value you
>> want them to have by the time ENCLS happens. I think that, by the
>> time you resolve all of these issues, it'll look a lot like the
>> pseudocode I emailed out, and MTF won't be necessary any more.
>
> Agreed. Emulation in the host is better.
Hi Andy/Paolo,
Thanks for comments. I'll follow your suggestion in v2.
Thanks,
-Kai
>
> Paolo
>
5 years
Re: [intel-sgx-kernel-dev] [PATCH 08/10] kvm: vmx: add guest's IA32_SGXLEPUBKEYHASHn runtime switch support
by Huang, Kai
On 5/17/2017 2:21 AM, Paolo Bonzini wrote:
>
>
> On 16/05/2017 02:48, Huang, Kai wrote:
>>
>>
>> If host only allows one single LE to run, KVM can add a restrict that
>> only allows to create KVM guest with runtime change to
>> IA32_SGXLEPUBKEYHASHn disabled, so that only host allowed (single) hash
>> can be used by guest. From guest's view, it simply has
>> IA32_FEATURE_CONTROL[bit17] cleared and has IA32_SGXLEPUBKEYHASHn with
>> default value to be host allowed (single) hash.
>>
>> If host allows several LEs (not but everything), and if we create guest
>> with 'lewr', then the behavior is not consistent with HW behavior, as
>> from guest's hardware's point of view, we can actually run any LE but we
>> have to tell guest that you are only allowed to change
>> IA32_SGXLEPUBKEYHASHn to some specific values. One compromise solution
>> is we don't allow to create guest with 'lewr' specified, and at the
>> meantime, only allow to create guest with host approved hashes specified
>> in 'lehash'. This will make guest's behavior consistent to HW behavior
>> but only allows guest to run one LE (which is specified by 'lehash' when
>> guest is created).
>>
>> I'd like to hear comments from you guys.
>>
>> Paolo, do you also have comments here from KVM's side?
>
> I would start with read-only LE hash (same as the host), which is a
> valid configuration anyway. Then later we can trap EINIT to emulate
> IA32_SGXLEPUBKEYHASHn.
You mean we can start with creating guest without Qemu 'lewr' parameter
support, and always disallowing guest to change IA32_SGXLEPUBKEYHASHn?
Even in this way, KVM still needs to emulate IA32_SGXLEPUBKEYHASHn (just
allow MSR reading but not writing), and write guest's value to physical
MSRs when running guest (trapping EINIT and write MSRs during EINIT is
really just performance optimization). Because host can run multiple LEs
and change MSRs. Your suggestion only works when runtime change to
IA32_SGXLEPUBKEYHASHn is disabled on host (meaning physical machine).
Thanks,
-Kai
>
> Paolo
>
5 years
[PATCH 0/2] intel_sgx: fix ksgxswapd_tsk init bugs
by Sean Christopherson
Sean Christopherson (2):
intel_sgx: separate EPC bank addition from page cache init
intel_sgx: check the return value of kthread_run
drivers/platform/x86/intel_sgx/sgx.h | 3 ++-
drivers/platform/x86/intel_sgx/sgx_main.c | 6 +++++-
drivers/platform/x86/intel_sgx/sgx_page_cache.c | 12 ++++++++----
3 files changed, 15 insertions(+), 6 deletions(-)
5 years