[edk2] Question about hotplugging NIC devices to an empty pci-bridge

Zhoujian (jay) jianjay.zhou at huawei.com
Tue Dec 25 22:04:41 PST 2018


Hi Laszlo,

Thank for your explanation in details and your patience, it is really helpful!

Regards,
Jay Zhou

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek at redhat.com]
> Sent: Tuesday, December 25, 2018 6:18 PM
> To: Zhoujian (jay) <jianjay.zhou at huawei.com>
> Cc: Yao, Jiewen <jiewen.yao at intel.com>; edk2-devel at lists.01.org; Huangweidong
> (C) <weidong.huang at huawei.com>; liujunjie (A) <liujunjie23 at huawei.com>;
> wangxin (U) <wangxinxin.wang at huawei.com>; wujing (O) <wujing42 at huawei.com>;
> dengkai (A) <dengkai1 at huawei.com>
> Subject: Re: Question about hotplugging NIC devices to an empty pci-bridge
> 
> Brief answer while I'm on PTO.
> 
> (It's difficult to reply to this thread in any sensible manner, because of
> the brain-damaged top-posting that outlook and gmail perpetuate. I'll try my
> best anyway, but you might have to reverse the order of my answers for
> getting a good logical explanation. Again, the damage is self-inflicted here;
> use a better MUA please.)
> 
> On 12/21/18 14:50, Zhoujian (jay) wrote:
> >> -----Original Message-----
> >> From: Yao, Jiewen [mailto:jiewen.yao at intel.com]
> >> Sent: Friday, December 21, 2018 1:28 PM
> >> To: Zhoujian (jay) <jianjay.zhou at huawei.com>;
> >> edk2-devel at lists.01.org; lersek at redhat.com
> >> Cc: Huangweidong (C) <weidong.huang at huawei.com>; liujunjie (A)
> >> <liujunjie23 at huawei.com>; wangxin (U) <wangxinxin.wang at huawei.com>;
> >> wujing (O) <wujing42 at huawei.com>; dengkai (A) <dengkai1 at huawei.com>
> >> Subject: RE: Question about hotplugging NIC devices to an empty
> >> pci-bridge
> 
> When you hotplug a traditional PCI, or PCI Express, device, at OS runtime,
> the OS can generally only satisfy the resource requirements of the device
> from reserved (pre-allocated) resources. This means that hotplug plans have
> to be considered in advance when the initial PCI enumeration and resource
> assignment occurs, in the firmware. The reservations should be considered /
> propagated upstream (to the root
> complex(es)) from the leaf bridge(s) where the hotplug actions are expected.
> PciBusDxe covers the propagation, but the "leaves" have to expose the
> reservations ("paddings").
> 
> The default reservation sizes may be both wasteful and insufficient. One
> example for waste is when you have many traditional PCI bridges, each
> requiring 4KB IO space, but the platform doesn't have much IO space in total
> (the theoretical maximum is 64KB anyway), and so you run out of IO space
> during enumeration.
> 
> More below:
> 
> >>
> >> You need have a PciHotPlug driver to produce the
> >> EFI_PCI_HOT_PLUG_INIT_PROTOCOL
> >>
> >> One example:
> >> https://github.com/tianocore/edk2/tree/master/OvmfPkg/PciHotPlugInitD
> >> xe Laszlo added it. He may provide comment on how to use it.
> >>
> >> Another example:
> >> https://github.com/tianocore/edk2-platforms/tree/devel-
> >> MinPlatform/Platform/Intel/KabylakeOpenBoardPkg/Features/PciHotPlug
> >> This is to add Thunderbolt support in Kabylake platform.
> >
> > I've checked the dsc, and confirmed that the OVMF.fd already had the
> > PciHotPlug driver.
> > Then I found the resource info through the debug log like below:
> >
> > InitRootBridge: populated root bus 0, with room for 255 subordinate
> > bus(es)
> > RootBridge: PciRoot(0x0)
> >   Support/Attr: 70069 / 70069
> >     DmaAbove4G: No
> > NoExtConfSpace: Yes
> >      AllocAttr: 3 (CombineMemPMem Mem64Decode)
> >            Bus: 0 - FF Translation=0
> >             Io: C000 - FFFF Translation=0
> >            Mem: C0000000 - FBFFFFFF Translation=0
> >     MemAbove4G: 41800000000 - 41FFFFFFFFF Translation=0
> >           PMem: FFFFFFFFFFFFFFFF - 0 Translation=0
> >    PMemAbove4G: FFFFFFFFFFFFFFFF - 0 Translation=0
> >
> > In the OvmfPkg/PlatformPei/Platform.c, the function
> > MemMapInitialization sets the PciIoBase=0xC000 and PciIoSize=0x4000(On
> > Q35, the PciIoBase=0x6000 and PciIoSize=0xA000).
> >
> > So my question are:
> > 1)Why the default value of PciIoBase is 0xC000, each pci-bridges needs
> > 0x0fff IO window, which means only 4 pci-bridges can be reserved?
> 
> The IO space aperture sizes that you see on i440fx and Q35 in
> OvmfPkg/PlatformPei emerge like that simply because those are the largest
> contiguous IO space ranges that fit between IO ports that belong to platform
> devices.
> 
> If you run
> 
>   git blame -- OvmfPkg/PlatformPei/Platform.c
> 
> you soon end up with a pointer to commit bba734ab4c7c
> ("OvmfPkg/PlatformPei: provide 10 * 4KB of PCI IO Port space on Q35", 2016-
> 05-17). The commit message on that commit should help, and it also mentions
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1333238
> 
> which is where I had investigated the IO space sizes that were
> *practically* available on i440fx and Q35.
> 
> > 2)If I set the PciIoBase=0x1000, PciIoSize=0xA000 and start a vm with
> > 8 empty pci-bridges, hotpluging a virtual nic to the pci-bridge, the
> > problem is disappearing.
> >   But will this cause any side effects?
> 
> Yes, it could; if you override PciIoBase like this, then PciBusDxe may easily
> allocate IO BARs of devices such that they overlap IO ports of other (built-
> in, platform) devices.
> 
> The solution to the IO space shortage is to use Q35 with a PCI Express (that
> is, not traditional PCI) hierarchy. PCI Express devices are required to
> function without IO BARs, and you can use PCI Express Root Ports, and
> Switches (consisting from Upstream Ports and a number of Downstream Ports)
> without consuming IO space at all.
> 
> This is documented in great detail in the following two documents in the QEMU
> source tree:
> 
> [1] docs/pcie.txt
> [2] docs/pcie_pci_bridge.txt
> 
> Now, if you switch to Q35 / PCIE, then you likely won't run out of IO space;
> however, the other issue may still arise, where not enough MMIO is reserved
> for hot-plugging devices with large MMIO demands.
> 
> For that, OvmfPkg/PciHotPlugInitDxe implements the firmware side for QEMU's
> "PCI resource reservation capability". This is a vendor-specific PCI
> capability (in traditional config space) that can be added to the generic PCI
> Express Root Port device model of QEMU, using the appropriate command line
> switches (see again [1] and [2]). When you do that, PciHotPlugInitDxe
> instructs PciBusDxe to reserve the given sizes from the given resource types
> on the given root port, and then you can hotplug a large device at OS runtime
> into that root port.
> 
> For more details (beyond the two documents above), please refer to
> 
> [3] git log -- OvmfPkg/PciHotPlugInitDxe [4]
> https://bugzilla.redhat.com/show_bug.cgi?id=1434740#c5
> [5] https://lists.01.org/pipermail/edk2-devel/2017-September/015296.html
> 
> More below:
> 
> >>> -----Original Message-----
> >>> From: Zhoujian (jay) [mailto:jianjay.zhou at huawei.com]
> >>> Sent: Friday, December 21, 2018 11:04 AM
> >>> To: Yao, Jiewen <jiewen.yao at intel.com>; edk2-devel at lists.01.org;
> >>> lersek at redhat.com
> >>> Cc: Huangweidong (C) <weidong.huang at huawei.com>; liujunjie (A)
> >>> <liujunjie23 at huawei.com>; wangxin (U) <wangxinxin.wang at huawei.com>;
> >>> wujing (O) <wujing42 at huawei.com>; dengkai (A) <dengkai1 at huawei.com>
> >>> Subject: RE: Question about hotplugging NIC devices to an empty
> >>> pci-bridge
> >>>
> >>> I've tried to set PcdPciBusHotplugDeviceSupport to be true in
> >>> MdeModulePkg.dec like below:
> >>> gEfiMdeModulePkgTokenSpaceGuid.PcdPciBusHotplugDeviceSupport|TRUE
> >>> |BOOLEAN|0x0001003d
> >>> But the problem still exists. Is there any steps I missed? Or some
> >>> infos need to populate to OVMF by Qemu?
> >>>
> >>> Could you give me more infos?
> >>>
> >>> Thanks,
> >>> Jay Zhou
> >>>
> >>>> -----Original Message-----
> >>>> From: Yao, Jiewen [mailto:jiewen.yao at intel.com]
> >>>> Sent: Thursday, December 20, 2018 8:09 PM
> >>>> To: Zhoujian (jay) <jianjay.zhou at huawei.com>;
> >>>> edk2-devel at lists.01.org
> >>>> Cc: Huangweidong (C) <weidong.huang at huawei.com>; liujunjie (A)
> >>>> <liujunjie23 at huawei.com>; wangxin (U)
> >>> <wangxinxin.wang at huawei.com>; wujing (O)
> >>>> <wujing42 at huawei.com>; dengkai (A) <dengkai1 at huawei.com>
> >>>> Subject: RE: Question about hotplugging NIC devices to an empty
> >>> pci-bridge
> >>>>
> >>>> Maybe you can use EFI_PCI_HOT_PLUG_INIT_PROTOCOL to reserve some
> >>> resource.
> >>>>
> >>>> See MdePkg\Include\Protocol\PciHotPlugInit.h
> >>>>
> >>>> Thank you
> >>>> Yao Jiewen
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: edk2-devel [mailto:edk2-devel-bounces at lists.01.org] On
> >>>>> Behalf
> >>> Of
> >>>>> Zhoujian (jay)
> >>>>> Sent: Thursday, December 20, 2018 7:34 PM
> >>>>> To: edk2-devel at lists.01.org
> >>>>> Cc: Huangweidong (C) <weidong.huang at huawei.com>; liujunjie (A)
> >>>>> <liujunjie23 at huawei.com>; wangxin (U)
> >>> <wangxinxin.wang at huawei.com>;
> >>>>> wujing (O) <wujing42 at huawei.com>; dengkai (A)
> >>> <dengkai1 at huawei.com>
> >>>>> Subject: [edk2] Question about hotplugging NIC devices to an empty
> >>>>> pci-bridge
> >>>>>
> >>>>> Hi all,
> >>>>>
> >>>>> The issue occurs when I started a virtual machine in UEFI way by
> >>>>> libvirt on qemu-kvm platform, the vm is configured with 8
> >>>>> pci-bridges on root bus0. I hotplug a device like virtual nic to
> >>>>> an empty pci-bridge which has no device connected. Login the vm, I
> >>>>> can see the device by "lspci"", but it didn't show by "ifconfig
> >>>>> -a". Dmesg shows like
> >>>> below:
> >>>>> pci 0000:04:01.0: BAR 0: no space for [mem size 0x00010000 64bit
> >>>>> pref] pci
> >>>>> 0000:04:01.0: BAR 0: failed to assign [mem size 0x00010000 64bit
> >>>>> pref] pci
> >>>>> 0000:04:01.0: BAR 3: no space for [mem size 0x00004000 64bit pref]
> >>>>> pci
> >>>>> 0000:04:01.0: BAR 3: failed to assign [mem size 0x00004000 64bit
> >>>>> pref]
> >>>>>
> >>>>> Reboot the vm, everything turns back to normal and I can see the
> >>>>> new hotplugged nic by "ifconfig -a".
> >>>>>
> >>>>> Use the OVMF compiling from latest edk2 source code, the same
> >>> problem
> >>>>> arises.
> >>>>>
> >>>>> So, my questions are:
> >>>>> 1) the generic PCI bus driver in edk2 does not allocate IO and/or
> >>>>> MMIO for a bridge if there is no device behind the Currently, if
> >>>>> you bridge that consume that kind of resource?
> >>>>> 2) What's the purpose of this strategy?
> >>>>> 3) Why don't allocate resource to all bridges like seabios?
> >>>>> 4) Is there any switch for me to turn off this constraint so that
> >>>>> every pci-bridge including empty ones can be assigned IO and
> >>>>> memory
> >>> window?
> >>>>> Otherwise, each time I hotplug a device to empty pci-bridge, a
> >>>>> reboot operation should be implemented to use the device?
> >>>>>
> >>>>> Any help will be appreciated, Thanks!
> 
> Currently, the resource reservation capability is implemented on the Generic
> PCI Express Root Port device model, which is only usable on Q35.
> If you really want to hotplug a traditional PCI device, *while* sizing the
> reservation appropriately, I believe you'll have to:
> - size the reservation on a Root Port as needed,
> - cold-plug a PCIE-PCI bridge first into the Root Port,
> - hotplug the traditional PCI device into the PCIE-PCI bridge.
> 
> (You can also *hot*plug the PCIE-PCI bridge itself, because
> <https://bugzilla.tianocore.org/show_bug.cgi?id=656> has been fixed, but then
> remember to reserve bus numbers as well, at the Root Port level.)
> 
> We worked out this exact scenario with another developer earlier, on the
> SeaBIOS mailing list. Please read through the thread below:
> 
>   [SeaBIOS] hotplug failure issue on pci-bridge
>   http://mid.mail-archive.com/da8e8d1c-ab1e-c790-0c34-
> ef094a438a77 at linux.intel.com
> 
> https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/WKHZ6LVP
> OAXRPPT4M5HZKUPON2Z7EZWB/
> 
> Hope this helps,
> Laszlo


More information about the edk2-devel mailing list