SPDK + user space appliance
by Shahar Salzman
Hi experts,
We have been integrating spdk into our system using a blockdev module, currently only a POC version.
Our use case is a user space appliance processing IOs, with an SPDK frontend to do the NVMeF.
Currently all of the user bdevs are created via configuration file, but we are working to add functions + rpc's which allow creation/deletion of these namespaces.
IO is sent to user space via callback, implementation is up to user space, but obviously the longer it lingers there the lower the performance, we use a set of rings + threads processing them, so that the time spent in the appliance is minimal.
Going back from user space we use a single ring (multiple producers single consumer) onto which the completions are inserted, and the ring poll function is registered with spdk core (spdk_poller_register).
Does this seem like a sane design? We'd really like your feedback, and if this can be useful to others, push the code into spdk.
Obviously we are willing to go through any review/testing process that is required. And share performance results and issues.
Cheers,
Shahar
4 years, 4 months
SPDK support for NVMe CMBs and PMRs with WDS/RDS
by Stephen Bates
Hi All
I just uploaded a patchset to Gerrit that adds support for NVMe controllers with CMBs/PMRs that support WDS and RDS (my full branch of the code is at [1]). This allows NVMe controllers to move/copy data from a namespace on one controller to a namespace on a different controller without requiring a system memory buffer. There are lots of interesting use cases for such data-movement.
The biggest issue with using a CMB or PMR for data copies is getting a vtophys translation. This series adds a new vtophys method for physical memory regions that fail via the existing methods (e.g. a PCIe BAR). This new method using a linked list of physical regions that can be added/deleted via spdk_reg_memory calls. We can then allocate/free memory from these regions and the current maps can handle both the reference counting and store the vtophys translations.
The hello_world example is updated to utilize the CMB WDS/RDS capability if the associated controller supports it. It addition a new example application called cmb_copy is included that performs the aforementioned offloaded copy when a CMB is available.
We have confirmed both cmb_copy and the new hello_world work as expected on both hardware from Eiditicom and Everspin. We plan to do more testing as more drives with WDS/RDS capable CMBs become available. We used PCIe packet counters in the Microsemi PCIe switch to confirm traffic is moving directly between the two NVMe SSDs and not being routed to the root complex on the CPU.
Feedback on the patches is gratefully received!
Cheers
Stephen
[1] https://github.com/Eideticom/spdk/tree/cmb-copy-v3
4 years, 4 months
Hello World example failed to run
by micki b
Hi
I tried to run nvme/hello_world example and failed see below the output
EAL: Detected 4 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
Initializing NVMe Controllers
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 15b7:5001 spdk_nvme
EAL: *Cannot write command to PCI config space!*
EAL: Cannot set up bus mastering!
EAL: Requested device 0000:01:00.0 cannot be used
*NVME Device *
Node SN Model
Namespace Usage Format FW Rev
---------------- --------------------
---------------------------------------- ---------
-------------------------- ---------------- --------
/dev/nvme0n1 172438424494 WDC WDS256G1X0C-00ENX0
1 256.06 GB / 256.06 GB 512 B + 0 B B35500WD
*PCI Output*
01:00.0 Non-Volatile memory controller: Sandisk Corp Device 5001 (prog-if
02 [NVM Express])
Subsystem: Marvell Technology Group Ltd. Device 1093
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 0: Memory at f7000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr-
TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit
Latency L0s <512ns, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+,
LTR+, OBFF Via message
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
LTR+, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance-
SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+,
LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable+ Count=19 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00003000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+
ChkEn+
Capabilities: [148 v1] Device Serial Number 03-4d-ff-7a-99-88-77-66
Capabilities: [158 v1] Power Budgeting <?>
Capabilities: [168 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [178 v1] #19
Capabilities: [2b8 v1] Latency Tolerance Reporting
Max snoop latency: 3145728ns
Max no snoop latency: 3145728ns
Capabilities: [2c0 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
L1_PM_Substates+
PortCommonModeRestoreTime=10us
PortTPowerOnTime=10us
Kernel driver in use: *nvme*
Kernel modules: nvme
When I declare Huge page nvme kernel driver will change to uio_pci_generic
How can I solve this problem
Regards
Micki
4 years, 4 months
Re: [SPDK] NVMe CMB WDS/RDS support patches
by Stephen Bates
Team
Belay my last request. It was the tag-ids. This patch series has now been uploaded to Gerrit for test and review. Cover email coming soon.
Cheers
Stephen
On 2018-01-29, 8:16 PM, "SPDK on behalf of Stephen Bates" <spdk-bounces(a)lists.01.org on behalf of sbates(a)raithlin.com> wrote:
[This sender failed our fraud detection checks and may not be who they appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing]
Hi All
I have been working on adding support for NVMe CMBs with WDS and RDS capabilities. I have been doing some work with Daniel and Jim to prepare for this based on some earlier patches from Daniel [1]. I now have something I want to submit for review and have a git branch for it here [2].
However when I tried to push this code for review I got the following rather obtuse error. Anyone any ideas what is going on here? My patches seem to pass the check_format.sh script and I rebased on the latest gerrit/master. Any help appreciated! I am wondering if this is something to do with outdated Id tags from Daniel's original patches?
Once I get the submission sorted I will send a cover email explaining what the patches do and the testing we have done on our NVMe hardware to validate that the patches work.
batesste@dionysus:~/spdk/examples/nvme/cmb_copy$ git push review
Password for 'https://sbates130272@review.gerrithub.io':
Counting objects: 82, done.
Delta compression using up to 16 threads.
Compressing objects: 100% (78/78), done.
Writing objects: 100% (82/82), 14.34 KiB | 0 bytes/s, done.
Total 82 (delta 66), reused 4 (delta 3)
remote: Resolving deltas: 100% (66/66)
remote: Processing changes: refs: 1, done
To https://sbates130272@review.gerrithub.io/spdk/spdk
! [remote rejected] HEAD -> refs/for/master (change https://review.gerrithub.io/375202 closed)
error: failed to push some refs to 'https://sbates130272@review.gerrithub.io/spdk/spdk'
Cheers
Stephen
[1] https://review.gerrithub.io/#/c/375201/
[2] https://github.com/Eideticom/spdk/tree/cmb-copy-v3
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
4 years, 4 months
NVMe CMB WDS/RDS support patches
by Stephen Bates
Hi All
I have been working on adding support for NVMe CMBs with WDS and RDS capabilities. I have been doing some work with Daniel and Jim to prepare for this based on some earlier patches from Daniel [1]. I now have something I want to submit for review and have a git branch for it here [2].
However when I tried to push this code for review I got the following rather obtuse error. Anyone any ideas what is going on here? My patches seem to pass the check_format.sh script and I rebased on the latest gerrit/master. Any help appreciated! I am wondering if this is something to do with outdated Id tags from Daniel's original patches?
Once I get the submission sorted I will send a cover email explaining what the patches do and the testing we have done on our NVMe hardware to validate that the patches work.
batesste@dionysus:~/spdk/examples/nvme/cmb_copy$ git push review
Password for 'https://sbates130272@review.gerrithub.io':
Counting objects: 82, done.
Delta compression using up to 16 threads.
Compressing objects: 100% (78/78), done.
Writing objects: 100% (82/82), 14.34 KiB | 0 bytes/s, done.
Total 82 (delta 66), reused 4 (delta 3)
remote: Resolving deltas: 100% (66/66)
remote: Processing changes: refs: 1, done
To https://sbates130272@review.gerrithub.io/spdk/spdk
! [remote rejected] HEAD -> refs/for/master (change https://review.gerrithub.io/375202 closed)
error: failed to push some refs to 'https://sbates130272@review.gerrithub.io/spdk/spdk'
Cheers
Stephen
[1] https://review.gerrithub.io/#/c/375201/
[2] https://github.com/Eideticom/spdk/tree/cmb-copy-v3
4 years, 4 months
Re: [SPDK] strcpy forbidden
by Stephen Bates
Paul
Thanks! OK but strdup() won't work if the destination buffer is already pre-allocated? I assume in that case we should use strncpy() or memcpy()?
Cheers
Stephen Bates, PhD
www.raithlin.com
+1 403 609 1784
On 2018-01-29, 2:50 PM, "SPDK on behalf of Luse, Paul E" <spdk-bounces(a)lists.01.org on behalf of paul.e.luse(a)intel.com> wrote:
LOL, it got me too Stephen. To avoid buffer overflow attacks. Use strdup instead....
-----Original Message-----
From: SPDK [mailto:spdk-bounces@lists.01.org] On Behalf Of Stephen Bates
Sent: Monday, January 29, 2018 2:48 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] strcpy forbidden
Hi All
I am sure there is a really good reason why strcpy is forbidden by check_format.sh but I cannot find it documented anywhere. Can someone enlighten me and point to the documentation if it exists?
Cheers
Stephen
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
4 years, 4 months
strcpy forbidden
by Stephen Bates
Hi All
I am sure there is a really good reason why strcpy is forbidden by check_format.sh but I cannot find it documented anywhere. Can someone enlighten me and point to the documentation if it exists?
Cheers
Stephen
4 years, 4 months
Welcome to SPDK China Summit 2018
by Yang, Ziye
Hi All,
Storage is well-known as an integral part of the data center infrastructure. The hardware and software architectures of the storage system jointly determines its performance, security, and manageability. Storage media is rapidly evolving, from mechanical hard drives to SATA NAND SSDs and now to NAND and 3D-Xpoint SSDs based on the NVMe protocol. Storage capacities also continue to improve, as well as I/O latencies improving from milliseconds to microseconds. With these storage hardware and media improvements, the storage software needs similar improvements to take advantage. Intel started the Storage Performance Development Kit (SPDK) open source software project (https://spdk.io<https://spdk.io/>) to help optimize the performance of storage systems.
To further promote the development of SPDK technology and its community and to provide a platform for exchange and sharing, the SPDK China Summit 2018 will be held in the Crowne Plaza Hotel Sun Palace Beijing on March 23rd, 2018. We sincerely invite you to attend this summit and discuss the status of SPDK and its future development. In this summit, Intel and SPDK users(e.g., Alibaba, Huawei, Hitachi, FusionStack and etc.) will share some topics related with SPDK program and community development. Welcome to join this summit, and the following shows the detailed info for conference registration.
https://www.bagevent.com/event/1177792 (Chinese website)
https://www.bagevent.com/event/1177885 (English website)
Best Regards
Ziye Yang
4 years, 5 months
Re: [SPDK] Compile time error in lib/scsi/scsi_bdev.o?
by Meneghini, John
The following compiles cleanly, w/no errors. So it looks like CONFIG_COVERAGE?=y is broken.
ssan-rx2560-03:spdk(master) > cat CONFIG.local
CONFIG_WERROR?=y
CONFIG_DPDK_DIR?=/home/johnm/SPDK/spdk/dpdk/build
CONFIG_RDMA?=y
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of John Meneghini <John.Meneghini(a)netapp.com>
Reply-To: Storage Performance Development Kit <spdk(a)lists.01.org>
Date: Wednesday, January 24, 2018 at 2:50 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Compile time error in lib/scsi/scsi_bdev.o?
Is anyone else seeing this error?
/John
CC lib/scsi/scsi_bdev.o
scsi_bdev.c: In function ‘spdk_bdev_scsi_execute’:
scsi_bdev.c:1884:21: error: ‘bdlen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
pllen - (md + bdlen));
~~~~^~~~~~~~
scsi_bdev.c:1784:6: note: ‘bdlen’ was declared here
int bdlen, llba;
^~~~~
cc1: all warnings being treated as errors
make[2]: *** [/home/johnm/SPDK/spdk/mk/spdk.common.mk:216: scsi_bdev.o] Error 1
make[1]: *** [/home/johnm/SPDK/spdk/mk/spdk.subdirs.mk:35: scsi] Error 2
make: *** [/home/johnm/SPDK/spdk/mk/spdk.subdirs.mk:35: lib] Error 2
ssan-rx2560-03:spdk(master) > cat CONFIG.local
CONFIG_WERROR?=y
CONFIG_COVERAGE?=y
CONFIG_DPDK_DIR?=/home/johnm/SPDK/spdk/dpdk/build
CONFIG_RDMA?=y
ssan-rx2560-03:spdk(master) > git logg -10
* f570aa65 2018-01-24 (HEAD -> master, origin/master) vhost: only split on 2MB boundaries when necessary [ Jim Harris / james.r.harris(a)intel.com ]
* e489ca69 2018-01-23 setup.sh: add virtio device names to status output [ Jim Harris / daniel.verkamp(a)intel.com ]
* 3779dda4 2018-01-23 setup.sh: change NVME_WHITELIST to PCI_WHITELIST [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* f8c1c71c 2018-01-23 setup.sh: support multiple hugetlb mountpoints [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 4b428979 2018-01-23 setup.sh: fix chown [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 68b8237c 2018-01-23 blob: assign iovcnt value in spdk_bs_user_op_alloc function [ Jim Harris / maciej.szwed(a)intel.com ]
* fb12bbec 2018-01-23 virtio: move vdev->name allocation to generic virtio [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* e6da08c2 2018-01-23 virtio/pci: detach pci device on virtio-pci destroy [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 0d6a37c7 2018-01-23 bdev/virtio/rpc: add RPC to attach virtio-pci device [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 2bbc59fa 2018-01-23 nvmf: Fix bug when accessing realloc'd pointer [ Daniel Verkamp / benjamin.walker(a)intel.com ]
4 years, 5 months
Re: [SPDK] Compile time error in lib/scsi/scsi_bdev.o?
by Harris, James R
Hi John,
I’m not seeing this error. I don’t see how we could get to line 1884 with bdlen uninitialized through – bdlen gets set at line 1863 or 1869 when rc >= 0. If rc < 0, we will break at line 1874.
This code hasn’t changed in more than a year though – any idea why you might only be seeing this issue now? I’m OK with initializing bdlen to 0 at the beginning of that case statement – just curious what is making this pop up now.
Thanks,
-Jim
From: SPDK <spdk-bounces(a)lists.01.org> on behalf of "Meneghini, John" <John.Meneghini(a)netapp.com>
Reply-To: Storage Performance Development Kit <spdk(a)lists.01.org>
Date: Wednesday, January 24, 2018 at 12:49 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Compile time error in lib/scsi/scsi_bdev.o?
Is anyone else seeing this error?
/John
CC lib/scsi/scsi_bdev.o
scsi_bdev.c: In function ‘spdk_bdev_scsi_execute’:
scsi_bdev.c:1884:21: error: ‘bdlen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
pllen - (md + bdlen));
~~~~^~~~~~~~
scsi_bdev.c:1784:6: note: ‘bdlen’ was declared here
int bdlen, llba;
^~~~~
cc1: all warnings being treated as errors
make[2]: *** [/home/johnm/SPDK/spdk/mk/spdk.common.mk:216: scsi_bdev.o] Error 1
make[1]: *** [/home/johnm/SPDK/spdk/mk/spdk.subdirs.mk:35: scsi] Error 2
make: *** [/home/johnm/SPDK/spdk/mk/spdk.subdirs.mk:35: lib] Error 2
ssan-rx2560-03:spdk(master) > cat CONFIG.local
CONFIG_WERROR?=y
CONFIG_COVERAGE?=y
CONFIG_DPDK_DIR?=/home/johnm/SPDK/spdk/dpdk/build
CONFIG_RDMA?=y
ssan-rx2560-03:spdk(master) > git logg -10
* f570aa65 2018-01-24 (HEAD -> master, origin/master) vhost: only split on 2MB boundaries when necessary [ Jim Harris / james.r.harris(a)intel.com ]
* e489ca69 2018-01-23 setup.sh: add virtio device names to status output [ Jim Harris / daniel.verkamp(a)intel.com ]
* 3779dda4 2018-01-23 setup.sh: change NVME_WHITELIST to PCI_WHITELIST [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* f8c1c71c 2018-01-23 setup.sh: support multiple hugetlb mountpoints [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 4b428979 2018-01-23 setup.sh: fix chown [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 68b8237c 2018-01-23 blob: assign iovcnt value in spdk_bs_user_op_alloc function [ Jim Harris / maciej.szwed(a)intel.com ]
* fb12bbec 2018-01-23 virtio: move vdev->name allocation to generic virtio [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* e6da08c2 2018-01-23 virtio/pci: detach pci device on virtio-pci destroy [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 0d6a37c7 2018-01-23 bdev/virtio/rpc: add RPC to attach virtio-pci device [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 2bbc59fa 2018-01-23 nvmf: Fix bug when accessing realloc'd pointer [ Daniel Verkamp / benjamin.walker(a)intel.com ]
4 years, 5 months