Compile time error in lib/scsi/scsi_bdev.o?
by Meneghini, John
Is anyone else seeing this error?
/John
CC lib/scsi/scsi_bdev.o
scsi_bdev.c: In function ‘spdk_bdev_scsi_execute’:
scsi_bdev.c:1884:21: error: ‘bdlen’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
pllen - (md + bdlen));
~~~~^~~~~~~~
scsi_bdev.c:1784:6: note: ‘bdlen’ was declared here
int bdlen, llba;
^~~~~
cc1: all warnings being treated as errors
make[2]: *** [/home/johnm/SPDK/spdk/mk/spdk.common.mk:216: scsi_bdev.o] Error 1
make[1]: *** [/home/johnm/SPDK/spdk/mk/spdk.subdirs.mk:35: scsi] Error 2
make: *** [/home/johnm/SPDK/spdk/mk/spdk.subdirs.mk:35: lib] Error 2
ssan-rx2560-03:spdk(master) > cat CONFIG.local
CONFIG_WERROR?=y
CONFIG_COVERAGE?=y
CONFIG_DPDK_DIR?=/home/johnm/SPDK/spdk/dpdk/build
CONFIG_RDMA?=y
ssan-rx2560-03:spdk(master) > git logg -10
* f570aa65 2018-01-24 (HEAD -> master, origin/master) vhost: only split on 2MB boundaries when necessary [ Jim Harris / james.r.harris(a)intel.com ]
* e489ca69 2018-01-23 setup.sh: add virtio device names to status output [ Jim Harris / daniel.verkamp(a)intel.com ]
* 3779dda4 2018-01-23 setup.sh: change NVME_WHITELIST to PCI_WHITELIST [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* f8c1c71c 2018-01-23 setup.sh: support multiple hugetlb mountpoints [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 4b428979 2018-01-23 setup.sh: fix chown [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 68b8237c 2018-01-23 blob: assign iovcnt value in spdk_bs_user_op_alloc function [ Jim Harris / maciej.szwed(a)intel.com ]
* fb12bbec 2018-01-23 virtio: move vdev->name allocation to generic virtio [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* e6da08c2 2018-01-23 virtio/pci: detach pci device on virtio-pci destroy [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 0d6a37c7 2018-01-23 bdev/virtio/rpc: add RPC to attach virtio-pci device [ Jim Harris / dariuszx.stojaczyk(a)intel.com ]
* 2bbc59fa 2018-01-23 nvmf: Fix bug when accessing realloc'd pointer [ Daniel Verkamp / benjamin.walker(a)intel.com ]
4 years, 5 months
I/O getting lost on retry?
by Andrey Kuzmin
Just looking at the code below, doesn't the command being retried get lost
after the break statement if the block device is still in the nomem
condition?
Regards,
Andrey
diff --git a/lib/bdev/bdev.c b/lib/bdev/bdev.c
index ba64994..77328ae 100644
--- a/lib/bdev/bdev.c
+++ b/lib/bdev/bdev.c
@@ -1567,10 +1567,11 @@
TAILQ_REMOVE(&bdev_ch->nomem_io, bdev_io, link);
bdev_ch->io_outstanding++;
bdev_io->status = SPDK_BDEV_IO_STATUS_PENDING;
bdev->fn_table->submit_request(bdev_ch->channel, bdev_io);
if (bdev_io->status == SPDK_BDEV_IO_STATUS_NOMEM) {
+ /** Doesn't the io get lost here? */
break;
}
}
}
4 years, 5 months
Re: [SPDK] bdev: Added latency to channel statistics patch testing
by Isaac Otsiabah
Hi Jim, how are you and happy new year? We investigated the cached tsc value suggestion and found two things.
1. To retrieve the cached tsc value from spdk_thread structure, spdk_thread_get_ticks() would be implemented like
uint64_t spdk_thread_get_ticks(void)
{
struct spdk_thread *thread = spdk_get_thread();
if (thread) {
return thread->tsc;
}
return 0;
}
The issue is spdk_get_thread() acquires a lock each time it is called.
2. Saving the tsc value from the _spdk_reactor_run() function require a) acquiring the lock at least once to get the address of the tsc variable from spdk_thread and saving the address into a pointer variable or b) use a function that would repeatedly acquire the lock when it needs to save the value.
3. It is also possible that a cached value can be used to process more than one i/o command which can result in erroneous lower latencies with higher qdepths.
4. I compared the result of spdk_thread_get_ticks() implementation with just calling spdk_get_ticks() (not caching the tsc) using spdk/test/lib/bdev/bdevperf/bdevperf. The result showed lower latency for higher qdepth.
Num of core = 1, Devices = 1, runtime = 60 sec, operation = write
Test#
Qdepth
Retrieving cached tsc value through spdk_thread_get_ticks() call implementation
Old implementation (not caching spdk_get_ticks() values)
Num of ops
Latency (us)
Num of ops
Latency (us)
1
1
6383242
9.399
6638771
8.972
2
2
10309024
11.640
10260596
11.630
3
4
21594909
11.113
22122622
10.783
4
8
22333247
21.492
22187344
21.569
5
16
22584995
42.506
21974123
43.622
6
32
23407821
82.024
23418909
81.919
7
64
23745706
161.713
23039526
166.604
8
128
25408188
302.265
23890885
321.395
Conclusion: So with these issues, what if we keep the latency measurement implementation we submitted as is, and we add RPC command that enable/disable the latency measurement logic? This way, applications running production code will not suffer from time taken to execute latency measurement logic. On the other hand, if a user needs to quickly measure latency, they can enable latency measurement with the RPC command, take latency measurement and then disable it when done. What do you think?
Isaac
From: Harris, James R [mailto:james.r.harris@intel.com]
Sent: Tuesday, December 19, 2017 3:47 PM
To: Isaac Otsiabah <IOtsiabah(a)us.fujitsu.com>; Verkamp, Daniel <daniel.verkamp(a)intel.com>
Cc: Paul Von-Stamwitz <PVonStamwitz(a)us.fujitsu.com>; Edward Yang <eyang(a)us.fujitsu.com>
Subject: Re: bdev: Added latency to channel statistics patch testing
Hi Isaac,
Sorry about the delay. Since this failed the test pool it didn’t get through my review filter. We do have some intermittent failures in the test pool that we are debugging – if one of these ever hits you, please do not hesitate to send an e-mail and we will be happy to re-run it for you.
One concern I have is that this adds two get_ticks calls on each I/O. And for cases like SPDK logical volumes – where the NVMe namespace is a bdev, and then there is a logical volume bdev on top of it – it means four get_ticks calls for each I/O submitted to the logical volume.
The get_ticks() calls are relatively expensive – on my E5 v4 Xeon system, each get_ticks() takes about 11ns. So if we do 1M IO/s on one core with SPDK logical volumes on my system, this would be 4 * 1M * 11ns = 44ms (or 4.4%) of the CPU cycles spent just on these get_ticks() calls.
Ben and I were talking about this in the lab and have an idea. In the main reactor loop, we already need to do a get_ticks() call each time through the loop. We could easily save this value in the spdk_thread structure. Then when the bdev layer needs it for something like I/O statistics, it could use that value, instead of calling get_ticks() again. We would call this something like spdk_thread_get_ticks(). The downside is that value would not be 100% precise – it would resolve to when the poller function was called, which may be 100-200ns off.
We could also provide an option with an RPC/conf file parameter so that a call to spdk_thread_get_ticks() would return a precise tick count instead of the cached value. Then Fujitsu could enable this to get precise values while the default would use the cached values.
What do you think? If this would suit Fujitsu’s needs, I would be happy to put a patch out for review.
-Jim
From: Isaac Otsiabah <IOtsiabah(a)us.fujitsu.com<mailto:IOtsiabah@us.fujitsu.com>>
Date: Tuesday, December 19, 2017 at 2:51 PM
To: James Harris <james.r.harris(a)intel.com<mailto:james.r.harris@intel.com>>, Daniel Verkamp <daniel.verkamp(a)intel.com<mailto:daniel.verkamp@intel.com>>
Cc: Paul Von-Stamwitz <PVonStamwitz(a)us.fujitsu.com<mailto:PVonStamwitz@us.fujitsu.com>>, Isaac Otsiabah <IOtsiabah(a)us.fujitsu.com<mailto:IOtsiabah@us.fujitsu.com>>, Edward Yang <eyang(a)us.fujitsu.com<mailto:eyang@us.fujitsu.com>>
Subject: FW: bdev: Added latency to channel statistics patch testing
Hello Jim, how are you? Please, can we get a movement on this patch to move it forward. The changes were very small in lib/bdev/bdev.c. I tested it with bdev fio_plugin and latency results were comparable fio results. One thing is, the build and test on machine fedora-06 kept failing but was passing on all other machines. I do not see a reason for failing on fedora-06 since the changes were not platform related. Please, can you help us on this to move it forward? Thank you.
Isaac
From: Isaac Otsiabah
Sent: Tuesday, December 12, 2017 12:59 PM
To: 'daniel.verkamp(a)intel.com'' <daniel.verkamp(a)intel.com'<mailto:daniel.verkamp@intel.com'>>; 'Harris, James R' <james.r.harris(a)intel.com<mailto:james.r.harris@intel.com>>
Cc: Paul Von-Stamwitz <PVonStamwitz(a)us.fujitsu.com<mailto:PVonStamwitz@us.fujitsu.com>>; Isaac Otsiabah <IOtsiabah(a)us.fujitsu.com<mailto:IOtsiabah@us.fujitsu.com>>; Edward Yang <eyang(a)us.fujitsu.com<mailto:eyang@us.fujitsu.com>>
Subject: bdev: Added latency to channel statistics patch testing
Hi Daniel, this is a bdev fio_plugin test that compared latency with latency measurements with changes in the https://review.gerrithub.io/#/c/390654 (bdev: Added latency to channel statistics) patch. As expected, latency measurements from the patch are comparable to fio measurements.
Test run#
Qdepth
fio clat (us)
fio avg latency (us)
bdev latency (us) from spdk_bdev_get_io_stat(..)
1
2
write
7.80
8.52
8.56
read
95.36
96.06
97.138
2
4
write
7.98
8.70
8.32
read
133.88
134.59
128.85
3
8
write
8.83
9.85
10.87
read
175.61
176.48
180.66
4
16
write
9.79
10.81
10.282
read
240.71
241.6
236.913
5
32
write
11.87
12.88
12.384
read
329.8
330.67
327.648
6
64
write
20.64
21
20.707
read
471.02
471.91
467.118
7
128
write
187.53
188.57
182.92
read
704.93
705.81
697.49
Isaac
4 years, 5 months
how to use blobstore with pthread
by Zhengyu Zhang
Hi list,
Could someone tell me how to integrate the blobstore into multithread
programs? I saw some examples using SPDK event framework to do
multithreading, but how to do that with the pthread library or C++
std::thread is still mysterious to me...
My work flow is shown below. It gives me segmentation fault when it
init/load blobstore:
[main]: spdk_app_start -> get bdev -> pthread_create -> ...
[new threads]: spdk_allocate_thread -> spdk_bdev_create_bs_dev ->
spdk_bs_init/load[SEG FAULT] -> create/open/write/ blobs
Thanks!
Zhengyu
---
My thread routine:
+ 393 #define THREAD_NUM (1)
+ 394
+ 395 static void _spdk_send_msg(spdk_thread_fn fn, void *ctx, void
*thread_ctx)
+ 396 {
+ 397 assert(false);
+ 398 }
+ 399
+ 400 int thread_routine(struct hello_context_t *hello_context)
+ 401 {
+ 402 spdk_allocate_thread(_spdk_send_msg, NULL, NULL, NULL, NULL);
+ 403
+ 404 struct spdk_bs_dev *bs_dev = NULL;
+ 405 bs_dev = spdk_bdev_create_bs_dev(hello_context->bdev, NULL, NULL);
+ 406 if (bs_dev == NULL) {
+ 407 SPDK_ERRLOG("Could not create blob bdev!!\n");
+ 408 spdk_app_stop(-1);
+ 409 return;
+ 410 }
+ 411
+ 412 spdk_bs_init(bs_dev, NULL, bs_init_complete, hello_context); //
SEG FAULT
+ 413 }
4 years, 5 months
Enabling SPDK_DEBUGLOG(SPDK_LOG_NVME
by Stephen Bates
Hi All
I am looking to add some more CMB support to nvme_pci.c. I'd like to include some debug statements to help track the location of the CMB memory mapping into virtual address space. I am having issues enabling the SPDK_DEBUGLOG macro. Can someone give me some pointers as to how we turn this logging mechanism on?
Cheers
Stephen
4 years, 5 months
The IO handling when disk hot removed
by 張安男
Hello all,
Recently I am trying the function of hot remove disk.
I found the IOs on the fly will not all come back(i.e. execute call back
function)
when I hot remove the disk.
Is there exist any settings that make all IOs execute call back even when
the disk is hot removed???
If the IOs will not execute call back function when disk removed, I must
record the IOs before send out,
and abort all the IOs by my self when I received the hotplug event. That's
a little trouble.
Any suggestion is appreciated.
Than you so much
--
Vincent chang
4 years, 5 months
SPDK FIO poor 4K read performance
by Mayank
Hi All,
I have performed fio benchmark test with SPDK fio plugin for throughput
test with iodepth = 32 and numjobs=4.
Latency results are as below:
rand read 11.2k 45.9MB/s 88.83 usec
rand write 126k 518MB/s 7.56 usec
Throughput results are as below:
Throughput (io:32 thr:4) IOPS Throuhgput Latency (usec)
seq read 406k 1665MB/s 314.71
seq write 455k 1865MB/s 280.87
rand read 478k 1959MB/s 267.29
rand write 442k 1812MB/s 289.03
In results, 'm getting sequential read IOPS lesser than sequential write
IOPS.
I am using NVME model number : INTEL SSDPEDMD020T4.
kernel : 4.13.13
fio version : 2.21
can anybody tell me what could be the issue?
Thanks,
Mayank
4 years, 5 months
Re: [SPDK] Selecting specific NVMe controllers via setup.sh
by Harris, James R
Hi Stephen,
There is no way to do that currently with setup.sh. No objections from me for adding it in a way like you describe – I think that would be generally useful.
-Jim
On 1/8/18, 3:50 PM, "SPDK on behalf of Stephen Bates" <spdk-bounces(a)lists.01.org on behalf of sbates(a)raithlin.com> wrote:
Hi All
[This is my first SPDK email so please forgive if this is already covered somewhere]
Is there a way to use the setup.sh script to bind only one (or a subset) of the NVMe controllers in a system to vfio or uio? I have a mix of NVMe devices and I only want some of them allocated to SPDK and keep the reset under kernel driver control.
If the answer to this is NO are there any objections to me adding this (all be it in a way that keeps setup.sh backward compatible)? I'd probably use a environment variable list like how we use SKIP_PCI already...
Cheers
Stephen
_______________________________________________
SPDK mailing list
SPDK(a)lists.01.org
https://lists.01.org/mailman/listinfo/spdk
4 years, 5 months
Problem with Blobstore when write 65MB continously
by Zhengyu Zhang
Hi list!
I want to write some app with Blobstore in SPDK. I am playing with
example/blob/hello_world/hello_blob.c for a while. I modified the
hello_blob to make it write more pages than its original one page:
for ( i = 0; i < SOMEVAL; i ++) {
spdk_bs_io_write_blob(hello_context->blob, hello_context->channel,
hello_context->write_buff, offset, 32,
write_complete, hello_context);
offset += 32;
}
I meant to write blob for SOMEVAL times and 32 pages for each write. When
the total amount of writing data is below 64M (SOMEVAL <= 512), it works
fine. However, when the total size is over 64M, e.g. 65M, it breaks:
hello_blob.c: 388:blob_create_complete: *NOTICE*: new blob id 4294967296
hello_blob.c: 327:open_complete: *NOTICE*: entry
hello_blob.c: 338:open_complete: *NOTICE*: blobstore has FREE clusters of
380063
hello_blob.c: 358:open_complete: *NOTICE*: resized blob now has USED
clusters of 65
hello_blob.c: 295:sync_complete: *NOTICE*: entry
hello_blob.c: 253:blob_write: *NOTICE*: entry
hello_blob.c: 232:write_complete: *NOTICE*: entry
hello_blob.c: 115:unload_bs: *ERROR*: Error in write completion (err -12)
blobstore.c:2563:spdk_bs_unload: *ERROR*: Blobstore still has open blobs
hello_blob.c: 99:unload_complete: *NOTICE*: entry
hello_blob.c: 101:unload_complete: *ERROR*: Error -16 unloading the bobstore
I have no idea what is going on ... can anyone help?
Thanks
ZHengyu
4 years, 5 months
Selecting specific NVMe controllers via setup.sh
by Stephen Bates
Hi All
[This is my first SPDK email so please forgive if this is already covered somewhere]
Is there a way to use the setup.sh script to bind only one (or a subset) of the NVMe controllers in a system to vfio or uio? I have a mix of NVMe devices and I only want some of them allocated to SPDK and keep the reset under kernel driver control.
If the answer to this is NO are there any objections to me adding this (all be it in a way that keeps setup.sh backward compatible)? I'd probably use a environment variable list like how we use SKIP_PCI already...
Cheers
Stephen
4 years, 5 months