aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-13nvme: avoid crashes when node 0 is memoryless node.Masayoshi Mizuma
When CONFIG_NUMA is enabled and node 0 is memoryless, the system crashes because nvme_probe() sets the device->numa_node to 0 by set_dev_node(&pdev->dev, 0), so it tries to allocate memory from node 0. To avoid the crash, we should change the 0 to first_memory_node. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme: Limit command retriesKeith Busch
Many controller implementations will return errors to commands that will not succeed, but without the DNR bit set. The driver previously retried these commands an unlimited number of times until the command timeout has exceeded, which takes an unnecessarilly long period of time. This patch limits the number of retries a command can have, defaulting to 5, but is user tunable at load or runtime. The struct request's 'retries' field is used to track the number of retries attempted. This is in contrast with scsi's use of this field, which indicates how many retries are allowed. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12loop: Make user notify for adding loop device failedMinfei Huang
There is no error number returned if loop driver fails in function alloc_disk to add new loop device. Add a correct error number to make user notify in this case. Signed-off-by: Minfei Huang <mnghuan@gmail.com> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme-loop: fix nvme-loop Kconfig dependenciesArnd Bergmann
I ran into the same problem on NVME_TARGET_RDMA now, which otherwise needs dependencies on both CONFIG_BLOCK and CONFIGFS_FS: warning: (NVME_TARGET_LOOP && NVME_TARGET_RDMA) selects NVME_TARGET which has unmet direct dependencies (BLOCK && CONFIGFS_FS) 0xA002B368 Mon Jul 11 18:00:45 CEST 2016 failed In file included from ../drivers/nvme/target/core.c:16:0: drivers/nvme/target/nvmet.h:222:14: error: field 'inline_bio' has incomplete type struct bio inline_bio; ^~~~~~~~~~ drivers/nvme/target/core.c: In function 'nvmet_async_event_work': drivers/nvme/target/core.c:98:3: error: implicit declaration of function 'kfree' [-Werror=implicit-function-declaration] kfree(aen); ^~~~~ ../drivers/nvme/target/core.c: In function 'nvmet_ns_enable': ../drivers/nvme/target/core.c:269:13: error: implicit declaration of function 'blkdev_get_by_path' [-Werror=implicit-function-declaration] ns->bdev = blkdev_get_by_path(ns->device_path, FMODE_READ | FMODE_WRITE, Folding in my patch below should address that too. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvmet: fix return value check in nvmet_subsys_alloc()Wei Yongjun
In case of error, the function kstrndup() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme-fabrics: add-remove ctrl repeat fixMing Lin
Repeatedly adding then removing the same NVMe-over-Fabrics controller over and over again (shown below) can cause a kernel crash (also shown below). This patch fixes that. [nvmf]# ./setup_nvme_connections.sh traddr=192.168.1.100,transport=rdma,trsvcid=4420,nqn=darkside -nqn,hostnqn=evil-wins-nqn,nr_io_queues=16 > /dev/nvme-fabrics traddr=192.168.1.100,transport=rdma,trsvcid=4420,nqn=lightside -nqn,hostnqn=good-wins-nqn > /dev/nvme-fabrics [nvmf]# ./remove_nvme_connections.sh 2 echo 1 > /sys/class/nvme/nvme0/delete_controller echo 1 > /sys/class/nvme/nvme1/delete_controller [nvmf]# ./setup_nvme_connections.sh traddr=192.168.1.100,transport=rdma,trsvcid=4420,nqn=darkside -nqn,hostnqn=evil-wins-nqn,nr_io_queues=16 > /dev/nvme-fabrics Killed [nvmf]# dmesg [ 313.416908] nvme nvme0: creating 16 I/O queues. [ 313.523908] nvme nvme0: new ctrl: NQN "darkside-nqn", addr 192.168.1.100:4420 [ 313.524857] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 313.525262] IP: [<ffffffff8136c60e>] strcmp+0xe/0x30 [ 313.525490] PGD 0 [ 313.525726] Oops: 0000 [#1] SMP [ 313.525900] Modules linked in: nvme_rdma nvme_fabrics nvme_core ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_en mlx4_ib ib_core mlx4_core [ 313.527085] CPU: 15 PID: 5856 Comm: setup_nvme_conn Not tainted 4.7.0-rc2+ #2 [ 313.527259] Hardware name: Supermicro X9DRT-F/IBQF/IBFF/X9DRT -F/IBQF/IBFF, BIOS 1.0a 10/09/2012 [ 313.527551] task: ffff88027646cd40 ti: ffff88025b980000 task.ti: ffff88025b980000 [ 313.527879] RIP: 0010:[<ffffffff8136c60e>] [<ffffffff8136c60e>] strcmp+0xe/0x30 [ 313.528232] RSP: 0018:ffff88025b983db0 EFLAGS: 00010206 [ 313.528403] RAX: 0000000000000000 RBX: ffff880471879880 RCX: fffffffffffffff1 [ 313.528594] RDX: 0000000000000000 RSI: ffff880474afa860 RDI: 0000000000000011 [ 313.528778] RBP: ffff88025b983db0 R08: ffff880474afa860 R09: ffff880471879058 [ 313.528956] R10: 000000000000002c R11: ffff88047f415000 R12: ffff880471879800 [ 313.529129] R13: ffff880471879000 R14: ffff880474afa860 R15: fffffffffffffff8 [ 313.529303] FS: 00007f778f510700(0000) GS:ffff88047fbc0000(0000) knlGS:0000000000000000 [ 313.529629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 313.529817] CR2: 0000000000000010 CR3: 0000000274174000 CR4: 00000000000406e0 [ 313.529989] Stack: [ 313.530154] ffff88025b983e48 ffffffffa0171c74 0000000000000001 0000000000000059 [ 313.530621] ffff880476f32400 ffff88047e8add80 0000010074b33aa0 ffff880471879059 [ 313.531162] ffff88047187904b ffff880471879058 0000000000000000 ffff88047736e000 [ 313.531629] Call Trace: [ 313.531797] [<ffffffffa0171c74>] nvmf_dev_write+0x674/0x840 [nvme_fabrics] [ 313.531974] [<ffffffff81180b53>] __vfs_write+0x23/0x120 [ 313.532146] [<ffffffff8119daff>] ? __fd_install+0x1f/0xc0 [ 313.532316] [<ffffffff8119d97a>] ? __alloc_fd+0x3a/0x170 [ 313.532487] [<ffffffff811811f3>] vfs_write+0xb3/0x1b0 [ 313.532658] [<ffffffff8117e321>] ? filp_close+0x51/0x70 [ 313.532845] [<ffffffff811824e1>] SyS_write+0x41/0xa0 [ 313.533016] [<ffffffff8183055b>] entry_SYSCALL_64_fastpath+0x13/0x8f [ 313.533188] Code: 80 3a 00 75 f7 48 83 c6 01 0f b6 4e ff 48 83 c2 01 84 c9 88 4a ff 75 ed 5d c3 0f 1f 00 55 48 89 e5 eb 04 84 c0 74 18 48 83 c7 01 <0f> b6 47 ff 48 83 c6 01 3a 46 ff 74 eb 19 c0 83 c8 01 5d c3 31 [ 313.536563] RIP [<ffffffff8136c60e>] strcmp+0xe/0x30 [ 313.536815] RSP <ffff88025b983db0> [ 313.536981] CR2: 0000000000000010 [ 313.537151] ---[ end trace 3d952e590e7bc2d5 ]--- Reported-and-tested-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <mlin@kernel.org> Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme-fabrics: Remove tl_retry_countSagi Grimberg
The timeout before error recovery logic kicks in is dictated by the nvme keep-alive, so we don't really need a transport layer retry count. transports can retry for as much as they like. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme-rdma: Don't use tl_retry_countSagi Grimberg
Always use the maximum qp retry count as the error recovery timeout is dictated from the nvme keep-alive. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme-rdma: fix the return value of nvme_rdma_reinit_request()Wei Yongjun
PTR_ERR should be applied before its argument is reassigned, otherwise the return value will be set to 0, not error code. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12cdrom: support read sub-channel command in LBA formatvchannaiah
userspace application can send READ_SUB_CHANNEL command with time bit enabled and disabled. The time bit allows selection of address reporting format. If the time bit is disabled the response is in logical block address(CDROM_LBA) format, represented as a 32-bit integer with ms-byte first. If the time bit is enabled the response is in time format i.e., minutes, second, frame (CDROM_MSF) format. Signed-off-by: vchannaiah <vanitha.channaiah@in.bosch.com> Signed-off-by: Mahendran Kuppusamy <mahendran.kuppusamy@in.bosch.com> [veeraiyan.chidambaram@in.bosch.com: updated Documentation/ioctl/cdrom.txt] Signed-off-by: Veeraiyan Chidambaram <veeraiyan.chidambaram@in.bosch.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-12nvme/quirk: Add a delay before checking for adapter readinessGuilherme G. Piccoli
When disabling the controller, the specification says the register NVME_REG_CC should be written and then driver needs to wait the adapter to be ready, which is checked by reading another register bit (NVME_CSTS_RDY). There's a timeout validation in this checking, so in case this timeout is reached the driver gives up and removes the adapter from the system. After a firmware activation procedure, the PCI_DEVICE(0x1c58, 0x0003) (HGST adapter) end up being removed if we issue a reset_controller, because driver keeps verifying the NVME_REG_CSTS until the timeout is reached. This patch adds a necessary quirk for this adapter, by introducing a delay before nvme_wait_ready(), so the reset procedure is able to be completed. This quirk is needed because just increasing the timeout is not enough in case of this adapter - the driver must wait before start reading NVME_REG_CSTS register on this specific device. Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08Merge branch 'for-4.8/block' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm into for-4.8/drivers Dan writes: "The removal of ->driverfs_dev in favor of just passing the parent device in as a parameter to add_disk(). See below, it has received a "Reviewed-by" from Christoph, Bart, and Johannes. It is also a pre-requisite for Fam Zheng's work to cleanup gendisk uevents vs attribute visibility [1]. We would extend device_add_disk() to take an attribute_group list. This is based off a branch of block.git/for-4.8/drivers and has received a positive build success notification from the kbuild robot across several configs. [1]: "gendisk: Generate uevent after attribute available" http://marc.info/?l=linux-virtualization&m=146725201522201&w=2"
2016-07-08nvme-rdma: add a NVMe over Fabrics RDMA host driverChristoph Hellwig
This patch implements the RDMA host (initiator in SCSI speak) driver. It can be used to connect to remote NVMe over Fabrics controllers over Infiniband, RoCE or iWarp, and uses the existing NVMe core driver as well a the new fabrics library. To connect to all NVMe over Fabrics controller reachable on a given taget port using RDMA/CM use the following command: nvme connect-all -t rdma -a $IPADDR This requires the latest version of nvme-cli with Fabrics support. Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08nvmet-rdma: add a NVMe over Fabrics RDMA target driverChristoph Hellwig
This patch implements the RDMA transport for the NVMe over Fabrics target, which allows exporting NVMe over Fabrics functionality over RDMA fabrics (Infiniband, RoCE, iWARP). All NVMe logic is in the generic target and this module just provides a small glue between it and the generic code in the RDMA subsystem. Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com>, Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08nvme-rdma.h: Add includes for nvme rdma_cm negotiationSagi Grimberg
NVMe over Fabrics RDMA transport defines a connection establishment protocol over the RDMA connection manager. This header will be used by both the host and target drivers to negotiate the connection establishment parameters. Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08nvme: add new reconnecting controller stateChristoph Hellwig
The nvme fabric (RDMA, FC, etc...) can introduce port, link or node failures that may require a reconnect to re-establish the connection. Add a new reconnecting state that will initially be used by the RDMA driver. Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08blk-mq: Introduce blk_mq_reinit_tagsetSagi Grimberg
The new nvme-rdma driver will need to reinitialize all the tags as part of the error recovery procedure (realloc the tag memory region). Add a helper in blk-mq for it that can iterate over all requests in a tagset to make this easier. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Stephen Bates <Stephen.Bates@pmcs.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: make __nvm_submit_ppa staticMatias Bjørling
The __nvm_submit_ppa() function is not used outside lightnvm core. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: make ppa_list const in nvm_set_rqd_listMatias Bjørling
The passed by reference ppa list in nvm_set_rqd_list() is updated when multiple planes are available. In that case, each PPA plane is incremented when the device side PPA list is created. This prevents the caller to rely on the PPA list to be unmodified after a call. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: fix lun offset calculation for mark blkMatias Bjørling
The gen_mark_blk_bad function marks the wrong block when a block is on a different channel. Fix the index calculation, so that it updates the correct block. Reported-by: Javier Gonzalez <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: make rrpc_map_page call nvm_get_blk outside locksMatias Bjørling
The nvm_get_blk() function is called with rlun->lock held. This is ok when the media manager implementation doesn't go out of its atomic context. However, if a media manager persists its metadata, and guarantees that the block is given to the target, this is no longer a viable approach. Therefore, clean up the flow of rrpc_map_page, and make sure that nvm_get_blk() is called without any locks acquired. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: remove _unlocked variant of [get/put]_blkMatias Bjørling
The [get/put]_blk API enables targets to get ownership of blocks at runtime. This information is currently not recorded on disk, and the information is therefore lost on power failure. To restore the metadata, the [get/put]_blk must persist its metadata. In that case, we need to control the outer lock, so that we can disable them while updating the on-disk metadata. Fortunately, the _unlocked versions can be removed, which allows us to move the lock into the [get/put]_blk functions. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: remove unused lists from struct rrpc_blockMatias Bjørling
The ->list, ->open_list, and ->closed_list lists were previously used for statistics. However, their usage have been removed, and thus these can safely be removed. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: remove nested lock conflict with mmMatias Bjørling
If a media manager tries to initialize it targets upon media manager initialization, the media manager will need to know which target types are available in LightNVM. The lists of which managers and target types are available shares the same lock. Therefore, on initialization, the nvm_lock is taken by LightNVM core, which later leads to a deadlock when target types are enumerated by the media manager. Add an exclusive lock for target types to resolve this conflict. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: move target mgmt into media mgrMatias Bjørling
To enable persistent block management to easily control creation and removal of targets, we move target management into the media manager. The LightNVM core continues to maintain which target types are registered, while the media manager now keeps track of its initialized targets. Two new callbacks for the media manager are introduced. create_tgt and remove_tgt. Note that remove_tgt returns 0 on successfully removing a target, and returns 1 if the target was not found. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: rename gennvm and update descriptionMatias Bjørling
The generic manager should be called the general media manager, and instead of using the rather long name of "gennvm" in front of each data structures, use "gen" instead to shorten it. Update the description of the media manager as well to make the media manager purpose clearer. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: remove open/close statistics for gennvmMatias Bjørling
The responsibility of the media manager is not to keep track of open/closed blocks. This is better maintained within a target, that already manages this information on writes. Remove the statistics and merge the states NVM_BLK_ST_OPEN and NVM_BLK_ST_CLOSED. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: fix checkpatch terse errorsMatias Bjørling
A couple of small checkpatch fixups to stop it from complaining. ./drivers/lightnvm/core.c:360: WARNING: line over 80 characters ./drivers/lightnvm/core.c:360: ERROR: trailing statements should be on next line ./drivers/lightnvm/core.c:503: WARNING: Block comments use a trailing */ on a separate line Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: remove checkpatch warning for unsigned intsMatias Bjørling
Checkpatch found two incidents where the type was preferred to be written out in full. ./drivers/lightnvm/rrpc.h:184: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' ./drivers/lightnvm/rrpc.h:209: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' ./drivers/lightnvm/rrpc.c:51: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: Make functions not used by ouside staticJohannes Thumshirn
Mark functions not used by ouside of thier implementing file as static. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07nvme: lightnvm: make MLC num_pairs little endianJohannes Thumshirn
According to the OpenChannel SSD interface specification the NAND flash MLC page pairing information's number of page page pairings field is the first two bytes in the MLC Page Pairing data structure. The hardware's data structure itself is little endian so annotate it as such, like the rest of lighnvm's data structures. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: initialize ppa_addr in dev_to_generic_addr()Javier González
The ->reserved bit is not initialized when allocated on stack. This may lead targets to misinterpret the PPA as cached. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: add media manager mark_blk helperJavier González
Expose media manager mark_blk() to targets, as done for the rest of the media manager callback functions. Signed-off-by: Javier González <javier@cnexlabs.com> Updated description Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07lightnvm: break the loop when rqd is not nullWenwei Tao
Break the loop when rqd is not null to reduce an unnecessary schedule. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07nvmet: fix an error codeDan Carpenter
We accidentally return zero here when ERR_PTR(-ENOMEM) is intended. Fixes: a07b4970f464 ('nvmet: add a generic NVMe target') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07nvme-loop: add configfs dependencyArnd Bergmann
CONFIG_NVME_TARGET has a correct CONFIG_CONFIGFS_FS dependency, but the newly added NVME_TARGET_LOOP is missing this, resulting in a link failure: drivers/nvme/built-in.o: In function `nvmet_init_configfs': loop.c:(.init.text+0x2a0): undefined reference to `config_group_init' loop.c:(.init.text+0x2c0): undefined reference to `config_group_init_type_name' loop.c:(.init.text+0x318): undefined reference to `configfs_register_subsystem' drivers/nvme/built-in.o: In function `nvmet_exit_configfs': loop.c:(.exit.text+0x9c): undefined reference to `configfs_unregister_subsystem' This adds the same dependency here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 3a85a5de29ea ("nvme-loop: add a NVMe loopback host driver") Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05bcache: Remove redundant block_size assignmentYijing Wang
We have assigned sb->block_size before the switch, so remove the redundant one. Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Acked-by: Eric Wheeler <bcache@lists.ewheeler.net> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05bcache: update document infoYijing Wang
There is no return in continue_at(), update the documentation. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05bcache: Remove redundant parameter for cache_alloc()Yijing Wang
Cache_sb is not used in cache_alloc, and we have copied sb info to cache->sb already, remove it. Reviewed-by: Coly Li <colyli@suse.de> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme-loop: add a NVMe loopback host driverChristoph Hellwig
This patch implements adds nvme-loop which allows to access local devices exported as NVMe over Fabrics namespaces. This module can be useful for easy evaluation, testing and also feature experimentation. To createa nvme-loop device you need to configure the NVMe target to export a loop port (see the nvmetcli documentaton for that) and then connect to it using nvme connect-all -t loop which requires the very latest nvme-cli version with Fabrics support. Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvmet: add a generic NVMe targetChristoph Hellwig
This patch introduces a implementation of NVMe subsystems, controllers and discovery service which allows to export NVMe namespaces across fabrics such as Ethernet, FC etc. The implementation conforms to the NVMe 1.2.1 specification and interoperates with NVMe over fabrics host implementations. Configuration works using configfs, and is best performed using the nvmetcli tool from http://git.infradead.org/users/hch/nvmetcli.git, which also has a detailed explanation of the required steps in the README file. Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com> Signed-off-by: Anthony Knapp <anthony.j.knapp@intel.com> Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05block: Export blk_pollSagi Grimberg
The new NVMe over fabrics target will make use of this outside from a module. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme: add keep-alive supportSagi Grimberg
Periodic keep-alive is a mandatory feature in NVMe over Fabrics, and optional in NVMe 1.2.1 for PCIe. This patch adds periodic keep-alive sent from the host to verify that the controller is still responsive and vice-versa. The keep-alive timeout is user-defined (with keep_alive_tmo connection parameter) and defaults to 5 seconds. In order to avoid a race condition where the host sends a keep-alive competing with the target side keep-alive timeout expiration, the host adds a grace period of 10 seconds when publishing the keep-alive timeout to the target. In case a keep-alive failed (or timed out), a transport specific error recovery kicks in. For now only NVMe over Fabrics is wired up to support keep alive, but we can add PCIe support easily once controllers actually supporting it become available. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@chelsio.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme.h: Add keep-alive opcode and identify controller attributeSagi Grimberg
KAS: keep-alive support and granularity of kato in units of 100 ms nvme_admin_keep_alive opcode: 0x18 Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme-fabrics: add a generic NVMe over Fabrics libraryChristoph Hellwig
The NVMe over Fabrics library provides an interface for both transports and the nvme core to handle fabrics specific commands and attributes independent of the underlying transport. In addition, the fabrics library adds a misc device interface that allow actually creating a fabrics controller, as we can't just autodiscover it like in the PCI case. The nvme-cli utility has been enhanced to use this interface to support fabric connect and discovery. Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com>, Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com>, Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme.h: add NVMe over Fabrics definitionsChristoph Hellwig
The NVMe over Fabrics specification defines a protocol interface and related extensions to NVMe that enable operation over network protocols. The NVMe over Fabrics specification has an NVMe Transport binding for each NVMe Transport. This patch adds the fabrics related definitions: - fabric specific command set and error codes - transport addressing and binding definitions - fabrics sgl extensions - controller identification fabrics enhancements - discovery log page definition Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme: add fabrics sysfs attributesMing Lin
- delete_controller: This attribute allows to delete a controller. A driver is not obligated to support it (pci doesn't) so it is created only if the driver supports it. The new fabrics drivers will support it (essentialy a disconnect operation). Usage: echo > /sys/class/nvme/nvme0/delete_controller - subsysnqn: This attribute shows the subsystem nqn of the configured device. If a driver does not implement the get_subsysnqn method, the file will not appear in sysfs. - transport: This attribute shows the transport name. Added a "name" field to struct nvme_ctrl_ops. For loop, cat /sys/class/nvme/nvme0/transport loop For RDMA, cat /sys/class/nvme/nvme0/transport rdma For PCIe, cat /sys/class/nvme/nvme0/transport pcie - address: This attributes shows the controller address. The fabrics drivers that will implement get_address can show the address of the connected controller. example: cat /sys/class/nvme/nvme0/address traddr=192.168.2.2,trsvcid=1023 Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme: Modify and export sync command submission for fabricsChristoph Hellwig
NVMe over fabrics will use __nvme_submit_sync_cmd in the the transport and require a few tweaks to it. For that we export it and add a few more paramters: 1. allow passing a queue ID to the block layer For the NVMe over Fabrics connect command we need to able to specify a queue ID that we want to send the command on. Add a qid parameter to the relevant functions to enable this behavior. 2. allow submitting at_head commands In cases where we want to (re)connect to a controller where we have inflight queued commands we want to first connect and only then allow the other queued commands to be kicked. This will prevents failures in controller resets and reconnects. 3. allow passing flags to blk_mq_allocate_request Both for Fabrics connect the the keep-alive feature in NVMe 1.2.1 we want to be able to use reserved requests. Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05nvme: allow transitioning from NEW to LIVE stateChristoph Hellwig
For Fabrics we're not going through an intermediate reset state (at least for now). Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-05blk-mq: add blk_mq_alloc_request_hctxMing Lin
For some protocols like NVMe over Fabrics we need to be able to send initialization commands to a specific queue. Based on an earlier patch from Christoph Hellwig <hch@lst.de>. Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> [hch: disallow sleeping allocation, req_op fixes] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>