aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-18mtip32xx: fully switch to the generic DMA APIChristoph Hellwig
The mtip32xx used an odd mix of the old PCI and the generic DMA API, so switch it over to the generic API entirely. Note that this also removes a weird fallback to just a 32-bit coherent dma mask if the 64-bit dma mask doesn't work, as that can't even happen. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18rsxx: switch to the generic DMA APIChristoph Hellwig
The PCI DMA API is deprecated, switch to the generic DMA API instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18umem: switch to the generic DMA APIChristoph Hellwig
The PCI DMA API is deprecated, switch to the generic DMA API instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18sx8: switch to the generic DMA APIChristoph Hellwig
The PCI DMA API is deprecated, switch to the generic DMA API instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18sx8: remove dead IF_64BIT_DMA_IS_POSSIBLE codeChristoph Hellwig
This code has effectively been commented out since the first commit, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18skd: switch to the generic DMA APIChristoph Hellwig
The PCI DMA API is deprecated, switch to the generic DMA API instead. Also make use of the dma_set_mask_and_coherent helper to easily set the streaming an coherent DMA masks together. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18ubd: remove use of blk_rq_map_sgChristoph Hellwig
There is no good reason to create a scatterlist in the ubd driver, it can just iterate the request directly. Signed-off-by: Christoph Hellwig <hch@lst.de> [rw: Folded in improvements as discussed with hch and jens] Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-17drivers/block: Remove DAC960 driverHannes Reinecke
The DAC960 driver has been obsoleted by the myrb/myrs drivers, so it can be dropped. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16sx8: convert to blk-mqJens Axboe
Convert from the old request_fn style driver to blk-mq. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16z2ram: convert to blk-mqJens Axboe
Straight forward conversion to blk-mq, nothing special about this driver. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16gdrom: convert to blk-mqJens Axboe
Ditch the deffered list, lock, and workqueue handling. Just mark the set as being blocking, so we are invoked from a workqueue already. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16floppy: convert to blk-mqOmar Sandoval
This driver likes to fetch requests from all over the place, so make queue_rq put requests on a list so that the logic stays the same. Tested with QEMU. Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() and fixed a few spots where the tag_set leaked on cleanup. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16ataflop: convert to blk-mqOmar Sandoval
This driver is already pretty broken, in that it has two wait_events() (one in stdma_lock()) in request_fn. Get rid of the first one by freezing/quiescing the queue on format, and the second one by replacing it with stdma_try_lock(). The rest is straightforward. Compile-tested only and probably incorrect. Cc: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16ataflop: fix error handling during setupOmar Sandoval
Move queue allocation next to disk allocation to fix a couple of issues: - If add_disk() hasn't been called, we should clear disk->queue before calling put_disk(). - If we fail to allocate a request queue, we still need to put all of the disks, not just the ones that we allocated queues for. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16ataflop: fold headers into C fileOmar Sandoval
atafd.h and atafdreg.h are only used from ataflop.c, so merge them in there. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16amiflop: convert to blk-mqOmar Sandoval
Straightforward conversion, just use the existing amiflop_lock to serialize access to the controller. Compile-tested only. Cc: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16amiflop: clean up on errors during setupOmar Sandoval
The error handling in fd_probe_drives() doesn't clean up at all. Fix it up in preparation for converting to blk-mq. While we're here, get rid of the commented out amiga_floppy_remove(). Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16amiflop: fold headers into C fileOmar Sandoval
amifd.h and amifdreg.h are only used from amiflop.c, and they're pretty small, so move the contents to amiflop.c and get rid of the .h files. This is preparation for adding a struct blk_mq_tag_set to struct amiga_floppy_struct. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim3: convert to blk-mqOmar Sandoval
Pretty simple conversion. grab_drive() could probably be replaced by some freeze/quiesce incantation, but I left it alone, and just used freeze/quiesce for eject. Compile-tested only. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim3: add real error handling in setupOmar Sandoval
The driver doesn't have support for removing a device that has already been configured, but with more careful ordering we can avoid the need for that and make sure that we don't leak generic resources. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim: convert to blk-mqOmar Sandoval
The only interesting thing here is that there may be two floppies (i.e., request queues) sharing the same controller, so we use the global struct swim_priv->lock to check whether the controller is busy. Compile-tested only. Tested-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim: fix cleanup on setup errorOmar Sandoval
If we fail to allocate the request queue for a disk, we still need to free that disk, not just the previous ones. Additionally, we need to cleanup the previous request queues. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16mtd_blkdevs: convert to blk-mqJens Axboe
Straight forward conversion, using an internal list to enable the driver to pull requests at will. Dynamically allocate the tag set to avoid having to pull in the block headers for blktrans.h, since various mtd drivers use block conflicting names for defines and functions. Cc: David Woodhouse <dwmw2@infradead.org> Cc: linux-mtd@lists.infradead.org Tested-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15xsysace: convert to blk-mqJens Axboe
Straight forward conversion, using an internal list to enable the driver to pull requests at will. Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15paride: convert pf to blk-mqJens Axboe
Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15paride: convert pd to blk-mqJens Axboe
Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15paride: convert pcd to blk-mqJens Axboe
Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15ps3disk: convert to blk-mqJens Axboe
Convert from the old request_fn style driver to blk-mq. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15blk-mq: provide helper for setting up an SQ queue and tag setJens Axboe
This pattern is repeated throughout all the blk-mq conversions. Provide a basic helper to get it done. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-15null_blk: remove set but not used variable 'q'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/block/null_blk_main.c: In function 'end_cmd': drivers/block/null_blk_main.c:609:24: warning: variable 'q' set but not used [-Wunused-but-set-variable] It not used any more after commit e50b1e327aeb ("null_blk: remove legacy IO path") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-14cdrom: don't attempt to fiddle with cdo->capabilityJens Axboe
We can't modify cdo->capability as it is defined as a const. Change the modification hack to just WARN_ON_ONCE() if we hit any of the invalid combinations. This fixes a regression for pcd, which doesn't work after the constify patch. Fixes: 853fe1bf7554 ("cdrom: Make device operations read-only") Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-14block: remove bogus check for queue_lock assignmentJens Axboe
We just allocated the queue and haven't even set it up yet, hence we know that checking if ->mq_ops is NULL is always going to be true. In fact we do need to assign a lock to ->queue_lock always, as we need it for the queue flags modifications. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-14null_blk: remove legacy IO pathJens Axboe
We're planning on removing this code completely, kill the old path. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-14um: Convert ubd driver to blk-mqRichard Weinberger
Convert the driver to the modern blk-mq framework. As byproduct we get rid of our open coded restart logic and let blk-mq handle it. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-14skd: fixup usage of legacy IO APIJens Axboe
We need to be using the mq variant of request requeue here. Fixes: ca33dd92968b ("skd: Convert to blk-mq") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-14aoe: convert aoeblk to blk-mqJens Axboe
Straight forward conversion - instead of rewriting the internal buffer retrieval logic, just replace the previous elevator peeking with an internal list of requests. Reviewed-by: "Ed L. Cashin" <ed.cashin@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-13blk-mq: fallback to previous nr_hw_queues when updating failsJianchao Wang
When we try to increate the nr_hw_queues, we may fail due to shortage of memory or other reason, then blk_mq_realloc_hw_ctxs stops and some entries in q->queue_hw_ctx are left with NULL. However, because queue map has been updated with new nr_hw_queues, some cpus have been mapped to hw queue which just encounters allocation failure, thus blk_mq_map_queue could return NULL. This will cause panic in following blk_mq_map_swqueue. To fix it, when increase nr_hw_queues fails, fallback to previous nr_hw_queues and post warning. At the same time, driver's .map_queues usually use completion irq affinity to map hw and cpu, fallback nr_hw_queues will cause lack of some cpu's map to hw, so use default blk_mq_map_queues to do that. Reported-by: syzbot+83e8cbe702263932d9d4@syzkaller.appspotmail.com Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-13blk-mq: realloc hctx when hw queue is mapped to another nodeJianchao Wang
When the hw queues and mq_map are updated, a hctx could be mapped to a different numa node. At this moment, we need to realloc the hctx. If fail to do that, go on using previous hctx. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-13blk-mq: change gfp flags to GFP_NOIO in blk_mq_realloc_hw_ctxsJianchao Wang
blk_mq_realloc_hw_ctxs could be invoked during update hw queues. At the momemt, IO is blocked. Change the gfp flags from GFP_KERNEL to GFP_NOIO to avoid forever hang during memory allocation in blk_mq_realloc_hw_ctxs. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-13blk-mq: adjust debugfs and sysfs register when updating nr_hw_queuesJianchao Wang
blk-mq debugfs and sysfs entries need to be removed before updating queue map, otherwise, we get get wrong result there. This patch fixes it and remove the redundant debugfs and sysfs register/unregister operations during __blk_mq_update_nr_hw_queues. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-13block, bfq: improve asymmetric scenarios detectionFederico Motta
bfq defines as asymmetric a scenario where an active entity, say E (representing either a single bfq_queue or a group of other entities), has a higher weight than some other entities. If the entity E does sync I/O in such a scenario, then bfq plugs the dispatch of the I/O of the other entities in the following situation: E is in service but temporarily has no pending I/O request. In fact, without this plugging, all the times that E stops being temporarily idle, it may find the internal queues of the storage device already filled with an out-of-control number of extra requests, from other entities. So E may have to wait for the service of these extra requests, before finally having its own requests served. This may easily break service guarantees, with E getting less than its fair share of the device throughput. Usually, the end result is that E gets the same fraction of the throughput as the other entities, instead of getting more, according to its higher weight. Yet there are two other more subtle cases where E, even if its weight is actually equal to or even lower than the weight of any other active entities, may get less than its fair share of the throughput in case the above I/O plugging is not performed: 1. other entities issue larger requests than E; 2. other entities contain more active child entities than E (or in general tend to have more backlog than E). In the first case, other entities may get more service than E because they get larger requests, than those of E, served during the temporary idle periods of E. In the second case, other entities get more service because, by having many child entities, they have many requests ready for dispatching while E is temporarily idle. This commit addresses this issue by extending the definition of asymmetric scenario: a scenario is asymmetric when - active entities representing bfq_queues have differentiated weights, as in the original definition or (inclusive) - one or more entities representing groups of entities are active. This broader definition makes sure that I/O plugging will be performed in all the above cases, provided that there is at least one active group. Of course, this definition is very coarse, so it will trigger I/O plugging also in cases where it is not needed, such as, e.g., multiple active entities with just one child each, and all with the same I/O-request size. The reason for this coarse definition is just that a finer-grained definition would be rather heavy to compute. On the opposite end, even this new definition does not trigger I/O plugging in all cases where there is no active group, and all bfq_queues have the same weight. So, in these cases some unfairness may occur if there are asymmetries in I/O-request sizes. We made this choice because I/O plugging may lower throughput, and probably a user that has not created any group cares more about throughput than about perfect fairness. At any rate, as for possible applications that may care about service guarantees, bfq already guarantees a high responsiveness and a low latency to soft real-time applications automatically. Signed-off-by: Federico Motta <federico@willer.it> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-11cfq: clear queue pointers from cfqg after unpinning them in cfq_pd_offlineMaciej S. Szmigiero
BFQ is already doing a similar thing in its .pd_offline_fn() method implementation. While it seems that after commit 4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()") was reverted leaving these pointers intact no longer causes crashes clearing them is still a sensible thing to do to make the code more robust. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-11block: describe difference between flags IO_STAT and STATSKonstantin Khlebnikov
This adds reasonable comments, but they definitely needs better names. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-10drivers/block: remove redundant 'default n' from Kconfig-sBartlomiej Zolnierkiewicz
'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-10block: remove redundant 'default n' from Kconfig-sBartlomiej Zolnierkiewicz
'default n' is the default value for any bool or tristate Kconfig setting so there is no need to write it explicitly. Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols") the Kconfig behavior is the same regardless of 'default n' being present or not: ... One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. ... Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09lightnvm: pblk: guarantee that backpointer is respected on writer stallJavier González
pblk's write buffer must guarantee that it respects the device's constrains for reads (i.e., mw_cunits). This is done by maintaining a backpointer that updates the L2P table as entries wrap up, making them point to the media instead of pointing to the write buffer. This mechanism can race in case that the write thread stalls, as the write pointer will protect the last written entry, thus disregarding the read constrains. This patch adds an extra check on wrap up, making sure that the threshold is respected at all times, preventing new entries to overwrite committed data, also in case of write thread stall. Reported-by: Heiner Litz <hlitz@ucsc.edu> Signed-off-by: Javier González <javier@cnexlabs.com> Reviewed-by: Heiner Litz <hlitz@ucsc.edu> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09lightnvm: pblk: consider max hw sectors supported for max_write_pgsZhoujie Wu
When do GC, the number of read/write sectors are determined by max_write_pgs(see gc_rq preparation in pblk_gc_line_prepare_ws). Due to max_write_pgs doesn't consider max hw sectors supported by nvme controller(128K), which leads to GC tries to read 64 * 4K in one command, and see below error caused by pblk_bio_map_addr in function pblk_submit_read_gc. [ 2923.005376] pblk: could not add page to bio [ 2923.005377] pblk: could not allocate GC bio (18446744073709551604) Signed-off-by: Zhoujie Wu <zjwu@marvell.com> Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09lightnvm: pblk: fix error handling of pblk_lines_init()Wei Yongjun
In the too many bad blocks error handling case, we should release all the allocated resources, otherwise it will cause memory leak. Fixes: 2deeefc02dff ("lightnvm: pblk: fail gracefully on line alloc. failure") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09lightnvm: do no update csecs and sos on 1.2Javier González
1.2 devices exposes their data and metadata size through the separate identify command. Make sure that the NVMe LBA format does not override these values. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-09lightnvm: pblk: guarantee mw_cunits on read bufferJavier González
OCSSD 2.0 defines the amount of data that the host must buffer per chunk to guarantee reads through the geometry field mw_cunits. This value is the base that pblk uses to determine the size of its read buffer. Currently, this size is set to be the closes power-of-2 to mw_cunits times the number of parallel units available to the pblk instance for each open line (currently one). When an entry (4KB) is put in the buffer, the L2P table points to it. As the buffer wraps up, the L2P is updated to point to addresses on the device, thus guaranteeing mw_cunits at a chunk level. However, given that pblk cannot write to the device under ws_min (normally ws_opt), there might be a window in which the buffer starts wrapping up and updating L2P entries before the mw_cunits value in a chunk has been surpassed. In order not to violate the mw_cunits constrain in this case, account for ws_opt on the read buffer creation. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>