- 11 Sep, 2019 1 commit
-
-
Michael S. Tsirkin authored
iovec addresses coming from vhost are assumed to be pre-validated, but in fact can be speculated to a value out of range. Userspace address are later validated with array_index_nospec so we can be sure kernel info does not leak through these addresses, but vhost must also not leak userspace info outside the allowed memory table to guests. Following the defence in depth principle, make sure the address is not validated out of node range. Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org Acked-by:
Jason Wang <jasowang@redhat.com> Tested-by:
Jason Wang <jasowang@redhat.com>
-
- 04 Sep, 2019 2 commits
-
-
Michael S. Tsirkin authored
This reverts commit 7f466032 ("vhost: access vq metadata through kernel virtual address"). The commit caused a bunch of issues, and while commit 73f628ec ("vhost: disable metadata prefetch optimization") disabled the optimization it's not nice to keep lots of dead code around. Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Yunsheng Lin authored
It is unnecessary to use ret variable to return the error code, just return the error code directly. Signed-off-by:
Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 19 Jun, 2019 1 commit
-
-
Thomas Gleixner authored
Based on 1 normalized pattern(s): this work is licensed under the terms of the gnu gpl version 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 48 file(s). Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Allison Randal <allison@lohutok.net> Reviewed-by:
Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081204.624030236@linutronix.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 08 Jun, 2019 1 commit
-
-
Mauro Carvalho Chehab authored
Mostly due to x86 and acpi conversion, several documentation links are still pointing to the old file. Fix them. Signed-off-by:
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Reviewed-by:
Wolfram Sang <wsa@the-dreams.de> Reviewed-by:
Sven Van Asbroeck <TheSven73@gmail.com> Reviewed-by:
Bhupesh Sharma <bhsharma@redhat.com> Acked-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Jonathan Corbet <corbet@lwn.net>
-
- 06 Jun, 2019 1 commit
-
-
Jason Wang authored
It was noticed that the copy_to/from_user() friends that was used to access virtqueue metdata tends to be very expensive for dataplane implementation like vhost since it involves lots of software checks, speculation barriers, hardware feature toggling (e.g SMAP). The extra cost will be more obvious when transferring small packets since the time spent on metadata accessing become more significant. This patch tries to eliminate those overheads by accessing them through direct mapping of those pages. Invalidation callbacks is implemented for co-operation with general VM management (swap, KSM, THP or NUMA balancing). We will try to get the direct mapping of vq metadata before each round of packet processing if it doesn't exist. If we fail, we will simplely fallback to copy_to/from_user() friends. This invalidation and direct mapping access are synchronized through spinlock and RCU. All matedata accessing through direct map is protected by RCU, and the setup or invalidation are done under spinlock. This method might does not work for high mem page which requires temporary mapping so we just fallback to normal copy_to/from_user() and may not for arch that has virtual tagged cache since extra cache flushing is needed to eliminate the alias. This will result complex logic and bad performance. For those archs, this patch simply go for copy_to/from_user() friends. This is done by ruling out kernel mapping codes through ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE. Note that this is only done when device IOTLB is not enabled. We could use similar method to optimize IOTLB in the future. Tests shows at most about 23% improvement on TX PPS when using virtio-user + vhost_net + xdp1 + TAP on 2.6GHz Broadwell: SMAP on | SMAP off Before: 5.2Mpps | 7.1Mpps After: 6.4Mpps | 8.2Mpps Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Jerome Glisse <jglisse@redhat.com> Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-parisc@vger.kernel.org Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 05 Jun, 2019 5 commits
-
-
Jason Wang authored
Factoring vring address and num setting which needs special care for accelerating vq metadata accessing. Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Jason Wang authored
To avoid code duplication since it will be used by kernel VA prefetching. Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Jason Wang authored
Rename the function to be more accurate since it actually tries to prefetch vq metadata address in IOTLB. And this will be used by following patch to prefetch metadata virtual addresses. Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Jason Wang authored
This is used to hide the metadata address from virtqueue helpers. This will allow to implement a vmap based fast accessing to metadata. Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
Jason Wang authored
Use one generic vhost_copy_to_user() instead of two dedicated accessor. This will simplify the conversion to fine grain accessors. About 2% improvement of PPS were seen during vitio-user txonly test. Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 27 May, 2019 1 commit
-
-
Jason Wang authored
We used to have vhost_exceeds_weight() for vhost-net to: - prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX This function could be useful for vsock and scsi as well. So move it to vhost.c. Device must specify a weight which counts the number of requests, or it can also specific a byte_weight which counts the number of bytes that has been processed. Signed-off-by:
Jason Wang <jasowang@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 14 May, 2019 1 commit
-
-
Ira Weiny authored
To facilitate additional options to get_user_pages_fast() change the singular write parameter to be gup_flags. This patch does not change any functionality. New functionality will follow in subsequent patches. Some of the get_user_pages_fast() call sites were unchanged because they already passed FOLL_WRITE or 0 for the write parameter. NOTE: It was suggested to change the ordering of the get_user_pages_fast() arguments to ensure that callers were converted. This breaks the current GUP call site convention of having the returned pages be the final parameter. So the suggestion was rejected. Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.com Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Mike Marshall <hubcap@omnibond.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 11 Apr, 2019 1 commit
-
-
Jason Wang authored
We used to accept zero size iova range which will lead a infinite loop in translate_desc(). Fixing this by failing the request in this case. Reported-by:
<syzbot+d21e6e297322a900c128@syzkaller.appspotmail.com> Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Signed-off-by:
Jason Wang <jasowang@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 08 Mar, 2019 1 commit
-
-
Arnd Bergmann authored
On some architectures, the MMU can be disabled, leading to access_ok() becoming an empty macro that does not evaluate its size argument, which in turn produces an unused-variable warning: drivers/vhost/vhost.c:1191:9: error: unused variable 's' [-Werror,-Wunused-variable] size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0; Mark the variable as __maybe_unused to shut up that warning. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 06 Mar, 2019 1 commit
-
-
Arnd Bergmann authored
On some architectures, the MMU can be disabled, leading to access_ok() becoming an empty macro that does not evaluate its size argument, which in turn produces an unused-variable warning: drivers/vhost/vhost.c:1191:9: error: unused variable 's' [-Werror,-Wunused-variable] size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0; Mark the variable as __maybe_unused to shut up that warning. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 19 Feb, 2019 1 commit
-
-
Jason Wang authored
When fail, translate_desc() returns negative value, otherwise the number of iovs. So we should fail when the return value is negative instead of a blindly check against zero. Detected by CoverityScan, CID# 1442593: Control flow issues (DEADCODE) Fixes: cc5e7107 ("vhost: log dirty page correctly") Acked-by:
Michael S. Tsirkin <mst@redhat.com> Reported-by:
Stephen Hemminger <stephen@networkplumber.org> Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 29 Jan, 2019 1 commit
-
-
Jason Wang authored
After batched used ring updating was introduced in commit e2b3b35e ("vhost_net: batch used ring update in rx"). We tend to batch heads in vq->heads for more than one packet. But the quota passed to get_rx_bufs() was not correctly limited, which can result a OOB write in vq->heads. headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx, vhost_len, &in, vq_log, &log, likely(mergeable) ? UIO_MAXIOV : 1); UIO_MAXIOV was still used which is wrong since we could have batched used in vq->heads, this will cause OOB if the next buffer needs more than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've batched 64 (VHOST_NET_BATCH) heads: Acked-by:
Stefan Hajnoczi <stefanha@redhat.com> ============================================================================= BUG kmalloc-8k (Tainted: G B ): Redzone overwritten ----------------------------------------------------------------------------- INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674 kmem_cache_alloc_trace+0xbb/0x140 alloc_pd+0x22/0x60 gen8_ppgtt_create+0x11d/0x5f0 i915_ppgtt_create+0x16/0x80 i915_gem_create_context+0x248/0x390 i915_gem_context_create_ioctl+0x4b/0xe0 drm_ioctl_kernel+0xa5/0xf0 drm_ioctl+0x2ed/0x3a0 do_vfs_ioctl+0x9f/0x620 ksys_ioctl+0x6b/0x80 __x64_sys_ioctl+0x11/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201 INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for vhost-net. This is done through set the limitation through vhost_dev_init(), then set_owner can allocate the number of iov in a per device manner. This fixes CVE-2018-16880. Fixes: e2b3b35e ("vhost_net: batch used ring update in rx") Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 18 Jan, 2019 1 commit
-
-
Jason Wang authored
Vhost dirty page logging API is designed to sync through GPA. But we try to log GIOVA when device IOTLB is enabled. This is wrong and may lead to missing data after migration. To solve this issue, when logging with device IOTLB enabled, we will: 1) reuse the device IOTLB translation result of GIOVA->HVA mapping to get HVA, for writable descriptor, get HVA through iovec. For used ring update, translate its GIOVA to HVA 2) traverse the GPA->HVA mapping to get the possible GPA and log through GPA. Pay attention this reverse mapping is not guaranteed to be unique, so we should log each possible GPA in this case. This fix the failure of scp to guest during migration. In -next, we will probably support passing GIOVA->GPA instead of GIOVA->HVA. Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Reported-by:
Jintack Lim <jintack@cs.columbia.edu> Cc: Jintack Lim <jintack@cs.columbia.edu> Signed-off-by:
Jason Wang <jasowang@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 15 Jan, 2019 1 commit
-
-
Pavel Tikhomirov authored
We've failed to copy and process vhost_iotlb_msg so let userspace at least know about it. For instance before these patch the code below runs without any error: int main() { struct vhost_msg msg; struct iovec iov; int fd; fd = open("/dev/vhost-net", O_RDWR); if (fd == -1) { perror("open"); return 1; } iov.iov_base = &msg; iov.iov_len = sizeof(msg)-4; if (writev(fd, &iov,1) == -1) { perror("writev"); return 1; } return 0; } Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 04 Jan, 2019 1 commit
-
-
Linus Torvalds authored
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 13 Dec, 2018 2 commits
-
-
Jason Wang authored
This reverts commit 78139c94. We don't protect device IOTLB with vq mutex, which will lead e.g use after free for device IOTLB entries. And since we've switched to use mutex_trylock() in previous patch, it's safe to revert it without having deadlock. Fixes: commit 78139c94 ("net: vhost: lock the vqs one by one") Cc: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jason Wang authored
We miss a write barrier that guarantees used idx is updated and seen before log. This will let userspace sync and copy used ring before used idx is update. Fix this by adding a barrier before log_write(). Fixes: 8dd014ad ("vhost-net: mergeable buffers support") Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 03 Dec, 2018 1 commit
-
-
Jean-Philippe Brucker authored
Commit 78139c94 ("net: vhost: lock the vqs one by one") moved the vq lock to improve scalability, but introduced a possible deadlock in vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding the device's IOTLB spinlock. And on the vhost_iotlb_miss() path, the spinlock is taken while holding vq->mutex. Since calling vhost_poll_queue() doesn't require any lock, avoid the deadlock by not taking vq->mutex. Fixes: 78139c94 ("net: vhost: lock the vqs one by one") Acked-by:
Jason Wang <jasowang@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 31 Oct, 2018 1 commit
-
-
Jason Wang authored
The idx in vhost_vring_ioctl() was controlled by userspace, hence a potential exploitation of the Spectre variant 1 vulnerability. Fixing this by sanitizing idx before using it to index d->vqs. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 27 Sep, 2018 1 commit
-
-
Tonghao Zhang authored
This patch changes the way that lock all vqs at the same, to lock them one by one. It will be used for next patch to avoid the deadlock. Signed-off-by:
Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 26 Aug, 2018 1 commit
-
-
Jason Wang authored
We don't wakeup the virtqueue if the first byte of pending iova range is the last byte of the range we just got updated. This will lead a virtqueue to wait for IOTLB updating forever. Fixing by correct the check and wake up the virtqueue in this case. Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Reported-by:
Peter Xu <peterx@redhat.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> Reviewed-by:
Peter Xu <peterx@redhat.com> Tested-by:
Peter Xu <peterx@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 08 Aug, 2018 1 commit
-
-
Jason Wang authored
We need to reset metadata cache during new IOTLB initialization, otherwise the stale pointers to previous IOTLB may be still accessed which will lead a use after free. Reported-by:
<syzbot+c51e6736a1bf614b3272@syzkaller.appspotmail.com> Fixes: f8894913 ("vhost: introduce O(1) vq metadata cache") Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 06 Aug, 2018 1 commit
-
-
Jason Wang authored
We use to have message like: struct vhost_msg { int type; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; Unfortunately, there will be a hole of 32bit in 64bit machine because of the alignment. This leads a different formats between 32bit API and 64bit API. What's more it will break 32bit program running on 64bit machine. So fixing this by introducing a new message type with an explicit 32bit reserved field after type like: struct vhost_msg_v2 { __u32 type; __u32 reserved; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; We will have a consistent ABI after switching to use this. To enable this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2). Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Signed-off-by:
Jason Wang <jasowang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 12 Jun, 2018 3 commits
-
-
Kees Cook authored
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by:
Kees Cook <keescook@chromium.org>
-
Matthew Wilcox authored
Signed-off-by:
Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by:
Kees Cook <keescook@chromium.org>
-
Michael S. Tsirkin authored
struct vhost_msg within struct vhost_msg_node is copied to userspace. Unfortunately it turns out on 64 bit systems vhost_msg has padding after type which gcc doesn't initialize, leaking 4 uninitialized bytes to userspace. This padding also unfortunately means 32 bit users of this interface are broken on a 64 bit kernel which will need to be fixed separately. Fixes: CVE-2018-1118 Cc: stable@vger.kernel.org Reported-by:
Kevin Easton <kevin@guarana.org> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Reported-by:
<syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-
- 26 May, 2018 1 commit
-
-
Christoph Hellwig authored
These abstract out calls to the poll method in preparation for changes in how we poll. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
- 25 May, 2018 1 commit
-
-
Jason Wang authored
DaeRyong Jeong reports a race between vhost_dev_cleanup() and vhost_process_iotlb_msg(): Thread interleaving: CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup) (In the case of both VHOST_IOTLB_UPDATE and VHOST_IOTLB_INVALIDATE) ===== ===== vhost_umem_clean(dev->iotlb); if (!dev->iotlb) { ret = -EFAULT; break; } dev->iotlb = NULL; The reason is we don't synchronize between them, fixing by protecting vhost_process_iotlb_msg() with dev mutex. Reported-by:
DaeRyong Jeong <threeearcat@gmail.com> Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Signed-off-by:
Jason Wang <jasowang@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 11 Apr, 2018 3 commits
-
-
Stefan Hajnoczi authored
Currently vhost *_access_ok() functions return int. This is error-prone because there are two popular conventions: 1. 0 means failure, 1 means success 2. -errno means failure, 0 means success Although vhost mostly uses #1, it does not do so consistently. umem_access_ok() uses #2. This patch changes the return type from int to bool so that false means failure and true means success. This eliminates a potential source of errors. Suggested-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Stefan Hajnoczi authored
Commit d65026c6 ("vhost: validate log when IOTLB is enabled") introduced a regression. The logic was originally: if (vq->iotlb) return 1; return A && B; After the patch the short-circuit logic for A was inverted: if (A || vq->iotlb) return A; return B; This patch fixes the regression by rewriting the checks in the obvious way, no longer returning A when vq->iotlb is non-NULL (which is hard to understand). Reported-by:
<syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Auger authored
vhost_copy_to_user is used to copy vring used elements to userspace. We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC. Fixes: f8894913 ("vhost: introduce O(1) vq metadata cache") Signed-off-by:
Eric Auger <eric.auger@redhat.com> Acked-by:
Jason Wang <jasowang@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 29 Mar, 2018 1 commit
-
-
Jason Wang authored
Vq log_base is the userspace address of bitmap which has nothing to do with IOTLB. So it needs to be validated unconditionally otherwise we may try use 0 as log_base which may lead to pin pages that will lead unexpected result (e.g trigger BUG_ON() in set_bit_to_user()). Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Reported-by:
<syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com> Signed-off-by:
Jason Wang <jasowang@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 27 Mar, 2018 1 commit
-
-
Jason Wang authored
We tried to remove vq poll from wait queue, but do not check whether or not it was in a list before. This will lead double free. Fixing this by switching to use vhost_poll_stop() which zeros poll->wqh after removing poll from waitqueue to make sure it won't be freed twice. Cc: Darren Kenny <darren.kenny@oracle.com> Reported-by:
<syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com> Fixes: 2b8b328b ("vhost_net: handle polling errors when setting backend") Signed-off-by:
Jason Wang <jasowang@redhat.com> Reviewed-by:
Darren Kenny <darren.kenny@oracle.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 20 Mar, 2018 1 commit
-
-
Sonny Rao authored
Clang is particularly anal about signed vs unsigned comparisons and doesn't like the fact that some ioctl numbers set the MSB, so we get this error when trying to build vhost on aarch64: drivers/vhost/vhost.c:1400:7: error: overflow converting case value to switch condition type (3221794578 to 18446744072636378898) [-Werror, -Wswitch] case VHOST_GET_VRING_BASE: 3221794578 is 0xC008AF12 in hex 18446744072636378898 is 0xFFFFFFFFC008AF12 in hex Fix this by using unsigned ints in the function signature for vhost_vring_ioctl(). Signed-off-by:
Sonny Rao <sonnyrao@chromium.org> Reviewed-by:
Darren Kenny <darren.kenny@oracle.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com>
-