From 951fe1073f69d1fceaf2e02386ac17770999c0d0 Mon Sep 17 00:00:00 2001 From: Chen Qun Date: Wed, 12 May 2021 21:54:37 +0800 Subject: [PATCH 1/3] blockjob: Fix crash with IOthread when block commit after snapshot Currently, if guest has workloads, IO thread will acquire aio_context lock before do io_submit, it leads to segmentfault when do block commit after snapshot. Just like below: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f7c7d91f700 (LWP 99907)] 0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437 1437 ../block/mirror.c: No such file or directory. (gdb) p s->job $17 = (MirrorBlockJob *) 0x0 (gdb) p s->stop $18 = false Call trace of IO thread: 0 0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437 1 0x00005576d0f7f3ab in bdrv_driver_pwritev at ../block/io.c:1174 2 0x00005576d0f8139d in bdrv_aligned_pwritev at ../block/io.c:1988 3 0x00005576d0f81b65 in bdrv_co_pwritev_part at ../block/io.c:2156 4 0x00005576d0f8e6b7 in blk_do_pwritev_part at ../block/block-backend.c:1260 5 0x00005576d0f8e84d in blk_aio_write_entry at ../block/block-backend.c:1476 ... Switch to qemu main thread: 0 0x00007f903be704ed in __lll_lock_wait at /lib/../lib64/libpthread.so.0 1 0x00007f903be6bde6 in _L_lock_941 at /lib/../lib64/libpthread.so.0 2 0x00007f903be6bcdf in pthread_mutex_lock at /lib/../lib64/libpthread.so.0 3 0x0000564b21456889 in qemu_mutex_lock_impl at ../util/qemu-thread-posix.c:79 4 0x0000564b213af8a5 in block_job_add_bdrv at ../blockjob.c:224 5 0x0000564b213b00ad in block_job_create at ../blockjob.c:440 6 0x0000564b21357c0a in mirror_start_job at ../block/mirror.c:1622 7 0x0000564b2135a9af in commit_active_start at ../block/mirror.c:1867 8 0x0000564b2133d132 in qmp_block_commit at ../blockdev.c:2768 9 0x0000564b2141fef3 in qmp_marshal_block_commit at qapi/qapi-commands-block-core.c:346 10 0x0000564b214503c9 in do_qmp_dispatch_bh at ../qapi/qmp-dispatch.c:110 11 0x0000564b21451996 in aio_bh_poll at ../util/async.c:164 12 0x0000564b2146018e in aio_dispatch at ../util/aio-posix.c:381 13 0x0000564b2145187e in aio_ctx_dispatch at ../util/async.c:306 14 0x00007f9040239049 in g_main_context_dispatch at /lib/../lib64/libglib-2.0.so.0 15 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:232 16 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:255 17 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:531 18 0x0000564b212304e1 in qemu_main_loop at ../softmmu/runstate.c:721 19 0x0000564b20f7975e in main at ../softmmu/main.c:50 In IO thread when do bdrv_mirror_top_pwritev, the job is NULL, and stop field is false, this means the MirrorBDSOpaque "s" object has not been initialized yet, and this object is initialized by block_job_create(), but the initialize process is stuck in acquiring the lock. In this situation, IO thread come to bdrv_mirror_top_pwritev(),which means that mirror-top node is already inserted into block graph, but its bs->opaque->job is not initialized. The root cause is that qemu main thread do release/acquire when hold the lock, at the same time, IO thread get the lock after release stage, and the crash occured. Actually, in this situation, job->job.aio_context will not equal to qemu_get_aio_context(), and will be the same as bs->aio_context, thus, no need to release the lock, becasue bdrv_root_attach_child() will not change the context. This patch fix this issue. Fixes: 132ada80 "block: Adjust AioContexts when attaching nodes" Signed-off-by: Michael Qiu Message-Id: <20210203024059.52683-1-08005325@163.com> Signed-off-by: Kevin Wolf --- ...sh-with-IOthread-when-block-commit-a.patch | 114 ++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 blockjob-Fix-crash-with-IOthread-when-block-commit-a.patch diff --git a/blockjob-Fix-crash-with-IOthread-when-block-commit-a.patch b/blockjob-Fix-crash-with-IOthread-when-block-commit-a.patch new file mode 100644 index 0000000..2efef72 --- /dev/null +++ b/blockjob-Fix-crash-with-IOthread-when-block-commit-a.patch @@ -0,0 +1,114 @@ +From e37cda3452309d147f1f7aec3c74249001e3db0c Mon Sep 17 00:00:00 2001 +From: Michael Qiu +Date: Wed, 12 May 2021 21:54:37 +0800 +Subject: [PATCH] blockjob: Fix crash with IOthread when block commit after + snapshot + +Currently, if guest has workloads, IO thread will acquire aio_context +lock before do io_submit, it leads to segmentfault when do block commit +after snapshot. Just like below: + +Program received signal SIGSEGV, Segmentation fault. + +[Switching to Thread 0x7f7c7d91f700 (LWP 99907)] +0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437 +1437 ../block/mirror.c: No such file or directory. +(gdb) p s->job +$17 = (MirrorBlockJob *) 0x0 +(gdb) p s->stop +$18 = false + +Call trace of IO thread: +0 0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437 +1 0x00005576d0f7f3ab in bdrv_driver_pwritev at ../block/io.c:1174 +2 0x00005576d0f8139d in bdrv_aligned_pwritev at ../block/io.c:1988 +3 0x00005576d0f81b65 in bdrv_co_pwritev_part at ../block/io.c:2156 +4 0x00005576d0f8e6b7 in blk_do_pwritev_part at ../block/block-backend.c:1260 +5 0x00005576d0f8e84d in blk_aio_write_entry at ../block/block-backend.c:1476 +... + +Switch to qemu main thread: +0 0x00007f903be704ed in __lll_lock_wait at +/lib/../lib64/libpthread.so.0 +1 0x00007f903be6bde6 in _L_lock_941 at /lib/../lib64/libpthread.so.0 +2 0x00007f903be6bcdf in pthread_mutex_lock at +/lib/../lib64/libpthread.so.0 +3 0x0000564b21456889 in qemu_mutex_lock_impl at +../util/qemu-thread-posix.c:79 +4 0x0000564b213af8a5 in block_job_add_bdrv at ../blockjob.c:224 +5 0x0000564b213b00ad in block_job_create at ../blockjob.c:440 +6 0x0000564b21357c0a in mirror_start_job at ../block/mirror.c:1622 +7 0x0000564b2135a9af in commit_active_start at ../block/mirror.c:1867 +8 0x0000564b2133d132 in qmp_block_commit at ../blockdev.c:2768 +9 0x0000564b2141fef3 in qmp_marshal_block_commit at +qapi/qapi-commands-block-core.c:346 +10 0x0000564b214503c9 in do_qmp_dispatch_bh at +../qapi/qmp-dispatch.c:110 +11 0x0000564b21451996 in aio_bh_poll at ../util/async.c:164 +12 0x0000564b2146018e in aio_dispatch at ../util/aio-posix.c:381 +13 0x0000564b2145187e in aio_ctx_dispatch at ../util/async.c:306 +14 0x00007f9040239049 in g_main_context_dispatch at +/lib/../lib64/libglib-2.0.so.0 +15 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:232 +16 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:255 +17 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:531 +18 0x0000564b212304e1 in qemu_main_loop at ../softmmu/runstate.c:721 +19 0x0000564b20f7975e in main at ../softmmu/main.c:50 + +In IO thread when do bdrv_mirror_top_pwritev, the job is NULL, and stop field +is false, this means the MirrorBDSOpaque "s" object has not been initialized +yet, and this object is initialized by block_job_create(), but the initialize +process is stuck in acquiring the lock. + +In this situation, IO thread come to bdrv_mirror_top_pwritev(),which means that +mirror-top node is already inserted into block graph, but its bs->opaque->job +is not initialized. + +The root cause is that qemu main thread do release/acquire when hold the lock, +at the same time, IO thread get the lock after release stage, and the crash +occured. + +Actually, in this situation, job->job.aio_context will not equal to +qemu_get_aio_context(), and will be the same as bs->aio_context, +thus, no need to release the lock, becasue bdrv_root_attach_child() +will not change the context. + +This patch fix this issue. + +Fixes: 132ada80 "block: Adjust AioContexts when attaching nodes" + +Signed-off-by: Michael Qiu +Message-Id: <20210203024059.52683-1-08005325@163.com> +Signed-off-by: Kevin Wolf +--- + blockjob.c | 8 ++++++-- + 1 file changed, 6 insertions(+), 2 deletions(-) + +diff --git a/blockjob.c b/blockjob.c +index 74abb97bfd..72865a4a6e 100644 +--- a/blockjob.c ++++ b/blockjob.c +@@ -223,14 +223,18 @@ int block_job_add_bdrv(BlockJob *job, const char *name, BlockDriverState *bs, + uint64_t perm, uint64_t shared_perm, Error **errp) + { + BdrvChild *c; ++ bool need_context_ops; + + bdrv_ref(bs); +- if (job->job.aio_context != qemu_get_aio_context()) { ++ ++ need_context_ops = bdrv_get_aio_context(bs) != job->job.aio_context; ++ ++ if (need_context_ops && job->job.aio_context != qemu_get_aio_context()) { + aio_context_release(job->job.aio_context); + } + c = bdrv_root_attach_child(bs, name, &child_job, job->job.aio_context, + perm, shared_perm, job, errp); +- if (job->job.aio_context != qemu_get_aio_context()) { ++ if (need_context_ops && job->job.aio_context != qemu_get_aio_context()) { + aio_context_acquire(job->job.aio_context); + } + if (c == NULL) { +-- +2.27.0 + -- Gitee From 6d6cd0fe82589b86cfbc1079d342bfd02ee229a2 Mon Sep 17 00:00:00 2001 From: Chen Qun Date: Fri, 28 May 2021 16:27:22 +0800 Subject: [PATCH 2/3] spec: Update patch and changelog with !118 blockjob: Fix crash with IOthread when block commit after snapshot !118 blockjob: Fix crash with IOthread when block commit after snapshot Signed-off-by: Chen Qun --- qemu.spec | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/qemu.spec b/qemu.spec index db2e33f..9a7c37b 100644 --- a/qemu.spec +++ b/qemu.spec @@ -326,6 +326,7 @@ Patch0313: tz-ppc-add-dummy-read-write-methods.patch Patch0314: imx7-ccm-add-digprog-mmio-write-method.patch Patch0315: util-cacheinfo-fix-crash-when-compiling-with-uClibc.patch Patch0316: arm-cpu-Fixed-function-undefined-error-at-compile-ti.patch +Patch0317: blockjob-Fix-crash-with-IOthread-when-block-commit-a.patch BuildRequires: flex BuildRequires: bison @@ -719,6 +720,9 @@ getent passwd qemu >/dev/null || \ %endif %changelog +* Fri May 28 2021 Chen Qun +- blockjob: Fix crash with IOthread when block commit after snapshot + * Thu 20 May 2021 zhouli57 - arm/cpu: Fixed function undefined error at compile time under arm -- Gitee From 36455e017ba20e17e1e1bad3e241b8830b198d35 Mon Sep 17 00:00:00 2001 From: Chen Qun Date: Fri, 28 May 2021 16:27:23 +0800 Subject: [PATCH 3/3] spec: Update release version with !118 increase release verison by one Signed-off-by: Chen Qun --- qemu.spec | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu.spec b/qemu.spec index 9a7c37b..52fb795 100644 --- a/qemu.spec +++ b/qemu.spec @@ -1,6 +1,6 @@ Name: qemu Version: 4.1.0 -Release: 57 +Release: 58 Epoch: 2 Summary: QEMU is a generic and open source machine emulator and virtualizer License: GPLv2 and BSD and MIT and CC-BY-SA-4.0 -- Gitee