From 1f91d4756cf341eb163589e0ac47232044632ffe Mon Sep 17 00:00:00 2001 From: lin-xiaoran Date: Fri, 2 Jul 2021 15:09:10 +0800 Subject: [PATCH] patch 0016 submission, fix blktrace exit directly when nthreads running patch (cherry picked from commit 3ca248d7ce8094c7d402646d6ab06b40f5863c4e) --- ...-exit-directly-when-nthreads-running.patch | 65 +++++++++++++++++++ blktrace.spec | 6 +- 2 files changed, 70 insertions(+), 1 deletion(-) create mode 100644 0016-blktrace-fix-exit-directly-when-nthreads-running.patch diff --git a/0016-blktrace-fix-exit-directly-when-nthreads-running.patch b/0016-blktrace-fix-exit-directly-when-nthreads-running.patch new file mode 100644 index 0000000..ffd5090 --- /dev/null +++ b/0016-blktrace-fix-exit-directly-when-nthreads-running.patch @@ -0,0 +1,65 @@ +From 3a1b1366d30375cdb0f5b299df4edda0c8ba3bcc Mon Sep 17 00:00:00 2001 +From: lijinlin +Date: Mon, 28 Jun 2021 13:41:32 -0600 +Subject: blktrace: exit directly when nthreads_running != ncpus in + run_tracers() + +We found blktrace got stuck when cgroup restricts blktrace to use cpu, +the messages and stack is: +[root@localhost ~]# blktrace -w 10 -o- /dev/sda +FAILED to start thread on CPU 1: 22/Invalid argument +FAILED to start thread on CPU 2: 22/Invalid argument +[root@localhost ~]# cat /proc/1385110/stack +[<0>] __switch_to+0xe8/0x150 +[<0>] futex_wait_queue_me+0xd4/0x158 +[<0>] futex_wait+0xf4/0x230 +[<0>] do_futex+0x470/0x900 +[<0>] __arm64_sys_futex+0x13c/0x188 +[<0>] el0_svc_common+0x80/0x200 +[<0>] el0_svc_handler+0x78/0xe0 +[<0>] el0_svc+0x10/0x260 +[<0>] 0xffffffffffffffff + +Blktrace failed to start thread is caused by thread can't lock on the +Restricted cpu. In this case, blktrace would't schedule an alarm after +defined time to set variable 'done' as 1. +We debug the code and found the call trace as bellow: +main() + ==>run_tracers() + ==>wait_tracers() + ==>process_trace_bufs() + ==>wait_empty_entries() + ==>t_pthread_cond_wait() +Blktrace was set to piped output, so the process is stuck in +wait_empty_entries() for wait variable 'done' have been set as 1. + +We set variable 'done' as 1 when 'nthreads_running' is not equal to +'ncpus' in run_tracers() to fix the problem. + +Signed-off-by: lijinlin +Signed-off-by: Zhiqiang Liu +Signed-off-by: Lixiaokeng +Signed-off-by: Jens Axboe +--- + blktrace.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/blktrace.c b/blktrace.c +index 82a6aad..3444fbb 100644 +--- a/blktrace.c ++++ b/blktrace.c +@@ -2705,8 +2705,10 @@ static int run_tracers(void) + printf("blktrace: connected!\n"); + if (stop_watch) + alarm(stop_watch); +- } else ++ } else { + stop_tracers(); ++ done = 1; ++ } + + wait_tracers(); + if (nthreads_running == ncpus) +-- +cgit 1.2.3-1.el7 + diff --git a/blktrace.spec b/blktrace.spec index 35af94c..c5a9e78 100644 --- a/blktrace.spec +++ b/blktrace.spec @@ -1,6 +1,6 @@ Name: blktrace Version: 1.2.0 -Release: 16 +Release: 17 Summary: Block IO tracer in the Linux kernel License: GPLv2+ Source: http://brick.kernel.dk/snaps/blktrace-%{version}.tar.bz2 @@ -24,6 +24,7 @@ Patch12: 0012-btt_plot.py-Use-with-open-as-.-context-manager.patch Patch13: 0013-blkparse-Fix-device-in-event-tracking-error-messages.patch Patch14: 0014-blkparse-Allow-request-tracking-on-non-md-dm-devices.patch Patch15: 0015-blkparse-Initialize-and-test-for-undefined-request-t.patch +Patch16: 0016-blktrace-fix-exit-directly-when-nthreads-running.patch %description blktrace is a block layer IO tracing mechanism which provides detailed @@ -88,6 +89,9 @@ compare the differences between different benchmark runs. %{_mandir}/man1/iowatcher.* %changelog +* Fri Jul 02 2021 linxiaoran - 1.2.0-17 +- Fix blktrace exit patch + * Thu Sep 10 2020 lihaotian - 1.2.0-16 - create iowatcher rpm sub-package -- Gitee