diff --git a/articles/20230714-porting-riscv-ukl-translate-part1.md b/articles/20230714-porting-riscv-ukl-translate-part1.md new file mode 100644 index 0000000000000000000000000000000000000000..9cda6993c72d78251eb845e10dd9a94dcbf18729 --- /dev/null +++ b/articles/20230714-porting-riscv-ukl-translate-part1.md @@ -0,0 +1,132 @@ +> Corrector: [TinyCorrect](https://gitee.com/tinylab/tinycorrect) v0.2-rc1 - [refs pangu]
+> Title: [Integrating Unikernel Optimizations in a General Purpose OS](https://arxiv.org/pdf/2206.00789.pdf)
+> Author: Ali Raza
+> Translator: Gege-Wang <2891067867@qq.com>
+> Date: 2023/07/14
+> Revisor: Falcon
+> Project: [RISC-V Linux 内核剖析](https://gitee.com/tinylab/riscv-linux)
+> Sponsor: PLCT Lab, ISCAS + +# 在通用式操作系统中集成 Unikernel 优化 - part1 + +## 前言 + +> There is growing evidence that the structure of today’s general purpose operating systems is problematic for a number of key use cases. For example, applications that require highperformance I/O use frameworks like DPDK and SPDK to bypass the kernel and gain unimpeded access to hardware devices. In the cloud, client workloads are typically run inside dedicated virtual machines, and a kernel designed to multiplex the resources of many users and processes is instead being replicated across many single-user, often single process, environments. + +越来越多的证据表明,当今通用操作系统的结构在许多关键用例中存在问题。例如,需要高性能 I/O 的应用程序使用像 DPDK 和 SPDK 这样的框架来绕过内核并获得对硬件设备的无障碍访问。在云中,客户端工作负载通常运行在专用虚拟机中,而一个旨在将多个用户和进程的资源复用的内核却被复制到多个单用户(通常是单进程)环境中。 + +> In response, there has been a resurgence of research systems exploring the idea of a libraryOS, or a unikernel, where an application is linked with a specialized kernel and deployed directly on virtual hardware. Compared with Linux, unikernels have demonstrated significant advantages in boot time, security, resource utilization, and I/O performance. + +作为回应,一些研究系统重新开始探索库操作系统(libraryOS)或单内核(unikernel)的概念,其中应用程序与专门的内核相关联,并直接部署在虚拟硬件上。和 Linux 相比,unikernels 在启动引导时间、安全性、资源利用率和 I/O 性能方面表现出了显著的优势。 + +> As with any operating system, widespread adoption of a unikernel will require enormous and ongoing investment by a large community. Justifying this investment is difficult since unikernels target only niche portions of the broad use-cases of general-purpose OSes. In addition to their intrinsic limitation as single application environments, with few exceptions, existing unikernels support only virtualized environments and, in many cases, only run on a single processor core. Moreover, they do not support accelerators (e.g., GPUs and FPGAs) that are increasingly critical to achieving high performance in a post Dennard scaling world. + +与任何操作系统一样,广泛采用单内核将需要大型社区的大量持续投资。证明这种投资的合理性是困难的,因为 unikernels 只针对通用操作系统的广泛用例中的一小部分。除了作为单一应用程序环境的固有限制外,现有的 unikernels 只支持虚拟化环境,而且在许多情况下,只在单个处理器核心上运行。此外,它们不支持加速器(例如 GPU 和 FPGAs),而在 Dennard 定理逐渐失效的情况下,加速器对实现高性能越来越重要。 + +> Some systems have demonstrated that it is possible to create a unikernel that re-uses much of the battle-tested code of a general-purpose OS and supports a wide range of applications. Examples include NetBSD based Rump Kernel, Windows based Drawbridge and Linux based Linux Kernel Library (LKL). These systems, however, require significant changes to the general-purpose OS, resulting in a fork of the codebase and community. As a result, ongoing investments in the base operating system are not necessarily applicable to the forked unikernel. + +一些系统已经证明,重用通用操作系统大量的经过实战测试的代码去创建一个 Unikernel 并使其支持广泛的应用程序,是可能的,例如,基于 NetBSD 的 Rump Kernel、基于 Windows 的 Drawbridge 和基于 Linux 的 Linux 内核库 (LKL)。然而,这些系统需要对通用操作系统进行重大更改,从而导致代码库和社区的分支。因此,对基本操作系统的持续投资不一定适用于派生的 unikernel。 + +> To avoid the investment required for a different OS, the recent Lupine and X-Containers projects explore exploiting Linux’s innate configurability to enable application specific customizations. These projects avoid the hardware overhead of system calls between user and kernel mode, but to avoid code changes to Linux, they do not explore deeper optimizations. Essentially these systems preserve the boundary between the application and the underlying kernel, giving up on any unikernel performance advantages that depend on linking the application and kernel code together. + +为了避免不同操作系统所需的投资,最近的 Lupine 和 X-Containers 项目探索利用 Linux 固有的可配置性来实现特定于应用程序的自定义。这些项目避免了用户模式和内核模式之间系统调用的硬件开销,但是为了避免对 Linux 进行代码更改,它们没有探索更深入的优化。从本质上讲,这些系统保留了应用程序和底层内核之间的边界,放弃了依赖于将应用程序和内核代码链接在一起的全部单内核性能优势。 + +> The Unikernel Linux (UKL) project started as an effort to exploit Linux’s configurability to try to create a new unikernel in a fashion that would avoid forking the kernel. If this is possible, we hypothesized that we could create a unikernel that would support a wide range of Linux’s applications and hardware, while becoming a standard part of the ongoing investment by the Linux community. Our experience has led us to a different, more powerful goal; enabling a kernel that can be configured to span a spectrum between a general-purpose operating system and a pure unikernel. + +Unikernel Linux (UKL) 项目最初是为了开发 Linux 的可配置性,以一种避免分叉内核的方式创建一个新的 Unikernel。如果这是可能的,我们假设我们可以创建一个单内核,它将支持广泛的 Linux 应用程序和硬件,同时成为 Linux 社区持续投资的标准部分。我们的经验把我们引向了一个不同的、更强大的目标;启用一个可以配置为跨越通用操作系统和纯单内核之间的范围的内核。 + +> At the general-purpose end of the spectrum, if all UKL configurations are disabled, a standard Linux kernel is generated. The simplest base model configuration of UKL supports many applications, albeit with only modest performance advantages. Like unikernels, a single application is statically linked into the kernel and executed in privileged mode. However, the base model of UKL preserves most of the invariants and design of Linux, including a separate page-able application portion of the address space and a pinned kernel portion, distinct execution modes for application and kernel code, and the ability to run multiple processes. The changes to Linux to support the UKL base model are modest (~550 LoC), and the resulting kernel support all hardware and applications of the original kernel as well as the entire Linux ecosystem of tools for deployment and performance tuning. UKL base model shows a modest 5% improvement in syscall latency. + +在通用的情况下,如果禁用所有 UKL 配置,将生成一个标准的 Linux 内核。尽管只有适度的性能优势,最简单的 UKL 基本模型配置支持许多应用程序。与 unikernels 类似,单个应用程序被静态地链接到内核中,并以特权模式执行。但是,UKL 的基本模型保留了 Linux 的大多数不变量和设计,包括地址空间中单独的可分页应用程序部分和固定的内核部分、应用程序和内核代码的不同执行模式,以及运行多个进程的能力。为支持 UKL 基本模型而对 Linux 进行的更改是适度的 (~550 LoC),生成的内核支持原始内核的所有硬件和应用程序,并且支持用于部署和性能调优的整个 Linux 生态系统工具。UKL 基本模型显示系统调用延迟有 5% 的适度改善。 + +> Once an application is running in the UKL base model, a developer can move along the spectrum towards a unikernel by 1) adapting additional configuration options that may improve performance but will not work for all applications, and/or 2) modifying the applications to directly invoke kernel functionality. Example configuration options we have explored avoid costly transition checks between application and kernel code, use simple return (rather than iret) from page faults and interrupts, and use shared stacks for application and kernel execution etc. Application modifications can, for example, avoid scheduling and exploit application knowledge to reduce the overhead of synchronization and polymorphism. Experiments show up to 83% improvement in syscall latency and substantial performance advantages for real workloads, e.g., 26% improvement in Redis throughput while improving tail latency by 22%. A latency sensitive workloads show 100 times improvement. The full UKL patch to Linux, including the base model and all configurations, is 1250 LoC. + +一旦应用程序在 UKL 基本模型中运行,开发人员就可以通过以下方式向单内核方向发展:1) 采用可能提高性能但不适用于所有应用程序的附加配置选项,和/或 2) 修改应用程序以直接调用内核功能。我们探讨的示例配置选项避免了应用程序和内核代码之间代价高昂的转换检查,在页面错误和中断时使用简单的 return(而不是 iret),以及在应用程序和内核执行时使用共享堆栈等。应用程序修改有很多好处,例如可以避免调度,并利用应用程序知识来减少同步和多态性的开销。实验表明,系统调用延迟提高了 83%,在实际工作负载中具有显著的性能优势,例如,Redis 吞吐量提高了 26%,尾延迟提高了 22%。对延迟敏感的工作负载性能提高了 100 倍。Linux 的完整 UKL 补丁,包括基本模型和所有配置,是 1250 LoC。 + +> Contributions of this work include: +> 1. An existence proof that unikernel techniques can be integrated into a general-purpose OS in a fashion that does not need to fragment/fork it. +> 2. A demonstration that a single kernel can be adopted across a spectrum between a unikernel and a general purpose OS. +> 3. A demonstration that performance advantages are possible; applications achievemodest gainswith no changes, and incremental effort can achieve more significant gains. + +这项工作的贡献包括: +1. 证明单内核技术可以以一种不需要碎片化/分叉的方式集成到通用操作系统中。 +2. 演示单个内核可以在单内核和通用操作系统之间的范围内被采用。 +3. 证明性能优势是可能的;应用程序在不进行更改的情况下可以获得适度的收益,而增量的工作可以获得更显著的收益。 + +> We discuss our motivations and goals for this project in Section 2, and ur overall approach to bring unikernel techniques to Linux in Section 3. Section 4 describes key implementation details. In Section 5, we evaluate and discuss the implications of the current design and implementation. Finally, Section 6 and 7 contrast UKL to previous work and describe research directions that this work enables. + +我们将在第 2 节中讨论这个项目的动机和目标,并在第 3 节中讨论将单内核技术引入 Linux 的总体方法。第 4 节描述了关键的实现细节。在第 5 节中,我们将评估和讨论当前设计和实现的含义。最后,第 6 节和第 7 节将 UKL 与以前的工作进行了对比,并描述了这项工作能够实现的研究方向。 + +## 动机 & 目标 + +> UKL seeks to explore a spectrum between a general-purpose operating system and a unikernelin order to:(1) enable unikernel optimizations demonstrated by earlier systems while preserving a general-purpose operating system’s (2) broad application support, (3) broad hardware support, and (4) the ecosystem of developers, tools and operators. We motivate and describe each of these four goals. + +UKL 试图探索通用操作系统和统一内核之间的范围,以便: +1. 启用早期系统演示的单内核优化,同时保留通用操作系统的单内核优化 +2. 广泛的应用程序支持 +3. 广泛的硬件支持 +4. 具有开发人员、工具和操作人员的生态系统。 +同时,我们受激发于并且描述这四个目标。 + +### Unikernel 优化 + +> Unikernels fundamentally enable optimizations that rely on linking the application and kernel together in the same address space. Example optimizations that previous systems have adopted include +> 1. avoiding ring transition overheads; +> 2. exploiting the shared address space to pass pointers rather than copying data; +> 3. exploiting fine-grained control over scheduling decisions, e.g., deferring preemption in latency-sensitive routines; +> 4. enabling interrupts to be efficiently dispatched to application code; +> 5. exploiting knowledge of the application to remove code that is never used; +> 6. employing kernel-level mechanisms to optimize locking and memory management, for instance, by using Read-Copy-Update (RCU), per-processor memory, and DMA-aided data movement; and +> 7. enabling compiler, link-time, and profile-driven optimizations between the application and kernel code. + +Unikernels 从根本上实现了那些依赖于将应用程序和内核链接在同一地址空间中的优化。以前系统采用的优化示例包括: +1. 避免特权模式转换开销; +2. 利用共享地址空间传递指针,而不是复制数据; +3. 采用对调度决策的细粒度控制,例如,在延迟敏感例程中的推迟抢占; +4. 使中断能够有效地分配给应用程序代码; +5. 利用应用程序的知识来删除从未使用过的代码; +6. 采用内核级机制来优化锁和内存管理,例如,通过使用读-复制-更新 (RCU)、per-processor 内存和 dma 辅助的数据移动; +7. 在应用程序和内核代码之间,启用编译器、链接时和以性能分析驱动的优化。 + +> Ultimately our goal with UKL is to explore the full spectrum between general purpose and highly specialized unikernels. For this paper, our goal is to enable applications to be linked into the Linux kernel, and explore what, if any, improvements can be achieved by modest changes to the application and general-purpose system. + +我们使用 UKL 的最终目标是探索通用操作系统和高度专门化 unikernels 之间的全部范围。 +在本文中,我们的目标是使应用程序能够链接到 Linux 内核中,并探索通过对应用程序和通用系统进行适度更改可以实现哪些改进(如果有的话)。 + +### 应用程序支持 + +> One of the fundamental problems with unikernels is the limited set of applications that they support. By their nature, unikernels only enable a single process, excluding any application that requires helper processes, scripts, etc. Moreover, the limited set of interfaces typically requires substantial porting effort for any application, and library that the application uses. + +unikernels 的一个基本问题是它们支持的应用程序有限。就其本质而言,unikernels 只启用单个进程,不支持任何需要辅助进程、脚本等的应用程序。此外,有限的接口集意味着通常需要为所有应用程序和应用程序使用的库进行大量的移植工作。 + +> UKL seeks to enable unikernel optimizations to be broadly applicable. Our goal is to enable any unmodified Linux application and library to use UKL, with a re-compilation, as long as only one application needs to be linked into the kernel. Once the application is functional, the developer can incrementally enable unikernel optimizations/configurations. A large set of applications should be able to achieve some gain on the general-purpose end of the spectrum, while a much smaller set of applications will be able to achieve more substantial +gains as we move toward the unikernel end. + +UKL 寻求的是单内核优化的广泛适用。我们的目标是允许任何未经修改的 Linux 应用程序和库使用 UKL,通过重新编译,只要有一个应用程序链接到内核中就可以。一旦应用程序运行正常,开发人员就可以增量地启用单内核优化/配置。大型应用程序应该能够在通用端获得一些增益,而较小的应用程序集将能够在我们向单内核端移动时获得更大的增益。 + +### 硬件支持 + +> Another fundamental problem with unikernels is the lack of support for physical machines and devices. While recent unikernel research has mostly focused on virtual systems, some recent and previous systems have demonstrated the value of per-application specialized +operating systems on physical machines. Unfortunately, even these systems were limited to very specific hardware platforms with a restricted set of device drivers. This precludes a wide range of infrastructure applications (e.g., storage systems, schedulers, networking toolkits) that are typically deployed bare-metal. Moreover, the lack of hardware support is an increasing problem in a post-Dennard scaling world, where performance depends on taking advantage of the revolution of heterogeneous computing. + +unikernels 的另一个基本问题是缺乏对物理机器和设备的支持。虽然最近的 unikernel 研究主要集中在虚拟系统上,但一些最近的和以前的系统已经证明了在物理机上运行每个应用程序专用操作系统的价值。不幸的是,即使是这些系统也仅限于非常特定的硬件平台和有限的设备驱动程序集。这就排除了大量通常部署在裸机上的基础设施应用程序(例如,存储系统、调度程序、网络工具包)。此外,在 dennard 定理逐渐失效的情况下,缺乏硬件支持是一个日益严重的问题,性能依赖于异构计算革命。 + +> Our goal with UKL is to provide a unikernel environment capable of supporting the complete HCL of Linux, allowing applications to exploit any hardware (e.g. GPUs, TPUs, FPGAs) enabled in Linux. Our near term goal, while supporting all Linux devices, is to focus on x86-64 systems. Much like KVM became a feature of Linux on x86 and was then ported to other platforms; we expect that, if UKL is accepted upstream, communities interested in non-x86 architectures will take on the task of porting and optimizing UKL for their platforms. + +我们使用 UKL 的目标是提供一个能够支持 Linux 的完整 HCL 的单内核环境,允许应用程序利用 Linux 中启用的任何硬件(例如 gpu, tpu, fpga)。在支持所有 Linux 设备的同时,我们的近期目标是专注于 x86-64 系统。就像 KVM 在 x86 上成为 Linux 的一个特性,然后被移植到其他平台;我们期望,如果 UKL 在上游被接受,对非 x86 架构感兴趣的社区将承担为他们的平台移植和优化 UKL 的任务。 + +### 生态系统 + +> While application and hardware support are normally thought of as the fundamental barriers for unikernel adoption, the problem is much larger. Linux has a huge developer community, operators that know how to configure and administer it, a massive body of battle-tested code, and a rich set of tools to support functional and performance debugging and configuration. + +虽然应用程序和硬件支持通常被认为是采用单内核的基本障碍,但问题要大得多。Linux 有一个庞大的开发人员社区、有知道如何配置和管理它的操作人员、有大量经过实战测试的代码,以及一组丰富的工具来支持功能和性能的调试和配置。 + +> Our goal with UKL is, while enabling developers to adopt extreme optimizations that are inconsistent with the broader ecosystem, the entire ecosystem should be preserved on the general-purpose end of the spectrum. This means operational as well as functional and performance debugging tools should just work. Standard application and library testing systems should, similarly, just work. Most of all, the base changes needed to enable UKL need to be of a nature that they don’t +break assumptions of the battle tested Linux code, can be accepted by the community, and can be tested and maintained as development on the system progresses. + +我们使用 UKL 的目标是,在使开发人员能够采用与更广泛的生态系统不一致的极端优化的同时,整个生态系统继续保留在通用端。这意味着操作、功能和性能调试工具都应该正常工作。类似地,标准应用程序和库测试系统应该能够正常工作。最重要的是,使能 UKL 的基本改变需要具有这样的性质:它们不会打破经过实战测试的 Linux 代码的假设,可以被社区接受,并且可以随着系统开发的进展进行测试和维护。 + +## 参考文献 + +[1] Dpdk - data plane development kit. https://www.dpdk.org/. Accessed on 2021-10-7. +[2] Storage Performance Development Kit. https://spdk.io/, 2018.(Accessed on 01/16/2019). diff --git a/articles/images/porting-riscv-ukl/translate-figure-1.PNG b/articles/images/porting-riscv-ukl/translate-figure-1.PNG new file mode 100644 index 0000000000000000000000000000000000000000..a246fb3b0971084cb46dc72d6259d0595501e0cc Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-1.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-figure-2.PNG b/articles/images/porting-riscv-ukl/translate-figure-2.PNG new file mode 100644 index 0000000000000000000000000000000000000000..ed43ebe2389025e23323efdf0166d31894a598db Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-2.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-figure-3.PNG b/articles/images/porting-riscv-ukl/translate-figure-3.PNG new file mode 100644 index 0000000000000000000000000000000000000000..dc50ff5a811f48f1c10e117110b82edaa6a91fcc Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-3.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-figure-4.PNG b/articles/images/porting-riscv-ukl/translate-figure-4.PNG new file mode 100644 index 0000000000000000000000000000000000000000..b8c128aa6474a4f2d2941213cf355414649dac6d Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-4.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-figure-5.PNG b/articles/images/porting-riscv-ukl/translate-figure-5.PNG new file mode 100644 index 0000000000000000000000000000000000000000..3107b62c505dce5addd0a03cc94117bb9d7d482b Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-5.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-figure-6.PNG b/articles/images/porting-riscv-ukl/translate-figure-6.PNG new file mode 100644 index 0000000000000000000000000000000000000000..5ba82d3af54013da56b3ba6cee9569177aa2c7e7 Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-6.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-figure-7.PNG b/articles/images/porting-riscv-ukl/translate-figure-7.PNG new file mode 100644 index 0000000000000000000000000000000000000000..8c991556baf9adea8a33af226390bf92bb4bc4cd Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-figure-7.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-table-1.PNG b/articles/images/porting-riscv-ukl/translate-table-1.PNG new file mode 100644 index 0000000000000000000000000000000000000000..bd0ae093ae6c59548c8c23e491d950480ddf3e2e Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-table-1.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-table-2.PNG b/articles/images/porting-riscv-ukl/translate-table-2.PNG new file mode 100644 index 0000000000000000000000000000000000000000..cbbd05ed2863e31ff06abec983da320ba5a05bc7 Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-table-2.PNG differ diff --git a/articles/images/porting-riscv-ukl/translate-table-3.PNG b/articles/images/porting-riscv-ukl/translate-table-3.PNG new file mode 100644 index 0000000000000000000000000000000000000000..0a4b4e28d330534185d88312b76cfa500dfa1d1b Binary files /dev/null and b/articles/images/porting-riscv-ukl/translate-table-3.PNG differ