diff --git a/news/README.md b/news/README.md index ed9f931816d4aa1b53a916d0741941aabdf91982..5a17b6df4353e18889acab19404d692c1348bda0 100644 --- a/news/README.md +++ b/news/README.md @@ -6,6 +6,1016 @@ * [2023 年 - 上半年](2023-1st-half.md) * [2023 年 - 下半年](2023-2nd-half.md) + +## 20240512:第 91 期 + +### 内核动态 + +#### RISC-V 架构支持 + +**[v1: riscv: do not select MODULE_SECTIONS by default](http://lore.kernel.org/linux-riscv/20240511015725.1162-1-dqfext@gmail.com/)** + +> Handling of those relocations is unnecessary. +> Only select MODULE_SECTIONS when +> RELOCATABLE. +> + +**[v4: Define _GNU_SOURCE for sources using](http://lore.kernel.org/linux-riscv/20240510000842.410729-1-edliaw@google.com/)** + +> Centralizes the definition of _GNU_SOURCE into KHDR_INCLUDES and removes +> redefinitions of _GNU_SOURCE from source code. +> asprintf into kselftest_harness.h. +> + +**[v5: Support Zve32[xf] and Zve64[xfd] Vector subextensions](http://lore.kernel.org/linux-riscv/20240510-zve-detection-v5-0-0711bdd26c12@sifive.com/)** + +> The series is tested on a QEMU and verified that booting, Vector +> programs context-switch, signal, ptrace, prctl interfaces works when we +> only report partial V from the ISA. +> This patch should be able to apply on risc-v for-next branch on top of +> the commit 0a16a1728790 + + +**[v4: of: property: Add fw_devlink support for interrupt-map property](http://lore.kernel.org/linux-riscv/20240509120820.1430587-1-apatel@ventanamicro.com/)** + +> Supplier (interrupt controller) based on "interrupt-map" DT property. +> + +**[GIT PULL: RISC-V Devicetrees for v6.10 Take 2](http://lore.kernel.org/linux-riscv/20240508-crafter-cement-4f54e4182270@spud/)** + +> Microchip: +> A simple addition of a power-monitor on the Icicle dev board, as the +> binding for it is now in mainline. +> Support for the Milk-V Mars. This board is incredibly similar to the +> VisionFive v2 that is already supported, with only the really ethernet +> configuration being slightly different. +> Re-ordering of some nodes to match the DTS coding style on the th1520. +> + +**[v1: riscv: change XIP's kernel_map.size to be size of the entire kernel](http://lore.kernel.org/linux-riscv/20240508191917.2892064-1-namcao@linutronix.de/)** + +> Change XIP's kernel_map.size to be the size of the entire kernel. +> +> + +**[v1: riscv: Don't use hugepage mappings for vmemmap if it's not supported](http://lore.kernel.org/linux-riscv/20240508173116.2866192-1-namcao@linutronix.de/)** + +> Only use hugepage mapping if it is supported. +> + +**[v1: riscv: dts: starfive: Enable Bluetooth on JH7100 boards](http://lore.kernel.org/linux-riscv/20240508111604.887466-1-emil.renner.berthing@canonical.com/)** + +> This series enables the in-kernel Bluetooth driver to work with the +> Broadcom Wifi/Bluetooth module on the BeagleV Starlight and StarFive VisionFive V1 boards. + +**[v3: riscv: set trap vector earlier](http://lore.kernel.org/linux-riscv/20240508022445.6131-1-gaoshanliukou@163.com/)** + +> So fix that by setting the exception vector earlier. +> + +**[v2: riscv: Support compiling the kernel with more extensions](http://lore.kernel.org/linux-riscv/20240507-compile_kernel_with_extensions-v2-0-722c21c328c6@rivosinc.com/)** + +> This series introduces Kconfig options that allow the kernel to be +> compiled with additional extensions. +> The motivation for this patch is the performance improvements that come +> along with compiling the kernel with these extra instructions. +> Additionally, alternatives that check if an extension is supported can be eliminated when the Kconfig options to assume hardware support is enabled. +> + +#### 进程调度 + +**[v4: time/tick-sched: idle load balancing when nohz_full cpu becomes idle.](http://lore.kernel.org/lkml/20240509092931.35209-2-ppbuk5246@gmail.com/)** + +> Change tick_nohz_idle_stop_tick() to call nohz_balance_enter_idle() +> without checking !was_stopped so that nohz_full cpu can be chosen to +> perform idle load balancing when it enters idle state. +> + +**[[net PATCH] net/sched: Get stab before calling ops->change()](http://lore.kernel.org/lkml/20240509024043.3532677-1-xiaolei.wang@windriver.com/)** + +> ops->change() depends on stab, there is such a situation +> When no parameters are passed in for the first time, stab +> is omitted, as in configuration 1 below. At this time, a +> warning "Warning: sch_taprio: Size table not specified, frame +> length estimates may be inaccurate" will be received. When +> stab is added for the second time, parameters, like configuration +> 2 below, because the stab is still empty when ops->change() +> is running, you will also receive the above warning. +> + + +**[v1: sched/fair: prevent unbounded task iteration in load balance](http://lore.kernel.org/lkml/20240508223456.4189689-1-joshdon@google.com/)** + +> This patch now separates the number of tasks +> we migrate from the number of tasks we can search. Now, the search limit +> can be raised while keeping the nr_migrate fixed. +> + +**[v3: net/sched: adjust device watchdog timer to detect stopped queue at right time](http://lore.kernel.org/lkml/20240508133617.4424-1-praveen.kannoju@oracle.com/)** + +> Modify watchdog next timeout to be shorter than the device specified. +> Compute the next timeout be equal to device watchdog timeout less the +> how long ago queue stop had been done. At next watchdog timeout tx +> timeout handler is called into if still in stopped state. Either called +> or not called, restore the watchdog timeout back to device specified. +> + +**[v1: perf sched: Introduce schedstat tool](http://lore.kernel.org/lkml/20240508060427.417-1-ravi.bangoria@amd.com/)** + +> Existing \`perf sched\` is quite exhaustive and provides lot of insights +> into scheduler behavior but it quickly becomes impractical to use for +> long running or scheduler intensive workload. overall \`perf sched schedstat record\` is much more light- +> weight compare to \`perf sched record\`. +> it is very useful to analyse impact +> of any scheduler code changes. +> +> + +**[v5: sched/fair: allow disabling sched_balance_newidle with sched_relax_domain_level](http://lore.kernel.org/lkml/cover.1715083479.git.vitaly@bursov.com/)** + +**[v1: sched: Clear user_cpus_ptr only when no intersection with the new mask](http://lore.kernel.org/lkml/20240507072242.585-1-xuewen.yan@unisoc.com/)** + +> The commit 851a723e45d1c("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()") +> would cause that online/offline cpu will produce different results +> for the !top-cpuset task. +> So add the judgement of whether there is an intersection between them. +> Clear user_cpus_ptr only when no intersection with the new mask. +> + +**[v1: time/tick-sched: enable idle load balancing when nohz_full cpu becomes idle.](http://lore.kernel.org/lkml/20240506213150.13608-1-ppbuk5246@gmail.com/)** + +> So, nohz_balance_enter_idle() could be called safely without !was_stooped +> check. +> + +**[v1: sched: Introduce task_struct::latency_sensi_flag.](http://lore.kernel.org/lkml/20240505030615.GA5131@didi-ThinkCentre-M920t-N000/)** + +> So latency_sensi_flag is introduced in +> task_struct, when it is set to 1, task only wakes up softirq daemon in +> __local_bh_enable_ip(). +> + +#### 内存管理 + +**[v12: mm: report per-page metadata information](http://lore.kernel.org/linux-mm/20240512010611.290464-1-souravpanda@google.com/)** + +> Today, we do not have any observability of per-page metadata +> and how much it takes away from the machine capacity. Thus, +> we want to describe the amount of memory that is going towards +> per-page metadata, which can vary depending on build +> configuration, machine architecture, and system use. +> +> This patch adds 2 fields to /proc/vmstat. +> + +**[ v1: linux-next: mm/huge_memory: mark racy access on huge_anon_orders_always](http://lore.kernel.org/linux-mm/20240511144436754EiKfJM4xjMSTyCbEExwcL@zte.com.cn/)** + +> huge_anon_orders_always and huge_anon_orders_always are accessed +> lockless, it is better to use the READ_ONCE() wrapper. +> This is not fixing any visible bug, hopefully this can cease some +> KCSAN complains in the future. +> Also do that for huge_anon_orders_madvise. +> + +**[v1: -rc7: mm/huge_memory: mark huge_zero_page reserved](http://lore.kernel.org/linux-mm/20240511035435.1477004-1-linmiaohe@huawei.com/)** + +> When I did memory failure tests recently, below panic occurs: +> + +**[v1: mm/huge_memory: mark huge_zero_folio reserved](http://lore.kernel.org/linux-mm/20240511032801.1295023-1-linmiaohe@huawei.com/)** + +> When I did memory failure tests recently, below panic occurs: +> + +**[v2: arch/fault: don't print logs for simulated poison errors](http://lore.kernel.org/linux-mm/20240510182926.763131-1-axelrasmussen@google.com/)** + +> This patch is based on mm-unstable as of 2024-05-10. In particular it +> needs this somewhat related fix to apply cleanly. + +**[v10: LUF(Lazy Unmap Flush) reducing tlb numbers over 90%](http://lore.kernel.org/linux-mm/20240510065206.76078-1-byungchul@sk.com/)** + +> +> I'm suggesting a new mechanism, LUF(Lazy Unmap Flush), defers tlb flush +> until folios that have been unmapped and freed, eventually get allocated +> again. +> tlb flush can be defered when folios get unmapped as long as it +> guarantees to perform tlb flush needed, before the folios actually +> become used, of course, only if all the corresponding ptes don't have +> write permission. Otherwise, the system will get messed up. +> + +**[v2: Enhance soft hwpoison handling and injection](http://lore.kernel.org/linux-mm/20240510062602.901510-1-jane.chu@oracle.com/)** + +> This series aim at the following enhancement - +> - Let one hwpoison injector, that is, madvise(MADV_HWPOISON) to behave +> more like as if a real UE occurred. + +**[v1: selftests/mm: hugetlb_madv_vs_map: Avoid test skipping by querying hugepage size at runtime](http://lore.kernel.org/linux-mm/20240509095447.3791573-1-dev.jain@arm.com/)** + +> Since we +> are using a simple mmap() using MAP_HUGETLB; hence, instead of skipping +> the test, make it fail. +> + +**[v1: rfc: mm: memcg: separate legacy cgroup v1 code and put under config option](http://lore.kernel.org/linux-mm/20240509034138.2207186-1-roman.gushchin@linux.dev/)** + + +> Cgroup v1-specific code in memcontrol.c is close to 4k lines in size and it's +> intervened with generic and cgroup v2-specific code. It's a burden on +> developers and maintainers. +> +> This is an RFC version, which is not 100% polished yet, so but it would be great +> to discuss and agree on the overall approach. +> + +**[v1: introduce budgt control in readahead](http://lore.kernel.org/linux-mm/20240509023937.1090421-1-zhaoyang.huang@unisoc.com/)** + +> This series patches would like to introduce the helper +> function to provide the bytes limit and apply it on readahead. + +**[v4: large folios swap-in: handle refault cases first](http://lore.kernel.org/linux-mm/20240508224040.190469-1-21cnbao@gmail.com/)** + +> This patch primarily addressing +> the handling of scenarios involving large folios in the swap cache. Currently, it is +> particularly focused on addressing the refaulting of mTHP, which is still undergoing +> reclamation. This approach aims to streamline code review and expedite the integration +> of this segment into the MM tree. +> +> It relies on Ryan's swap-out series, leveraging the helper function +> swap_pte_batch() introduced by that series. +> +> + +**[v1: Make riscv use THP contpte support for arm64](http://lore.kernel.org/linux-mm/20240508191931.46060-1-alexghiti@rivosinc.com/)** + +> This allows riscv to support napot (riscv equivalent to contpte) THPs by +> moving arm64 contpte support into mm, the previous series only merging +> riscv and arm64 implementations of hugetlbfs contpte. +> + +**[v1: binfmt_elf: Honor PT_LOAD alignment for static PIE](http://lore.kernel.org/linux-mm/20240508172848.work.131-kees@kernel.org/)** + +> This attempts to implement PT_LOAD p_align support for static PIE builds. + + +**[v5: selftests: cgroup: add tests to verify the zswap writeback path](http://lore.kernel.org/linux-mm/20240508171359.1545744-1-usamaarif642@gmail.com/)** + +> Attempt writeback with the below steps and check using +> memory.stat.zswpwb if zswap writeback occurred. + +**[v1: mm/ksm: optimize unstable_tree_search_insert()](http://lore.kernel.org/linux-mm/20240508-b4-ksm-unstable-insert-v1-0-631cdbc2b77f@linux.dev/)** + +> We use unstable_tree_search_insert() to find matched page or insert our +> rmap_item into the unstable tree if no matched found. +> + +**[v2: RESEND: Merge arm64/riscv hugetlbfs contpte support](http://lore.kernel.org/linux-mm/20240508113419.18620-1-alexghiti@rivosinc.com/)** + +> This patchset intends to merge the contiguous ptes hugetlbfs implementation +> of arm64 and riscv. +> + +**[v2: Merge arm64/riscv hugetlbfs contpte support](http://lore.kernel.org/linux-mm/20240508111829.16891-1-alexghiti@rivosinc.com/)** + +> This patchset intends to merge the contiguous ptes hugetlbfs implementation +> of arm64 and riscv. + +**[v1: tools/mm: allow filtering and culling by module in page_owner_sort](http://lore.kernel.org/linux-mm/20240508094507.685475-1-oss@malat.biz/)** + +> Extend page_owner_sort filtering and culling features to work with module +> names as well. The top most module is used. +> Fix regex error handling, failure labels were one step shifted. +> + +**[v1: iomap: use huge zero folio in iomap_dio_zero](http://lore.kernel.org/linux-mm/20240507145811.52987-1-kernel@pankajraghav.com/)** + +> Instead of looping with ZERO_PAGE, use a huge zero folio to zero pad the +> block. Fallback to ZERO_PAGE if mm_get_huge_zero_folio() fails. +> + +**[v3: -next: mm: memcg: make alloc_mem_cgroup_per_node_info() return bool](http://lore.kernel.org/linux-mm/20240507132324.1158510-1-xiujianfeng@huawei.com/)** + +> So change the the function to return bool (true on success) because +> this is slightly less confusing and more consistent with the other code. +> + +**[v2: Add XSAVE layout description to Core files for debuggers to support varying XSAVE layouts](http://lore.kernel.org/linux-mm/20240507095330.2674-1-vigbalas@amd.com/)** + +> This patch proposes to add an extra .note section in the corefile to dump the CPUID information of a machine. + +**[v1: x86/fault: speed up uffd-unit-test by 10x: rate-limit "MCE: Killing" logs](http://lore.kernel.org/linux-mm/20240507022939.236896-1-jhubbard@nvidia.com/)** + +> If a system experiences a lot of memory failures, then any associated +> printk() output really needs to be rate-limited. +> With this patch, all but 10 lines are suppressed, thus speeding up that +> particular selftest by 90% (runtime drops from 107 seconds, to 10.6 +> seconds). + +**[v1: mm-unstable: mm: rmap: abstract updating per-node and per-memcg stats](http://lore.kernel.org/linux-mm/20240506211333.346605-1-yosryahmed@google.com/)** + +> The folio struct should already be in the cache at this point, so it shouldn't +> cause any noticeable overhead. + + +#### 文件系统 + +**[v2: vfs: move dentry shrinking outside the inode lock in 'rmdir()'](http://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/)** + +> There seems to be no actual reason for holding the inode lock any more +> by the time we get rid of the now uninteresting negative dentries, and +> it's an effect of the calling convention. + +**[v4: ext4: support adding multi-delalloc blocks](http://lore.kernel.org/linux-fsdevel/20240511112619.3656450-1-yi.zhang@huaweicloud.com/)** + +**[v1: -next: fs: fsconfig: intercept for non-new mount API in advance for FSCONFIG_CMD_CREATE_EXCL](http://lore.kernel.org/linux-fsdevel/20240511062147.3312801-1-lihongbo22@huawei.com/)** + +> fsconfig with FSCONFIG_CMD_CREATE_EXCL command requires the new mount api, +> here we should return -EOPNOTSUPP in advance to avoid extra procedure. +> + +**[v1: -next: fsconfig: intercept for non-new mount API in advance for FSCONFIG_CMD_CREATE_EXCL](http://lore.kernel.org/linux-fsdevel/20240511040249.2141380-1-lihongbo22@huawei.com/)** + +> fsconfig with FSCONFIG_CMD_CREATE_EXCL command requires the new mount api, +> here we should return -EOPNOTSUPP in advance to avoid extra procedure. +> + +**[v1: fsnotify: clear PARENT_WATCHED flags lazily](http://lore.kernel.org/linux-fsdevel/20240510221901.520546-1-stephen.s.brennan@oracle.com/)** + +> the underlying issue I was trying to resolve was when +> directories have many dentries (frequently, a ton of negative dentries), the +> __fsnotify_update_child_dentry_flags() operation can take a while, and it +> happens under spinlock. + +**[v1: fuse: add simple request tracepoints](http://lore.kernel.org/linux-fsdevel/4b11700964fdcdb67eb63a72c133423a5a876332.1715376944.git.josef@toxicpanda.com/)** + +> I've been timing various fuse operations and it's quite annoying to do +> with kprobes. Add two tracepoints for sending and ending fuse requests +> to make it easier to debug and time various operations. +> + +**[GIT PULL: vfs rw](http://lore.kernel.org/linux-fsdevel/20240510-vfs-rw-332f4a8e1772@brauner/)** + +> The core fs signalfd, userfaultfd, and timerfd subsystems did still use +> f_op->read() instead of f_op->read_iter(). Convert them over since we +> should aim to get rid of f_op->read() at some point. +> +> Aside from that io_uring and others want to mark files as FMODE_NOWAIT +> so it can make use of per-IO nonblocking hints to enable more efficient +> IO. Converting those users to f_op->read_iter() allows them to be marked +> with FMODE_NOWAIT. +> + +**[v3: ext4: Don't reduce symlink i_mode by umask if no ACL support](http://lore.kernel.org/linux-fsdevel/1586868.1715341641@warthog.procyon.org.uk/)** + +> If CONFIG_EXT4_FS_POSIX_ACL=n then the fallback version of ext4_init_acl() +> will mask off the umask bits from the new inode's i_mode. This should not +> be done if the inode is a symlink. If CONFIG_EXT4_FS_POSIX_ACL=y, then we +> go through posix_acl_create() instead which does the right thing with +> symlinks. + + +**[v2: fscrypt: try to avoid refing parent dentry in fscrypt_file_open](http://lore.kernel.org/linux-fsdevel/20240508081400.422212-1-mjguzik@gmail.com/)** + +> Merely checking if the directory is encrypted happens for every open +> when using ext4, at the moment refing and unrefing the parent, costing 2 +> atomics and serializing opens of different files. +> +> The most common case of encryption not being used can be checked for +> with RCU instead. +> + +**[v4: fs/coredump: Enable dynamic configuration of max file note size](http://lore.kernel.org/linux-fsdevel/20240506193700.7884-1-apais@linux.microsoft.com/)** + +> Introduce the capability to dynamically configure the maximum file +> note size for ELF core dumps via sysctl. +> This enhancement removes the previous static limit of 4MB, allowing +> system administrators to adjust the size based on system-specific +> requirements or constraints. +> + +**[v2: virtiofs: use string format specifier for sysfs tag](http://lore.kernel.org/linux-fsdevel/20240506185713.58678-1-bfoster@redhat.com/)** + +> The existing emit call is a vector for format string injection. Use +> the string format specifier to avoid this problem. +> + +**[v2: epoll: be better about file lifetimes](http://lore.kernel.org/linux-fsdevel/20240505175556.1213266-2-torvalds@linux-foundation.org/)** + +> epoll can call out to vfs_poll() with a file pointer that may race with +> the last 'fput()'. That would make f_count go down to zero, and while +> the ep->mtx locking means that the resulting file pointer tear-down will +> be blocked until the poll returns, it means that f_count is already +> dead, and any use of it won't actually get a reference to the file any +> more: it's dead regardless. +> +> + +**[v1: blk: optimization for classic polling](http://lore.kernel.org/linux-fsdevel/3578876466-3733-1-git-send-email-nj.shetty@samsung.com/)** + +> This removes the dependency on interrupts to wake up task. Set task +> state as TASK_RUNNING, if need_resched() returns true, +> while polling for IO completion. +> Earlier, polling task used to sleep, relying on interrupt to wake it up. +> This made some IO take very long when interrupt-coalescing is enabled in +> NVMe. +> + +#### 网络设备 + +**[v2: net-next: ENA driver changes May 2024](http://lore.kernel.org/netdev/20240512134637.25299-1-darinzon@amazon.com/)** + +> This patchset contains several misc and minor +> changes to the ENA driver. +> + +**[v1: net-next: mlx5 misc patches](http://lore.kernel.org/netdev/20240512124306.740898-1-tariqt@nvidia.com/)** + +> This series includes patches for the mlx5 driver. +> + +**[v2: tty: rfcomm: prefer struct_size over open coded arithmetic](http://lore.kernel.org/netdev/AS8PR02MB7237262C62B054FABD7229168BE12@AS8PR02MB7237.eurprd02.prod.outlook.com/)** + +> This is an effort to get rid of all multiplications from allocation +> functions in order to prevent integer overflows . +> +> + +**[v6: net-next: add ethernet driver for Tehuti Networks TN40xx chips](http://lore.kernel.org/netdev/20240512085611.79747-1-fujita.tomonori@gmail.com/)** + +> This patchset adds a new 10G ethernet driver for Tehuti Networks +> TN40xx chips. Note in mainline, there is a driver for Tehuti Networks +> (drivers/net/ethernet/tehuti/tehuti.[hc]), which supports TN30xx +> chips. +> To make reviewing easier, this patchset has only basic functions. Once +> merged, I'll submit features like ethtool support. +> + +**[v1: bpf, sockmap: defer sk_psock_free_link() using RCU](http://lore.kernel.org/netdev/838e7959-a360-4ac1-b36a-a3469236129b@I-love.SAKURA.ne.jp/)** + +> Defer kfree() using RCU so that the attached BPF program runs +> without holding psock->link_lock. +> + +**[v1: lib80211: Constify struct lib80211_crypto_ops](http://lore.kernel.org/netdev/cover.1715443223.git.christophe.jaillet@wanadoo.fr/)** + +> This serie constify struct lib80211_crypto_ops. This sutructure is +> mostly some function pointers, so having it in a read-only section +> when possible is safer. +> + +**[[PATCH net RFC] net: ethernet: mtk_eth_soc: ppe: add source port comparison](http://lore.kernel.org/netdev/20240511124230.13991-1-eladwf@gmail.com/)** + +> Resolve packet loss issue on the following conditions: +> - utilizing multiple GMACs +> - device has more than 4GB DRAM +> - using PPE +> + +**[v4: net-next: net: ethernet: mtk_eth_soc: ppe: add support for multiple PPEs](http://lore.kernel.org/netdev/20240511122659.13838-1-eladwf@gmail.com/)** + +> Add the missing pieces to allow multiple PPEs units, one for each GMAC. +> mtk_gdm_config has been modified to work on targted mac ID, +> the inner loop moved outside of the function to allow unrelated +> operations like setting the MAC's PPE index. +> + +**[v2: net-next: net: qede: flower: validate control flags](http://lore.kernel.org/netdev/20240511073705.230507-1-ast@fiberby.net/)** + +> Use flow_rule_match_has_control_flags() to check for control flags, +> such as can be set through \`tc flower ... ip_flags frag`. +> +> In case any control flags are masked, flow_rule_match_has_control_flags() +> sets a NL extended error message, and we return -EOPNOTSUPP. +> + +**[v1: net-next: selftests: netfilter: nft_flowtable.sh: bump socat timeout to 1m](http://lore.kernel.org/netdev/20240511064814.561525-1-fw@strlen.de/)** + +> Looks like socat gets zapped too quickly, so increase timeout to 1m. +> +> Could also reduce tx file size for KSFT_MACHINE_SLOW, but its preferrable +> to have same test for both debug and nondebug. +> + +**[v5: net-next: virtio_net: rx enable premapped mode by default](http://lore.kernel.org/netdev/20240511031404.30903-1-xuanzhuo@linux.alibaba.com/)** + +> This patch set makes the big mode of virtio-net to support premapped mode. +> And enable premapped mode for rx by default. +> + +> + +**[v2: net-next: net: fec: Convert fec driver to use lock guards](http://lore.kernel.org/netdev/20240511030229.628287-1-wei.fang@nxp.com/)** + +> Convert the fec driver to use guard() and scoped_guard() +> defined in linux/cleanup.h to automate lock lifetime control in the +> fec driver. +> + +**[v2: net-next: selftests: net: local_termination: annotate the expected failures](http://lore.kernel.org/netdev/20240511013236.383368-1-kuba@kernel.org/)** + +> The bridge driver fares particularly badly [...] mainly because +> it does not implement IFF_UNICAST_FLT. +> +> We don't want to hide the known gaps, but having a test which +> always fails prevents us from catching regressions. Report +> the cases we know may fail as XFAIL. +> + +**[v1: ynl: ensure exact-len value is resolved](http://lore.kernel.org/netdev/20240510232202.24051-1-a@unstable.cc/)** + +> For type String and Binary we are currently usinig the exact-len +> limit value as is without attempting any name resolution. +> However, the spec may specify the name of a constant rather than an +> actual value, which would result in using the constant name as is +> and thus break the policy. +> +> Ensure the limit value is passed to get_limit(), which will always +> attempt resolving the name before printing the policy rule. +> + +**[v9: net-next: Device Memory TCP](http://lore.kernel.org/netdev/20240510232128.1105145-1-almasrymina@google.com/)** + +**[v2: net-next: net: ethernet: cortina: TSO and pause param](http://lore.kernel.org/netdev/20240511-gemini-ethernet-fix-tso-v2-0-2ed841574624@linaro.org/)** + +> This restores the TSO support as we put it on the back +> burner a while back. + +**[v1: pci: Add ACS quirk for Broadcom BCM5760X NIC](http://lore.kernel.org/netdev/20240510204228.73435-1-ajit.khaparde@broadcom.com/)** + +> Add an ACS quirk for this device so the functions can be in independent +> IOMMU groups and attached individually to userspace applications using +> VFIO. +> + +**[v1: net-next: Add TX stop/wake counters](http://lore.kernel.org/netdev/20240510201927.1821109-1-danielj@nvidia.com/)** + +> Several drivers provide TX stop and wake counters via ethtool stats. Add +> those to the netdev queue stats, and use them in virtio_net. +> + +**[v8: bpf qdisc](http://lore.kernel.org/netdev/20240510192412.3297104-1-amery.hung@bytedance.com/)** + +> This is the v8 of bpf qdisc patchset. While I would like to do more +> testing and performance evaluation, I think posting it now may help +> discussions in the upcoming LSF/MM/BPF. +> +> + +**[[PATCH net-next 14/15 v2] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.](http://lore.kernel.org/netdev/20240510162121.f-tvqcyf@linutronix.de/)** + +> The XDP redirect process is two staged: +> - bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the +> packet and makes decisions. While doing that, the per-CPU variable +> bpf_redirect_info is used. +> +> - Afterwards xdp_do_redirect() is invoked and accesses bpf_redirect_info +> and it may also access other per-CPU variables like xskmap_flush_list. + + +**[v3: net-next: net: A lightweight zero-copy notification](http://lore.kernel.org/netdev/20240510155900.1825946-1-zijianzhang@bytedance.com/)** + +> While making maximum reuse of the existing MSG_ZEROCOPY related code, +> this patch set introduces a new zerocopy socket notification mechanism. +> Users of sendmsg pass a control message as a placeholder for the incoming +> notifications. Upon returning, kernel embeds notifications directly into +> user arguments passed in. By doing so, we can significantly reduce the +> complexity and overhead for managing notifications. In an ideal pattern, +> the user will keep calling sendmsg with SCM_ZC_NOTIFICATION msg_control, +> and the notification will be delivered as soon as possible. +> +> + +**[v1: iwl-next: idpf: XDP chapter I: convert Rx to libeth](http://lore.kernel.org/netdev/20240510152620.2227312-1-aleksander.lobakin@intel.com/)** + +> Applies on top of "idpf: don't enable NAPI and interrupts prior to +> allocating Rx buffers" from Tony's tree. +> Sent as RFC as we're at the end of the development cycle and several +> kdocs are messed up. I'll fix them when sending non-RFC after the window +> opens. +> + +**[[net-next PATCH] test: hsr: Extend the hsr_redbox.sh to have more SAN devices connected](http://lore.kernel.org/netdev/20240510143710.3916631-1-lukma@denx.de/)** + +> After this change the single SAN device (ns3eth1) is now replaced with +> two SAN devices - respectively ns4eth1 and ns5eth1. +> +> It is possible to extend this script to have more SAN devices connected +> by adding them to ns3br1 bridge. +> + +**[v1: bpf-next: netfilter: Add the capability to offload flowtable in XDP layer](http://lore.kernel.org/netdev/cover.1715348200.git.lorenzo@kernel.org/)** + +> Introduce bpf_xdp_flow_offload_lookup kfunc in order to perform the +> lookup of a given flowtable entry based on the fib tuple of incoming +> traffic. +> bpf_xdp_flow_offload_lookup can be used as building block to offload +> in XDP the sw flowtable processing when the hw support is not available. + +> + +**[v1: dt-bindings: mfd: syscon: Add more simple compatibles](http://lore.kernel.org/netdev/20240510123018.3902184-1-robh@kernel.org/)** + +> Add another batch of various "simple" syscon compatibles which were +> undocumented or still documented with old text bindings. Remove the old +> text binding docs for the ones which were documented. +> + +**[v2: net-next: tcp: support rstreasons in the passive logic](http://lore.kernel.org/netdev/20240510122502.27850-1-kerneljasonxing@gmail.com/)** + +> In this series, I split all kinds of reasons into five part which, I +> think, can be easily reviewed. I respectively implement corresponding +> rstreasons in those functions. After this, we can trace the whole tcp +> passive reset with clear reasons. +> + +**[v1: net-next: selftests: net: use upstream mtools](http://lore.kernel.org/netdev/20240510112856.1262901-1-vladimir.oltean@nxp.com/)** + + +> Check that the deployed mtools version is 3.0 or above. Note that the +> version check breaks compatibility with my fork where I didn't bump the +> version, but I assume that won't be a problem. +> + +**[v2: net: ptp: ocp: adjust serial port symlink creation](http://lore.kernel.org/netdev/20240510110405.15115-1-vadim.fedorenko@linux.dev/)** + +> The commit b286f4e87e32 ("serial: core: Move tty and serdev to be children +> of serial core port device") changed the hierarchy of serial port devices +> and device_find_child_by_name cannot find ttyS* devices because they are +> no longer directly attached. Add some logic to restore symlinks creation +> to the driver for OCP TimeCard. +> + +**[v1: net-next: netconsole: Do not shutdown dynamic configuration if cmdline is invalid](http://lore.kernel.org/netdev/20240510103005.3001545-1-leitao@debian.org/)** + +> If a user provides an invalid netconsole configuration during boot time +> (e.g., specifying an invalid ethX interface), netconsole will be +> entirely disabled. Consequently, the user won't be able to create new +> entries in /sys/kernel/config/netconsole/ as that directory does not +> exist. +> Create /sys/kernel/config/netconsole/ even if the command line arguments +> are invalid, so, users can create dynamic entries in netconsole. +> + +**[v3: Add Bananapi R3 Mini](http://lore.kernel.org/netdev/20240510095707.6895-1-linux@fw-web.de/)** + +> Add mt7986 based BananaPi R3 Mini SBC. +> + +**[v6: net-next: net: stmmac: Add support for RZN1 GMAC devices](http://lore.kernel.org/netdev/20240510-rzn1-gmac1-v6-0-b63942be334c@bootlin.com/)** + +> This series consists of a devicetree binding describing the RZN1 GMAC +> controller IP, a node for the GMAC1 device in the r9a06g032 SoC device +> tree, and the GMAC driver itself which is a glue layer in stmmac. + + +**[v2: iwl-next: ice:Support to dump PHY config, FEC](http://lore.kernel.org/netdev/20240510065243.906877-1-anil.samal@intel.com/)** + +> Implementation to dump PHY configuration and FEC statistics to +> facilitate link level debugging of customer issues. Implementation has +> two parts +> +> + +**[v2: net-next: mlx5: Add netdev-genl queue stats](http://lore.kernel.org/netdev/20240510041705.96453-1-jdamato@fastly.com/)** + +> This change adds support for the per queue netdev-genl API to mlx5, +> which seems to output stats +> + +**[v1: net-next: Introduce IPPROTO_SMC](http://lore.kernel.org/netdev/1715314333-107290-1-git-send-email-alibuda@linux.alibaba.com/)** + +> This patch allows to create smc socket via AF_INET, +> similar to the following code. + +**[v1: net-next: add basic PSP encryption for TCP connections](http://lore.kernel.org/netdev/20240510030435.120935-1-kuba@kernel.org/)** + +> +> Add support for PSP encryption of TCP connections. +> + +#### 安全增强 + +**[v1: perf/x86/amd/uncore: Add flex array to struct amd_uncore_ctx](http://lore.kernel.org/linux-hardening/AS8PR02MB7237E4848B44A5226BD3551C8BE02@AS8PR02MB7237.eurprd02.prod.outlook.com/)** + +> This is an effort to get rid of all multiplications from allocation +> functions in order to prevent integer overflows . +> + +**[v3: perf/ring_buffer: Prefer struct_size over open coded arithmetic](http://lore.kernel.org/linux-hardening/AS8PR02MB72379B4807F3951A1B926BA58BE02@AS8PR02MB7237.eurprd02.prod.outlook.com/)** + +> This is an effort to get rid of all multiplications from allocation +> functions in order to prevent integer overflows . +> +> + +**[v1: seccomp: Constify sysctl subhelpers](http://lore.kernel.org/linux-hardening/20240508171337.work.861-kees@kernel.org/)** + +> The read_actions_logged() and write_actions_logged() helpers called by the +> sysctl proc handler seccomp_actions_logged_handler() are already expecting +> their sysctl table argument to be read-only. Actually mark the argument +> as const in preparation[1] for global constification of the sysctl tables. +> + +**[v1: Mitigating unexpected arithmetic overflow](http://lore.kernel.org/linux-hardening/202404291502.612E0A10@keescook/)** + +> Over the last decade or so, our work hardening against weaknesses +> in various kernel APIs and eliminating the ambiguities in C language +> semantics have traditionally been somewhat off in one corner or another +> of the Linux codebase. This topic is going to be much different as +> it is ultimately about the C type system, which is rather front and +> center. So, hold on to your hats while I try to explain what's desired +> here. Please try to reserve judgement until the end; as we've explored +> the topic we've found a lot of nuances, which I've tried to touch on +> below. +> + +**[v2: uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be}](http://lore.kernel.org/linux-hardening/AS8PR02MB72372E45071E8821C07236F78BE42@AS8PR02MB7237.eurprd02.prod.outlook.com/)** + +> This commit only provide UAPI macros for UAPI structs that will +> gain annotations for __counted_by_{le, be} attributes. And it is the +> previous step to be able to use these attributes in UAPI. +> + +**[v2: Introduce STM32 DMA3 support](http://lore.kernel.org/linux-hardening/20240507125442.3989284-1-amelie.delaunay@foss.st.com/)** + +> STM32 DMA3 is a direct memory access controller with different features +> depending on its hardware configuration. It is either called LPDMA (Low +> Power), GPDMA (General Purpose) or HPDMA (High Performance), and it can +> be found in new STM32 MCUs and MPUs. +> + +**[v1: net-next: netdevice: define and allocate &net_device _properly_](http://lore.kernel.org/linux-hardening/20240507123937.15364-1-aleksander.lobakin@intel.com/)** + +> In fact, this structure contains a flexible array at the end, but +> historically its size, alignment etc., is calculated manually. +> There are several instances of the structure embedded into other +> structures, but also there's ongoing effort to remove them and we +> could in the meantime declare &net_device properly. +> Declare the array explicitly, use struct_size() and store the array +> size inside the structure, so that __counted_by() can be applied. +> Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the +> allocated buffer is aligned to what the user expects. +> Also, change its alignment from %NETDEV_ALIGN to the cacheline size +> as per several suggestions on the netdev ML. +> + +**[v1: cdrom: rearrange last_media_change check to avoid unintentional overflow](http://lore.kernel.org/linux-hardening/20240507-b4-sio-ata1-v1-1-810ffac6080a@google.com/)** + +> When running syzkaller with the newly reintroduced signed integer wrap +> sanitizer we encounter this splat. +> + +#### 异步 IO + +**[v2: io_uring/rsrc: coalescing multi-hugepage registered buffers](http://lore.kernel.org/io-uring/20240511055229.352481-1-cliang01.li@samsung.com/)** + + +> This patch series enables coalescing registered buffers with more than +> one hugepages. It optimizes the DMA-mapping time and saves memory for +> these kind of buffers. + + +**[v3: io_uring: support sqe group and provide group kbuf](http://lore.kernel.org/io-uring/20240511001214.173711-1-ming.lei@redhat.com/)** + +> When running 64KB block size test on ublk-loop('ublk add -t loop --buffered_io -f $backing'), +> it is observed that perf is doubled. +> + +**[v2: io_uring: support to inject result for NOP](http://lore.kernel.org/io-uring/20240510035031.78874-1-ming.lei@redhat.com/)** + +> The two patches add nop_flags for supporting to inject result on NOP. +> + +**[v1: Propagate back queue status on accept](http://lore.kernel.org/io-uring/20240509180627.204155-1-axboe@kernel.dk/)** + +> This series starts by changing the proto/proto_ops accept prototypes +> to eliminate flags/errp/kern and replace it with a structure that +> encompasses all of them. +> + +**[v1: io_uring: add IORING_OP_NOP_FAIL](http://lore.kernel.org/io-uring/20240509023413.4124075-1-ming.lei@redhat.com/)** + +> Add IORING_OP_NOP_FAIL so that it is easy to inject failure from +> userspace. +> +> Like IORING_OP_NOP, the main use case is test, and it is very helpful +> for covering failure handling code in io_uring core change. +> + +**[v1: io_uring/filetable: don't unnecessarily clear/reset bitmap](http://lore.kernel.org/io-uring/0dbe5c36-b2b0-4f56-8c80-f56e09213285@kernel.dk/)** + +> If we're updating an existing slot, we clear the slot bitmap only to +> set it again right after. Just leave the bit set rather than toggle +> it off and on, and move the unused slot setting into the branch of +> not already having a file occupy this slot. +> + +**[v3: io_uring/io-wq: Use set_bit() and test_bit() at worker->flags](http://lore.kernel.org/io-uring/20240507170002.2269003-1-leitao@debian.org/)** + +> Utilize set_bit() and test_bit() on worker->flags within io_uring/io-wq +> to address potential data races. +> These races involve writes and reads to the same memory location by +> + +**[v1: io_uring/rsrc: Add support for multi-folio buffer coalescing](http://lore.kernel.org/io-uring/20240506075303.25630-1-cliang01.li@samsung.com/)** + +> Currently fixed buffers consisting of pages in one same folio(huge page) +> can be coalesced into a single bvec entry at registration. +> This patch expands it to support coalescing fixed buffers +> with multiple folios. + +#### Rust For Linux + +**[v2: kbuild: rust: split up helpers.c](http://lore.kernel.org/rust-for-linux/20240507210818.672517-1-ojeda@kernel.org/)** + +> Each helper file is listed explicitly and thus conflicts in the file +> list are still likely. However, they should be simpler to resolve than +> the conflicts usually seen in helpers.c. + +**[v1: rust: alloc: use \`if` instead of `match` in VecExt::reserve()](http://lore.kernel.org/rust-for-linux/20240507201709.105693-1-dakr@redhat.com/)** + +> In commit 1161057f53f6 ("rust: alloc: fix dangling pointer in +> VecExt::reserve()") the check for zero of a vector's capacity has +> been implemented using a \`match\` statement. Using an \`if` statement +> is the preferred style, hence change that. +> + +#### BPF + +**[v2: bpf-next: bpf: make list_for_each_entry portable](http://lore.kernel.org/bpf/20240511212243.23477-1-jose.marchesi@oracle.com/)** + +> This patch adds a new macro can_loop to bpf_experimental, that +> implements the same logic than cond_break but evaluates to a boolean +> expression. The patch also changes all the current instances of usage +> of cond_break withing the header of loop accordingly. +> + +**[v1: bpf-next: bpf: disable strict aliasing in test_global_func9.c](http://lore.kernel.org/bpf/20240511212213.23418-1-jose.marchesi@oracle.com/)** + +> The BPF selftest test_global_func9.c performs type punning and breaks +> srict-aliasing rules. +> In this case the warning is not emitted, because s-> is initialized. +> +> This patch disables strict aliasing in this test when building with +> GCC. clang seems to not optimize this particular code even when +> strict aliasing is enabled. +> + +**[v1: bpf-next: selftests/bpf: Free strdup memory in xdp_hw_metadata](http://lore.kernel.org/bpf/af9bcccb96655e82de5ce2b4510b88c9c8ed5ed0.1715417367.git.tanggeliang@kylinos.cn/)** + +> This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup() +> to avoid a potential memory leak in xdp_hw_metadata.c. +> + +**[v1: bpf-next: use network helpers, part 5](http://lore.kernel.org/bpf/cover.1715396405.git.tanggeliang@kylinos.cn/)** + +> This patchset uses post_socket_cb and post_connect_cb callbacks of struct +> network_helper_opts to refactor do_test() in bpf_tcp_ca.c to move dctcp +> test dedicated code out of do_test() into test_dctcp(). + + +**[v2: riscv, bpf: Optimize zextw insn with Zba extension](http://lore.kernel.org/bpf/20240511023436.3282285-1-xiao.w.wang@intel.com/)** + +> The Zba extension provides add.uw insn which can be used to implement +> zext.w with rs2 set as ZERO. +> + +**[v4: perf/core: Check sample_type in sample data saving helper functions](http://lore.kernel.org/bpf/20240510191423.2297538-1-yabinc@google.com/)** + +> We use helper functions to save raw data, callchain and branch stack in +> perf_sample_data. These functions update perf_sample_data->dyn_size without +> checking event->attr.sample_type, which may result in unused space allocated in +> sample records. To prevent this from happening, this patchset enforces checking +> sample_type of an event in these helper functions. +> + +**[v1: bpf-next: Retire progs/test_sock_addr.c](http://lore.kernel.org/bpf/20240510190246.3247730-1-jrife@google.com/)** + +> This patch series migrates remaining tests from bpf/test_sock_addr.c to +> prog_tests/sock_addr.c and progs/verifier_sock_addr.c in order to fully +> retire the old-style test program and expands test coverage to test +> previously untested scenarios related to sockaddr hooks. +> + +**[[PATCH net-next 14/15 v2] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.](http://lore.kernel.org/bpf/20240510162121.f-tvqcyf@linutronix.de/)** + +> At the very end of the NAPI callback, xdp_do_flush() is invoked which +> does not access bpf_redirect_info but will touch the individual per-CPU +> lists. +> +> On PREEMPT_RT the pointer to bpf_net_context is saved task's +> task_struct. On non-PREEMPT_RT builds the pointer saved in a per-CPU +> variable (which is always NODE-local memory). Using always the +> bpf_net_context approach has the advantage that there is almost zero +> + +**[v2: bpf-next: bpf: make trusted args nullable](http://lore.kernel.org/bpf/20240510122823.1530682-1-vadfed@meta.com/)** + +> Current verifier checks for the arg to be nullable after checking for +> certain pointer types. It prevents programs to pass NULL to kfunc args +> even if they are marked as nullable. This patchset adjusts verifier and +> changes bpf crypto kfuncs to allow null for IV parameter which is +> optional for some ciphers. Benchmark shows +> 4% improvements when there +> is no need to initialise 0-sized dynptr. +> + +**[v1: dwarves: btf_encoder: add "distilled_base" BTF feature to split BTF generation](http://lore.kernel.org/bpf/20240510104847.858922-1-alan.maguire@oracle.com/)** + +> Adding "distilled_base" to --btf_features when generating split BTF will +> create split and .BTF.base BTF - the latter allows us to map references +> from split BTF to base BTF, even if that base BTF has changed. It does +> this by providing just enough information about the base types in the +> .BTF.base section. +> +> Patch is applicable on the "next" branch of dwarves, and requires the +> libbpf from the series in +> +> + +**[v3: bpf-next: bpf: support resilient split BTF](http://lore.kernel.org/bpf/20240510103052.850012-1-alan.maguire@oracle.com/)** + +> For a STRUCT sk_buff, a module that +> uses that structure (or a pointer to it) simply needs to refer to the +> core kernel type id, saving the need to define the structure and its many +> dependents. This cuts down on duplication and makes BTF as compact +> as possible. +> + + +**[v5: bpf-next: Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head.](http://lore.kernel.org/bpf/20240510011312.1488046-1-thinker.li@gmail.com/)** + + +> The patch set aims to enable the use of these specific types in arrays +> and struct fields, providing flexibility. It examines the types of +> global variables or the value types of maps, such as arrays and struct +> types, recursively to identify these special types and generate field +> information for them. + +**[v3: bpf-next: Notify user space when a struct_ops object is detached/unregistered](http://lore.kernel.org/bpf/20240510002942.1253354-1-thinker.li@gmail.com/)** + +> This patch set enables the detach feature for struct_ops links and +> send an event to epoll when a link is detached. Subsystems could call +> link->ops->detach() to detach a link and notify user space programs +> through epoll. +> + +**[v3: perf:core: Save raw sample data](http://lore.kernel.org/bpf/20240510002424.1277314-1-yabinc@google.com/)** + +**[v8: bpf-next: Replace mono_delivery_time with tstamp_type](http://lore.kernel.org/bpf/20240509211834.3235191-1-quic_abchauha@quicinc.com/)** + + +### 周边技术动态 + +#### Qemu + +**[v1: target/riscv: Support RISC-V privilege 1.13 spec](http://lore.kernel.org/qemu-devel/20240510065856.2436870-1-fea.wang@sifive.com/)** + +> Based on the change log for the RISC-V privilege 1.13 spec, add the +> support for ss1p13. +> +> + +**[v5: target/riscv: Implement dynamic establishment of custom decoder](http://lore.kernel.org/qemu-devel/20240506023607.29544-1-eric.huang@linux.alibaba.com/)** + +> In this patch, we modify the decoder to be a freely composable data +> structure instead of a hardcoded one. It can be dynamically builded up +> according to the extensions. + + +#### Buildroot + +**[arch: allow riscv32 noMMU configuration](http://lore.kernel.org/buildroot/20240512102052.B5C2087099@busybox.osuosl.org/)** + + +**[v1: configs/qemu_riscv32_nommu_virt_defconfig: New defconfig](http://lore.kernel.org/buildroot/Zj8iP0Dx8Q%2FTWeNM@waldemar-brodkorb.de/)** + +> Add new defconfig for Qemu RISCV32 w/o MMU. +> + +**[package/bpftool: enable on riscv](http://lore.kernel.org/buildroot/20240509164200.65F6286CD0@busybox.osuosl.org/)** + + +> bpftool supports RISC-V, including rv64 and rv32, so let's enable the +> bpftool package on RISC-V. +> + +#### U-Boot + +**[v4: board: starfive: add Milk-V Mars CM support](http://lore.kernel.org/u-boot/20240512042528.7766-1-heinrich.schuchardt@canonical.com/)** + +> With this series the Milk-V Mars CM board can be booted. +> +> + +**[v1: Add Starfive JH7110 Cadence USB driver](http://lore.kernel.org/u-boot/20240504150358.19600-1-minda.chen@starfivetech.com/)** + +> Add Starfive JH7110 Cadence USB driver and related PHY driver. +> So the codes can be used in visionfive2 and milkv 7110 board. +> + + + ## 20240505:第 90 期 ### 内核动态