diff --git a/news/README.md b/news/README.md index c42be1990ace65d3aae0fb0f1952a9b53c978efb..6bec72c88a66c1b6c03398bfdf27371282dc243f 100644 --- a/news/README.md +++ b/news/README.md @@ -7,6 +7,733 @@ * [2023 年 - 下半年](2023-2nd-half.md) +## 20240602:第 94 期 + +### 内核动态 + +#### RISC-V 架构支持 + +**[v6: RISC-V: ACPI: Add external interrupt controller support](http://lore.kernel.org/linux-riscv/20240601150411.1929783-1-sunilvl@ventanamicro.com/)** + +> This series adds support for the below ECR approved by ASWG. +> The series primarily enables irqchip drivers for RISC-V ACPI based +> platforms. + +**[v0: RISC-V: Use Zkr to seed KASLR base address](http://lore.kernel.org/linux-riscv/20240531162327.2436962-1-jesse@rivosinc.com/)** + +> Dectect the Zkr extension and use it to seed the kernel base address. + +**[v1: RISC-V: Implement ioremap_wc/wt](http://lore.kernel.org/linux-riscv/20240531100407.282-1-dqfext@gmail.com/)** + +> To improve performance, map the memory as weakly-ordered non-cacheable +> normal memory. + +**[v1: riscv: stacktrace: Add USER_STACKTRACE support](http://lore.kernel.org/linux-riscv/20240531083258.386709-1-ruanjinjie@huawei.com/)** + +> So use the +> perf_callchain_user() code as blueprint to implement the +> arch_stack_walk_user() which add userstacktrace support on riscv. + + +**[v1: external ulpi vbus control](http://lore.kernel.org/linux-riscv/20240531-citable-copier-188d32c108ff@wendy/)** + +> A customer sent me a patch adding a dt property to enable external vbus +> control as their phy didn't support it*. I was surprised to see that none +> of the other musb drivers made any use of this, but there is handling +> in the musb core for it - made me feel like I was missing something as +> to why it was not used by other drivers. + +**[v1: Revert "riscv: mm: accelerate pagefault when badaccess"](http://lore.kernel.org/linux-riscv/20240530164451.21336-1-palmer@rivosinc.com/)** + +> I accidentally picked up an earlier version of this patch, which had +> already landed via mm. The patch I picked up contains a bug, which I +> kept as I thought it was a fix. So let's just revert it. + +**[v1: riscv: sophgo: add thermal sensor support for cv180x/sg200x SoCs](http://lore.kernel.org/linux-riscv/SEYPR01MB422119B40F4CF05B823F93DCD7F32@SEYPR01MB4221.apcprd01.prod.exchangelabs.com/)** + +> This series implements driver for Sophgo cv180x/sg200x on-chip thermal +> sensor and adds common thermal zones for these SoCs. + +**[v1: riscv: dts: thead: th1520: Add PMU event node](http://lore.kernel.org/linux-riscv/IA1PR20MB4953BA3638A0839FCB0EF86BBBF32@IA1PR20MB4953.namprd20.prod.outlook.com/)** + +> T-HEAD th1520 uses standard C910 chip and its pmu is already supported +> by OpenSBI. + +**[v1: irqchip/sifive-plic: Chain to parent IRQ after handlers are ready](http://lore.kernel.org/linux-riscv/20240529215458.937817-1-samuel.holland@sifive.com/)** + +> Now that the PLIC uses a platform driver, the driver probed later in the +> boot process, where interrupts from peripherals might already be +> pending. + +**[v1: riscv: perf: Add support for Control Transfer Records Ext.](http://lore.kernel.org/linux-riscv/20240529185337.182722-1-rkanwal@rivosinc.com/)** + +> This series enables Control Transfer Records extension support on riscv +> platform. + +**[v1: RISC-V: hwprobe: Add MISALIGNED_PERF key](http://lore.kernel.org/linux-riscv/20240529182649.2635123-1-evan@rivosinc.com/)** + +> This +> causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS, +> since SLOW, FAST, and EMULATED have values whose bits overlap with +> each other. + +**[v4: mm: multi-gen LRU: Walk secondary MMU page tables while aging](http://lore.kernel.org/linux-riscv/20240529180510.2295118-1-jthoughton@google.com/)** + +> This patchset makes it possible for MGLRU to consult secondary MMUs +> while doing aging, not just during eviction. + +**[v1: Zacas/Zabha support and qspinlocks](http://lore.kernel.org/linux-riscv/20240528151052.313031-1-alexghiti@rivosinc.com/)** + +> This implements [cmp]xchgXX() macros using Zacas and Zabha extensions +> and finally uses those newly introduced macros to add support for +> qspinlocks: note that this implementation of qspinlocks satisfies the +> forward progress guarantee. + +**[v1: clk: clkdev: don't fail clkdev_alloc() if over-sized](http://lore.kernel.org/linux-riscv/E1sBrzn-00E8GK-Ue@rmk-PC.armlinux.org.uk/)** + +> Don't fail clkdev_alloc() if the strings are over-sized. In this case, +> the entry will not match during lookup, so its useless. + +**[v1: clk: sifive: Do not register clkdevs for PRCI clocks](http://lore.kernel.org/linux-riscv/20240528001432.1200403-1-samuel.holland@sifive.com/)** + +> These clkdevs were unnecessary, because systems using this driver always +> look up clocks using the devicetree. + +**[v1: irqchip/riscv-aplic: Simplify the to_of_node code](http://lore.kernel.org/linux-riscv/20240527125000.3502935-1-ruanjinjie@huawei.com/)** + +> The to_of_node has is_of_node check, so there is no need to repeat the +> is_of_node and to_of_node. And if is_of_node is false, the to_of_node will +> return NULL, the of_property_present will also return NULL, so remove the +> redundant check. + +**[v1: Add board support for Sipeed LicheeRV Nano](http://lore.kernel.org/linux-riscv/20240527-sg2002-v1-0-1b6cb38ce8f4@bootlin.com/)** + +> The LicheeRV Nano is a RISC-V SBC based on the Sophgo SG2002 chip. Adds +> minimal device tree files for this board to make it boot to a basic +> shell. + +**[v1: PCI: microchip: support using either instance 1 or 2](http://lore.kernel.org/linux-riscv/20240527-slather-backfire-db4605ae7cd7@wendy/)** + +> The current driver and binding for PolarFire SoC's PCI controller assume +> that the root port instance in use is instance 1. + +**[v2: riscv: lib: relax assembly constraints in hweight](http://lore.kernel.org/linux-riscv/20240527092405.134967-1-dqfext@gmail.com/)** + +> rd and rs don't have to be the same. In some cases where rs needs to be +> saved for later usage, this will save us some mv instructions. + +**[v1: RISC-V: io: Don't have a void* PCI_IOBASE](http://lore.kernel.org/linux-riscv/20240526213617.12890-2-palmer@rivosinc.com/)** + + +**[v1: riscv: enable HAVE_ARCH_HUGE_VMAP for XIP kernel](http://lore.kernel.org/linux-riscv/20240526110104.470429-1-namcao@linutronix.de/)** + +> This also fixes a boot problem for XIP kernel introduced by the commit in +> "Fixes:". This commit used huge page mapping for vmemmap, but huge page +> vmap was not enabled for XIP kernel. + +#### LoongArch 架构支持 + +**[v3: LoongArch: KVM: Add Binary Translation extension support](http://lore.kernel.org/loongarch/20240527074644.836699-1-maobibo@loongson.cn/)** + +> Like FPU extension, here late enabling method is used for LBT. LBT context +> is saved/restored on vcpu context switch path. +> Also this patch set BT capability detection, and BT register get/set +> interface for userspace vmm, so that vm supports migration with BT +> extension. + +#### 进程调度 + +**[v1: sched,x86: export percpu arch_freq_scale](http://lore.kernel.org/lkml/20240530181548.2039216-1-pauld@redhat.com/)** + +> Export the +> underlying percpu symbol on x86 so that external trace point helper +> modules can be made to work again. + +**[v2: sched/fair: Reschedule the cfs_rq when current is ineligible](http://lore.kernel.org/lkml/20240529141806.16029-1-spring.cxz@gmail.com/)** + +> I found that some tasks have been running for a long enough time and +> have become illegal, but they are still not releasing the CPU. This +> will increase the scheduling delay of other processes. Therefore, I +> tried checking the current process in wakeup_preempt and entity_tick, +> and if it is illegal, reschedule that cfs queue. + +**[v1: sched: core: quota and parent_quota can be uninitialized and assigned values](http://lore.kernel.org/lkml/20240528115350.1927-1-zeming@nfschina.com/)** + +> quota and parent_quota are first assigned values, so their use is not +> affected. + +#### 内存管理 + +**[v1: mm: increase totalram_pages on freeing to buddy system](http://lore.kernel.org/linux-mm/20240601133402.2675-1-richard.weiyang@gmail.com/)** + +> Total memory represents pages managed by buddy system. After the +> introduction of DEFERRED_STRUCT_PAGE_INIT, it may count the pages before +> being managed. + +**[v1: maple_tree: add mas_node_count() before going to slow_path in mas_wr_modify()](http://lore.kernel.org/linux-mm/20240601025536.25682-1-rgbi3307@naver.com/)** + +> If there are not enough nodes, mas_node_count() set an error state via mas_set_err() +> and return control flow to the beginning. +> In the return flow, mas_nomem() checks the error status, allocates new nodes, +> and resumes execution again. + +**[v4: slab: Introduce dedicated bucket allocator](http://lore.kernel.org/linux-mm/20240531191304.it.853-kees@kernel.org/)** + +**[v1: mm: read page_type using READ_ONCE](http://lore.kernel.org/linux-mm/20240531125616.2850153-1-david@redhat.com/)** + +> Let's use READ_ONCE to avoid load tearing (shouldn't make a difference) +> and to make KCSAN happy. +> Likely, we might also want to use WRITE_ONCE for the writer side of +> page_type, if KCSAN ever complains about that. But we'll not mess with +> that for now. + +**[v1: mm: sparse: Consistently use _nr](http://lore.kernel.org/linux-mm/20240531124144.240399-1-dev.jain@arm.com/)** + +> Consistenly name the return variable with an _nr suffix, whenever calling +> pfn_to_section_nr(), to avoid confusion with a (struct mem_section *). + +**[v1: mm: Reduce the number of slab->folio casts](http://lore.kernel.org/linux-mm/20240531122904.2790052-1-willy@infradead.org/)** + +> Mark a few more folio functions as taking a const folio pointer, which +> allows us to remove a few places in slab which cast away the const. + +**[v2: DAMON multiple contexts support](http://lore.kernel.org/linux-mm/20240531122320.909060-1-yorha.op@gmail.com/)** + +> This patch-set implements support for multiple contexts +> per kdamond. + +**[v11: LUF(Lazy Unmap Flush) reducing tlb numbers over 90%](http://lore.kernel.org/linux-mm/20240531092001.30428-1-byungchul@sk.com/)** + +> While I'm working with a tiered memory system e.g. CXL memory, I have +> been facing migration overhead esp. tlb shootdown on promotion or +> demotion between different tiers. + +**[v1: fs: sys_ringbuffer() (WIP)](http://lore.kernel.org/linux-mm/ytprj7mx37dna3n3kbiskgvris4nfvv63u3v7wogdrlzbikkmt@chgq5hw3ny3r/)** + +> Add new syscalls for generic ringbuffers that can be attached to +> arbitrary (supporting) file descriptors. + +**[v1: mm/memory-failure: Stop setting the folio error flag](http://lore.kernel.org/linux-mm/20240531032938.2712870-1-willy@infradead.org/)** + +> Nobody checks the error flag any more, so setting it accomplishes +> nothing. Remove the obsolete parts of this comment; it hasn't +> been true since errseq_t was used to track writeback errors in 2017. + +**[v3: vmstat: Kernel stack usage histogram](http://lore.kernel.org/linux-mm/20240530170259.852088-1-pasha.tatashin@soleen.com/)** + +> Provide a kernel stack usage histogram to aid in optimizing kernel stack +> sizes and minimizing memory waste in large-scale environments. The +> histogram divides stack usage into power-of-two buckets and reports the +> results in /proc/vmstat. This information is especially valuable in +> environments with millions of machines, where even small optimizations +> can have a significant impact. + +**[v1: mm: store zero pages to be swapped out in a bitmap](http://lore.kernel.org/linux-mm/20240530102126.357438-1-usamaarif642@gmail.com/)** + +> As shown in the patchseries that introduced the zswap same-filled +> optimization [1], 10-20% of the pages stored in zswap are same-filled. +> This is also observed across Meta's server fleet. +> By using VM counters in swap_writepage (not included in this +> patchseries) it was found that less than 1% of the same-filled +> pages to be swapped out are non-zero pages. + +**[v3: add mTHP support for anonymous shmem](http://lore.kernel.org/linux-mm/cover.1717033868.git.baolin.wang@linux.alibaba.com/)** + +> Anonymous pages have already been supported for multi-size (mTHP) allocation +> through commit 19eaf44954df, that can allow THP to be configured through the +> sysfs interface located at '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled'. + +**[v1: mm: vmscan: reset sc->priority on retry](http://lore.kernel.org/linux-mm/20240529154911.3008025-1-shakeel.butt@linux.dev/)** + +> The commit 6be5e186fd65 ("mm: vmscan: restore incremental cgroup +> iteration") added a retry reclaim heuristic to iterate all the cgroups +> before returning an unsuccessful reclaim but missed to reset the +> sc->priority. Let's fix it. + +**[v6: enable bs > ps in XFS](http://lore.kernel.org/linux-mm/20240529134509.120826-1-kernel@pankajraghav.com/)** + +> This is the sixth version of the series that enables block size > page size +> (Large Block Size) in XFS targetted for inclusion in 6.11. + +**[v2: mm: page_type, zsmalloc and page_mapcount_reset()](http://lore.kernel.org/linux-mm/20240529111904.2069608-1-david@redhat.com/)** + +> Wanting to remove the remaining abuser of _mapcount/page_type along with +> page_mapcount_reset(), I stumbled over zsmalloc, which is yet to be +> converted away from "struct page". + +**[v5: large folios swap-in: handle refault cases first](http://lore.kernel.org/linux-mm/20240529082824.150954-1-21cnbao@gmail.com/)** + +> This patch is extracted from the large folio swapin series[1], primarily addressing +> the handling of scenarios involving large folios in the swap cache. + +**[v1: mm/hugetlb: Do not call vma_add_reservation upon ENOMEM](http://lore.kernel.org/linux-mm/20240528205323.20439-1-osalvador@suse.de/)** + +> sysbot reported a splat [1] on __unmap_hugepage_range(). +> Check for that and do not call vma_add_reservation() if that is the case, +> otherwise region_abort() and region_del() will see that we do not have any +> file_regions. + + +**[v4: percpu_counter: add a cmpxchg-based _add_batch variant](http://lore.kernel.org/linux-mm/20240528204257.434817-1-mjguzik@gmail.com/)** + +> Interrupt disable/enable trips are quite expensive on x86-64 compared to +> a mere cmpxchg (note: no lock prefix!) and percpu counters are used +> quite often. + +**[v2: memcg: rearrange fields of mem_cgroup_per_node](http://lore.kernel.org/linux-mm/20240528164050.2625718-1-shakeel.butt@linux.dev/)** + +> Kernel test robot reported [1] performance regression for will-it-scale +> test suite's page_fault2 test case for the commit 70a64b7919cb ("memcg: +> dynamically allocate lruvec_stats"). After inspection it seems like the +> commit has unintentionally introduced false cache sharing. + +**[v7: Memory management patches needed by Rust Binder](http://lore.kernel.org/linux-mm/20240528-alice-mm-v7-0-78222c31b8f4@google.com/)** + +> This patchset contains some abstractions needed by the Rust +> implementation of the Binder driver for passing data between userspace, +> kernelspace, and directly into other processes. + +**[v3: mm: migrate: support poison recover from migrate folio](http://lore.kernel.org/linux-mm/20240528134513.2283548-1-wangkefeng.wang@huawei.com/)** + +> This series of patches provide the recovery mechanism from folio copy for +> the widely used folio migration. + +#### 文件系统 + +**[v1: readdir: Add missing quote in macro comment](http://lore.kernel.org/linux-fsdevel/20240602004729.229634-2-thorsten.blum@toblux.com/)** + +> Add a missing double quote in the unsafe_copy_dirent_name() macro +> comment. + +**[v1: ext4: simplify the counting and management of delalloc reserved blocks](http://lore.kernel.org/linux-fsdevel/20240601034149.2169771-1-yi.zhang@huaweicloud.com/)** + + +> This patch series is the part 3 prepartory changes of the buffered IO +> iomap conversion, it simplify the counting and updating logic of delalloc +> reserved blocks. +> This series has passed through kvm-xfstests in auto mode many times, +> please take a look at it. + +**[v1: fs: don't block i_writecount during exec](http://lore.kernel.org/linux-fsdevel/20240531-vfs-i_writecount-v1-1-a17bea7ee36b@kernel.org/)** + +> Back in 2021 we already discussed removing deny_write_access() for +> executables. Back then I was hesistant because I thought that this might +> cause issues in userspace. +> It's not +> completely out of the realm of possibility but let's find out if that's +> actually the case and not guess. +> + +**[v1: struct fd situation](http://lore.kernel.org/linux-fsdevel/20240531031802.GA1629371@ZenIV/)** + +> I've done another round of review of users. + +**[v1: kernel/sysctl-test: add MODULE_DESCRIPTION()](http://lore.kernel.org/linux-fsdevel/20240529-md-kernel-sysctl-test-v1-1-32597f712634@quicinc.com/)** + +> Fix the 'make W=1' warning: +> WARNING: modpost: missing MODULE_DESCRIPTION() in kernel/sysctl-test.o + +**[v1: Start moving write_begin/write_end out of aops](http://lore.kernel.org/linux-fsdevel/20240528164829.2105447-1-willy@infradead.org/)** + +> Christoph wants to remove write_begin/write_end from aops and pass them +> to filemap as callback functions. Here's one possible route to do this. +> I combined it with the folio conversion (because why touch the same code +> twice?) and tweaked some of the other things (support for ridiculously +> large folios with size_t lengths, remove the need to initialise fsdata +> by passing only a pointer to the fsdata pointer). + +**[v1: fs/netfs/fscache_cookie: add missing "n_accesses" check](http://lore.kernel.org/linux-fsdevel/20240528144445.3268304-1-max.kellermann@ionos.com/)** + +> This fixes a NULL pointer dereference bug due to a data race which +> looks like this: + +**[v1: KTEST: add test to exercise the new mount API for bcachefs](http://lore.kernel.org/linux-fsdevel/20240528043612.812644-5-tahbertschinger@gmail.com/)** + +**[v1: v5.1: fs: Allow fine-grained control of folio sizes](http://lore.kernel.org/linux-fsdevel/20240527210125.1905586-1-willy@infradead.org/)** + +> We need filesystems to be able to communicate acceptable folio sizes +> to the pagecache for a variety of uses (e.g. large block sizes). +> Support a range of folio sizes between order-0 and order-31 + +**[v1: netfs: Fault in smaller chunks for non-large folio mappings](http://lore.kernel.org/linux-fsdevel/20240527201735.1898381-1-willy@infradead.org/)** + +> As in commit 4e527d5841e2 ("iomap: fault in smaller chunks for non-large +> folio mappings"), we can see a performance loss for filesystems +> which have not yet been converted to large folios. + +**[v1: fs: autofs: add MODULE_DESCRIPTION()](http://lore.kernel.org/linux-fsdevel/20240527-md-fs-autofs-v1-1-e06db1951bd1@quicinc.com/)** + +> Fix the 'make W=1' warning: +> WARNING: modpost: missing MODULE_DESCRIPTION() in fs/autofs/autofs4.o + +**[v1: enhance the path resolution capability in fs_parser](http://lore.kernel.org/linux-fsdevel/20240527014717.690140-1-lihongbo22@huawei.com/)** + + +> The following is a brief overview of the patches, see the patches for +> more details. + +**[v1: isofs: add missing MODULE_DESCRIPTION()](http://lore.kernel.org/linux-fsdevel/20240526-md-fs-isofs-v1-1-60e2e36a3d46@quicinc.com/)** + +> Fix the 'make W=1' warning: +> WARNING: modpost: missing MODULE_DESCRIPTION() in fs/isofs/isofs.o + +#### 网络设备 + +**[v5: ext4: check hash version and filesystem casefolded consistent](http://lore.kernel.org/netdev/20240601113749.473058-1-lizhi.xu@windriver.com/)** + +> When mounting the ext4 filesystem, if the hash version and casefolded are not +> consistent, exit the mounting. + +**[v2: PCIe TPH and cache direct injection support](http://lore.kernel.org/netdev/20240531213841.3246055-1-wei.huang2@amd.com/)** + + +> This series introduces generic TPH support in Linux, allowing STs to be +> retrieved from ACPI _DSM (as defined by ACPI) and used by PCIe endpoint +> drivers as needed. + +**[v2: net-next: vmxnet3: upgrade to version 9](http://lore.kernel.org/netdev/20240531193050.4132-1-ronak.doshi@broadcom.com/)** + +> This patch series extends vmxnet3 driver +> to leverage these new feature. + +**[v3: net-next: net: mana: Allow variable size indirection table](http://lore.kernel.org/netdev/1717169861-15825-1-git-send-email-shradhagupta@linux.microsoft.com/)** + +> Allow variable size indirection table allocation in MANA instead +> of using a constant value MANA_INDIRECT_TABLE_SIZE. +> The size is now derived from the MANA_QUERY_VPORT_CONFIG and the +> indirection table is allocated dynamically. + +**[v4: Add Microchip KSZ 9897 Switch CPU PHY + Errata](http://lore.kernel.org/netdev/20240531142430.678198-1-enguerrand.de-ribaucourt@savoirfairelinux.com/)** + +> Back in 2022, I had posted a series of patches to support the KSZ9897 +> switch's CPU PHY ports but some discussions had not been concluded with +> Microchip. I've been maintaining the patches since and I'm now +> resubmitting them with some improvements to handle new KSZ9897 errata +> sheets (also concerning the whole KSZ9477 family). + +**[v2: net-next: net: smc91x: Refactor SMC_* macros](http://lore.kernel.org/netdev/20240531120103.565490-2-thorsten.blum@toblux.com/)** + +> Use the macro parameter lp directly instead of relying on ioaddr being +> defined in the surrounding scope. + +**[v2: vmxnet3: disable rx data ring on dma allocation failure](http://lore.kernel.org/netdev/20240531103711.101961-1-mstocker@barracuda.com/)** + + +> To fix this bug, rq->data_ring.desc_size needs to be set to 0 to tell +> the hypervisor to disable this feature. + +**[v1: net-next: Introduce EN7581 ethernet support](http://lore.kernel.org/netdev/cover.1717150593.git.lorenzo@kernel.org/)** + +> Add airoha_eth driver in order to introduce ethernet support for +> Airoha EN7581 SoC available on EN7581 development board. + +**[v3: iwl-net: ice: implement AQ download pkg retry](http://lore.kernel.org/netdev/20240531093206.714632-1-wojciech.drewek@intel.com/)** + +> ice_aqc_opc_download_pkg (0x0C40) AQ sporadically returns error due +> to FW issue. Fix this by retrying five times before moving to +> Safe Mode. + +**[v4: net: tcp/mptcp: count CLOSE-WAIT for CurrEstab](http://lore.kernel.org/netdev/20240531091753.75930-1-kerneljasonxing@gmail.com/)** + +> Taking CLOSE-WAIT sockets into CurrEstab counters is in accordance with RFC + +**[v2: ext4: add casefolded feature check before setup encrypted info](http://lore.kernel.org/netdev/20240531030740.1024475-1-lizhi.xu@windriver.com/)** + +> Due to the current file system not supporting the casefolded feature, only +> i_crypt_info was initialized when creating encrypted information, without actually +> setting the sighash. Therefore, when creating an inode, if the system does not +> support the casefolded feature, encrypted information will not be created. + +**[v1: net-next: tcp: refactor skb_cmp_decrypted() checks](http://lore.kernel.org/netdev/20240530233616.85897-1-kuba@kernel.org/)** + +> Refactor the input patch coalescing checks and wrap "EOR forcing" +> logic into a helper. This will hopefully make the code easier to +> follow. While at it throw some DEBUG_NET checks into skb_shift(). + +**[v2: net-next: net: visibility of memory limits in netns](http://lore.kernel.org/netdev/20240530232722.45255-1-technoboy85@gmail.com/)** + +> Some programs need to know the size of the network buffers to operate +> correctly, export the following sysctls read-only in network namespaces. + +**[v1: net-next: ionic: advertise 52-bit addressing limitation for MSI-X](http://lore.kernel.org/netdev/20240530214026.774256-1-drc@linux.ibm.com/)** + +> Current ionic devices only support 52 internal physical address +> lines. This is sufficient for x86_64 systems which have similar +> limitations but does not apply to all other architectures, +> notably IBM POWER (ppc64). + +**[v2: net-next: bnxt_en: add timestamping statistics support](http://lore.kernel.org/netdev/20240530204751.99636-1-vadfed@meta.com/)** + +> The ethtool_ts_stats structure was introduced earlier this year. Now +> it's time to support this group of counters in more drivers. +> This patch adds support to bnxt driver. + +**[v10: net-next: Device Memory TCP](http://lore.kernel.org/netdev/20240530201616.1316526-1-almasrymina@google.com/)** + +**[v4: net-next: net: allow dissecting/matching tunnel control flags](http://lore.kernel.org/netdev/cover.1717088241.git.dcaratti@redhat.com/)** + +> Ilya says: "for correct matching on decapsulated packets, we should match +> on not only tunnel id and headers, but also on tunnel configuration flags +> like TUNNEL_NO_CSUM and TUNNEL_DONT_FRAGMENT. + +**[v1: net-next: af_unix: Don't check last_len in unix_stream_data_wait().](http://lore.kernel.org/netdev/20240530164256.40223-1-kuniyu@amazon.com/)** + +> When commit 869e7c62486e ("net: af_unix: implement stream sendpage +> support") added sendpage() support, data could be appended to the last +> skb in the receiver's queue. + +**[v2: net-next: tcp: add sysctl_tcp_rto_min_us](http://lore.kernel.org/netdev/20240530153436.2202800-1-yyd@google.com/)** + +> Adding a sysctl knob to allow user to specify a default +> rto_min at socket init time. + +**[GIT PULL: Networking for v6.10-rc2](http://lore.kernel.org/netdev/20240530132944.37714-1-pabeni@redhat.com/)** + + +**[[net-next PATCH] octeontx2: Improve mailbox tracepoints for debugging](http://lore.kernel.org/netdev/1717070038-18381-1-git-send-email-sbhatta@marvell.com/)** + +> The tracepoints present currently wrt mailbox do not +> provide enough information to debug mailbox activity. + + +#### 安全增强 + +**[v4: Hardening perf subsystem](http://lore.kernel.org/linux-hardening/AS8PR02MB7237F5BFDAA793E15692B3998BFD2@AS8PR02MB7237.eurprd02.prod.outlook.com/)** + +> This is an effort to get rid of all multiplications from allocation +> functions in order to prevent integer overflows . + +**[v1: ubsan: add missing MODULE_DESCRIPTION() macro](http://lore.kernel.org/linux-hardening/20240531-md-lib-test_ubsan-v1-1-c2a80d258842@quicinc.com/)** + +> Add the missing invocation of the MODULE_DESCRIPTION() macro. + +**[v4: Introduce STM32 DMA3 support](http://lore.kernel.org/linux-hardening/20240531150712.2503554-1-amelie.delaunay@foss.st.com/)** + +> In STM32MP25 SoC [1], 3 HPDMAs and 1 LPDMA are embedded. Only HPDMAs are +> used by Linux. + +**[v1: x86/boot: add prototype for __fortify_panic()](http://lore.kernel.org/linux-hardening/20240529-fortify_panic-v1-1-9923d5c77657@quicinc.com/)** + +> As discussed in [1] add a prototype for __fortify_panic() to fix the +> 'make W=1 C=1' warning: + + +**[v1: x86/hpet: Read HPET directly if panic in progress](http://lore.kernel.org/linux-hardening/20240528063836.5248-1-TonyWWang-oc@zhaoxin.com/)** + +> To avoid this dead loops, read HPET directly if panic in progress. + +**[v2: dma-buf/fence-array: Add flex array to struct dma_fence_array](http://lore.kernel.org/linux-hardening/8b4e556e07b5dd78bb8a39b67ea0a43b199083c8.1716652811.git.christophe.jaillet@wanadoo.fr/)** + +> This is an effort to get rid of all multiplications from allocation +> functions in order to prevent integer overflows . + +#### 异步 IO + +**[v3: liburing: test: add test cases for hugepage registered buffers](http://lore.kernel.org/io-uring/20240531052023.1446914-1-cliang01.li@samsung.com/)** + +> Add a test file for hugepage registered buffers, to make sure the +> fixed buffer coalescing feature works safe and soundly. + +**[v1: io_uring/net: assign kmsg inq/flags before buffer selection](http://lore.kernel.org/io-uring/c52d9b19-7fd7-4fb1-b396-632b9f0f612d@kernel.dk/)** + +> syzbot reports that recv is using an uninitialized value: + +#### Rust For Linux + +**[v3: Rust block device driver API and null block driver](http://lore.kernel.org/rust-for-linux/20240601081806.531954-1-nmi@metaspace.dk/)** + +> Rebased on v6.10-rc1 and implemented a ton of improvements suggested by Benno. v2 is here [2] + +**[v2: net::phy support for C45](http://lore.kernel.org/rust-for-linux/20240601043535.53545-1-fujita.tomonori@gmail.com/)** + +> Adds support for reading/writing C45 registers and genphy helper +> functions executed via C45 registers. + +**[v1: Makefile: rust-analyzer target: better error handling and comments](http://lore.kernel.org/rust-for-linux/20240601004856.206682-1-jhubbard@nvidia.com/)** + +> This is confusing at first, because there is, in fact, a rust-analyzer +> build target. It's just not set up to handle errors gracefully. + +**[v1: kbuild: rust: provide an option to inline C helpers into Rust](http://lore.kernel.org/rust-for-linux/20240529202817.3641974-1-gary@garyguo.net/)** + +> This RFC presents an option \`RUST_LTO_HELPERS\` to inline C helpers into +> Rust. This is similar to LTO, but we perform the extra inlining and +> optimisation per Rust crate (compilation unit) instead of at final +> linking time, thus has better compilation speed. It also means that this +> presented approach work for loadable modules as well. + +**[v1: rust: net::phy support to C45 registers access](http://lore.kernel.org/rust-for-linux/20240527.104650.353359058235482782.fujita.tomonori@gmail.com/)** + +> Adds support for C45 registers access. C45 registers can be accessed +> in two ways: either C45 bus protocol or C45 over C22. Normally, a PHY +> driver shouldn't care how to access. PHYLIB chooses the appropriate +> one. But there is an exception; PHY hardware supporting only C45 bus +> protocol. + +#### BPF + +**[v1: bpf-next: libbpf: implement BTF field iterator](http://lore.kernel.org/bpf/20240601014505.3443241-1-andrii@kernel.org/)** + +> Switch from callback-based iteration over BTF type ID and string offset +> fields to an iterator-based approach. +> Switch all existing internal use cases to this new iterator. + +**[v1: bpf: Make session kfuncs global](http://lore.kernel.org/bpf/20240531101550.2768801-1-jolsa@kernel.org/)** + +> The bpf_session_cookie is unavailable for !CONFIG_FPROBE as reported +> by Sebastian . +> Instead of adding more ifdefs, making the session kfuncs globally +> available as suggested by Alexei. It's still allowed only for +> session programs, but it won't fail the build. + +**[v2: net-next: virtnet_net: prepare for af-xdp](http://lore.kernel.org/bpf/20240530112406.94452-1-xuanzhuo@linux.alibaba.com/)** + +> This patch set prepares for supporting af-xdp zerocopy. +> There is no feature change in this patch set. +> I just want to reduce the patch num of the final patch set, +> so I split the patch set. + +**[v1: bpf-next: use network helpers, part 6](http://lore.kernel.org/bpf/cover.1717054461.git.tanggeliang@kylinos.cn/)** + +> For moving dctcp test dedicated code out of do_test() into test_dctcp(). +> This patchset adds a new helper start_test() in bpf_tcp_ca.c to refactor +> do_test(). +> Address Martin's comments for the previous series. + +**[v7: bpf-next: Notify user space when a struct_ops object is detached/unregistered](http://lore.kernel.org/bpf/20240530065946.979330-1-thinker.li@gmail.com/)** + +> This patch set enables the detach feature for struct_ops links and +> send an event to epoll when a link is detached. Subsystems could call +> link->ops->detach() to detach a link and notify user space programs +> through epoll. + +**[v1: net: tap: validate metadata and length for XDP buff before building up skb](http://lore.kernel.org/bpf/1717026141-25716-1-git-send-email-si-wei.liu@oracle.com/)** + +> The cited commit missed to check against the validity of the length +> and various pointers on the XDP buff metadata in the tap_get_user_xdp() +> path, which could cause a corrupted skb to be sent downstack. For +> instance, tap_get_user() prohibits short frame which has the length +> less than Ethernet header size from being transmitted, while the +> skb_set_network_header() in tap_get_user_xdp() would set skb's +> network_header regardless of the actual XDP buff data size. This +> could either cause out-of-bound access beyond the actual length, or +> confuse the underlayer with incorrect or inconsistent header length +> in the skb metadata. + + +**[v1: bpf: libbpf: don't close(-1) in multi-uprobe feature detector](http://lore.kernel.org/bpf/20240529231212.768828-1-andrii@kernel.org/)** + +> Guard close(link_fd) with extra link_fd >= 0 check to prevent close(-1). +> Detected by Coverity static analysis. + +**[v1: bpf-next: libbpf: keep FD_CLOEXEC flag when dup()'ing FD](http://lore.kernel.org/bpf/20240529223239.504241-1-andrii@kernel.org/)** + +> Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs. +> Use dup3() with O_CLOEXEC flag for that. + +**[v2: net-next: net: validate SO_TXTIME clockid coming from userspace](http://lore.kernel.org/bpf/20240529183130.1717083-1-quic_abchauha@quicinc.com/)** + +> Add validation in setsockopt to support only CLOCK_REALTIME, +> CLOCK_MONOTONIC and CLOCK_TAI to be set from userspace. + +**[v1: bpftool: Query only cgroup-related attach types](http://lore.kernel.org/bpf/20240529131028.41200-1-tadakentaso@gmail.com/)** + +**[v4: bpf-next: netfilter: Add the capability to offload flowtable in XDP layer](http://lore.kernel.org/bpf/cover.1716987534.git.lorenzo@kernel.org/)** + +> This series has been tested running the xdp_flowtable_offload eBPF program +> on an ixgbe 10Gbps NIC (eno2) in order to XDP_REDIRECT the TCP traffic to +> a veth pair (veth0-veth1) based on the content of the nf_flowtable as soon +> as the TCP connection is in the established state. + +**[v2: bpf: Allocate bpf_event_entry with node info](http://lore.kernel.org/bpf/20240529065311.1218230-1-namhyung@kernel.org/)** + +> It was reported that accessing perf_event map entry caused pretty high +> LLC misses in get_map_perf_counter(). As reading perf_event is allowed +> for the local CPU only, I think we can use the target CPU of the event +> as hint for the allocation like in perf_event_alloc() so that the event +> and the entry can be in the same node at least. + +**[v1: net: validate SO_TXTIME clockid coming from userspace](http://lore.kernel.org/bpf/20240528224935.1020828-1-quic_abchauha@quicinc.com/)** + +> Add validation in setsockopt to support only CLOCK_REALTIME, +> CLOCK_MONOTONIC and CLOCK_TAI to be set from userspace. + +**[v5: bpf-next: bpf: support resilient split BTF](http://lore.kernel.org/bpf/20240528122408.3154936-1-alan.maguire@oracle.com/)** + +> The series first focuses on generating split BTF with distilled base +> BTF; then relocation support is added to allow split BTF with +> an associated distlled base to be relocated with a new base BTF. + +### 周边技术动态 + +#### Qemu + +**[v2: Improve the performance of RISC-V vector unit-stride/whole register ld/st instructions](http://lore.kernel.org/qemu-devel/20240531174504.281461-1-max.chou@sifive.com/)** + + +> In this new version, we added patches that try to load/store more data +> at a time in part of vector continuous load/store (unit-stride/whole +> register) instructions with some assumptions (e.g. no masking, no tail +> agnostic, perform virtual address resolution once for the entire vector, +> etc.) as suggested by Richard Henderson. + + +**[v1: hw/riscv/virt.c: add address-cells in create_fdt_one_aplic()](http://lore.kernel.org/qemu-devel/20240530084949.761034-1-dbarboza@ventanamicro.com/)** + +> We need #address-cells properties in all interrupt controllers that are +> referred by an interrupt-map [1]. For the RISC-V machine, both PLIC and +> APLIC controllers must have this property. + + +**[v1: target/riscv: Add support for Control Transfer Records Ext.](http://lore.kernel.org/qemu-devel/20240529160950.132754-1-rkanwal@rivosinc.com/)** + +> This series enables Control Transfer Records extension support on riscv +> platform. This extension is similar to Arch LBR in x86 and BRBE in ARM. + +**[v2: target/riscv: zvbb implies zvkb](http://lore.kernel.org/qemu-devel/20240528130349.20193-1-jerry.zhangjian@sifive.com/)** + +> - According to RISC-V crypto spec, Zvkb extension is a proper subset of the Zvbb extension. + +**[v2: RESEND: target/riscv/kvm: QEMU support for KVM Guest Debug on RISC-V](http://lore.kernel.org/qemu-devel/20240528080759.26439-1-duchao@eswincomputing.com/)** + +> This series implements QEMU KVM Guest Debug on RISC-V, with which we +> could debug RISC-V KVM guest from the host side, using software +> breakpoints. + + +**[v2: target/riscv/kvm: QEMU support for KVM Guest Debug on RISC-V](http://lore.kernel.org/qemu-devel/20240528072048.25529-1-duchao@eswincomputing.com/)** + +> This series implements QEMU KVM Guest Debug on RISC-V, with which we +> could debug RISC-V KVM guest from the host side, using software +> breakpoints. + +**[v1: targer/riscv: Implement Zabha extension](http://lore.kernel.org/qemu-devel/20240528054535.290953-1-alexghiti@rivosinc.com/)** + +> Add Zabha implementation. + +**[v1: riscv-to-apply queue](http://lore.kernel.org/qemu-devel/20240528024328.246965-1-alistair.francis@wdc.com/)** + + +**[v7: target/riscv/kvm/kvm-cpu.c: kvm_riscv_handle_sbi() fail with vendor-specific SBI](http://lore.kernel.org/qemu-devel/20240527134811.342027-1-alexei.filippov@syntacore.com/)** + + +> Add new error path to provide proper error in case of +> qemu_chr_fe_read_all() may not return sizeof(ch), because exactly zero +> just means we failed to read input, which can happen, so +> telling the SBI caller we failed to read, but telling the caller of this +> function that we successfully emulated the SBI call, is correct. However, +> anything else, other than sizeof(ch), means something unexpected happened, +> so we should return an error. +> Added SBI related return code's defines. + +#### U-Boot + +**[v1: doc: Add UEFI supplement document](http://lore.kernel.org/u-boot/20240529-efi-spec-v1-1-0668e4d19683@flygoat.com/)** + +> Add UEFI supplement document to define some behaviours +> on architectures not covered by the original specification. + + ## 20240526:第 93 期 ### 内核动态