OSDI 23 BWoS 更好的多线程work-stealing技术——分块

一篇有趣的文章,做的是新的多生产者多消费者的队列。 我猜这篇是他们做量化的同学搞的哈哈哈哈哈。 PowerPoint Presentation OSDI 2023 论文评述 Day3-Session11: Verify Your Bits - 知乎 stdexec/include/exec/detail/bwos_lifo_queue.hpp at main · NVIDIA/stdexec BW

SC Paper Summary 5

总链接: https://haibin-blog.vercel.app/index.php/sc-2024-passage/ Matrix Computations on Tensor Cores AmgT: Algebraic Multigrid Solver on Tensor Cores (Best Paper Finalist) Super Scientific Software Labor

SC 24 Brief Summary 4

总链接: https://haibin-blog.vercel.app/index.php/sc-2024-passage/ Parallel Program Analysis and Code Optimization MCFuser: High-performance and Rapid-fusion of Memory-bound Compute-intensive Operators Aut

SC Paper Reading 3

总链接: https://haibin-blog.vercel.app/index.php/sc-2024-passage/ Paper Computational Efficiency and Learning Techniques Murali Emani B311 AcceleratorsApplications and Application FrameworksArtificial Int

SC 24 brief Summary 1

SC 24 Passage My summary and understanding of the papers presented at the SC24 conference. 总链接: https://haibin-blog.vercel.app/index.php/sc-2024-passage/ Jensen Huang NVIDIA speech 主题:NVIDIA GPU的历史、目前进

APAC 2024 Optimization Summary: HPC

HPC Groups: ZuDong Li (leader) Haibin Lai Benxiang Xiao Zixu Wang Wenhan Tan Wenbo An AI Groups: Yukun Yang Honglie Li Junyu Su Abstract In this report, we detail the optimization efforts conducted on