大师兄 (daidai258)

[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

1 0 0

[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention

1 0 0

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

1 0 0

CUDA Templates and Python DSLs for High-Performance Linear Algebra

1 0 0

周一

周四

周日

三月

四月

五月

六月

七月

八月

九月

十月

十一月

十二月

一月

二月

少

多

最近一年贡献：53 次

最长连续贡献：5 日

最近连续贡献：1 日

贡献度的统计数据包括代码提交、创建任务 / Pull Request、合并 Pull Request，其中代码提交的次数需本地配置的 git 邮箱是 Gitee 帐号已确认绑定的才会被统计。

大师兄

热门项目

贡献度

动态（访客只能浏览公开仓库的动态）

大师兄

热门项目

贡献度

动态 （访客只能浏览公开仓库的动态）

搜索帮助

动态（访客只能浏览公开仓库的动态）