Pinned Loading
-
xlite-dev/LeetCUDA
xlite-dev/LeetCUDA Public📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
-
xlite-dev/lite.ai.toolkit
xlite-dev/lite.ai.toolkit Public🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
-
PaddlePaddle/FastDeploy
PaddlePaddle/FastDeploy PublicHigh-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
-
sgl-project/sglang
sgl-project/sglang PublicSGLang is a high-performance serving framework for large language models and multimodal models.
-
vipshop/cache-dit
vipshop/cache-dit Public🤗 A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.
-
xlite-dev/ffpa-attn
xlite-dev/ffpa-attn Public🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.





