chrispion

chrispion

Popular repositories Loading

fast_topk_batched fast_topk_batched Public

🚀 Accelerate CPU inference with Fast TopK for high-performance batched Top-K selection, optimized for efficient LLM sampling workloads.

C++
chrispion.github.io chrispion.github.io Public

⚡ Accelerate batched Top-K selection for CPU inference, optimizing LLM sampling workloads with performance up to 80x faster than PyTorch.