llm-quantization

Here are 11 public repositories matching this topic...

snu-mllab / GuidedQuant

Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)

quantization efficient-inference large-language-models llm-inference llm-quantization

Updated Jul 6, 2025
Python

GongCheng1919 / bias-compensation

Star

[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation

post-training-quantization llm-compression output-error-optimization bias-compensation llm-quantization

Updated Mar 12, 2025
Python

A high-performance, memory-efficient healthcare framework that deploys fine-tuned Large Language Models (LLMs) on edge devices. Multi-agent system to provide personalized diagnostic reasoning, health education, and dietary planning.

lora multi-agent-systems qlora peft-fine-tuning-llm llm-quantization

Updated Sep 7, 2025
Jupyter Notebook

GovOn-Org / GovOn

Star

On-device AI 민원 처리 및 분석 시스템 | LLM 경량화 & 파인튜닝 | 현장미러형 연계 프로젝트 - 산업체 수요 기반 현장 실무 역량 강화

nlp transformers pytorch fine-tuning capstone-project edge-ai on-device-ai llm llm-quantization industry-collaboration

Updated Mar 10, 2026
Python

MagicTeaMC / AutoGGUF

Star

Let me make GGUF files quickly

llm llamacpp llama-cpp gguf llm-quantization

Updated Jun 4, 2025
Python

violinmelody / CelestiaLLM

Star

Local & lightweight LLM inference runtime in C++ with support for GGUF & quantization

open-source library opensource cpp17 mit-license cpp-library cpp-lib llm cpp-module llm-inference llm-local llm-tools llm-framework gguf llm-library llm-quantization llm-integration lightweight-llm

Updated Feb 27, 2026

nagababumo / Quantization-in-Depth

Star

pytorch quantization dequantization 2-bit hugging-face hugging-face-hub llm-quantization torch-quantization

Updated Jun 26, 2024
Jupyter Notebook

t81dev / ternary

Star

Ternary Quantization for LLMs: Implement balanced ternary (T3_K) weights for 2.63-bit quantization—the first working solution for modern large language models.

balanced-ternary llama-cpp gguf llm-quantization ai-efficiency ternary-logic

Updated Nov 29, 2025
C++

Mine-Fire-Bro / llm-diagnostic-reasoning

Star

Enable expert-level, multi-step diagnostic reasoning in Claude Code with an easy-to-use skill for clear and explainable AI diagnosis.

Updated Mar 10, 2026
Python

samarthamp / advanced-nlp-course-projects

Star

Implementation of advanced Natural Language Processing architectures and optimization techniques, built from scratch. The projects focus on understanding the internal mechanics of Transformers, LLM efficiency through quantization, and scaling via Mixture-of-Experts (MoE).

load-balancing mixture-of-experts transformer-architecture positional-encoding llm-fine-tuning llm-quantization

Updated Jan 8, 2026
Python

paraglondhe098 / sentiment-classification-llm

Star

Implemented and fine-tuned BERT for a custom sequence classification task, leveraging LoRA adapters for efficient parameter updates and 4-bit quantization to optimize performance and resource utilization.

nlp lora quantization data-augmentation nlp-augmentation llm qlora llm-fine-tuning peft-fine-tuning-llm llm-quantization

Updated Dec 30, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the llm-quantization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-quantization topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-quantization

Here are 11 public repositories matching this topic...

snu-mllab / GuidedQuant

GongCheng1919 / bias-compensation

nithya333 / Medi-LLM

GovOn-Org / GovOn

MagicTeaMC / AutoGGUF

violinmelody / CelestiaLLM

nagababumo / Quantization-in-Depth

t81dev / ternary

Mine-Fire-Bro / llm-diagnostic-reasoning

samarthamp / advanced-nlp-course-projects

paraglondhe098 / sentiment-classification-llm

Improve this page

Add this topic to your repo