π CallCenterAI β Intelligent Customer Ticket Classification
Production-Ready MLOps System β’ Dual NLP Architecture β’ Real-Time Intelligence
π Table of Contents β¨ FeaturesποΈ Architecture Overview
π Quick Deployment
π API Documentation
π Performance Dashboard
π οΈ Development Guide
π§ͺ Testing Suite
π Monitoring & Observability
π Model Comparison
π€ AI Agent Intelligence
π Project Structure
π³ Docker Services
π CI/CD Pipeline
β FAQ & Troubleshooting
πΈ System Screenshots
π₯ Contributing
π License
β¨ Features π― Core Capabilities Dual-Model Intelligence: Combines TF-IDF+SVM (94.8% accuracy) with Transformer (92.3% accuracy)
Smart Routing Agent: Real-time model selection based on text complexity, language, and confidence
PII Protection: Automatic scrubbing of emails, phones, CIN numbers, and names
Multilingual Support: French, English, Arabic with DistilBERT-base-multilingual-cased
π Production Ready Full MLOps Pipeline: MLflow tracking, DVC versioning, model registry
Containerized Microservices: Docker + Docker Compose orchestration
Enterprise Monitoring: Prometheus metrics + Grafana dashboards
CI/CD Integration: GitHub Actions with security scanning
β‘ Performance Highlights 10ms Inference: TF-IDF model for simple English queries
95%+ Accuracy: On IT Service Ticket Classification dataset
Horizontal Scalability: Stateless microservices architecture
Real-Time Analytics: Live metrics and performance tracking
ποΈ Architecture Overview System Architecture https://deepseek_mermaid_20251215_c6cd46.svg
Service Communication Component Port Protocol Responsibility AI Agent 8000 HTTP/JSON Intelligent routing & PII scrubbing TF-IDF Service 8010 HTTP/JSON Fast classification (simple cases) Transformer Service 8020 HTTP/JSON Advanced multilingual classification MLflow Server 5000 HTTP Experiment tracking & model registry Prometheus 9090 HTTP Metrics collection & storage Grafana 3000 HTTP Visualization & dashboards π Quick Deployment One-Command Setup bash
git clone https://github.com/Elyes-2/CallCenterAI.git cd CallCenterAI docker-compose up -d --build Verification bash
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
NAMES STATUS PORTS callcenterai-grafana-1 Up 6 days 0.0.0.0:3000->3000/tcp callcenterai-agent-1 Up 6 days 0.0.0.0:8000->8000/tcp callcenterai-prometheus-1 Up 6 days 0.0.0.0:9090->9090/tcp callcenterai-tfidf-1 Up 6 days 0.0.0.0:8010->8010/tcp callcenterai-transformer-1 Up 6 days 0.0.0.0:8020->8020/tcp callcenterai-mlflow-1 Up 7 days 0.0.0.0:5000->5000/tcp Health Check bash
./scripts/health_check.sh
curl -s http://localhost:8000/health | jq
curl -s http://localhost:8010/health | jq
curl -s http://localhost:8020/health | jq
curl -s http://localhost:5000 | head -5
π API Documentation
π€ 1. Main Classification Endpoint
http
POST /classify
Host: localhost:8000
Content-Type: application/json
Request:
json { "text": "My laptop screen is broken. Need replacement ASAP! Contact: john.doe@company.com or 123-456-7890." } Response:
json
{
"success": true,
"category": "Hardware",
"confidence": 0.9945622647141518,
"model_used": "TF-IDF + SVM",
"model_choice_reason": "Short English (9 words)",
"pii_scrubbed": true,
"pii_details": {
"emails": ["john.doe@company.com"],
"phones": ["1234567890"],
"cins": [],
"names": []
},
"detected_language": "en",
"processing_time_ms": 65.2,
"timestamp": "2025-12-15T18:13:36.123Z"
}
π§ 2. Direct Model Access
TF-IDF Service (:8010)
bash
curl -X POST http://localhost:8010/predict
-H "Content-Type: application/json"
-d '{"text": "Printer not working, need urgent help"}'
Transformer Service (:8020)
bash
curl -X POST http://localhost:8020/predict
-H "Content-Type: application/json"
-d "{"text": "J'ai des problèmes d'accès au système de facturation en ligne"}"
π 3. Metrics & Monitoring
bash
curl http://localhost:8000/metrics | grep "http_requests_total"
curl http://localhost:8000/health
curl http://localhost:8000/info π§ͺ 4. Batch Processing bash
curl -X POST http://localhost:8000/batch_classify
-H "Content-Type: application/json"
-d '{
"tickets": [
{"id": "1", "text": "Monitor not working"},
{"id": "2", "text": "Need password reset"},
{"id": "3", "text": "Problème avec le serveur"}
]
}'
π Performance Dashboard
Live Metrics Access
Dashboard URL Credentials Purpose
Grafana http://localhost:3000 admin/admin Real-time monitoring
Prometheus http://localhost:9090 None Metrics querying
MLflow UI http://localhost:5000 None Experiment tracking
Key Dashboards
π System Overview - Request rates, error percentages, latency
π€ Model Performance - Accuracy, F1-score, confusion matrix
π Language Analytics - Distribution by language and category
β‘ Inference Metrics - P95/P99 latency, throughput per service
π PII Detection - Statistics on detected sensitive information
π οΈ Development Guide Prerequisites bash
- Docker 20.10+ & Docker Compose 2.0+
- Python 3.11+ (for local development)
- Git
- VS Code (recommended) or any IDE
- NVIDIA Docker (for GPU acceleration)
- kubectl (for Kubernetes deployment)
- Helm (for chart management) Local Development Setup bash
git clone https://github.com/Elyes-2/CallCenterAI.git cd CallCenterAI
python -m venv .venv source .venv/bin/activate # Linux/Mac
pip install -r requirements.txt pip install -r requirements-dev.txt
dvc pull # Download dataset
uvicorn src.agent.api:app --reload --port 8000 uvicorn src.tfidf_service.api:app --reload --port 8010 uvicorn src.transformer_service.api:app --reload --port 8020 Training Models bash
python src/training/train_tfidf.py
--data-path data/processed/train.csv
--model-save-path models/tfidf_svm.pkl
--mlflow-tracking-uri http://localhost:5000
python src/training/train_transformer.py
--model-name distilbert-base-multilingual-cased
--epochs 3
--batch-size 16
--learning-rate 2e-5
python train_and_log.py π§ͺ Testing Suite Test Coverage bash
pytest tests/ -v --cov=src --cov-report=html
================================================================= β 23 tests passed β 92% code coverage β All integration tests passing β Security tests cleared Test Categories Test File Tests Purpose Status test_agent.py 5 AI agent logic & routing β Passed test_tfidf.py 4 TF-IDF service functionality β Passed test_transformer.py 3 Transformer service β Passed test_integration.py 2 Service communication β Passed test_training.py 6 Model training pipeline β Passed test_data.py 3 Data validation β Passed Running Tests bash
pytest tests/test_agent.py -v pytest tests/test_integration.py -v
pytest -m "not slow" # Skip slow tests pytest -m "integration" # Run only integration tests
pytest --cov=src --cov-report=term-missing π Monitoring & Observability Prometheus Metrics Each service exposes Prometheus metrics at /metrics:
http_requests_total - Total request count
http_request_duration_seconds - Request latency histogram
model_predictions_total - Prediction count per model
prediction_confidence - Confidence score distribution
pii_detection_total - PII items detected by type
Alert Rules yaml
groups:
- name: callcenterai_alerts
rules:
-
alert: HighErrorRate expr: rate(http_requests_total{status="500"}[5m]) > 0.05 for: 2m labels: severity: critical annotations: summary: "High error rate detected"
-
alert: ModelLatencyHigh expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 1 for: 5m labels: severity: warning Grafana Dashboards Pre-configured dashboards include:
-
System Health - Uptime, resource usage, service status
Business Metrics - Tickets by category, language distribution
Performance - Latency percentiles, throughput, error rates
Model Analytics - Confidence scores, model selection patterns
π Model Comparison
Performance Matrix
Metric TF-IDF + SVM DistilBERT Transformer Best Use Case
Accuracy 94.8% 92.3% TF-IDF for English
F1-Score 94.7% 91.8% TF-IDF for balance
Inference Speed β‘ 10-15ms π’ 45-55ms TF-IDF for speed
Memory Usage 120MB 890MB TF-IDF for efficiency
Multilingual β Limited β
Excellent Transformer for i18n
Complex Context
# Rule 2: High confidence threshold
(lambda: self.estimate_complexity(text) < 0.3, "TF-IDF"),
# Rule 3: Non-English languages
(lambda: language in ["fr", "ar", "es"], "Transformer"),
# Rule 4: Long or complex text
(lambda: word_count > 15 or self.has_technical_terms(text), "Transformer"),
# Default: Transformer for safety
(lambda: True, "Transformer")
]
for condition, model in rules:
if condition():
return model
PII Scrubbing Pipeline python class PIIScrubber: def scrub(self, text: str) -> Tuple[str, Dict]: """Remove personally identifiable information""" patterns = { 'email': r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z|a-z]{2,}\b', 'phone': r'\b(?:+?\d{1,3}[-.]?)?(?\d{3})?[-.]?\d{3}[-.]?\d{4}\b', 'cin': r'\b\d{8}\b', # Tunisian CIN 'name': r'\b(Mr|Ms|Mrs|Dr).?\s+[A-Z][a-z]+\s+[A-Z][a-z]+\b' }
detected = {}
scrubbed_text = text
for pii_type, pattern in patterns.items():
matches = re.findall(pattern, text)
if matches:
detected[pii_type] = matches
replacement = f'[{pii_type.upper()}_{i}]' for i in range(len(matches))
scrubbed_text = re.sub(pattern, replacement, scrubbed_text)
return scrubbed_text, detected
π Project Structure
text
CallCenterAI/
βββ π .github/workflows/ # CI/CD Pipelines
β βββ ci-cd.yml # Main workflow
β βββ security-scan.yml # Security checks
β βββ docker-publish.yml # Docker image builds
β
βββ π docker/ # Docker configurations
β βββ Dockerfile.agent # AI Agent service
β βββ Dockerfile.tfidf # TF-IDF service
β βββ Dockerfile.transformer # Transformer service
β βββ prometheus.yml # Monitoring config
β
βββ π src/ # Source code
β βββ agent/ # Intelligent routing agent
β β βββ api.py # FastAPI application
β β βββ router.py # Model selection logic
β β βββ pii_scrubber.py # PII detection & removal
β β βββ language_detector.py # Language identification
β β
β βββ tfidf_service/ # Traditional ML service
β β βββ api.py # REST API endpoints
β β βββ model.py # TF-IDF+SVM model wrapper
β β βββ metrics.py # Prometheus metrics
β β
β βββ transformer_service/ # Deep learning service
β β βββ api.py # FastAPI application
β β βββ model.py # HuggingFace model wrapper
β β βββ tokenizer.py # Text preprocessing
β β
β βββ training/ # Model training scripts
β β βββ train_tfidf.py # TF-IDF pipeline training
β β βββ train_transformer.py # Transformer fine-tuning
β β βββ data_preprocessor.py # Data cleaning & preparation
β β
β βββ utils/ # Shared utilities
β βββ mlflow_client.py # MLflow integration
β βββ dvc_utils.py # Data version control
β βββ config.py # Configuration management
β
βββ π tests/ # Comprehensive test suite
β βββ test_agent.py # Agent functionality tests
β βββ test_tfidf.py # TF-IDF service tests
β βββ test_transformer.py # Transformer service tests
β βββ test_integration.py # Integration tests
β βββ test_training.py # Training pipeline tests
β βββ conftest.py pytest fixtures
β
βββ π data/ # Datasets
β βββ raw/ # Original Kaggle dataset
β βββ processed/ # Cleaned & preprocessed data
β βββ external/ # External resources
β
βββ π models/ # Trained models
β βββ tfidf_svm.pkl # TF-IDF+SVM pipeline
β βββ label_encoder.pkl # Category mappings
β βββ transformer/ # Fine-tuned DistilBERT
β βββ config.json
β βββ pytorch_model.bin
β βββ tokenizer.json
β
βββ π monitoring/ # Observability configs
β βββ grafana/ # Dashboard definitions
β β βββ dashboard.json # Main Grafana dashboard
β βββ prometheus/ # Alert rules & configs
β βββ alerts.yml # Alerting rules
β βββ recording_rules.yml # Recording rules
β
βββ π scripts/ # Utility scripts
β βββ deploy.sh # Deployment script
β βββ health_check.sh # Service health verification
β βββ benchmark.py # Performance benchmarking
β
βββ π docker-compose.yaml # Full stack orchestration
βββ π dvc.yaml # Data pipeline definition
βββ π requirements.txt # Production dependencies
βββ π requirements-dev.txt # Development dependencies
βββ π pyproject.toml # Python project configuration
βββ π .pre-commit-config.yaml # Git hooks
βββ π README.md # This file
π³ Docker Services
Service Configuration
yaml
version: '3.8'
services:
agent:
build: ./docker/Dockerfile.agent
ports: ["8000:8000"]
environment:
- TFIDF_SERVICE_URL=http://tfidf:8010
- TRANSFORMER_SERVICE_URL=http://transformer:8020
- MLFLOW_TRACKING_URI=http://mlflow:5000
depends_on:
- tfidf
- transformer
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
tfidf: build: ./docker/Dockerfile.tfidf ports: ["8010:8010"] environment: - MODEL_PATH=/app/models/tfidf_svm.pkl volumes: - ./models:/app/models
transformer: build: ./docker/Dockerfile.transformer ports: ["8020:8020"] environment: - MODEL_NAME=distilbert-base-multilingual-cased - MODEL_PATH=/app/models/transformer volumes: - ./models:/app/models deploy: resources: limits: memory: 2G reservations: memory: 1G
mlflow: image: ghcr.io/mlflow/mlflow:latest ports: ["5000:5000"] volumes: - ./mlflow_data:/mlflow command: > mlflow server --host 0.0.0.0 --port 5000 --backend-store-uri sqlite:///mlflow/mlflow.db --default-artifact-root /mlflow/artifacts
prometheus: image: prom/prometheus:latest ports: ["9090:9090"] volumes: - ./docker/prometheus.yml:/etc/prometheus/prometheus.yml - prometheus_data:/prometheus command: - '--config.file=/etc/prometheus/prometheus.yml' - '--storage.tsdb.path=/prometheus' - '--web.console.libraries=/etc/prometheus/console_libraries' - '--web.console.templates=/etc/prometheus/consoles' - '--storage.tsdb.retention.time=200h' - '--web.enable-lifecycle'
grafana: image: grafana/grafana:latest ports: ["3000:3000"] volumes: - grafana_data:/var/lib/grafana - ./monitoring/grafana/dashboards:/etc/grafana/provisioning/dashboards environment: - GF_SECURITY_ADMIN_PASSWORD=admin - GF_INSTALL_PLUGINS=grafana-piechart-panel depends_on: - prometheus
volumes: mlflow_data: prometheus_data: grafana_data: π CI/CD Pipeline GitHub Actions Workflow yaml name: CI/CD Pipeline on: [push, pull_request]
jobs: test: runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with: {python-version: '3.11'}
- name: Install dependencies
run: pip install -r requirements-dev.txt
- name: Lint code
run: |
black --check src tests
flake8 src tests
isort --check-only src tests
- name: Run tests
run: pytest tests/ -v --cov=src --cov-report=xml
- name: Security scan
run: |
bandit -r src
trivy fs --severity HIGH,CRITICAL .
build: needs: test runs-on: ubuntu-latest steps: - name: Build Docker images run: | docker build -f docker/Dockerfile.agent -t callcenterai-agent . docker build -f docker/Dockerfile.tfidf -t callcenterai-tfidf . docker build -f docker/Dockerfile.transformer -t callcenterai-transformer .
- name: Scan images
run: |
trivy image callcenterai-agent
trivy image callcenterai-tfidf
trivy image callcenterai-transformer
- name: Push to Registry
if: github.ref == 'refs/heads/main'
run: |
echo "${{ secrets.DOCKER_PASSWORD }}" | docker login -u "${{ secrets.DOCKER_USERNAME }}" --password-stdin
docker push callcenterai/agent:latest
docker push callcenterai/tfidf:latest
docker push callcenterai/transformer:latest
β FAQ & Troubleshooting Common Issues & Solutions π΄ Port Conflicts bash
sudo lsof -i :8000
netstat -ano | findstr :8000 # Windows
sudo kill -9
π‘ Docker Build Failures bash
docker system prune -a docker builder prune --all
docker-compose build --no-cache
docker system df π Model Loading Warnings bash
python train_and_log.py --retrain
pip install scikit-learn==1.6.1
export PYTHONWARNINGS="ignore" π’ Service Communication Issues bash
docker exec callcenterai-agent-1 curl -s http://tfidf:8010/health docker exec callcenterai-agent-1 curl -s http://transformer:8020/health
docker exec callcenterai-agent-1 nslookup tfidf Performance Optimization For Development: bash
docker-compose up -d --scale transformer=1
docker update --memory 1G --cpus 1 callcenterai-transformer-1 For Production: bash
docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d
docker-compose up -d --scale tfidf=3 --scale transformer=2
docker-compose -f docker-compose.yml -f docker-compose.traefik.yml up -d Logging & Debugging bash
docker-compose logs -f --tail=50 agent
docker logs callcenterai-agent-1 --since 10m
docker exec -it callcenterai-agent-1 /bin/bash
curl http://localhost:9090/api/v1/query?query=http_requests_total
πΈ System Screenshots
-
π― AI Agent Service (Port 8000)
Real-time classification interface showing intelligent routing decisions -
π CI/CD Pipeline (GitHub Actions)
Automated testing, security scanning, and deployment pipeline -
π Prometheus Monitoring (Port 9090)
Real-time metrics collection and alerting system -
π§ͺ MLflow Experiment Tracking (Port 5000)
Model versioning, experiment comparison, and artifact storage -
π³ Docker Container Management
All services running in isolated Docker containers -
π Grafana Analytics Dashboard (Port 3000)
Business intelligence and performance monitoring dashboard
π₯ Contributing We welcome contributions! Here's how to get started:
Fork the repository
Create a feature branch
bash git checkout -b feature/amazing-feature Make your changes
Run tests and linting
bash pre-commit run --all-files pytest tests/ -v Commit your changes
bash git commit -m "Add amazing feature" Push to your branch
bash git push origin feature/amazing-feature Open a Pull Request
Development Guidelines Follow PEP 8 style guide
Write unit tests for new functionality
Update documentation for API changes
Use type hints for better code clarity
Keep commit messages descriptive
Project Roadmap Add support for more languages
Implement A/B testing framework
Add real-time streaming support
Develop mobile application
Create admin dashboard
Add advanced analytics features
π License This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments Dataset: IT Service Ticket Classification from Kaggle
Models: Hugging Face Transformers library
MLOps Tools: MLflow, DVC, Prometheus, Grafana teams
Web Framework: FastAPI for excellent performance
Containerization: Docker and Docker Compose
Citation If you use this project in your research or work, please cite:
bibtex @software{CallCenterAI2025, author = {Elyes-2}, title = {CallCenterAI: Intelligent Customer Ticket Classification}, year = {2025}, url = {https://github.com/Elyes-2/CallCenterAI} }
β Star this repository if you found it useful! β