I'm a final-year engineering student at IMT Nord Europe specializing in Data Science & Artificial Intelligence (CPGE MPSI/MP* background). I'm passionate about building robust ML pipelines, exploring LLM architectures, and turning messy data into actionable insight.
Currently preparing a Generative AI Engineer internship at IFPI (Malta) and actively seeking an alternance starting September 2026 in AI/Data roles.
Languages
Python SQL Java
Data & ML
PyTorch Scikit-learn XGBoost Sentence-Transformers LangChain Pandas NumPy mlxtend
Data Engineering & Infra
SQLite PostgreSQL Dagster Docker Kafka
Visualization & Web
Streamlit Django NetworkX Plotly Matplotlib
Human Languages 🇫🇷 French (Native) · 🇬🇧 English (B2)
Python · Pandas · Scikit-learn · XGBoost · Sentence-BERT · NetworkX · SQLite
A research-grade data pipeline analyzing 126,000+ Steam games to transform noisy, user-generated tags (folksonomy) into reliable ludological metadata. The project spans 8 stages: FP-Growth association rules, co-occurrence network analysis (Lift), multi-algorithm clustering (Louvain, OPTICS, K-Medoids), ML classification, and Sentence-BERT for deep semantic analysis and diachronic drift measurement (PMI).
PyTorch · Transformers · Contrastive Learning (InfoNCE)
Developed a joint embedding space to align natural language descriptions with 3D human motion sequences, with a focus on complex human-human interactions (hugging, handshaking). Used advanced contrastive learning strategies evaluated via Recall@K. Top 3 result in a competitive challenge.
Django · LangChain · Gemma 4-31B · Gemini 1.5 · Sentence-Transformers
An interactive web platform featuring four AI-powered game modes around anime & manga. Highlights include: cosine similarity-based guessing, Chain-of-Thought story archetype generation, Undercover Anime (Explainable AI deduction game), hybrid LLM stack (local Gemma 4 with automatic Gemini fallback), PydanticOutputParser for structured JSON outputs, and RAG with pre-computed NumPy vectors.
Scikit-learn · XGBoost · Conformal Prediction · Streamlit · Librosa
A dual-purpose platform combining ML-based genre classification (with conformal prediction for uncertainty quantification) and a Shazam-like audio fingerprinting engine using spectrogram peak analysis. Features a cosine-similarity recommendation engine with adjustable feature weighting.
Python · PostgreSQL · Dagster · Scikit-learn · Streamlit · AniList GraphQL API
End-to-end data platform extracting 25,000+ anime from the AniList API, computing TF-IDF weighted recommendations (70% metadata / 30% synopsis), and serving them via a Streamlit interface. Orchestrated with Dagster (weekly scheduling) on a serverless Neon PostgreSQL backend. Includes a Higher or Lower anime score guessing game.
Python · NEAT · Pygame
Trained an AI agent to master Flappy Bird using NeuroEvolution of Augmenting Topologies (NEAT). The genome evolves across generations to maximize survival score. Includes network visualization, genealogy tracking, and a replay mode for the best-performing agent.
Python · Kivy · JsonStore
A real-time sports performance data collection app designed for PE teachers and coaches. Features configurable variables, live score tracking, an integrated timer, and automatic .txt archiving. Deployed in a school context with a 4th/5th grade volleyball pilot session.
🏃 Sports & Fitness · 🌍 Sustainability · 🤖 LLMs & GenAI · 📐 Mathematics · 📚 Cross-cultural reading
"Turning noise into signal — whether in data, tags, or tokens."