Skip to content
View kaelvalen's full-sized avatar
🪐
OH!
🪐
OH!

Highlights

  • Pro

Block or report kaelvalen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kaelvalen/README.md
Kael Valen Banner

Mehmet Arda Hakbilen (Kael Valen)

ML Architecture Researcher · Non-Transformer Models · Systematic Learning


Typing SVG


Research badge Focus badge Profile views counter


Researching non-transformer architectures through systematic implementation.
Not just reading papers — building Mamba, RWKV, Flash Attention from scratch.


Email LinkedIn GitHub


GitHub Analytics & Activity

Tracking systematic learning through daily commits


Kael Valen's Metrics

Snake Animation


Research & Core Focus

Beyond Transformer: Alternative Sequence Architectures

Questioning the assumption that transformers are the only viable solution

  • State-Space Models → Mamba & RWKV implementations

    • Linear-time inference vs quadratic attention complexity
    • Selective state mechanisms for efficient memory
    • Comparing trade-offs: speed vs expressiveness
  • Hybrid Architectures → RNN + Attention combinations

    • Exploring best of both worlds: recurrence + selectivity
    • Custom memory systems for long-context tasks
    • Implementation-first approach to understanding
  • Flash Attention v2 → From-scratch CUDA optimization

    • Understanding memory-efficient attention at kernel level
    • Production inference optimization
    • 10x speedup through proper memory access patterns

Featured Projects

Beyond Transformer

Open-source alternative architecture research

Implementation-first approach to understanding non-transformer models. Building Mamba, RWKV, and hybrid systems from scratch to compare trade-offs.

PyTorch JAX CUDA ONNX

Current Phase:
• Flash Attention v2 from scratch
• Mamba state-space model analysis
• RWKV architecture comparison

SentinelFS

Distributed P2P File Sync (Archived)

Autonomous peer-to-peer synchronization with ML-based anomaly detection, delta-sync algorithms, and genetic topology remeshing for fault tolerance.

C++17 Threading P2P ML

Key Features:
• ML anomaly detection pipeline
• Self-healing network topology
• Zero-copy delta sync protocol
• Byzantine fault tolerance

Research Blog (Planned)

Implementation Notes & Architecture Analysis

Documenting learning journey: from-scratch implementations, architecture comparisons, and trade-off analysis. Focus on practical insights over theory.

Markdown GitHub Pages Technical Writing

Planned Topics:
• Mamba vs Transformer trade-offs
• Flash Attention internals
• CUDA kernel optimization
• JAX vs PyTorch for research


Tech Stack & Tools

ML Frameworks
Research Tools
Core Languages
Infrastructure
Web Stack(I'm not happy about this, ngl)

Collaboration & Contact

Open to research collaboration & technical discussions
Interested in alternative architectures, efficient inference, or systematic ML learning?

What I'm looking for:
• Co-researchers on non-transformer architectures
• Code review & implementation feedback
• Trade-off discussions: speed vs accuracy vs memory

What I'm not interested in:
❌ Wrapper apps without novel architecture
❌ "Just use ChatGPT API" projects
❌ Hype-driven development


"Implementation over theory. Trade-offs over hype. Systematic learning over vibe coding."


Star the repos if you like them. Building in private — launching into space soon.

Pinned Loading

  1. latch-lang latch-lang Public

    Latch Language

    Rust 1

  2. synthe synthe Public

    Python

  3. beyond_transformer beyond_transformer Public

    Parallel Unified Linear State Engine(PULSE). Exploring a new AI paradigm beyond Transformers. Includes literature review, proposed architectures, and experiment plans.

    Python 1