A feature-rich, modular C compiler written entirely in C that implements a complete compilation pipeline from source code to x86-64 assembly. This educational project demonstrates modern compiler design with clean architecture and comprehensive tooling.
- Features
- Architecture
- Installation
- Usage
- Project Structure
- Modules Overview
- Examples
- Development
- Testing
- Contributing
- License
- Acknowledgments
- Complete Compilation Pipeline: Full implementation from lexing to code generation
- x86-64 Assembly: Generates optimized Linux assembly code
- Modular Design: Clean separation of concerns with reusable components
- Self-Hosting: Compiler can compile its own source code
- Error Recovery: Robust error handling and reporting
- Variables and Types:
inttype with type checking - Control Flow:
if/else,while,forloops - Functions: Multiple functions with parameters
- Expressions: Full arithmetic, relational, and logical operators
- Arrays: One-dimensional arrays
- Pointers: Basic pointer operations
- Structures: Simple struct support
- Comprehensive Test Suite: Unit tests for all modules
- Build System: GNU Make with multiple build configurations
- Development Tools: Debug utilities and visualizers
- Documentation: API documentation and usage examples
- Performance Profiling: Built-in profiling support
The compiler follows a classic multi-pass architecture:
Source Code → Lexer → Tokens → Parser → AST → Semantic Analyzer → IR → Code Generator → Assembly
- Interfaces (
interfaces/): Abstract interfaces for compiler phases - Modules (
modules/): Core implementation of compiler stages - Utils (
utils/): Supporting utilities and data structures - Tests (
tests/): Comprehensive test suite
# Required tools
sudo apt-get update
sudo apt-get install build-essential gcc nasm git make# Clone the repository
git clone https://github.com/solomonkassa/Mini-C-Compiler.git
cd mini-c-compiler
# Build the compiler
make
# Run tests to verify installation
make test
# Build with optimizations
make release# One-line setup and test
make && ./minic examples/hello.c# Compile a C source file
./minic source.c
# Specify output file
./minic source.c -o output.s
# Compile and assemble directly
./minic source.c --executable program
# Show AST (for debugging)
./minic source.c --show-ast# Enable optimizations
./minic source.c -O2
# Generate intermediate representation
./minic source.c --emit-ir
# Show symbol table
./minic source.c --show-symbols
# Verbose compilation output
./minic source.c -vmini-c-compiler/
├── interfaces/ # Abstract interfaces
│ ├── lexer.h # Lexer interface
│ ├── parser.h # Parser interface
│ ├── codegen.h # Code generator interface
│ └── ast.h # AST node interfaces
├── modules/ # Implementation modules
│ ├── lexer.c # Tokenizer implementation
│ ├── parser.c # Recursive descent parser
│ ├── ast.c # AST construction and manipulation
│ ├── semantic.c # Semantic analysis
│ ├── codegen.c # x86-64 code generation
│ └── optimizer.c # Code optimization passes
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ ├── benchmarks/ # Performance tests
│ └── test_runner.c # Test runner
├── utils/ # Utility modules
│ ├── hashmap.c # Hash table implementation
│ ├── vector.c # Dynamic array
│ ├── stringbuf.c # String buffer
│ ├── error.c # Error reporting
│ └── debug.c # Debug utilities
├── examples/ # Example programs
│ ├── hello.c # Hello world
│ ├── fibonacci.c # Fibonacci sequence
│ ├── calculator.c # Expression calculator
│ └── structs.c # Structure example
├── docs/ # Documentation
│ ├── API.md # API reference
│ ├── ARCHITECTURE.md # Design documentation
│ └── CONTRIBUTING.md # Contribution guide
├── Makefile # Build system
├── minic.c # Main compiler driver
└── README.md # This file
- lexer.h: Tokenization interface with Unicode support
- parser.h: Parser interface with error recovery
- ast.h: Abstract Syntax Tree node definitions and visitors
- codegen.h: Backend interface for multiple targets
- Lexer Module: Converts source code to tokens with location tracking
- Parser Module: Recursive descent parser with precedence climbing
- AST Module: Immutable AST with visitor pattern
- Semantic Module: Type checking and symbol resolution
- CodeGen Module: x86-64 assembly generator with register allocation
- Optimizer Module: Constant folding, dead code elimination
- Memory Management: Arena allocator for fast allocation
- Collections: Generic vector, hashmap, and stack
- String Handling: Safe string operations and formatting
- Error System: Rich error messages with source locations
- Debug Tools: AST visualizer and IR dumper
// examples/hello.c
#include "minic_std.h"
int main() {
print_string("Hello, World!\n");
return 0;
}// examples/fibonacci.c
int fibonacci(int n) {
if (n <= 1) return n;
return fibonacci(n-1) + fibonacci(n-2);
}
int main() {
return fibonacci(10); // Returns 55
}// examples/array.c
int sum_array(int arr[], int size) {
int total = 0;
for (int i = 0; i < size; i++) {
total += arr[i];
}
return total;
}
int main() {
int numbers[5] = {1, 2, 3, 4, 5};
return sum_array(numbers, 5); // Returns 15
}# Debug build with symbols
make debug
# Release build with optimizations
make release
# Build with sanitizers
make sanitize
# Generate documentation
make docsThe project follows a consistent coding style:
- ANSI C99 with GNU extensions
- 4-space indentation
- K&R brace style
- Descriptive variable names
- Doxygen-style comments
# Run with debug output
./minic source.c --debug
# Generate AST visualization (requires Graphviz)
./minic source.c --visualize-ast
# Profile compilation
./minic source.c --profile# Run all tests
make test
# Run specific test categories
make test-unit
make test-integration
make test-benchmarks
# Run tests with valgrind
make test-memory
# Generate test coverage report
make coverage- Unit Tests: Test individual modules in isolation
- Integration Tests: Test complete compilation pipelines
- Regression Tests: Ensure previously fixed bugs stay fixed
- Performance Tests: Benchmark compiler performance
- Fuzz Tests: Randomized input testing
# Create a new test
cp tests/template.c tests/unit/test_new_feature.c
# Implement test, then run:
make test-unitWe welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes following the code style
- Add tests for new functionality
- Run the test suite:
make test - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Language Features: Add new C constructs
- Optimizations: Improve generated code quality
- Error Messages: Better diagnostics
- Documentation: Improve guides and examples
- Tools: Developer utilities and debugging aids
- Automated CI checks run on PR
- Maintainers review for correctness and style
- Changes requested if needed
- Once approved, PR is merged
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2023 Mini-C Compiler Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
- Dragon Book: Compilers: Principles, Techniques, and Tools
- Crafting Interpreters: Robert Nystrom's excellent book
- LLVM Documentation: Inspiration for modern compiler design
- NASM Manual: x86-64 assembly reference
- GCC: Compiler toolchain
- NASM: Assembler
- Make: Build automation
- Valgrind: Memory debugging
- GDB: Debugger
Thanks to all the contributors who have helped make this compiler better!
| Component | Status | Notes |
|---|---|---|
| Lexer | ✅ Complete | Full C tokenization |
| Parser | ✅ Complete | Recursive descent |
| AST | ✅ Complete | Visitor pattern |
| Semantic | ✅ Complete | Type checking |
| CodeGen | ✅ Complete | x86-64 assembly |
| Optimizer | 🔄 In Progress | Basic optimizations |
| Standard Library | 🔄 In Progress | Minimal implementation |
✅ = Complete | 🔄 = In Progress | 📋 = Planned
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: solomonmulu000@gmail.com
"The only way to learn a new programming language is by writing programs in it." - Dennis Ritchie
If you find this project useful, please consider giving it a ⭐