Skip to content

Solomonkassa/Mini-C-Compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mini-C Compiler

License: MIT C Language Platform: Linux Educational

A feature-rich, modular C compiler written entirely in C that implements a complete compilation pipeline from source code to x86-64 assembly. This educational project demonstrates modern compiler design with clean architecture and comprehensive tooling.

📋 Table of Contents

✨ Features

🚀 Core Compiler Features

  • Complete Compilation Pipeline: Full implementation from lexing to code generation
  • x86-64 Assembly: Generates optimized Linux assembly code
  • Modular Design: Clean separation of concerns with reusable components
  • Self-Hosting: Compiler can compile its own source code
  • Error Recovery: Robust error handling and reporting

📚 Language Support

  • Variables and Types: int type with type checking
  • Control Flow: if/else, while, for loops
  • Functions: Multiple functions with parameters
  • Expressions: Full arithmetic, relational, and logical operators
  • Arrays: One-dimensional arrays
  • Pointers: Basic pointer operations
  • Structures: Simple struct support

🔧 Development Features

  • Comprehensive Test Suite: Unit tests for all modules
  • Build System: GNU Make with multiple build configurations
  • Development Tools: Debug utilities and visualizers
  • Documentation: API documentation and usage examples
  • Performance Profiling: Built-in profiling support

🏗️ Architecture

The compiler follows a classic multi-pass architecture:

Source Code → Lexer → Tokens → Parser → AST → Semantic Analyzer → IR → Code Generator → Assembly

Key Components:

  1. Interfaces (interfaces/): Abstract interfaces for compiler phases
  2. Modules (modules/): Core implementation of compiler stages
  3. Utils (utils/): Supporting utilities and data structures
  4. Tests (tests/): Comprehensive test suite

📦 Installation

Prerequisites

# Required tools
sudo apt-get update
sudo apt-get install build-essential gcc nasm git make

Building from Source

# Clone the repository
git clone https://github.com/solomonkassa/Mini-C-Compiler.git
cd mini-c-compiler

# Build the compiler
make

# Run tests to verify installation
make test

# Build with optimizations
make release

Quick Start

# One-line setup and test
make && ./minic examples/hello.c

🚀 Usage

Basic Compilation

# Compile a C source file
./minic source.c

# Specify output file
./minic source.c -o output.s

# Compile and assemble directly
./minic source.c --executable program

# Show AST (for debugging)
./minic source.c --show-ast

Advanced Options

# Enable optimizations
./minic source.c -O2

# Generate intermediate representation
./minic source.c --emit-ir

# Show symbol table
./minic source.c --show-symbols

# Verbose compilation output
./minic source.c -v

📁 Project Structure

mini-c-compiler/
├── interfaces/           # Abstract interfaces
│   ├── lexer.h          # Lexer interface
│   ├── parser.h         # Parser interface
│   ├── codegen.h        # Code generator interface
│   └── ast.h           # AST node interfaces
├── modules/             # Implementation modules
│   ├── lexer.c          # Tokenizer implementation
│   ├── parser.c         # Recursive descent parser
│   ├── ast.c           # AST construction and manipulation
│   ├── semantic.c       # Semantic analysis
│   ├── codegen.c        # x86-64 code generation
│   └── optimizer.c      # Code optimization passes
├── tests/               # Test suite
│   ├── unit/           # Unit tests
│   ├── integration/     # Integration tests
│   ├── benchmarks/      # Performance tests
│   └── test_runner.c   # Test runner
├── utils/              # Utility modules
│   ├── hashmap.c       # Hash table implementation
│   ├── vector.c        # Dynamic array
│   ├── stringbuf.c     # String buffer
│   ├── error.c         # Error reporting
│   └── debug.c         # Debug utilities
├── examples/           # Example programs
│   ├── hello.c        # Hello world
│   ├── fibonacci.c    # Fibonacci sequence
│   ├── calculator.c   # Expression calculator
│   └── structs.c      # Structure example
├── docs/               # Documentation
│   ├── API.md         # API reference
│   ├── ARCHITECTURE.md # Design documentation
│   └── CONTRIBUTING.md # Contribution guide
├── Makefile           # Build system
├── minic.c           # Main compiler driver
└── README.md         # This file

📚 Modules Overview

Interfaces (interfaces/)

  • lexer.h: Tokenization interface with Unicode support
  • parser.h: Parser interface with error recovery
  • ast.h: Abstract Syntax Tree node definitions and visitors
  • codegen.h: Backend interface for multiple targets

Core Modules (modules/)

  • Lexer Module: Converts source code to tokens with location tracking
  • Parser Module: Recursive descent parser with precedence climbing
  • AST Module: Immutable AST with visitor pattern
  • Semantic Module: Type checking and symbol resolution
  • CodeGen Module: x86-64 assembly generator with register allocation
  • Optimizer Module: Constant folding, dead code elimination

Utilities (utils/)

  • Memory Management: Arena allocator for fast allocation
  • Collections: Generic vector, hashmap, and stack
  • String Handling: Safe string operations and formatting
  • Error System: Rich error messages with source locations
  • Debug Tools: AST visualizer and IR dumper

💡 Examples

Example 1: Hello World

// examples/hello.c
#include "minic_std.h"

int main() {
    print_string("Hello, World!\n");
    return 0;
}

Example 2: Fibonacci Sequence

// examples/fibonacci.c
int fibonacci(int n) {
    if (n <= 1) return n;
    return fibonacci(n-1) + fibonacci(n-2);
}

int main() {
    return fibonacci(10);  // Returns 55
}

Example 3: Array Operations

// examples/array.c
int sum_array(int arr[], int size) {
    int total = 0;
    for (int i = 0; i < size; i++) {
        total += arr[i];
    }
    return total;
}

int main() {
    int numbers[5] = {1, 2, 3, 4, 5};
    return sum_array(numbers, 5);  // Returns 15
}

🔬 Development

Building for Development

# Debug build with symbols
make debug

# Release build with optimizations
make release

# Build with sanitizers
make sanitize

# Generate documentation
make docs

Code Style

The project follows a consistent coding style:

  • ANSI C99 with GNU extensions
  • 4-space indentation
  • K&R brace style
  • Descriptive variable names
  • Doxygen-style comments

Debugging

# Run with debug output
./minic source.c --debug

# Generate AST visualization (requires Graphviz)
./minic source.c --visualize-ast

# Profile compilation
./minic source.c --profile

🧪 Testing

Running Tests

# Run all tests
make test

# Run specific test categories
make test-unit
make test-integration
make test-benchmarks

# Run tests with valgrind
make test-memory

# Generate test coverage report
make coverage

Test Structure

  • Unit Tests: Test individual modules in isolation
  • Integration Tests: Test complete compilation pipelines
  • Regression Tests: Ensure previously fixed bugs stay fixed
  • Performance Tests: Benchmark compiler performance
  • Fuzz Tests: Randomized input testing

Adding Tests

# Create a new test
cp tests/template.c tests/unit/test_new_feature.c
# Implement test, then run:
make test-unit

🤝 Contributing

We welcome contributions! Here's how to get started:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes following the code style
  4. Add tests for new functionality
  5. Run the test suite: make test
  6. Commit your changes: git commit -m 'Add amazing feature'
  7. Push to the branch: git push origin feature/amazing-feature
  8. Open a Pull Request

Contribution Areas

  • Language Features: Add new C constructs
  • Optimizations: Improve generated code quality
  • Error Messages: Better diagnostics
  • Documentation: Improve guides and examples
  • Tools: Developer utilities and debugging aids

Code Review Process

  1. Automated CI checks run on PR
  2. Maintainers review for correctness and style
  3. Changes requested if needed
  4. Once approved, PR is merged

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

Copyright (c) 2023 Mini-C Compiler Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

🙏 Acknowledgments

Educational Resources

  • Dragon Book: Compilers: Principles, Techniques, and Tools
  • Crafting Interpreters: Robert Nystrom's excellent book
  • LLVM Documentation: Inspiration for modern compiler design
  • NASM Manual: x86-64 assembly reference

Tools Used

  • GCC: Compiler toolchain
  • NASM: Assembler
  • Make: Build automation
  • Valgrind: Memory debugging
  • GDB: Debugger

Contributors

Thanks to all the contributors who have helped make this compiler better!


📊 Project Status

Component Status Notes
Lexer ✅ Complete Full C tokenization
Parser ✅ Complete Recursive descent
AST ✅ Complete Visitor pattern
Semantic ✅ Complete Type checking
CodeGen ✅ Complete x86-64 assembly
Optimizer 🔄 In Progress Basic optimizations
Standard Library 🔄 In Progress Minimal implementation

✅ = Complete | 🔄 = In Progress | 📋 = Planned


🌟 Star History

Star History Chart


📞 Contact & Support


"The only way to learn a new programming language is by writing programs in it." - Dennis Ritchie


Built with ❤️ by Solomon Kassa
If you find this project useful, please consider giving it a ⭐

About

Mini-C Compiler A complete, educational C compiler written in C that compiles a subset of C to x86-64 assembly. This project demonstrates the full compilation pipeline from source code to executable.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages