Header-only, fixed-size ring buffer in C++20 with:
- fast bitmask wrap-around (
Nmust be power of two) - STL-style iterators (forward/reverse, const/non-const)
- throwing APIs and non-throwing
try_*APIs - tests, micro-benchmarks, and Doxygen docs generation
The main implementations are in:
include/RingBuffer.hpp(default-constructible storage path)include/RingBufferNonDefaultConstructible.hpp(supports non-default-constructible types)
- Why Two Headers
- Strong points
- Requirements
- Build
- Compile on Linux / macOS / Windows
- Run tests
- Run benchmark
- Benchmark methodology
- Benchmark configuration
- Non-default benchmark configuration
- Benchmark environment
- Latest benchmark results
- Latest benchmark results (RingBufferNonDefaultConstructible)
- Generate documentation
- API overview
- Memory layout
- Throwing operations
- Non-throwing operations
- Iterators
- Minimal usage
- STL algorithm examples
- Project layout
The codebase intentionally keeps two implementations instead of a single unified abstraction:
include/RingBuffer.hpp: optimized path for default-constructible/trivial-friendly scenarios.include/RingBufferNonDefaultConstructible.hpp: compatibility path for non-default-constructible types.
This duplication is a deliberate performance choice. In benchmark and profiling experiments, the default-constructor-oriented implementation is consistently faster, and attempts to merge both code paths behind additional abstraction layers resulted in worse performance.
So the project favors explicit specialization over unification to keep the fast path as efficient as possible while still providing support for non-default-constructible element types.
- Header-only integration: no library build/link step, just include
RingBuffer.hpp. - Deterministic memory usage: fixed-size storage with no container-internal heap allocation.
- Fast wrap-around indexing: power-of-two storage enables bitmask-based index wrap.
- Predictable O(1) core ops: push/pop/front/back are constant-time operations.
- Dual error-handling style: throwing API and explicit non-throwing
try_*API. - STL-friendly iteration: bidirectional iterators, const iterators, and reverse iterators.
- Good for real-time-ish paths: bounded capacity and explicit behavior under full/empty conditions.
- Easy to benchmark and audit: small implementation surface and transparent semantics.
- C++20 compiler (GCC/Clang)
- CMake 3.31+
make- Optional:
doxygen(for docs)
makeEquivalent manual commands:
cmake -S . -B build
cmake --build buildcmake -S . -B build
cmake --build build -j
ctest --test-dir build --output-on-failurecmake -S . -B build
cmake --build build -j
ctest --test-dir build --output-on-failureUse generator-agnostic CMake commands (no make required):
cmake -S . -B build -G Ninja
cmake --build build
ctest --test-dir build --output-on-failureIf Ninja is not installed, omit -G Ninja and use your default Visual Studio generator.
To build and run benchmark executable:
cmake --build build --target bench_basic
./build/benchmarks/bench_basicmake testThis runs all test targets via CTest from tests/:
basic_tests.cppiterator_tests.cppnon_default_comparison_tests.cppstl_tests.cpp
make run-benchBenchmark source: benchmarks/basic.cpp.
Run non-default benchmark:
cmake --build build --target bench_non_default_compare
./build/benchmarks/bench_non_default_compareBenchmark source: benchmarks/non_default_compare.cpp.
The benchmark compares RingBuffer against:
std::deque(queue-like push_back/pop_front)std::vectorused as a stack (push_back/pop_back)
For each data type (int, std::string, LargeObject), it runs two push/pop modes plus one iteration test:
-
Discard pop mode
- Fill the container with
fill_countelements - Pop all elements without consuming their payload
- Measures structural push/pop overhead
- Fill the container with
-
Consume pop mode
- Fill the container with
fill_countelements - Pop each element and feed it to
consume_value(...) - Measures push/pop + payload move/copy/access cost
- Fill the container with
-
Iteration mode (RingBuffer only)
- Fill once, then iterate repeatedly over logical elements
- Measures iterator traversal cost
RingBuffer uses effective logical capacity N-1, so benchmark loops use buffer.max_size() for fair fill/drain counts across containers.
From benchmarks/basic.cpp:
BUFFER_SIZE = 1024- Effective ring
fill_count = buffer.max_size() = 1023 ROUNDS = 1'000'000for batch push/pop benchmarksLOOPS = 200'000for iterator benchmark- Clock source:
std::chrono::steady_clock - Data types tested:
intstd::stringLargeObject(int data[64])
The benchmark includes two push/pop modes:
- Discard pop: pops without payload consumption (container overhead focus)
- Consume pop: pops and passes values to
consume_value(...)to keep payload work observable
From benchmarks/non_default_compare.cpp:
BUFFER_SIZE = 1024- Effective ring
fill_count = buffer.max_size() = 1023 ROUNDS = 1'000'000for batch push/pop benchmarksLOOPS = 200'000for iterator benchmark- Clock source:
std::chrono::steady_clock - Data types tested:
NonDefaultBlob<16>NonDefaultBlob<64>NonDefaultBlob<256>
This benchmark compares RingBufferNonDefaultConstructible against:
std::deque(queue-like push_back/pop_front)std::vectorused as a stack (push_back/pop_back)
Measured on this machine:
| Item | Value |
|---|---|
| OS | Linux 6.14.0-37-generic x86_64 |
| CPU | 13th Gen Intel(R) Core(TM) i9-13900H |
| Cores / Threads | 14 cores, 20 threads |
| Max CPU freq | 5400 MHz |
| RAM | 30 GiB |
| Compiler | c++ (Ubuntu 14.2.0-19ubuntu2) 14.2.0 |
Because this is a micro-benchmark, results can vary with thermal/power state, background load, and CPU scaling.
| Mode | RingBuffer (s) | std::deque (s) | std::vector stack (s) |
|---|---|---|---|
| Batch discard pop | 0.434571 | 0.902812 | 0.281144 |
| Batch consume pop | 1.42992 | 1.10755 | 0.684198 |
| Iteration (RingBuffer only) | 0.0758931 | - | - |
| Mode | RingBuffer (s) | std::deque (s) | std::vector stack (s) |
|---|---|---|---|
| Batch discard pop | 1.98054 | 3.74455 | 1.71825 |
| Batch consume pop | 3.45806 | 5.32584 | 2.4731 |
| Iteration (RingBuffer only) | 0.0795547 | - | - |
| Mode | RingBuffer (s) | std::deque (s) | std::vector stack (s) |
|---|---|---|---|
| Batch discard pop | 6.76861 | 18.3302 | 9.25165 |
| Batch consume pop | 9.66389 | 21.1863 | 11.2808 |
| Iteration (RingBuffer only) | 0.0585418 | - | - |
| Mode | RingBufferNonDefaultConstructible (s) | std::deque (s) | std::vector stack (s) |
|---|---|---|---|
| Batch discard pop | 8.34699 | 8.93915 | 7.60881 |
| Batch consume pop | 8.9921 | 8.87878 | 7.58946 |
| Iteration (RingBuffer only) | 0.101687 | - | - |
| Mode | RingBufferNonDefaultConstructible (s) | std::deque (s) | std::vector stack (s) |
|---|---|---|---|
| Batch discard pop | 3.94942 | 6.86327 | 3.83647 |
| Batch consume pop | 4.59373 | 7.11657 | 3.26814 |
| Iteration (RingBuffer only) | 0.146314 | - | - |
| Mode | RingBufferNonDefaultConstructible (s) | std::deque (s) | std::vector stack (s) |
|---|---|---|---|
| Batch discard pop | 10.017 | 29.3278 | 9.32272 |
| Batch consume pop | 10.7661 | 29.9914 | 10.0828 |
| Iteration (RingBuffer only) | 0.147358 | - | - |
make docsWhat this does:
- Generates a
Doxyfile - Configures output to
doc/ - Runs Doxygen
Open generated HTML:
xdg-open doc/html/index.htmlTemplate:
RingBuffer<T, N>Constraints:
N > 0Nmust be power of two- effective logical capacity is
N - 1
Ring buffer storage is fixed and contiguous (size N).
Storage indices:
[0][1][2][3][4][5][6][7]
headpoints to the first logical element.tailpoints to the next insertion position.- Logical active range is
[head ... tail)(with wrap-around).
Storage:
[0][1][2][3][4][5][6][7]
^ ^
head tail
Logical elements: [1, 2, 3]
Storage:
[0][1][2][3][4][5][6][7]
^ ^
tail head
Logical elements: [6, 7, 0, 1]
In wrapped state, iteration still follows logical order starting at head and continuing circularly until tail.
push(...)throws if fullpop()throws if emptypop_discard()throws if emptyemplace_back(...)/emplace_front(...)throw if fullfront()/back()/peek()throw if emptyat(i)throws if out of range
try_push(...)try_pop(T&)try_emplace_back(...)try_pop_discard()try_front(...),try_back(...),try_peek(...)(pointer outputs)
These return bool to indicate success/failure.
Supported:
begin()/end()cbegin()/cend()rbegin()/rend()crbegin()/crend()
Iterator category is bidirectional.
#include "RingBuffer.hpp"
int main() {
RingBuffer<int, 8> rb; // power-of-two storage size
rb.push(1);
rb.push(2);
int value = rb.pop(); // 1
int out{};
if (rb.try_pop(out)) {
// out == 2
}
return 0;
}RingBuffer iterators are compatible with common STL algorithms.
#include <algorithm>
#include <numeric>
#include <vector>
#include "RingBuffer.hpp"
int main() {
RingBuffer<int, 8> rb;
rb.push(10);
rb.push(42);
rb.push(5);
rb.push(7);
// std::find
auto it = std::find(rb.begin(), rb.end(), 42);
if (it != rb.end()) {
// found 42
}
// std::accumulate
int sum = std::accumulate(rb.begin(), rb.end(), 0);
// std::sort on a copied range
std::vector<int> sorted(rb.begin(), rb.end());
std::sort(sorted.begin(), sorted.end());
return sum;
}include/
RingBuffer.hpp
RingBufferNonDefaultConstructible.hpp
tests/
basic_tests.cpp
iterator_tests.cpp
non_default_comparison_tests.cpp
stl_tests.cpp
benchmarks/
basic.cpp
non_default_compare.cpp