Understanding the volatile Keyword and Compiler Optimizations
In C++, writing a loop that counts from 1 to a billion feels like a sure way to consume time and CPU cycles. But what if your loop - even with all its iterations - doesn’t run at all? Or what if your carefully written benchmark gets replaced by a single line of code?
Welcome to the world of modern compiler optimizations, where the compiler often knows better than you - and isn't afraid to prove it.
This article introduces a critical concept in C++: the volatile keyword, and why your code may not behave the way you think it will - especially when performance is involved.
The Problem: A Loop That “Should” Take Time
Let’s take a look at this innocent-looking C++ code:
#include <iostream>
#include <chrono>
int main() {
const long long iterations = 1'000'000'000;
long long sink = 0;
auto start = std::chrono::high_resolution_clock::now();
for (long long i = 0; i < iterations; ++i) {
sink += 1;
}
auto end = std::chrono::high_resolution_clock::now();
long long nanoseconds = std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count();
std::cout << "Result: " << sink << std::endl;
std::cout << "Time: " << nanoseconds << " nanoseconds" << std::endl;
return 0;
}
This code should, at a glance, take a few hundred milliseconds, maybe even a second, to run a billion additions.
And yet, when compiled in Release mode, you might see output like this:
Result: 1000000000
Time: 150 nanoseconds
Wait - what?
What Happened: The Compiler Outsmarted You
The compiler looked at this loop and thought:
“You're just adding 1 a billion times? I know how this ends.”
And instead of generating machine code for a billion iterations, it optimized the loop into this:
sink = 1'000'000'000;
No loop. No wasted CPU cycles. Just the final result, instantly.
The output is correct i.e. sink is 1,000,000,000 - but the loop never actually ran.
This is an example of dead code elimination and loop strength reduction, two standard optimizations in modern compilers like MSVC, Clang, and GCC.
Introducing volatile: Your Code’s Bodyguard
To stop the compiler from doing this, we can use the volatile keyword:
volatile long long sink = 0;
This keyword tells the compiler:
“Do not optimize reads or writes to this variable. Every access is important, even if it seems pointless to you.”
With volatile, the compiler is forced to generate real instructions for every addition, and now the loop takes real time to run. You might see:
Result: 1000000000
Time: 450000000 nanoseconds
That’s more like it.
What Exactly Does volatile Do?
In C++, volatile is a type qualifier that instructs the compiler:
- Do not cache this variable in a register.
- Do not remove or merge reads/writes.
- Assume this variable might be changed by something outside your current code.
It’s commonly used in embedded systems and multithreaded contexts where variables might change due to hardware or background processes.
In our case, it's used as a benchmarking hack - we don’t want the compiler to optimize our code away while we’re trying to measure it.
When Not to Use volatile
It’s tempting to use volatile everywhere once you discover it, but resist that urge.
- It does not make your code thread-safe.
- It can reduce performance by disabling useful optimizations.
- It should be reserved for rare, specific scenarios like memory-mapped hardware or precise timing measurements.
Use it sparingly, and only when you truly need to prevent the compiler from reordering or eliminating reads/writes.
Key Takeaways
- Modern C++ compilers are extremely aggressive about optimizing your code - even to the point of deleting entire loops.
- The keyword
volatiletells the compiler: “Hands off. Don’t optimize this.” - If you're benchmarking or dealing with low-level systems, knowing when and how to use
volatileis essential. - Don't assume your code is running just because you wrote it - always measure, and always verify.
“The best performance improvement is the one your compiler gives you for free - unless it gives you zero.”
— A C++ developer, watching their loop vanish into thin air
If you’re new to low-level benchmarking, you might also like this article on build modes.