Notes on Implementing a Variable Time Step Main Game Loop in Rust

For my small billiards game I implemented a variable time step simulation loop, so I'd like to share how it's done and a surprising bug about it.

First off, what is a "variable time step loop"? Normally, games that require physics simulations such as moving objects at their velocity do so by repeatedly running a simulation code in a loop that is triggered every T seconds. By "seconds" I mean milliseconds. For example, a game running at 50 FPS has T = 1/50 = 0.02. This is also called the "delta" time. Delta is a the Greek letter Δ, and it usually refers to a difference between two points. In this case, delta is the difference in time from the start of the previous loop to the start of current loop, which translates into "how much time has passed since we last applied the velocity to the objects." If a simulation is running consistently at 50 frames per second, or 0.02 seconds each loop, then that is a fixed time step loop. A variable time step loop calculates how much time it should spend on an iteration for each loop. This means that one iteration of the loop can take several milliseconds while the other can take a fraction of millisecond.

Of course, we are talking about how much time passes "inside" the game, inside the simulation. It's possible that the calculated delta says the simulation should run for 1 microsecond, and doing the simulation in the CPU actually takes longer than 1 microsecond. That would be kind of a problem. However, for most games, doing the physics simulations doesn't actually take a long time compared to rendering the game on the screen.

Anyway, the first thing we need is to know how much time we want to simulate.

pub fn simulate(&mut self, time_to_simulate: f64) {
    // simulation goes here
}

Since I'm making the game in Javascript, what I did was use requestAnimationFrame that triggers every time we need to refresh the screen. The requestAnimationFrame API takes a callback which is called with how much time has elapsed since the last frame. What is important to note is that this can be an exceedingly long time. This would be true even if we were doing the whole thing in Rust with SDL, or in Python.

For example, assuming that requestAnimationFrame only triggers if the tab is active, if the user goes to a different tab, or minimizes the window, it can take several minutes for the next requestAnimationFrame to trigger. On the other hand, if we were doing this all in SDL, it's possible that it takes longer than 0.02 seconds to simulate 0.02 seconds. In which case the time from one frame to the next is going to keep increasing indefinitely, e.g. we start by simulating 0.02 seconds, that takes 0.04 seconds, so the time between two frames next iteration will be 0.04, in which case we simulate 0.04 seconds, and that takes 0.08 seconds to simulate, so next iteration we calculate that it has been 0.08 seconds since the last simulation, so we need to simulate for 0.08 seconds, which will take 0.16 seconds to finish, ad infinitum.

This means that in order to call our simulate function it's important to cap time_to_simulate before calling it. In my case, I'm not capping it inside the function, because for my use case I actually need to run it for several seconds sometimes. I cap it to 1/60 seconds before calling the function if I just want to computer the current frame.

let mut time_elapsed = 0.0;
while time_elapsed < time_to_simulate {
    let max_step_time = self.calculate_max_step_time();
    let time_left = time_to_simulate - time_elapsed;

    if time_elapsed + time_left <= time_elapsed {
        break;
    }

    let step_delta = time_left.min(max_step_time);
    self.simulate_iteration(step_delta);
    time_elapsed += step_delta;
}

Above is the main code of the simulate function. Let's understand what it does step by step.

let mut time_elapsed = 0.0;
while time_elapsed < time_to_simulate {
    let step_delta = /* something */;
    self.simulate_iteration(step_delta);    
    time_elapsed += step_delta;
}

In this code, we start counting up from 0.0 to time_to_simulate. We could also count down from time_to_simulate to zero, but I thought counting up would be slightly more accurate since doing several operations on the same floating point variable will accumulate error.

The code that actually performs the simulation is in simulate_iteration. We don't need to worry about that since we're only learning about the main loop. What it contains is moving objects around and doing things like applying friction to slow objects down, or gravity to speed them up.

let max_step_time = self.calculate_max_step_time(0.25);
let time_left = time_to_simulate - time_elapsed;
let step_delta = time_left.min(max_step_time);

This part is what sets the variable time step inside the loop. max_step_time tells us what is the maximum amount of time that should be elapsed in one iteration. In other words, this is the function that calculates the variable step time. time_left is just how much time we have left to simulate. If time_left is smaller than max_step_time, we simply simulate the whole time_left, otherwise it's capped to max_step_time. In case you don't know, the min function in Rust (and other programming languages) gives you the smallest of two numbers.

For example, if we are simulating 1 second, and max_step_time is consistently 0.3, the loop will run for 4 iterations with 0.3, 0.3, 0.3, and 0.1 deltas.

It's worth noting that if you want to use this sort of code on your project, you'll probably want a way to cap the minimum delta. That's because if calculate_max_step_time returns an extremely low value thousands of times, you will end up having to perform too many iterations. In my case, that's not a problem because calculate_max_step_time is based on the velocity of the balls, and the velocity is decreased on every iteration due to friction.

Before I explain calculate_max_step_time, let's take a look at this weird thing:

if time_elapsed + time_left <= time_elapsed {
    break;
}

If you have understood what the loop is doing, then this should make absolutely no sense to you. After all, let's follow the logic:

while time_elapsed < time_to_simulate, so time_elapsed has to be smaller than time_to_simulate.
let time_left = time_to_simulate - time_elapsed, since time_elapsed is smaller than time_to_simulate, time_left must be greater than zero.
time_elapsed + time_left <= time_elapsed, since time_left is greater than zero, there should be no universe in which adding it to time_elapsed gives you a number SMALLER than time_elapsed?!

It really doesn't make sense, but it actually happens, and it was the source of a bug in which, at the end of the simulation, when the velocities of the balls were slow, the game started lagging.

Credit where it's due: I didn't figure this out on my own. I actually just asked Claude about it and I was extremely skeptical about the solution and in awe it actually worked.

The reason for this "gotcha" is floating point arithmetic. The way floating point values work is that they hold a precise value for a given magnitude. The larger the magnitude, the worse the precision becomes. If you have a high-magnitude number (e.g. in seconds) and you add a low-magnitude number to it (e.g. in microseconds), it's possible that the number is non-zero, but it's smaller than the distance between one number and the next representable in the target magnitude, so that number won't change.

For example, imagine if the floating point value could represent 1.0, 1.1, 1.2, etc. but not 1.01. However, if the magnitude was lower, it would be able to represent 0.11, 0.12, 0.13, etc., but not 0.101. In this scenario, if we had a number at very low magnitude like 0.005 and we added it to 1.0, the number 1.0 wouldn't change, because the next number it can "become" is 1.1. Of course, the precision of actual floating point numbers is much better than this, but fundamentally the concept is the same.

pub fn calculate_max_step_time(&self, max_movement_per_step: f64) -> f64 {
    let greatest_speed_radius_ratio = self.get_greatest_speed_radius_ratio();
    let oversteps = greatest_speed_radius_ratio / max_movement_per_step;

    if oversteps <= 1.0 {
        1.0
    } else {
        1.0 / oversteps
    }
}

pub fn get_greatest_speed_radius_ratio(&self) -> f64 {
    let greatest_speed_radius_ratio_sq = self.balls
        .iter()
        .map(|b| {
            b.get_velocity().length_squared() / b.radius.powi(2)
        }).fold(0.0, Coord::max);
    
    greatest_speed_radius_ratio_sq.sqrt()
}

Finally this is the code I used to calculate the maximum amount of time for a single time step. It's very simple, since we are only dealing with circles moving around.

The idea is that, for a given iteration, we don't want any single ball to move more than 1/4th of its radius. That's because if a ball moves too fast, it will just phase through the other balls. More specifically, it's because I'm doing collision detection by just checking if a circle overlaps with another circle. There is probably some fancier, smarter way to do it that uses geometric math that I don't know about. Since I don't know about it, I can't program it, so I'm doing things the way I know how to do. And in the way I'm doing things, if things move too fast, they just kind of pass through walls.

In order to prevent a single ball from moving more than 1/4th of its radius in one iteration, we need to cap its velocity multiplied by the delta to 1/4th of its radius. The problem is that we can't do that just for a single ball, since that means we're just capping actual velocity of the ball. We need to scale down the velocity of the entire simulation just so that the fastest ball goes slow enough to avoid phasing through everything.

The speed of a ball as a single floating value is the length of its velocity vector. If we divide the length by the radius, we get the ratio of the velocity by the radius. For example, if the velocity vector is {x: 32.0, y: 0.0}, its length is 32.0. If the radius is 16.0, then the ratio is 32.0/16.0 = 2.0. That means if we applied the WHOLE velocity, we would move 2 whole radii. As you might guess, the velocity vector is "how much a ball moves in 1 second." In order to apply the WHOLE velocity, a WHOLE second needs to pass.

What this function is calculating, then, is the INVERSE of "how many seconds need to pass to for a ball to move a length equal to its radius." We can un-inverse this inverse with 1.0 / inverse. And then we know how many seconds need to pass for a ball to move a length equal to its radius.

That's what we are doing. But we also have a max_movement_per_step variable that lets us adjust this further. If the argument is 1.0, then the result of the function is just how many times the fastest ball will move its radius in 1 second. However, if the argument is 0.5, then the comparison is with "half" of its radius instead of the whole radius. This is called the number of "oversteps."

If the number of "oversteps" is smaller than 1, that means no ball has moved faster than we want. Returning 1.0 here essentially caps the delta to 1 second, but we could return a larger number than 1.0 if we wanted. If the number of oversteps is greater than 1.0, then we cap the delta in proportion, assuring the fastest ball can't move faster than we want in a single iteration.

But what if your project isn't just a bunch of circles? You can just use a different way to calculate the maximum speed. For example, if we want to ensure that nothing can move faster than N pixels, we would use this code instead:

pub fn get_greatest_speed_pixel_ratio(&self, pixels: f64) -> f64 {
    let greatest_speed_radius_ratio_sq = self.balls
        .iter()
        .map(|b| {
            b.get_velocity().length_squared() / pixels.powi(2)
        }).fold(0.0, Coord::max);
    
    greatest_speed_radius_ratio_sq.sqrt()
}