logo

Effective Determinism

profile photo
Brian Hackett
In the last post we discussed the sources of non-determinism which Replay’s recorder captures and controls for in order to replay the original recorded program’s behavior. If these are completely accounted for then the replayed program will behave in exactly the same manner that it did while recording. This isn’t what Replay’s recorder does, however. When replaying, the program’s behavior is allowed to vary in non-deterministic ways, as long as the behaviors we’re interested in — JavaScript state, the contents of the DOM, etc. — are the same as what happened while recording. Aiming for this idea of effective determinism, rather than complete determinism, is crucial for the recorder’s versatility and efficiency.
Consider the following program:
c++
#include <stdlib.h> #include <stdio.h>int main(int argc, char** argv) { fprintf(stderr, “Hello %d\n”, 42); return 0; }
When this program runs, it will always do the same thing, printing “Hello 42”. Now consider the following similar program:
c++
#include <stdlib.h> #include <stdio.h>int main(int argc, char** argv) { void* p = malloc(10); fprintf(stderr, “Hello %p\n”, p); return 0; }
When this program runs repeatedly, it prints out a variety of strings non-deterministically:
c++
Hello 0x7f9faf4057e0 Hello 0x7fc0124057e0 Hello 0x7fb506c057e0 Hello 0x7f86d2c057e0
This happens because malloc is non-deterministic: it can return different pointers in different runs, depending on the system’s state and the behavior of other threads. Now, malloc is defined by system libraries, and Replay’s recorder could treat the pointers it returns as non-deterministic inputs read from the environment, in the same way as e.g. data read from a file. Those pointers could be replayed and the exact same values would be printed when replaying.
Doing this isn’t necessary, however. Replay’s recorder is designed to work with runtimes for interpreted languages like JavaScript. These runtimes include a virtual machine for interpreting that language, along with various other components the language is used to interact with (like the rest of a web browser). When replaying we aren’t interested in the fine details of these components (what you would see with a C-level debugger like gdb) but rather higher level information about their state, such as that provided by a browser’s regular developer tools. This state has the same structure and contents regardless of the raw pointer values in the fine details. Likewise, the raw pointer values don’t have an effect on the behavior of the JavaScript — there is no way in JS and most other interpreted languages to determine the pointer associated with an object or other value.
If we allow pointer values to vary while replaying, we unlock some powerful benefits:
  • We can run analyses while replaying to get information about the runtime’s behavior, without having had to run those same analyses while originally recording. Running analyses generally requires reconfiguring the runtime in ways that affect the allocations it performs internally; if pointer values have to match while replaying, there can’t be any changes to the allocations the runtime makes. We use these analyses to gather the information needed to view a replay, such as the graphics a browser rendered and the points at which every JavaScript statement executes. We’ll be exploring this idea more in the next blog post.
  • We don’t need to bloat recordings with information about these pointer values, and we don’t need to incur overhead while keeping track of them. Most programs use malloc far more than other system interactions, so trying to record those pointers will produce an avalanche of information that can greatly increase the size of the recording.
  • Not having to record what malloc and related functions are doing keeps things simple.
  • Allowing pointer values to vary allows other parts of the runtime to behave non-deterministically as well, extending the benefits above. We’ll be discussing this more in the next section.
Allowing pointers to vary has drawbacks as well. State within the runtime occasionally depends on the raw pointer values, and changes to those values can affect the high level behavior and state which needs to be consistent with what happened while recording. This usually manifests when iterating through the contents of a hashtable: if the table’s keys depend on pointer values then they will be hashed differently while replaying, and the iteration order will differ as well. In cases like this we need to update the runtime to ensure the iteration order will be the same if the table’s contents are the same modulo pointer values.
Adding mitigations like this to the runtime isn’t needed often, and the benefits of allowing pointers to vary far outweigh the cost of having to identify places where these are needed.

Runtime Non-Determinism

The largest and most complicated component of an interpreted language runtime will usually be the virtual machine used for interpreting the language. Within that virtual machine, the most complicated sub-components will be the garbage collector (GC) and any just-in-time (JIT) compilers. These are both crucial for performance and have numerous optimizations to reduce GC overhead, avoid user-visible GC pauses, and speed up the generated machine code.
Fortunately, with few exceptions these do not have any effect on the behavior of the code being interpreted or on other parts of the runtime. The GC and JIT are not part of a language’s specification, and by design the interpreted code should do the same thing regardless of GC and JIT activity. When replaying that code, the GC and JIT are in turn free to behave differently from what happened while recording. We want to allow these differences in behavior, for the same reasons we want to allow pointer values to vary while replaying: this allows analyses to run while replaying that affect the GC or JIT behavior, improves recording efficiency, and keeps things simple.
In doing so, we do need to deal with exceptional behavior where the GC or JIT affect the code that’s running or the runtime’s state. For the JIT these are minimal: minor optimizations can do things like short-circuiting eval() calls in JavaScript so that the eval’ed code doesn’t run, and these optimizations can just be disabled. The GC is trickier, as a non-deterministic GC means that finalization of GC’ed objects happens at different points when recording vs. replaying. If those finalizers do complicated things like posting events to other threads, behavior can start to diverge enough from the recording that replaying can’t continue. We currently deal with these cases by simply leaking the associated objects (adding a reference to them so that they are never collected). This works fine for short recordings of a few minutes or less, but we plan to add mechanisms for running finalizers at deterministic points when recording vs. replaying, to avoid leaking memory and allow creating recordings of any length.
Related posts
We describe a new technique to simplify testing patches
We use Replay to help AI developers reliably make a frontend improvement
We share a demo of a prototype we’ve developed for debugging an application’s full stack.
Powered by Notaku