Debugging a Time-Travel Debugger is a lot like playing Wordle!

One of the great ironies of building a Time-Travel Debugger is that when it fails, the best way to figure out why is with an elaborate game of Wordle.

It’s painful to type these words, so let me explain how we record and why diagnosing issues feels like an elaborate game of wordle on a massive two dimensional board.

How Replay records and replays

The key insight behind Replay is that most software is deterministic. This means that given the same inputs, the program with return the same outputs. You do not need to record a fibonacci program to be able to replay it.

The primary thing that Replay records are library calls. Take this helloWorld example which reads the first line of a file.

plain text
#include <stdio.h>int main(int argc, char** argv) {
  FILE* fp = fopen(“hello.txt”, “r”);  char buf[256];
  fgets(buf, sizeof(buf), fp);
  puts(buf);  fclose(fp);
  return 0;
}

When Replay records helloWorld the only thing it needs to capture are the calls to fopen and fgets. Everything else is deterministic!

Where Wordle comes in

Okay, so helloWorld was fine. Once you capture the calls to fopen and fgets you can replay the program a million times and everything will play the same.

The problem is that runtimes are not single threaded. They are heavily multi-threaded. Take this example which creates several threads that fill up a buffer and then prints that buffer.

plain text
#include <pthread.h>
#include <stdio.h>pthread_mutex_t mutex;
char buffer[51];
int bufferIndex;void* thread_main(void* ptr) {
  for (int i = 0; i < 5; i++) {
    pthread_mutex_lock(&mutex);
    buffer[bufferIndex++] = ‘a’ + (int)ptr;
    pthread_mutex_unlock(&mutex);
  }
  return NULL;
}int main(int argc, char** argv) {
  for (int i = 0; i < 10; i++) {
    pthread_t thr;
    pthread_create(&thr, NULL, thread_main, (void*)i);
  }
  pthread_mutex_lock(&mutex);
  puts(buffer);
  pthread_mutex_unlock(&mutex);
  return 0;
}

Compiling and running this program repeatedly prints different strings. Uh oh!

plain text
aaaaabbbbbcccccdddddfffffeeeeehhhhhgggggiiiii
aaaaabbbbbccccceeeeedddddfffffhhhhhgggggiiiiijjjjj
abdcccecbeebbaaaaddddfffffggggghhhhhiiiiijjjjj
abcaaaabbbbccccdddddeeeeehhhhhgggggfffffiiiii
aaaaabbbbbcccccdddddfeefefggggghhhhhiiiii

It turns out that inter-thread non-determinism is just as important as intra-thread non-determinism. The primary way we record inter-thread non-determism is by ordering some of the locks which works like any other lock or mutex, except that when replaying it will be acquired by different threads in the same order that it was acquired when recording.

This is great, except it that in practice if we record all of the locks it would bloat the recordings. It also turns out that most of the locks are not needed to maintain the same behavior.

This is where Wordle comes in. When a recording diverges while replaying the question is what unordered lock is actually important for preserving the original behavior.

This is a search game where the search space is the runtime itself (Chromium, Firefox, Node) and our clues are the recording and replaying stack traces. So where do we start? The first step is to add an assert (1, 2, 3). The next step is to wait for one of the asserts to hit. This narrows the search space down. Sometimes that is enough to find the right lock.

Most of the time, we place new asserts and wait for the next hit. We’re guessing a word, not with the expecation that we’re right, but with the hope that we learn enough to be able to guess the answer next time!

If you’d like to learn more about how Replay Works, we wrote a great deep dive with more examples here. Also if you’re really into Wordle or the inne mechanics of runtimes and Time-Travel Debugging, say hello in replay.io/discord.

We believe in the future is replayable runtimes will be a given and ultimately that is a community effort!