Fixing a Memory Leak in Go: Understanding time.After

Recently, we decided to investigate why our application ARANGOSYNC for synchronizing two ArangoDB clusters across data centers used up a lot of memory – around 2GB in certain cases. The environment contained ~1500 shards with 5000 GOroutines. Thanks to tools like pprof (to profile CPU and memory usage) it was very easy to identify the issue. The GO profiler showed us that memory was allocated in the function `time.After()` and it accumulated up to nearly 1GB. The memory was not released so it was clear that we had a memory leak. We will explain how memory leaks can occur using the `time.After()` function through three examples.

Valid usage of the time.After() function

select {
  case <-time.After(time.Second):
     // do something after 1 second.
}

Nothing is wrong with the above code because there is only one possibility when the `select` statement is finished. When it is done the timer which was created internally in the `time.After()` function was stopped and resources were freed.

Invalid usage of the time.After() function

It is very tempting to write the following code:

select {
  case <-time.After(time.Second):
     // do something after 1 second.
  case <-ctx.Done():
     // do something when context is finished.
     // resources created by the time.After() will not be garbage collected
  }

In the above `select` statement, if the `time.After()` function is finished everything works like in the first example. But if the `ctx.Done()` is finished earlier, then the timer which was created in the `time.After` function is not stopped and resources are not released – causing a memory leak (see the documentation here).

Improved usage of the time.After() function

In production code, one should use `time.After()` in the following way instead:

  delay := time.NewTimer(time.Second)

  select {
  case <-delay.C:
     // do something after one second.
  case <-ctx.Done():
     // do something when context is finished and stop the timer.
     if !delay.Stop() {
        // if the timer has been stopped then read from the channel.
        <-delay.C
     }    
  }

Here, one creates a new timer and when it is finished all resources created by the `time.NewTimer()` are released. In the other case when `ctx.Done()` occurs before, then resources are released using the `delay.Stop()` function. It may occur that the `ctx.Done()` finishes, and immediately afterwards the timer expires. So that is why there is an additional condition \ checking whether the timer has expired or stopped.

I hope that this finding is useful for others, it at least solved our problem immediately. Feel free to leave comments below or ping me on the ArangoDB Community Slack (@tomasz.arangodb)

Related Blogs

See All Blogs

Arango Contextual Data Platform

Why Graph Databases Alone Don’t Win Enterprise AI (And What Actually Does)

Solutions

Why Graph Databases Alone Don’t Win Enterprise AI (And What Actually Does)

Developers

From Prototype to Production: Why It’s Time to Move to ArangoDB Enterprise Edition

Learn

Why Graph Databases Alone Don’t Win Enterprise AI (And What Actually Does)

Why Arango?

Why Graph Databases Alone Don’t Win Enterprise AI (And What Actually Does)

Fixing a Memory Leak in Go: Understanding time.After

Valid usage of the time.After() function

Invalid usage of the time.After() function

Improved usage of the time.After() function

Share

More to Explore

Related Blogs

Products

Developer Hub

Company

Use Cases

Learn