Fix Docker Container Exits with Code 137 (OOMKilled: true)

The Error

Your container stops unexpectedly. Run docker ps and the container is gone. Inspecting it reveals:

Container exited with code 137 (OOMKilled: true)

Exit code 137 means the process received signal 9 (SIGKILL). The OOMKilled: true flag tells you it wasn't a crash or a bug — the Linux kernel's OOM killer deliberately ended your container because it ran out of memory.

Why This Happens

The Linux kernel tracks memory usage per cgroup (control group). When a container exceeds its memory limit — or when the host itself runs out of memory — the OOM killer picks a process to terminate. Containers are frequent targets.

Two scenarios trigger this:

Container hit a hard memory limit: You set --memory and the container exceeded it.
Host ran out of memory: No limit was set. The container kept growing until the host had nothing left to give.

Either way, the path forward is the same: confirm the kill, check usage, set or raise limits, and hunt down what's eating memory.

Step 1: Confirm It Was an OOMKill

Inspect the container state directly:

docker inspect <container_name_or_id> --format='{{.State.OOMKilled}}'

Returns true? The OOM killer was responsible. Now confirm the exit code:

docker inspect <container_name_or_id> --format='{{.State.ExitCode}}'

Should return 137. Check kernel logs on the host to see exactly which process was targeted:

dmesg | grep -i 'oom\|killed'
# or on systemd systems:
journalctl -k | grep -i oom

Look for a line like: Out of memory: Kill process 1234 (node) score 900 or sacrifice child. The process name and OOM score tell you which container was hit and why the kernel chose it.

Step 2: Check Current Memory Usage

See how much memory your running containers are using:

docker stats --no-stream

This shows current usage versus each container's configured limit. Any container sitting above 80% of its limit is one traffic spike away from getting killed.

Check what limit was set on the killed container:

docker inspect <container_name_or_id> --format='{{.HostConfig.Memory}}'

A value of 0 means no limit was set — the container could consume all available host memory until the kernel had no choice but to act.

Step 3: Set or Increase the Memory Limit

For containers started with docker run, add memory flags:

docker run -d \
  --memory="512m" \
  --memory-swap="512m" \
  your-image

Setting --memory-swap equal to --memory disables swap for the container. Set it higher to allow some swap, or omit it to default to twice the memory limit.

For Docker Compose (v3 with deploy, used in Swarm or with the --compatibility flag):

services:
  app:
    image: your-image
    deploy:
      resources:
        limits:
          memory: 512M
        reservations:
          memory: 256M

For standard Docker Compose (v2 syntax, no Swarm):

services:
  app:
    image: your-image
    mem_limit: 512m
    memswap_limit: 512m

After updating, recreate the container:

docker compose up -d --force-recreate app

Step 4: Find What's Consuming Memory

Raising the limit buys time. It won't fix a memory leak. If usage keeps growing, the container will OOMKill again at the new, higher limit — you'll just wait longer between restarts.

Check memory stats inside the running container:

docker exec -it <container_name> sh -c 'cat /proc/meminfo'
docker exec -it <container_name> top

For Java apps — without container-aware settings, the JVM sizes its heap based on host RAM, not the container limit. On a 16GB host inside a 512MB container, the JVM may target a 4GB max heap. When the running app pushes past 512MB, the kernel kills it. Fix this by setting heap size explicitly:

docker run -d \
  -e JAVA_OPTS="-Xmx256m -Xms128m" \
  your-java-image

On Java 8u191+ or Java 10+, use -XX:+UseContainerSupport instead — the JVM reads cgroup limits directly and sizes itself accordingly:

docker run -d \
  -e JAVA_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0" \
  your-java-image

For Node.js apps — V8 doesn't respect container memory limits either. Cap the heap explicitly:

docker run -d your-node-image node --max-old-space-size=256 app.js

For general memory growth — take heap dumps and use your runtime's profiling tools to find the leak. The sooner you profile under realistic load, the faster you'll find it.

Step 5: Add a Restart Policy While You Fix the Root Cause

This keeps the container alive after an OOMKill while you work on the real fix:

docker run -d \
  --restart=on-failure:5 \
  --memory="512m" \
  your-image

Or in Compose:

services:
  app:
    image: your-image
    restart: on-failure
    mem_limit: 512m

This won't prevent OOMKills. It just keeps the service up while you track down the root cause.

Verify the Fix

Restart the container and watch its memory in real time:

docker stats <container_name>

Memory should stabilize under 70–80% of the configured limit. After 10–15 minutes under normal load, confirm the OOMKilled flag is cleared:

docker inspect <container_name> --format='{{.State.OOMKilled}}'
# Expected output: false

Also check the container isn't quietly restarting in the background:

docker ps --format "table {{.Names}}\t{{.Status}}\t{{.RestartCount}}"
# A rising RestartCount means it's still getting killed

Tips to Avoid OOMKilled in Production

Always set memory limits. Unbounded containers on a shared host can take down everything running on it. Make limits a deploy requirement, not an afterthought.
Set reservations alongside limits in Compose so the scheduler knows the minimum resources needed before placing the container.
Alert before the limit is hit. Wire Docker metrics to Prometheus + cAdvisor and alert at 80% memory usage — so you're fixing the problem before the kernel does it for you.
Test with production limits locally. Run your container with the same --memory value you use in production, then load test it. Catch memory problems before they ship.
Watch log buffer sizes. Apps writing large volumes of logs to in-memory buffers can silently balloon memory usage. Check your logging driver configuration.
Consider swap carefully. Allowing swap (--memory-swap > --memory) can prevent OOMKills but tanks performance under pressure. It's a band-aid, not a fix.