The Error
Your container stops unexpectedly. Run docker ps and the container is gone. Inspecting it reveals:
Container exited with code 137 (OOMKilled: true)
Exit code 137 means the process received signal 9 (SIGKILL). The OOMKilled: true flag tells you it wasn't a crash or a bug โ the Linux kernel's OOM killer deliberately ended your container because it ran out of memory.
Why This Happens
The Linux kernel tracks memory usage per cgroup (control group). When a container exceeds its memory limit โ or when the host itself runs out of memory โ the OOM killer picks a process to terminate. Containers are frequent targets.
Two scenarios trigger this:
- Container hit a hard memory limit: You set
--memoryand the container exceeded it. - Host ran out of memory: No limit was set. The container kept growing until the host had nothing left to give.
Either way, the path forward is the same: confirm the kill, check usage, set or raise limits, and hunt down what's eating memory.
Step 1: Confirm It Was an OOMKill
Inspect the container state directly:
docker inspect <container_name_or_id> --format='{{.State.OOMKilled}}'
Returns true? The OOM killer was responsible. Now confirm the exit code:
docker inspect <container_name_or_id> --format='{{.State.ExitCode}}'
Should return 137. Check kernel logs on the host to see exactly which process was targeted:
dmesg | grep -i 'oom\|killed'
# or on systemd systems:
journalctl -k | grep -i oom
Look for a line like: Out of memory: Kill process 1234 (node) score 900 or sacrifice child. The process name and OOM score tell you which container was hit and why the kernel chose it.
Step 2: Check Current Memory Usage
See how much memory your running containers are using:
docker stats --no-stream
This shows current usage versus each container's configured limit. Any container sitting above 80% of its limit is one traffic spike away from getting killed.
Check what limit was set on the killed container:
docker inspect <container_name_or_id> --format='{{.HostConfig.Memory}}'
A value of 0 means no limit was set โ the container could consume all available host memory until the kernel had no choice but to act.
Step 3: Set or Increase the Memory Limit
For containers started with docker run, add memory flags:
docker run -d \
--memory="512m" \
--memory-swap="512m" \
your-image
Setting --memory-swap equal to --memory disables swap for the container. Set it higher to allow some swap, or omit it to default to twice the memory limit.
For Docker Compose (v3 with deploy, used in Swarm or with the --compatibility flag):
services:
app:
image: your-image
deploy:
resources:
limits:
memory: 512M
reservations:
memory: 256M
For standard Docker Compose (v2 syntax, no Swarm):
services:
app:
image: your-image
mem_limit: 512m
memswap_limit: 512m
After updating, recreate the container:
docker compose up -d --force-recreate app
Step 4: Find What's Consuming Memory
Raising the limit buys time. It won't fix a memory leak. If usage keeps growing, the container will OOMKill again at the new, higher limit โ you'll just wait longer between restarts.
Check memory stats inside the running container:
docker exec -it <container_name> sh -c 'cat /proc/meminfo'
docker exec -it <container_name> top
For Java apps โ without container-aware settings, the JVM sizes its heap based on host RAM, not the container limit. On a 16GB host inside a 512MB container, the JVM may target a 4GB max heap. When the running app pushes past 512MB, the kernel kills it. Fix this by setting heap size explicitly:
docker run -d \
-e JAVA_OPTS="-Xmx256m -Xms128m" \
your-java-image
On Java 8u191+ or Java 10+, use -XX:+UseContainerSupport instead โ the JVM reads cgroup limits directly and sizes itself accordingly:
docker run -d \
-e JAVA_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0" \
your-java-image
For Node.js apps โ V8 doesn't respect container memory limits either. Cap the heap explicitly:
docker run -d your-node-image node --max-old-space-size=256 app.js
For general memory growth โ take heap dumps and use your runtime's profiling tools to find the leak. The sooner you profile under realistic load, the faster you'll find it.
Step 5: Add a Restart Policy While You Fix the Root Cause
This keeps the container alive after an OOMKill while you work on the real fix:
docker run -d \
--restart=on-failure:5 \
--memory="512m" \
your-image
Or in Compose:
services:
app:
image: your-image
restart: on-failure
mem_limit: 512m
This won't prevent OOMKills. It just keeps the service up while you track down the root cause.
Verify the Fix
Restart the container and watch its memory in real time:
docker stats <container_name>
Memory should stabilize under 70โ80% of the configured limit. After 10โ15 minutes under normal load, confirm the OOMKilled flag is cleared:
docker inspect <container_name> --format='{{.State.OOMKilled}}'
# Expected output: false
Also check the container isn't quietly restarting in the background:
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.RestartCount}}"
# A rising RestartCount means it's still getting killed
Tips to Avoid OOMKilled in Production
- Always set memory limits. Unbounded containers on a shared host can take down everything running on it. Make limits a deploy requirement, not an afterthought.
- Set reservations alongside limits in Compose so the scheduler knows the minimum resources needed before placing the container.
- Alert before the limit is hit. Wire Docker metrics to Prometheus + cAdvisor and alert at 80% memory usage โ so you're fixing the problem before the kernel does it for you.
- Test with production limits locally. Run your container with the same
--memoryvalue you use in production, then load test it. Catch memory problems before they ship. - Watch log buffer sizes. Apps writing large volumes of logs to in-memory buffers can silently balloon memory usage. Check your logging driver configuration.
- Consider swap carefully. Allowing swap (
--memory-swap>--memory) can prevent OOMKills but tanks performance under pressure. It's a band-aid, not a fix.

