Fix 'service is unhealthy' Error in Docker Compose

intermediate๐Ÿณ Docker2026-03-17| Docker Desktop / Docker Engine 20.x+, Docker Compose v2.x, Linux/macOS/Windows

Error Message

service "myservice" is unhealthy
#docker#docker-compose#healthcheck#container

The Error

You run docker compose up and the whole thing grinds to a halt:

Error response from daemon: service "myservice" is unhealthy

Or a dependent container refuses to start because its upstream service never turned green. Either way, you're stuck.

What's happening: Docker ran the healthcheck command inside your container and it kept returning a non-zero exit code. After hitting the retry limit, Docker stamped the container unhealthy. Depending on your Compose setup, it either killed the container outright or blocked everything that depends_on it.

Root Causes

  • The healthcheck command itself is wrong โ€” wrong path, wrong tool, wrong port
  • The service takes longer to initialize than start_period allows
  • The service is genuinely broken inside the container (bad config, crash loop)
  • A dependency the healthcheck pings โ€” a DB or external API โ€” isn't ready yet
  • The healthcheck tool isn't installed in the image (e.g., curl or wget missing)

Step 1: See What Docker Is Actually Seeing

Don't guess. Pull the raw healthcheck output first:

# Show health status and last check output
docker inspect --format='{{json .State.Health}}' myservice | jq .

The Log array is the key part. Each entry has an ExitCode and Output โ€” that's the literal stdout/stderr from your healthcheck command. Nine times out of ten, this tells you exactly what's broken.

# Check if the container is even running
docker compose ps
docker compose logs myservice

Fix 1: Correct the Healthcheck Command

The most common culprit: the tool you're calling isn't in the image. Or the endpoint is wrong. Here's a working healthcheck for a Node.js API on port 3000:

services:
  api:
    image: my-node-app
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 30s

Not sure which tool is available in your image? Test directly:

docker exec myservice wget -qO- http://localhost:3000/health
docker exec myservice curl -f http://localhost:3000/health

For PostgreSQL, use the built-in pg_isready โ€” no extra tools needed:

healthcheck:
  test: ["CMD-SHELL", "pg_isready -U postgres"]
  interval: 5s
  timeout: 3s
  retries: 5

Redis is even simpler:

healthcheck:
  test: ["CMD", "redis-cli", "ping"]
  interval: 5s
  timeout: 3s
  retries: 5

Fix 2: Give Slow Services More Time with start_period

Java apps and Spring Boot are notorious for this. A Spring Boot service can take 30โ€“90 seconds to fully start. Without a generous start_period, Docker will mark it unhealthy before it even finishes loading.

The key detail: failed checks during start_period don't count toward retries. It's a grace window, not a death timer.

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:8080/actuator/health"]
  interval: 15s
  timeout: 10s
  retries: 5
  start_period: 60s   # Give it 60s before retries count

Check your container logs for the actual startup time, then set start_period to that value plus 10โ€“15 seconds of buffer.

Fix 3: Use condition: service_healthy in depends_on

By default, depends_on only waits for a container to start โ€” not to be ready. A database container can be "started" for 5 seconds before Postgres actually accepts connections. Your app hits it too early and fails.

The fix is one line per dependency:

services:
  app:
    image: my-app
    depends_on:
      db:
        condition: service_healthy   # Wait until db passes health check
      redis:
        condition: service_healthy

  db:
    image: postgres:15
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 3s
      retries: 5
      start_period: 10s

  redis:
    image: redis:7
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 3

This guarantees app won't launch until both db and redis are confirmed healthy.

Fix 4: Temporarily Disable the Healthcheck

Useful when you need to isolate the problem. Is the healthcheck config broken, or is the service itself crashing? Disabling it answers that question fast:

services:
  myservice:
    image: my-image
    healthcheck:
      disable: true

If the container runs fine with healthcheck disabled but fails with it enabled, the healthcheck config is your problem. Don't leave this in production.

Fix 5: The Service Itself Is Broken

Sometimes the healthcheck is perfectly fine. The container genuinely fails. Dig into the logs:

docker compose logs --tail=100 myservice

# Bring it up and watch for errors in real time
docker compose up 2>&1 | grep -E "(error|Error|fatal|Fatal|unhealthy)"

Usual suspects: missing environment variables, wrong database credentials, a port collision on the host, or a volume mount with permissions Docker can't write to.

Verification

Once you've applied a fix, watch the container's status in real time:

watch -n2 'docker compose ps'

The STATUS column should move from starting โ†’ healthy within your configured start_period + (interval ร— retries) window. Double-check with inspect:

docker inspect --format='{{.State.Health.Status}}' myservice
# Expected output: healthy

Prevention

  • Test the command with docker exec first โ€” run your healthcheck command manually inside the container before writing it into compose.yml. Saves a lot of trial and error.
  • Always set start_period explicitly โ€” the default is 0s, meaning checks fire immediately at container start. Almost every non-trivial service needs at least 10โ€“30s here.
  • Prefer specific checks over generic TCP probes โ€” pg_isready and redis-cli ping give meaningful output on failure; a raw TCP check just tells you a port is open, not that the service is working.
  • Add healthchecks to every stateful dependency (databases, caches, message queues) and pair them with condition: service_healthy in depends_on.
  • Capture healthcheck output in CI logs โ€” flaky timing issues are much easier to catch in a pipeline than in a production incident at 2am.

Related Error Notes