The Scenario
It's 2 AM. Your cron job just died. The logs show:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
char 0 means Python didn't make it past the very first character. The string it tried to parse was either completely empty, or something totally unexpected β an HTML error page, a bare newline, a BOM character, a 500 response from an API you thought was reliable.
What's Actually Happening
Both json.loads() and json.load() throw JSONDecodeError the instant they see anything that isn't valid JSON. The line 1 column 1 (char 0) variant specifically means the input was one of these:
- Empty string (
""orb"") - Just whitespace (
" ") - An HTML page β a 404 or 503 response from an API
- A plain-text error message from a server
- A file that got truncated mid-write, or never written at all
- A response with a UTF-8 BOM (
\xef\xbb\xbf) prepended
The fix depends on which of these you're dealing with. Start by printing exactly what you're trying to parse.
Diagnose Before You Fix
Drop this one line before your json.loads() call:
print(repr(response_text)) # or repr(raw_string)
json.loads(response_text)
repr() strips away any illusions. You'll immediately see if the string is empty, starts with <!DOCTYPE, or has a BOM hiding at position zero.
Quick Fixes by Cause
1. Empty String from an API Response
This is the most common production trigger. An API returned 204 No Content, hit a timeout, or responded with a 5xx error page instead of JSON.
import requests
import json
response = requests.get("https://api.example.com/data")
# Bad β crashes on an empty body
data = json.loads(response.text)
# Good β check before parsing
if response.status_code == 200 and response.text.strip():
data = response.json() # requests handles the decode
else:
print(f"Unexpected response: {response.status_code} β {repr(response.text[:200])}")
Prefer response.json() from the requests library when you can. It raises requests.exceptions.JSONDecodeError with more context than the bare stdlib version.
2. Reading a File That's Empty or Truncated
import json
import os
path = "data.json"
if not os.path.exists(path) or os.path.getsize(path) == 0:
raise FileNotFoundError(f"JSON file missing or empty: {path}")
with open(path, "r", encoding="utf-8") as f:
data = json.load(f)
Writing and reading the file from different processes? The writer may not have flushed yet. Call f.flush() followed by os.fsync(f.fileno()) after writing to guarantee the data hits disk.
3. BOM at the Start of the File
Files saved by Windows tools or Excel often sneak in a UTF-8 BOM (\xef\xbb\xbf). Python's json module treats it as a non-JSON character and immediately bails.
# Fix: utf-8-sig silently strips the BOM before parsing
with open("data.json", "r", encoding="utf-8-sig") as f:
data = json.load(f)
4. API Returns HTML Instead of JSON
Rate limiting, expired auth tokens, and misconfigured proxies all love to return an HTML error page. Check the Content-Type header β it won't lie:
import requests
response = requests.get(url, headers={"Accept": "application/json"})
content_type = response.headers.get("Content-Type", "")
if "application/json" not in content_type:
raise ValueError(f"Expected JSON, got {content_type}: {response.text[:300]}")
data = response.json()
5. Empty Environment Variable
Environment variables are always strings. If one is unset or blank, you're handing an empty string straight to json.loads():
import json
import os
raw = os.environ.get("MY_CONFIG", "")
if not raw:
raise EnvironmentError("MY_CONFIG environment variable is not set")
config = json.loads(raw)
A Reusable Helper Worth Keeping
Once you've tracked down the root cause, wrap your JSON parsing in a helper. Cryptic tracebacks at 2 AM are no fun β this gives you something actionable instead:
import json
from typing import Any
def safe_parse_json(raw: str, source: str = "unknown") -> Any:
"""Parse JSON with a useful error message on failure."""
if not isinstance(raw, str):
raw = raw.decode("utf-8", errors="replace") # handle bytes
stripped = raw.strip()
if not stripped:
raise ValueError(f"Empty JSON input from {source}")
try:
return json.loads(stripped)
except json.JSONDecodeError as e:
preview = repr(stripped[:200])
raise ValueError(f"Invalid JSON from {source}: {e} β got: {preview}") from e
Call it like:
data = safe_parse_json(response.text, source=f"GET {url}")
Next time this blows up at 2 AM, your log will show exactly where the bad input came from and what the first 200 characters looked like.
Verify the Fix
Run this quick sanity check before deploying:
import json
test_cases = [
("", "empty string"),
(" ", "whitespace only"),
('{"key": "value"}', "valid JSON"),
("null", "JSON null"),
]
for raw, label in test_cases:
try:
result = json.loads(raw) if raw.strip() else None
print(f"{label}: OK β {result}")
except json.JSONDecodeError as e:
print(f"{label}: FAIL β {e}")
Expected output:
empty string: OK β None
whitespace only: OK β None
valid JSON: OK β {'key': 'value'}
JSON null: OK β None
Prevention Tips
Staring at a 3,000-character response blob and can't tell where the syntax breaks? Paste it into the JSON Formatter & Validator at ToolCraft. It highlights exactly which token is invalid β nothing gets uploaded, it all runs in your browser.
For the long term:
- Always check the API response status code before touching the body.
- Log the raw response whenever JSON parsing fails β you'll want it for debugging later.
- If you own the API, return
Content-Type: application/jsonconsistently, even for error responses. - For file-based pipelines, write atomically: output to a
.tmpfile first, thenos.rename()into place. This stops partial reads cold.

