The Error
You call AutoTokenizer.from_pretrained() and get hit with this:
OSError: Can't load tokenizer for 'bert-base-uncased'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bert-base-uncased' is the correct identifier of a pretrained model listed on 'https://huggingface.co/models'.
Or the short version:
OSError: Can't load tokenizer for 'your-model-name'.
Looks scary. Usually isn't. The culprit is almost always one of three things: wrong model name, blocked network, or a corrupted local cache.
Why It Happens
- Typo or wrong model name β the identifier doesn't exist on the Hub.
- No internet access β firewall, corporate proxy, or air-gapped machine.
- Corrupt cache β a partial download left broken files in
~/.cache/huggingface/. - Outdated
transformersversion β some tokenizer classes were introduced after 4.20; older installs don't know about them. - Private or gated model β Llama 2, Mistral, and similar repos require an accepted license and a valid HF token.
- Local path confusion β you passed a directory path but the expected files aren't inside it.
Step-by-Step Fix
Step 1 β Verify the Model Name
Model names are case-sensitive. Bert-Base-Uncased fails; bert-base-uncased works.
# Wrong
tokenizer = AutoTokenizer.from_pretrained("Bert-Base-Uncased")
# Correct
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
Community models also need the username prefix β copy the ID directly from the model card URL:
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
Step 2 β Check Network and Authentication
Before touching anything else, confirm you can actually reach the Hub:
python -c "from huggingface_hub import HfApi; print(HfApi().model_info('bert-base-uncased'))"
A connection error here means the problem is network, not your code.
Gated models like Llama 2 need an accepted license on the model page plus a valid token:
# Option A: CLI login (persists across sessions)
huggingface-cli login
# Option B: pass token inline
tokenizer = AutoTokenizer.from_pretrained(
"meta-llama/Llama-2-7b-hf",
token="hf_your_token_here"
)
Generate a token at huggingface.co/settings/tokens. Read access is enough for downloading.
Step 3 β Clear the Corrupted Cache
Interrupted downloads leave partial files that block future loads. Delete the specific model folder and let it re-download fresh:
# Find the cache location first
python -c "from huggingface_hub import constants; print(constants.HF_HUB_CACHE)"
# Delete just the broken model (Linux/macOS)
rm -rf ~/.cache/huggingface/hub/models--bert-base-uncased
# Nuclear option: wipe everything
rm -rf ~/.cache/huggingface/hub/
Note: popular models like bert-base-uncased are around 440 MB. Larger ones (Llama 2 7B) hit 13+ GB. Make sure you have disk space before re-downloading.
Step 4 β Upgrade Transformers
Fast tokenizer backends and newer model types often require a recent release. Upgrade the whole stack at once:
pip install --upgrade transformers tokenizers huggingface_hub
Check what you're now running:
python -c "import transformers; print(transformers.__version__)"
Anything below 4.30 will struggle with models released in 2023 or later. Most modern models (Mistral, Phi, Gemma) need 4.35+.
Step 5 β Load from a Local Directory
Already have the model downloaded? Skip the network entirely by pointing directly at the local path:
from transformers import AutoTokenizer
# Save it once
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
tokenizer.save_pretrained("./local_tokenizer/")
# Load offline from that point on
tokenizer = AutoTokenizer.from_pretrained("./local_tokenizer/")
The directory must contain tokenizer_config.json. Depending on the tokenizer type, you'll also need vocab.txt (BERT-style), vocab.json (GPT-2-style), or tokenizer.json (fast tokenizers). Check with ls ./local_tokenizer/ if something feels missing.
Step 6 β Enable Offline Mode Explicitly
On air-gapped machines, set TRANSFORMERS_OFFLINE=1 before running anything:
# Shell
export TRANSFORMERS_OFFLINE=1
python your_script.py
# Or inside Python
import os
os.environ["TRANSFORMERS_OFFLINE"] = "1"
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
With this flag set, Transformers skips all network calls and loads only from cache. If the cache is missing, you get a clear error immediately β no hanging while trying to reach huggingface.co.
Step 7 β Fall Back to the Slow Tokenizer
Occasionally the fast (Rust-based) tokenizer has a version conflict with the tokenizers library. Switch to the pure-Python implementation as a diagnostic step:
tokenizer = AutoTokenizer.from_pretrained(
"bert-base-uncased",
use_fast=False
)
Slower on large batches, but sidesteps any Rust library ABI issues entirely.
Verify the Fix
Run this quick sanity check after applying your fix:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
output = tokenizer("Hello, HuggingFace!", return_tensors="pt")
print(output)
# Expected: {'input_ids': tensor([[101, 7592, 1010, ...]]), 'attention_mask': tensor([[1, 1, ...]])}
Tensor output means the tokenizer loaded. If you get an error, the model ID or file is still the issue.
Quick Diagnosis Checklist
- Copy the model name directly from the Hub URL β don't retype it.
- Run
huggingface-cli whoamito confirm you're authenticated. - Check the model page for an "Access restricted" or "Gated model" notice.
- Check disk space β tokenizer caches range from 500 MB to 13+ GB.
- When loading locally, confirm
tokenizer_config.jsonexists in the target folder. - On corporate networks, set
HTTPS_PROXYβ many HF download failures are silent proxy blocks.
Proxy and Firewall Fix
Corporate proxies often block HuggingFace downloads without a clear error. Set the proxy before running your script:
export HTTPS_PROXY=http://proxy.company.com:8080
export HTTP_PROXY=http://proxy.company.com:8080
python your_script.py
Or set it inside Python if you can't modify the shell environment:
import os
os.environ["HTTPS_PROXY"] = "http://proxy.company.com:8080"
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

