Fix MongoServerError: document is larger than the maximum size 16777216

What happened

You tried to insert or update a document and MongoDB threw this:

MongoServerError: document is larger than the maximum size 16777216

That number — 16,777,216 bytes — is exactly 16 MB. It's MongoDB's hard BSON document size limit. Replica set, Atlas, local dev machine — doesn't matter. That ceiling is baked into the BSON spec itself and there's no config knob to raise it.

The usual suspects:

Storing a Base64-encoded image or PDF directly in a document field — Base64 inflates binary data by ~33%, so a 12 MB PDF hits the limit before you've added a single metadata field
An unbounded array (logs, events, history) that grows with every $push until it finally tips over
Serializing a large object graph from your ORM and inserting the whole thing at once
Months of accumulated data that nobody tracked until production broke

Measure before you fix

Don't guess which field is bloated — measure it. In mongosh:

// In mongosh
const doc = db.mycollection.findOne({ _id: ObjectId("...") });
Object.bsonsize(doc);
// e.g. 18432000  ← over 16 MB

In Node.js with the native driver:

const { BSON } = require('bson');
const size = BSON.calculateObjectSize(doc);
console.log(`Document size: ${(size / 1024 / 1024).toFixed(2)} MB`);

In Python (PyMongo):

import bson
size = len(bson.encode(doc))
print(f"Document size: {size / 1024 / 1024:.2f} MB")

Once you've pinpointed the bloated field — typically a binary blob or a runaway array — pick the fix below that matches your situation.

Fix 1 — Use GridFS for binary/file data

Storing files, images, or PDFs directly in a document field is the wrong approach. GridFS was built for exactly this. It splits the file into 255 KB chunks and stores metadata separately, bypassing the 16 MB ceiling entirely.

Node.js example (native driver):

const { MongoClient, GridFSBucket } = require('mongodb');
const fs = require('fs');

const client = await MongoClient.connect('mongodb://localhost:27017');
const db = client.db('mydb');
const bucket = new GridFSBucket(db);

const uploadStream = bucket.openUploadStream('report.pdf');
fs.createReadStream('/tmp/large-report.pdf').pipe(uploadStream);

uploadStream.on('finish', () => {
  console.log('Uploaded file id:', uploadStream.id);
});

Python example (PyMongo):

from pymongo import MongoClient
import gridfs

client = MongoClient('mongodb://localhost:27017')
db = client['mydb']
fs = gridfs.GridFS(db)

with open('/tmp/large-report.pdf', 'rb') as f:
    file_id = fs.put(f, filename='report.pdf')
    print(f'Stored file id: {file_id}')

Store only the returned file_id in your main document. Retrieve later with bucket.openDownloadStream(file_id) or fs.get(file_id).

Fix 2 — Split the document (bucket pattern)

Runaway arrays are a classic culprit. The bucket pattern caps each document at N items, then starts a new one — a well-established approach for event logs, telemetry, and time-series data.

// Instead of one document with 100k log entries:
// { _id, userId, events: [ ...100000 items... ] }

// Use bucketed documents:
// { _id, userId, bucket: 1, count: 200, events: [ ...200 items... ] }
// { _id, userId, bucket: 2, count: 200, events: [ ...200 items... ] }

const MAX_BUCKET_SIZE = 200;

await db.collection('user_events').updateOne(
  { userId: userId, count: { $lt: MAX_BUCKET_SIZE } },
  {
    $push: { events: newEvent },
    $inc: { count: 1 },
    $setOnInsert: { bucket: Date.now() }
  },
  { upsert: true }
);

Each document stays comfortably under 16 MB. Range queries on events get faster too, since you're scanning smaller, bounded documents instead of one giant one.

Fix 3 — Reference instead of embed

Embedding sub-documents works great for small, stable data. For data that grows over time — reviews, comments, audit logs — embedding turns into a liability. Move the growing data to a separate collection and store a reference:

// Before (bloated): product with all reviews embedded
{
  _id: ObjectId("..."),
  name: "Widget",
  reviews: [ /* 5000 reviews */ ]
}

// After: separate collection
// products: { _id, name }
// reviews:  { _id, productId, text, rating, date }

Use $lookup when you need to join, or query the reviews collection directly when rendering. Two queries is a small price for never hitting the 16 MB wall.

Fix 4 — Compress before storing

Sometimes the data genuinely belongs together — a snapshot, a serialized report. Compression is a reasonable last resort. JSON payloads typically compress 5–10x with gzip, which can bring a 50 MB object down under 10 MB:

const zlib = require('zlib');

// Compress
const raw = JSON.stringify(bigObject);
const compressed = zlib.gzipSync(raw);  // returns Buffer

await db.collection('snapshots').insertOne({
  _id: snapshotId,
  data: compressed,  // stored as BinData
  compressedAt: new Date()
});

// Decompress on read
const doc = await db.collection('snapshots').findOne({ _id: snapshotId });
const original = JSON.parse(zlib.gunzipSync(doc.data).toString());

One thing to be aware of: MongoDB's WiredTiger engine compresses data at the storage layer, but that's transparent and has no effect on BSON document size. You need application-level compression — like the gzip above — to reduce the document size before insertion.

Verify the fix

After applying your chosen fix, confirm the new document is actually within bounds:

// Check size of the new document
const newDoc = await db.collection('mycollection').findOne({ _id: newId });
const { BSON } = require('bson');
console.log('Size (bytes):', BSON.calculateObjectSize(newDoc));
// Should be well under 16777216

For GridFS, verify the file landed correctly:

// In mongosh
db.fs.files.findOne({ filename: 'report.pdf' });
// Should show { length: ..., chunkSize: 261120, ... }

Lessons from the field

Never store files as Base64 strings in documents. A 12 MB PDF becomes ~16 MB encoded — and you haven't stored a single metadata field yet. Use GridFS or an object store (S3, Cloudflare R2).
Add a size alert to write-heavy collections. Check Object.bsonsize() in staging on documents that grow over time. A 10 MB document caught early is far cheaper than a 3 AM incident.
Enforce array length at the schema level. Mongoose validators, application-layer checks — either works. Don't wait for MongoDB to tell you the document is too big.
The limit is per document, not per collection. Splitting large data across multiple documents in the same collection is always safe.

Fix MongoServerError: document is larger than the maximum size 16777216

What happened

Measure before you fix

Fix 1 — Use GridFS for binary/file data

Fix 2 — Split the document (bucket pattern)

Fix 3 — Reference instead of embed

Fix 4 — Compress before storing

Verify the fix

Lessons from the field

Related Error Notes

Fixing the MongoDB 'cannot use the part (...) to traverse the element' Error

Sửa lỗi MongoDB 'The positional operator did not find the match needed from the query' khi update array

Fixing the MongoDB Error: 'text index required for $text query'