Back to Blog
Engineering Deep-Dive7 min read

The Hidden Cost of Missing Indexes: A Production Post-Mortem

The API route had worked fine in development and staging. It worked fine in production for the first two weeks. Then the data grew, and a route that should take 40ms started taking 4 seconds. Here's the full post-mortem.

The symptom

The route in question returned a paginated list of shipments filtered by status and sorted by createdAt — a perfectly ordinary query. In development against a few hundred documents, it was instant. In production, once the collection crossed 2 million documents, response times ballooned from ~40ms to 3.8–4.2 seconds.

The frontend team noticed first. The dashboard was unusable. We opened the investigation.

Diagnosing with the MongoDB profiler

MongoDB has a database profiler that logs slow operations. Enable it for operations above a threshold:

db.setProfilingLevel(1, { slowms: 100 })

Then query the system.profile collection to find slow operations:

db.system.profile.find({ millis: { $gt: 100 } }).sort({ ts: -1 }).limit(10)

The output showed our shipments query with a millis value of 3847. The explain plan was the tell.

What the query plan revealed

Running .explain("executionStats") on the query showed:

stage: COLLSCAN
docsExamined: 2,041,883
docsReturned: 20
totalKeysExamined: 0

COLLSCAN means collection scan — MongoDB had no index to use and was reading every single document in the collection to find the 20 that matched the filter. 2 million documents examined to return 20. That's the problem.

The query filtered on `{ status: "pending", tenantId: "..." }` and sorted on `{ createdAt: -1 }`. There was a single-field index on `tenantId` from the original schema design, but no compound index that covered the full query pattern.

The fix and why the field order matters

The compound index:

db.shipments.createIndex(
  { tenantId: 1, status: 1, createdAt: -1 },
  { name: "tenant_status_date" }
)

Field order in a compound index matters. The general rule: equality fields first (fields you filter with exact match), then range or sort fields. Since we filter by tenantId and status with exact values, they go first. Since we sort by createdAt descending, it goes last — and the direction in the index must match the sort direction in the query.

After the index was created (it took ~4 minutes to build against 2M documents):

stage: IXSCAN
docsExamined: 20
docsReturned: 20
totalKeysExamined: 20

Response time: 38ms. From 3.8 seconds to 38 milliseconds.

The indexing strategy we now enforce

Since this incident we've added three rules to our project checklist:

1. Every query that will run against a large collection must have a covering index. We review query patterns during the design phase, not after performance complaints arrive.

2. Index analysis is part of the pre-launch checklist. We run .explain() on every significant query against a dataset that approximates production volume. Not 100 documents — at least 100,000.

3. We use MongoDB Atlas's Performance Advisor in production. Atlas automatically analyses slow queries and suggests indexes. It's not a replacement for thinking, but it catches things that slip through.

The lesson isn't that indexes are complicated. It's that a missing index on a small collection is invisible — and on a large collection it's a production incident. The time to add the index is before the data grows, not after.

Ready to take action?

Working with MongoDB at scale?

Get in Touch
← Read more articles