The route in question returned a paginated list of shipments filtered by status and sorted by createdAt — a perfectly ordinary query. In development against a few hundred documents, it was instant. In production, once the collection crossed 2 million documents, response times ballooned from ~40ms to 3.8–4.2 seconds.
The frontend team noticed first. The dashboard was unusable. We opened the investigation.
MongoDB has a database profiler that logs slow operations. Enable it for operations above a threshold:
db.setProfilingLevel(1, { slowms: 100 })Then query the system.profile collection to find slow operations:
db.system.profile.find({ millis: { $gt: 100 } }).sort({ ts: -1 }).limit(10)The output showed our shipments query with a millis value of 3847. The explain plan was the tell.
Running .explain("executionStats") on the query showed:
stage: COLLSCAN
docsExamined: 2,041,883
docsReturned: 20
totalKeysExamined: 0COLLSCAN means collection scan — MongoDB had no index to use and was reading every single document in the collection to find the 20 that matched the filter. 2 million documents examined to return 20. That's the problem.
The query filtered on `{ status: "pending", tenantId: "..." }` and sorted on `{ createdAt: -1 }`. There was a single-field index on `tenantId` from the original schema design, but no compound index that covered the full query pattern.
The compound index:
db.shipments.createIndex(
{ tenantId: 1, status: 1, createdAt: -1 },
{ name: "tenant_status_date" }
)Field order in a compound index matters. The general rule: equality fields first (fields you filter with exact match), then range or sort fields. Since we filter by tenantId and status with exact values, they go first. Since we sort by createdAt descending, it goes last — and the direction in the index must match the sort direction in the query.
After the index was created (it took ~4 minutes to build against 2M documents):
stage: IXSCAN
docsExamined: 20
docsReturned: 20
totalKeysExamined: 20Response time: 38ms. From 3.8 seconds to 38 milliseconds.
Since this incident we've added three rules to our project checklist:
1. Every query that will run against a large collection must have a covering index. We review query patterns during the design phase, not after performance complaints arrive.
2. Index analysis is part of the pre-launch checklist. We run .explain() on every significant query against a dataset that approximates production volume. Not 100 documents — at least 100,000.
3. We use MongoDB Atlas's Performance Advisor in production. Atlas automatically analyses slow queries and suggests indexes. It's not a replacement for thinking, but it catches things that slip through.
The lesson isn't that indexes are complicated. It's that a missing index on a small collection is invisible — and on a large collection it's a production incident. The time to add the index is before the data grows, not after.
Ready to take action?