MongoDB Performance Optimization: Tips and Best Practices
MongoDB is powerful, but poor optimization can lead to slow queries and performance issues. Here’s how to optimize your MongoDB database for peak performance.
Indexing Strategies
Single Field Index
// Create index on email field
db.users.createIndex({ email: 1 });
// Query using index
db.users.find({ email: "user@example.com" });
Compound Index
Order matters! Most selective fields first:
// Good - status is more selective
db.orders.createIndex({ status: 1, customerId: 1, createdAt: -1 });
// Query that uses index
db.orders.find({ status: "pending", customerId: "123" })
.sort({ createdAt: -1 });
Index Direction
// Ascending index
db.products.createIndex({ price: 1 });
// Descending index
db.products.createIndex({ createdAt: -1 });
// For sorting, direction matters
db.products.find().sort({ price: 1 }); // Uses index
db.products.find().sort({ price: -1 }); // Can use same index
Partial Indexes
Index only subset of documents:
// Index only active users
db.users.createIndex(
{ email: 1 },
{ partialFilterExpression: { status: "active" } }
);
// Query must match filter expression
db.users.find({ email: "user@example.com", status: "active" });
Text Indexes
// Create text index
db.articles.createIndex({ title: "text", content: "text" });
// Search
db.articles.find({ $text: { $search: "mongodb performance" } });
// With relevance score
db.articles.find(
{ $text: { $search: "mongodb" } },
{ score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } });
Index Best Practices
// ✅ Good - selective field first
db.orders.createIndex({ status: 1, customerId: 1 });
// ❌ Bad - non-selective field first
db.orders.createIndex({ customerId: 1, status: 1 });
// ✅ Good - include sort fields
db.products.createIndex({ category: 1, price: -1 });
// ✅ Good - covered query (returns only indexed fields)
db.users.find(
{ email: "user@example.com" },
{ email: 1, name: 1, _id: 0 }
);
Query Optimization
Use Explain
Always analyze query performance:
// Basic explain
db.users.find({ email: "user@example.com" }).explain("executionStats");
// Check if index is used
db.users.find({ email: "user@example.com" })
.explain("executionStats")
.executionStats.totalDocsExamined; // Should be close to nReturned
Projection
Only return fields you need:
// ❌ Bad - returns all fields
db.users.find({ status: "active" });
// ✅ Good - only return needed fields
db.users.find(
{ status: "active" },
{ name: 1, email: 1, _id: 0 }
);
Limit Results
// ❌ Bad - returns all matching documents
db.products.find({ category: "electronics" });
// ✅ Good - limit results
db.products.find({ category: "electronics" }).limit(10);
Avoid $where and $regex
// ❌ Bad - slow, can't use index
db.users.find({ $where: "this.name.length > 5" });
db.users.find({ name: /john/i });
// ✅ Good - use operators
db.users.find({ name: { $regex: "^John", $options: "i" } });
// ✅ Better - exact match with index
db.users.find({ name: "John" });
Aggregation Pipeline Optimization
// ✅ Good order: match early, project late
db.orders.aggregate([
{ $match: { status: "pending" } }, // Filter early
{ $sort: { createdAt: -1 } }, // Sort indexed field
{ $limit: 100 }, // Limit early
{ $lookup: { // Join after filtering
from: "customers",
localField: "customerId",
foreignField: "_id",
as: "customer"
}},
{ $project: { // Project late
orderId: 1,
total: 1,
"customer.name": 1
}}
]);
// ❌ Bad - lookup before filtering
db.orders.aggregate([
{ $lookup: { /* ... */ }}, // Expensive join first
{ $match: { status: "pending" } } // Filter after join
]);
Schema Design
Embedding vs Referencing
Embed when:
- Data is frequently accessed together
- One-to-few relationships
- Data doesn’t change often
// Embedded documents
{
_id: ObjectId("..."),
name: "John Doe",
addresses: [
{ street: "123 Main St", city: "NYC" },
{ street: "456 Oak Ave", city: "LA" }
]
}
Reference when:
- Data is large
- Data is frequently updated
- Many-to-many relationships
// Referenced documents
// User
{
_id: ObjectId("user1"),
name: "John Doe",
orderIds: [ObjectId("order1"), ObjectId("order2")]
}
// Orders
{
_id: ObjectId("order1"),
userId: ObjectId("user1"),
total: 99.99
}
Avoid Large Arrays
// ❌ Bad - unbounded array growth
{
_id: ObjectId("..."),
productId: "123",
reviews: [/* potentially thousands of reviews */]
}
// ✅ Good - separate collection
// Product
{
_id: ObjectId("..."),
productId: "123",
reviewCount: 1250
}
// Reviews (separate collection)
{
_id: ObjectId("..."),
productId: "123",
rating: 5,
comment: "Great!"
}
Use Appropriate Data Types
// ❌ Bad
{
price: "19.99", // String instead of number
createdAt: "2024-01-01" // String instead of date
}
// ✅ Good
{
price: 19.99, // Number
createdAt: ISODate("2024-01-01T00:00:00Z") // Date
}
Connection Management
Connection Pooling
const { MongoClient } = require('mongodb');
const client = new MongoClient(uri, {
maxPoolSize: 50, // Maximum connections
minPoolSize: 10, // Minimum connections
maxIdleTimeMS: 30000, // Close idle connections
waitQueueTimeoutMS: 5000 // Timeout for waiting connections
});
Reuse Connections
// ❌ Bad - new connection per request
app.get('/users', async (req, res) => {
const client = await MongoClient.connect(uri);
const users = await client.db().collection('users').find().toArray();
await client.close();
res.json(users);
});
// ✅ Good - reuse connection
let client;
async function connectDB() {
client = await MongoClient.connect(uri);
return client.db();
}
app.get('/users', async (req, res) => {
const db = client.db();
const users = await db.collection('users').find().toArray();
res.json(users);
});
Bulk Operations
// ❌ Bad - individual operations
for (const user of users) {
await db.users.insertOne(user);
}
// ✅ Good - bulk insert
await db.users.insertMany(users, { ordered: false });
// Bulk write operations
const bulkOps = users.map(user => ({
insertOne: { document: user }
}));
await db.users.bulkWrite(bulkOps, { ordered: false });
Read Preference
// Primary (default) - all reads from primary
db.users.find().readPref('primary');
// Secondary - read from secondaries
db.analytics.find().readPref('secondary');
// Primary Preferred - primary if available
db.users.find().readPref('primaryPreferred');
Write Concern
// Wait for acknowledgment from primary only (fast)
db.logs.insertOne(doc, { writeConcern: { w: 1 } });
// Wait for majority (safer)
db.orders.insertOne(doc, { writeConcern: { w: 'majority' } });
// Wait with timeout
db.critical.insertOne(doc, {
writeConcern: { w: 'majority', wtimeout: 5000 }
});
Monitoring
Enable Profiler
// Enable profiler for slow queries (> 100ms)
db.setProfilingLevel(1, { slowms: 100 });
// View slow queries
db.system.profile.find().sort({ ts: -1 }).limit(10);
// Disable profiler
db.setProfilingLevel(0);
Monitor Index Usage
// Get index stats
db.users.aggregate([{ $indexStats: {} }]);
// Find unused indexes
db.users.aggregate([
{ $indexStats: {} },
{ $match: { "accesses.ops": 0 } }
]);
Sharding Considerations
Choose Good Shard Key
// ❌ Bad - monotonically increasing
{ _id: 1 } // All new documents go to same shard
// ✅ Good - evenly distributed
{ userId: 1, timestamp: 1 }
// ✅ Good - hashed
{ _id: "hashed" }
Target Queries to Single Shard
// ✅ Good - includes shard key
db.orders.find({ userId: "123", status: "pending" });
// ❌ Bad - scatter-gather across all shards
db.orders.find({ status: "pending" });
Best Practices Checklist
✅ Create indexes for frequently queried fields ✅ Use compound indexes in correct order ✅ Use projection to limit returned fields ✅ Limit query results ✅ Use aggregation pipeline efficiently ✅ Design schema appropriately (embed vs reference) ✅ Reuse database connections ✅ Use bulk operations for multiple writes ✅ Monitor slow queries with profiler ✅ Remove unused indexes ✅ Use appropriate read/write concerns ✅ Choose good shard keys
Performance Testing
// Measure query time
const start = Date.now();
const result = await db.users.find({ email: "user@example.com" }).toArray();
console.log(`Query took ${Date.now() - start}ms`);
// Load testing with explain
db.users.find({ email: "user@example.com" })
.explain("executionStats");
Conclusion
MongoDB performance optimization is crucial for scalable applications. Focus on proper indexing, efficient queries, appropriate schema design, and regular monitoring. Always test and measure before optimizing, and use the explain plan to understand query execution.