← Back to Blog
Database Performance 22 min read

Database Indexing Optimization: Complete Guide

Learn database indexing strategies, optimization techniques, query performance tuning, and best practices for improving database performance.

The Importance of Database Indexing

Database indexes are critical for query performance. Proper indexing can improve query speed by orders of magnitude, while poor indexing can lead to slow queries, high resource consumption, and scalability issues. Understanding indexing strategies is essential for any developer working with databases.

Indexing Analogy: Think of a database index like a book's index. Without an index, you'd have to scan every page to find information. With an index, you can go directly to the relevant pages.

Types of Database Indexes

B-Tree Indexes

Most common index type, balanced tree structure:

  • • Default index in most databases
  • • Efficient for equality and range queries
  • • Supports sorting operations
  • • Works well with most data types
-- PostgreSQL/MySQL CREATE INDEX idx_users_email ON users(email);

Hash Indexes

Fast for equality lookups only:

  • • Extremely fast for exact matches
  • • Doesn't support range queries
  • • Memory-intensive
  • • Best for unique constraints
-- PostgreSQL CREATE INDEX idx_users_id_hash ON users USING hash(id);

GIN (Generalized Inverted) Indexes

For array and full-text search:

  • • Optimized for array operations
  • • Excellent for full-text search
  • • Supports JSON/JSONB queries
  • • Larger than B-tree indexes
-- PostgreSQL JSONB indexing CREATE INDEX idx_users_tags_gin ON users USING gin(tags);

BRIN (Block Range) Indexes

For large, naturally ordered data:

  • • Very small storage footprint
  • • Efficient for time-series data
  • • Works with ordered data
  • • Less precise than B-tree
-- PostgreSQL time-series CREATE INDEX idx_logs_timestamp_brin ON logs USING brin(timestamp);

Index Design Strategies

1. Composite Indexes

Index multiple columns together for specific query patterns:

-- Query: WHERE status = 'active' AND created_at > '2024-01-01' -- Good: Composite index on (status, created_at) CREATE INDEX idx_users_status_created ON users(status, created_at); -- Query: WHERE status = 'active' ORDER BY created_at DESC -- Also works well with the same index -- Query: WHERE created_at > '2024-01-01' AND status = 'active' -- Still works (order doesn't matter for equality) -- Query: WHERE created_at > '2024-01-01' -- Does NOT use the index efficiently (leading column missing)

2. Covering Indexes

Include all columns needed by the query in the index:

-- Query: SELECT id, name, email FROM users WHERE status = 'active' -- Without covering index: Index lookup + table access CREATE INDEX idx_users_status ON users(status); -- With covering index: Index-only scan CREATE INDEX idx_users_status_covering ON users(status) INCLUDE (id, name, email); -- PostgreSQL INCLUDE syntax -- MySQL: Add columns to index key -- SQL Server: INCLUDE clause available

3. Partial/Filtered Indexes

Index only a subset of rows:

-- Index only active users (smaller, faster) CREATE INDEX idx_users_active ON users(status) WHERE status = 'active'; -- Index only recent data CREATE INDEX idx_orders_recent ON orders(created_at) WHERE created_at > CURRENT_DATE - INTERVAL '30 days'; -- Index only non-deleted records CREATE INDEX idx_comments_active ON comments(post_id) WHERE deleted_at IS NULL;

Query Optimization with Indexes

1. Index Selection Rules

How databases choose which index to use:

Index Selection Factors:
  • Selectivity: More selective columns first
  • Cardinality: High cardinality = better index
  • Query Patterns: Match WHERE, JOIN, ORDER BY
  • Index Size: Smaller indexes preferred
  • Maintenance Cost: Update/insert performance

2. Common Anti-Patterns

Indexing mistakes to avoid:

-- 1. Over-indexing (too many indexes) -- Each index adds write overhead CREATE INDEX idx_a ON table(a); CREATE INDEX idx_b ON table(b); CREATE INDEX idx_c ON table(c); CREATE INDEX idx_a_b ON table(a, b); -- Often redundant -- 2. Indexing low-cardinality columns CREATE INDEX idx_gender ON users(gender); -- Only 2-3 values -- 3. Indexing frequently updated columns CREATE INDEX idx_last_login ON users(last_login); -- Updates on every login -- 4. Wrong column order in composite indexes -- Query: WHERE status = 'active' AND created_at > '2024-01-01' CREATE INDEX idx_wrong_order ON users(created_at, status); -- Wrong! CREATE INDEX idx_correct_order ON users(status, created_at); -- Correct!

Index Monitoring and Maintenance

1. Identifying Unused Indexes

Find and remove indexes that aren't being used:

-- PostgreSQL: Check index usage SELECT schemaname, tablename, indexname, idx_scan as index_scans FROM pg_stat_user_indexes ORDER BY idx_scan; -- MySQL: Check index usage SELECT object_schema, object_name, index_name, count_read, count_fetch FROM performance_schema.table_io_waits_summary_by_index_usage WHERE index_name IS NOT NULL ORDER BY count_read DESC; -- SQL Server: Check index usage SELECT object_name(s.object_id) as table_name, i.name as index_name, user_seeks, user_scans, user_lookups, user_updates FROM sys.dm_db_index_usage_stats s INNER JOIN sys.indexes i ON s.object_id = i.object_id AND s.index_id = i.index_id;

2. Index Fragmentation and Rebuilding

Maintain index performance over time:

-- PostgreSQL: REINDEX REINDEX INDEX concurrently idx_users_email; REINDEX TABLE users; -- MySQL: OPTIMIZE TABLE OPTIMIZE TABLE users; -- SQL Server: Rebuild/Reorganize -- Rebuild (more thorough) ALTER INDEX idx_users_email ON users REBUILD; -- Reorganize (online, less disruptive) ALTER INDEX idx_users_email ON users REORGANIZE; -- Check fragmentation SELECT object_name(ips.object_id) as table_name, i.name as index_name, ips.avg_fragmentation_in_percent FROM sys.dm_db_index_physical_stats( DB_ID(), NULL, NULL, NULL, 'LIMITED') ips INNER JOIN sys.indexes i ON ips.object_id = i.object_id AND ips.index_id = i.index_id WHERE ips.avg_fragmentation_in_percent > 30;

Database-Specific Optimization

PostgreSQL

  • • Use partial indexes
  • • Consider BRIN for time-series
  • • GIN for JSON/arrays
  • • Analyze table regularly

MySQL

  • • Use covering indexes
  • • Consider prefix indexes
  • • Monitor index usage
  • • Use EXPLAIN ANALYZE

SQL Server

  • • Use filtered indexes
  • • Consider columnstore
  • • Monitor fragmentation
  • • Use included columns

Conclusion and Best Practices

Indexing Best Practices Checklist

  • ✓ Index columns used in WHERE, JOIN, ORDER BY
  • ✓ Create composite indexes for common query patterns
  • ✓ Use covering indexes to avoid table access
  • ✓ Consider partial indexes for filtered data
  • ✓ Monitor and remove unused indexes
  • ✓ Rebuild fragmented indexes regularly
  • ✓ Test index changes with realistic data
  • ✓ Use database-specific optimization features

Effective database indexing requires understanding your data, query patterns, and database capabilities. Start with the most critical queries, monitor performance, and iterate. Remember that indexing is a balance between read performance and write overhead. Regular monitoring and adjustment are key to maintaining optimal database performance.

Related Tools on DailyTools.uk

Check out our developer tools that can help with data processing: