Table of Contents
- Overview
- Core Concepts
- Vacuum Implementation
- Autovacuum Implementation
- Key Data Structures
- Vacuum Process Flow
- Autovacuum Process Flow
- Configuration Parameters
- Performance Considerations
- Troubleshooting
Overview
PostgreSQL's vacuum system is responsible for maintaining database health by:
- Reclaiming storage space from deleted tuples
- Preventing transaction ID wraparound
- Updating table statistics
- Maintaining visibility maps and free space maps
- Freezing old tuples to prevent XID wraparound
The system consists of two main components:
- Manual VACUUM - User-initiated vacuum operations
- Autovacuum - Automatic background vacuum processes
Core Concepts
Transaction ID (XID) Management
- PostgreSQL uses 32-bit transaction IDs that wrap around
- Old tuples must be "frozen" to prevent XID wraparound
relfrozenxid
tracks the oldest unfrozen XID in each tabledatfrozenxid
tracks the oldest unfrozen XID in each database
Multi-Transaction ID (MXID) Management
- Used for row-level locking with multiple transactions
- Similar wraparound concerns as XIDs
relminmxid
anddatminmxid
track oldest MXIDs
Visibility and Free Space Maps
- Visibility Map (VM): Tracks which pages are all-visible and all-frozen
- Free Space Map (FSM): Tracks available free space on pages
- Used to optimize vacuum operations by skipping unnecessary pages
Vacuum Types
- Normal Vacuum: Reclaims space and updates statistics
- Aggressive Vacuum: Must advance relfrozenxid/relminmxid
- Full Vacuum: Rewrites entire table (like CLUSTER)
Vacuum Implementation
Main Entry Points
ExecVacuum()
- Command Processing
Located in src/backend/commands/vacuum.c
void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
Responsibilities:
- Parse VACUUM command options
- Validate parameters
- Set up vacuum parameters structure
- Create memory context for vacuum operations
- Call main
vacuum()
function
vacuum()
- Core Vacuum Logic
void vacuum(List *relations, const VacuumParams params,
BufferAccessStrategy bstrategy, MemoryContext vac_context,
bool isTopLevel)
Responsibilities:
- Expand relation list (handle inheritance)
- Determine transaction strategy
- Process each relation via
vacuum_rel()
- Update database-wide statistics
Heap Vacuum Implementation
heap_vacuum_rel()
- Table-Specific Vacuum
Located in src/backend/access/heap/vacuumlazy.c
Main phases:
- Setup Phase: Initialize vacuum state, determine aggressiveness
- Scan Phase: Scan heap pages, prune tuples, collect dead items
- Index Vacuum Phase: Remove dead index entries
- Heap Vacuum Phase: Mark dead items as unused
- Cleanup Phase: Update statistics, truncate if possible
Key Functions
lazy_scan_heap()
- Main Scanning Logic
- Scans relation pages using read streams
- Calls
lazy_scan_prune()
orlazy_scan_noprune()
for each page - Manages dead item collection and index vacuuming cycles
- Updates visibility and free space maps
lazy_scan_prune()
- Page Pruning and Freezing
- Requires cleanup lock on buffer
- Prunes HOT chains and freezes tuples
- Updates visibility map bits
- Collects LP_DEAD items for index vacuuming
lazy_scan_noprune()
- Lightweight Page Processing
- Only requires shared lock
- Counts tuples and collects existing LP_DEAD items
- Used when cleanup lock unavailable
- May return false to force full processing
Vacuum Phases
Phase I: Heap Scanning
- Scan relation pages sequentially
- Skip pages based on visibility map
- Prune dead tuples and freeze old tuples
- Collect TIDs of dead items in TidStore
- Update page-level visibility information
Phase II: Index Vacuuming
- Process collected dead TIDs
- Call
ambulkdelete()
for each index - Remove index entries pointing to dead tuples
- May run in parallel for multiple indexes
Phase III: Heap Vacuuming
- Mark LP_DEAD items as LP_UNUSED
- Truncate line pointer arrays where possible
- Update free space map
- Set visibility map bits for newly all-visible pages
Eager Scanning Algorithm
Normal (non-aggressive) vacuums use eager scanning to freeze pages proactively:
Success Limiting
- Cap successful eager freezes to
MAX_EAGER_FREEZE_SUCCESS_RATE
(20%) of all-visible but not all-frozen pages - Prevents excessive work in single vacuum cycle
- Amortizes freezing cost across multiple vacuum operations
Failure Limiting
- Use regional failure caps based on
EAGER_SCAN_REGION_SIZE
(4096 blocks) - Suspend eager scanning in region after too many failures
- Configurable via
vacuum_max_eager_freeze_failure_rate
Implementation Details
static void heap_vacuum_eager_scan_setup(LVRelState *vacrel, const VacuumParams params)
- Initializes eager scan state
- Calculates success and failure limits
- Sets random starting region to avoid patterns
Autovacuum Implementation
Architecture
Autovacuum uses a launcher/worker model:
- Autovacuum Launcher: Schedules and manages workers
- Autovacuum Workers: Perform actual vacuum operations
Autovacuum Launcher
AutoVacLauncherMain()
- Main Launcher Process
Located in src/backend/postmaster/autovacuum.c
Responsibilities:
- Maintain database list with scheduling information
- Determine when to launch workers
- Handle worker lifecycle management
- Rebalance cost limits across workers
Database Scheduling
static void rebuild_database_list(Oid newdb)
- Builds prioritized list of databases
- Distributes vacuum times across
autovacuum_naptime
interval - Considers database age and last vacuum time
- Handles wraparound emergencies with priority
Worker Management
static Oid do_start_worker(void)
- Selects database needing vacuum
- Prioritizes wraparound prevention
- Considers recent vacuum activity
- Signals postmaster to fork worker
Autovacuum Workers
AutoVacWorkerMain()
- Worker Process Entry
- Connects to assigned database
- Scans
pg_class
for tables needing vacuum/analyze - Applies autovacuum thresholds and settings
- Performs vacuum/analyze operations
Table Selection Logic
static void relation_needs_vacanalyze(Oid relid, AutoVacOpts *relopts, ...)
Vacuum thresholds:
vacuum_threshold = base_threshold + scale_factor * reltuples
insert_vacuum_threshold = base_threshold + scale_factor * reltuples * unfrozen_ratio
Analyze thresholds:
analyze_threshold = base_threshold + scale_factor * reltuples
Wraparound prevention:
- Force vacuum when
relfrozenxid
age exceedsautovacuum_freeze_max_age
- Force vacuum when
relminmxid
age exceedsautovacuum_multixact_freeze_max_age
Key Data Structures
VacuumParams
typedef struct VacuumParams {
bits32 options; // VACUUM options bitmask
int freeze_min_age; // Minimum age for freezing
int freeze_table_age; // Age for aggressive vacuum
int multixact_freeze_min_age; // MXID freeze minimum age
int multixact_freeze_table_age; // MXID freeze table age
bool is_wraparound; // Wraparound prevention vacuum
int log_min_duration; // Logging threshold
VacOptValue index_cleanup; // Index cleanup setting
VacOptValue truncate; // Table truncation setting
int nworkers; // Parallel workers
} VacuumParams;
VacuumCutoffs
struct VacuumCutoffs {
TransactionId relfrozenxid; // Current table frozen XID
MultiXactId relminmxid; // Current table min MXID
TransactionId OldestXmin; // Oldest visible XID
MultiXactId OldestMxact; // Oldest visible MXID
TransactionId FreezeLimit; // XID freeze threshold
MultiXactId MultiXactCutoff; // MXID freeze threshold
};
LVRelState
typedef struct LVRelState {
Relation rel; // Target relation
Relation *indrels; // Index relations
int nindexes; // Number of indexes
bool aggressive; // Aggressive vacuum?
bool skipwithvm; // Use visibility map?
struct VacuumCutoffs cutoffs; // Freeze/prune cutoffs
TidStore *dead_items; // Dead tuple TIDs
// Statistics and counters
BlockNumber rel_pages; // Total pages
BlockNumber scanned_pages; // Pages examined
double new_rel_tuples; // Estimated tuple count
int64 tuples_deleted; // Deleted tuples
int64 tuples_frozen; // Frozen tuples
// Eager scanning state
BlockNumber eager_scan_remaining_successes;
BlockNumber eager_scan_remaining_fails;
BlockNumber next_eager_scan_region_start;
} LVRelState;
AutoVacuumShmemStruct
typedef struct {
sig_atomic_t av_signal[AutoVacNumSignals]; // IPC signals
pid_t av_launcherpid; // Launcher PID
dclist_head av_freeWorkers; // Available workers
dlist_head av_runningWorkers; // Active workers
WorkerInfo av_startingWorker; // Worker being started
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS]; // Work queue
pg_atomic_uint32 av_nworkersForBalance; // Workers for cost balancing
} AutoVacuumShmemStruct;
Vacuum Process Flow
Manual VACUUM Flow
Command Parsing (
ExecVacuum
)- Parse options and parameters
- Validate settings
- Create memory context
Relation Processing (
vacuum
)- Expand relation list
- Start transactions as needed
- Process each relation
Table Vacuum (
vacuum_rel
)- Open and lock relation
- Check permissions
- Call table AM vacuum function
Heap Vacuum (
heap_vacuum_rel
)- Determine aggressiveness
- Set up parallel workers if enabled
- Scan heap pages
- Vacuum indexes and heap
- Update statistics
Heap Scanning Flow
Page Selection (
heap_vac_scan_next_block
)- Use visibility map to skip pages
- Apply eager scanning logic
- Return next block to process
Page Processing
- Try to get cleanup lock
- Call
lazy_scan_prune
orlazy_scan_noprune
- Update visibility and free space maps
Dead Item Management
- Collect dead TIDs in TidStore
- Trigger index vacuum when full
- Reset dead items after processing
Autovacuum Process Flow
Launcher Flow
Initialization
- Set up signal handlers
- Create database list
- Enter main loop
Scheduling Loop
- Determine sleep time
- Wait for events or timeout
- Check for worker completion
- Launch new workers as needed
Worker Launch
- Select database needing vacuum
- Find available worker slot
- Signal postmaster to start worker
Worker Flow
Startup
- Connect to assigned database
- Set up vacuum parameters
- Scan system catalogs
Table Selection
- Check each table against thresholds
- Prioritize wraparound prevention
- Apply per-table settings
Vacuum Execution
- Call standard vacuum functions
- Use autovacuum-specific parameters
- Report progress and statistics
Configuration Parameters
Core Vacuum Parameters
vacuum_freeze_min_age
(50M): Minimum age for tuple freezingvacuum_freeze_table_age
(150M): Age for aggressive vacuumvacuum_multixact_freeze_min_age
(5M): MXID freeze minimumvacuum_multixact_freeze_table_age
(150M): MXID aggressive thresholdvacuum_failsafe_age
(1.6B): Emergency failsafe triggervacuum_cost_delay
(0): Delay between operationsvacuum_cost_limit
(200): Cost accumulation limit
Autovacuum Parameters
autovacuum
(on): Enable autovacuumautovacuum_naptime
(1min): Time between launcher runsautovacuum_max_workers
(3): Maximum worker processesautovacuum_work_mem
(-1): Memory per workerautovacuum_vacuum_threshold
(50): Base vacuum thresholdautovacuum_vacuum_scale_factor
(0.2): Vacuum scale factorautovacuum_analyze_threshold
(50): Base analyze thresholdautovacuum_analyze_scale_factor
(0.1): Analyze scale factorautovacuum_freeze_max_age
(200M): XID wraparound thresholdautovacuum_multixact_freeze_max_age
(400M): MXID wraparound threshold
Per-Table Settings
Tables can override autovacuum settings via storage parameters:
autovacuum_enabled
autovacuum_vacuum_threshold
autovacuum_vacuum_scale_factor
autovacuum_analyze_threshold
autovacuum_analyze_scale_factor
autovacuum_vacuum_cost_delay
autovacuum_vacuum_cost_limit
autovacuum_freeze_min_age
autovacuum_freeze_max_age
autovacuum_freeze_table_age
Performance Considerations
Cost-Based Delay
- Prevents vacuum from overwhelming I/O system
- Balances cost across multiple workers
- Configurable delay and limit parameters
- Disabled during failsafe mode
Parallel Vacuum
- Supports parallel index vacuuming
- Requires multiple indexes
- Shares dead item storage in DSM
- Coordinates via shared memory
Buffer Access Strategy
- Uses ring buffer to limit cache pollution
- Configurable via
vacuum_buffer_usage_limit
- Bypassed during failsafe mode
Visibility Map Optimization
- Skips all-visible pages during normal vacuum
- Tracks all-frozen pages for aggressive vacuum
- Reduces I/O for large, stable tables
Troubleshooting
Common Issues
Transaction ID Wraparound
- Monitor
age(relfrozenxid)
andage(datfrozenxid)
- Increase
autovacuum_freeze_max_age
if needed - Check for long-running transactions blocking vacuum
Autovacuum Not Running
- Verify
autovacuum = on
- Check
track_counts = on
- Monitor
pg_stat_user_tables
for last vacuum times - Review autovacuum thresholds
Poor Vacuum Performance
- Increase
maintenance_work_mem
orautovacuum_work_mem
- Adjust cost-based delay parameters
- Consider parallel vacuum for large tables
- Monitor I/O and lock contention
Monitoring Queries
Check Table Vacuum Status
SELECT schemaname, tablename,
last_vacuum, last_autovacuum,
n_dead_tup, n_live_tup,
age(relfrozenxid) as xid_age
FROM pg_stat_user_tables
ORDER BY xid_age DESC;
Monitor Autovacuum Activity
SELECT pid, state, query_start, query
FROM pg_stat_activity
WHERE query LIKE '%autovacuum%';
Check Wraparound Status
SELECT datname, age(datfrozenxid),
2^31 - age(datfrozenxid) as xids_remaining
FROM pg_database
ORDER BY age(datfrozenxid) DESC;
Debug Settings
log_autovacuum_min_duration
: Log slow autovacuum operationsautovacuum_verbose
: Verbose autovacuum loggingvacuum_verbose
: Detailed vacuum outputtrack_cost_delay_timing
: Monitor cost-based delays
Implementation Files
Core Files
src/backend/commands/vacuum.c
- Main vacuum command processingsrc/backend/access/heap/vacuumlazy.c
- Heap vacuum implementationsrc/backend/postmaster/autovacuum.c
- Autovacuum launcher and workerssrc/backend/commands/vacuumparallel.c
- Parallel vacuum coordination
Header Files
src/include/commands/vacuum.h
- Vacuum function declarationssrc/include/postmaster/autovacuum.h
- Autovacuum declarations
Related Files
src/backend/access/heap/pruneheap.c
- Heap pruning and freezingsrc/backend/storage/freespace/freespace.c
- Free space mapsrc/backend/access/common/reloptions.c
- Storage parameterssrc/backend/utils/adt/pgstatfuncs.c
- Statistics functions
This documentation provides a comprehensive overview of PostgreSQL's vacuum and autovacuum implementation, covering both the high-level concepts and detailed implementation specifics.
No comments:
Post a Comment