Add snapshot backup support for PostgreSQL by x4m · Pull Request #2101 · wal-g/wal-g

x4m · 2025-11-01T17:48:49Z

Implement snapshot backups that leverage filesystem-level or cloud disk snapshots (e.g., AWS EBS, Azure Managed Disks, GCP Persistent Disks, ZFS, LVM) for creating PostgreSQL backups while maintaining proper database consistency and point-in-time recovery capabilities.

Snapshot backups store only metadata in WAL-G storage while the actual database files remain in externally managed snapshots. This approach provides near-instantaneous backups regardless of database size and significantly reduces storage costs in WAL-G's object storage.

Key components:

snapshot-push command: Coordinates backup creation by calling pg_start_backup(), executing a user-defined snapshot command, calling pg_stop_backup(), and uploading metadata. The snapshot command receives environment variables (WALG_SNAPSHOT_NAME, WALG_PG_DATA, WALG_SNAPSHOT_START_LSN, WALG_SNAPSHOT_START_WAL_FILE) for proper snapshot tagging and identification.
snapshot-fetch command: Prepares restored snapshots for PostgreSQL recovery by creating backup_label and tablespace_map files from stored metadata. Supports automatic recovery configuration for both PostgreSQL 12+ (recovery.signal) and earlier versions (recovery.conf), with optional point-in-time recovery target specification.
Automatic WAL protection: Critical safety feature that prevents deletion of WAL segments required by snapshot backups. During any delete operation, WAL-G identifies all snapshot backups and protects their required WAL range (start LSN to finish LSN) from deletion, ensuring snapshot backups remain recoverable even with aggressive retention policies.
Exact backup_label preservation: Stores the exact content returned by pg_stop_backup() for backup_label and tablespace_map files rather than reconstructing them. This ensures compatibility across all PostgreSQL versions and handles any future format changes automatically, as PostgreSQL-generated files are guaranteed to be readable by PostgreSQL.

Implementation stores snapshot metadata in BackupSentinelDto with new BackupLabel and TablespaceMap fields. Snapshot backups are identified by FilesMetadataDisabled=true, CompressedSize=0, and presence of BackupLabel content. Delete operations check IsSnapshotBackup() and protect required WAL segments through GetPermanentBackupsAndWals() modifications.

Configuration uses WALG_SNAPSHOT_COMMAND (required) for snapshot creation and WALG_SNAPSHOT_DELETE_COMMAND (optional) for cleanup. Commands execute in shell context with environment variables for maximum flexibility across different infrastructure providers.

Testing includes comprehensive snapshot_test.sh with 10 test cases covering backup creation, restoration, PITR, deletion, retention policies, and WAL protection verification. Tests use cp -al (hardlinks) to emulate filesystem snapshots without requiring cloud infrastructure.

Documentation in docs/PostgreSQL_Snapshot.md provides complete usage examples for major cloud providers (AWS, Azure, GCP) and on-premises solutions (ZFS, LVM), along with technical implementation details, security considerations, and best practices.

Snapshot backups integrate seamlessly with existing WAL-G features including backup-list, delete commands, permanent backup flag, encryption, compression (for WAL files), and multiple storage backends.

Author: Cursor, Sonnet 4.5, some whacking by me
Discussion: #1781

Database name

Wal-g provides support for many databases, please write down name of database you uses.

Pull request description

Describe what this PR fixes

// problem is ...

Please provide steps to reproduce (if it's a bug)

// it can really help

Please add config and wal-g stdout/stderr logs for debug purpose

also you can use WALG_LOG_LEVEL=DEVEL for logs collecting

If you can, provide logs

```bash any logs here ```

Implement snapshot backups that leverage filesystem-level or cloud disk snapshots (e.g., AWS EBS, Azure Managed Disks, GCP Persistent Disks, ZFS, LVM) for creating PostgreSQL backups while maintaining proper database consistency and point-in-time recovery capabilities. Snapshot backups store only metadata in WAL-G storage while the actual database files remain in externally managed snapshots. This approach provides near-instantaneous backups regardless of database size and significantly reduces storage costs in WAL-G's object storage. Key components: 1. snapshot-push command: Coordinates backup creation by calling pg_start_backup(), executing a user-defined snapshot command, calling pg_stop_backup(), and uploading metadata. The snapshot command receives environment variables (WALG_SNAPSHOT_NAME, WALG_PG_DATA, WALG_SNAPSHOT_START_LSN, WALG_SNAPSHOT_START_WAL_FILE) for proper snapshot tagging and identification. 2. snapshot-fetch command: Prepares restored snapshots for PostgreSQL recovery by creating backup_label and tablespace_map files from stored metadata. Supports automatic recovery configuration for both PostgreSQL 12+ (recovery.signal) and earlier versions (recovery.conf), with optional point-in-time recovery target specification. 3. Automatic WAL protection: Critical safety feature that prevents deletion of WAL segments required by snapshot backups. During any delete operation, WAL-G identifies all snapshot backups and protects their required WAL range (start LSN to finish LSN) from deletion, ensuring snapshot backups remain recoverable even with aggressive retention policies. 4. Exact backup_label preservation: Stores the exact content returned by pg_stop_backup() for backup_label and tablespace_map files rather than reconstructing them. This ensures compatibility across all PostgreSQL versions and handles any future format changes automatically, as PostgreSQL-generated files are guaranteed to be readable by PostgreSQL. Implementation stores snapshot metadata in BackupSentinelDto with new BackupLabel and TablespaceMap fields. Snapshot backups are identified by FilesMetadataDisabled=true, CompressedSize=0, and presence of BackupLabel content. Delete operations check IsSnapshotBackup() and protect required WAL segments through GetPermanentBackupsAndWals() modifications. Configuration uses WALG_SNAPSHOT_COMMAND (required) for snapshot creation and WALG_SNAPSHOT_DELETE_COMMAND (optional) for cleanup. Commands execute in shell context with environment variables for maximum flexibility across different infrastructure providers. Testing includes comprehensive snapshot_test.sh with 10 test cases covering backup creation, restoration, PITR, deletion, retention policies, and WAL protection verification. Tests use cp -al (hardlinks) to emulate filesystem snapshots without requiring cloud infrastructure. Documentation in docs/PostgreSQL_Snapshot.md provides complete usage examples for major cloud providers (AWS, Azure, GCP) and on-premises solutions (ZFS, LVM), along with technical implementation details, security considerations, and best practices. Snapshot backups integrate seamlessly with existing WAL-G features including backup-list, delete commands, permanent backup flag, encryption, compression (for WAL files), and multiple storage backends. Author: Cursor, Sonnet 4.5, some whacking by me Discussion: wal-g#1781

boosterKRD · 2025-11-02T09:03:59Z

+	}
+
+	if inRecovery {
+		return errors.New("Cannot perform snapshot backup on a standby server")


Possible issue: when pg_start_backup() runs successfully but returns inRecovery = true, the function immediately returns an error without calling pg_stop_backup().

thanks for reviewing! Well, this code was written by Cursor and it is...well... I see no reason to disallow snapshot backup on standby.

thanks for reviewing! Well, this code was written by Cursor and it is...well... I see no reason to disallow snapshot backup on standby.

Hi !
I thought about it for a while — why standby is forbidden — but in the end, I accepted the “religion of the code.” Probably for the best: there’s no real load on the primary, and using a replica doesn’t add much value.

If you run pg_stop_backup() on a replica, it finishes immediately without waiting for the WAL segments (the range mentioned in backup_label) to be archived — especially since archive_mode is usually disabled on replicas.
So you get a “successful” backup, but at restore time, some required WAL segments might be missing.

On the primary, pg_stop_backup() always waits until all WALs are archived, ensuring consistency.

From my view — if backups on replicas are ever allowed, docs must clearly state that users are fully responsible for controlling WAL archiving.

We already support backups on standby since 2017. So there's no way back in forbidding them somewhere :)

ostinru · 2025-11-03T18:51:07Z

 	errorGroup, _ := errgroup.WithContext(ctx)
 	errorGroup.Go(func() error {
-		err := json2.MarshalWrite(writer, data)
+		err := json.NewEncoder(writer).Encode(data)


Is there any reason to rollback #2056 ?
If it breaks compilation - upgrade to 1.25 and add GOEXPERIMENT=jsonv2 to your env (this part is not obvious).

yup, this is bogus change, I'll keep json2

ostinru · 2025-11-03T19:04:11Z

+
+	sbh.QueryRunner, err = NewPgQueryRunner(conn)
+	if err != nil {
+		return errors.Wrap(err, "failed to build query runner")


NIT: it seems that github/pkg/errors is public archive and we can use fmt.Errorf() instead.

I think at some point we should make this consistent across codebase...

ostinru · 2025-11-03T19:07:03Z

+	var lsnString string
+	var inRecovery bool
+	err = sbh.QueryRunner.Connection.QueryRow(context.TODO(), startBackupQuery, backupLabel).Scan(
+		&walFileName, &lsnString, &inRecovery)


Why not use sbh.QueryRunner.StartBackup()?

x4m requested a review from a team as a code owner November 1, 2025 17:48

boosterKRD reviewed Nov 2, 2025

View reviewed changes

x4m marked this pull request as draft November 2, 2025 17:27

Fix tests

252365b

ostinru reviewed Nov 3, 2025

View reviewed changes

Fix tests

2f89aad

ostinru added the postgres PostgreSQL issue label Nov 18, 2025

ostinru mentioned this pull request Feb 28, 2026

[QUESTION] PostgreSQL backup-push without running server #2186

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add snapshot backup support for PostgreSQL#2101

Add snapshot backup support for PostgreSQL#2101
x4m wants to merge 3 commits intowal-g:masterfrom
x4m:snapshot

x4m commented Nov 1, 2025

Uh oh!

boosterKRD Nov 2, 2025

Uh oh!

x4m Nov 2, 2025

Uh oh!

boosterKRD Nov 2, 2025

Uh oh!

x4m Nov 4, 2025

Uh oh!

ostinru Nov 3, 2025

Uh oh!

x4m Nov 4, 2025

Uh oh!

ostinru Nov 3, 2025

Uh oh!

x4m Nov 4, 2025

Uh oh!

ostinru Nov 3, 2025

Uh oh!

x4m Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

x4m commented Nov 1, 2025

Database name

Pull request description

Describe what this PR fixes

Please provide steps to reproduce (if it's a bug)

Please add config and wal-g stdout/stderr logs for debug purpose

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants