Implementing PostgreSQL with Barman

elephant

Read our post-implementation learnings from 2 months later here.

This past weekend we at 4Degrees completed a database transition from Microsoft Azure’s managed PostgreSQL service to a self-hosted PostgreSQL solution with Barman for backups.

We had adopted Azure’s solution early on in our company history because 1) it came with robust automated backup and recovery capabilities and 2) we were a part of Microsoft’s generous startup sponsorship package, so the additional cost of the service was covered by credits. But over the past year or so we’ve had a number of issues with the service; most notably, it was our single largest source of unplanned site downtime (it had a habit of triggering dozens of outage emails to us in the middle of the night).

Our Azure Postgres issues came to a head in the past month or two: as we’ve scaled, we’ve run into a number of performance issues. We needed to implement a trigram-based index to speed up our searches on text fields, but a configuration issue with Azure’s service prevented us from doing so. The impact to our customers on top of the uptime issues were enough to push us back on to our own service.

Architecture

There are a few different ways to set up PostgreSQL with Barman. We opted for continuously streaming backups with WALs (Write Ahead Logs), as this gave us the same RPO (Recovery Point Objective — the amount of information you lose in a recovery) of 0 that we had with Azure’s service. Missing from the diagram is the ad-hoc SSH connection back from the Barman server to the PostgreSQL server used during recoveries.

We run the streaming backup process every night, essentially creating a checkpoint of the database at that point in time, with a 7 day retention policy. The WAL streaming process runs continuously, retaining a record of all changes that have occurred since the last checkpoint.

Virtual Machines

For the PostgreSQL server, we opted for Azure’s DS3 v2 VM, with 4 vCPUs and 14GB of memory. This gave us slightly more performance than we had with the managed service. We included a 1.1TB managed disk; although we don’t need nearly that much storage today, the read/write speed of Azure’s managed disks goes up along with the disk size.

For the Barman server, we opted for a more basic D2s v3, with 2 vCPUs and 8GB of memory. In comparison to the DS series, the D series has less read/write performance (and is cheaper). We weren’t sure that the Barman server needed this much CPU or memory, but figured we didn’t want to run into any hanging issues while trying to execute a recovery. We chose a 250GB managed disk, though we may need to increase that depending on how space the 7 day retention policy ends up using.

Both VMs were set up with Ubuntu 16.04.

Setting Up PostgreSQL

If you already know how to set up a PostgreSQL server, you can skip this section. The instructions for setting up Barman are self-contained after this.

1/ Mount managed disk

We followed the steps here to set up the 1.1TB disk at /data. The official directions provided by Microsoft did not work for us, and based on some Googling have not worked for others.

2/ Install PostgreSQL

3/ Configuration

  • Copy the PostgreSQL data folder structure over to the storage location on the managed disk: $ rsync -av /var/lib/postgresql /data
  • Open up the PostgreSQL configuration file at /etc/postgresql/9.6/main/postgresql.conf and make the following changes:
  • Open up the PostgreSQL connection configuration file at/etc/postgresql/9.6/main/pg_hba.conf and change the IPv4 local connections to:
  • Restart PostgreSQL with systemctl restart postgresql
  • Make sure your data directory was switched correctly:

4/ Create database

5/ Load data from remote

This step is only if you’re migrating a database, as we were. It took ~5–10 minutes per GB of data.

At this point, your PostgreSQL server should be set up. Feel free to test the connection / functionality however you’d like. The rest is dedicated to setting up Barman for automatic backup / recovery.

Setting Up Barman

1/ Mount managed disk

We followed the steps here to set up the 250GB disk at /data. The official directions provided by Microsoft did not work for us, and based on some Googling have not worked for others.

2/ Install PostgreSQL

3/ Install Barman

4/ Configure Barman

Open up /etc/barman.conf and change/confirm:

5/ Configure server connection

Create a new file at /etc/barman.d/pg.conf and add:

6/ Set up storage directory

7/ Create Barman user for PostgreSQL

  • On the PostgreSQL server, run $ createuser -s -P barman. You’ll be prompted for a password. Make sure to write down whatever you enter.
  • Back on the Barman server, create a new file at ~barman/.pgpass and add:

.pgpass should have 0600 permissions, which you can set with chmod 0600 ~barman/.pgpass

  • Test the psql connection:

8/ Configure PostgreSQL for streaming (for backup)

  • Head back to the PostgreSQL server and open /etc/postgresql/9.6/main/postgresql.conf, change/confirm:
  • Open /etc/postgresql/9.6/main/pg_hba.conf, add:
  • Restart PostgreSQL: $ systemctl restart postgresql
  • Back on the Barman server, test the replication connection:

7/ Set up SSH (for restores)

  • On the Barman server, run:
  • Save the key to ~/.ssh/pg
  • Copy the value of the public key in ~/.ssh/pg.pub
  • On the PostgreSQL server, open or create ~postgres/.ssh/authorized_keys and add the public key

8/ Create replication slot

  • On the Barman server run $ barman receive-wal — create-slot pg

9/ Test

  • Test WAL streaming with $ barman switch-wal --force --archive pg
  • Test everything with $ barman check pg

10/ Set up cron jobs

  • As user barman, open up cron jobs with $ crontab -e and add:
  • You can also manually create backup at any time with $ barman backup pg. This took us about 30 seconds per GB of data.

Recovery With Barman

1/ Stop the database

  • Stop the PostgreSQL service on the PostgreSQL server with $ systemctl stop postgresql

2/ Run the recovery

  • Get a list of possible backup spots with $ barman list-backup pg
  • Get details for the backup you choose with $ barman show-backup pg <backup_id>. Note that <backup_id> is the timestamp of the backup.
  • Run $ barman recover --remote-ssh-command "ssh -i ~/.ssh/pg -o StrictHostKeyChecking=no postgres@<database_host>" --target-time="<YYYY-MM-DD HH:MM:SS.SSSSSS+00:00>" pg <backup_id> /data/postgresql/9.6/main. This took us about 30 seconds per GB of data.

3/ Restart the database

  • Back on the PostgreSQL server, run $ systemctl start postgresql

Comments are closed.

Related Content

Share this post:

Subscribe to the 4D newsletter

Helpful updates and content from 4Degrees.