PostgreSQL backup with pg_dump and pg_restore
The pg_dump and pg_restore command line tools are used to export and import PostgreSQL databases. They create PostgreSQL backups and migrate PostgreSQL databases between servers.
What is a PostgreSQL dump?
A PostgreSQL dump is the output file which is created when exporting a PostgreSQL database. PostgreSQL is a sophisticated database management system which stores data in optimised data structures. Therefore, extracting structured data from a PostgreSQL database requires a special procedure.
What is a backup?. We answer this basic question in our dedicated guide.
The pg_dump tool creates a text file with SQL commands, similar to MySQL backup with MySQL dump. Running the commands will restore the database to the time of the dump. According to the official PostgreSQL documentation:
“The idea behind this dump method is to generate a text file with SQL commands that, when fed back to the server, will recreate the database in the same state as it was at the time of the dump.”
It is important to understand what `PostgreSQL database´ means. The term is often confused with PostgreSQL server. In fact, it is not uncommon for a single PostgreSQL server to contain multiple databases. Below is an overview of the hierarchy of objects in a PostgreSQL installation:
PostgreSQL Object | Contains |
---|---|
Server | Databases |
Database | Tables |
Table | Records |
Record | Fields |
How do pg_dump and pg_restore work?
The command line tools pg_dump and pg_restore are usually installed with the PostgreSQL client applications, as well as the PostgreSQL command line interface psql. The tools follow the Unix philosophy and use text streams for input and output. This allows them to be linked to other programs through pipes. Input and output can also be read from files or output to files using redirections.
We show the general pattern when using pg_dump to create a PostgreSQL dump:
pg_dump dbname > db.dump
We use a redirection of the output (“>”) to a file. The PostgreSQL dump generated from this contains SQL commands. These can be run with the psql tool. We’ll show the general pattern when using psql to read a PostgreSQL dump:
psql dbname < db.dump
As you can see, the command is similar to the command for pg_dump. However, the input (“<”) from the PostgreSQL dump file must be redirected.
The pg_dump tool can do a lot more. It is also possible to output databases in special dump formats. However, the pg_restore or psql tool is needed to restore them depending on the format of the generated PostgreSQL dump.
Below you will find an overview of the command line tools used for creating and restoring PostgreSQL backups:
Tool | Explanation |
---|---|
pg_dump | Command line tool to create a PostgreSQL dump |
pg_restore | Command line tool to restore a PostgreSQL database from a PostgreSQL dump; allows special operations like partial imports, re-sorting of import data, parallel import of multiple tables, etc. |
psql | PostgreSQL command line interface; accepts SQL commands from the command line or from a PostgreSQL dump file and runs them |
Let’s take a look at the pattern when using pg_restore for a PostgreSQL dump:
pg_restore --dbname=dbname db.dump
The PostgreSQL dump must be created in a special format for pg_restore. Below are the possible output formats for pg_dump:
pg_dump output format | Explanation | Import via |
---|---|---|
Plain | Plain text file with SQL commands; compressing requires an additional tool, such as Gzip | psql |
Custom | Compressed dump format; import can be controlled in detail | pg_restore |
Directory | Creates directory with one file per table/blob; table of contents; can be edited with standard Unix tools; allows parallel export of multiple tables | |
Tar | Archiving format ‘Tape Archive’; can be converted to directory format; compressing requires additional tool, such as Gzip; not possible to control sequence of imports | pg_restore |
Finally, the pattern for pg_dump used when creating a PostgreSQL dump in a `custom´ dump format is:
pg_dump --format=custom dbname > db.dump
If your PostgreSQL installation is running in a Docker container, you can use pg_dump inside the container to create a PostgreSQL backup. You can also save the entire container as Docker backup. We explain how to do this in detail in our article
Step by step guide: Create and restore a PostgreSQL backup
There are several different ways to create and restore PostgreSQL backups. The methods offer different advantages and disadvantages, so it depends on the deployment scenario and requirements. The procedure used when creating the PostgreSQL dump will determine which method should be used during import.
A quick note before we get into the specific methods of creating and restoring PostgreSQL dumps; The examples below simply reference the name of the database used. However, they do not include database usernames or passwords. These are stored in the .pgpass password file following the PostgreSQL convention. This file can be found in the user’s home directory and contains PostgreSQL connection data. The following format is used:
hostname:port:database:username:password
The information contained in the password file is automatically used when the command line tools are enabled. This eliminates the risk of entering any sensitive data on the command line.
Check if the tools are available and install them if necessary
Firstly, you must check if pg_dump and pg_restore are installed. Try to enable the tools to display their version. If this fails, then the tool is not on your system and must be installed.
- Verify that pg_dump is installed:
pg_dump --version
- Verify that pg_restore is installed:
pg_restore --version
- Furthermore, check if the PostgreSQL command line interface psql is installed:
psql --version
The tools can be easily installed if they are not found.
- Install the PostgreSQL client applications using Homebrew on Mac:
brew install libpq
brew link --force libpq
- Use the built-in package management under Ubuntu-Linux:
sudo apt-get install postgresql-client
- Follow our instructions to install PostgreSQL on Windows Server 2016 on Windows.
Create and restore PostgreSQL backup
Firstly, let’s look at the simplest way to create a PostgreSQL backup. We extract a single database from a PostgreSQL server. The structure and contents of the database are written on a file in SQL commands. Enabling the pg_dump tool will look like this:
pg_dump dbname > db.dump
But what if you want to restore the PostgreSQL dump to another server? The PostgreSQL command line interface psql is used. The command is very simple:
psql dbname < db.dump
The pg_dump tool allows users to create specialised PostgreSQL dump formats, other than the output of a PostgreSQL dump as a text file with SQL commands. These are controlled through options when they are enabled. Below is an overview of the two most useful dump formats:
PostgreSQL dump format | Detailed options syntax | Short options syntax |
---|---|---|
Custom | pg_dump --format=custom | pg_dump -Fc |
Directory | pg_dump --format=directory | pg_dump -Fd |
Now, PostgreSQL dump in a custom format must be created:
pg_dump --format=custom dbname > db.dump
Use the pg_restore tool to restore to another server:
pg_restore --dbname=dbname db.dump
An additional step is required to restore the PostgreSQL dump to the same server, as the database and tables already exist on the server and must be removed before importing. This is similar to '--add-drop-tables' for MySQL dump. PostgreSQL conveniently allows you to add the functionality during import:
pg_restore --clean --create --dbname=dbname db.dump
The '—clean' option removes the existing database before the import. The '--create' option creates the database under the specified name. This allows the import to run without problems.
Migrate PostgreSQL database between remote servers
Export and import of a PostgreSQL dump can be connected with a pipe (‘|’). This exports the data directly into the import. It is possible to import the export from one server directly to another server, since pg_dump, pg_restore, and psql operate on a remote host when needed. Let’s take a look at the command used for this. Use the '—host' option to specify the host names:
pg_dump --host=export_host dbname | psql --host=import_host dbname
Create PostgreSQL backup of large databases
PostgreSQL is a professional database management system. A special procedure is needed to create backups of large databases, as PostgreSQL dumps can be very large. Firstly, compression is advised. PostgreSQL dumps exported as text files usually contain large amounts of redundant SQL commands which can be easily compressed.
Pipe the output of pg_dump to the Gzip compression tool and write it as a compressed .gz file:
pg_dump dbname | gzip > db.dump.gz
Reverse the process to recover from a compressed PostgreSQL dump. The gunzip tool unpacks the compressed data and outputs it to standard output. Pipe the output to the psql tool and use the '-c' option to ensure the input file is unaffected when unpacking:
gunzip -c db.dump.gz | psql dbname
The 3 2 1 backup rule requires at least one backup in the cloud. Uploading very large files can be problematic under certain circumstances. It might make sense to split the PostgreSQL dump into multiple files using the split tool. A pipe should be used again to forward the output of pg_dump to split. We split the PostgreSQL dump into individual files with a maximum size of 1 GB in the example below:
pg_dump dbname | split -b 1G - db.dump
How can the partial files created by split during import be merged again? No special software is needed for this, as the standard cat tool can be used. We forward a list of partial PostgreSQL dumps, which have been combined by the cat tool into a coherent data stream. Pipe this to psql as usual:
cat db.dump* | psql dbname
It is possible to dump multiple tables in parallel when using the directory dump format. This is faster, but it also leads to a higher load on the database server. We control the number of tables exported in parallel using the '—jobs' option. In our example, we export three tables in parallel. It is not possible to redirect the output to a file as we are writing a directory. We use the '—file' option instead with specification for the directory name:
pg_dump --jobs=3 --format=directory --file=dump.dir dbname