How to optimise data synchronization with rsync
rsync is a versatile tool that simplifies file transfer over network connections and speeds up the synchronisation of local directories. The high flexibility makes the synchronisation tool an excellent option for a variety of file-level operations.
What is rsync?
rsync, short for ‘remote synchronisation’, is a flexible and network-compatible synchronisation tool under Linux. The open-source program can be used to synchronise files and directories between local systems or across networks. The tool uses a differential data transfer technique, whereby only those sections of data that have actually been changed are transferred. This minimises the amount of data exchange and considerably speeds up the synchronisation process. Thanks to a variety of options, rsync allows precise control of synchronisation behaviour. The flexible syntax makes both simple local copies and complex network synchronisations possible.
What is the syntax for rsync?
The command syntax of rsync has a simple structure and is similar to that of SSH, SCP and CP. The basic structure is as follows:
rsync [OPTION] source destination
bashThe source path that the data should be synchronised from is entered in source
, while the destination path is specified as destination
. rsync offers a variety of options which users can use to adapt the synchronisation process to their requirements. The most frequently used options are:
-a
(archives): Preserves recursive file permissions, timestamps, groups, owners and special file properties-v
(verbose): Displays detailed information about the synchronisation process-r
(recursive): Synchronises directories and their contents recursively-u
(update): Only transfers files that are newer than those already in the target directory-z
(compress): Reduces data traffic over the network-n
–itemize-changes: Displays a list of the changes to be made--delete
: Deletes files in the target directory that no longer exist in the source--exclude
: Excludes certain files or directories from synchronisation--dry-run
: Simulates the synchronisation process without actually transferring files--progress
: Shows the progress of the file transfer--partial
: Files that have been partially transferred remain in the target directory if the transfer is interrupted. When the transfer is resumed, the file is continued from its last state
Examples of rsync syntax
The following examples of rsync syntax should make it easier to understand how the command is used. The following code example creates the directory dir1
including 100 empty test files and a second empty directory dir2
:
$ cd ~
$ mkdir dir1
$ mkdir dir2
$ touch dir1/file{1..100}
bashThe contents of dir1
can be synchronised on the same system with dir2
using the -r
option:
$ rsync -r dir1/ dir2
bashAlternatively, the -a
option can be used, which synchronises recursively and contains symbolic links, special device files, modification times, groups, owners and authorisations:
$ rsync -a dir1/ dir2
bashNote: The slash (/) at the end of the source directory in an rsync command is important because it indicates that the contents of the directory should be synchronised, not the directory itself.
$ rsync -a dir1/ dir2
bashHere’s an example of the output:
sending incremental file list
./
file1
file10
file100
file11
file12
file13
file14
file15
file16
file17
file18
. . .
bashIf the source directory doesn’t have a trailing slash, the source directory will be copied to the target directory:
$ rsync -a dir1 dir2
bashHere’s the output:
sending incremental file list
dir1/
dir1/file1
dir1/file10
dir1/file100
dir1/file11
dir1/file12
dir1/file13
dir1/file14
dir1/file15
dir1/file16
dir1/file17
dir1/file18
. . .
bashUsing the slash at the end of the source directory ensures that the synchronisation process runs as expected and that the contents of the source directory end up in the correct target directory.
How to synchronise rsync with a remote system
Synchronising a remote system with rsync is usually not difficult, provided you have SSH access to the remote computer and have the necessary authentication information. Rsync often uses SSH (Secure Shell) for secure communication with remote systems. To use this tool, it has to be installed on both sides.
If SSH access between the two computers is verified, the dir1
folder can be synchronised on a remote computer. In this case, the actual directory needs to be transferred, which is why the trailing slash has been omitted in the following command:
$ rsync -a ~/dir1 username@remote_host:destination_directory
bashIf a directory is moved from a local system to a remote system, this is referred to as a push operation. In contrast, when a remote directory is synchronised with a local system, this is referred to as a pull operation. The syntax for this is as follows:
$ rsync -a username@remote_host:/home/username/dir1 place_to_sync_on_local_machine
bash- Unlimited traffic and up to 1 Gbit/s bandwidth
- Fast SSD NVMe storage
- Free Plesk Web Host Edition
What other options are there in rsync?
The standard behaviour of rsync can be further adapted using the options below.
Transferring non-compressed files with rsync
The network load when transferring non-compressed files can be reduced using the -z
option:
$ rsync -az source destination
bashDisplaying progress and resuming interrupted transmissions
With -P
you can combine the options --progress
and --partial
. This gives you an overview of the progress of transmissions and also allows you to resume interrupted transmissions at the same time:
$ rsync -azP source destination
bashHere’s the output:
sending incremental file list
./
file1
0 100% 0.00kB/s 0:00:00 (xfer#1, to-check=99/101)
file10
0 100% 0.00kB/s 0:00:00 (xfer#2, to-check=98/101)
file100
0 100% 0.00kB/s 0:00:00 (xfer#3, to-check=97/101)
file11
0 100% 0.00kB/s 0:00:00 (xfer#4, to-check=96/101)
. . .
bashExecute the command again to obtain a shorter output. This allows rsync to determine whether changes have been made based on change times.
$ rsync -azP source destination
bashHere’s the output:
sending incremental file list
sent 818 bytes received 12 bytes 1660.00 bytes/sec
total size is 0 speedup is 0.00
bashKeep directories synchronised with rsync
To ensure that two directories are actually kept in sync, it’s necessary to delete files that have been removed from the source directory in the target directory. But rsync doesn’t remove files from the target directory automatically. This can be modified with the --delete
option. However, it’s important to use this option with caution since it deletes files in the target directory that no longer exist in the source.
Before using this option, you should use the --dry-run
option. This will allow you to perform a simulation of the synchronisation process without deleting any actual files. That way you can ensure that only the desired changes are made without accidentally losing important data:
$ rsync -a --delete source destination
bashExclude files and directories from synchronisation
In rsync, you can use the --exclude
option to exclude certain files and directories from synchronisation. This is useful if, for example, you don’t want to synchronise temporary files, log files or other content.
$ rsync -a --exclude=pattern_to_exclude source destination
bashIf you’ve specified a pattern for excluding files, you can use the --include=
option to overwrite this exclusion for certain files that match a different pattern.
$ rsync -a --exclude=pattern_to_exclude --include=pattern_to_include source destination
bashSave backups with rsync
The --backup
option allows you to save backups of important files. It can be used in conjunction with the --backup-dir
option to specify the directory where the backup files should be saved:
$ rsync -a --delete --backup --backup-dir=/path/to/backups /path/to/source destination
bashYou can find a detailed overview of the various backup scenarios in our article about server backups with rsync.