Archive

Articles taggués ‘rsync’

Do-It-Yourself Backup System Using Rsync

03/12/2023 Comments off

What is rsync?

rsync-terminalRsync is a program for synchronizing two directory trees across different file systems even if they are on different computers. It can run its host to host communications over ssh to keep things secure and to provide key based authentication. If a file is already present in the target and is the same as on the source the file will not be transmitted. If the file on the target is different than the one on the source then only the parts of it that are different are transferred. These features greatly increase the performance of rsync over a network.

Lire la suite…

Categories: Logiciel, Système Tags: , , ,

How to enable incremental file sync for many users on Linux

02/12/2023 Comments off

Suppose you as a software developer has set up daily builds of your software for testing purposes. Every day you make a new build, users have to re-download the updated build to evaluate it. In this case you may want to enable differential downloads, so that users can download only difference between two builds, thereby saving on the server’s bandwidth. Users will also be happy as they don’t have to wait to re-download the whole thing. Similar situations are encountered when you want to set up a download archive which allows incremental sync for users.

In these cases, how would you distribute incrementally updated files efficiently for multiple users? In fact, there are open-source storage solutions that come with « delta sync » capability built-in, such as ownCloud or Syncthing. These kinds of full-blown solutions with built-in GUI require users to install a dedicated client, and thus may be an overkill for simple file distribution that you are looking for.

Barring full-blown third-party software like these, perhaps rsync may come to mind, which can do bandwidth-efficient file sync. The problem with rsync, however, is that all the heavy duty computations is done at the server side. That is, when a client requests for a file sync, the server needs to perform block-by-block checksum computation and search for blocks not available on the client. Obviously this procedure can place a significant strain on the server’s resources if the server needs to handle many users, and thus is not scalable.

What is Zsync?

This is when a command-line tool called zsync comes in handy. While zsync uses the same delta-encoding based sync algorithm as rsync does, it moves the heavy duty computation away from the server and onto the clients. What do I mean by that?

Well, in zsync, the server maintains a separate .zsync metadata file for a file to distribute, which contains a list of « precomputed » checksums for individual blocks of the file. When zsync client requests for a file sync, the client downloads .zsync metadata file first, and performs block-by-block checksum calculation to find missing blocks on its own. The client then requests for missing blocks using HTTP range requests. As you can see, the server is totally out of the loop from the sync algorithm, and simply serves requested file blocks over HTTP, which makes it ideal when incremental file sync is needed for many users.

Here is a quick rundown on the server-side overhead difference between rsync and zsync. In the plot below, I compare rsync and zsync in terms of the server’s CPU usage when 200 users are downloading a tarball file with 2.5% discrepancy of a previous version. For fair comparison, SSH is not used for rsync.

With zsync, since all checksum computation overhead has shifted from the server to individual clients, the server overhead is reduced dramatically. A small neat idea makes zsync a real winner!

In the rest of the tutorial, I will show how to distribute a file incrementally using zsync under the Linux environment.

Zsync: Client Side Setup

On the client side, you need to install zsync to initiate file transfer from a remote web server. zsync is extremely lightweight, and is included in the most Linux distros. Here is how to install zsync on various distros.

Debian, Ubuntu or Linux Mint:

$ sudo apt-get install zsync

Fedora:

$ sudo yum install zsync

CentOS or RHEL:

First, enable Repoforge repository, and then run:

$ sudo yum install zsync

Arch Linux:

$ sudo pacman -S zsync

Lire la suite…

Categories: Logiciel Tags: , ,

How to Backup Linux? 15 rsync Command Examples

30/10/2023 Comments off

rsync stands for remote sync.

rsync is used to perform the backup operation in UNIX / Linux.

rsync utility is used to synchronize the files and directories from one location to another in an effective way. Backup location could be on local server or on remote server.

Important features of rsync

  • Speed: First time, rsync replicates the whole content between the source and destination directories. Next time, rsync transfers only the changed blocks or bytes to the destination location, which makes the transfer really fast.
  • Security: rsync allows encryption of data using ssh protocol during transfer.
  • Less Bandwidth: rsync uses compression and decompression of data block by block at the sending and receiving end respectively. So the bandwidth used by rsync will be always less compared to other file transfer protocols.
  • Privileges: No special privileges are required to install and execute rsync

Syntax

$ rsync options source destination

Source and destination could be either local or remote. In case of remote, specify the login name, remote server name and location.

Example 1. Synchronize Two Directories in a Local Server

To sync two directories in a local computer, use the following rsync -zvr command.

$ rsync -zvr /var/opt/installation/inventory/ /root/temp
building file list ... done
sva.xml
svB.xml
.
sent 26385 bytes  received 1098 bytes  54966.00 bytes/sec
total size is 44867  speedup is 1.63
$

In the above rsync example:

  • -z is to enable compression
  • -v verbose
  • -r indicates recursive

Now let us see the timestamp on one of the files that was copied from source to destination. As you see below, rsync didn’t preserve timestamps during sync.

$ ls -l /var/opt/installation/inventory/sva.xml /root/temp/sva.xml
-r--r--r-- 1 bin  bin  949 Jun 18  2009 /var/opt/installation/inventory/sva.xml
-r--r--r-- 1 root bin  949 Sep  2  2009 /root/temp/sva.xml

Example 2. Preserve timestamps during Sync using rsync -a

rsync option -a indicates archive mode. -a option does the following,

  • Recursive mode
  • Preserves symbolic links
  • Preserves permissions
  • Preserves timestamp
  • Preserves owner and group

Now, executing the same command provided in example 1 (But with the rsync option -a) as shown below:

$ rsync -azv /var/opt/installation/inventory/ /root/temp/
building file list ... done
./
sva.xml
svB.xml
.
sent 26499 bytes  received 1104 bytes  55206.00 bytes/sec
total size is 44867  speedup is 1.63
$

As you see below, rsync preserved timestamps during sync.

$ ls -l /var/opt/installation/inventory/sva.xml /root/temp/sva.xml
-r--r--r-- 1 root  bin  949 Jun 18  2009 /var/opt/installation/inventory/sva.xml
-r--r--r-- 1 root  bin  949 Jun 18  2009 /root/temp/sva.xml

Lire la suite…

Categories: Réseau, Système Tags: ,

Simple versioned TimeMachine-like backup using rsync

18/10/2023 Comments off

time machine rsyncUsing TimeMachine with rsync: Over many years, I have dealt with scripts that do backup versioning, i.e., maintain multiple backups. Due to their flexibility, they have been complex to understand and configure. Here is a simple rsync-based tool with a different focus: The experienced systems administrator who wants to keep his system’s complexity down.

Backup in action: TimeMachine and rsync

It consists of a simple script, which you can call rsync-backup.sh and store wherever you like, e.g., in /usr/local/sbin. I will use these names and paths in the examples.

#!/bin/sh
# Usage: rsync-backup.sh <src> <dst> <label>
if [ "$#" -ne 3 ]; then
    echo "$0: Expected 3 arguments, received $#: $@" >&2
    exit 1
fi
if [ -d "$2/__prev/" ]; then
    rsync -a --delete --link-dest="$2/__prev/" "$1" "$2/$3"
else
    rsync -a                                   "$1" "$2/$3"
fi
rm -f "$2/__prev"
ln -s "$3" "$2/__prev"

During normal operation, it boils down to three simple statements:

  1. rsync with --link-dest: Copying the contents of <src> to <dst>/<label>, reusing the files from the previous backup with hard links ((The non---link-dest rsync does not use --delete to reduce the risk of accidentally deleting files when called with wrong parameters))
  2. rm and ln: Remember this backup location for the next incremental backup.

Voilà – it doesn’t get much easier than that!

Of course, there is something missing: The actual backup policy. It is separated into cron, which I consider an advantage. Using this separation of duties, many policies can be implemented very easily and composed in a modular way:

Create daily backups  for every weekday

You might know this from automysqlbackup or autopostgresqlbackup: A backup is created every day and overwritten after 7 days. This is achieved by adding the following file to /etc/cron.daily/:

#!/bin/sh
/usr/local/bin/rsync-backup.sh /home /data/backup `date +%A`

All your user’s files are copied daily to /data/backup, named after the current day, overwritten weekly.

Daily backups for a month

Sure, this is easy as well, by putting this with a descriptive name into/etc/cron.daily/:

#!/bin/sh
/usr/local/bin/rsync-backup.sh /home /data/backup `date +Day-%d`

Lire la suite…

Categories: Système Tags: , , , ,