Archive

Articles taggués ‘backup system rsync’

Do-It-Yourself Backup System Using Rsync

08/07/2016 Comments off

What is rsync?

rsync-terminalRsync is a program for synchronizing two directory trees across different file systems even if they are on different computers. It can run its host to host communications over ssh to keep things secure and to provide key based authentication. If a file is already present in the target and is the same as on the source the file will not be transmitted. If the file on the target is different than the one on the source then only the parts of it that are different are transferred. These features greatly increase the performance of rsync over a network.

What are hard links?

Hard links are similar to symlinks. They are normally created using the ln command but without the -s switch. A hard link is when two file entries point to the same inode and disk blocks. Unlike symlinks there isn’t a file and a pointer to the file but rather two links to the same file. If you delete either entry the other will remain and will still contain the data. Here is an example of both:

  ------------- Symbolic Link Demo -------
  % echo foo > x
  % ln -s x y
  % ls -li ?
  38062 -rw-r--r--  1 kmk users 4 Jul 25 14:28 x
  38066 lrwxrwxrwx  1 kmk users 1 Jul 25 14:28 y -> x
  -- As you can see, y is only a pointer to x.
  % grep . ?
  x:foo
  y:foo
  -- They contain the same data.
  % rm x
  % ls -li ?
  38066 lrwxrwxrwx  1 kmk users 1 Jul 25 14:28 y -> x
  % grep . ?
  grep: y: No such file or directory
  -- Now that x is gone y is simply broken.
  ------------ Hard Link Demo ------------
  % echo foo > x
  % ln x y
  % ls -li ?
  38062 -rw-r--r--  2 kmk users 4 Jul 25 14:28 x
  38062 -rw-r--r--  2 kmk users 4 Jul 25 14:28 y
  -- They are the same file occupying the same disk space.
  % grep . ?
  x:foo
  y:foo
  -- They contain the same data.
  % rm x
  % ls -li ?
  38062 -rw-r--r--  1 kmk users 4 Jul 25 14:28 y
  % grep . ?
  y:foo
  -- Now y is simply an ordinary file.
  ---------- Breaking a Hard Link ----------
  % echo foo > x
  % ln x y
  % ls -li ?
  38062 -rw-r--r--  2 kmk users 4 Jul 25 14:34 x
  38062 -rw-r--r--  2 kmk users 4 Jul 25 14:34 y
  % grep . ?
  x:foo
  y:foo
  % rm y ; echo bar > y
  % ls -li ?
  38062 -rw-r--r--  1 kmk users 4 Jul 25 14:34 x
  38066 -rw-r--r--  1 kmk users 4 Jul 25 14:34 y
  % grep . ?
  x:foo
  y:bar

Why backup with rsync instead of something else?

  • Disk based: Rsync is a disk based backup system. It doesn’t use tapes which are too slow to backup (and more importantly restore) modern systems with large hard drives. Also, disk based backup solutions are much cheaper than equivalently sized tape backup systems.
  • Fast: Rsync only backs up what has changed since the last backup. It NEVER has to repeat the full backup unlike most other systems that have monthly/weekly/daily differential configurations.
  • Less work for the backup client: Most of the work in rsync backups including the rotation process is done on the backup server which is usually dedicated to doing backups. This means that the client system being backed up is not hit with as much load as with some other backup programs. The load can also be tailored to your particular needs through several rsync options and backup system design decisions.
  • Fastest restores possible: If you just need to restore a single file or set of files it is as simple as a cp or scp command. Restoring an entire file system is just a reverse of the backup procedure. Restoring an entire system is a bit long but is less work than backup systems that require you to reinstall your OS first and about the same as other manual backup systems like dump or tar.
  • Only one restore needed: Even though each backup is an incremental they are all accessible as full backups. This means you only restore the backup you want instead of restoring a full and an incremental or a monthly followed by a weekly followed by a daily.
  • Cross Platform: Rsync can backup and recover anything that can run rsync. I have used it to backup Linux, Windows, DOS, OpenBSD, Solaris, and even ancient SunOS 4 systems. The only limitation is that the file system that the backups are stored on must support all of the file metadata that the file systems containing files to be backed up supports. In other words if you were to use a vfat file system for your backups you would not be able to preserve file ownership when backing up an ext3 file system. If this is a problem for you try looking into rdiff-backup.
  • Cheap: It doesn’t seem like it would be cheap to have enough disk space for 2 copies of everything and then some but it is. With tape drives you have to choose between a cheap drive with expensive tapes or an expensive drive with cheap tapes. In a hard drive based system you just buy cheap hard drives and use RAID to tie them together. My current backup server uses two 500GB IDE drives in a software RAID-0 configuration for a total of 1TB for about $100 which is about 1/6th what I paid for the DDS3 tape drive that I used to use and that doesn’t even include the tapes that cost about $10/12GB.
  • Internet: Since rsync can run over ssh and only transfers what has changed it is perfect for backing up things across the internet. This is perfect for backing up and updating a web site at a web hosting company or even a co-located server. Internet based backup systems are also becoming more and more popular. Rsync is the perfect tool to backup to such services over the internet.
  • Do-it-yourself: There are FOSS backup packages out now that use rsync as their back end but the nice thing here is that you are using standard command line tools (rsync, ssh, rm) so you can engineer your own backup system that will do EXACTLY what you want and you don’t need a special tool to restore.

Lire la suite…