TKLBAM

TurnKey GNU/Linux Backup and Migration

Author: Liraz Siri <liraz@turnkeylinux.org>
Date: 2013-11-11
Manual section:8
Manual group:backup

System Message: WARNING/2 (tklbam.txt, line 5)

Title overline too short.

----------------------------------
TurnKey GNU/Linux Backup and Migration
----------------------------------

SYNOPSIS

tklbam <command> [arguments]

ENVIRONMENT VARIABLES

TKLBAM_CONF:Path to TKLBAM configurations dir (default: /etc/tklbam)
TKLBAM_REGISTRY:
 Path to TKLBAM registry (default: /var/lib/tklbam)

DESCRIPTION

TKLBAM (TurnKey GNU/Linux Backup and Migration), is a smart automated backup and restore facility for the TurnKey GNU/Linux Virtual Appliance Library.

Goals

TKLBAM is designed to provide an efficient system-level backup of changed files, users, databases and package management state. This system-level backup can be restored automatically on any installation of the same type of virtual appliance, regardless of the underlying hardware or location. The intended result is a functionally equivalent copy of the original system.

It is also designed to assist in migration of data and system configurations between different versions of the same type of virtual appliance though for some applications, additional manual steps, such as a database schema update, may be required to complete migration between versions.

Features

  • Figures out which files, databases and packages need to be backed up automatically by detecting changes since installation
  • Lets you simulate a backup beforehand to review if you need to include or exclude any file paths or databases
  • Restoring automatically adds missing users and groups, fixes file ownership and permission issues
  • Makes it easy to create lightweight backups that can be migrated across operating system versions
  • Supports a wide variety of storage targets (e.g., Amazon S3, local filesystem, rsync, ftp, SSH, etc.)
  • Leverages Duplicity to creates PGP encrypted, incremental backup archives
  • Easy to embed in an existing backup solution

Key elements

TurnKey Hub: a web service which provides the front-end for backup management. The user links an appliance to a specific Hub account identified by an API KEY.

Backup profile: describes the installation state for a specific type and version of appliance. An appropriate profile is downloaded from the Hub the first time you backup, or as required if there is a profile update (e.g., bugfix).

Delta: a set of changes since installation to files, users, databases and package management state. This is calculated at backup time by comparing the current system state to the installation state described by the backup profile.

Encryption key: generated locally on your server and used to directly encrypt your backup volumes. By default key management is handled transparently by the Hub. For extra security, the encryption key may be passphrase protected cryptographically. An escrow key can be created to protect against data loss in case the password is forgotten.

Duplicity: back-end primitive that the backup and restore operations invoke to encode, transfer and decode encrypted backup volumes which contain the delta. It communicates directly with the storage target (e.g., Amazon S3). In normal usage the storage target is auto-configured by the Hub. Duplicity uses the rsync algorithm to support efficient incremental backups. It uses GnuPG for symmetric encryption (AES).

Amazon S3: a highly-durable cloud storage service where encrypted backup volumes are uploaded to by default. To improve network performance, backups are routed to the closest datacenter, based on a GeoIP lookup table.

Any storage target supported by Duplicity can be forced but this complicates usage as the Hub can only work with S3. This means backups, encryption keys and authentication credentials will need to be managed by hand.

Principle of operation

Every TKLBAM-supported TurnKey appliance has a corresponding backup profile that describes installation state and includes an appliance-specific list of files and directories to check for changes. This list does not include any files or directories maintained by the package management system.

A delta (I.e., changeset) is calculated by comparing the current system state to the installation state. Only this delta is backed up and only this delta is re-applied on restore.

An exception is made with regards to database contents. These are backed up and restored whole, unless otherwise configured by the user.

In addition to direct filesystem changes to user writeable directories (e.g., /etc, /var/www, /home) the backup delta is calculated to include a list of any new packages not originally in the appliance's installation manifest. During restore, the package management system is leveraged to install these new packages from the configured software repositories.

Users and groups from the backed up system are merged on restore. If necessary, uids / gids of restored files and directories are remapped to maintain correct ownership.

Similarly, permissions for files and directories are adjusted as necessary to match permissions on the backed up system.

COMMANDS

init:Initialization (links TKLBAM to Hub account)
passphrase:Change passphrase of backup encryption key
escrow:Create a backup escrow key (Save this somewhere safe)
backup:Backup the current system
list:List backup records
restore:Restore a backup
restore-rollback:
 Rollback last restore

EXAMPLE USAGE SCENARIO

Alon is developing a new web site. He starts by deploying TurnKey LAMP to a virtual machine running on his laptop. This will serve as his local development server. He names it DevBox.

He customizes DevBox by:

After a few days of hacking on the web application, Alon is ready to show off a prototype of his creation to some friends from out of town.

He logs into the TurnKey Hub and launches a new TurnKey LAMP server in the Amazon EC2 cloud. He names it CloudBox.

On both DevBox and CloudBox Alon installs and initializes TKLBAM with the following commands:

apt-get update
apt-get install tklbam

# The API Key is needed to link tklbam to Alon's Hub account
tklbam-init QPINK3GD7HHT3A

On DevBox Alon runs a backup:

root@DevBox:~# tklbam-backup

Behind the scenes, TKLBAM downoads from the Hub a profile for the version of TurnKey LAMP Alon is using. The profile describes the state of DevBox right after installation, before Alon customized it. This allows TKLBAM to detect all the files and directories that Alon has added or edited since. Any new packages Alon installed are similarly detected.

As for his MySQL databases, it's all taken care of transparently but if Alon dug deeper he would discover that their full contents are being serialized and encoded into a special file structure optimized for efficiency on subsequent incremental backups. Between backups Alon usually only updates a handful of tables and rows, so the following incremental backups are very small, just a few KBs!

When TKLBAM is done calculating the delta and serializing database contents, it invokes Duplicity to encode backup contents into a chain of encrypted backup volumes which are uploaded to Amazon S3.

When Alon's first backup is complete, a new record shows up in the Backups section of his TurnKey Hub account.

Now to restore the DevBox backup on CloudBox:

root@CloudBox:~# tklbam-list
# ID  SKPP  Created     Updated     Size (GB)  Label
   1  No    2010-09-01  2010-09-01  0.02       TurnKey LAMP

root@CloudBox:~# tklbam-restore 1

When the restore is done Alon points his browser to CloudBox's IP address and is delighted to see his web application running there, exactly the same as it does on DevBox.

Alon, a tinkerer at heart, is curious to learn more about how the backup and restore process works. By default, the restore process reports what it's doing verbosely to the screen. But Alon had a hard time following the output in real time, because everything happened so fast! Thankfully, all the output is also saved to a log file at /var/log/tklbam-restore.

Alon consults the log file and can see that only the files he added or changed on DevBox were restored to CloudBox. Database state was unserialized. The xcache package was installed via the package manager. User alon was recreated. It's uid didn't conflict with any other existing user on CloudBox so the restore process didn't need to remap it to another uid and fix ownership of Alon's files. Not that it would matter to Alon either way. It's all automagic.

FILES

SEE ALSO

tklbam-faq (7)