Backups

Special Cases


This is part of an exploration into using Linux as a central Backup server : a server that creates backups of (data on) several networked hosts and allows you to centrally manage all backup jobs.

File backups are easy, it's usually just a matter of creating and maintaining one or more an up to date copies of your files, possibly with compression to save disk or tape space and a retention scheme so you can roll back to one of several older versions of your file(s). (see also here.

Things get more complicated when the information you want to back up is stored in a database, a content management system, or some other application. Sure, you can copy these files, but that's not the point : you want a backup that allows you to reproduce the information or restore the application. A simple file copy will be insufficient in most of these cases. So we look into some special backup mechanisms to solve those special cases.

Table of Contents


backup of mysql databases

when you backup a database, you don't only want a backup of your data, you also need a backup of the database structure so that the data can be restored in a useful form : organized in the same tables, records, fields, ... as in the original database. If your mysql rdbms contains multiple databases, you want backups of all of them. You'll also want backups of the system databases (that contain user accounts, privileges, ...).

databases can be build entirely from sql statements, and sql statements can also be used to insert data into tables, so the most generic way to reproduce or rebuild a database is to run sql scripts to execute the required statements that will create the database, create tables, views, stored procedures, ..., and insert data in the tables. The most generic way to create a database backup is therefore to create those sql scripts.

the makers of mysql understand this very well, so they've created the 'mysqldump' tool that can be used to generate all required sql scripts. From the mysqldump man page :

	The mysqldump client is a backup program originally written by Igor
       Romanenko. It can be used to dump a database or a collection of
       databases for backup or transfer to another SQL server (not necessarily
       a MySQL server). The dump typically contains SQL statements to create
       the table, populate it, or both. However, mysqldump can also be used to
       generate files in CSV, other delimited text, or XML format.

       If you are doing a backup on the server and your tables all are MyISAM
       tables, consider using the mysqlhotcopy instead because it can
       accomplish faster backups and faster restores. See mysqlhotcopy(1).

       There are three general ways to invoke mysqldump:

          shell> mysqldump [options] db_name [tables]
          shell> mysqldump [options] --databases db_name1 [db_name2 db_name3...]
          shell> mysqldump [options] --all-databases
	

Your msql backup mechanism will thus consist of a scheduled job with the appropriate mysqldump statements. The resulting sql scripts are plain text files that can be copied or moved to a network server and included in your file backup mechanism. The restore procedure will consist of setting up a mysql server (see further for system restore methods) and running the sql scripts in the appropriate order. You may want to test and document this in your restore procedures, and/or create a script that executes the scripts in the required order.

The technique of dumping database to sql text can be used in any relational database. Most rdbms have tools to do this, but in worst case scenarios, you could just build the required create table, insert into and other sql statements from the output of interactive or scripted sql commands.

Apart from generic sql dumps, most rdbms also let you write database backups to a (binary) file, which you can include in a file backup and use to restore your databases. See for instance the msqlhotcopy tool, MS SQL Server's maintenance and backup jobs, etc.


backup of a wiki or other dynamic websites

Dynamic websites such as a wiki or other content management systems usually run on top of a database and a web server, so your backup will consist of a backup of the web server's configuration, the database server's configuration, and a backup of the data in the database. These backups are discussed elsewhere on this page.

A special case is when you use a wiki or an other web application as repository for system documentation or as online help for your sysadmin work. This creates a catch-22 : if your wiki (or the underlying infrastructure) fails, you have no access to the information and documentation you need to restore it.

A possible workaround is to export all web pages in your wiki to static html pages, ideally with all internal links still working. These can be browsed from a hard disk or an emergency web server, so you'll have access to the information contained in the wiki, even when the wiki is down. And obviously, you can include those static pages in an ordinary file backup as well.

This is of course no substitute for a proper database and system backup. You'll still need those as well if you want to restore your wiki . cms / web application / ..., but at least you'll still have access to the information you need, and it can serve as a temporary 'read only' solution until your web server, rdbms, and web applications are back up.

some wiki systems have built-in tools to export their contents to static html files. If your wiki doesn't, you can use a web harvesting or http download tool to accomplish the same. In the following example, we'll use wget to create a 'mirror for off-line browsing'.

	wget --mirror -p --html-extension --convert-links -P /mnt/backups/wiki/$(date +%F) -nH http://wiki.intranet.xx
	

man wget and wget --help will show more options
the options we used here :

keep in mind that you can not use this download to restore your wiki or website, because wget will have modified the pages so they're suitable for off-line browsing. Links, paths and file names will have been modified. This mirror is only meant to provide basic off-line access to the information contained in the wiki or website.


Operating systems and system configuration

In case of a major problem, you may need to restore an entire server. You can do this by hand, and hopefully you have adequate documentation of all relevant configuration. Or maybe you've made a copy of the entire system, or made disk images, or something similar, but if you need to reproduce the system on different hardware, you'll still have to tweak and modify the new system to get it all up and running again. Hopefully you've documented the required tweaks and modifications in your restore procedures.

It is possibly to automate all (or a lot) of the above, e.g. by using scripts : you create scripts to document your systems and their configuration, and you use this 'documentation' as input to other scripts that you run to set up a new system. Obviously, when creating the documentation, you format the output so that it can be read again by a script in order to reproduce that system on a different host. This requires some effort, but it may well be worth it. The scripts themselves, and the output the produce, are all ordinary files that will be included in your file backup scheme. Here are some notes on automated system documentation and automating server and workstation configuration:


Koen Noens
February 2009