Backup and Restore · Wiki · researchdata / RDMS

Introduction

Backing up and restoring ReSeeD is, essentially, a matter of copying the Docker volumes to a different location, and restoring them as needed.

The following data volumes are used by ReSeeD:

rdms_app
rdms_cache
rdms_db
rdms_db-fcrepo
rdms_derivatives
rdms_fcrepo
rdms_file_uploads
rdms_redis
rdms_solr

It is important to note that ReSeeD (Hyrax) typically writes data to more than one volume in a given operation. This means that the volumes should be copied/restored as a complete set. Also, for the same reason, it is not advisable to copy the volumes while ReSeeD is writing to those volumes.

General process

There are two options for ensuring that ReSeeD is not writing to the volumes at a particular moment in time:

Option 1: Stop the containers

This option will obviously disrupt the service. However, since RUB intends to re-boot the Docker host system on a regular basis, this may actually be a viable approach:

Stop all containers
Backup all volumes
Reboot Docker host
Start all containers

Option 2: Place Hyrax in "read-only" mode

This option is less disruptive to the service. The system remains available to users (and remote processes using the APIs) in a "read-only" state. This means that the data volumes can be safely copied, and that this set of copies of the volume data is internally consistent. The process would be:

Place Hyrax into "read-only" mode. This can be done via the Admin UI or programmatically.
Backup all volumes
Place Hyrax back into "read--write" mode. This can be done via the Admin UI or programmatically.

Placing Hyrax into read-only mode programatically:

Flipflop::FeatureSet.current.replace do
	Flipflop.configure do
    	feature :read_only, default: true
    end
end
puts "Read only mode: #{Flipflop.read_only?}"

Copying Docker volumes for backup

The supported way to backup/restore Docker volumes is to run a Docker container for this purpose. Note that "run" in this sense has a specific meaning: running a container means that it is started, executed and then stopped in one operation. This approach is used for operations which do not require interaction beyond the parameters passed at "runtime".

(The following is copied verbatim from the Docker documentation)

Back up a volume

For example, create a new container named dbstore:

 docker run -v /dbdata --name dbstore ubuntu /bin/bash

In the next command:

Launch a new container and mount the volume from the dbstore container
Mount a local host directory as /backup
Pass a command that tars the contents of the dbdata volume to a backup.tar file inside our /backup directory.

 docker run --rm --volumes-from dbstore -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /dbdata

When the command completes and the container stops, it creates a backup of the dbdata volume.

Restore volume from a backup

With the backup just created, you can restore it to the same container, or to another container that you created elsewhere.

For example, create a new container named dbstore2:

 docker run -v /dbdata --name dbstore2 ubuntu /bin/bash

Then, un-tar the backup file in the new container’s data volume:

 docker run --rm --volumes-from dbstore2 -v $(pwd):/backup ubuntu bash -c "cd /dbdata && tar xvf /backup/backup.tar --strip 1"

You can use the techniques above to automate backup, migration, and restore testing using your preferred tools.

Comments

Please register or sign in to add a comment.