Difference between revisions of "Backup/Restore procedure"

From ScarletDME
Jump to navigation Jump to search
Line 24: Line 24:
 
#Only [[User:SteveB|SteveB]] has overview and responsibility. This is a single point of failure.
 
#Only [[User:SteveB|SteveB]] has overview and responsibility. This is a single point of failure.
 
#Only one physical backup copy - overwritten daily - so no possibility to go back more than one day if a corrupt system is backed up
 
#Only one physical backup copy - overwritten daily - so no possibility to go back more than one day if a corrupt system is backed up
#No regular verification that the backup copy is usable so the backup procedure could fail silently eg if there is anything wrong with the script logic
+
#No regular verification that the backup copy is usable - so the backup procedure could fail silently eg if there is anything wrong with the script logic
 
#VMSnapshot/Rsync is not totally proven method but results in minimal downtime whereas VMSuspend/Rsync will result in maybe 30 mins downtime
 
#VMSnapshot/Rsync is not totally proven method but results in minimal downtime whereas VMSuspend/Rsync will result in maybe 30 mins downtime

Revision as of 11:36, 10 December 2008

Current Strategy

Backup

Currently running a backup script at 06:00 UTC

The VMWare virtual machine is snapshotted and then its file system is rsync'd while the VM is running.

Restore to original host

Copy the rsync copy back to its original location

Restart the VM after removing the snapshot

Restore to another host

Similar to restoration to the original location plus:

Needs VMWare Server 2.0 installed

gpl.openqm.com DNS to point to new server.

Problems with current strategy

  1. Only SteveB has overview and responsibility. This is a single point of failure.
  2. Only one physical backup copy - overwritten daily - so no possibility to go back more than one day if a corrupt system is backed up
  3. No regular verification that the backup copy is usable - so the backup procedure could fail silently eg if there is anything wrong with the script logic
  4. VMSnapshot/Rsync is not totally proven method but results in minimal downtime whereas VMSuspend/Rsync will result in maybe 30 mins downtime