Homemade internet database replication/backup...

kolonuk

Member
Hi all expert progress users!

Sorry, this is a bit long, but I hope that this taxes your grey matter a little in a good way!

I have a requirement to do a daily replication and backup. Not sure exactly how to go about this, so here are my results so far.

Basically, we would like a server to do data replication with massive HDs and a fast internet connection at a datacenter.

Simply transfering a copy of the database is out of the question (over 4Gb, 1.5Gb with tar/gzip and -com, even with a fast network connection at the server, the customer may only have a 2-4mb ADSL or worse! Not too much of a problem out-of-hours though, except possible bandwidth limitations of the ISP (more bandwidth, more cost), and if we have 10 customers per server, that is expensive bandwidth on our part!).

Also, as I need to keep a history (say, 14 days), really I only want the changes stored, with a full copy to apply them to.

I know about various Progress offerings, but like any company, we want to save OUR money! Although, if we don't find a viable solution, then it'll have to be Progress!

I have tried various methods in testing, such as:

RSync - promising, but heavy data transfer (over 750mb of "changes" a night would be transferred at a rough estimate). I have tried this on a standard "probkup" on it's own (750mb xfer) "probkup -com" (over 1Gb xfer) and tar/gzipping transferrs are out of the question (nearly 2.5Gb). I can understand the -com and tar/gzip sizes, as the file structure may be significantly different if only one piece of data is modified. I also tried to run rsync against a database that was not running (just the various files that make a database) and the transfer came down to 350mb a go. Better by far, but I really dont think that our customers have that much change over night. The databases that I am testing were from two consecutive random days...

xdelta - produces binary deltas of two files. Good, but you need a full copy of the database on both machines (server and client). The delta is about 1.1Gb, which is almost the same as the rsync data transfer of the database backed up with "-com". Also, this is a two step process - create the delta, then scp the delta over to the server. This is good except that (assuming we have a good copy of the database from 14 days ago) to restore back to yesterdays copy, we have to apply 14 separate deltas. And if things go wrong...

rdiff-backup - like xdelta, but remote controllable like rsync (server and client talk intelligently to each other) and creates reverse deltas! I couldn't get it to compile on my server, but basically it is supposed to create a delta, send it to the server, apply it to get todays copy of the database, then (the good part!) it goes off to create a delta between todays and yesterdays file - a reverse delta! This way, if I want todays, I have it straight away, and if I then want the file that was 14 days ago (very very very unlikely!!) only then will I have to apply 14 reverse deltas. The bonus is that this is all incorporated in one process/program, so that there is only one point of failure for debugging etc.

Other than that, I am a bit lost... I could implement the xdelta method, but signalling the server that the delta is available for it to play with is a bit flaky, as it is a separate process, and might be a point of failure (for example, the transfer didn't happen, so it would apply yesterdays delta instead of todays and create a duplicate fraudulent database claiming to be what it is not. Again, solved by moving the delta to a directory called "todays-date", if delta is there, then use it. And I know there are other checking methods and what-not, but it is anticipating what else can go wrong.)

I have heard about ai (after-imaging), but not sure how this would work for a historical setup. I'm having a look now. Also, triggers for each and every table would be very messy, so that would be a very VERY last resort!!

If anyone has any bright ideas, please let me know, and when I have a working solution, I'll be sure to share it!

Cheers,
 

TomBascom

Curmudgeon
Understanding why you are doing this and what use you have for the replicated databases would go a long ways towards advising you on a strategy.
 

kolonuk

Member
Essentially, it is a failover scheme. If the customer's server fails for some reason (and the hardware support can not recover the data on the disks) it is just peice of mind. We encourage them to use the backup process we provide (database, emails, user files etc.), but many of them forget, don't call support if it breaks, don't know who does it, the person goes on holiday, dont take the backup offsite, etc. (typical users!). At least this way, those-that-be are assured that a backup is being taken off site by the people who provide the system.

As this is just in the planning stages, not sure what the recovery procedure would be. Probably switch to a different backup server (not the fileserver) at the same datacenter, copy the database over there for the customer to use temporarily. Then get a replacement server sorted at customers site, then transfer it all back (via a portable usb drive or similar). Maximum downtime, say 1 hour till every user is working from the backup database. Again, this is just at the planning stages, and a far cry from implementing.

Infact, as with all backups, we probably won't know it is working properly until something goes wrong!
 

TomBascom

Curmudgeon
That changes things substantially.

First the good news -- since no data is going to be changed the best way to do this is via some form of replication. You could use Open Edge Replication (formerly known as Fathom Replication) or you can roll your own log based replication strategy. ("Log Based Replication" is a fancy way of saying "ai files".)

Now the bad news -- this almost certainly means that someone needs to be paying for a DR license. A DR license is priced at 50% of the cost of the production license.

You must start with a backup of the target database. You should be prepared to occasionally "reseed" your recovery site because various things will require you to restart after-imaging. If you do it right and manage it well that probably won't happen more than a couple of times per year but it's a big pain when it does happen.

You then enable after-imaging on the source and transfer the ai logs as they are filled. Compress them first to save bandwidth. On the receiving end uncompress them and roll them forward.
 

kolonuk

Member
sounds interesting... How would this work with historical backups? I suppose, after you do the roll forward, you could take a regular backup... Could you store the ai files, then apply them when needed?
 

TomBascom

Curmudgeon
sounds interesting... How would this work with historical backups? I suppose, after you do the roll forward, you could take a regular backup... Could you store the ai files, then apply them when needed?

Yes.

There are, of course, a lot of details and caveats but every Progress DBA should become intimately familiar with the after-imaging process. It is a DBA's best friend.
 

kolonuk

Member
Well, we now have a solution that seems to work (although only in beta!)

We are using rsync over ssh. This runs everynight to an off site location. This process also transmits a file that contains the date of the backup. This date file is then interegated, and if it is after the last date file it found (yesterday), then it gzip's it to an archive location. On a customers live database and related user files, 5.5 Gb, it takes about 80-90 meg transfer data and compresses to about 1.6 Gb. The rsync is run against a live database, so you only need to do a prorest and it seems fine. This is also run against our development database, and our own live. There seems to be no failures during transfer once it has everything at the backup site (had problems with a the connection dropping on files over 1Gb first time round, but with rsyncs "partial" options, it carries on where it left off next time it is run) and on a rolling 7 day backup, the data storage seems fairly good.

Have yet to test the integrity, but it seems a viable option...

will post an update later...
 
Top