Hi all expert progress users!
Sorry, this is a bit long, but I hope that this taxes your grey matter a little in a good way!
I have a requirement to do a daily replication and backup. Not sure exactly how to go about this, so here are my results so far.
Basically, we would like a server to do data replication with massive HDs and a fast internet connection at a datacenter.
Simply transfering a copy of the database is out of the question (over 4Gb, 1.5Gb with tar/gzip and -com, even with a fast network connection at the server, the customer may only have a 2-4mb ADSL or worse! Not too much of a problem out-of-hours though, except possible bandwidth limitations of the ISP (more bandwidth, more cost), and if we have 10 customers per server, that is expensive bandwidth on our part!).
Also, as I need to keep a history (say, 14 days), really I only want the changes stored, with a full copy to apply them to.
I know about various Progress offerings, but like any company, we want to save OUR money! Although, if we don't find a viable solution, then it'll have to be Progress!
I have tried various methods in testing, such as:
RSync - promising, but heavy data transfer (over 750mb of "changes" a night would be transferred at a rough estimate). I have tried this on a standard "probkup" on it's own (750mb xfer) "probkup -com" (over 1Gb xfer) and tar/gzipping transferrs are out of the question (nearly 2.5Gb). I can understand the -com and tar/gzip sizes, as the file structure may be significantly different if only one piece of data is modified. I also tried to run rsync against a database that was not running (just the various files that make a database) and the transfer came down to 350mb a go. Better by far, but I really dont think that our customers have that much change over night. The databases that I am testing were from two consecutive random days...
xdelta - produces binary deltas of two files. Good, but you need a full copy of the database on both machines (server and client). The delta is about 1.1Gb, which is almost the same as the rsync data transfer of the database backed up with "-com". Also, this is a two step process - create the delta, then scp the delta over to the server. This is good except that (assuming we have a good copy of the database from 14 days ago) to restore back to yesterdays copy, we have to apply 14 separate deltas. And if things go wrong...
rdiff-backup - like xdelta, but remote controllable like rsync (server and client talk intelligently to each other) and creates reverse deltas! I couldn't get it to compile on my server, but basically it is supposed to create a delta, send it to the server, apply it to get todays copy of the database, then (the good part!) it goes off to create a delta between todays and yesterdays file - a reverse delta! This way, if I want todays, I have it straight away, and if I then want the file that was 14 days ago (very very very unlikely!!) only then will I have to apply 14 reverse deltas. The bonus is that this is all incorporated in one process/program, so that there is only one point of failure for debugging etc.
Other than that, I am a bit lost... I could implement the xdelta method, but signalling the server that the delta is available for it to play with is a bit flaky, as it is a separate process, and might be a point of failure (for example, the transfer didn't happen, so it would apply yesterdays delta instead of todays and create a duplicate fraudulent database claiming to be what it is not. Again, solved by moving the delta to a directory called "todays-date", if delta is there, then use it. And I know there are other checking methods and what-not, but it is anticipating what else can go wrong.)
I have heard about ai (after-imaging), but not sure how this would work for a historical setup. I'm having a look now. Also, triggers for each and every table would be very messy, so that would be a very VERY last resort!!
If anyone has any bright ideas, please let me know, and when I have a working solution, I'll be sure to share it!
Cheers,
Sorry, this is a bit long, but I hope that this taxes your grey matter a little in a good way!
I have a requirement to do a daily replication and backup. Not sure exactly how to go about this, so here are my results so far.
Basically, we would like a server to do data replication with massive HDs and a fast internet connection at a datacenter.
Simply transfering a copy of the database is out of the question (over 4Gb, 1.5Gb with tar/gzip and -com, even with a fast network connection at the server, the customer may only have a 2-4mb ADSL or worse! Not too much of a problem out-of-hours though, except possible bandwidth limitations of the ISP (more bandwidth, more cost), and if we have 10 customers per server, that is expensive bandwidth on our part!).
Also, as I need to keep a history (say, 14 days), really I only want the changes stored, with a full copy to apply them to.
I know about various Progress offerings, but like any company, we want to save OUR money! Although, if we don't find a viable solution, then it'll have to be Progress!
I have tried various methods in testing, such as:
RSync - promising, but heavy data transfer (over 750mb of "changes" a night would be transferred at a rough estimate). I have tried this on a standard "probkup" on it's own (750mb xfer) "probkup -com" (over 1Gb xfer) and tar/gzipping transferrs are out of the question (nearly 2.5Gb). I can understand the -com and tar/gzip sizes, as the file structure may be significantly different if only one piece of data is modified. I also tried to run rsync against a database that was not running (just the various files that make a database) and the transfer came down to 350mb a go. Better by far, but I really dont think that our customers have that much change over night. The databases that I am testing were from two consecutive random days...
xdelta - produces binary deltas of two files. Good, but you need a full copy of the database on both machines (server and client). The delta is about 1.1Gb, which is almost the same as the rsync data transfer of the database backed up with "-com". Also, this is a two step process - create the delta, then scp the delta over to the server. This is good except that (assuming we have a good copy of the database from 14 days ago) to restore back to yesterdays copy, we have to apply 14 separate deltas. And if things go wrong...
rdiff-backup - like xdelta, but remote controllable like rsync (server and client talk intelligently to each other) and creates reverse deltas! I couldn't get it to compile on my server, but basically it is supposed to create a delta, send it to the server, apply it to get todays copy of the database, then (the good part!) it goes off to create a delta between todays and yesterdays file - a reverse delta! This way, if I want todays, I have it straight away, and if I then want the file that was 14 days ago (very very very unlikely!!) only then will I have to apply 14 reverse deltas. The bonus is that this is all incorporated in one process/program, so that there is only one point of failure for debugging etc.
Other than that, I am a bit lost... I could implement the xdelta method, but signalling the server that the delta is available for it to play with is a bit flaky, as it is a separate process, and might be a point of failure (for example, the transfer didn't happen, so it would apply yesterdays delta instead of todays and create a duplicate fraudulent database claiming to be what it is not. Again, solved by moving the delta to a directory called "todays-date", if delta is there, then use it. And I know there are other checking methods and what-not, but it is anticipating what else can go wrong.)
I have heard about ai (after-imaging), but not sure how this would work for a historical setup. I'm having a look now. Also, triggers for each and every table would be very messy, so that would be a very VERY last resort!!
If anyone has any bright ideas, please let me know, and when I have a working solution, I'll be sure to share it!
Cheers,