"roll forward" takes for ever!

ron

Member
Our details are:

Progress 9.1D09 on Solaris 8.

Ten separate production databases ranging in size from about 60 Mbytes up 260 Gbytes.

Database block size is 8K, AI and BI blocksizes are both 16K.

Cluster size = 16384K.​

We have ten production DBs, and swap the AI for each DB to the next extent every ten minutes -- package-up the AI data and send it to a warm spare -- and apply the details to the ten warm spare DBs.

But we have a problem with the extraordinarily long time it takes to roll forward. In particular, the time taken to deal with an AI file that contains almost no notes at all varies from 2 seconds – up to 800 seconds. The longest times seem to be for databases where the number of active clusters is greatest. (But it is hard to know how many clusters are in use on the warm spare databases.)

Recently we've had to swap less often -- every 30 minutes, rather than every 10 minutes, because it was taking "for ever" for rfutil to perform the roll forward process -- even if every one of the ten AI files contained negligible data. In fact, the logs show that it would take a total of about 25 minutes to "do nothing". Obviously, when the swaps were only ten minutes apart, the warm spare would very quickly fall behind. Now, at 30 minutes between swaps, we keep up-to-date.

Progress KB P65028 looks like exactly the problem we have -- and it suggests that out problem will be solved by upgrading to 9.1E. However KB P118636 says that we will exchange one problem for another!

It appears that we may be able to reduce the magnitude of the problem (at least temporarily) by truncating the BI on the warm spare for each of the ten DBs – but we would have to be very careful to only do this when there are nil in-flight transactions – otherwise we will kill the synchronisation between the production and warm spare databases.

Is anyone familiar with this problem?

Do you have any ideas about how we can improve on the situation?

Any help / suggestions are most welcome!
 

TomBascom

Curmudgeon
I've never, personally, run into this but it sounds a lot like the issue with the redo phase taking forever that George brings up on PEG every couple of months.

SFAIK it has not yet been fixed in any release. In spite of a few hopeful sounding kbase entries. I strongly advise getting in touch with tech support and a) getting on record with the problem and b) letting them know that you have a reproducible example. (If TS acts like they've never heard of the problem try escalating it...)
 

ron

Member
Thanks, Tom.

(It's good to know someone else has the problem! Ie, George.)

I have lodged a Support call with Progress. If anything interesting eventuates I'll post it here.

Ron.
 

curly

New Member
Hi Ron,
We have the same problem as well with AI roll-forward taking up to 30min – quite a big problem considering that we create and ship our AI files every 5 mins. :mad:

Comment on this topic from Gus:
“This has become a common problem. It didn't used ot be one.

It is mostly caused by the way that the redo phase handles allocation and use of empty blocks above the highwater mark. That mechanism was designed a long time ago, way back in early version 7. At that time the design didn't cause any significant problems.

But the world has changed and the usage patterns have changed: databases are bigger and updated more, after-image journalling is used in ways it wasn't back then (switching extents frequently, rolling them into a hot spare, ai replication, etc.).

The Engine Crew will have to assess the issue and decide what to do about it.
-gus”

You could find complete discussion on http://www.peg.com/lists/dba/history/200609/msg00185.html

Regarding your comment about truncating bi on the warm spare: Progress does not recommend it however it works for us – just keep eye on in-flight transactions.

Regards,
Marian
 

garfield_ruhr

New Member
Hello group,

I had also the problems in higher version of progress and found that it is important to get a small BI file on the database where the AI files should be applied. So I do the following to create a database that will take the AI files:
1) Stop the source database.
2) truncate BI on the source.
3) copy the source database to the target.
4) enable AI
5) continue with the normal AI handling.

I had big BI files first and each AI file took 40 minutes to apply. The main time was the check of the environment first. After using the truncated BI file each AI file takes 1 to 2 seconds to apply.

Be careful with truncating the BI file on the target database. This can only be done if with the apply of the last AI file no transaction is open. Then also the target BI can be truncated.

I hope this helps a bit.

Bye,

Garfield
 

ron

Member
Thanks very much, Garfield.

It helps a lot to know that others have experienced the same problem!

Re-initiating the warm spare database from the source database is extremely "painful" for us -- because the DB is 260GB and we still use DLT8000 drives. That means four tapes and about 20 hours -- plus someone needing to stick-around all night changing tapes!

One or more of our DBs will enlarge the BI file quite a lot at least once a month, and we can't realistically keep copying-over a new copy of the source DB. Therefore, we will improve our scripts so that, for each DB, once each day (during the night) we'll check for a time when the logs show nil in-flight transactions -- and then truncate the BI on the warm spare. We have also logged the problem with Progress, to see if they can come up with a solution.

Regards,
Ron.
 

garfield_ruhr

New Member
Hi Ron,

the BI file of the target database must be small. So the source database does not matter. I experienced that you could apply aifiles up to the point that no additional transaction is open. Then you can truncate the target BI file and the next applying will be much faster. But really be carefull to have no open transaction after applying the last ai file.

Bye,

Garfield
 
Top