[Progress Communities] [Progress OpenEdge ABL] Forum Post: RE: How to survive after the errors like 819 or 10566?

Status
Not open for further replies.
G

George Potemkin

Guest
I will criticize my idea of using a chain of corrupted block to keep a database online after the critical errors. It will not work, for example, for the error 1124: “Wrong dbkey in block”. In the most cases nowadays the error is caused by wrong mapping of a block in system cache to a block on disk. By the way, Progress could automatically empty the system cache trying to fix the 1124. After the error 1124 we can’t mark the block as corrupted by any flag inside the block itself. Otherwise sooner or later we should write the modified block on disk and, if the mapping error is not yet fixed we will overwrite a wrong block on disk. Progress can store the list of the corrupted blocks inside some special blocks. To minimize the impact on performance Progress can check against the list only when the blocks are retrieved from disk. When we will get a critical error the block can be in database buffer pool. In this case it should be deleted from buffer pool. When another session will try to read a block that was previously reported in the message 1124, the session can issue new message: "You are trying to read a block that was marked as corrupted". And at this moment the session will not held a buffer lock - no reasons to crash a database. Apart of implementation are there any negative effects of keeping a database online after the critical errors?

Continue reading...
 
Status
Not open for further replies.
Top