What is the quickiest & safest way to blow up bad/cuorrupted records fron a DB?

rrojo7229

Member
Hi,


We have a database with bad/corrupted records like this:

Dep Code Prod
──────── ───────
}èM37│
ØÉ"l╠
*ièXdÓ┬
100008
100206
100222
100347

I am wondering which is the quickiest and safest way to remove these records? Shall we go for ASCII Dump Load and remove the bad lines from files .d by vi editor? I am a bit afraid by that way.

Did someone have this experience before?

Kind regards,
Rrojo7229
 

TomBascom

Curmudgeon
A lot depends on how they became "corrupt". If it was an application error or administrative mistake (like starting a session with the wrong code page) that has simply put inappropriate values into fields then no amount of database tool scanning, repairing, rebuilding, dumping & loading and so forth is going to "blow them up". You would have to write a custom program to seek them out and fix them.

If it is something more along the lines of a disk or memory error that corrupted the data I would expect to see Progress throwing errors. Depending on what Progress errors are thrown you may have several options open. A dprpr scan is usually a good start on figuring out how bad that sort of problem is.

It would be helpful to share the Progress version, OS and any Progress errors that are being reported.
 

rrojo7229

Member
Hi Tom,


Thanks for your quickly reply.

It is not Code Page wrong issue, this problem has started after some Energy Failure at the branch where is this Database.

The lg file has been shown this ERROR messages:

SYSTEM ERROR: bfposto: position timestamp - 233013 - doesn't match the list timestamp - 270270464. (493)
SYSTEM ERROR: Memory violation. (49)
SYSTEM ERROR: lkrels record 19322151 not locked. (435)

I will go for dbpr utility after a good backup and I will post it here the results.


Progress 9.1E
Red Hat Linux 3


Regards,
Ricardo Olguin
 

rrojo7229

Member
Hi Tom,

It is posted here the results for dbrpr and idxbuild but neither did not show any problem.

proutil package -C idxbuild ALL -TB 24 -TM 32 -B 1000 -SG 64

1940 indexes rebuilt
Index rebuild complete. 0 error(s) encountered. (1644)

proutil package -C dbrpr < dbrpr.inp > dbrpr.log

Scan @ dbkey 3932160
Scan @ dbkey 3964928
Scan @ dbkey 3997696
Scan @ dbkey 4030464
Scan @ dbkey 4063232
Total Records: 10308626
Continues: 19273
# Dumped: 0

I am wondering to make a ASCII Dump Load just to be sure any dodge record will be loaded like those in the beggining of this post.


What is the best way to blow up these dogge records?

And it is not Code Page problem because nobody has been change this thing - this DB is in Live so every time when it is Up or Down the DB it is made by scripts.

Regards,
 

TomBascom

Curmudgeon
I'm guessing that by "blow up" you mean "identify and remove"?

The fact that idxbuild & dbrpr don't find anything to object to is a case of good news and bad news.... The good news is that nothing is wrong at that level. The bad news is that finding the bogus data is going to be hard. The other bad news is that dumping and loading isn't going to change anything about that data.

Since there doesn't seem to be any actual corruption (from a dbrpr POV) it seems more likely that somehow bad data was actually entered.

Tangent:
In my misspent youth (before Progress) I once worked with a system (I didn't write it! Honest.) that didn't filter input very well. At one point I had to cleanup the data because a new user got the idea that she could change data in fields by using the arrow keys rather than tabbing & back tabbing to get to the desired fields. This resulted in some spectacularly messed up data... Not all that dissimilar from what you're seeing.

I mentioned code pages as one particular example of a class of problems. It is not the only way that inappropriate data can get into a field. However... the characters that you showed sure look like someone has the wrong codepage set. Code pages are not just server parameters -- they can also be set for the session and it is possible that your terminal emulator is using something incompatible as well. It may just be a single user who has been messing around with settings that they don't really understand.

Or it could just be someone using copy & paste in a peculiar way.

In the original post you show the "Dep Code" field as having bad data. It looks like a character field (even though it seems to only hold a numeric account number, for now I won't lecture on that topic...) If that is correct then you could, perhaps, identify the data with something like this:

Code:
for each department no-lock where dep-code > "~~":
  display dep-code.
end.

(I'm assuming that the table name is "department" and the field is "dep-code".)

The assumption here is that whatever caused the bad data to be entered only acted on these "department" records. I don't know if that is the case or not.

(Also, the error messages that you shared would not seem to have anything to do with this problem.)
 
Top