Data Delete/archive Tool

Rob Fitzpatrick

ProgressTalk.com Sponsor
Does anybody know of any good Database delete/archive tool. This is for QAD environment.
If I were you I'd ask QAD. It's a good idea to consult your vendor on a question like this.

You may want to articulate what "good" means to you. Some questions to help you think about this problem space:
  • Do you want the tool to dump the current schema, in addition to archiving the data? Ask yourself how you'll later load the archived data back into a DB if the schema changes in the meantime. Will you need data-fix utilities to populate new fields?
  • What format should the archived data be in?
  • Should the program purge any arbitrary tables the user chooses, or just certain specific ones?
  • Do you need the program to understand the relational model of the database, so it can descend into child tables and archive/delete records there too, preserving relational integrity and preventing orphaned records? Do you know the relational model?
  • Should the tool also be able to load archived data back into a database, or will that be a manual task?
  • What level of detail should the tool's logging provide? Do you just want totals per table of records evaluated/archived/deleted or do you also want to see details of the unique keys of each record?
  • Do you want the archiving step to be optional?
  • Do you want or need to do the entire range of data deletion in a single uninterrupted run or do you want to do it in batches, say for an hour (or whatever time is left over) at the end of your batch window each night until the desired outcome is achieved?
  • Do you have performance objectives for the tool? E.g. it must delete/archive X records/minute; it must add no more than Y% to the run time of jobs that run during the execution of the tool, etc.
  • For a given table, do you have an appropriate index to bracket on the data to be archived or deleted, based on the selection criteria, or will it require a table scan? If the latter, will your performance objectives be achievable or will you need to add an index to certain tables to facilitate efficient archiving/deletion? Or will you need to schedule a downtime window to do the work?
  • What security model should the tool incorporate? How should authentication and authorization work? Should anyone be able to run it? Only a user with a given application role or AD group? A purge tool in the wrong hands (either incompetent or malicious) is bad news.
  • Should the tool employ sanity checks on input parameters to prevent, say, purges of very recent data?
One other point: think about why you want to purge. What are the expected results or effects? Some people believe "if I purge old data then my database will be smaller" or "if I purge old data then my application will perform better". While either of these outcomes can be achieved, the act of deleting records, in and of itself, guarantees neither.

If you want a smaller database then in the case of affected index areas (I assume you're on OE 10+ with a decent structure) then you will have to compact or rebuild indexes. For affected table areas you will have to relocate the tables to new areas via tablemove (likely not viable for large tables) or dump & load. After relocating the table(s) you can reclaim space by removing the old area, assuming you can carefully prove to yourself that it is now logically empty.

If you want better application performance after purging a table then you should compact or rebuild indexes, particularly unique indexes as they may still be logically quite large due to leftover delete placeholders in the index blocks. You may also want to dump and load the data as well. After deletion, the remaining record blocks may be quite sparsely populated, meaning that the caching efficiency when reading the remaining data might be not much better than prior to the deletion. The dump and load will pack the data more densely on disk and will, albeit temporarily, eliminate logical scatter (in the order of the dump index, at least). Both effects, in theory, may improve application performance. YMMV.
 
Top