understanding bi

longhair

Member
afternoon all,
we have a tiny db (about 3mb) that is comprised of 3 tables.
daily we delete the records out of 2 of the tables and reload new data.
when this is done the bi file grows quite large (as in more than 2 times the size of the db):confused:
previously the tables were in another db and we did not see this bi file growth.
i reworked the programs that delete and add the records to the tables limiting the transaction scope and adding pauses. this limits the bi file growth (but i still think there should be none) but increases the time that it takes for the script to run.
i also find it odd that if i run the programs manually they take just over i minute to run, but when in a script it takes 7 to 9 minutes. although, bi growth is smaller when run from a script.
any ideas or suggestions?
thanks.
regards,
longhair
 

TomBascom

Curmudgeon
1) I can hardly imagine worrying about something that uses about 10MB of disk space ;)

2) The bi file logs changes so that they can be undone - that includes undoing deletes by putting the record back and undoing inserts by removing the record... You're deleting every record in 2 tables (out of 3) and then adding them again. So I would *expect* to see bi space used that is roughly 2x to 3x the size of the tables in question (all of the data is handled at least twice, so it will have to be in the bi notes at least twice, and there is overhead to account for as well).

3) You say that you rewrote this and that as part of that you added some PAUSE statements. That's probably got something to do with things taking longer under some conditions.

4) The performance of batch deletions, updates and data loads can usually be improved by expanding transaction scope to "chunk" the operations into groups. Usually a group size of 100 or so is sufficient. Something like this:

Code:
define variable i as integer no-undo.

outer: do for customer transaction while true:
   inner: do while true:
     i = i + 1.
     find next customer exclusive-lock.
     if not available customer then leave outer.
     discount = 0.
     if i modulo 100 = 0 then next outer.
   end.
end.

5) You could also put the tables that you are going to delete into a dedicated storage area and use "proutil dbname -C truncate area" to quickly delete them. You'll still have some bi growth when you load them but not so much as before.
 

longhair

Member
tom,
thanks for the info, helpful as always.
i'm already chunking based on your previous posts.
if i chunk at every 100 records with a 1 second pause at each chunk the bi grows to about double the db and takes about 8 minutes to run via script.
if i remove the pause and still chunk at every 100 records the bi grows to more than 3 times the db size and takes about 45 seconds to run via script.
if i chunk at every 5000 records with a 3 second pause at each chunk the bi grows to the exact same size as every 100 records with no pause.
if i remove the pause and still chunk at every 5000 records the bi gows to the exact same size as every 100 records with no pause.
is this a case where the bi buffers are not being written to the db quick enough so progress is constantly allocating additional bi buffers?:confused:
regards,
longhair
 

TomBascom

Curmudgeon
A cluster won't be reused if it is still active or if it was closed too recently. If I recall correctly "recently" is 60 seconds. But I could be wrong about that -- in a more active system that is rarely an issue...

Have you tried using -i? With a tiny system like this and given what you're doing that would seem like a safe thing to do (just make a quick backup first).

Have you tried increasing the bi block and bi cluster sizes?

As Casper hinted... you might also want to run it multi-user with APWs and a BIW. That will move the writes to the page-writers and give control back to the user (or the batch process) sooner.
 

longhair

Member
tom,
thanks for the suggestions.
i'm not the dba so changing bi block size, cluster size or changing to multi user is out of my control (although all have been suggested).
from your post the issue is the writes are way too quick.
there are only 2 tables with 3 fields in one and 4 in the other. so progrss needs to create another cluster. i'll try chunking at a higher record count and see what that does.
thanks, again.
regards,
longhair
 

TomBascom

Curmudgeon
Aside from an intellectual curiosity angle I cannot imagine how anyone could justify spending more than about 10 minutes on this project ;)
 

longhair

Member
tom,
agreed. like i said i will try chunking at a higher rate and just truncate the bi afterwards. thank you for your time and insight.
regards,
longhair
 
Top