Question BI growth

Hi All,

We are performing bulk loading and the BI is getting increased in uncontrolled way.
Any solution to restrict BI growth?

Thanks and Regards,
Surya
 

Cringer

ProgressTalk.com Moderator
Staff member
What method are you using for the load? You must be doing it online. The BI is a not of all running transactions etc so that in the event of a crash they can be undone systematically to ensure that the DB integrity is maintained. You don't want to prevent this from happening in my opinion. On the other hand, if you load the data offline (probably more sensible anyway IMO) then the BI won't grow IIRC.
Alternatively examine your load code to make sure it is only performing small transactions rather than a big one that stays outstanding for the life of the load. That way the BI should not grow out of control.
 

RealHeavyDude

Well-Known Member
The BI growth is determined by two factors
  1. Transaction load
  2. Long running transactions.
In almost all cases I have seen the second factor - long running transactions - were the root cause of the growth. Long running transactions which prevent BI clusters from being re-used are alway cause by the software uitilizing transaction scopes which are much larger than they need be. Most of the times I've seen this it was not on purpose but by accident.

The only thing you can do is to fix the transaction scope so that the transactions are as small as possible to allow BI clusters from being re-used much faster.

You should inspect the software you are using to do the load and compile it using the listing option to see the transaction scopes.

There is nothing you can do on the database to prevent such transactions. All you can do to prevent the database from shutting down abnormally is to use the -bistall paramater ( possibly in conjunction with the -bithold parameter ).

Heavy Regards, RealHeavyDude.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
If it is possible to take a maintenance window where other users cannot access the database then you should do so. Then you can restart the database with startup parameters and BI configuration tuned for a bulk load (and for preventing access by business users). If the load is very large and time-consuming I use the maximum BI block size and cluster size during the load to improve performance (although that does nothing to change volume of BI notes).

If it is a very large and time-consuming load I may also start the database with no-integrity (-i) which also improves performance and does decrease the volume of BI notes. Warning: do not use -i unless you clearly understand its implications and the very limited cases where it is acceptable for use. This parameter prevents crash recovery, so it is only acceptable for use when there is no normal forward processing (business activity) happening in your database. You must have a good backup of the database taken just before the database was started. If you crash, your only means of recovery is restoring that backup. Remember to shut down the DB again and back up again immediately after the load finishes. Never use this parameter in production.

As RHD said, your problem may well be a combination of two factors: an old active transaction (not necessarily related to your bulk load session) that has pinned the earliest BI cluster and prevented its re-use, combined with the high volume of BI notes caused by the bulk load that causes additional clusters to be added to the BI file. But without more information from you we can only speculate. I suggest you review your bulk-loading code to see if its transaction scope is too large. It is possible that the developer tried to use a "transaction batching" technique to purposely increase the transaction scope as a means of improving throughput. This can work but the trade-off is BI file growth. You can review manually and also run COMPILE LISTING to see your transaction scope in each procedure.

I also suggest you run your bulk-loading code on a test database where you are the only user. Monitor the database in promon or ProTop. How many transactions (commit in promon) do you see? How often does your transaction ID change? Are the numbers what you expect? Do you see BI growth?
 
Hi Cringer/RealHeavyDude/Fitz,

Thanks a lot for your information.

We are actually performing huge loading transactions. We did a series and test and the conclusion is that BI cluster was unused due the very first cluster being active. It was interesting to see that after the first transaction was killed, the BI stopped growing as it was being reused.

BTW, Any version in which any of the cluster can be used rather than wait for the first one to be free?

Thanks and Regards,

Surya
 

Cringer

ProgressTalk.com Moderator
Staff member
I suspect your understanding of how the BI works is a little off. The basic answer to your question is that the workings of the BI has not changed in years and is unlikely to because it works very well. Your issue isn't the BI, it's the code you're using.
For large data loads, binary dump and load is far more efficient.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Last edited:

Rob Fitzpatrick

ProgressTalk.com Sponsor
Any active transaction for which before image notes have been written will prevent re-use of the BI cluster(s) that contain those notes. The clusters cannot be reused because that transaction might be rolled back, either because of the client's action or because the database or system crashed and the subsequent crash recovery at startup caused it to be rolled back. Reusing the BI cluster would prevent transaction processing from working properly.

So when doing large data loads, you don't want any transaction activity from clients other than the one(s) doing the load, regardless of how you're loading (ABL, SQL, dictionary, or proutil load). Because the presence of a long-lived transaction during a load which creates many BI notes will prevent BI cluster re-use and force new BI clusters to be added, i.e. it will result in BI growth.
 
I suspect your understanding of how the BI works is a little off. The basic answer to your question is that the workings of the BI has not changed in years and is unlikely to because it works very well. Your issue isn't the BI, it's the code you're using.
For large data loads, binary dump and load is far more efficient.
Hi Cringer,

You are correct. I have just started learning more about progress. I am an Oracle DBA and taking up Progress DBA role also. This is the first time that I ever went into more details about how BI clusters are assigned and reused.

BTW, we are using 10.2A3 in RHEL 5.8. Recently we performed delete archive activity. We tried to perform Transaction History loading as Post Delete Archive this time. Application Team changed this way to save time.

Just a question: If I have an active transaction in BI Cluster 1. Then I perform mass loading in one transaction and get next 499 clusters created leading to 500. Now, if transaction holding cluster 1 is still active and I start next loading, will a new BI cluster 501 be created or will it reuse cluster 2 and cluster 3 and so on?

Thanks and Regards,
Surya
 

Cringer

ProgressTalk.com Moderator
Staff member
I think it would probably be worth your while going on a DBA course to be honest. I did the DBA Bootcamp last week and it was invaluable.
To those who know more about this than me, please forgive my probable use of incorrect terminology!
Think of the BI as a ring of clusters. As one gets filled, the next one comes into action. The currently used one is marked as busy, the previous one will be marked as full. It will remain full, until all active transactions in that cluster are complete, at which point it will be marked as empty again. If Progress has filled up all the current clusters though, and all of them still have active transactional information in them then Progress will create a new cluster, and slip it into the list. This will continue until one of the full clusters is able to be emptied as the transactions are now complete. Each time a new cluster is added to the ring the size of the BI file is increased. This has the impact you are seeing, and it also has the impact of slowing things down as that space has to be allocated to the BI.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Just a question: If I have an active transaction in BI Cluster 1. Then I perform mass loading in one transaction and get next 499 clusters created leading to 500. Now, if transaction holding cluster 1 is still active and I start next loading, will a new BI cluster 501 be created or will it reuse cluster 2 and cluster 3 and so on?

Think of the BI clusters as logically being in a ring (linked list):
Code:
--> cluster 0 --> cluster 1 --> cluster 2 --> cluster 3 --> cluster 4 --> ... --> cluster n --
|                                                                                            |
----------------------------------------------------------------------------------------------

Notes are only added, in chronological order, to the current cluster. When it fills, the current cluster is closed and new writes will go to the "next" cluster in the ring. Which one is next? Imagine that cluster 2 is the currently-open BI cluster. BI notes are written to it until it fills. Then a decision is required: can we write new notes to cluster 3, the next one in the ring? If cluster 3 contains no notes that are related to active transactions or to changes that have not yet been committed to disk in the database then the notes in cluster 3 are no longer needed and its space can be re-used. New BI notes are written to cluster 3. But if cluster 3 cannot be re-used, for the reasons given above, then a new cluster must be formatted and inserted into the ring (logically, not physically) between 2 and 3 and new BI notes are written into it. When it fills, the process above repeats itself. I have simplified somewhat but that is the basic process. Gus does a very good job of explaining the process (and a bunch of other interesting DB internals stuff) in this video.
 

Cringer

ProgressTalk.com Moderator
Staff member
Think of the BI clusters as logically being in a ring (linked list):
Code:
--> cluster 0 --> cluster 1 --> cluster 2 --> cluster 3 --> cluster 4 --> ... --> cluster n --
|                                                                                            |
----------------------------------------------------------------------------------------------

Notes are only added, in chronological order, to the current cluster. When it fills, the current cluster is closed and new writes will go to the "next" cluster in the ring. Which one is next? Imagine that cluster 2 is the currently-open BI cluster. BI notes are written to it until it fills. Then a decision is required: can we write new notes to cluster 3, the next one in the ring? If cluster 3 contains no notes that are related to active transactions or to changes that have not yet been committed to disk in the database then the notes in cluster 3 are no longer needed and its space can be re-used. New BI notes are written to cluster 3. But if cluster 3 cannot be re-used, for the reasons given above, then a new cluster must be formatted and inserted into the ring (logically, not physically) between 2 and 3 and new BI notes are written into it. When it fills, the process above repeats itself. I have simplified somewhat but that is the basic process. Gus does a very good job of explaining the process (and a bunch of other interesting DB internals stuff) in this video.
You see, this is why I shouldn't post much on here because that's exactly what I was trying to say, but in a much better way. lol
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I think we basically said the same thing. :) I recommend the video; Gus has a knack for providing simple explanations for complicated things.
 
I think it would probably be worth your while going on a DBA course to be honest. I did the DBA Bootcamp last week and it was invaluable.
To those who know more about this than me, please forgive my probable use of incorrect terminology!
Think of the BI as a ring of clusters. As one gets filled, the next one comes into action. The currently used one is marked as busy, the previous one will be marked as full. It will remain full, until all active transactions in that cluster are complete, at which point it will be marked as empty again. If Progress has filled up all the current clusters though, and all of them still have active transactional information in them then Progress will create a new cluster, and slip it into the list. This will continue until one of the full clusters is able to be emptied as the transactions are now complete. Each time a new cluster is added to the ring the size of the BI file is increased. This has the impact you are seeing, and it also has the impact of slowing things down as that space has to be allocated to the BI.
Hi Cringer,

I am already in the process of going for a training on progress DBA. But no facility in ASIA....

Thanks a lot for your explanation...

Thanks and Regards,
Surya
 

TheMadDBA

Active Member
Not to threadjack this too much but... I can understand your frustration/confusion coming from an Oracle background. When I started using Oracle after using Progress for 20+ years I was amazed at how many things just worked differently (and imo better) in Oracle than in Progress.

On topic: The documentation is actually pretty decent (in modern releases of Progress) at explaining a lot of the core DBA concepts. Dan Foreman at BravePoint has a book (Database Administration Guide) that would probably be very helpful for you. It covers a broad range of topics.
 
Not to threadjack this too much but... I can understand your frustration/confusion coming from an Oracle background. When I started using Oracle after using Progress for 20+ years I was amazed at how many things just worked differently (and imo better) in Oracle than in Progress.

On topic: The documentation is actually pretty decent (in modern releases of Progress) at explaining a lot of the core DBA concepts. Dan Foreman at BravePoint has a book (Database Administration Guide) that would probably be very helpful for you. It covers a broad range of topics.
Hi TheMadDBA,

Thanks a lot. I will see if I can acquire the book. I am also watching the video now and also went through GUS. I am doing better.

Surya
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I am already in the process of going for a training on progress DBA. But no facility in ASIA....
There is also the option of Progress' web-based training. You get access to that for free if you have a PSDN subscription.
https://wbt.progress.com

And I second the recommendation for Dan Foreman's books. I have the DBA Guide, Performance Tuning Guide, DBA Jumpstart, and VST book. Great stuff. Available here:
http://bravepoint.com/products-publications.shtml
 
Top