Question Any disadvantage if all RPB = 256?

ron

Member
RH, OE 10.2B and 11.3.3

I have done a few database migrations from Type 1 storage areas to Type 2 - using mean record size to separate tables into different areas accordingly (ie, "by the book").

I now have a different situation - numerous DBs using the same application, but very different usage and very different DB sizes (from about 4GB to about 400GB).

What do I "lose" if I make every storage area have 256 RPB? Of course I know that it lowers the maximum size for each storage area. But is there any other factor that needs to be considered?

Ron.
 

TomBascom

Curmudgeon
Lowering the maximum size is unlikely to be a real disadvantage with oe 10.2+ and 64 bit rowids.

One possible issue is that you waste rowids -- and thus if you do any sort of "scan by rowid" will have a lot of false positives. Such scans are not a normal part of any application that I have ever seen. Personally I use them in two cases:

1) to randomly sample a large storage area for statistical purposes (notably to investigate "logical scatter" and average row size distributions)

2) to dump around corruption

Another potential issue has to do with the default values of the CREATE and TOSS limits. Although that is more tied to what the actual row size is and the unlikliehood of growth rather than to the RPB per se. A high RPB along with bona-fide small records that do not ever grow (which implies high RPB) would probably benefit from a smaller than default create & toss limits. Long RM chains are an indication that you might be experiencing this sort of issue.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
You could potentially have issues with record fragmentation, but that depends on the behaviour of your application.

If it creates many small records in a given table (thus filling RM blocks) and then later updates them such that they grow in size then that would cause record fragmentation. Fragmentation would reduce your ratio of logical to physical reads as the storage engine would have to read two or more blocks into the buffer pool, rather than one, to construct a single record.

I have heard of the "self-healing" nature of more modern releases reducing fragmentation over time but I have no hard data on how that works, or how well. Maybe someone else can provide insight.
 

TomBascom

Curmudgeon
Either the self-healing works very well or the frequency of the problem is very low.

I keep looking for noticeable fragmentation without finding it.
 

cj_brandt

Active Member
We had problems with fragmentation on Type II storage areas, when we added UUID fields to many highly read tables that didn't have a unique identifier. Our reads noticeably increased.

We don't go above 128 RPB and we were using 1 RPB for index areas, but I think 8RPB was recommended for storage areas that hold indexes. Someone with a better memory can comment.
 

TomBascom

Curmudgeon
Disk read increased? Or logical reads increased? Adding a UUID would make small records a whole lot larger so I would expect fewer of them to fit into a block which would certainly lead to more disk reads for the same number of records.

Did fragmentation show up in dbanalys?
 

cj_brandt

Active Member
reads from disk increased as did logical reads. The reads from disk is what was impacting our performance as the disk io was not stellar.
table analysis showed the number of fragments was increased while the number of records remained the same.

the UUID fields were implemented as a way to identify changed records in the OE database to transfer to a DB2 system, so they were read each evening when the extract ran.
 

TomBascom

Curmudgeon
I'm not questioning the need for UUID fields -- that's application stuff. You gotta do, what you gotta do :)

Were the original records on the small side?

You say that the field was added -- so there were existing records without it? And they needed to expand?

If that was the case and there was limited space left in existing blocks that would be another reason for fragmentation. Growing records in place will tend to result in that happening. If you suddenly need to grow a lot of records all at once that were pretty efficiently packed to start with it would be a sort of "worst case" scenario.

Do you recall what the fragmentation factor was?

Has there been a d&l since and did that have an impact?
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
We don't go above 128 RPB and we were using 1 RPB for index areas, but I think 8RPB was recommended for storage areas that hold indexes. Someone with a better memory can comment.
In old Progress releasese with 32-bit rowids (pre-10.1B), people often set RPB to 1 for index areas to conserve rowids. Other than that, RPB has no effect for areas. And that tactic is unnecessary with 64-bit rowids.

The current advice from Progress is to set index areas to 64 blocks per cluster to minimize the cost of a worst-case block split that cascades all the way up the b-tree. But they don't have a recommendation for RPB for index areas as far as I know.

My advice is to set RPB to 128 for index areas, for the sole reason that someone may screw up when making a schema change and put a table in that area. If you put a fast-growing table in an RPB 1 area, the variable extent will grow very very quickly and it will torpedo your caching efficiency.

It would be very nice if Progress would let us specify allowable object types per storage area so that could become a concern of the past.
 

cj_brandt

Active Member
The UUID fields were added to existing tables.
Yes the UUID field caused the record to fragment.
I don't recall the fragmentation factor - I know that in a table with 100 million records, I had 15 million or so additional fragments.
Yes a D&L resolved the issue.

Thanks Rob - so it was the cluster size they recommended. I was using a cluster of 8.

Surprisingly nobody has deployed a table to an index area - however the schema area is still a favorite target. So thankful for the online tablemove.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Sorry, off-topic:

I think BPC of 8 for an index area is probably fine. If I recall, the justification for 64 was:
  • On average, half of an object's last cluster is empty. For an area with 8 BPC, that's 4 free blocks.
  • In the worst case, a block split at the leaf level could cause a block split in the block above, which could cause one in the block above that, etc., all the way up to the root block.
  • The limit on index levels is 16.
  • So if this worst case happens and you don't have enough free blocks to deal with these already-expensive block splits, which your code is waiting on synchronously for completion, now you have to wait for the allocation of a whole new cluster to get enough blocks to complete the operation.
  • Therefore you want to use 64 BPC for indexes because on average, an object's last cluster will contain 32 free blocks.
In practice that's not going to happen. Recently RichB polled some people online about index levels on large tables and if I recall correctly the most anyone found, even for multi-terabyte tables, was 6.

That means the current architectural limit of 16 index levels isn't close to being an issue for anyone anytime soon. And in practice, probably 8 BPC is fine for almost all people for all or almost all index areas.

Another thing to consider about 64 BPC is that it means every index has a minimum cost on disk of 512 KB (assuming 8 KB blocks), regardless of how small it is. With a large schema, it start to add up. Storage may be relatively very cheap these days, but I/O bandwidth is not.
 
Last edited:

Cringer

ProgressTalk.com Moderator
Staff member
In practice that's not going to happen. Recently RichB polled some people online about index levels on large tables and if I recall correctly the most anyone found, even for multi-terabyte tables, was 6.

And this was in a huge table of a huge database that George Potemkin pulled out of his vast repository of information. So yeah we're not going to hit that limit any time soon :D
 
Top