Question What Data Does _indexstat Provide?

BigSlick

Member
Hi All,

We're looking at doing some spring cleaning on our database and want to look at removing some indexes. We plan to monitor the _IndexStat over the next month or so and see what Indexes are being used and which aren't.

Currently i've run a few tests and wanted to check what the fields actually mean, as this is the output:

Code:
"Index Num" "Table Name"     "Index Name"         "Reads"     "Splits"     "Creates"     "Deletes"     "Total"
619         "Customer"         "idxCustomerCode"     0             91             90865         4             90960
620         "Customer"         "idxLastPurchase"     38840         712         91484         4             131040
621         "Customer"         "idxPrevPurchase"     136601         490         91261         4             228356
622         "Customer"         "idxItemcode"         58884138     42             90817         4             58975001
623         "Customer"         "idxCost"             0             248         98129         7113         105490
624         "Customer"         "idxNumber"           0             119         90891         4             91014
625         "Customer"         "idxDate"             0             14             90789         4             90807
626         "Customer"         "idxType"             0             589         91362         4             91955
618         "Customer"         "idxOrder"             8772742     511         91278         4             8864535
627         "Customer"         "idxName"             0             525         91301         5             91831
628         "Customer"         "idxSeq"             0             252         1026         4             91282

Progress does provide the following:
OpenEdge 11.6 Documentation

But it doesnt quite explain it!

Essential, a read is a read and a create must occur for each new record so this could be discarded as reason not to delete an index. But what about deletes and splits?

I wouldnt have been too bothered about deletes had idxCost not come in with over 7000 hits.

Any details on this would be greatly appreciated.

thanks
 
Last edited:

TomBascom

Curmudgeon
What is unclear about "delete"?

A split is what happens when room needs to be made in a block for a new entry. Most new entries occur in "leaf nodes" but every now and then a block higher up will need to be split. That's why splits are much smaller than creates.

If I were considering removing indexes I would look at three things:

1) redundant indexes -- there are various tools to scan your schema and find these. or you can do it by hand. basically you are looking for indexes that share the same components in pointless ways.

2) index selection as shown by "compile xref" -- if the compiler is never choosing an index and the index is not enforcing something like a uniqueness constraint you may not need it. keep in mind that the compiler has no knowledge of dynamic queries -- so if your application uses those you need to lean heavily on rule #3

3) run time usage -- if an index is never used at runtime (reads) then it probably isn't serving a purpose (except perhaps the afore-mentioned uniqueness constraint)
 

BigSlick

Member
Hi Tom,

The clarity surrounding 'Delete' is the numbers involved, why has index idxCost racked up over 7000 hits and the rest 4 or 5. Does this mean the idxCost index was used as the delete, therefore removing this index would break the delete process?

Thanks for the info on splits :)

I have been through the xref route, with 4 indexes being submitted this was a way of clarifying the info. We also run SQL queries and were unsure of the indexes they used, hence using _IndexStat plus as you've mentioned the dynamic queries!

Thanks for the info so far. Much appreciated :)
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
We're looking at doing some spring cleaning on our database and want to look at removing some indexes.
I wouldn't consider index removal a part of "spring cleaning". Why do you feel you need to delete indexes? Do you measure record creates as being too costly? Are your users reporting performance problems?

Proving that an index is unnecessary is non-trivial, even if you have source code. Do you know that a "month or so" is enough time to ensure that any code that might need to run does run? Are there end-of-quarter or end-of-year programs you need to consider? What about canned SQL reports or ad hoc queries in BI tools? Maybe they're run on a schedule, or maybe once in a blue moon. But do you want them to do table scans because the indexes they would otherwise have used were deleted in the name of efficiency?
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
The clarity surrounding 'Delete' is the numbers involved, why has index idxCost racked up over 7000 hits and the rest 4 or 5. Does this mean the idxCost index was used as the delete, therefore removing this index would break the delete process?
An index key delete could correspond to a record delete, but it doesn't have to. It could also correspond to a record update where one or more of the key field values is changed. Such an update (ABL or SQL) causes an index delete and an index create. That is why there is no _IndexStat-update field; keys don't get updated when a record is updated. So not all indexes on a table will have an equal number of deletes.
 

BigSlick

Member
Hi Tom,

The clarity surrounding 'Delete' is the numbers involved, why has index idxCost racked up over 7000 hits and the rest 4 or 5. Does this mean the idxCost index was used as the delete, therefore removing this index would break the delete process?

Thanks for the info on splits :)

I have been through the xref route, with 4 indexes being submitted this was a way of clarifying the info. We also run SQL queries and were unsure of the indexes they used, hence using _IndexStat plus as you've mentioned the dynamic queries!

Thanks for the info so far. Much appreciated :)
 

TomBascom

Curmudgeon
Along with what Rob said a lot also depends on the structure of those indexes.

A .df or other description of the indexes would help. It would also be interesting to see the table and index analysis portions of dbanalysis that relate to those indexes.
 

BigSlick

Member
Hi Both,

This is a recent dbanalysis file:

Code:
-------------------------------------------------------
                                                           -Record Size (B)-           ---Fragments--- Scatter
Table                                    Records    Size   Min   Max  Mean                Count Factor  Factor
PUB.Customer                         2646221400  562.2G   196   856   228           2646352284    1.0     1.0


Table                      Index  Fields Levels         Blocks    Size  % Util  Factor
PUB.Customer
  idxCustomerCode            619       1      4        3084296    6.7G    57.4     1.9
  idxLastPurchase            620       4      5       20446317   50.8G    65.5     1.7
  idxPrevPurchase            621       3      5       12798814   32.3G    66.6     1.7
  idxItemcode                622       1      4        1313338    3.0G    59.8     1.8
  idxCost                    623       3      5        5562824   13.7G    65.0     1.7
  idxNumber                  624       3      5        4101014    9.1G    58.8     1.8
  idxDate                    625       1      4         382458    1.4G    97.0     1.1
  idxType                    626       3      5       12761118   35.7G    73.9     1.5
  idxOrder                   618       3      5       14512493   34.7G    63.1     1.7
  idxName                    627       4      5       12689389   31.9G    66.3     1.7
  idxSeq                     628       2      5        7170141   18.2G    66.9     1.7

The table is our main table and takes up around 600gb which is 3/4 of all table data and 5/7 of all index data at around 250GB.

I used 'spring cleaning' in the loose sense :) Although it would help to remove any 'dead wood'.

Management are happy that one month is sufficient, although i do agree that we may run other code every so often. I do think we are months and months away from doing anything major here.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I do think we are months and months away from doing anything major here.
In that case I would collect months and months of _IndexStat data. :) Removing dead wood is great, as long as it's actually dead.

Most of your indexes are below 70% utilization and a few are below 60%, so depending on your access stats you might well get an application performance benefit from compacting or rebuilding those indexes.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
why has index idxCost racked up over 7000 hits and the rest 4 or 5
To clarify my prior point, the application may be updating the fields in idxCost much more frequently than it is updating other indexed fields in Customer.

I know you didn't ask about refactoring, but I am led to wonder why things like "Cost" and "Itemcode" and "LastPurchase" are attributes of a Customer. I would expect Cost and Itemcode to be attributes of Items. Last purchase could be the read from the last record for a given customer in a Purchase table. So your application might benefit from a normalization exercise, though it might take a lot of conversion work to get there with such a large table. But you could gain long-term benefits from that exercise, including tables and indexes that are smaller and easier to maintain, and greater concurrency. Also, small heavily-read tables and their indexes can be further optimized by assigning them to the Alternate Buffer Pool, which is not usually feasible for large tables and indexes.
 

TomBascom

Curmudgeon
We might be looking at fake table and index names.

But if they are real then yes, the names seem to suggest some odd design choices.

There are also definitely some opportunities to compact those indexes!

I'm suspicious that some of those might be as big as they are as a result of delete place holders left behind from old purges. I'd do an idxcompact first, then re-run the dbanalys and then think about candidates for removal.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
618 (idxOrder) is primary unique
626 (idxType) is unique.
So sparse blocks caused by deletes could be responsible for large index size on those two. It would be interesting to see the outcome of a compact or rebuild of those indexes on a test copy of the DB.
 

BigSlick

Member
Just an update as index compacting on the test box is taking an age:

This is the most up to date dbanalys before any index compacting:

Code:
PUB.Customer
  idxCustomerCode            619       1      4        3149652    6.8G    57.3     1.9
  idxRecordModified          976       1      4        2949829    7.1G    63.8     1.7
  idxLastPurchase            620       4      5       20936416   51.9G    65.5     1.7
  idxPrevPurchase            621       3      5       13118838   33.1G    66.6     1.7
  idxItemcode                622       1      4        1349984    3.0G    59.5     1.8
  idxCost                    623       3      5        5689316   14.0G    65.1     1.7
  idxNumber                  624       3      5        4190198    9.3G    58.7     1.8
  idxDate                    625       1      4         391631    1.4G    97.0     1.1
  idxType                    626       3      5       13107070   36.6G    73.7     1.5
  idxOrder                   618       3      5       14873941   35.6G    63.1     1.7
  idxName                    627       4      5       12980483   32.6G    66.3     1.7
  idxSeq                     628       2      5        7329432   18.6G    66.9     1.7

And this is the first part after (the idxcompact was still running hence why i can't get all indexes yet!)

Code:
idxCustomerCode                619       1      4        2186622    6.8G    82.3     1.4
idxRecordModified              976       1      4        2126238    7.1G    88.3     1.2

Number of blocks and util% have reduced, but the size is the same.
 

BigSlick

Member
Size in GB of the indexes, although blocks have reduced and utilization has improved. The size has been maintained.

This will require a dump and reload, right?
 

TomBascom

Curmudgeon
4k blocks?

2126238 * 4096 * .883 = 7690109558
779019588 / (1024 * 1024 * 1024 ) = 7.1

The "size" of the index is the amount of data that makes up the index entries -- that shouldn't change much on compaction (unless placeholders are being counted and they get removed -- offhand I don't recall if they are part of that size number or not). So dumping and loading shouldn't appreciably change the "size" of a recently compacted index.

Changing the number of blocks would change the size of the disk footprint if you dump & load and if the storage areas are adjusted (or variable).
 

BigSlick

Member
Yes 4k.

I'm hoping to move us to 8k in the future. Stumbling block is the size of the database though.

Thanks for the info!
 
Top