Forum Post: RE: Can the different worlds hear each other?

George Potemkin · Dec 16, 2014

Hi TheMadDBA, what version of OE was this? 10.2B08, AIX, IBM P570, POWER6, 4 CPUs, 4704 MHz, 32 GB RAM what options did you use for all of the new idxbuild parameters? I have only information from db log: (13942) Idxbuild is running with threads, the maximum number of threads is 9. (11443) Temporary sort file at: /test/tmp/ will use the available disk space. (-----) Process memory 72698912 of 32883343360 in use (-----) Max merge pool entries estimated at: 250325 w/memory usage: 16405299200 (-----) TMB value is 64, TM value is 32, SG value is 64, packing factor is 80. Year ago or so the customer has tested new idxbuild parameters and SSD for temp files. In current case I don't care about the optimization of idxbuild run. Maybe it can complete in a day instead of 6 days. But one day out of business is not an option for the customer to repair 1-2 records with the corrupted indexes. Idxbuild can be used during the planned downtime but I would not expect it will happen often than one day per year. but I am not entirely amazed by the idxfix numbers. Now I would say we could predict such bad results. For "good" indexes (the ones with low logical scatter factor) the scan time is directly proportional to the number of records. For "bad" indexes the scan time will grow as the number of records in power X where X is slightly less than 2: idfxix is forced to jump from data blocks to the index blocks and vice versa. Due to the high logical scatter factor it's unlikely that next blocks the idxfix will read would be the same as the ones recently used. This will cause the waits on disk IO. More records in table the larger indexes, the bigger jumps between disk IOs and the longer waits. If one minute is fine to scan 10 million records then the time to scan almost 10 billions records (as in our case) would be a thousand time longer for "good" indexes and almost a million time longer for "bad" one. A thousand minutes is 17 hours and a million minutes is 694 days or two years - it's almost those values that we got in the tests. Multi-threaded idxfix could decrease the scan times by ten times. It's not a perfect solution for "bad" large indexes but it's a solution that Progress could implement rather easy: just take the multi-threaded binary dump (and dumpspecified) and replace the final operation - instead of dumping a record check its indexes. It's as easy as to change a nozzle at an electric. But if Progress could entirely change the algorithm of idxfix then we would get a solution even for "bad" indexes. The possible solution: idxfix can read the index blocks as idxblockreport does and put them into the buffer pool. Simultaneous it should build a temp index for recids found in the index keys. We know this index as the "default" index. It's much smaller than any normal index of the same table. So idxfix can use an additional memory (separately from -B) to build the "default" index in memory. When the buffer pool is entirely filled by index blocks then idxfix should stop reading the index blocks. Now it should read the records using the "default" index and check their indexes. One block in memory would be enough for reading data blocks. On this phase the idxfix will read data blocks sequentially without the jumps during disk IO - the correspondent index blocks are already in memory. When the phase is completed then idxfix should return to the first phase to fill the buffer pool by new chunk of index blocks. Then it will again read the records using new instant of the "default" index. New phase of record reads could re-read from disk the data blocks that were used on previous phases. It's, of course, on overhead. But the current version of idxfix does the same for each record in data block in case it scans a "bad" index and it does this after a jump from index block. Let's take, for example, the worst indexes from our test. Its size is 200 GB. The box has 32 GB of RAM. We can start db with -B 20 GB. So the whole index will be divided into 10 chunks. We will 10 times re-read the area with the records using the different instances of the "default" index. In case of "good" index each time we would re-read only one-tenth of the area. But it's not true for "bad" indexes. On other hands it's unlikely that each time we will re-read the whole area. Let's say we will re-read each block from data area by 5 times. Hence the time to check our index will equate to the time of idxblockreport plus the time of tabanalys multiplied by 5. No doubts that the time would be much smaller than 2 years. We have a couple of the indexes that are smaller than 20 GB. They can be processed just in one interaction. The suggested algorithm can be multi-threaded as well. But I am afraid that if the enhancements of idxfix will be implemented in 2 or 3 years then at that time we will not have the large customers who need such options.

Continue reading...

Forum Post: RE: Can the different worlds hear each other?

George Potemkin

Guest

Similar threads