Program Taking Too Much Time For Execution

Naveen Sharma

New Member
Hi Guys,

I am writing a program where I have a table A which has customer-number field in it. For all customer > 500000 i will have to subtract the customer number by 300000 so that customer number gets changed to 200000. I have taken the customer number in a temp-table.
For each record in this temp-table i will have to change the customer number in table A. Table A primary unique index is not the cutsomer number and a customer number may be repeated mutiple number of times in the table. My code is taking way tooo much time to execute. Need suggestions for a faster execution as there are lakhs of records. Following is an example code snippet:

OUTPUT STREAM sCsv TO VALUE(lv_csvfile).
EXPORT STREAM sCsv DELIMITER "," "Old id" "New id" .
FOR EACH tt-customer WHERE tt-customer.tt-cust-old <> "" EXCLUSIVE-LOCK,
EACH A EXCLUSIVE-LOCK
WHERE A.customer-number = tt-customer.tt-cust-old :

A.customer-number = tt-customer.tt-cust-new NO-ERROR.
IF ERROR-STATUS:ERROR THEN
DO:
capture ERROR AND WRITE IN log
END. /*IF ERROR:STATUS = ERROR THEN DO*/
EXPORT STREAM sCsv DELIMITER "," tt-customer.tt-cust-old A.customer-number .
END. /*for each tt-customer WHERE tt-customer <> */
OUTPUT STREAM sCsv CLOSE.
 

Cringer

ProgressTalk.com Moderator
Staff member
Sounds like you're pretty much having to do a full table scan on A for each tt-customer record. If that's the case then this will be better:
Code:
for each A no-lock (add table-scan if you're in 11 or higher):
  find tt-customer where ttcustomer.tt-cust-old eq A.customer-number no-error.
  if available tt-customer then do for bufferA transaction:
    find bufferA exclusive-lock where rowid(bufferA) eq rowid(A) no-error.
    if available bufferA then bufferA.customer-number = tt-customer.tt-cust-new.
  end.
end.

It's untested pseudo code but you should get the idea.
 
Last edited:

Naveen Sharma

New Member
Sounds like you're pretty much having to do a full table scan on A for each tt-customer record. If that's the case then this will be better:
Code:
for each A no-lock (add table-scan if you're in 11.6 or higher):
  find tt-customer where ttcustomer.tt-cust-old eq A.customer-number no-error.
  if available tt-customer then do for bufferA transaction:
    find bufferA exclusive-lock where rowid(bufferA) eq rowid(A) no-error.
    if available bufferA then bufferA.customer-number = tt-customer.tt-cust-new.
  end.
end.

It's untested pseudo code but you should get the idea.


Hi Cringer,

Thanks for the reply. I will try this out definitely. I am using a older version of Prgress. Its 9.1D . I just noticed a table scan in your reply. Since i dont know what that is and it in the latest version could you kindly explain what exactly is a table scan and what is its use ?
 

TomBascom

Curmudgeon
A table scan is what happens when you write an inefficient query.

Instead of directly reading only the records you actually want the DB engine must read the entire table in order to figure out which records it needs.

Usually this will be because the WHERE clause does not line up with an index on the table.

When this occurs in a loop or as part of a join performance is awful.

Cringers solution moves the table scan to an outer loop - so it only happens once rather than over and over again.

Adding a proper index would be ideal but you cannot do that purely in code. You will need to get the DBA involved for that.
 

Cringer

ProgressTalk.com Moderator
Staff member
In addition to Tom's reply, the table-scan keyword on a query tells Progress you're aware it's going to be a table scan, so don't bother trying to use an index. Just find the records in whatever order you find them. It only works in 11 onwards and also only works if you're running Type II storage areas for your data. In most cases it will be quicker to do this than using the primary index to retrieve the records.
 

TomBascom

Curmudgeon
Regarding using the TABLE-SCAN keyword...

This is not something that everyone should do willy nilly. It can make sense if you need to look at the whole table (or a good chunk of it) AND you do not care about the ordering of the data. It is not some sort of magical feature that will turn crappy code into good code. DO NOT USE THIS SOLELY BECAUSE SOMEONE ON THE INTERNET SAID "in most cases it will be quicker". That is only part of what he said. If you use it you should be able to explain why you did so with something a bit more specific than that.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Another caveat about using the TABLE-SCAN keyword: depending on the state of the table, you might actually do more block reads with it than without it.

The intent of TABLE-SCAN, as Tom said, is that in cases where you know you have to read the entire table and you don't care about record order, you can leverage the object's cluster chain (only in a Type II area) to read all of the table's RM blocks without having to access an index b-tree. Depending on the physical size of the index you might otherwise have used, those index block reads you saved might be, say, 10% to 20% of the RM blocks you read.

But the catch is that you can't assume you would do the same number of RM block reads in either case. Reading the table with an index reads just the RM blocks that are pointed to by index entries; empty RM blocks are not accessed. But a TABLE-SCAN by its nature has no prior information of RM block contents. It just follows the table's cluster chain from beginning to end, reading all of the RM blocks whether they contain records or not.

Let's say the table in question is one you purge occasionally; maybe debug data or non-financial transaction history. After a mass deletion of data, you can reclaim IX blocks with an idxcompact or idxbuild and shrink the index b-trees. But RM blocks belonging to a table remain as RM blocks, even if they have all their records deleted, until you perform a maintenance activity like a dump/drop/add/load of the table. If the table was previously 2 GB and after a purge it is 1 GB (logically, not physically), a FOR EACH with an index might read 1 GB worth of RM blocks, best case. (Realistically it will read more than that, as a purge of half the table's data is very unlikely to delete records only from half of the blocks in the table.) But a FOR EACH... TABLE-SCAN will read all 2 GB of the table's RM blocks. This extra amount of reads could easily be more than the number of IX block reads that the TABLE-SCAN avoids.

Another complicating factor is logical scatter. One advantage of TABLE-SCAN is that although may read more unique RM blocks than the alternative table scan with an index, it will only visit each RM block once. In the worst case (e.g. a very large table relative to the size of -B, with very high logical scatter), you could potentially read each RM block that contains records n times where n is the table's average actual records-per-block (as opposed to the area RPB). Dmitri Levin and George Potemkin delivered a very enlightening presentation a couple of years ago that delved into the effects of logical scatter on query efficiency (among many other topics):
DBAnalys To The Rescue!
http://pugchallenge.org/downloads2015/246_DBAnalys.pptx

So in summary... it's complicated. :) I wouldn't say that TABLE-SCAN isn't useful, but I would say that a programmer using it on a given table should have a very good understanding of the CRUD activity on that table. Also, if an assumption is to be made that using TABLE-SCAN will reduce logical I/O, then test the code with realistic data to ensure that the theoretical gain is actually realized in practice.
 
Top