Question Query Performance Part 2

Hello,

While doing a new project I was testing something on performance, there is two way I tried:
Code:
// Way 1
FOR EACH CUSTOMER NO-LOCK:
    DISPLAY CUSTOMER.custnum CUSTOMER.custlib SKIP.
END.

// Way 2
DEFINE QUERY cust FOR CUSTOMER FIELD (custnum custlib).
OPEN QUERY cust FOR EACH CUSTOMER NO-LOCK.
GET FIRST cust.
DO WHILE AVAILABLE CUSTOMER:
    DISPLAY CUSTOMER.custnum CUSTOMER.custlib SKIP.
    GET NEXT cust.
END.
I used CUSTOMER table as exemple, but for my case I have a table with 30 fields and only need 6 of them. Also I'm reading around 6K records.
It appears that the traditionnal FOR EACH block was much faster (around 3 seconds less) than the QUERY strucutre.
I don't understand why?
Did someone as an explanation ?
For me the QUERY should be faster because I ask 2 less fields, so less data to be pulled. But maybe I mistake things.

Thanks in advance :)

BobyIsProgress
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
When inquiring about query performance, it is relevant to know:
  • OpenEdge release
  • Client connection type (remote, remote localhost, self-service)
  • Database broker client/server parameters
  • Client startup parameters
Together, they can have a significant impact on query performance.
 

TomBascom

Curmudgeon
Aside from the other points, which are all excellent, you are not comparing "apples to apples". Your two examples are not equivalent, the first FOR EACH is not using a FIELD() list so comparing it to the OPEN QUERY that does do so is not an equivalent comparison. Your sample code also lacks anything for timing and if you are timing it manually the DISPLAY side of things could be a chaotic factor.

The following test code is "apples to apples":

Code:
define query q for customer fields ( custnum name ).

etime( yes ).
for each customer fields( custNum name ) no-lock:
end.
display etime.
pause.

query q:forward-only = yes.

etime( yes ).
open query q for each customer fields no-lock.
get first q no-lock.
do while query q:query-off-end = false:
  get next q no-lock.
end.
display etime.

As Rob pointed out your session connection type and startup parameters have a large impact. Especially if it is a client-server connection you will find that the -prefetchNumRecs and -Mm are particularly influential, see this: https://pugchallenge.org/downloads2018/Bascom_Coding.pptx

Somewhat counter-intuitively using a FIELDS() list with a shared memory connection can actually HURT performance. I'm just guessing as to why but I suspect that the AVM is doing extra work to marshall the data even though doing so is kind of pointless if the connection is shared memory.

Having said all of that... my experience is that, all other things being equal, static FOR EACH queries will generally be faster than dynamic queries.
 
Thank you @TomBascom I just did a comparaison with what you explained.
In fact I was missing the syntax to select fields for the for each query and try it.
On the main request I have an improvement of 9 times. That is a huge one.
Best Regards.
 

dimitri.p

Member
Depending on number of records returned, hardware issues or lack thereof, buffer hit rates, other database activity and which runs first (for each vs open-query ), you might end up with the one that runs first, taking longer.
 
Top