Substitute Function Consumes More Time

Code Snippet:
assign
vFileName = 'mytest_' + replace(string(mtime),',','')
v-data = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'.

run profiler/on.p(vFileName).
DO i = 1 TO 9999:
v-tmpdata = SUBSTITUTE(lcBaseString,v-tmpdata,"~n",v-data).
/* v-tmpdata = v-tmpdata + "~n" + v-data. */
end.
run profiler/off.p(vFileName).

Results:
SUBSTITUTE function - 3.064793 seconds
'+' (concatenation operator) - 0.883670 seconds
 

tamhas

ProgressTalk.com Sponsor
And, your question is? Substitute() is a lot more powerful and general purpose, having special functionality like nice handling of unknown values and facilitating translation to multiple languages. + is very simple. You are surprised that the complex one takes longer to do an equivalent function? How often do you expect to use substitute() in a tight loop with 10,000 iterations?
 
I understand not many will be using tight loop with 10k iterations but we have got few scenarios of this kind (and even if we consider a case with less number of iteration the function consumes 3+ times more than the operator). [in our application in this scenario the concatenated string can grow up to 9.2 MB]

I was surprised to see a built in function consuming 3+ times more than '+' operator but as you have highlighted it does lot more than concatenation. In our case the values that we use for concatenation is in a db field that gets populated through couple of UI screens where we have tight validations for '?' (unknown value); given this case I guess '+' operator will suit for us.
 

tamhas

ProgressTalk.com Sponsor
Allow me to be skeptical. If your usage is something related to a file, as your sample code suggests, than doing *anything* to a file is going to consume so much time that the time required to compute the file name is going to be irrelevant. The same is true if you do anything to the DB.
 
Not sure if I understood completely - sorry for that.

The time that I have mentioned is not for running the entire code snippet; it's only for the below part of code;
DO i = 1 TO 9999:
v-tmpdata = SUBSTITUTE(lcBaseString,v-tmpdata,"~n",v-data).
/* v-tmpdata = v-tmpdata + "~n" + v-data. */
end.
I checked with both profiler and by adding etime.
 

TomBascom

Curmudgeon
You are not comparing "apples to apples". Your non-SUBSTITUTE() test should, at least, test for and handle ? since that is what you say is a key part of your validation and the reason for using SUBSTITUTE().
 

tamhas

ProgressTalk.com Sponsor
My point is that if the real loop that you run 10,000 times is actually something like this:
compute file name
open file
put something in file
close file
then the compute file name time will be a tiny fraction of the total time spent in each pass, small enough that you won't even reliably be able to measure it. Moreover, you suggest that part of the actual values used in the file name comes from the screen, which makes it sound like there is UI in the loop. Whole lots of time consuming stuff such that worrying about the difference between substitute() and + is pointless.
 

tamhas

ProgressTalk.com Sponsor
Yes, but again, what else is happening in each iteration of the loop. If the time consumed in the other parts is substantial, as it will be if you are doing anything with a file or the database, then the speed difference of these two options is meaningless. And, if you have UI between each iteration, then the loop itself is meaningless because what you really have is a bunch of single executions. If I was experiencing slowness with a webspeed function, the speed of a single character option is not the place I would start looking.
 

TheMadDBA

Active Member
Crazy thought here.. you seem to be able to run the profiler.. so why not actually profile the entire code and see where you are actually spending your time.
 
Overall response is ~8 seconds. Out of which this concatenation operation consumes 3 to 5 seconds (depends on the amount of data that gets concatenated), flushing this data through webstream consumes 1.5 to 2 seconds and the rest is for logic (i mean fetching data from tables). I did investigate the rest of the code both manually and by running on profiler and it looks good to me.

so why not actually profile the entire code and see where you are actually spending your time.
I did profile the entire code but I will not be able to post the profiled data of entire code (company policy). That's why I extracted the above piece of code snippet and replaced the same with dummy data.
 

tamhas

ProgressTalk.com Sponsor
Something doesn't make sense. I can't see a single substitute command taking 3-5 seconds to complete.
 
Just to confirm - it's not single SUBSTITUTE function it's in 10k iteration but the DO loop has no other logic or validation in it - hope we are on the same page.

I ran this test first through the webspeed program. Then to double check I extracted the code and ran it separately with dummy data on CHUI and both gave me similar results. I ran it right now once again on procedure editor in my application and it took 3 seconds for this loop (which has just the substitute function) and for live data it consumes ~5 seconds.
 

tamhas

ProgressTalk.com Sponsor
Well, I'm pretty confused about what you are doing ... why is there a 10K iteration which contains only some form of concatenation? Do you really start the loop with 10K values available?
 

TheMadDBA

Active Member
yeah... exactly where is all of this data to concatenate coming from?

If you are really doing the loop as show above.. the string is getting longer and longer on each iteration as you pass it to the substitute function... where the concat is not doing that.

Exactly how much data (length of the string) do you end up with after the do loop completes?
 

tamhas

ProgressTalk.com Sponsor
I gather from the ~n in the commands that this is intended to be a multi-line output?
Is each line different?
 
why is there a 10K iteration which contains only some form of concatenation? Do you really start the loop with 10K values available?
In the live application instead of DO loop you will see FOR loop; loops may vary from ~1K to ~15K # of iterations. Data comes from a field in the db table. All they do is iterate the table for that particular user, concatenate the data from one of the field in that table (history records) and flush it to webstream.

Exactly how much data (length of the string) do you end up with after the do loop completes?
It may vary but the maximum is 9.2 MB of data.
 
One quick question - did anyone get a chance to run the code snippet that is on the post? Can someone run the code snippet on your environment and let me know if it is consuming 3 seconds?
 

tamhas

ProgressTalk.com Sponsor
Have you considered other mechanisms altogether? Like populating a temp-table and then exporting it as JSON?

So, are you now saying that there is DB access in this loop? And suggesting that the concatenation is slower than the DB access?
 
Top