Question TCP vs. Shared Memory

Cecil

19+ years progress programming and still learning.
Does anyone have any test results of connecting to a multi-user database via the TCP vs. Shared Memory?

24 years ago on 8.3C we found there was a massive performance gain for accessing the database in shared memory vs. the TCP stack.
I think it was between 20% to 33% performance gain.

I was hoping for some more up-to-date stats that was published.
 

Cringer

ProgressTalk.com Moderator
Staff member
Does anyone have any test results of connecting to a multi-user database via the TCP vs. Shared Memory?

24 years ago on 8.3C we found there was a massive performance gain for accessing the database in shared memory vs. the TCP stack.
I think it was between 20% to 33% performance gain.

I was hoping for some more up-to-date stats that was published.
There is a big "it depends" part of this answer. There are certain activities (I can't remember which!!) that are quicker via TCP.
 

tamhas

ProgressTalk.com Sponsor
I was going to note that it is architecture specific. An application that fetches lots of records ... whether necessary or not ... will be significantly faster with shared memory than TCP, whereas one that fetches minimal records and minimal fields may not contrast nearly as much.
 

TomBascom

Curmudgeon
Plus ça change, plus c'est la même chose...

There have been a couple of significant enhancements to "queries" (the FOR EACH NO-LOCK statement and it's dynamic cousins):
  • Since 10.2B06 -prefetch* startup parameters allow network messages to contain *much* more data, rather than a maximum of 16 records you can send thousands, and the first message can be full instead of containing only one record
  • Since 11.6 -Mm only needs to be set on the server - no more need to track down *every* client startup to increase the message size
  • TABLE-SCAN was added (11.?) to FOR EACH for cases where you are reading all (or most) of a table and don't need to pay attention to indexed order
  • Server side joins roll out in oe12 with incremental improvements from release to release
  • Multi-threaded servers with oe12 mean that you no longer need lots and lots of "server" processes
So to the degree that your application performance is sensitive to large, NO-LOCK, result sets being returned from the server to the client these changes can be hugely beneficial.

In fact, as Cringer alludes, it is possible that queries which take advantage of these features are actually faster over TCP/IP than they are in shared memory. That is because you are streaming a big hunk of data and there are two CPUs simultaneously working on the job - one on each end of the stream. The startup latency isn't notable in a case like that.

However...

The mechanics of a FIND statement have not changed and 4gl applications use a LOT of FIND statements. At best a FIND statement is a minimum of three network messages. The request, the response, and releasing the cursor. At worst it can be a LOT worse. Five, six, seven messages going back and forth for one record.

Some benchmarking:

1693922067464.png
Shared memory is obviously a lot faster for FIND statements.

The FOR EACH JOIN TABLE-SCAN (FEJTS) is an example of streaming a large result set. In that particular case the TCP/IP connection handily beat shared memory. (In real life you probably don't do that very often.)

"LAN default" and "LAN alternate" were two different ping times (0.058ms and 0.069ms). I tried to benchmark a third option with a WAN whose ping times were on the order to 35ms. I killed the benchmark after 18 hours,

In my benchmarking PING time is the most critical factor in FIND performance. You want to do whatever you can to minimize it. "Good" ping time is less than 0.1ms. If you pings are greater than 1ms that is not good. Tens of millseconds is horrific.

Remember - the speed of light is one foot per nanosecond. And signals travel through fiber at only two-thirds light speed. The more times that signal gets handed off through routers, switches, NIC cards, firewalls, and such; the more latency you are going to endure.

That means that you really want your clients and your servers to be physically close. If you are running in the cloud try to use Azure "proximity placement groups" or AWS "local zones".
 

TomBascom

Curmudgeon
I was going to note that it is architecture specific. An application that fetches lots of records ... whether necessary or not ... will be significantly faster with shared memory than TCP, whereas one that fetches minimal records and minimal fields may not contrast nearly as much.

This brings up an interesting point.

In a lot of cases applications reference certain records over and over and over with a FIND statement.

These records are frequently some sort of control or "reference" record that either never changes, or changes so infrequently as to be effectively read-only.

I have seen many cases where well over *half* of the database workload is coming from such tables with as few as a dozen records in them. When the application was created this was no big deal, there were not many users and the workload wasn't significant enough for this to be a problem. Now, at the vendor's unanticipatedly large customer sites it is a different story...

You can make *huge* performance improvements by simply caching that data on the client.

And, in some cases, that is an incredibly simple thing to do. If you just happen to have an include file that every program includes (and many application do just happen to have such an include file) then all you need to do is this:

Code:
/* common.i -- an include file that every program includes at the top
 */

define new global shared temp-table whatever no-undo like dbname.whatever.

find first whatever no-error.
if not available whatever then
  for each dbname.whatever no-lock:
    create whatever.
    buffer-copy dbname.whatever to whatever.
  end.

/* all existing references to "whatever" will now read from the temp-table!
 */

You can, of course get a lot fancier. You might, for instance, only want to test for an empty temp-table in your -p procedure. Or you might want to add some sort of a "time to live" parameter and occasionally refresh the cache if it is updated on occasion.

You could even "bite the bullet" and encapsulate the cache into some handy OO code. But that's a topic for another thread ;)
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Since 10.2B06 -prefetch* startup parameters
Added in 11.1 and back-ported to 10.2B06. So if you're on 11.0, that's one more reason you're in unfortunate circumstances.

TABLE-SCAN was added (11.?) to FOR EACH
11.0.

Server side joins roll out in oe12 with incremental improvements from release to release
  • 12.0: FOR EACH... NO-LOCK
  • 12.1: Some dynamic queries (with FORWARD-ONLY and NO-LOCK)
  • 12.2: Static or dynamic queries, except those with INDEXED-REPOSITION or an outer join
    You can also obtain a record with a lock with a GET statement (static query) or GET method (dynamic query), e.g. GET-NEXT().
I killed the benchmark after 18 hours,
Was there more to that sentence? ;)
 

Cringer

ProgressTalk.com Moderator
Staff member
This brings up an interesting point.

In a lot of cases applications reference certain records over and over and over with a FIND statement.

These records are frequently some sort of control or "reference" record that either never changes, or changes so infrequently as to be effectively read-only.

I have seen many cases where well over *half* of the database workload is coming from such tables with as few as a dozen records in them. When the application was created this was no big deal, there were not many users and the workload wasn't significant enough for this to be a problem. Now, at the vendor's unanticipatedly large customer sites it is a different story...

We have a case in point. This table has one record. Note the scale on the left.

1693987112337.png
 
Top