Question Import Data volume increase to 300,000 causing DB updates to almost halt after updating 38000 records - Progress 11.7 Windows client - Linux db server

Shail

New Member
Progress 11.7 Windows client - Linux db server
-B 3,000,000, -B2 500,000, -L 1,000,000

We have recently increased the Import Data volume to 300,000 lines causing DB updates to almost halt after updating 38000 records.
I have cut the Import Data File to contain 100,000 lines - as a result the dead slowness is observed after the update of 44000 records.
Previously the data volumes were 50,000 lines and there weren't any issues.

-Mm 32000 is set for the application/client connection

There is enough space (64Gb) for BI files on the mount point on the Linux machine. The DB size is 300 Gb with enough disk space for data extents.

What would cause the slowness ? We have checked the data - no issues with it.
Would setting any other startup parameter, tuning etc may help ?

Any ideas please post,

Kind regards
Shail
 

TomBascom

Curmudgeon
Is "Import Data" an application specific function?

You are apparently importing something with a Windows client and then inserting or updating records via a client/server connection to a Linux host.

On the Windows side a great deal depends on the design of the code that is performing this "import" function. Does it attempt to find the record on the db to determine if it is an insert or an update? If so, are the relevant queries efficiently written? Is it batching groups of records together or doing them one at a time? Are there client side bottlenecks visible? (Large DBI files or SRT files for instance. Or bottlenecks reading the data from disk? Or updating temporary files?)

The network layer between client and server is an obvious potential bottleneck. You mention -Mm 32000. That is useful for large NO-LOCK queries, which might perhaps be related to checking if data is already in the target db, but pretty much useless for creates and updates (CREATE and UPDATE messages only ever contain a single record.) If you are increasing -Mm then you should, of course, also add:

Code:
-prefetchDelay                  # do not immediately send the first record -- wait to see if the query returns more than one
-prefetchFactor 100             # attempt to fill messages 100% or until NumRecs is reached
-prefetchNumRecs 4000           # attempt to put X records n a message, use a large number -- FIELDS phrases can leverage that; max value is 32766
-prefetchPriority 100           # aka "pollskip", check for messages (poll) from client every X records
-Nmsgwait 1                     # the number of seconds a server waits for a remote network message before checking for other events such as a database shutdown or forced disconnect of a remote client

The db server itself has many, many more parameters and configuration options than just -B, -B2 and -L.

To see what your database performance bottlenecks are you want a good monitoring tool like ProTop

Just for starters - for update heavy workloads you would want to be especially certain that the bi and ai blocksizes are 16k and the bi cluster size is at least 64MB and that AIW, BIW and APW "helper" processes are running. Your underlying hardware needs to be pretty good too if you're expecting to upload large amounts of data quickly. Remember - there is no such things as "a high performance SAN". And more cores are not better. A modest number of FAST cores are what you really want. Just as important is the structure of the database - both the configuration of storage areas and the assignments of tables and indexes to those storage areas. You can gain a lot of benefits from understanding the technical characteristics of your data and arranging for proper storage areas.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Set reasonable values for -aibufs and -bibufs on the primary broker, say 128.

It would also be helpful to know the current values of your other broker configuration parameters apart from the few you have mentioned.

What is the OE service pack on the database server? It is found in $DLC/version. It could be relevant.

On the Windows side a great deal depends on the design of the code that is performing this "import" function.
Seconded. This is hugely important. No amount of database tuning can make up for code that is written badly enough. And given that you already have a handicap by doing mass updates across a network, this code needs to be carefully written with efficiency in mind, including optimizing network traffic.
 

Shail

New Member
Thanks for that, appreciate

Other startup parameters on the db are currently as below:
-Mn 15, -Mm 32600, -maxport 10,000 -n 75, -Mqb 15, -pinshm, -lruskips 100, -lru2skips 100, -bistall, -bithold 6144, -bibufs 50, -pica 8192
Linux OS has 4096 blocksize - db blocksize is 8192
AI is not enabled on this db.

64-bit Windows Server 2012R2, 8Gb RAM with 2 cores each 2.7 GHz
I haven't seen any srt files on the -T (progress11\temp folder or anywhere else)

OE Service Pack 11.7.4.0

Unfortunately all the processing is on the Windows box.
Client parameters: -inp 32000, -s 100, -rand 2, -Mm 32600 -D 300, -mmax 5500, -cpstream ISO8859-1, -T c:\progress11\temp, -noincwarn

The data from the data file is imported into temp-tables and gets filled into prodataset/ datasource - this part completes without issues, displaying the temp-table data for chosen columns in a browse.

Update of the Main-record with an Ex-lock and its subsidiary records in a transaction, from within a For-Each of ttMain-record Block where field Found=Yes on the ttMain record.

Next, Insert of those Main-records and their subsidiary-records from within a For-each ttMain-record, where field Found=No. These get a Batch record against them, so that the entire batch could be deleted, if not needed, via a separate Submission process.

Just attempted an import of 50,000 lines of data - it got stuck on Line 46454, while creating a new record - no errors, just not moving any forward from there- connection is still present.

Structure file (similar dos version) is as below, BI Files are located in a separate mount point
#
b c:\db\folder-name\bi\ldb_name.b1 f 2000000
b c:\db\folder-name\bi\ldb_name.b2
#
d "Schema Area":6,64;1 c:\db\folder-name\SchemaArea\ldb_name.d1 f 50048
d "Schema Area":6,64;1 c:\db\folder-name\SchemaArea\ldb_name.d2
#
d "S128-512C":7,128;512 c:\db\folder-name\S128-512C\ldb_name_7.d1 f 2097024
d "S128-512C":7,128;512 c:\db\folder-name\S128-512C\ldb_name_7.d2
#
d "S128-8N":8,128;8 c:\db\folder-name\S128-8N\ldb_name_8.d1 f 100096
d "S128-8N":8,128;8 c:\db\folder-name\S128-8N\ldb_name_8.d2 f 100096
d "S128-8N":8,128;8 c:\db\folder-name\S128-8N\ldb_name_8.d3 f 100096
d "S128-8N":8,128;8 c:\db\folder-name\S128-8N\ldb_name_8.d4
#
d "S16-512C":9,16;512 c:\db\folder-name\S16-512C\ldb_name_9.d1 f 2097024
d "S16-512C":9,16;512 c:\db\folder-name\S16-512C\ldb_name_9.d2
#
d "S16-8N":10,16;8 c:\db\folder-name\S16-8N\ldb_name_10.d1 f 50048
d "S16-8N":10,16;8 c:\db\folder-name\S16-8N\ldb_name_10.d2
#
d "S256-512C":11,256;512 c:\db\folder-name\S256-512C\ldb_name_11.d1 f 2097024
d "S256-512C":11,256;512 c:\db\folder-name\S256-512C\ldb_name_11.d2
#
d "S256-8N":12,256;8 c:\db\folder-name\S256-8N\ldb_name_12.d1 f 50048
d "S256-8N":12,256;8 c:\db\folder-name\S256-8N\ldb_name_12.d2
#
d "S32-512C":13,32;512 c:\db\folder-name\S32-512C\ldb_name_13.d1 f 2097024
d "S32-512C":13,32;512 c:\db\folder-name\S32-512C\ldb_name_13.d2
#
d "S32-8N":14,32;8 c:\db\folder-name\S32-8N\ldb_name_14.d1 f 50048
d "S32-8N":14,32;8 c:\db\folder-name\S32-8N\ldb_name_14.d2
#
d "S64-512C":15,64;512 c:\db\folder-name\S64-512C\ldb_name_15.d1 f 2097024
d "S64-512C":15,64;512 c:\db\folder-name\S64-512C\ldb_name_15.d2
#
d "S64-8N":16,64;8 c:\db\folder-name\S64-8N\ldb_name_16.d1 f 50048
d "S64-8N":16,64;8 c:\db\folder-name\S64-8N\ldb_name_16.d2 f 50048
d "S64-8N":16,64;8 c:\db\folder-name\S64-8N\ldb_name_16.d3
#
d "SmallIndexes":17,1;8 c:\db\folder-name\SmallIndexes\ldb_name_17.d1 f 50048
d "SmallIndexes":17,1;8 c:\db\folder-name\SmallIndexes\ldb_name_17.d2 f 50048
d "SmallIndexes":17,1;8 c:\db\folder-name\SmallIndexes\ldb_name_17.d3 f 50048
d "SmallIndexes":17,1;8 c:\db\folder-name\SmallIndexes\ldb_name_17.d4
#
d "LargeIndexes":18,1;64 c:\db\folder-name\LargeIndexes\ldb_name_18.d1 f 2097024
d "LargeIndexes":18,1;64 c:\db\folder-name\LargeIndexes\ldb_name_18.d2


Kind regards
Shail
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
The structure of the Linux database is the same as the Windows example you have shown?

Database:
  • -Mqb 15: I assume you meant -Mpb?
  • -Ma?
  • -Mi?
  • -ServerType? There are no SQL users?
  • -bibufs 50: This is on the low side for mass updates. Check promon R&D, 2, 5 for empty buffer waits.
  • No AI? I assume this database is in a test environment, rather than production.
  • -pica 8192: This is an OE Replication-specific parameter, which you are not using.
  • -Mm 32600: You are at the maximum. Have you experimented with lower values for this process?
  • OE 11.7.4: note that you are 11 updates behind current (11.7.15). 11.7.4 was released on October 23, 2018.
  • How many remote users are connected to the server that the update client is connected to? What are they doing at the time of the update?
  • What other database I/O is happening in the database at the time of the update?
  • For the table(s) you are writing to, check where the area high water mark is in each of those areas: in a fixed extent or in the variable? Given that you said the database is 300 GB but the sum of your fixed extent sizes is 12.7 GB, I assume you are writing in the variable extents. Do you have throughput challenges with other periods of mass create/update in the database, or just this one in particular?
  • Are there any long-running database transactions during the update, either for the update client or any others? Check promon R&D, 1, 4, 3.
  • Are there any errors or warnings in the database log? You have set -bistall, -bithold 6144. Is the database in a BI stall? If so, that would produce symptoms like you are seeing: a connected client that is trying to process updates and seems "stuck". You will see messages (9239) and (6560) in the database log, e.g.:
    (9239) BI File Threshold of 6144 MBytes has been reached.
    (6560) Forward processing stalled until database administrator intervention.

Client:
  • -Bt?
  • -tmpbsize?
  • How much data are you loading into temp-tables during this upload operation? What is the ultimate size of the temp-table database (DBI file)? Does it grow at all or does it stay at the intial size?
  • Use -clientlog and check the log for warnings or errors.
  • -q?
  • Are you running r-code or doing session compiles with source?
  • Are there .pl files?
  • Is the propath unusually long?

Just attempted an import of 50,000 lines of data - it got stuck on Line 46454, while creating a new record - no errors, just not moving any forward from there- connection is still present.
  • Check promon 5 at this time, while the client is "stuck"; what is happening in the database? What is the rate of record creates or updates leading up to this? How does it vary over time, from line 1 to 46454?
  • Run a proGetStack on the client (proGetStack <client PID>) once the creates stop. You will see the ABL call stack in the file protrace.<PID> in the working or temp directory. Run it a few times, several seconds apart, to get multiple stack traces in the same file. Compare the top frame in each stack, to see if they show the same module/line number or different ones. You can find that line in the source by running a compile debug-list.
  • What is the OS resource utilization of the client process over time? Monitor with Process Explorer, look at process disk I/O, network I/O, CPU utilization. In particular, what if anything is it doing once it is "stuck"?
 

Shail

New Member
Thanks for all help and showing me process explorer.
Please find attached the actual .st from the Linux box, which is a lot longer with actual extent sizes.
The db Server is started for Both - 4gl/sql
Only one user is connected to this Test db.
no other I/O at the time of Update. [DB Startup Log attached]
-shmsegsize 32768 Mb
-Ma 5
-Bt is not currently applied, neither is -tmpbsize
Before-Image Cluster Size: 33554432.
Before-Image Block Size: 16384.
Maximum Number of Clients Per Server (-Ma): 5
Memory overwrite check (-MemCheck): Not Enabled
Delay of Before-Image Flush (-Mf): 3
Minimum Clients Per Server (-Mi): 1
Message Buffer Size (-Mm): 32600
Maximum Number of Servers (-Mn): 16
Servers per Protocol (-Mp): 0
Maximum Servers Per Broker (-Mpb): 15
Excess Shared Memory Size (-Mxs): 51
Broker server group support (-ServerType): BOTH
SQL Server Max Open Cursors (-SQLCursors): 50
Number of seconds for a SQL client to wait for a record lock(-SQLLockWaitTimeout): 5.
SQL Server Stack Size (-SQLStack): 1000

Many thanks
Shail
 

Attachments

  • dbname-st.txt
    27.1 KB · Views: 1
  • dbStartupLog.txt
    18.5 KB · Views: 1

Rob Fitzpatrick

ProgressTalk.com Sponsor
Have you audited your code, or looked at a compile listing, to determine your transaction scope?

Did your test database stall during the test? That scenario is the best fit for the symptoms you described.
 

Shail

New Member
Hi Rob, Tom - thanks , please herewith some more detail

-cpinternal ISO8859-1,-cpstream ISO8859-1,-cpcoll Basic,-cpcase Basic,-d dmy,-numsep 44,-numdec 46,(end .pf),-p mainprogram.w,-pf \\WinServer\folder-name\dbname\dbname.pf,-inp 32000,-s 100,-rand 2,-Mm 32600,-D 300,-mmax 5500,-cpstream ISO8859-1,-db dbname,-ld dbname,-S 5959,-H linux-machine,-N tcp,-T c:\progress11\temp,-noincrwarn,(end .pf),-ininame \\WinServer\folder-name\dbname\application_dbname.ini,-debugalert

No errors have shown up in .lg file regarding bi or anything else

Data file volume effects- the length of every line is 1320 characters and the least number of lines (I have tried) that it gets stuck with is 50,000 (at Line 46454) . It goes straight through with no issues when a 40,000 lines file is imported. But when tried loading a 320,000 line file, it got stuck at line 38,411.

procedure library - yes, .pl file is used as well.

ProGetStack - he Lines that it says are from a procedure inside the program, aren't actually there in that procedure in the source program (.p)
Exception code: C0000005 ACCESS_VIOLATION
Fault address: 71C2A80A 01:001E980A C:\Progress11\OpenEdge\bin\prow32.dll

Registers:
EAX:00000000
EBX:036F60A4
ECX:0039EA38
EDX:00000436
ESI:72327E70
EDI:0039EA38
CS:EIP:0023:71C2A80A
SS:ESP:002B:0039E9F8 EBP:71C2E300
DS:002B ES:002B FS:0053 GS:002B
Flags:00210202


Xref - a compile with xref shows three whole-index that are only small codes' tables so shouldn't be a problem
Listing - still working on this - the scoping of transactions for update appears correct - still checking

Thanks so much
Shail
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
The access violation would be from a crash rather than a freeze. I suggest you open a case with tech support.

You aren't using -q. That's a quick win for performance. Note that you will need to restart your client to pick up new code.

ProGetStack - he Lines that it says are from a procedure inside the program, aren't actually there in that procedure in the source program (.p)
Do you mean that the top of the ABL stack shows line x in module.p but the module.p source file has fewer than x lines of code? This is possible when you have include files. This is why you need debug listings of your code. The compile debug-list output of a procedure contains all of the source in the module and any of its includes, even if there are nested includes. Note that this is different from the output of a compile listing. Read the entry in the docs for the compile statement for more info.
 

Shail

New Member
1. Next applying -q to the client .pf and restart it.
2. Have opened a case with tech support.
3. Just noticed that a 50,000 Lines data update was started 7 hours ago - had reached up to 46,454 lines in 12 minutes, but then had stopped moving , so had abandoned it because it hadn't for over half-an-hour - just noticed it has moved to Line 48,488 as below, and is 97% complete - still updating, very slowly.

1659039935552.png
 

Shail

New Member
1. Next will be do a debug-list on the compile to read protrace - thanks for that
2. Indeed, -q has made a difference - within 12 minutes it has got to where it did after 7 hours without -q, but has stopped moving further now.
I have noticed that the moment it gets to this point of "no movement", the window flickers - also there is no PROCESS EVENTS anywhere in the entire program
1659043713936.png
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Possible memory leak? Watch the Private Bytes graph on the Performance Graph tab in Process Explorer for your prowin32.exe process.

Also, did you check DBI file size for changes during the load?
 

Shail

New Member
Can't find dbi or srt files in ( -T c:\progress11\temp folder) - possibly because it's a TDE database
1659078561036.png
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Can't find dbi or srt files in ( -T c:\progress11\temp folder) - possibly because it's a TDE database
Your temp files still exist. The DBI may be encrypted and hidden but it is still there. Configure File Explorer to show hidden files.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
You can use Ctrl-h on a Process Explorer line item to show a lower pane with a list of the process' file handles. Sort the list on the Name column and you will find the location of the DBI*, srt*, etc.
 

Shail

New Member
great, thanks - yes the files are there - haven't been able to access them yet
1659104919073.png
Have put some Logs in the code as below - is there a way to have Log-manager to write them UNBUFFERED because I only get to see a zero-sized Log file from the below.
IF NOT lAppliedLog and ttLineNo GT 46000 THEN
DO:
lAppliedLog = YES.
LOG-MANAGER:LOGFILE-NAME = "c:\temp\test.log".
LOG-MANAGER:LOGGING-LEVEL = 3. // Verbose
LOG-MANAGER:LOG-ENTRY-TYPES = "DB.Connects,4GLTRACE,4GLMESSAGES,FileID,ProEvents.Other:4".
END.
 
Top