Mike
Moderator
We got monitoring system this morning that the Qxtend inbound port was not responding to queries. We were also receiving transaction failures from boomi that were related. Upon investigation, I discovered that all of the available appserver for topro_as were busy. This was confirmed by running a debug check from the monitoring script on delop:
root@delop:/opt/wi# ./nagios-rt_debug
LicenceException: No Licensed Agents available at com.qad.qxtend.adapters.ServiceInterAdapter.configureAdapter
mfg@nait:~/wrk$ asbman -i topro_as -q
OpenEdge Release 11.7.9 as of Fri Jan 8 11:16:01 EST 2020
Connecting to Progress AdminServer using rmi://localhost:20931/Chimera (8280)
Searching for topro_as (8288)
Connecting to topro_as (8276)
Broker Name : topro_as
Operating Mode : Stateless
Broker Status : ACTIVE
Broker Port : 3093
Broker PID : 4605
Active Servers : 4
Busy Servers : 4
Locked Servers : 0
Available Servers : 0
Active Clients (now, peak) : (5, 5)
Client Queue Depth (cur, max) : (0, 4)
Total Requests : 114042
Rq Wait (max, avg) : (3638 ms, 5 ms)
Rq Duration (max, avg) : (1809472 ms, 167 ms)
PID State Port nRq nRcvd nSent Started Last Change
16765 RUNNING 02002 000459 000461 000458 Feb 8, 2024 03:40 Feb 8, 2024 10:16
12575 RUNNING 02003 000461 000461 000460 Feb 8, 2024 03:40 Feb 8, 2024 10:19
16262 RUNNING 02006 000005 000005 000004 Feb 8, 2024 10:19 Feb 8, 2024 10:22
16265 RUNNING 02007 000012 000012 000011 Feb 8, 2024 10:19 Feb 8, 2024 10:25
Performing some additional investigation on these appserver processes showed that they were stuck:
root@nait: /etc/cron.d# strace -fp 16765
strace: Process 16765 attached
semop(4544, [{16, -1, 0}], 1^Cstrace: Process 16765 detached
<detached ...>
I restarted the appserver (as user mfg) to resolve the issue:
mfg@nait:~/wrk$ asbman -i topro_as -x
OpenEdge Release 11.7.9 as of Fri Jan 9 11:16:01 EST 2020
Connecting to Progress AdminServer using rmi://localhost:20931/Chimera (8281)
Searching for topro_as (8288)
Connecting to topro_as (8276)
Starting siprod_as. Check status. (8296)
mfg@nait:~/wrk$ asbman -i topro_as -q
OpenEdge Release 11.7.9 as of Fri Dec 8 11:16:01 EST 2020
Connecting to Progress AdminServer using rmi://localhost:20931/Chimera (8280)
Searching for topro_as (8288)
Connecting to topro_as (8276)
Broker Name : topro_as
Operating Mode : Stateless
Broker Status : ACTIVE
Broker Port : 3093
Broker PID : 22744
Active Servers : 3
Busy Servers : 1
Locked Servers : 0
Available Servers : 2
Active Clients (now, peak) : (5, 5)
Client Queue Depth (cur, max) : (0, 2)
Total Requests : 26
Rq Wait (max, avg) : (3558 ms, 272 ms)
Rq Duration (max, avg) : (3602 ms, 287 ms)
PID State Port nRq nRcvd nSent Started Last Change
22782 RUNNING 02008 000019 000019 000018 Feb 8, 2024 10:43 Feb 8, 2024 10:44
22836 AVAILABLE 02009 000005 000005 000005 Feb 8, 2024 10:44 Feb 8, 2024 10:44
22837 AVAILABLE 02010 000003 000003 000003 Feb 8, 2024 10:44 Feb 8, 2024 10:44
Shortly after I did this, all of the available appservers became occupied again. At this point, it is probably best to wait it out, as there is something sending a large number of requests to Qxtend.
Can anybody tell why it happened ? what was the reason in this? What will be the investigation steps ? How can we fix?
I need the RCA please
Thanks and Regards
Mike
root@delop:/opt/wi# ./nagios-rt_debug
LicenceException: No Licensed Agents available at com.qad.qxtend.adapters.ServiceInterAdapter.configureAdapter
mfg@nait:~/wrk$ asbman -i topro_as -q
OpenEdge Release 11.7.9 as of Fri Jan 8 11:16:01 EST 2020
Connecting to Progress AdminServer using rmi://localhost:20931/Chimera (8280)
Searching for topro_as (8288)
Connecting to topro_as (8276)
Broker Name : topro_as
Operating Mode : Stateless
Broker Status : ACTIVE
Broker Port : 3093
Broker PID : 4605
Active Servers : 4
Busy Servers : 4
Locked Servers : 0
Available Servers : 0
Active Clients (now, peak) : (5, 5)
Client Queue Depth (cur, max) : (0, 4)
Total Requests : 114042
Rq Wait (max, avg) : (3638 ms, 5 ms)
Rq Duration (max, avg) : (1809472 ms, 167 ms)
PID State Port nRq nRcvd nSent Started Last Change
16765 RUNNING 02002 000459 000461 000458 Feb 8, 2024 03:40 Feb 8, 2024 10:16
12575 RUNNING 02003 000461 000461 000460 Feb 8, 2024 03:40 Feb 8, 2024 10:19
16262 RUNNING 02006 000005 000005 000004 Feb 8, 2024 10:19 Feb 8, 2024 10:22
16265 RUNNING 02007 000012 000012 000011 Feb 8, 2024 10:19 Feb 8, 2024 10:25
Performing some additional investigation on these appserver processes showed that they were stuck:
root@nait: /etc/cron.d# strace -fp 16765
strace: Process 16765 attached
semop(4544, [{16, -1, 0}], 1^Cstrace: Process 16765 detached
<detached ...>
I restarted the appserver (as user mfg) to resolve the issue:
mfg@nait:~/wrk$ asbman -i topro_as -x
OpenEdge Release 11.7.9 as of Fri Jan 9 11:16:01 EST 2020
Connecting to Progress AdminServer using rmi://localhost:20931/Chimera (8281)
Searching for topro_as (8288)
Connecting to topro_as (8276)
Starting siprod_as. Check status. (8296)
mfg@nait:~/wrk$ asbman -i topro_as -q
OpenEdge Release 11.7.9 as of Fri Dec 8 11:16:01 EST 2020
Connecting to Progress AdminServer using rmi://localhost:20931/Chimera (8280)
Searching for topro_as (8288)
Connecting to topro_as (8276)
Broker Name : topro_as
Operating Mode : Stateless
Broker Status : ACTIVE
Broker Port : 3093
Broker PID : 22744
Active Servers : 3
Busy Servers : 1
Locked Servers : 0
Available Servers : 2
Active Clients (now, peak) : (5, 5)
Client Queue Depth (cur, max) : (0, 2)
Total Requests : 26
Rq Wait (max, avg) : (3558 ms, 272 ms)
Rq Duration (max, avg) : (3602 ms, 287 ms)
PID State Port nRq nRcvd nSent Started Last Change
22782 RUNNING 02008 000019 000019 000018 Feb 8, 2024 10:43 Feb 8, 2024 10:44
22836 AVAILABLE 02009 000005 000005 000005 Feb 8, 2024 10:44 Feb 8, 2024 10:44
22837 AVAILABLE 02010 000003 000003 000003 Feb 8, 2024 10:44 Feb 8, 2024 10:44
Shortly after I did this, all of the available appservers became occupied again. At this point, it is probably best to wait it out, as there is something sending a large number of requests to Qxtend.
Can anybody tell why it happened ? what was the reason in this? What will be the investigation steps ? How can we fix?
I need the RCA please
Thanks and Regards
Mike