Answered Hanging AppServer

MattKnowles

New Member
We currently have a customer whose application hangs most mornings not allowing log in. Stopping and restarting the AppServer seems to fix this. I've put debug into the offending routine outputting this to the AppServer log.

Unfortunately the debug seems unhelpful as it indicates that the system is hanging on a FIND...NO-LOCK NO-ERROR statement as the message prior to this is output to the log but the message after isn't.

Two questions:
1) Are AppServer messages buffered? Is it possible that more messages are being reached but hadn't been flushed to the buffer?
2) Is there a way that I can determine what program/line-number that the AppServer is currently running/hanging on?

This is for 10.2b running on a Windows server.

Many thanks
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Do you have multiple AppServers running the same code? Is it possible multiple sessions are contending for the same record lock, causing the "hang" you see? Has this "hang" ever reached half an hour in length? If so you might see a lock wait timeout error in the log; the default value is 1800 seconds. If the AppServer(s) connect to a database, check the lock table in promon and see if your AppServer is waiting on a lock.

You could also try adding some log-entry-types on the AppServers so you have more information in the log about what they're doing.

1) I don't know.
2) You can open a proenv session and run "proGetStack pid" where pid is the process ID of the AppServer in question. That will create a file called protrace.pid in the working directory. It will contain the ABL call stack of the AppServer, so you will know which program it's running and which (debug-list) line number it's on. If this AppServer is a database client you can also get similar information on the server side by using Client Database Request Statement Caching in promon (R&D | 1 | 18 if I remember correctly). This feature writes the call stack to the client's _Connect record whenever it does DB I/O.
 

GregTomkins

Active Member
If it's really NO-LOCK, why is it hanging on a record lock?

Maybe it's actually stalled searching a huge unindexed table or in some kind of ABL loop. You could verify this by looking at the 'ps' output for the AppServer PID (or rather the Windows equivalent, whatever that is). If it really is a record lock, the CPU time will not move; if it's actually a loop or table scan, it will be advancing rapidly. This is one of the first things I check when researching a hung AppServer.

A couple of random other things to try: (a) look at the broker.log (as opposed to the server.log), though in all honesty, I find that 95% of the time, broker.log is not useful; (b) are there possibly some triggers on the table you are reading that is causing other work to happen that's not obvious from reading the source? (c) look in ASBMAN and see what the status of the agent shows, eg. it should be generally SENDING, if it's not, there may be some other piece of the puzzle missing.
 

GregTomkins

Active Member
One other suggestion: add NO-WAIT to the FIND, and deal with it if it fails due to a lock.

Also, is this a remote AppServer (running on a separate machine from the database)? There is something quirky about lock releases across remote AppServers in some versions, I forget the details, but if it is remote (or otherwise accessing the DB through non-shared memory), that would be a big red flag to me.
 

alperd

New Member
Is the problem still continuing? I had same problem. Now I can see which program running on the appserver.
 
Top