Resolved _trans._trans-duration > Process Life Time

RealHeavyDude

Well-Known Member
OpenEdge 11.3.1 64Bit Solaris SPARC

Lately we run into trouble with long running transactions causing the Before Image of our production database to unexpectedly grow. Bad code ...

Therefore I have introduced a transaction monitor to report long running transactions and eventually disconnect the offending processes from the database when the database is in before image panic ( utilization has reached 80% of 6 GB threshold ).

I use this query to get a hold on all active transactions:
Code:
for each _Trans no-lock where _Trans._Trans-State = 'ACTIVE',
first _Connect no-lock where _Connect._Connect-Usr = _Trans._Trans-UsrNum:
end.

The _Trans._Trans-Duration field hols the time in seconds.

During testing I encountered some weirdness where the _Trans._Trans-Duration was insanely high and exceeded the process life time - calculated from _Connect._Connect-Time - by far. To me this does not make any sense at all.

I found out that this only happens with self service clients though.

I am inclined to think that this is a bug.

Has anybody else seen something like this?

Thanks in Advance, RealHeavyDude.
 

RealHeavyDude

Well-Known Member
In the meantime I've logged a suport case. It seems possible that this could be caused an integer overflow in the _Trans._Trans-Duration field for some reason. As an alternative one could calculate the transaction duration from the _Trans._Trans-Txtime deducted from the currten time ( now ).
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
This rings a bell. It seems to me that Paul Koufalis posted about this to PEG a year or so ago. I think the conclusion was that it was a bug. I'll see if I can look it up when I get to work.

Also, your query on _Connect by _Connect-usr is a table scan. You should query by _Connect-id = _Trans._Trans-UsrNum + 1.
 

RealHeavyDude

Well-Known Member
Hello Rob, Tom,

thanks for the information. I was already aware about the table scan - but, I didn't know that the _Connect-Id is always _Connect-Usr + 1 ...

I will change that.

Thanks, RealHeavyDude.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
From dba@peg.com, 08/21/2014; subject: Weird _Trans-Duration Behaviour:

Paul Koufalis:
10.2B03

I have a long tx script that checks active transaction durations and sends an alert if the duration is greater than some threshold. The other night I got an alert with a duration of 1 408 592 009. That's 44 years for the curious. I have been using this kind of alert for years at multiple customers and this is the first time I see something bizarre like this.

The only thing that I can think of is that in the microsecond of my query the state changed to ACTIVE but the duration was still not set so it was random garbage.

Does this ring a bell for anyone?

Paul

Mike Furgal:
It happens to us quite frequently. I have never tracked it down, it's not high on our priority list, but we see this probably once a month with one of our customers. This is across various OE versions.

...MikeF
 

RealHeavyDude

Well-Known Member
That is exactly the same behavior I am seeing with 11.3.1, except, that I am seeing this up to 3 times during our day end batch processing during the night. We did not notice this until we changed our backup strategy to online backups exclusively instead of online backups during the week and an offline backup on Sunday. There seems to be a half-official "proutil -C zerostats" that is not documented in 11.3 but could be used to reset the stats. But it is not granular and may introduce other negative side effects on other logic that works with the VSTs.

Progress Tech Support implicitly confirmed it between the lines. Therefore I am guessing that they would need to change the data type of the _Trans._Trans-Duration to int64 but are not positive about it.

I will change my logic to calculate the duration converting the _Trans._Trans-Txtime to a datetime and use the interval function to get the duration in seconds.

Thanks, RealHeavyDude.
 
  • Like
Reactions: rzr
Top