Performance slowdown in LANSA applications caused by Mutex wait issue

Date: 17 May 2017
Product/Release: LANSA for i - All versions
Abstract: IBM PTF available for Mutex wait (MTXW) issue that can affect performance of LANSA applications
Submitted By: LANSA Technical Support

Description:

LANSA customers who have large systems involving many thousands of LANSA runtime jobs have sometimes reported a slowing down in performance of the LANSA jobs. In extreme cases, the performance can become so slow that the LANSA runtime jobs stop responding.

Symptom:
One of the most obvious symptom that you are experiencing this issue is that all jobs will be sitting on a MTXW condition so very little can be processed.

Background:
The problem can occur when there is a very large number of LANSA runtime jobs starting at about the same time – in one particular case it could have been about 6,000 requests (the number will depend on machine/subsystem configuration). The job call stacks showed the jobs were in the job initialisation stage (not yet in LANSA code) and were waiting on a mutex from IBM program QTCP/QTMSUTL72, all waiting for the same IFS file.

Solution:

IBM have acknowledged this Mutex contention issue and have produced a PTF to resolve the issue. The link to the APAR to resolve the issue:
http://www-01.ibm.com/support/docview.wss?uid=nas2SE66961

The PTF to apply is SI64349

Note: This is an IBM i 7.2 and later issue only as QTMSUTL72 is an IBM service program used for sending email and is new to IBM i 7.2.

Customer testimony:
We have applied the PTF and have seen very positive improvements. Prior to applying to production, we tested the PTF by starting thousands of LWEB_JOBS into a held queue and then released that queue. Prior to the PTF’s we saw large amounts of Mutex wait contention. After the PTF the Mutex contention all but dissapeared and the jobs started almost immediately even when we started 10,000 jobs at once. Analysis since the PTF has been applied shows MTXW contention time reduced to almost nothing on LWEB_Jobs so the issue has been resolved.