Even more on QMUX configurations
The jPOS QMUX feature forms the backbone of OLS.Switch’s remote authorization infrastructure. I talk about QMUX configuration models in my On-Boarding Guide. I’ve also blogged about how tight disconnect models can lead you to consider a sub-species of the QMUX model called the MUX Pool. We make use of that in our connectivity to Stored Value Systems (‘SVS’), a good provider of branded gift card services and a very reliable authorizer. They use a wicked tight disconnect model: 3 mins 30 secs or so of no traffic raises a peer disconnect on their end. It’s a good, proactive approach. What I liked about the conversation with SVS is that they could clearly articulate their approach. By comparison, we’ve had some frustrations with organizations who can’t describe or only hazily describe what the connection model will be like in production…especially with our replicated application node strategy in play at our client locations.
QMUX has proven to be extraordinarily resilient and efficient in the face of large authorization transaction volumes. Lines go up, down, up, down…QMUX does the channel management with great skill. However, we did see a recent situation where an SVS line got in a hang-up situation for a number of hours. We had the line marked as a connected. QMUX kept the channel in the mix (only one of the two active connections was affected). Yet, transaction after transaction timed out because neither we nor SVS saw the line as being disconnected or in any type of situation requiring some type of programmatic reset.
I reviewed the scenario with Alejandro. He suggested that what we need to do is to add ‘timeout’ and ‘keep-alive’ properties to SVS channel definitions. The timeout value will set a socket-level timeout. The ‘receive’ function in the multiplexer’s ‘Channel Adapter’ will fail (with log event ‘<io_timeout>’) if nothing is received within the specified timeout period. The channel will disconnect and then attempt a reconnection.
The specified value of the <timeout> property should be greater than the related <echo-interval> specified in the related logon manager. SVS features a tight (3 minutes 30 seconds) disconnect model, leading us to have to implement the ‘MUX pool’ approach with aggressive echo intervals (180000 ms, or three minutes). The timeout property value specified needs to provide breathing space to allow the echoes to operate in their intended fashion. An appropriate value is 300000 (five minutes). Since the MUX pool forces an echo on each channel in the SVS MUX every three minutes, not getting ‘receive’ activity in five minutes raises the possibility of a ‘hung’ line. Accordingly, we put the channel through a disconnect/reconnect cycle as a proactive measure.
What we end up with is a Logon Manager (one per defined channel in the Mux) that looks like this…
<svs-logon-mgr class="org.jpos.svs.LogonManager" logger="Q2">
<property name="persistent-space" value="jdbm:svslogon:log/svslogon" />
<property name="mux" value="svs-mux-0" />
<property name="channel-ready" value="svs.ready" />
<property name="timeout" value="900000" />
<property name="echo-interval" value="180000" />
<property name="logon-interval" value="43200000" />
</svs-logon-mgr>
…and a Channel Manager (one per defined channel in the Mux) that looks like this (“host” and “port” values are examples only):
<channel-adaptor name='svs'
class="org.jpos.q2.iso.ChannelAdaptor" logger="Q2">
<channel class="org.jpos.iso.channel.NACChannel" logger="Q2"
realm="svs-channel"
packager="org.jpos.iso.packager.GenericPackager">
<property name="packager-config" value="cfg/svs.xml" />
<property name="host" value="127.0.0.1" />
<property name="port" value="36000" />
<property name="timeout" value="300000" />
<property name="keep-alive" value="true" />
</channel>
<in>svs-send</in>
<out>svs-receive</out>
<reconnect-delay>10000</reconnect-delay>
</channel-adaptor>
Great post Andy, sorry I'm 8 months late. We could all benefit if the back-end providers would adopt a standard connection model {:
Posted by: Mark Carl | Saturday, May 09, 2009 at 22:18