| Debunking Low Latency Myths |
|
| Wednesday, 31 October 2007 18:00 |
|
An introduction to the industry's latest bandwagon is barely necessary. As an eternal cynic, I loath seeing technology inappropriately sold. So amongst all the verbiage promoting low latency concepts and solutions, I feel the need to contribute some balance to the debate, and make some observations - a reality check.
Myth 1 - Expensive technology vs. co-location Many institutions have expensive low latency market data infrastructures hosted at their own data centres, hanging on the end of exchange feeds delivered down leased lines with proportionately huge propagation latency. Your typical low latency market data system should be able to transfer an update message from a feed to an application in under 1mS to deserve to the label "low latency". However, exchange feeds are delivered with an exchange to customer latency in the range 10mS (same city) to 100mS (trans Atlantic). Arguably it is more effective to stick with existing, clunky and mature technology solutions, but locate them closer to trading venues. The inevitable trend towards co-location will throw up some interesting architectural challenges. Traditionally, trading systems have facilitated consolidation of market data from multiple sources. With trading systems moving back to the source, all that will be left to consolidate is trade reporting. How should cross market low latency trading strategies be implemented in a world of distributed co-located trading systems? And for those that remain centralised rather than co-located, what does MiFID mantra of best execution mean in the context of differential latency across multiple trading venues? Myth 2 - Low latency tuning doesn’t impact capacity Market data systems exhibit latency for a variety of reasons. Significant amongst these are design features that aim to optimise the capacity and reliability of the system. Tuning a system for low latency compromises its performance in both of these respects. Notably, one particular vendor proudly announced to the world that their market data system could be tuned so as to deliver 1mS latency. What they failed to report was the impact that their tuning strategy had on the capacity of the system. Perhaps they would care to complete their white paper with a capacity impact analysis? Market data systems batch multiple updates into the same network packet in order to make optimal use of network resources. The consequence of this is that some update messages are delayed. Low latency tuning, by removing this delay, impacts the capacity of the system because network resources are used less efficiently. The result is that the tuned system will have a lower capacity. More hardware spend will be required to achieve the original system capacity. Market data systems buffer previously transmitted messages, and are able to retransmit occasionally lost messages, thus maintaining the integrity of the message stream. Low latency tuning impacts the reliability by seeking to remove retransmission latency on the basis that retransmitted data is already too old. Myth 3 - Lab based tuning is sufficient When tuning a market data system for low latency, the normal approach it to set up a representative test system in the lab, tune that system, then apply the results to a production system. However, the reality is that this will tell you little about how a system will perform in a real world production environment and more importantly how its performance will evolve over time as the characteristics of the data it is handling change. If you are serious about low latency, you should put in place a 24/7 operational latency monitoring solution. The system should store historical latency data, facilitating latency trend analysis, and forensic latency event analysis. Myth 4 - Sample based latency monitoring is sufficient There are two popular techniques for latency monitoring, neither of which is any good. The first technique involves the injection of additional messages, each carrying a timestamp, into the system. These messages are then observed later in the system, and the latency between the two points in the system calculated. Whilst this approach is great at monitoring the latency of the injected messages, it actually tells you rather less about your business critical data. The second technique involves tagging messages with timestamps so that each message accumulates a timed audit trail through a system. Because this process carries with it a significant overhead, it is usual to apply this technique only to a small subset of messages. The result is that you end up knowing rather less than you would like to about the true statistical behaviour of latency in your system. If you are serious about low latency, you should put in place a passive latency monitoring solution that observes messages transiting various points in the system. It is only through the capture and analysis of every message as it passes through you system that you will, with no performance impact, gain a full picture of the latency of your business critical traffic, with comprehensive coverage of every latency event. |
