Date: Fri, 29 Mar 2024 01:54:41 -0700 (PDT) Message-ID: <39986336.417.1711702481816@aries> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_416_537107158.1711702481816" ------=_Part_416_537107158.1711702481816 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Overview
High-availability and failover refers to the capability of the Marketcet= era Automated Trading Platform to detect and recover from transient datacen= ter failures. Failures may be related to software (failure of a Market= cetera node), hardware (failure of an Marketcetera host), or datacenter (lo= ss of connectivity to/from a datacenter).
The goal of Marketcetera is to guarantee that any message it receives is= delivered to an appropriate recipient. In most cases, this will be the int= ended recipient, e.g. the broker or the client=E2=80=99s OMS. However, unde= r certain circumstances, like if a broker connection is unavailable, Market= cetera will reject the message to the OMS. In all cases, the goal is that n= o message is lost or undelivered to some recipient.
High-availability is more than just message delivery. In order to achiev= e HA/FO, Marketcetera as a conceptual application needs to be able to survi= ve the transient loss of one or more nodes. There are five essential compon= ents that needs to be considered:
Let=E2=80=99s consider them one at a time to understand how HA/FO is to = be achieved.
Marketcetera FIX Gateway (from OMS)
The Marketcetera FIX Gateway accepts incoming FIX messages from an OMS a= nd maintains the FIX sessions for that connection. One of the key component= s of the FIX Gateway is the network socket over which incoming orders are r= eceived. The loss of a datacenter or hardware host (which includes a softwa= re failure, of course) means that the socket is lost, too. Since we cannot = guarantee that the OMS is capable of managing multiple socket addresses (pr= imary, secondary, etc), Marketcetera implements a routing-based failover te= chnique. Using this technique, the IP address supplied to the OMS is a virt= ual address that initially maps to host 1, then fails over to map to host 2= (and then host 3, etc). The session details are kept in a common database.= Any Marketcetera host is capable of managing the sessions, however, only o= ne can be =E2=80=9Chot=E2=80=9D at once, since only one host can actively m= anage the real socket connection. Marketcetera will keep =E2=80=9Cwarm=E2= =80=9D FIX Gateway components and a shared, in-memory collection that indic= ates which is the hot host. Upon failure of a host, the designated next war= m host will become hot. The priority is configured in Marketcetera.xml= and must match the network mappings established by our client=E2=80= =99s IT team. Any Marketcetera FIX Gateway can process messages and move th= em to the next destination. Incoming messages are stored to an in-memory qu= eue shared by all FIX Gateways and written to the database. Upon receipt, a= message tracker is started, which is also shared by all Marketcetera nodes= . It is the responsibility for each tracker to verify that the message was = handled or take action. Action involves rejecting the message back to the O= MS, sending a notification, or both. Upon restart, all FIX Gateway componen= ts will seek each other out and sync up.
Marketcetera Message/Report Handlers
The Marketcetera Message and Report handlers are responsible for taking = an incoming message, applying business logic to it, and shepherding it to i= ts final destination up to but excluding the actual FIX session and socket.= The handlers do not maintain a specific socket on which they=E2=80=99re co= nnected, so they are more free to assemble at will. All handlers share an i= n-memory, trans-node queue of work to be done and tracking tasks. The queue= is backed by the database in case all nodes go down at the same time. Each= node knows about every other node and shares information about tasks to be= done.
Each task, which is either a message from an OMS or a message from a bro= ker, has an order ID associated with it. Nominally, any Marketcetera node c= an process any message, however, one of the business requirements is that m= essages for a particular order be processed sequentially. In order to maxim= ize overall order throughput, the optimal case would be to span messages to= be processed across all nodes. In order to guarantee sequential processing= for messages of the same order, however, it is necessary to affine an orde= r chain to a node. In this context, an order chain refers to all orders tha= t are conceptually or logically grouped. An order chain may consist of a ne= w order single, followed by a cancel/replace for that order, followed by a = cancel for the cancel/replace, for example. In practice, these messages do = not all share a common field. In concept, they are related and are all memb= ers of the same order chain. For convenience, Marketcetera refers to all me= mbers of the same order chain as having the same root order ID, which is th= e order ID of the original order. All members of the same order chain will = be handled by the same Marketcetera node. If a Marketcetera node dies, its = affined order chains will be orphaned and assigned to a new node to handle = the message that was in process, if any, and any subsequent messages of tha= t chain. The new node inherits the order chain from the terminated node.
Marketcetera FIX Sender
The Marketcetera FIX Sender sends outgoing FIX messages to brokers and m= aintains the FIX sessions. Like the Marketcetera FIX Gateway, the Sender ma= intains a number of physical socket connections. Therefore, there cannot be= multiple hot FIX Sender nodes. In a fashion very similar to the Marketcete= ra FIX Gateway components, there will be one hot and multiple warm nodes. A= t such time as a hot node goes off line, one of the warm nodes is nominated= to be the new hot node and it starts up. Unlike the FIX Gateway nodes, it = does not matter which warm node starts up as the FIX Sender initiates socke= t connections, not receives them. Like the other nodes, the FIX Sender main= tains an in-memory, trans-node queue backed by the database. FIX session in= formation is also stored in the database.
Database
The database Marketcetera uses is administered by our clients. In the ab= sence of viable database service, the Marketcetera nodes will not start. If= database service is lost while iRouter nodes are running, message delivery= will continue as will the in-memory, trans-nodes shared queues, but the ab= ility for nodes to recover from total shutdown (all nodes go offline) will = be absent. In the total shutdown scenario, upon restart, Marketcetera will = connect with the external FIX connections and negotiate last known messages= . If messages are determined to be missing by either party, they will be re= sent. Since session information is stored in the database in order to facil= itate loss-of-node recovery, it is very important that Marketcetera works c= losely with our clients to maintain suitable and standard maintenance and c= luster techniques and practices for the application database.