low-delay FAST market decoding device and method based on pipeline architecture
Technical Field
The invention belongs to the technical field of information, and particularly relates to a low-delay FAST market information decoding device and method based on a pipeline architecture.
Background
Financial transactions are so multicast (multicast) releases the latest market data to market participants in real time. The market data includes the latest information of the latest buying and selling quotation and demand, the transaction record of the financial product (such as opening price, highest price, lowest price, present price, transaction amount and the like), the order state and the like. The financial participator analyzes the accessed financial market data in real time through software such as tape reading software or algorithm trading, and makes financial trading decisions (such as buying and selling financial products) according to the latest market. Therefore, timely interpretation of financial market data is critical to market participants, especially algorithmic trading and high frequency traders. A market participant who has recently acquired market conditions may acquire an instantaneous market profit in preference to other market participants.
Algorithmic trading (algorithmic trading) refers to the automated implementation of fast, low-cost order execution and deals by computer programs. And more particularly to programmatically determining an optimal execution path, execution time, execution price, and execution quantity for an order. The algorithm transaction is widely applied to pension funds, mutual funds, hedge funds, other buyer institution investors and the like. By means of the algorithm trading, large-amount trading can be divided into a plurality of small-amount trading to deal with market risks and impacts, and meanwhile liquidity can be provided for the market.
high Frequency Trading (HFT) refers to computerized Trading that seeks price differentials to realize arbitrage from very short market segments that are unavailable to people through High performance computing platforms. The response delay of the high-frequency trading to market data is in the microsecond level, the position holding time of each time is extremely short, and the position is basically kept flat when the disc is closed. High frequency transactions achieve profit by accumulating the tiny profits many times frequently. High frequency trading is widely used in marketing and provides liquidity to the market. High frequency trading has accounted for 70% of the total volume of the U.S. stock market, 45% in europe and 40% in japan. High frequency trading is beginning to rise in emerging markets, and the high frequency trading is applied to commodity futures in China, ETF (electronic and technical field) namely property certificates and the like.
in the domestic financial market, financial messages including order data, market data, and the like are transmitted between Trading institutions and market participants mainly through a security Trading data Exchange Protocol (STEP). The STEP message is fully compatible with the Financial information exchange protocol (FIX) of the foreign Financial market. The STEP/FIX message is strictly composed of a plurality of basic structures of "tag value". This basic structure is called a field or a field. The fields are separated by a field separator "SOH". Wherein "tag" is field ID or field name, indicates a specific field, and implies information such as type and value interval of a value, and "value" is the value of the field. If the field "270 ═ 342" indicates that the value of the field numbered 270 (in STEP, the "price of a market item") is 342; the SOH is a non-printable character with ASCII code equal to 1.
In order to reduce the market data transmission bandwidth requirement and transmission delay, the market publishing organization selects a STreaming FIX protocol (FAST) to perform STreaming coding compression on the STEP/FIX market message. The FAST encoding method reduces the size of the data stream in two levels. First, the concept of "field operators" makes it possible to exploit the dependencies of data in a stream, eliminating redundant data. Second, the field length that can be self-described (stop bit encoding mechanism) and the field Presence bitmap (PMap) that indicates whether a field is present or not are utilized in the serialization of the remaining data by binary encoding. The encoding is performed according to a control structure called "template". The template controls the encoding of a portion of the stream by specifying the order and structure of the fields, field operators, and the binary encoding representation methods used by them. For more details, see FIX/STEP and FAST protocol specification documents. For convenience of description, binary numbers obtained by FAST coding each FIX/STEP field are referred to as the coded values of the field.
with the increasingly open development of the Chinese financial market, the financial applications such as high-frequency transaction, algorithm transaction and the like in the Chinese financial market must occupy larger market share in future. The timely analysis of the financial market data is a necessary precondition for financial applications such as algorithm transaction, high-frequency transaction, financial risk monitoring and the like. As financial markets compete increasingly, financial risks have a greater impact on the institution itself, and even the entire financial market and even social levels than before. Therefore, research on a low-latency FAST market decoding technology is urgently needed to support application requirements of financial risk monitoring, algorithm transaction and the like requiring low-latency market processing. And meanwhile, the liquidity of the financial market can be promoted.
The current FAST market decoding is mainly realized based on a software method, such as OpenFAST (source/project/OpenFAST), QuickFAST (www.ociweb.com/products/QuickFAST), or a market decoding system developed by an enterprise. Software-based decoding methods introduce additional data processing delays. On one hand, software network protocol stack delay introduced by market data packet analysis: memory copy time delay twice and system interrupt processing time delay waiting; on the other hand, the system jitter delay introduced by the operating system includes multi-process contention for system resources, interrupt waiting, and the like. Typically, the market processing delay of software is on the order of milliseconds. The market decoding time delay of millisecond level is difficult to satisfy the real-time application such as high-frequency trading and market risk monitoring.
On the other hand, FAST messages have strong data correlation. The encoded FAST market messages are binary data streams, and fields and messages are sequentially identified through a PMap and stop bit encoding mechanism. That is, the position of the next field in the input market stream can be determined only by reading the FAST encoding value of one field from the input stream; similarly, the next FAST message can only be read from the input stream after the read of one FAST message is completed. The data dependency of the FAST message limits the parallel decoding of the FAST market. The published literature has not achieved the effort to decode FAST market messages in parallel.
Disclosure of Invention
The invention aims to provide a device and a method for accelerating FAST market data decoding based on special hardware, which can effectively accelerate market decoding speed and provide support for financial applications such as algorithm transaction, high-frequency transaction, market risk monitoring and the like.
The technical scheme adopted by the invention is as follows:
a low-delay FAST market decoding device based on a pipeline architecture comprises an internal bus, a controller and field decoding operators of fields in FAST market data, wherein the field decoding operators are respectively connected to the internal bus, and the decoding of the fields in the FAST market data is sequentially completed under the control of the controller; the internal bus is divided into a data bus and a control bus, the data bus realizes data transmission among the FAST market data input stream buffer, each field decoding operator and the FIX message buffer, and the control bus is responsible for controlling the decoding operation of each field.
Furthermore, the field decoding operator is a three-section decoding operator, and the pipeline type FAST market decoding is realized through bus connection; the three-section decoding operator comprises three components of reading data, decoding a field and outputting a decoding result, and the three components perform intermediate result caching through a cache. The field decoding component is responsible for specific decoding according to the rule of a field operator, and the result output component is responsible for outputting the decoded field value to the output FIX message buffer. The pipelined FAST market decoding comprises three independent pipelines: the data reading components of all the field decoding operators are respectively connected to the internal bus and connected with the data reading controller to form a data reading pipeline; the decoding components of all the field decoding operators are respectively connected to the other internal bus and connected with a decoding controller to form a decoding pipeline; and the result output parts of all the field decoding operators are respectively connected to the other internal bus and are connected with the output controller to form an output pipeline.
Furthermore, the controller is realized by adopting a finite state machine mechanism with a data path, the controller is internally divided into a control path and a data path, and the interiors of the two paths are realized by the finite state machine; the control path is a field decoding task scheduler at the top layer, and the control path sequentially sends out instructions for starting decoding operation of each field decoding operator; the data path comprises specific control logic of each decoding operator; and when the decoding operator of one field completes decoding, the data path returns a decoding end signal to the control layer.
A low-delay FAST market decoding method based on a pipeline architecture, which adopts the device, divides a field decoding operator into three components of reading data, field decoding and decoding result output, and realizes the pipeline FAST market decoding through bus connection, and comprises the following steps:
1) The data reading components of all the field decoding operators are respectively connected to the internal bus and connected with the data reading controller to form a data reading pipeline; reading the coded value of the field from the FAST market data input stream buffer by a data reading component;
2) the decoding components of all the field decoding operators are respectively connected to the other internal bus and connected with a decoding controller to form a decoding pipeline; the field decoding component is responsible for specific decoding according to the rules of the field operator;
3) The result output parts of all the field decoding operators are respectively connected to the other internal bus and connected with the output controller to form an output pipeline; it is responsible for outputting the decoded field value to the output FIX message buffer through the result output part.
The device and the method for processing FAST market data decoding provided by the invention have the following advantages:
1. And obtaining the FAST market data processing speed with extremely low time delay. The delay for decoding a FAST market is 0.1-1 microsecond. FAST market decoding is realized based on special hardware, and extra processing time delay introduced by a software system can be avoided.
2. The FAST market data decoding processor based on the bus architecture has good expansibility and supports flexible field updating. Adding new operators only needs to mount the operators to the bus, and deleting the operators only needs to unload the operators from the bus. This is important for financial applications, as FAST market templates may be updated frequently.
3. The FAST market situation decoding processor based on the production line solves the difficulty of FAST market situation data dependence, solves the difficulty that FAST market situation can not be decoded in parallel, and realizes parallel decoding among a plurality of fields and a plurality of messages; and reasonable representation of intermediate results among all the stages of the pipeline adopts a representation mode of 'value existence mark + value' overall. Compared with the FAST market processor which does not use a pipeline architecture, theoretically, the performance can obtain 3 times of acceleration ratio; the actual measurement result shows that the performance is improved by 1.8 times.
4. The controller is implemented based on a finite state machine with a data path, simplifying control and update logic.
5. The invention is suitable for the differential decoding protocol which is similar to the FAST protocol structure in task.
The actual measurement environment of the points 1 and 3 is as follows: the FPGA chip is as follows: xilinx Zynq-7000chip (XC7Z020-3CLG 484); a FAST template contains 12 fields.
Drawings
Fig. 1 is a schematic diagram of a FAST market decoding processor based on a bus architecture.
FIG. 2 is a schematic diagram of a field decoder for a three-stage pipeline.
FIG. 3 is a schematic diagram of a three-stage pipelined FAST market decoding processor.
FIG. 4 is a schematic diagram of a controller principle based on FSMD.
Fig. 5 is a schematic diagram of a FAST market message decoding processor implemented based on an FPGA.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the present invention, dedicated hardware such as FPGA (Field Programmable Gate Array), ASIC (Application specific integrated Circuit), etc. is used to accelerate FAST market data decoding.
First, a FAST market decoding processor based on a bus architecture is designed. For a specific FAST market template, the decoders (also called decoding operators) of the fields defined in the template are first connected to the internal bus of the decoding processor, respectively. The internal bus is divided into a data bus and a control bus. And sequentially completing the decoding of each field under the control of a decoding controller.
And then, optimizing and improving the FAST market situation decoding processor of the bus architecture, and designing the FAST market situation decoding processor of a three-section type assembly line. Through ingenious pipeline design, serial data reading, parallel decoding and serial data output decoding between FAST fields and FAST messages are realized. The speed of FAST market decoding is greatly improved. Each field decoding operator of the FAST market situation decoding processor based on the assembly line is designed into a three-section decoding operator, and then each decoding operator is connected through a bus to realize the assembly line type integral FAST market situation decoder.
1. FAST market situation decoding processor based on bus architecture
1 nThe data bus implements the FAST market data input stream buffer (FAST-MSG FIFO) and the respective field decoders, and the FIX message buffer (FIX-MSG FIFO) are connected to the internal bus of the decoding processor, the control bus is responsible for controlling the decoding operation of the respective fields, when the FAST market data input stream buffer is not empty, the controller first activates the PMap field Decoder (PMap Decoder) to read the encoded value of the PMap field from the input stream, obtains the PMap vector, then starts the decoding of the template ID (DECODer in the figure, and then activates the subsequent decoding operations (PMap Decode) in turn, and only needs to modify the field of the particular FAST decoding control stream, and only needs to add a new decoding control field to the decoding control bus, and the relevant field decoding control information is updated according to the logical state of the FAST field before the FAST decoding process is completed, and the relevant decoding control field decoding information is frequently updated by applying the control bus.
2. FAST market decoding processor based on assembly line
FIG. 2 depicts a field decode operator for a three-stage pipeline architecture. The field decoding operator is divided into three parts of read data (reader), field decoding (decoder) and decoding result output (writer). The three components are buffered for intermediate results by buffers (FIFO). The read data unit is responsible for reading the encoded value of the field from the input stream (FAST-MSG FIFO); the field decoding component is responsible for specific decoding according to the rules of the field operator; the result output part is responsible for outputting the decoded field value to an output FIX message buffer (FIX-MSG FIFO).
Fig. 3 depicts the architecture of a parallel FAST market decoding processor. The processor is divided into three separate pipelines. Each field is a three-segment field decoding operator, namely divided into three parts: reading data, decoding fields and outputting decoding results. The read data components of all (field) decoding operators are respectively connected to an internal bus (the internal bus is divided into a data bus and a control bus), and are connected with a Read Controller (RC) to form a read data pipeline of the FAST market processor. Wherein the read controller controls the specific read operation of the respective decoding operator. Similarly, the decoding components of all decoding operators are respectively connected to another internal bus, and are connected with a Decoding Controller (DC) to form a decoding pipeline of the FAST market processor; the result output components of all decoding operators are respectively connected to another internal bus, and are connected with an output controller (WritingCONTROL, WC) to form an output pipeline of the FAST market processor. The operation mechanism of each pipeline is described in detail below.
a) A read data pipeline:
and sequentially activating the reading components of the fields to sequentially read the coded values of the fields. When the input FAST message buffer (FAST-MSG FIFO) is not empty, the read data controller RC sends a read data enable signal to the read data part (PMap reader) of the PMap decoding operator. After the component reads the data, the subsequent component (the ID field of the template) is activated to read the data. And in turn, the read controller activates the read data components of subsequent fields in turn according to the PMap state and the field sequence defined by the FAST template. When the last field finishes reading the data, the RC is informed that a FAST message is finished. At this time, if the buffer of the input FAST data stream is not empty, the RC restarts the reading operation of a new FAST message; otherwise, entering an idle waiting state.
The reading component of each operator writes the read-in encoded value into a field encoded value buffer xxx _ in _ fifo (the "xxx" character appearing in the present invention represents a specific field name in the diagram, for example, the encoded value buffer of the first field in the FAST template is "fld 1_ in _ fifo", see fig. 3), and the basic storage unit format of the buffer is: "[ pmap _ bit ] [ value ]". The PMap _ bit is the PMap flag bit of the field that the field decoding component needs to use to decide how to encode. I.e. the decoder needs to know if the current field has no encoded value in the input stream and if it needs to recover the field value using the previous or initial value, e.g. when the PMap of the COPY (COPY) operator field is marked 0.
b) Decoding pipeline
the decoding means of the respective fields perform decoding in parallel. The encoded value of a field is read from the field encoded value buffer, or the field encoded value may not be read, which is determined by the rules of the field operator and the state of the field PMap. The field decoding is then performed according to the rules of the field operator. Finally, the decoded result is output to the field value buffer (xxx _ out _ fifo).
the basic memory cell structure of the field value buffer (xxx _ out _ fifo) is: "Presence _ flag value". When this field has a value, presence _ flag is 1; when this field has no value, presence _ flag is 0, and "value" is null. When the current field of the message is empty, the field output component needs to know that the field has no value and it does not need to write any data to the output buffer. "value" is a field value. The field value of the character string of the variable length takes '\ 0' as an end mark.
c) Output pipeline
The activation mechanism of the output pipeline is the same as the read data pipeline, and the output units (xxx _ writers) of the fields are activated in turn. After each output component receives an output data starting signal sent by an output controller, reading a field value in a corresponding field value buffer (xxx _ out _ fifo), and if a read field coding value existence flag (presence _ flag) is 0 and indicates that no field value needs to be output, finishing the output work of the field; otherwise, a field value is read from the field value buffer and then written byte by byte into a FIX message buffer (FIX-MSG FIFO). After the output is finished, the output controller notifies the subsequent output component to continue outputting the field value.
3. Controller
The controller in the present invention is implemented by using a Finite State Machine (FSMD) mechanism with a data path, as shown in fig. 4. The controller is internally divided into a control path and a data path, and the two paths are internally realized by a Finite State Machine (FSM). The control path is a field decoding task scheduler at the top layer, and the field decoding task scheduler sequentially sends out an instruction (xxx _ dec _ begin) for starting a decoding operation by each field decoding operator; the data path contains the specific control logic of the respective decoding operator. When the decoding operator of a field completes decoding, the data path will return a decoding end signal (xxx _ dec _ done) to the control layer.
4. Low-delay market decoding processor based on FPGA
The market decoding structure provided by the invention can be realized on special hardware such as FPGA (field Programmable Gate array), ASIC (application Specific Integrated circuit) and the like. In this section, specific implementations are given by taking an FPGA platform as a column.
Fig. 5 presents a block diagram of the FAST market message decoding processor based on FPGA implementation. The financial market processor is a FAST market decoding processor based on a pipeline. And a data exchange module is arranged in the FPGA chip and is responsible for the rapid data exchange between the market processor and an external system. On one hand, the data exchange module is responsible for receiving the real-time market data stream issued from the exchange, unpacking and further analyzing the received data packet, extracting the FAST market data and caching the FAST market data into a FAST market data buffer (FAST-MSG-FIFO). On the other hand, the data exchange module is responsible for transmitting the decoded FIX message to the financial application system (containing the necessary packets). Because the exchange issues the quotation in real time in a multicast mode, the data exchange module accesses the quotation data through a high-speed network interface, such as an 10/40G PYH + QSFP + + GMAC core; the transmission of the decoded FIX market information to the financial application system may be a high-speed network interface, and the market information may also be quickly transmitted through a pci express interface + DMA communication mode.
The above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and a person skilled in the art can modify the technical solution of the present invention or substitute the same without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.