CN105959151A - High availability stream processing system and method - Google Patents

High availability stream processing system and method Download PDF

Info

Publication number
CN105959151A
CN105959151A CN201610458184.5A CN201610458184A CN105959151A CN 105959151 A CN105959151 A CN 105959151A CN 201610458184 A CN201610458184 A CN 201610458184A CN 105959151 A CN105959151 A CN 105959151A
Authority
CN
China
Prior art keywords
data
component
working
information
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610458184.5A
Other languages
Chinese (zh)
Other versions
CN105959151B (en
Inventor
袁一
沈贇
陶玮
张学舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN201610458184.5A priority Critical patent/CN105959151B/en
Publication of CN105959151A publication Critical patent/CN105959151A/en
Application granted granted Critical
Publication of CN105959151B publication Critical patent/CN105959151B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a high availability stream processing system and method. The system comprises an upstream application system, a receiving device, a message-oriented middleware device, a stream processing topology device and a downstream application system, wherein the upstream application system is used for generating original data in real time according to the progress of an upstream business application and sending to the receiving device one by one; the receiving device is used for forwarding the original data to a message queue of the message-oriented middleware device; the message-oriented middleware device is used for caching the data for the stream processing topology device in a form of the message queue; the stream processing topology device is used for processing data stream of the original data and generating a data result; and the downstream application system is corresponding to a business scene and is used for carrying out follow-up processing on the data result.

Description

High-availability streaming processing system and method
Technical Field
The present invention relates to the field of information processing, and in particular, to a high availability streaming processing system and method.
Background
The term "streaming data" appears in the communication field for the first time, and its definition and application range are expanding continuously with the development of information network technology. In the fields of data analysis, risk monitoring, network security and the like, data sequences generated by various service systems reach quickly, can be generated continuously and grow infinitely, and are also called as stream data. However, conventional centralized processing IT architectures face several challenges when processing such data. Firstly, the enterprise user adds a new requirement for real-time analysis and processing to the streaming data, that is, the total time taken from the generation of the upstream application system service to the completion of the processing of the downstream application system is not more than a second. Second, the conventional host has limited storage and processing capabilities due to its single-node centralized processing characteristics, and is difficult to carry high throughput during high-traffic peak periods. Thirdly, due to the high price and high operation and maintenance cost of the high-performance host, enterprises bear the heavy pressure of huge economic cost such as early investment and later maintenance. In summary, the enterprise IT architecture is transformed to a cheap distributed server processing technology.
The distributed streaming data processing technology is combined with distributed server cluster deployment and real-time streaming data processing technology, and the distributed streaming data processing method has the remarkable advantages of low cost, flexible cluster expansion, high data processing throughput, strong data processing timeliness and the like. However, compared with a centralized processing IT architecture, due to the increased failure rate of the distributed nodes, the distributed architecture faces huge challenges in load balancing, failure recovery, data recovery, and the like. In addition, the coupling degree of the traditional data processing system and the upper application is too high, so that the traditional data processing system is difficult to be compatible with various data analysis processing scenes, and if a service module needs to be modified or newly added, a bottom data processing framework needs to be modified simultaneously. Therefore, an improved high-availability streaming data processing system is urgently needed by an enterprise, on the premise of stable operation, not only can the data be quickly processed, but also the failed node and the lost data can be timely recovered, and a bottom layer processing framework can support various service scenarios.
Disclosure of Invention
To address various problems that occur in large-scale processing environments, the present invention provides a highly available streaming processing system and method. The invention designs and realizes a topological distributed stream data calculation framework, overcomes the problem of low I/O efficiency of the data in the transmission and processing process due to a magnetic disk, and effectively ensures the real-time processing performance of the data by using a local memory at each data stream processing node; the invention improves the problem that the traditional data processing method is difficult to be compatible with various data analysis processing scenes, and carries out loose coupling design on the basic framework module and the specific service logic module, thereby providing solid bottom support for complex and various service models; the invention overcomes the problem of overlarge single-point load of the traditional data processing system, adopts decentralized architecture design, designs a plurality of nodes to concurrently perform the same service processing, and reasonably utilizes cluster computing resources; the invention overcomes the difficulties of unstable cluster work, data loss and the like caused by the fault of the distributed working nodes, realizes the automatic fault recovery mechanism by using the Zookeeper technology, and simultaneously realizes the quasi-real-time search of the lost data by using Spark memory to calculate rapidly.
Specifically, the highly available streaming processing system proposed by the present invention includes: the upstream application system is used for generating original data in real time according to the process of the upstream business application and sending the original data to the receiving device one by one; receiving means for forwarding the raw data to a message queue of the message middleware apparatus; the message middleware device is used for caching data for the streaming processing topology device in a message queue form; the stream processing topology device is used for processing the data stream of the original data and generating a data result; and the downstream application system corresponds to the service scene and is used for carrying out subsequent processing on the data result.
Further, the streaming topology device includes: the method comprises the following steps of collecting components, model components and forwarding components, wherein each component corresponds to a plurality of working nodes; the acquisition component is used for reading an original data record of an upstream application system from a message queue of the message middleware device, converting the original data into a MessageObject message object format which can be operated by a working node of the internal component, and forwarding the converted data to a downstream model component according to the setting of a routing table; the model components are serially arranged in one data processing link, each model component is used for processing the converted data according to the service logic specified by the upper application system to generate a data result and transmitting the data result to a downstream model component or a transmitting component, and if one data processing link has a plurality of model components, the converted data are sequentially processed; and the forwarding component is used for forwarding the data result to the corresponding downstream application system.
Further, the system further comprises: the heartbeat detection device is used for periodically receiving heartbeat information from the working nodes of the streaming processing topology device and forwarding the working node list of the sent heartbeat to the system guard device; the system guard device is used for receiving the list information sent by the heartbeat detection device and monitoring the running state of the streaming processing topology device; the system is also used for restarting the fault working node and recovering the normal work of the fault working node; but also to recover data lost in system operation in a fine-grained mode.
Further, the system further comprises: the system information persistence device is used for managing and maintaining information of related components and working nodes in the system, recording the running state of each working node and caching a data record to be recovered.
Further, the system further comprises: the mass log management device is used for collecting and storing the frame logs for subsequent mining; and the log reconciliation device is used for finding the data records lost in the processing process, regularly reading the frame logs in the mass log management device through a mining algorithm based on a reconciliation mechanism, and finally writing the lost data record information into the system information persistence device for recovery processing.
Furthermore, the system guard device comprises a monitoring management unit, a fault recovery unit and a data recovery unit; the monitoring management unit is used for monitoring the running state of working nodes of each component in the system, periodically receiving messages from the heartbeat detection device, wherein the message content comprises a working node list which successfully sends heartbeats in the period, and writing the working node list into the system information persistence device; the system information persistence device comprises a fault recovery unit, a destination server and a system information persistence unit, wherein the fault recovery unit is used for periodically comparing a working node list which is recorded in the system information persistence device and successfully sends heartbeat with all working nodes of the streaming processing topology device, searching the working nodes which do not send heartbeat, judging that the working nodes have faults for the working nodes which do not send heartbeat in time in a preset time interval, writing the information of the fault working nodes into the system information persistence device, finding out the server IP port information and the configuration information of the fault nodes from the system information persistence device, and remotely restarting the destination server; and the data recovery unit is used for regularly reading the data records lost in the system operation process from the system information persistence device and performing fine-grained recovery one by one.
Further, the data recovery unit is configured to read a component ID when data is lost in the system information persistence apparatus, determine, according to a predetermined policy, that data is to be retransmitted to a certain working node ID of the component, and find a server IP and a port of the working node;
and when the data recovery unit reads the lost data record, repackaging the data record into a message object by the text, and retransmitting the message object to the corresponding component working node of the streaming processing topology device according to the IP and the port of the destination server.
The invention also provides a high-availability streaming processing method, which comprises the following steps: step S1, the upstream application system generates the original data in real time and sends the data to the receiving device one by one; step S2, the receiving device forwards the original data to the message queue of the message middleware device, and the message middleware device caches the data for the streaming processing topology device in the form of the message queue; step S3, starting the flow process topology device, collecting the work nodes of the module, model module, and transfer module to initialize, loading the common data of the transfer route information and data mapping information of the work module in the memory; step S4, the heartbeat detection device and the system guard device carry out normal operation of the real-time guard system, the heartbeat detection device monitors all working nodes of the system, and when the system guard device monitors that the working nodes have faults, the system guard device carries out remote restart; step S5, the collection component of the stream processing topology device reads and processes the original data from the message queue of the message middleware device and converts the original data into a message object, and the message object is transmitted to one or more downstream model components in the stream processing topology device according to the forwarding routing information and the data mapping information of the working component; step S6, the model component of each data processing link branch processes the data record by the service logic appointed by the upper application system, and forwards the routing information and the data mapping information to the downstream model component or the forwarding component according to the working component; each data processing link branch has one or more model components, and the last model component forwards data to a forwarding component; step S7, the forwarding component forwards the data to the downstream application system; in steps S5, S6, and S7, the working nodes of all the components will record the current processing state in the working log in real time for the subsequent data recovery component to find the lost data record; in step S8, the journal checking device periodically searches the data record information that failed to be forwarded in steps S5, S6, and S7, and the system guard device resumes and transmits the data record.
Further, in step S4, the method further includes: step S401, when the working nodes of the streaming topology processing device work normally, heartbeat is sent to a heartbeat detection device periodically, and the heartbeat detection device collects the working nodes which receive the heartbeat into a list periodically and forwards the list to a monitoring management unit; step S402, the monitoring management unit collects a working node list which successfully sends the heartbeat, and writes the working node list into the system information persistence device; in step S403, the failure recovery unit searches for a failed node and remotely restarts the failed node.
Further, in step S403, the method further includes: the fault recovery unit compares a working node list which is recorded in the system information persistence device and successfully sends heartbeats with all working nodes, searches for a working node which does not send heartbeats, and judges that the working node fails in a preset time interval when the working node which does not send heartbeats in time is not sent; and the fault recovery unit writes the fault working node information into the system information persistence device, finds the server IP port information and the configuration information of the fault node from the system information persistence device, and remotely restarts the destination server by adopting an SSH protocol.
Further, in step S8, the method further includes: step S801, recording intermediate results of processing data records in a log in real time by each link working node; step S802, the log reconciliation device periodically searches for lost data; in step S803, the data recovery unit periodically recovers the missing data records.
Further, in step S802, the method further includes: step S8021, collecting the log information recorded by the same main key data in the frame log into a set; step S8022, analyzing a log information set recorded by the same main key data, finding the log information of the H mark and the T mark from the set, and checking the condition of data loss; step S8023, positioning the lost path of the lost data in the transmission process; step S8024, locate the component position where the lost data is located when the data loss occurs on the failed forwarding path.
The high-availability streaming processing system and method provided by the invention can be applied to a high-performance data processing platform, can process data generated by an upstream application system at a second level, and adopts various measures to improve the stability of the system. The method specifically comprises the following advantages:
1. the system and the method carry out loose coupling design on the basic framework module and the specific service logic module, can simultaneously support a plurality of high-speed real-time data analysis processing scenes, provide solid bottom support for complex and various service models, and ensure that data is quickly and effectively processed.
2. The log reconciliation method provided by the system and the method can ensure that each piece of data generated by an upstream application system is processed, and the data lost due to abnormity is accurately positioned to a specific position and a service model.
3. The system and the method monitor the working nodes of the distributed cluster in real time, respond to the failed nodes in time and recover quickly, and ensure that the system continuously and normally serves the outside.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
fig. 1 is a schematic diagram of a highly available streaming processing system according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a streaming processing topology apparatus according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a relationship between a collection, model, and forwarding component and a node according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of the acquisition, model and forwarding modes of operation according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a streaming processing topology apparatus according to another embodiment of the present invention.
Fig. 6 is a schematic structural diagram of a system guard device according to an embodiment of the present invention.
Fig. 7 is a flow chart of a highly available streaming processing method according to an embodiment of the invention.
Fig. 8 is a schematic diagram of the system defending process of step S4 according to an embodiment of the present invention.
Fig. 9 is a schematic diagram of a data forwarding flow of the collection component and the model component work node in steps S5 and S6 according to an embodiment of the present invention.
Fig. 10 is a schematic diagram of the data recovery process in step S8 according to an embodiment of the invention.
Fig. 11 is a log record rule diagram of a collection, model and forwarding working node in normal operation according to an embodiment of the present invention.
Fig. 12 is a log record rule diagram of an abnormal operation of a collection, model and forwarding work node according to an embodiment of the present invention.
Fig. 13 is a flowchart of an algorithm for searching and locating the missing data record in step S802 according to an embodiment of the present invention.
FIG. 14 is a flow chart of log collection for a high availability streaming system, according to an embodiment of the invention.
Detailed Description
The technical means adopted by the invention to achieve the preset object are further described below by combining the drawings and the preferred embodiments of the invention.
Fig. 1 is a schematic diagram of a highly available streaming processing system according to an embodiment of the present invention. As shown in fig. 1, the system includes: the system comprises an upstream application system 101, a receiving device 102, a message middleware device 103, a streaming processing topology device 104, a downstream application system 105, a heartbeat detection device 106, a system guard device 107, a system information persistence device 108, a mass log management device 109 and a log reconciliation device 110. Wherein, the upstream application 101 and the downstream application 105 are external devices that are in communication with the system. In order to ensure that the system core device, i.e. the streaming topology device 104, can provide services to the upstream and downstream application systems permanently and stably, various measures can be taken to ensure the normal operation of the system, and timely response and quick recovery to the failed node are achieved.
Specifically, the upstream application system 101 is configured to generate raw data in real time according to a process of an upstream service application, and send the raw data to the receiving device one by one. The data generated by the upstream application system 101 is basically a single data record, and each data record includes a unique data primary key and application-related detail information. Inside the system, data is transmitted and processed in a fine-grained mode with a single data record.
Receiving means 102 for forwarding the original data to a message queue of a message middleware device; the device is provided with a plurality of receiving ends, each of which processes data asynchronously and concurrently, ensuring that the data is efficiently and in time transferred to the message middleware device 103.
Message middleware means 103 for buffering data for the streaming topology means in the form of message queues. Message middleware device 103 manages data in the form of message queues, facilitating secure and reliable delivery of data. Meanwhile, the combination of the message middleware device 103 and the receiving device 102 breaks the direct contact between the streaming processing topology device 104 and the upstream application system 101, and realizes the structural loose coupling. Specifically, during the peak period of the service, the data generation speed of the upstream application system 101 exceeds the processing speed of the streaming processing, the message middleware device 103 temporarily buffers the data which cannot be processed in time, and the data is digested during the low period of the service. Further, if the streaming topology device 104 has an abnormal condition such as system interrupt, the message middleware device 103 suspends the supply of data to the streaming topology device 104, and keeps normal data reception, thereby ensuring that data is not lost during the suspension. The system can support various existing message middleware products, and if the message middleware products need to be replaced according to application requirements, only the mode of the downstream components for updating and accessing the message middleware is needed, so that the rest parts of the system are ensured not to be changed greatly.
And the streaming processing topology device 104 is used for processing the data stream of the original data and generating a data result. The streaming processing topology device 104 is a core device of the system, is deployed on a distributed cluster composed of a plurality of servers, and is responsible for specifically processing data streams. The device comprises an acquisition component, a model component and a forwarding component. In order to support different data analysis processing scenarios, a loose coupling design is performed on a basic framework module and a specific service logic module, wherein the loose coupling design means that the device focuses on data receiving, caching, forwarding and high component availability, the specific data processing service logic is not involved, only a service processing interface is provided for upper-layer application, and the upper-layer application performs designation and injection on the service processing logic through the service processing interface.
Specifically, one data record is transmitted to the corresponding downstream application system through the acquisition component, the model component and the forwarding component in sequence. In order to improve the concurrent processing capacity of the system, each component corresponds to a plurality of working nodes, and each working node is responsible for the same task and can be deployed and operated in different servers. If one working node fails, other nodes of the same component bear the workload of the failed node until the failed node recovers to work; if the existing working nodes cannot bear the suddenly increased workload during the service peak period, the working nodes can be dynamically increased on the premise of not influencing the current operating environment. After the new component working node joins in the work, the system will inform all working nodes corresponding to the upstream component to update the forwarding routing information, and when forwarding data, the system can forward the data to the newly added working node.
The downstream application system 105 corresponds to a service scenario and is configured to perform subsequent processing on the data result. The system can simultaneously support a plurality of service scenes, so a plurality of downstream application systems can be arranged.
And the heartbeat detection device 106 is used for periodically receiving heartbeat information from the working nodes of the streaming processing topology device and forwarding the working node list of the sent heartbeat to the system guard device. The heartbeat detection device 106 employs a distributed service framework, specifically, Zookeeper technology. The Zookeeper is an open-source and distributed service framework, can provide state synchronization service in a distributed application scene, and greatly simplifies the development cost of the distributed coordination service.
The system guard device 107 is used for receiving the list information sent by the heartbeat detection device and monitoring the running state of the streaming processing topology device; the system is also used for restarting the fault working node and recovering the normal work of the fault working node; but also to recover data lost in system operation in a fine-grained mode.
And the system information persistence device 108 is used for managing and maintaining information of related components and working nodes in the system, recording the running state of each working node, and caching a data record to be recovered. The system information persistence device 108 is an indispensable intermediate link for high availability of the system, and is also an essential interface for a system administrator to observe the operation condition of the system in real time. Specifically, the device adopts an Oracle database system, but the selection of the database by the device is not limited, and other general relational database systems can be adopted.
And the mass log management device 109 is used for collecting and storing the frame logs for subsequent mining. The mass log management apparatus 109 includes: a log collector and a log memory. When the streaming topology device 104 runs, each component working node generates a frame log in real time and records an intermediate result of data processing in the component working node.
Specifically, the component working node sends the content of the frame log to a log collector, and the log collector writes the content into a log file after receiving the content and stores the content in a log storage. In order to enhance the utility of the device, the system improves the device: 1. in order not to influence the normal work of the component working node, the component working node adopts asynchronous sending when transmitting the non-core log content to the log collector. 2. In order to ensure the integrity of log data collection, two log collectors located in different servers are adopted to collect and backup logs respectively, and before subsequent mining, two log contents are integrated and deduplicated. 3. Because the generated log file has huge data volume, the log is divided into management logs according to the length of minutes, and the management is convenient to organize. 4. The log memory stores the massive logs by adopting a Hadoop distributed file system. The Hadoop file system is a distributed file system with high fault tolerance and high reliability, can be deployed in a cheap server cluster, and is widely applied to large-scale data storage management.
And the log reconciliation device 110 is used for finding the data records lost in the processing process, regularly reading the frame logs in the mass log management device 109 through a mining algorithm based on a reconciliation mechanism, and finally writing the lost data record information into the system information persistence device 108 for recovery processing. The mining algorithm based on the reconciliation mechanism is explained in detail in the embodiments shown in fig. 10 and fig. 13, which are described later, in short, when a data record flows through each working node of the streaming processing topology device 104, the working node records the processing state of the working node in the working log by using the identifiers H, S, R, T, OS and OR, where H and T are respectively a head identifier and a tail identifier. And finally, when the working logs are summarized, analyzing whether each data record has a corresponding head and tail identification, if not, indicating that the data record is lost in the processing process, and further finding out the data record loss position. In the embodiment, the mining algorithm is realized by using Spark technology, Spark is used as a general parallel computing frame, a working process of memory computing is introduced based on a distributed memory, and the method has the computing characteristic of MapReduce and rapid data processing efficiency. In the aspect of data recovery, the device can position the intermediate result when the data is lost and the component where the data is positioned, so that the intermediate result of the data is directly sent to the working node of the component when the subsequent data is recovered, and fine-grained recovery is really ensured.
Fig. 2 is a schematic structural diagram of a streaming processing topology apparatus according to an embodiment of the present invention. As shown in fig. 2, the streaming topology apparatus 104 includes: the system comprises a collection component 1041, a model component 1042 and a forwarding component 1043, wherein each component corresponds to a plurality of working nodes. Wherein,
a component is a type of virtual organization with specific processing logic, and a worker node is a component-specific executor. As shown in fig. 3, one component corresponds to multiple working nodes, the working nodes of the same component have the same set of processing logic, and each working node can run in different servers and can work in parallel. It should be noted that the internal structures of the collection component 1041, the model component 1042, and the forwarding component 1043 are the same, but the types of the work tasks that they undertake are different, and a specific service processing logic is specified and injected by an upper application through a specific service processing interface. In the present system, each component has a specific ID in order to distinguish the components. Similarly, different working nodes inside each component are provided with IDs for distinguishing and identifying. The system information persistence device 108 records the route forwarding information between the components and the contents of the IP and port of the server where the working node is located, etc. by ID.
As shown in fig. 4, the operation mode of the collection, model and forwarding work nodes is schematically illustrated, and three work threads and two message queues are maintained in each work node according to the producer-consumer mode. The producer consumer mode refers to writing and fetching operations on a shared memory, wherein a writing thread is a producer (production data) and a fetching thread is a consumer (consumption data). The three working threads are respectively a receiving thread, a calculating thread and a sending thread, and the three working threads respectively perform their own functions and work asynchronously. The computing thread is responsible for analyzing and processing the data according to own processing logic; and the receiving thread and the sending thread are respectively responsible for data interaction with the upper component working node and the lower component working node. The message queues are a receiving message queue and a sending message queue.
Specifically, after the streaming topology apparatus 104 is started, when each component work node is initialized, the routing information and the downstream component work node information are first read from the system information persistence apparatus 108. The forwarding route of the working component in the system information persistence device 108 is shown in table 1 below, where each component is known to have and only have a unique upstream component, and it is known from the table below that component 1 has no upstream component, it is inferred that the component is an acquisition component, the upstream component of component 2 is component 1, it is inferred that the component is a model component, the model components can be analogized in turn, and if the group of components has no downstream component, it is inferred that the component is a forwarding component, thereby deriving a data forwarding path: acquisition component 1 → model component 2 → model component 3 → forwarding component 4. The component node information table shown in table 2 records specific deployment information of all component nodes in the system, and the content includes component ID, node ID, deployment server IP, port, and the like. In addition, the public memory of each working node records the downstream component list of the component where the working node is located and the IP port of the working node contained in each component. When the working node needs to acquire the node information of the downstream component, the internal thread accesses the accessible public data, and the access efficiency is improved.
Table 1: working component Forwarding routing Table example one
Location component ID Upstream component ID
1 Air conditioner
2 1
3 2
4 3
Table 2: component node information representation
Component ID Node ID Deploying server IP Deploying server ports
1 1 109.251.13.179 6431
1 2 109.251.13.179 6432
1 3 109.251.13.180 6431
2 1 109.251.13.180 6432
2 2 109.251.13.181 6431
……
The collection component 1041 is configured to read a raw data record of the upstream application system 101 from a message queue of the message middleware device 103, convert the data into a MessageObject message object format that is operable by a working node of an internal component of the system, and forward the data to a downstream model component according to a routing table setting. The MessageObject employs a HashMap data structure for storing information of the original data record. One collection component corresponds to a plurality of collection working nodes, and each collection working node concurrently works according to the same processing logic. In the collection working node, the receiving thread, the computing thread and the sending thread have different processing flows respectively.
Specifically, the receiving thread is responsible for accessing the message middleware device 103, reading data from the message queue, and writing the data into the receiving queue. Since the message middleware device 103 of the present system may use different middleware products, the receiving thread may use different connection modes according to the products when accessing the message middleware. When the message middleware product is replaced, only relevant codes related to access reading and writing in the thread need to be changed, and the rest parts of the streaming processing topology device and the message middleware device are ensured to be in a loose coupling state.
Specifically, the computing thread is responsible for reading and analyzing the original data of the receiving queue, converting the original data into a message object which can be identified by the system, and writing the message object into the sending queue. Meanwhile, the calculation thread finds all downstream model components according to the routing table and records the current working state in a system frame log. The stream processing topology device 104 of the system performs consistency processing by adopting a MessageObject format, and shields different data formats of the upstream application system 101. For different upstream application systems 101, different data formats are adopted, and only the relevant resolving codes need to be changed in the thread, so as to ensure that the rest of the streaming topology device 104 and the message middleware device 3 are in a loose coupling state.
Specifically, the sending thread is responsible for selecting a certain model working node of the downstream component according to a selection policy (random mode, polling mode or hash mode) specified by a user, forwarding the data record to the model working node, and recording the current working state in a system framework log.
The model component 1042: in the system, one or more model components can serially exist in one data processing link, and each model component is responsible for processing data records according to business logic specified by an upper layer application and forwarding the data records to a downstream model component or a forwarding component. Thus, if a plurality of model assemblies exist in one link, the data are sequentially processed. In the first embodiment, the model component includes multiple types, for example, model component 2 → model component 3 in the above path, and model component 2 and model component 3 have a set of different data processing logic inside, and the data processing logic is specified and injected by the upper layer application through the service processing interface. After the working node of the model component 2 finishes data processing, the intermediate data is forwarded to the working node of the model component 3, and after the working node of the model component 3 finishes data processing, the final result is forwarded to a downstream forwarding component.
Similar to the collection component, one model component corresponds to a plurality of model working nodes. In each model working node, a receiving thread is responsible for receiving data records in a MessageObject message object format sent by an upstream component and recording a frame log; the calculation thread is responsible for processing the data record according to the established data processing logic, and after the data record is processed, the content of the data record changes; and the sending thread is responsible for forwarding the new data record to a working node of the downstream component according to a set strategy and recording the current operation information in the frame log.
And a forwarding component 1043, configured to forward the processed data to the corresponding downstream application system 105. The forwarding component has a similar working mode to that of the model component, but serves as the last ring of the streaming processing topology device, and the sending thread forwards the processing result of the data record to the downstream application system 105 outside the system according to the IP port of the downstream application system 105, and records the current running information in the frame log.
Fig. 5 is a schematic structural diagram of a streaming processing topology apparatus according to another embodiment of the present invention. As shown in fig. 5, the difference from the previous embodiment is that in the present embodiment, the collection component 1041 is connected to a plurality of model components 1042 at the same time, that is, one piece of data can match one or more calculation models. Under the default condition, the acquisition component can send the data to all the associated downstream model components, and the model component A and the model component B can process the same data; in certain cases, the collection component, via the data mapping table, can send data to the associated downstream model component according to the data type. The system has the practical significance that the system supports different business processing on the same data.
In order to implement a data matching arbitrary computation model, the work component forwarding routing table of the system information persistence apparatus 108 may design multiple forwarding paths, and taking table 3 as an example, two forwarding paths may be derived:
forwarding path 1: acquisition Module 1 → model Module 2 → Forwarding Module 4
Forwarding path 2: acquisition Module 1 → model Module 3 → Forwarding Module 5
Table 3: working component Forwarding routing Table example two
In this embodiment, the upstream application 101 needs to add a field of the data record type when generating the data record. Accordingly, a mapping table must be established in the system information persistence device 108 of the present system to supplement the matching information of the data type and the downstream model component ID. The data mapping table is shown in table 4:
table 4: data mapping table
Data type Downstream model component ID
A 2
A 3
B 2
C 3
And selecting a downstream module when the collection working node forwards the data, extracting the type of the obtained data record by the collection working node, comparing the type with a data mapping table, and finding out a downstream module ID list matched with the type. For example, data records of data type a match downstream model component ID lists of 2 and 3, which may match forwarding path 1 and forwarding path 2. And the data record of data type B only matches the downstream model component 2, so the acquisition component passes along the forwarding path 1 after receiving the data of this type.
It should be noted that, only the collection component is repeated for a plurality of forwarding paths of data of the same data type, and the subsequent model component and the forwarding component are different, that is, the ID of the forwarding component can be used to locate the only one forwarding path of a specific data type.
Fig. 6 is a schematic structural diagram of a system guard device according to an embodiment of the present invention. As shown in fig. 6, the system guard device 107 includes: a monitoring management unit 1071, a failure recovery unit 1072, and a data recovery unit 1073. Wherein,
a monitoring management unit 1071, configured to monitor an operating state of a working node of each component in the system, and periodically receive a message from the heartbeat detection device, where the message content includes a working node list that has successfully sent heartbeats in the period, and the monitoring management unit writes the working node list into the system information persistence device;
a failure recovery unit 1072, configured to periodically compare a list of working nodes that successfully send heartbeats and are recorded in the system information persistence device with all working nodes of the streaming processing topology device, find a working node that does not send a heartbeat, and for a working node that does not send a heartbeat in a preset time interval, determine that the working node fails, write information of the failed working node into the system information persistence device, find server IP port information and configuration information of the failed node from the system information persistence device, and remotely restart the destination server;
a data recovery unit 1073, configured to periodically read data records lost in the system operation process from the system information persistence device, and perform fine-grained recovery one by one; specifically, firstly, reading a component ID when data in a system information persistence device is lost, determining a certain working node ID of the component to which the data is retransmitted according to a set strategy, and finding a server IP and a port of the working node; and then, when the data recovery unit reads the lost data record, the data record is encapsulated into a message object again by the text, and the message object is retransmitted to the corresponding component working node of the streaming topology device according to the IP and the port of the destination server.
Fig. 7 is a flow chart of a highly available streaming processing method according to an embodiment of the invention. As shown in fig. 7, the method includes:
step S1, the upstream application system generates the original data in real time and sends the data to the receiving device one by one;
step S2, the receiving device forwards the original data to the message queue of the message middleware device, and the message middleware device caches the data for the streaming processing topology device in the form of the message queue;
step S3, starting the flow process topology device, collecting the work nodes of the module, model module, and transfer module to initialize, loading the common data of the transfer route information and data mapping information of the work module in the memory;
step S4, the heartbeat detection device and the system guard device carry out normal operation of the real-time guard system, the heartbeat detection device monitors all working nodes of the system, and when the system guard device monitors that the working nodes have faults, the system guard device carries out remote restart;
step S5, the collection component of the stream processing topology device reads and processes the original data from the message queue of the message middleware device and converts the original data into a message object, and the message object is transmitted to one or more downstream model components in the stream processing topology device according to the forwarding routing information and the data mapping information of the working component;
step S6, the model component of each data processing link branch processes the data record by the service logic appointed by the upper application system, and forwards the routing information and the data mapping information to the downstream model component or the forwarding component according to the working component; each data processing link branch has one or more model components, and the last model component forwards data to a forwarding component;
step S7, the forwarding component forwards the data to the downstream application system;
in steps S5, S6, and S7, the working nodes of all the components will record the current processing state in the working log in real time for the subsequent data recovery component to find the lost data record;
in step S8, the journal checking device periodically searches the data record information that failed to be forwarded in steps S5, S6, and S7, and the system guard device resumes and transmits the data record.
Compared with the embodiment shown in fig. 5, the single data link of the embodiment shown in fig. 2 is simpler in design, and the characteristic of multi-branch data transmission cannot be highlighted, so the above steps are described in the case of the embodiment shown in fig. 5.
Fig. 8 is a schematic diagram of the system defending process of step S4 according to an embodiment of the present invention. The flow is a highly available streaming method of failure recovery. Due to the characteristic of complex structure of the distributed system, the distributed working nodes are difficult to avoid faults during working. Therefore, in order to ensure stable and durable operation of the system, timely response and rapid recovery are particularly important for the working node with the fault. Specifically, the system design realizes the heartbeat detection device, the monitoring management unit and the fault recovery unit which respectively perform their own functions and work in a cooperative manner, and can monitor and collect the heartbeats of the working nodes of the streaming topology device, search the fault nodes and recover the operation of the fault nodes.
As shown in fig. 8, step S4 further includes:
step S401, when the working nodes of the streaming topology processing device work normally, the heartbeat detection device sends heartbeats to the heartbeat detection device periodically, and the heartbeat detection device collects the working nodes which have received heartbeats into a list periodically and forwards the list to the monitoring management unit.
Step S402, the monitoring management unit collects a working node list that successfully sends a heartbeat, and writes the working node list into the system information persistence apparatus.
Specifically, the monitoring management unit receives a message from the heartbeat detection device, and the content of the message includes a list of working nodes that have successfully sent heartbeats in the period. The unit writes the list to the system information persistence means 108.
In step S403, the failure recovery unit searches for a failed node and remotely restarts the failed node.
Specifically, the fault recovery unit compares a list of working nodes which are recorded in the system information persistence device and successfully send heartbeats with all the working nodes, searches for the working nodes which do not send heartbeats, and judges that the working nodes have faults as to the working nodes which do not send heartbeats in time within a preset time interval;
and the fault recovery unit writes the fault working node information into the system information persistence device, finds the server IP port information and the configuration information of the fault node from the system information persistence device, and remotely restarts the destination server by adopting an SSH protocol.
Fig. 9 is a schematic diagram of a data forwarding flow of the collection component and the model component work node in steps S5 and S6 according to an embodiment of the present invention. When the working node of the upstream component forwards the data to the downstream component, a certain strategy is adopted to select the working node which can normally work in the downstream component, and the downstream working node with the fault is isolated in time.
As shown in fig. 9, the specific steps include:
in step S501, the upstream working node maintains common information related to data forwarding.
The memory of each upstream working node maintains the routing forwarding information, the downstream component node IP port information, and the data mapping information of the node, which are obtained by reading the system information persistence device 108 when the working node is initialized. The routing forwarding information records a downstream component set which can be forwarded by a component where the working node is located, the data mapping information records downstream components matched with different data types, and the information is used as public information and can be accessed by internal threads.
Step S502, the working node sends data to the downstream working node according to the established strategy based on the public information.
When data is forwarded, in order to ensure load balance of downstream working nodes, when the upstream working nodes forward the data, a node is selected from a downstream normal node set by adopting a set strategy, and supported forwarding strategies comprise a random mode, a polling mode and a hash mode. Randomly selecting downstream working nodes; selecting a downstream working node based on a Round-Robin rule by a polling mode; the hash mode is to determine a working node according to the hash value of the current data record. After the working node forwards the data, two situations may be encountered, if the data is successfully sent, the next piece of data is processed, and if a feedback of data sending failure is obtained, the process goes to step S503.
In step S503, the upstream working node identifies the failed node and excludes it from the downstream normal node set.
If a certain downstream working node fails, the upstream working node receives feedback of failed transmission when normally transmitting data to the failed node, and acquires downstream node failure information. The upstream node specially marks the fault node and isolates the node in a downstream normal node set by the subsequent strategy selection.
Step S504, the upstream working node periodically detects whether the downstream fault node recovers the operation.
The upstream node starts a thread to periodically acquire heartbeat information from the isolated node and confirms whether the isolated node recovers work or not. And if the isolated node feeds back the heartbeat to the upstream node at a certain moment, the upstream node considers that the node is connected again, and the node is added into a downstream normal node set by strategy selection.
Fig. 10 is a schematic diagram of the data recovery process in step S8 according to an embodiment of the invention. It is known that data is lost due to network transmission failure, working node failure, and the like. For data lost during transmission, the system introduces mechanisms such as data intermediate result persistence, log reconciliation and the like to carry out accurate positioning, searching and recovery.
As shown in fig. 10, the specific steps include:
step S801, each link working node records an intermediate result in real time in a log when processing data recording.
Each working node in the system records the intermediate result of data processing in real time, divides the management log into frame logs according to the length of minutes, and stores the frame logs in the massive log management device. In order to further prevent the log data from being irretrievable due to abnormal loss, the present system also performs multiple backup storage on the frame log, which will be described in detail with reference to the following step of fig. 14. In addition, the invention designs a set of rules to record the intermediate result of the data in the component in real time. As shown in the normal operation log recording rule diagram of the working node in fig. 11, the present invention uses identifiers h (head), s (sender), r (receiver), and t (tail) to record the real-time status of data when the working node normally processes.
And H represents log information recorded when the data record of the binary stream is analyzed by the acquisition working node computing thread. The H log information includes: the method comprises the following steps of identifying H, a data recording main key, an ID of a collection assembly where the data recording main key is located, an ID of a collection working node where the data recording main key is located, an ID list of model assemblies forwarded to a downstream, data recording contents and a current processing timestamp;
and S represents the log information recorded when the collection working node and the model working node send threads to forward data records. The S log information includes: the method comprises the following steps of identifying S, a data recording main key, an ID of a node where the data recording main key is located, an ID of a component forwarded to a downstream, data recording content and a current processing timestamp;
and R represents log information recorded when the model working node and the forwarding working node receive the data record. The R log information includes: the method comprises the following steps of identifying R, a data recording main key, an ID of a component where the data recording main key is located, an ID of a node where the data recording main key is located and a current processing timestamp;
and T represents log information recorded when the forwarding working node sending thread finally forwards the data record to a downstream application system. The T log information includes: identification T, a data record primary key, the ID of the forwarding component, the ID of the forwarding working node and the current processing timestamp.
It should be noted that after each data record is successfully circulated in the streaming data topology apparatus 4, each log record generated by the data record must have a log entry with an H identifier and a T identifier. Thus, in subsequent log mining analysis, once the H and T identifiers of a data record are found, it can be determined that the data record has been successfully forwarded to a downstream application system. This is also one of the work cores of the log reconciliation device.
In addition, the component may also record some abnormal conditions during the runtime, as shown in the log recording rule diagram of the abnormal runtime of the working node in fig. 12, the abnormal identifier or (overflow receiver) and os (overflow sender) are used in the present invention to record the abnormal working state of the working node. It is known that a receiving queue and a sending queue are provided inside a working node to buffer data records. As shown in the internal diagram of the work node in fig. 4, the receiving thread and the computing thread respectively generate data writing operations to the queue. If the queue is full, memory overflow may occur if data continues to be written to the queue. In order to avoid overflow, the working thread stops inserting data into the queue when the queue is found to be full, and the intermediate result state of the data is directly written into the frame log and then the data is not processed any more.
OR means that the receiving thread of the working node finds the receiving queue full when writing data into the receiving queue. The OR log information includes: the identifier OR, a data record primary key, the ID of the located forwarding component, the ID of the located forwarding working node and the current processing timestamp.
The OS finds that the send queue is full when the compute thread representing the worker node writes data to the send queue. The OR log information includes: identification OS, data record primary key, forwarding to downstream component ID, home forwarding work node ID, current processing timestamp.
In step S802, the log reconciliation device periodically searches for lost data.
The log rule designed by the invention records the processing state and the intermediate result of the log in detail, and can draw a transmission path of data record in the streaming processing topology device according to the identification, the component ID, the node ID and the timestamp information of each log record. For a missing data record, the transmission path is interrupted at a certain working node. For this reason, the system introduces the idea of accounting, and designs a log pairing algorithm to search the data content to be recovered and the component ID. The specific algorithm flow is developed in the embodiment of fig. 13 described later, and is not described here. After finding the lost data, the log reconciliation device converts the data recording information when lost into text content, and writes the text content into the system information persistence device along with the information such as the ID of the component where the lost data exists.
Furthermore, the log reconciliation device adopts Spark technology to realize a log pairing algorithm, Spark is used as a general parallel computing frame, a working process of memory computing is introduced based on a distributed memory, and meanwhile, the log reconciliation device has the computing characteristic of MapReduce and rapid data processing efficiency. And after finding the lost data information, the log reconciliation device writes the lost data record content and the lost component ID into the system information persistence device.
Further, for the lost data found in a time-minute batch, the log reconciliation device can be set to continuously search the T identification information corresponding to the lost data from a plurality of batches (the batch number can be preset) after the time batch, and if the T identification information can be found, the data is considered not to be lost, and the data recovery operation is cancelled; if not, the process goes to step S803.
In step S803, the data recovery unit periodically recovers the missing data records.
The data recovery unit periodically accesses the system information persistence device, and if the information persistence device stores the lost data record information, the data recovery unit recovers the lost data record information one by one. When the data recovery unit recovers the data, the lost data is sent to the component where the lost working node is located, and the lost data is sent downwards continuously from the component, so that the component links of data transmission are reduced, and the efficiency is improved. The method comprises the following steps: the data recovery unit firstly obtains the ID of the component where the data is lost, selects the ID of the working node of the component according to a set strategy (a random mode, a polling mode or a Hash mode), and processes the data to be recovered again. And according to the working node ID, the data recovery unit acquires the IP address and the port of the working node. And then, the recovery unit reads the data content lost after the data content is lost, packages the data content into a message object and sends the message object to the destination working node, and simultaneously records related sending information in the frame log. After the successful transmission, the data recovery unit deletes the backup of the piece of data recorded in the information persistence device. After receiving the piece of data, the destination working node processes and forwards the data to the downstream node according to the normal working flow, so that the lost data enters the streaming topology device 4 again for streaming processing.
Fig. 13 is a flowchart of an algorithm for searching and locating the missing data record in step S802 according to an embodiment of the present invention. As shown in fig. 13, the specific steps include:
step S8021, the log information of the same primary key data record in the frame log is gathered into a set, and each set can draw a forwarding path of the same primary key data record in subsequent mining, so as to serve as a data analysis source of this time.
Step S8022, analyzing the log information set recorded by the same main key data, finding the log information of the H mark and the T mark from the set, and checking the data loss condition. According to the embodiment shown in fig. 5, several forwarding paths of data in the system can be obtained from the data mapping table and the routing table based on the data record type, and the total set of all the forwarding paths of data is represented by PathSet. The H-log information records the processing state of the data at the head node, including the list of model component IDs forwarded downstream, i.e., the number of forwarding paths. And each forwarding path corresponds to one piece of T log information. In this case, two possibilities are available: if the number of the list IDs is equal to that of the T logs, the data records of the main key are all successfully sent in the transmission process; if the data records are not equal, the data records of the main key are considered to be lost in the transmission process, and the path of the lost number and the position of the component need to be further positioned.
Step S8023, locate the path of the lost data when lost in the transmission process. In the step, an elimination strategy is adopted, the data is assumed to be lost in all paths, and the failure forwarding path set comprises all paths related to the data. However, if T log information is found in the log of a certain path, it indicates that the path is actually not lost, and the path corresponding to the T log is excluded from the failure forwarding path set. The method comprises the following steps: and setting a failure forwarding path set FailedPathSet, wherein paths included in the failure forwarding path set are copied from the PathSet generated in the step S8022 during initialization. Because the data forwarding path and the T log are in a one-to-one relationship, the ID of the forwarding component where the data forwarding path and the T log are located can be obtained from the T log, and the path in which the ID appears in the FailedPathSet is a successful forwarding path, so that the data forwarding path and the T log are deleted from the FailedPathSet. By using the method, after each known T log is analyzed, the remaining path sets in the FailedPathSet are all fault forwarding paths.
Step S8024, locate the component position where the lost data is located when the data loss occurs on the failed forwarding path. And (3) extracting an S or OS identification log set related to the path component from the frame log when the fault forwarding path is known, analyzing each component node one by using a forwarding component → a model component → … → an acquisition component in a reverse direction of the fault forwarding path, and finding the component which is last appeared in the forward direction of the fault forwarding path from the set. The purpose of step S8024 is to locate a first failed component of the failed forwarding path, specifically, first locate an upstream model component of the last forwarding component, check whether the component generates an S or OS log, and if not, indicate that the component is a failed component; and continuing to search the upstream component of the failed component, if the S or OS log is found in the upstream component of the failed component, and if the S or OS log is not found, further searching the upstream component of the upstream component by the same method. The last normal component in the failed forwarding path, the component downstream of the last normal component, i.e. the first failed component, is located in the above-mentioned method, that is, the sending destination of the lost data in the step S803 (lost data recovery step). The last normal component generates an S or OS log containing data records that are missing data, i.e., data to be recovered.
FIG. 14 is a flow chart of log collection for a high availability streaming system, according to an embodiment of the invention. As can be seen from the foregoing embodiments, the present system takes some measures to improve the utility of log collection. In order not to influence the normal work of the component working node, the component working node adopts asynchronous sending when transmitting the non-core log content to the log collector. Further, for the number loss situation caused by data asynchronous transmission, a log multiple backup mechanism can be adopted, namely, log collectors are respectively deployed on a plurality of different servers to receive and generate log data of component working nodes, and before subsequent mining, multiple log contents are integrated and deduplicated, so that the data integrity is ensured to the maximum extent. The management log is divided according to the length of minutes, and the log collector writes the currently collected log data into the log file every 60 seconds. When the last minute switches to the next minute, different log collectors may scratch the same log record into different time batches while processing it, due to the inconsistency of the local physical clocks of the different servers. In contrast, in the subsequent multiple log integration, the log record appears in multiple time batches, and the problem of inconsistent file content is caused.
In order to avoid the problems, the invention further introduces an odd-even queue for log collection management in the time batch, namely, the current time minute is designated as an odd or even attribute, and the frame log generated at the current time is divided into the corresponding odd or even queues.
As shown in fig. 14, the specific steps include:
step S1401: when the component working node generates a frame log record, the log content and the odd and even identifications are respectively sent to the two log collectors.
When the component working node generates the log record content, the current frame unified clock is obtained, the parity of the log record can be judged according to the parity of the minutes, and the log record is marked by a parity mark. And the component working node sends the parity identification of the log record at the same time of sending the log record content. When the log collector receives the logs, the collection of the core logs and the non-core logs can be distinguished. The core log record is an H log, the H log record indicates that the data record enters the streaming topology device, the data record is recorded in a synchronous mode, the working node needs to ensure that the log collector continues to follow the next operation after receiving and sending feedback, and extra working time is consumed in the waiting process. The non-core logs are the rest logs except the H log and are asynchronous processing mechanisms, namely the component working nodes do not need to wait for receiving feedback of the log collector and continue to work in the next step.
Step S1402: the log collector places the log records into corresponding queues according to their parity identifications.
As seen from step S1401, the parity of the log record has been determined when the working node records. When the log collector receives the log record, the parity identification of the log record is read and is put into a queue corresponding to the parity for buffering. Each log collector internally maintains an odd queue and an even queue, wherein one queue is an 'active queue' of a current time batch and used for receiving log records of the current time batch during normal work, and the other queue is a 'dead queue' and used for writing the log record data of a last time batch in the queue into a frame log. Every 60 seconds when the local physical time of the log collector is full, the log collector switches the active and dead states of the odd queue and the even queue.
Step S1403: and setting buffer time, and writing the queue data into a frame log file by the log collector after the current time is switched for a period of time.
In the system, a buffer time of 4 seconds is set for waiting for log records sent in a previous time batch. There may be a case where the working node has sent the logging data within the last few seconds of the last time batch, and due to network delay or the like, the active queue has switched to another queue when received by the log collector. And after the queue corresponding to the log record is converted into a stagnation queue, starting to write the data in the queue into the frame log corresponding to the batch. To this end, the log collector sets a 4 second duration for the stall queue to wait for the log record for the corresponding batch to "late". Thus, during the first 4 seconds of each switchover time, the stalled queue receives late log records while the active queue is used to receive log records for the normal arrival of the current batch. After 4 seconds, the log collector begins writing the stalling queue data to the frame log file.
In summary, the parity queue can ensure that the same log record appears in the log files of multiple backups in the same batch at the same time. It is known that a plurality of log files are obtained by the above method, and each log file records the same content in an ideal state, but it is difficult to avoid an accident that part of log data is lost in a collection process due to a network failure or the like. Therefore, the multiple log contents are collected, integrated and deduplicated, and the data integrity is ensured to the maximum extent.
Furthermore, in order to ensure the clock consistency of the server cluster deployed by the working node, an innovative framework unified clock is adopted in the system. Specifically, as can be seen from the foregoing, in the topology device for stream processing, after one data record passes through different component working nodes, each node records a timestamp, which is used for determining the data loss situation afterwards. Since the local physical clock of the cluster server cannot guarantee absolute unification, it is possible that the occurrence time of the H-id log record of the same data record is later than the T-id log record. If the physical clock error between the servers exceeds the buffering time, the T identification log of the same data record is divided into log files of the last time batch because the occurrence time is early, and the H identification log of the same data record is divided into the next time batch, so that the subsequent searching of the lost data is inaccurate due to inaccurate log division. The invention provides a method for approximately synchronizing a clock in a communication network with a physical clock, which is called a frame unified clock in the system. The method comprises the following steps:
in the first step, the first working node records the local server physical time T1 and T2 respectively at the beginning and the end of processing a data record, and obtains a time offset (T1-T2-T1), thereby determining that two frame clocks are T1 and T1+ T1 respectively.
And secondly, recording the physical time T3 and T4 of the local server respectively at the beginning and the end of processing the same data record by the second working node to obtain a time offset (T2 is T4-T3), and assuming that the time spent by the data transmitted from the first working node to the second node is 0, obtaining two frame clocks by the second node, namely T1+ T1 and T1+ T1+ T2 respectively.
And by analogy, the frame clock obtained by the latter node is always later than that obtained by the former node. The same clock can be used to approximate the physical clock using the framework, regardless of whether the server local physical clock has errors. The timestamp recorded in the log file is the frame clock.
The high-availability streaming processing system and method provided by the invention can be applied to a high-performance data processing platform, can process data generated by an upstream application system at a second level, and adopts various measures to improve the stability of the system. The method specifically comprises the following advantages:
1. the system and the method carry out loose coupling design on the basic framework module and the specific service logic module, can simultaneously support a plurality of high-speed real-time data analysis processing scenes, provide solid bottom support for complex and various service models, and ensure that data is quickly and effectively processed.
2. The log reconciliation method provided by the system and the method can ensure that each piece of data generated by an upstream application system is processed, and the data lost due to abnormity is accurately positioned to a specific position and a service model.
3. The system and the method monitor the working nodes of the distributed cluster in real time, respond to the failed nodes in time and recover quickly, and ensure that the system continuously and normally serves the outside.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (12)

1. A high availability streaming processing system, the system comprising:
the upstream application system is used for generating original data in real time according to the process of the upstream business application and sending the original data to the receiving device one by one;
receiving means for forwarding the raw data to a message queue of the message middleware apparatus;
the message middleware device is used for caching data for the streaming processing topology device in a message queue form;
the stream processing topology device is used for processing the data stream of the original data and generating a data result;
and the downstream application system corresponds to the service scene and is used for carrying out subsequent processing on the data result.
2. The high availability streaming processing system of claim 1, wherein the streaming processing topology means comprises: the method comprises the following steps of collecting components, model components and forwarding components, wherein each component corresponds to a plurality of working nodes; wherein,
the acquisition component is used for reading an original data record of an upstream application system from a message queue of the message middleware device, converting the original data into a MessageObject message object format which can be operated by a working node of the internal component, and forwarding the converted data to a downstream model component according to the setting of a routing table;
the model components are serially arranged in one data processing link, each model component is used for processing the converted data according to the service logic specified by the upper application system to generate a data result and transmitting the data result to a downstream model component or a transmitting component, and if one data processing link has a plurality of model components, the converted data are sequentially processed;
and the forwarding component is used for forwarding the data result to the corresponding downstream application system.
3. The high availability streaming system in accordance with claim 2, further comprising:
the heartbeat detection device is used for periodically receiving heartbeat information from the working nodes of the streaming processing topology device and forwarding the working node list of the sent heartbeat to the system guard device;
the system guard device is used for receiving the list information sent by the heartbeat detection device and monitoring the running state of the streaming processing topology device; the system is also used for restarting the fault working node and recovering the normal work of the fault working node; but also to recover data lost in system operation in a fine-grained mode.
4. The high availability streaming system in accordance with claim 3, further comprising:
the system information persistence device is used for managing and maintaining information of related components and working nodes in the system, recording the running state of each working node and caching a data record to be recovered.
5. The high availability streaming system in accordance with claim 4, further comprising:
the mass log management device is used for collecting and storing the frame logs for subsequent mining;
and the log reconciliation device is used for finding the data records lost in the processing process, regularly reading the frame logs in the mass log management device through a mining algorithm based on a reconciliation mechanism, and finally writing the lost data record information into the system information persistence device for recovery processing.
6. The high availability streaming system according to claim 4, wherein the system guard comprises a monitoring management unit, a failure recovery unit, a data recovery unit; wherein,
the monitoring management unit is used for monitoring the running state of the working nodes of each component in the system, periodically receiving messages from the heartbeat detection device, wherein the message content comprises a working node list which successfully sends heartbeats in the period, and writing the working node list into the system information persistence device;
the system information persistence device comprises a fault recovery unit, a destination server and a system information persistence unit, wherein the fault recovery unit is used for periodically comparing a working node list which is recorded in the system information persistence device and successfully sends heartbeat with all working nodes of the streaming processing topology device, searching the working nodes which do not send heartbeat, judging that the working nodes have faults for the working nodes which do not send heartbeat in time in a preset time interval, writing the information of the fault working nodes into the system information persistence device, finding out the server IP port information and the configuration information of the fault nodes from the system information persistence device, and remotely restarting the destination server;
and the data recovery unit is used for regularly reading the data records lost in the system operation process from the system information persistence device and performing fine-grained recovery one by one.
7. The high availability streaming system according to claim 6, wherein the data recovery unit is configured to read a component ID when data is lost in the system information persistence device, determine a certain working node ID for retransmitting data to the component according to a predetermined policy, and find a server IP and a port of the working node;
and when the data recovery unit reads the lost data record, repackaging the data record into a message object by the text, and retransmitting the message object to the corresponding component working node of the streaming processing topology device according to the IP and the port of the destination server.
8. A highly available streaming process, the process comprising:
step S1, the upstream application system generates the original data in real time and sends the data to the receiving device one by one;
step S2, the receiving device forwards the original data to the message queue of the message middleware device, and the message middleware device caches the data for the streaming processing topology device in the form of the message queue;
step S3, starting the flow process topology device, collecting the work nodes of the module, model module, and transfer module to initialize, loading the common data of the transfer route information and data mapping information of the work module in the memory;
step S4, the heartbeat detection device and the system guard device carry out normal operation of the real-time guard system, the heartbeat detection device monitors all working nodes of the system, and when the system guard device monitors that the working nodes have faults, the system guard device carries out remote restart;
step S5, the collection component of the stream processing topology device reads and processes the original data from the message queue of the message middleware device and converts the original data into a message object, and the message object is transmitted to one or more downstream model components in the stream processing topology device according to the forwarding routing information and the data mapping information of the working component;
step S6, the model component of each data processing link branch processes the data record by the service logic appointed by the upper application system, and forwards the routing information and the data mapping information to the downstream model component or the forwarding component according to the working component; each data processing link branch has one or more model components, and the last model component forwards data to a forwarding component;
step S7, the forwarding component forwards the data to the downstream application system;
in steps S5, S6, and S7, the working nodes of all the components will record the current processing state in the working log in real time for the subsequent data recovery component to find the lost data record;
in step S8, the journal checking device periodically searches the data record information that failed to be forwarded in steps S5, S6, and S7, and the system guard device resumes and transmits the data record.
9. The high availability streaming processing method according to claim 8, further comprising, in step S4:
step S401, when the working nodes of the streaming topology processing device work normally, heartbeat is sent to a heartbeat detection device periodically, and the heartbeat detection device collects the working nodes which receive the heartbeat into a list periodically and forwards the list to a monitoring management unit;
step S402, the monitoring management unit collects a working node list which successfully sends the heartbeat, and writes the working node list into the system information persistence device;
in step S403, the failure recovery unit searches for a failed node and remotely restarts the failed node.
10. The high availability streaming processing method according to claim 9, further comprising, in step S403: the fault recovery unit compares a working node list which is recorded in the system information persistence device and successfully sends heartbeats with all working nodes, searches for a working node which does not send heartbeats, and judges that the working node fails in a preset time interval when the working node which does not send heartbeats in time is not sent;
and the fault recovery unit writes the fault working node information into the system information persistence device, finds the server IP port information and the configuration information of the fault node from the system information persistence device, and remotely restarts the destination server by adopting an SSH protocol.
11. The high availability streaming processing method according to claim 9, further comprising, in step S8:
step S801, recording intermediate results of processing data records in a log in real time by each link working node;
step S802, the log reconciliation device periodically searches for lost data;
in step S803, the data recovery unit periodically recovers the missing data records.
12. The high availability streaming processing method according to claim 11, further comprising, in step S802:
step S8021, collecting the log information recorded by the same main key data in the frame log into a set;
step S8022, analyzing a log information set recorded by the same main key data, finding the log information of the H mark and the T mark from the set, and checking the condition of data loss;
step S8023, positioning the lost path of the lost data in the transmission process;
step S8024, locate the component position where the lost data is located when the data loss occurs on the failed forwarding path.
CN201610458184.5A 2016-06-22 2016-06-22 A kind of Stream Processing system and method for High Availabitity Active CN105959151B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610458184.5A CN105959151B (en) 2016-06-22 2016-06-22 A kind of Stream Processing system and method for High Availabitity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610458184.5A CN105959151B (en) 2016-06-22 2016-06-22 A kind of Stream Processing system and method for High Availabitity

Publications (2)

Publication Number Publication Date
CN105959151A true CN105959151A (en) 2016-09-21
CN105959151B CN105959151B (en) 2019-05-07

Family

ID=56904602

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610458184.5A Active CN105959151B (en) 2016-06-22 2016-06-22 A kind of Stream Processing system and method for High Availabitity

Country Status (1)

Country Link
CN (1) CN105959151B (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446296A (en) * 2016-11-28 2017-02-22 泰康保险集团股份有限公司 Method for processing trading messages and trading system
CN106789741A (en) * 2016-12-26 2017-05-31 北京奇虎科技有限公司 The consuming method and device of message queue
CN106874133A (en) * 2017-01-17 2017-06-20 北京百度网讯科技有限公司 The troubleshooting of calculate node in streaming computing system
CN107046510A (en) * 2017-01-13 2017-08-15 广西电网有限责任公司电力科学研究院 A kind of node and its system of composition suitable for distributed computing system
CN107070976A (en) * 2017-01-13 2017-08-18 广西电网有限责任公司电力科学研究院 A kind of data transmission method
CN107465574A (en) * 2017-08-07 2017-12-12 南京华盾电力信息安全测评有限公司 Internet site group plateform system and its parallel isolation streaming computational methods
CN107506482A (en) * 2017-06-26 2017-12-22 湖南星汉数智科技有限公司 A kind of large-scale data processing unit and method based on Stream Processing framework
CN107870982A (en) * 2017-10-02 2018-04-03 深圳前海微众银行股份有限公司 Data processing method, system and computer-readable recording medium
CN108038019A (en) * 2017-12-25 2018-05-15 曙光信息产业(北京)有限公司 A kind of automatically restoring fault method and system of baseboard management controller
CN108614820A (en) * 2016-12-09 2018-10-02 腾讯科技(深圳)有限公司 The method and apparatus for realizing the parsing of streaming source data
CN109064338A (en) * 2018-07-10 2018-12-21 泰康保险集团股份有限公司 Investment risk management method, apparatus, storage medium and electronic equipment
CN109145023A (en) * 2018-08-30 2019-01-04 北京百度网讯科技有限公司 Method and apparatus for handling data
CN109639795A (en) * 2018-12-11 2019-04-16 广东亿迅科技有限公司 A kind of service management and device based on AcitveMQ message queue
CN109783251A (en) * 2018-12-21 2019-05-21 招银云创(深圳)信息技术有限公司 Data processing system based on Hadoop big data platform
CN109800129A (en) * 2019-01-17 2019-05-24 青岛特锐德电气股份有限公司 A kind of real-time stream calculation monitoring system and method for processing monitoring big data
CN109815256A (en) * 2018-12-21 2019-05-28 聚好看科技股份有限公司 A kind of data processing method, device, electronic equipment and storage medium
CN110011872A (en) * 2019-04-10 2019-07-12 海南航空控股股份有限公司 A kind of streaming computing platform status monitoring method and device based on diagnostic message
CN110019445A (en) * 2017-09-08 2019-07-16 北京京东尚科信息技术有限公司 Method of data synchronization and device calculate equipment and storage medium
CN110213128A (en) * 2019-05-28 2019-09-06 掌阅科技股份有限公司 Serve port detection method, electronic equipment and computer storage medium
CN110765091A (en) * 2019-09-09 2020-02-07 上海陆家嘴国际金融资产交易市场股份有限公司 Account checking method and system
CN111414345A (en) * 2020-03-30 2020-07-14 杭州华望系统科技有限公司 Model data recovery method based on operation log
CN111459986A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Data computing system and method
CN111813063A (en) * 2020-06-29 2020-10-23 南昌欧菲光电技术有限公司 Method and device for monitoring production equipment
CN112087373A (en) * 2020-09-21 2020-12-15 全通金信控股(广东)有限公司 Message sending method and service device
CN112953757A (en) * 2021-01-26 2021-06-11 北京明略软件系统有限公司 Data distribution method, system and computer equipment
CN113760274A (en) * 2020-09-04 2021-12-07 北京京东振世信息技术有限公司 Front-end component logic injection method and device
CN113760987A (en) * 2021-02-04 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method and data processing platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103731298A (en) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 Large-scale distributed network safety data acquisition method and system
CN103812697A (en) * 2014-01-28 2014-05-21 大唐移动通信设备有限公司 Remote disaster recovery method and remote disaster recovery system of distributed communication network
CN105471714A (en) * 2015-12-09 2016-04-06 百度在线网络技术(北京)有限公司 Message processing method and device
CN105468735A (en) * 2015-11-23 2016-04-06 武汉虹旭信息技术有限责任公司 Stream preprocessing system and method based on mass information of mobile internet

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103731298A (en) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 Large-scale distributed network safety data acquisition method and system
CN103812697A (en) * 2014-01-28 2014-05-21 大唐移动通信设备有限公司 Remote disaster recovery method and remote disaster recovery system of distributed communication network
CN105468735A (en) * 2015-11-23 2016-04-06 武汉虹旭信息技术有限责任公司 Stream preprocessing system and method based on mass information of mobile internet
CN105471714A (en) * 2015-12-09 2016-04-06 百度在线网络技术(北京)有限公司 Message processing method and device

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446296B (en) * 2016-11-28 2019-11-15 泰康保险集团股份有限公司 For handling the method and transaction system of transaction message
CN106446296A (en) * 2016-11-28 2017-02-22 泰康保险集团股份有限公司 Method for processing trading messages and trading system
CN108614820B (en) * 2016-12-09 2021-01-15 腾讯科技(深圳)有限公司 Method and device for realizing streaming source data analysis
CN108614820A (en) * 2016-12-09 2018-10-02 腾讯科技(深圳)有限公司 The method and apparatus for realizing the parsing of streaming source data
CN106789741A (en) * 2016-12-26 2017-05-31 北京奇虎科技有限公司 The consuming method and device of message queue
CN107070976A (en) * 2017-01-13 2017-08-18 广西电网有限责任公司电力科学研究院 A kind of data transmission method
CN107046510B (en) * 2017-01-13 2020-06-16 广西电网有限责任公司电力科学研究院 Node suitable for distributed computing system and system composed of nodes
CN107046510A (en) * 2017-01-13 2017-08-15 广西电网有限责任公司电力科学研究院 A kind of node and its system of composition suitable for distributed computing system
US11368506B2 (en) 2017-01-17 2022-06-21 Beijing Baidu Netcom Science And Technology Co., Ltd. Fault handling for computer nodes in stream computing system
CN106874133A (en) * 2017-01-17 2017-06-20 北京百度网讯科技有限公司 The troubleshooting of calculate node in streaming computing system
CN106874133B (en) * 2017-01-17 2020-06-23 北京百度网讯科技有限公司 Failure handling for compute nodes in a streaming computing system
CN107506482A (en) * 2017-06-26 2017-12-22 湖南星汉数智科技有限公司 A kind of large-scale data processing unit and method based on Stream Processing framework
CN107465574A (en) * 2017-08-07 2017-12-12 南京华盾电力信息安全测评有限公司 Internet site group plateform system and its parallel isolation streaming computational methods
CN107465574B (en) * 2017-08-07 2020-11-10 南京华盾电力信息安全测评有限公司 Internet website group platform system and parallel isolation streaming computing method thereof
CN110019445A (en) * 2017-09-08 2019-07-16 北京京东尚科信息技术有限公司 Method of data synchronization and device calculate equipment and storage medium
CN107870982A (en) * 2017-10-02 2018-04-03 深圳前海微众银行股份有限公司 Data processing method, system and computer-readable recording medium
CN108038019A (en) * 2017-12-25 2018-05-15 曙光信息产业(北京)有限公司 A kind of automatically restoring fault method and system of baseboard management controller
CN108038019B (en) * 2017-12-25 2021-06-11 曙光信息产业(北京)有限公司 Automatic fault recovery method and system for substrate management controller
CN109064338A (en) * 2018-07-10 2018-12-21 泰康保险集团股份有限公司 Investment risk management method, apparatus, storage medium and electronic equipment
CN109145023A (en) * 2018-08-30 2019-01-04 北京百度网讯科技有限公司 Method and apparatus for handling data
CN109639795B (en) * 2018-12-11 2021-12-24 广东亿迅科技有限公司 Service management method and device based on AcitveMQ message queue
CN109639795A (en) * 2018-12-11 2019-04-16 广东亿迅科技有限公司 A kind of service management and device based on AcitveMQ message queue
CN109815256A (en) * 2018-12-21 2019-05-28 聚好看科技股份有限公司 A kind of data processing method, device, electronic equipment and storage medium
CN109783251A (en) * 2018-12-21 2019-05-21 招银云创(深圳)信息技术有限公司 Data processing system based on Hadoop big data platform
CN109800129A (en) * 2019-01-17 2019-05-24 青岛特锐德电气股份有限公司 A kind of real-time stream calculation monitoring system and method for processing monitoring big data
CN110011872A (en) * 2019-04-10 2019-07-12 海南航空控股股份有限公司 A kind of streaming computing platform status monitoring method and device based on diagnostic message
CN110011872B (en) * 2019-04-10 2020-12-01 海南航空控股股份有限公司 Method and device for monitoring state of streaming computing platform based on diagnostic message
CN110213128B (en) * 2019-05-28 2020-06-05 掌阅科技股份有限公司 Service port detection method, electronic device and computer storage medium
CN110213128A (en) * 2019-05-28 2019-09-06 掌阅科技股份有限公司 Serve port detection method, electronic equipment and computer storage medium
CN110765091A (en) * 2019-09-09 2020-02-07 上海陆家嘴国际金融资产交易市场股份有限公司 Account checking method and system
CN111414345A (en) * 2020-03-30 2020-07-14 杭州华望系统科技有限公司 Model data recovery method based on operation log
CN111414345B (en) * 2020-03-30 2022-03-25 杭州华望系统科技有限公司 Model data recovery method based on operation log
CN111459986A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Data computing system and method
CN111813063A (en) * 2020-06-29 2020-10-23 南昌欧菲光电技术有限公司 Method and device for monitoring production equipment
CN111813063B (en) * 2020-06-29 2021-11-19 南昌欧菲光电技术有限公司 Method and device for monitoring production equipment
CN113760274A (en) * 2020-09-04 2021-12-07 北京京东振世信息技术有限公司 Front-end component logic injection method and device
CN113760274B (en) * 2020-09-04 2023-11-03 北京京东振世信息技术有限公司 Front-end assembly logic injection method and device
CN112087373B (en) * 2020-09-21 2022-05-13 全通金信控股(广东)有限公司 Message sending method and service device
CN112087373A (en) * 2020-09-21 2020-12-15 全通金信控股(广东)有限公司 Message sending method and service device
CN112953757A (en) * 2021-01-26 2021-06-11 北京明略软件系统有限公司 Data distribution method, system and computer equipment
CN112953757B (en) * 2021-01-26 2023-12-29 北京明略软件系统有限公司 Data distribution method, system and computer equipment
CN113760987A (en) * 2021-02-04 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method and data processing platform

Also Published As

Publication number Publication date
CN105959151B (en) 2019-05-07

Similar Documents

Publication Publication Date Title
CN105959151B (en) A kind of Stream Processing system and method for High Availabitity
EP3754514B1 (en) Distributed database cluster system, data synchronization method and storage medium
CN112313916B (en) Method and system for pseudo-storage of anti-tampering logs by fusing block chain technology
CN106294357B (en) Data processing method and stream calculation system
Kamburugamuve et al. Survey of distributed stream processing for large stream sources
WO2015090245A1 (en) File transmission method, apparatus, and distributed cluster file system
US9367261B2 (en) Computer system, data management method and data management program
CN111800354B (en) Message processing method and device, message processing equipment and storage medium
CN105306585B (en) A kind of method of data synchronization of multiple data centers
CN101388844A (en) Data flow processing method and system
CN104965850A (en) Database high-available implementation method based on open source technology
US10756947B2 (en) Batch logging in a distributed memory
CN106537347B (en) System and method for distributing and processing streams
CN105493474A (en) System and method for supporting partition level journaling for synchronizing data in a distributed data grid
CN110071873A (en) A kind of method, apparatus and relevant device sending data
CN108512753B (en) Method and device for transmitting messages in cluster file system
US10116736B2 (en) System for dynamically varying traffic routing modes in a distributed cluster and method therefor
CN104573428B (en) A kind of method and system for improving server cluster resource availability
WO2017181430A1 (en) Method and device for duplicating database in distributed system
CN112579552A (en) Log storage and calling method, device and system
US10169138B2 (en) System and method for self-healing a database server in a cluster
US20240020297A1 (en) Metrics and events infrastructure
US20170083525A1 (en) System and method for implementing a database in a heterogeneous cluster
CN113672452A (en) Method and system for monitoring operation of data acquisition task
US20170083562A1 (en) System for maintaining consistency across a decentralized database cluster and method therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant