WO2017032212A1 - Data stream processing method and apparatus - Google Patents

Data stream processing method and apparatus Download PDF

Info

Publication number
WO2017032212A1
WO2017032212A1 PCT/CN2016/093588 CN2016093588W WO2017032212A1 WO 2017032212 A1 WO2017032212 A1 WO 2017032212A1 CN 2016093588 W CN2016093588 W CN 2016093588W WO 2017032212 A1 WO2017032212 A1 WO 2017032212A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
data flow
processing
updated
real
Prior art date
Application number
PCT/CN2016/093588
Other languages
French (fr)
Chinese (zh)
Inventor
李旭良
李嘉
刘杰
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017032212A1 publication Critical patent/WO2017032212A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16ZINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
    • G16Z99/00Subject matter not provided for in other main groups of this subclass
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a data stream processing method and apparatus.
  • real-time data is strong in real-time, and the amount of data is large, and there is no end.
  • real-time data is mainly calculated by flow computing in real-time computing system. Calculation, for example: Storm application.
  • the code is implemented by relying on the JAVA application programming interface (API), and the package submission task is implemented, the business logic of the real-time computing system that is already running is immutable, that is, real-time.
  • the topology structure of the system is immutable at runtime in the computing system.
  • the embodiment of the invention provides a data stream processing method and device, which can dynamically adjust the topology structure of the real-time computing system.
  • an embodiment of the present invention provides a data stream processing method, including:
  • a processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update structure;
  • the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Circulation path
  • the processing node When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service according to the data flow circulation path included in the updated data flow circulation table.
  • the processed data stream When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service according to the data flow circulation path included in the updated data flow circulation table.
  • the processed data stream The processed data stream.
  • the method further includes:
  • the processing node When the processing node performs failure recovery, the processing node acquires the updated data flow distribution table from the shared storage node, and transmits the data flow according to the updated data flow distribution table.
  • the method further includes:
  • processing node feeds back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the control to the management node flow;
  • the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
  • the method further includes:
  • the processing node feeds back an update result of the data flow table update to an output node of the real-time computing system, and the output node summarizes the update result fed back by all processing nodes of the real-time computing system, and Output summary results.
  • an embodiment of the present invention provides a data stream processing method, including:
  • a management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update structure;
  • the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs the data flow on the data flow.
  • the service processes, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the method further includes:
  • the management node sends the control flow to a processing node for performing service processing on the data stream, including:
  • the management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  • an embodiment of the present invention provides a data stream processing method, including:
  • a source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update;
  • the source node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a data flow that matches the topology that needs to be updated currently.
  • the control flow Sending, by the source node, the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently;
  • the method further includes:
  • the source node When the source node performs fault recovery, the source node acquires the updated data flow flow table from the shared storage node, and performs data flow transmission according to the updated data flow flow table.
  • the method further includes:
  • the source node sends the control flow to the management node, so that the management node sends the control flow to the processing node, and the processing node follows the The topology that is currently updated as described by the control flow updates the data flow flow table.
  • an embodiment of the present invention provides a data stream processing apparatus, where the apparatus is applied to a processing node of a real-time computing system, including: a receiving unit, a first updating unit, and a first sending unit, where:
  • the receiving unit is configured to receive a control flow sent by a management node of a real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update currently. ;
  • the first update unit is configured to update a data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the topology that needs to be updated currently Structure matching data flow path;
  • the first sending unit is configured to: when the processing node receives the data stream, perform service processing on the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table. After the data stream.
  • the device further includes:
  • a second sending unit configured to send the updated data flow distribution table to a shared storage node of the real-time computing system
  • a recovery unit configured to: when the processing node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
  • the device further includes:
  • a first feedback unit configured to feed back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the update to the management node The control flow;
  • a second updating unit configured to: when the update result indicates that the first update unit fails to update, Receiving the control flow sent by the management node, and updating the data flow circulation table according to the topology that is currently updated according to the control flow.
  • the device further includes:
  • a second feedback unit configured to feed back an update result of the data flow table update to an output node of the real-time computing system, where the update result is fed back by the output node to all processing nodes of the real-time computing system Summarize and output the summary results.
  • an embodiment of the present invention provides a data stream processing apparatus, including: a receiving unit and a sending unit, where:
  • the receiving unit is configured to receive a control flow sent by a source node of a real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update currently. ;
  • the sending unit is configured to send the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data according to the topology that is currently updated according to the control flow.
  • a flow distribution table wherein the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node The data stream performs service processing, and transmits the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the device further includes:
  • a holding unit configured to establish and maintain a connection with each of the processing nodes in the real-time computing system
  • the sending unit is configured to send the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  • an embodiment of the present invention provides a data stream processing apparatus, where the apparatus is applied to a source node of a real-time computing system, including: an acquiring unit, an updating unit, a first sending unit, and a second sending unit, where:
  • the acquiring unit is configured to acquire a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present;
  • the updating unit is configured to update a data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the current requirement The data flow path that matches the topology to be updated;
  • the first sending unit is configured to send the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, And causing the processing node to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a topology that matches the current need to be updated.
  • Data flow path ;
  • the second sending unit is configured to receive a data stream, and send the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where the processing node is The data stream performs service processing, and transmits the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
  • the device further includes:
  • a third sending unit configured to send the updated data flow distribution table to a shared storage node of the real-time computing system
  • a recovery unit configured to: when the source node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
  • the device further includes:
  • a receiving unit configured to receive, by the processing node fed back by the processing node, an update result of updating the data flow distribution table
  • a fourth sending unit configured to: when the update result indicates that the update fails, send the control flow to the management node, so that the management node sends the control flow to the processing node, by the processing node Updating the data flow flow table according to the topology currently required to be updated as described by the control flow.
  • the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system.
  • the topology structure currently needs to be updated; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the current needs update Topology structure matching data flow path; when the processing node receives the data stream, the processing node pairs the number Performing service processing according to the flow, and transmitting the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
  • FIG. 1 is a structural diagram of a real-time computing system to which a data stream processing method according to an embodiment of the present invention is applicable;
  • FIG. 2 is a schematic flowchart of a data stream processing method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart diagram of another data stream processing method according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart diagram of another data stream processing method according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of control flow transmission using a Storm application as an example according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of data flow transmission using a Storm application as an example according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 14 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 15 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 16 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 17 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 18 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 19 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • FIG. 1 is a structural diagram of a real-time computing system applicable to a data stream processing method according to an embodiment of the present invention.
  • the method includes: a supervisory body 11, a source node 12, and one or more
  • the processing node 13 and the management module 14 are respectively connected to the source node 12 and the respective processing nodes 13, and the source node 12 is connected to at least one of the processing nodes 13, and the processing nodes 13 are sequentially connected.
  • the management module 14 includes a management node 141 and an output node 142, wherein the management node 141 is connected to the source node and the respective processing nodes 13, and the output node 142 is connected to the respective processing nodes 13.
  • the connection relationship between the nodes in the real-time computing system is expressed as a topology of the real-time computing system or as a topology model.
  • connection between the nodes described in this embodiment may be understood as a connection on a logical connection, and a data flow or a control flow may be transmitted between the connected nodes.
  • the data stream flows from the source node 12, and then the source node 12 transmits the data stream to the corresponding processing node 13 in accordance with the stored data flow flow table, and the processing node 13 performs the service on the received data stream.
  • the service processing may include real-time calculation of the data stream or understanding of the stream calculation
  • the processing node 13 transmitting the service-processed data stream to another processing node 13 according to the stored data flow circulation table, by the processing
  • the node 13 transmits the service-processed data stream to another node or outputs the result according to the stored data flow flow table.
  • the data flow distribution table includes a data flow circulation path that matches the topology structure of the real-time computing system, that is, in the embodiment, the data flow distribution table controls the transmission path or the transmission structure of the data flow, such that Both the source node 12 and the processing node 13 can send the data stream that needs to be sent to the corresponding node through the respective stored data flow distribution table.
  • the topology of the real-time computing system includes that the source node 12 is connected to the processing node A, the processing node A is reconnected to the processing node B, the processing node B is reconnected to the processing node C, and the processing node C is the ending node.
  • the data flow flow table stored at the source node 12 can include the data flow sent by the source node 12 to the processing section.
  • the mapping of the point A, the data flow distribution table stored by the processing node A may include the mapping of the data stream sent by the processing node A to the processing node B, and the data flow distribution table stored by the processing node B may include the data stream sent by the processing node B.
  • the data flow distribution table stored by the processing node C may include a mapping of the processing node C outputting the business processing result.
  • the data flow distribution table stored by each node may include all mappings or only include mappings related to itself.
  • control flow can be transmitted from the source node 12 to the management node 141, wherein the control flow is used to describe the topology that the real-time computing system currently needs to update.
  • the management node 141 then transmits the control flow to each processing node 13, so that the processing nodes 13 can update the stored data flow flow table according to the control flow, thereby realizing the dynamic adjustment of the topology of the real-time computing system.
  • the real-time computing system may be a distributed system, that is, the foregoing nodes may be distributed and run in different machines, and of course, some of the nodes may be allowed to run in the same machine, for example, a management node. 121 and output node 122 can run in the same machine or in different machines. In addition, in the embodiment of the present invention, these machines are not limited, for example, these machines may be computers or servers.
  • FIG. 2 is a schematic flowchart of a data stream processing method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
  • the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently. Topology.
  • the topology that needs to be updated in the foregoing control flow may be a complete topology that the real-time computing system needs to update.
  • the foregoing control flow may describe the source node, the processing node, and the management node as shown in FIG. 1 .
  • the topology that needs to be updated in the foregoing control flow description may be a topology that needs to be adjusted currently, for example, the processing node A is connected to the processing node B in the original topology, and the processing node B is connected to the processing node C, that is, the data stream.
  • the order of circulation is from processing node A to processing node B to processing node C, and currently needs to be adjusted to process node A to connect processing node C, and processing node C to connect to processing node B. Then, the above control flow may only describe the processing node A connection processing node C, processing section The point C reconnects to the topology of the processing node B, that is, after the update, the flow order of the data stream is from the processing node A to the processing node B to the processing node C.
  • control flow information that is, the above control flow can be understood as one piece of information.
  • the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Distribution path.
  • the processing node stores a data flow circulation table that matches the original topology structure.
  • the data flow distribution table may be updated according to the topology structure currently required to be updated.
  • the updated data flow circulation table includes a data flow distribution path that matches the topology that needs to be updated, so that the processing node sends the data flow according to the updated topology when transmitting the data flow.
  • the above-mentioned data flow path matching the topology that needs to be updated may be understood as a flow path or a circulation structure of the data flow in the topology that is currently required to be updated.
  • the topology that needs to be updated is the processing node A connection processing node C
  • the updated data flow distribution table includes the data stream transmitted from the processing node A to the processing node C, that is, when the processing node performing step 202 is Processing node A
  • the updated data flow flow table may include a data flow path for transmitting the data stream to the processing node C.
  • the processing node When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service processed data according to the data flow distribution path included in the updated data flow distribution table. flow.
  • the processing node can be sent according to the updated topology when transmitting the data stream, so that the topology of the real-time computing system can be dynamically adjusted.
  • the above-mentioned steps dynamically adjust the topology of the real-time computing system without causing interference to the data stream being processed.
  • the real-time computing system is a distributed system, and each node runs on a different machine, dynamically adjusting the topology of the real-time computing system through the above steps can avoid the problem caused by the modification lag of a certain machine.
  • the processing node may be any processing node in the real-time computing system.
  • the processing node of the real-time computing system receives a control flow sent by the management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control
  • the flow is used to describe a topology that the real-time computing system currently needs to update;
  • the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, where the updated data flow is circulated.
  • the table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs service processing on the data flow, and follows the updated data flow.
  • the data flow distribution path included in the flow table transmits the data stream after the business process. In this way, the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
  • FIG. 3 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention. As shown in FIG. 3, the method includes the following steps:
  • a processing node of a real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently. Topology.
  • the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Distribution path.
  • the processing node when the processing node updates the data flow distribution table, the processing node may suspend the data flow transmission, but may perform service processing on the data flow. Or the processing node may suspend the service processing and data stream transmission of the data stream when updating the data flow distribution table. When the data flow flow table is updated, the suspended data stream transmission and/or the business processing of the data stream are resumed. In this way, when the topology of the real-time computing system is dynamically adjusted, the error handling of the data stream and the blocking effect of the system may occur.
  • the processing node When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service processed data according to the data flow distribution path included in the updated data flow distribution table. flow.
  • the above method may further comprise the following steps:
  • the processing node sends the updated data flow circulation table to a shared storage node of the real-time computing system.
  • the processing node acquires the updated data flow flow table from the shared storage node, and performs the number according to the updated data flow circulation table. According to the flow of the transmission.
  • the processing node can obtain the data flow circulation table updated in step 302 directly from the shared storage node when the recovery node recovers, so that the processing node can read from the shared storage node after the failure occurs.
  • the data flow table is initialized and initialized, that is, the high availability (HA) mechanism of dynamically adjusting the topology is completed.
  • the foregoing method may further include the following steps:
  • processing node feeds back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the control to the management node flow;
  • the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
  • the update result may be the update result of step 302.
  • the source node may be notified of the update of the processing node by the update result.
  • the update fails, the source node is notified by the above update result that the processing node update fails, so that the source node sends the control flow to the management node again, and the management node sends the control flow to the processing node again to make the processing node again.
  • Update Certainly, when the source node sends the control flow to the management node again, it may also carry the identification information of the processing node that failed to update, so that the management node may only send the control flow to the processing node that failed the update, without successfully updating. The processing node sends the control flow again to save transmission resources.
  • This embodiment can implement the correct feedback of the update result, and if the update fails, the update task can be started again.
  • the foregoing method may further include the following steps:
  • Processing node feeds back an update result of the data flow table update to an output node of the real-time computing system, and the output node summarizes the update result fed back by all processing nodes of the real-time computing system, and outputs the summary result.
  • the output node can obtain the update result fed back by each processing node, so that the output node can summarize it, so that the summary result can be output, for example, sending the summary result to the presentation device, or printing the summary result, etc. This allows the user to know the state of the topology adjustment of the real-time computing system.
  • a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 2, and the topology of the real-time computing system can be dynamically adjusted.
  • FIG. 4 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention. As shown in FIG. 4, the method includes the following steps:
  • the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently. Topology.
  • control flow may include structural information of a topology that needs to be updated, and may also include a control flow decomposition structure identifier of the management node or the management module internal processing logic, that is, the control flow may be decomposed by the structure identifier and The structure information identifies the topology that needs to be updated currently.
  • the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow.
  • the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs the data flow on the data flow.
  • the service processes, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the management node may send the control flow to all processing nodes in the real-time computing system, so that all processing nodes update the stored data flow table.
  • the above method may further include the following steps:
  • the step of the foregoing management node sending the control flow to the processing node for performing service processing on the data stream may include:
  • the management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  • the management node may only send the control flow to the partial processing node.
  • the management section The point can only send control flow to these involved processing nodes, so that these are involved in the processing node to update the data flow flow table, and without being involved, the data flow flow table may not be updated.
  • the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system.
  • the topology update data flow circulation table wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing is performed by the processing
  • the node performs service processing on the data stream, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table. This allows dynamic adjustment of the topology of the real-time computing system.
  • FIG. 5 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention. As shown in FIG. 5, the method includes the following steps:
  • a source node of a real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update.
  • control flow may be a control flow in which the source node receives user input.
  • the control flow here can refer to the control flow described in the embodiment shown in FIG. 1-4, and will not be repeatedly described herein.
  • the source node updates the data flow distribution table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Distribution path.
  • the source node After receiving the above control flow, the source node can decompose the control flow, and update the data flow circulation table according to the topology structure that needs to be updated according to the decomposition. In addition, the source node stores or caches the updated data flow table. Of course, the above control flow can also be stored or cached.
  • the source node sends the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node used for performing service processing on the data flow, so that the processing is performed.
  • the node updates the data flow flow table according to the topology that is currently updated as described by the control flow, wherein the updated data flow flow table includes more The new topology matches the data flow path.
  • control node can send the control flow to the processing node, and the processing node updates the stored data flow table.
  • the source node receives the data stream, and sends the data stream to the processing node according to a data flow path included in the data flow circulation table updated by the source node, where the processing node performs a service on the data stream. Processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
  • the data stream may be used to calculate data currently needed by the system in real time, such as data input by the user or data transmitted by the collection device.
  • the step 504 sends the data flow according to the updated data flow circulation table, so that the source node can send the data flow according to the updated topology to dynamically adjust the topology of the real-time computing system. .
  • the foregoing method may further include the following steps:
  • the source node When the source node performs fault recovery, the source node acquires the updated data flow flow table from the shared storage node, and performs data flow transmission according to the updated data flow flow table.
  • the source node can directly obtain the data flow flow table updated in step 502 from the shared storage node, so that the source node can read from the shared storage node after the failure occurs.
  • the data flow table is initialized and initialized, that is, the HA mechanism for dynamically adjusting the topology is completed.
  • the foregoing method may further include the following steps:
  • the source node sends the control flow to the management node, so that the management node sends the control flow to the processing node, and the processing node follows the The topology that is currently updated as described by the control flow updates the data flow flow table.
  • the update result of the current update data flow flow table of each processing node can be obtained in time.
  • the source node can trigger the management node to The processing node that failed the update sends a control flow so that the processing node that failed the update updates the data flow flow table again.
  • the source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present; the source And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; Sending, by the source node, the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; Receiving, by the source node, a data stream, and including the data stream according to a data flow flow table updated
  • FIG. 6 is a schematic diagram of a control flow transmission using a Storm application as an example, as shown in FIG.
  • the alarm node receives the control flow message, parses the control flow message to obtain the control flow, and updates the data flow flow table according to the control flow, and sends the control flow message to the police station node.
  • the Storm application is a real-time computing system.
  • the above alarm node can be understood as the source node in the embodiment shown in Figures 1-5.
  • the alarm node is a Spout function class node of the Storm real-time computing system.
  • the Spout function class refers to a data source class in the Storm application, which is used for receiving and sending an external data stream, or constructing a data stream for transmission by itself.
  • the police station node is understood to be the management node in the embodiment shown in FIG. 1 to FIG. 5, and the police station node is a Bolt function class node in the Storm application, and depends on the police module body (the management module body in the embodiment shown in FIG. 1)
  • the support is automatically integrated in the Storm application topology, and is connected to all nodes except the alarm node that sends data to the Storm application.
  • the Bolt function class node is a data processing node in the Storm application, and each Bolt implements different business logics. Multiple Bolt combinations complete complex business logic processing.
  • the police station sends a control flow message to the police node A, the police node B, and the police node C.
  • the policing node can be understood as a processing node in the embodiment shown in FIG. 1 to FIG. 5, which is a Bolt function class node.
  • the police node A decomposes the control flow message, and updates the data flow flow table according to the control flow obtained by the decomposition.
  • the police node A feeds back the control flow processing message to the alarm node, and feeds back the feedback message of the processing process to the alarm node.
  • the police node A may acquire the control flow message from the alarm node to implement the update again, in which case the control flow processing message fed back to the alarm node may be the updated result of the update again.
  • the police node B decomposes the control flow message, and updates the data flow flow table according to the control flow obtained by the decomposition.
  • the police node B feeds back the control flow processing message to the alarm node, and feeds back the feedback message of the processing process to the alarm node.
  • the police node C decomposes the control flow message, and updates the data flow flow table according to the control flow obtained by the decomposition.
  • the police node C feeds back the control flow processing message to the alarm node, and feeds back the feedback message of the processing process to the alarm node.
  • the alarm node summarizes the updated structure of the feedback, and sends or prints the control flow processing structure according to a predetermined logic.
  • FIG. 7 is a schematic diagram of data flow transmission using a Storm application as an example, as shown in FIG. 7, including:
  • the alarm node receives the data stream, and sends the data stream to the processing node A according to the data flow distribution table.
  • the police node A performs service processing on the received data stream, and sends the service processed data stream to the police node B according to the data flow distribution table.
  • the police node B performs service processing on the received data stream, and follows the data flow distribution table.
  • the service processed data stream is sent to the police node C.
  • the police node C performs service processing on the received data stream, and sends the service processed data stream to the next node or outputs the result according to the data flow distribution table.
  • FIG. 8 is a schematic structural diagram of a data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 8, the method includes: a receiving unit 81, a first updating unit 82, and a first sending unit 83, where:
  • the receiving unit 81 is configured to receive a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update.
  • the data stream processing apparatus in this embodiment may be applied to a processing node of a real-time computing system, for example, a processing node as shown in FIG. 1.
  • the topology that needs to be updated in the foregoing control flow may be a complete topology that the real-time computing system needs to update.
  • the foregoing control flow may describe the source node, the processing node, and the management node as shown in FIG. 1 .
  • the topology that needs to be updated in the foregoing control flow description may be a topology that needs to be adjusted currently, for example, the processing node A is connected to the processing node B in the original topology, and the processing node B is connected to the processing node C, that is, the data stream.
  • the order of circulation is from processing node A to processing node B to processing node C, and currently needs to be adjusted to process node A to connect processing node C, and processing node C to connect to processing node B. Then, the above control flow may only describe the processing node A connection processing node C, and the processing node C reconnects the processing node B topology structure, that is, after the update, the data flow circulation order is the processing node A to the processing node B and then to the processing node. C.
  • control flow information that is, the above control flow can be understood as one piece of information.
  • the first update unit 82 is configured to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes the topology that is currently required to be updated.
  • the matching data flow path is configured to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes the topology that is currently required to be updated.
  • the processing node stores a data flow circulation table that matches the original topology structure.
  • the data flow distribution table may be updated according to the topology structure currently required to be updated.
  • the updated data flow flow table includes more needs than current needs.
  • the new topology matches the data flow path, so that the processing node sends the data stream according to the updated topology when sending the data stream.
  • the above-mentioned data flow path matching the topology that needs to be updated may be understood as a flow path or a circulation structure of the data flow in the topology that is currently required to be updated.
  • the topology that needs to be updated is the processing node A connection processing node C
  • the updated data flow distribution table includes the data stream transmitted from the processing node A to the processing node C, that is, when the device is applied to the processing node.
  • Processing node A then the updated data flow flow table may include a data flow path for transmitting the data stream to the processing node C.
  • the first sending unit 83 is configured to perform service processing on the data stream when the processing node receives the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the data stream is configured to perform service processing on the data stream when the processing node receives the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the processing node can be sent according to the updated topology when transmitting the data stream, so that the topology of the real-time computing system can be dynamically adjusted.
  • the above-mentioned steps dynamically adjust the topology of the real-time computing system without causing interference to the data stream being processed.
  • the real-time computing system is a distributed system, and each node runs on a different machine, dynamically adjusting the topology of the real-time computing system through the above steps can avoid the problem caused by the modification lag of a certain machine.
  • the processing node may be any processing node in the real-time computing system.
  • the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system.
  • the topology structure currently needs to be updated; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the current needs update a topologically matched data flow path; when the processing node receives the data stream, the processing node performs a service processing on the data stream, and follows a data flow path included in the updated data flow flow table Sending the data stream after the business processing.
  • the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
  • FIG. 9 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 9, the method includes: a receiving unit 91, a first updating unit 92, and a first sending list. Yuan 93, where:
  • the receiving unit 91 is configured to receive a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update.
  • the data stream processing apparatus in this embodiment may be applied to a processing node of a real-time computing system, for example, a processing node as shown in FIG. 1.
  • the first update unit 92 is configured to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the topology that is currently required to be updated.
  • the matching data flow path is configured to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the topology that is currently required to be updated.
  • the processing node when the processing node updates the data flow distribution table, the processing node may suspend the data flow transmission, but may perform service processing on the data flow. Or the processing node may suspend the service processing and data stream transmission of the data stream when updating the data flow distribution table. When the data flow flow table is updated, the suspended data stream transmission and/or the business processing of the data stream are resumed. In this way, when the topology of the real-time computing system is dynamically adjusted, the error handling of the data stream and the blocking effect of the system may occur.
  • the first sending unit 93 is configured to perform service processing on the data stream when the processing node receives the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the data stream is configured to perform service processing on the data stream when the processing node receives the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the device may further include:
  • a second sending unit 94 configured to send the updated data flow distribution table to a shared storage node of the real-time computing system
  • the recovery unit 95 is configured to: when the processing node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
  • the processing node can obtain the data flow circulation table updated by the first update unit 92 directly from the shared storage node when the recovery node recovers, so that the processing node can be shared storage after the failure occurs.
  • the data flow table is read and initialized in the node, that is, the HA mechanism for dynamically adjusting the topology is completed.
  • the foregoing apparatus may further include:
  • a first feedback unit 96 configured to feed back the data stream to a source node of the real-time computing system Updating the result of the update of the table, so that when the update structure indicates that the update fails, the source node sends the control flow to the management node;
  • a second updating unit 97 configured to: when the update result indicates that the first update unit fails to update, receive the control flow sent by the management node, and update according to the current needs described by the control flow
  • the topology updates the data flow flow table.
  • the update result may be an update result of the first update unit 92.
  • the source node may be notified of the update of the processing node by the update result.
  • the update fails, the source node is notified by the above update result that the processing node update fails, so that the source node sends the control flow to the management node again, and the management node sends the control flow to the processing node again to make the processing node again.
  • Update Certainly, when the source node sends the control flow to the management node again, it may also carry the identification information of the processing node that failed to update, so that the management node may only send the control flow to the processing node that failed the update, without successfully updating. The processing node sends the control flow again to save transmission resources.
  • This embodiment can implement the correct feedback of the update result, and if the update fails, the update task can be started again.
  • the foregoing apparatus may further include:
  • a second feedback unit 98 configured to feed back an update result of the data flow table update to an output node of the real-time computing system, and the update result fed back by the output node to all processing nodes of the real-time computing system Summarize and output the summary results.
  • the output node can obtain the update result fed back by each processing node, so that the output node can summarize it, so that the summary result can be output, for example, sending the summary result to the presentation device, or printing the summary result, etc. This allows the user to know the state of the topology adjustment of the real-time computing system.
  • a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 8, and the topology of the real-time computing system can be dynamically adjusted.
  • FIG. 12 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 12, the method includes: a receiving unit 121 and a sending unit 122, where:
  • the receiving unit 121 is configured to receive a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing The topology that the system currently needs to update.
  • the data stream processing apparatus may be applied to a management node of a real-time computing system, for example, the management node shown in FIG. 1.
  • control flow may include structural information of a topology that needs to be updated, and may also include a control flow decomposition structure identifier of the management node or the management module internal processing logic, that is, the control flow may be decomposed by the structure identifier and The structure information identifies the topology that needs to be updated currently.
  • the sending unit 122 is configured to send the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data flow according to the topology that is currently updated according to the control flow.
  • a flow table wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the data is processed by the processing node
  • the stream performs service processing, and transmits the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  • the management node may send the control flow to all processing nodes in the real-time computing system, so that all processing nodes update the stored data flow table.
  • the foregoing apparatus may further include:
  • a holding unit 123 configured to establish and maintain a connection with each of the processing nodes in the real-time computing system
  • the sending unit 122 is configured to send the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  • the management node may only send the control flow to the partial processing node.
  • the management node may only refer to these involved.
  • the processing node sends the control flow so that these are involved in the processing node updating the data flow flow table, and without being involved, the data flow flow table may not be updated.
  • the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system.
  • the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs a service on the data flow. Processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the updated data flow distribution table. This allows dynamic adjustment of the topology of the real-time computing system.
  • FIG. 14 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 14, the method includes: an obtaining unit 141, an updating unit 142, a first sending unit 143, and a second sending. Unit 144, wherein:
  • the obtaining unit 141 is configured to acquire a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update;
  • the data stream processing apparatus may be applied to a source node of a real-time computing system, for example, a source node as shown in FIG. 1.
  • control flow may be a control flow in which the source node receives user input.
  • the control flow here can refer to the control flow described in the embodiment shown in FIG. 1-4, and will not be repeatedly described herein.
  • the updating unit 142 is configured to update the data flow distribution table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a topology that matches the current need to be updated. Data flow path.
  • the source node After receiving the above control flow, the source node can decompose the control flow, and update the data flow circulation table according to the topology structure that needs to be updated according to the decomposition. In addition, the source node stores or caches the updated data flow table. Of course, the above control flow can also be stored or cached.
  • the first sending unit 143 is configured to send the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node used for performing service processing on the data stream, to And causing the processing node to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes data that matches the topology that needs to be updated currently.
  • control node can send the control flow to the processing node, and the processing node updates the stored data flow table.
  • a second sending unit 144 configured to receive a data stream, and send the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where The processing node performs service processing on the data stream, and sends the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
  • the data stream may be used to calculate data currently needed by the system in real time, such as data input by the user or data transmitted by the collection device.
  • the source node can send the data flow according to the updated topology to implement dynamic adjustment of the real-time calculation.
  • the topology of the system Since the update unit 142 updates the data flow distribution table, so that the second sending unit 144 sends the data flow according to the updated data flow distribution table, the source node can send the data flow according to the updated topology to implement dynamic adjustment of the real-time calculation.
  • the topology of the system Since the update unit 142 updates the data flow distribution table, so that the second sending unit 144 sends the data flow according to the updated data flow distribution table, the source node can send the data flow according to the updated topology to implement dynamic adjustment of the real-time calculation.
  • the topology of the system Since the update unit 142 updates the data flow distribution table, so that the second sending unit 144 sends the data flow according to the updated data flow distribution table, the source node can send the data flow according to the updated topology to implement dynamic adjustment of the real-time calculation.
  • the topology of the system Since the update unit 142 updates the data flow distribution table,
  • the foregoing apparatus may further include:
  • a third sending unit 145 configured to send the updated data flow circulation table to a shared storage node of the real-time computing system
  • the recovery unit 146 is configured to: when the source node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
  • the source node may directly obtain the data flow circulation table updated by the update unit 142 from the shared storage node, so that the source node may be in the shared storage node after the failure occurs. Reading and initializing the data flow table, the HA mechanism for dynamically adjusting the topology is completed.
  • the device may further include:
  • the receiving unit 147 is configured to receive, by the processing node that is fed back by the processing node, an update result of the data flow distribution table.
  • the fourth sending unit 148 is configured to: when the update result indicates that the update fails, send the control flow to the management node, so that the management node sends the control flow to the processing node, by the processing
  • the node updates the data flow flow table according to the topology that is currently updated as described by the control flow.
  • the update result of the current update data flow flow table of each processing node can be obtained in time.
  • the source node may trigger the management node to send a control flow to the processing node that failed the update, so that the processing node that failed the update updates the data flow circulation table again.
  • the source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a current requirement of the real-time computing system.
  • An updated topology the source node updates a data flow flow table according to the topology that is currently updated as described by the control flow, wherein the updated data flow flow table includes the topology that needs to be updated currently a data flow path of the structure matching;
  • the source node transmitting the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a service for processing a data stream Processing the node, so that the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the topology that needs to be updated currently a data flow path of the structure matching;
  • the source node receives the data stream, and sends the data stream to the processing node according to a data flow circulation path included in the
  • FIG. 17 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • the apparatus is applied to a processing node of a real-time computing system, as shown in FIG. 17, and includes: a processor 171, a network interface. 172, a memory 174 for implementing connection communication between the processor 171, the network interface 172, and the memory 173, and a communication bus 174, the processor 171 executing the program stored in the memory 173 Used to implement the following methods:
  • the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently;
  • the processing node When the processing node receives the data stream, performing processing on the data stream, and transmitting the data stream after the service processing according to the data stream circulation path included in the updated data stream circulation table.
  • the processor 171 can also execute the following procedure:
  • the updated data flow distribution table is obtained from the shared storage node, and the data flow is transmitted according to the updated data flow distribution table.
  • the processor 171 can also execute the following procedure:
  • the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
  • the processor 171 can also execute the following procedure:
  • the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system.
  • the topology structure currently needs to be updated; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the current needs update a topologically matched data flow path; when the processing node receives the data stream, the processing node performs a service processing on the data stream, and follows a data flow path included in the updated data flow flow table Sending the data stream after the business processing.
  • the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
  • FIG. 18 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • the apparatus is applied to a management node of a real-time computing system, as shown in FIG. 18, and includes: a processor 181, a network interface. 182.
  • a memory 184 for communicating communications between the processor 181, the network interface 182 and the memory 183, and a communication bus 184, the processor 181 executing the program stored in the memory 183 Used to implement the following methods:
  • the processing node Transmitting the control flow to a processing node for performing traffic processing on the data stream to cause the
  • the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently.
  • the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service processed data according to the data flow distribution path included in the updated data flow circulation table. flow.
  • the processor 181 can also execute the following procedure:
  • the program executed by the processor 181 to send the control flow to the processing node for performing service processing on the data stream may include:
  • the management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  • the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system.
  • the topology update data flow circulation table wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing is performed by the processing
  • the node performs service processing on the data stream, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table. This allows dynamic adjustment of the topology of the real-time computing system.
  • FIG. 19 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
  • the apparatus is applied to a source node of a real-time computing system, as shown in FIG. 19, and includes: a processor 191, a network interface. 192, a memory 194 for implementing connection communication between the processor 191, the network interface 192, and the memory 193, and a communication bus 194, the processor 191 executing the program stored in the memory 193 Used to implement the following methods:
  • control flow for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update;
  • the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently;
  • the program executed by the processor 191 may further include:
  • the updated data flow distribution table is obtained from the shared storage node, and the data flow is transmitted according to the updated data flow distribution table.
  • the program executed by the processor 191 may further include:
  • the topology that currently needs to be updated updates the data flow flow table.
  • the source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present; the source And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; Sending, by the source node, the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing The node updates the data flow flow table according to the topology that is currently updated according to the control flow, where the updated data flow flow table includes a data flow path that matches the topology that needs to be updated currently.
  • the source node receives the data stream, and sends the data stream to the processing node according to the data flow circulation path included in the data flow circulation table updated by the source node, and the data stream is processed by the processing node Performing business processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
  • the topology of the real-time computing system can be dynamically adjusted.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Abstract

Disclosed are a data stream processing method and apparatus. The method can comprise: a processing node of a real-time calculation system receiving a control stream which is sent by a management node of the real-time calculation system to adjust a topology structure of the real-time calculation system, wherein the control stream is used for describing a current topology structure needing to be updated of the real-time calculation system; the processing node updating a data stream flow table according to the current topology structure needing to be updated described by the control stream, wherein the updated data stream flow table comprises a data stream flow path matching the current topology structure needing to be updated; and when the processing node receives a data stream, the processing node performing service processing on the data stream, and sending the data stream having been subjected to the service processing according to the data stream flow path comprised in the updated data stream flow table. The embodiments of the present invention can dynamically adjust a topology structure of a real-time calculation system.

Description

一种数据流处理方法和装置Data stream processing method and device 技术领域Technical field
本发明涉及通信技术领域,尤其涉及一种数据流处理方法和装置。The present invention relates to the field of communications technologies, and in particular, to a data stream processing method and apparatus.
背景技术Background technique
随着科技的进步和用户需求的发展,现在存在大量的实时数据,其中,实时数据实时性强,且数据量大,且没有止境,目前主要通过实时计算系统以流计算的方式对实时数据进行计算,例如:Storm应用。然而,目前的实时计算系统中由于都是依赖JAVA应用程序编程接口(Application Programming Interface,API)实现代码,并打包提交任务实现的,所以已经在运行的实时计算系统的业务逻辑不可变,即实时计算系统中在运行时,该系统的拓扑(Topology)结构是不可变的。但是现在很多需求是希望能够动态调整实时计算系统的处理逻辑,即动态调整实时计算系统的拓扑结构,从而可以适应业务的需求。可见,目前如何动态调整实时计算系统的拓扑结构是当前急需解决的技术问题。With the advancement of technology and the development of user needs, there is a large amount of real-time data. Among them, real-time data is strong in real-time, and the amount of data is large, and there is no end. At present, real-time data is mainly calculated by flow computing in real-time computing system. Calculation, for example: Storm application. However, in the current real-time computing system, since the code is implemented by relying on the JAVA application programming interface (API), and the package submission task is implemented, the business logic of the real-time computing system that is already running is immutable, that is, real-time. The topology structure of the system is immutable at runtime in the computing system. However, many of the requirements nowadays are to dynamically adjust the processing logic of the real-time computing system, that is, to dynamically adjust the topology of the real-time computing system so that it can adapt to the needs of the business. It can be seen that how to dynamically adjust the topology of the real-time computing system is currently a technical problem that needs to be solved urgently.
发明内容Summary of the invention
本发明实施例提供了一种数据流处理方法和装置,可以动态调整实时计算系统的拓扑结构。The embodiment of the invention provides a data stream processing method and device, which can dynamically adjust the topology structure of the real-time computing system.
第一方面,本发明实施例提供一种数据流处理方法,包括:In a first aspect, an embodiment of the present invention provides a data stream processing method, including:
实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;a processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update structure;
所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Circulation path
当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务 处理后的数据流。When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service according to the data flow circulation path included in the updated data flow circulation table. The processed data stream.
在第一方面的第一种可能的实现方式中,所述方法还包括:In a first possible implementation manner of the first aspect, the method further includes:
所述处理节点将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Transmitting, by the processing node, the updated data flow circulation table to a shared storage node of the real-time computing system;
当所述处理节点进行故障恢复时,所述处理节点从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。When the processing node performs failure recovery, the processing node acquires the updated data flow distribution table from the shared storage node, and transmits the data flow according to the updated data flow distribution table.
结合第一方面或者第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述方法还包括:In conjunction with the first aspect or the first possible implementation of the first aspect, in a second possible implementation of the first aspect, the method further includes:
所述处理节点向所述实时计算系统的源节点反馈所述数据流流通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;And the processing node feeds back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the control to the management node flow;
当所述更新结果表示更新失败时,所述处理节点接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。When the update result indicates that the update fails, the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
结合第一方面或者第一方面的第一种可能的实现方式,在第一方面的第三种可能的实现方式中,所述方法还包括:In conjunction with the first aspect, or the first possible implementation of the first aspect, in a third possible implementation manner of the first aspect, the method further includes:
所述处理节点向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈的所述更新结果进行汇总,并输出汇总结果。The processing node feeds back an update result of the data flow table update to an output node of the real-time computing system, and the output node summarizes the update result fed back by all processing nodes of the real-time computing system, and Output summary results.
第二方面,本发明实施例提供一种数据流处理方法,包括:In a second aspect, an embodiment of the present invention provides a data stream processing method, including:
实时计算系统的管理节点接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;a management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update structure;
所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。 Transmitting, by the management node, the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow description The updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs the data flow on the data flow. The service processes, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
在第二方面的第一种可能的实现方式中,所述方法还包括:In a first possible implementation manner of the second aspect, the method further includes:
所述管理节点与所述实时计算系统中的各个所述处理节点建立并保持连接;Establishing and maintaining a connection between the management node and each of the processing nodes in the real-time computing system;
所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,包括:The management node sends the control flow to a processing node for performing service processing on the data stream, including:
所述管理节点以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
第三方面,本发明实施例提供一种数据流处理方法,包括:In a third aspect, an embodiment of the present invention provides a data stream processing method, including:
实时计算系统的源节点获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;A source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update;
所述源节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;And the source node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a data flow that matches the topology that needs to be updated currently. Circulation path
所述源节点将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;Sending, by the source node, the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently;
所述源节点接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。Receiving, by the source node, the data stream, and transmitting the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where the processing node performs a service on the data flow Processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
在第三方面的第一种可能的实现方式中,所述方法还包括:In a first possible implementation manner of the third aspect, the method further includes:
所述源节点将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Sending, by the source node, the updated data flow circulation table to a shared storage node of the real-time computing system;
当所述源节点进行故障恢复时,所述源节点从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。When the source node performs fault recovery, the source node acquires the updated data flow flow table from the shared storage node, and performs data flow transmission according to the updated data flow flow table.
结合第三方面或者第三方面的第一种可能的实现方式,在第三方面的第二种可能的实现方式中,所述方法还包括: With reference to the third aspect, or the first possible implementation manner of the third aspect, in a second possible implementation manner of the third aspect, the method further includes:
所述源节点接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;Receiving, by the source node, the processing node fed back by the processing node to update an update result of the data flow distribution table;
当所述更新结果表示更新失败时,所述源节点向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。When the update result indicates that the update fails, the source node sends the control flow to the management node, so that the management node sends the control flow to the processing node, and the processing node follows the The topology that is currently updated as described by the control flow updates the data flow flow table.
第四方面,本发明实施例提供一种数据流处理装置,所述装置应用于实时计算系统的处理节点,包括:接收单元、第一更新单元和第一发送单元,其中:In a fourth aspect, an embodiment of the present invention provides a data stream processing apparatus, where the apparatus is applied to a processing node of a real-time computing system, including: a receiving unit, a first updating unit, and a first sending unit, where:
所述接收单元,用于接收实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The receiving unit is configured to receive a control flow sent by a management node of a real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update currently. ;
所述第一更新单元,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The first update unit is configured to update a data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the topology that needs to be updated currently Structure matching data flow path;
所述第一发送单元,用于当所述处理节点接收到数据流时,对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The first sending unit is configured to: when the processing node receives the data stream, perform service processing on the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table. After the data stream.
在第四方面的第一种可能的实现方式中,所述装置还包括:In a first possible implementation manner of the fourth aspect, the device further includes:
第二发送单元,用于将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;a second sending unit, configured to send the updated data flow distribution table to a shared storage node of the real-time computing system;
恢复单元,用于当所述处理节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。And a recovery unit, configured to: when the processing node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
结合第四方面或者第四方面的第一种可能的实现方式,在第四方面的第二种可能的实现方式中,所述装置还包括:With reference to the fourth aspect, or the first possible implementation manner of the fourth aspect, in a second possible implementation manner of the fourth aspect, the device further includes:
第一反馈单元,用于向所述实时计算系统的源节点反馈所述数据流流通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;a first feedback unit, configured to feed back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the update to the management node The control flow;
第二更新单元,用于当所述更新结果表示所述第一更新单元更新失败时, 接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。a second updating unit, configured to: when the update result indicates that the first update unit fails to update, Receiving the control flow sent by the management node, and updating the data flow circulation table according to the topology that is currently updated according to the control flow.
结合第四方面或者第四方面的第一种可能的实现方式,在第四方面的第三种可能的实现方式中,所述装置还包括:With reference to the fourth aspect, or the first possible implementation manner of the fourth aspect, in a third possible implementation manner of the fourth aspect, the device further includes:
第二反馈单元,用于向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈的所述更新结果进行汇总,并输出汇总结果。a second feedback unit, configured to feed back an update result of the data flow table update to an output node of the real-time computing system, where the update result is fed back by the output node to all processing nodes of the real-time computing system Summarize and output the summary results.
第五方面,本发明实施例提供一种数据流处理装置,包括:接收单元和发送单元,其中:In a fifth aspect, an embodiment of the present invention provides a data stream processing apparatus, including: a receiving unit and a sending unit, where:
所述接收单元,用于接收实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The receiving unit is configured to receive a control flow sent by a source node of a real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update currently. ;
所述发送单元,用于将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The sending unit is configured to send the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data according to the topology that is currently updated according to the control flow. a flow distribution table, wherein the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node The data stream performs service processing, and transmits the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
在第五方面的第一种可能的实现方式中,所述装置还包括:In a first possible implementation manner of the fifth aspect, the device further includes:
保持单元,用于与所述实时计算系统中的各个所述处理节点建立并保持连接;a holding unit, configured to establish and maintain a connection with each of the processing nodes in the real-time computing system;
所述发送单元用于以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The sending unit is configured to send the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
第六方面,本发明实施例提供一种数据流处理装置,所述装置应用于实时计算系统的源节点,包括:获取单元、更新单元、第一发送单元和第二发送单元,其中:In a sixth aspect, an embodiment of the present invention provides a data stream processing apparatus, where the apparatus is applied to a source node of a real-time computing system, including: an acquiring unit, an updating unit, a first sending unit, and a second sending unit, where:
所述获取单元,用于获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The acquiring unit is configured to acquire a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present;
所述更新单元,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需 要更新的拓扑结构匹配的数据流流通路径;The updating unit is configured to update a data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the current requirement The data flow path that matches the topology to be updated;
所述第一发送单元,用于将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The first sending unit is configured to send the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, And causing the processing node to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a topology that matches the current need to be updated. Data flow path;
所述第二发送单元,用于接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The second sending unit is configured to receive a data stream, and send the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where the processing node is The data stream performs service processing, and transmits the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
在第六方面的第一种可能的实现方式中,所述装置还包括:In a first possible implementation manner of the sixth aspect, the device further includes:
第三发送单元,用于将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;a third sending unit, configured to send the updated data flow distribution table to a shared storage node of the real-time computing system;
恢复单元,用于当所述源节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。And a recovery unit, configured to: when the source node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
结合第六方面或者第六方面的第一种可能的实现方式,在第六方面的第二种可能的实现方式中,所述装置还包括:With reference to the sixth aspect, or the first possible implementation manner of the sixth aspect, in a second possible implementation manner of the sixth aspect, the device further includes:
接收单元,用于接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;a receiving unit, configured to receive, by the processing node fed back by the processing node, an update result of updating the data flow distribution table;
第四发送单元,用于当所述更新结果表示更新失败时,向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。a fourth sending unit, configured to: when the update result indicates that the update fails, send the control flow to the management node, so that the management node sends the control flow to the processing node, by the processing node Updating the data flow flow table according to the topology currently required to be updated as described by the control flow.
上述技术方案中,实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,所述处理节点对所述数 据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以通过实时更新数据流流通表的方式实现更新实时计算系统的处理逻辑,即可以动态地调整实时计算系统的拓扑结构。In the above technical solution, the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system. The topology structure currently needs to be updated; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the current needs update Topology structure matching data flow path; when the processing node receives the data stream, the processing node pairs the number Performing service processing according to the flow, and transmitting the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table. In this way, the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
附图说明DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any creative work.
图1是本发明实施例提供的数据流处理方法可应用的实时计算系统的架构图;FIG. 1 is a structural diagram of a real-time computing system to which a data stream processing method according to an embodiment of the present invention is applicable;
图2是本发明实施例提供的一种数据流处理方法的流程示意图;2 is a schematic flowchart of a data stream processing method according to an embodiment of the present invention;
图3是本发明实施例提供的另一种数据流处理方法的流程示意图;3 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention;
图4是本发明实施例提供的另一种数据流处理方法的流程示意图;4 is a schematic flowchart diagram of another data stream processing method according to an embodiment of the present invention;
图5是本发明实施例提供的另一种数据流处理方法的流程示意图;FIG. 5 is a schematic flowchart diagram of another data stream processing method according to an embodiment of the present disclosure;
图6是本发明实施例提供的以Storm应用为例的控制流传输示意图;FIG. 6 is a schematic diagram of control flow transmission using a Storm application as an example according to an embodiment of the present invention; FIG.
图7是本发明实施例提供的以Storm应用为例的数据流传输示意图;FIG. 7 is a schematic diagram of data flow transmission using a Storm application as an example according to an embodiment of the present invention; FIG.
图8是本发明实施例提供的一种数据流处理装置的结构示意图;FIG. 8 is a schematic structural diagram of a data stream processing apparatus according to an embodiment of the present invention; FIG.
图9是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 9 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图10是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 10 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图11是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 11 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图12是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 12 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图13是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 13 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图14是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 14 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图15是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 15 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图16是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 16 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图17是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 17 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图18是本发明实施例提供的另一种数据流处理装置的结构示意图;FIG. 18 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention;
图19是本发明实施例提供的另一种数据流处理装置的结构示意图。 FIG. 19 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
请参阅图1,图1是本发明实施例提供的数据流处理方法可应用的实时计算系统的架构图,如图1所示,包括:监督主体11、源节点12,以及还包括一个或者多个处理节点13和管理模块14,其中,监督主体11分别与源节点12以及各个处理节点13连接,源节点12至少与其中一个处理节点13连接,且各个处理节点13依次相连。另外,管理模块14包括管理节点141和输出节点142,其中,管理节点141与源节点以及各个处理节点13连接,输出节点142与各个处理节点13连接。其中,上述实时计算系统中各节点之间的连接关系表现为该实时计算系统的拓扑(Topology),或者理解为拓扑模型。Please refer to FIG. 1. FIG. 1 is a structural diagram of a real-time computing system applicable to a data stream processing method according to an embodiment of the present invention. As shown in FIG. 1, the method includes: a supervisory body 11, a source node 12, and one or more The processing node 13 and the management module 14 are respectively connected to the source node 12 and the respective processing nodes 13, and the source node 12 is connected to at least one of the processing nodes 13, and the processing nodes 13 are sequentially connected. In addition, the management module 14 includes a management node 141 and an output node 142, wherein the management node 141 is connected to the source node and the respective processing nodes 13, and the output node 142 is connected to the respective processing nodes 13. The connection relationship between the nodes in the real-time computing system is expressed as a topology of the real-time computing system or as a topology model.
需要说明的是,本实施例中描述的节点之间的连接可以理解为逻辑连接上的连接,连接的节点之间可以进行数据流或者控制流的传输。It should be noted that the connection between the nodes described in this embodiment may be understood as a connection on a logical connection, and a data flow or a control flow may be transmitted between the connected nodes.
在上述实时计算系统在运行时,数据流从源节点12流入,然后源节点12按照存储的数据流流通表将数据流发送给对应的处理节点13,该处理节点13对接收的数据流进行业务处理,其中,该业务处理可以包括对数据流进行实时计算或者理解为流计算,该处理节点13按照存储的数据流流通表将业务处理后的数据流发送给另一处理节点13,由该处理节点13按照存储的数据流流通表将业务处理后的数据流发送给另一节点或者输出结果。其中,本发明实施例中,数据流流通表包括与实时计算系统的拓扑结构匹配的数据流流通路径,即本实施例中,通过数据流流通表来控制数据流的传输路径或者传输结构,这样无论是源节点12还是处理节点13都可以通过各自存储的数据流流通表将当前需要发送的数据流发送至相应的节点。例如:上述实时计算系统的拓扑结构包括源节点12连接处理节点A,处理节点A再连接处理节点B,处理节点B再连接处理节点C,处理节点C为结束节点。这样在源节点12存储的数据流流通表就可以包括数据流由源节点12发送至处理节 点A的映射,处理节点A存储的数据流流通表就可以包括数据流由处理节点A发送至处理节点B的映射,处理节点B存储的数据流流通表就可以包括数据流由处理节点B发送至处理节点C的映射,处理节点C存储的数据流流通表就可以包括处理节点C输出业务处理结果的映射。当然,本实施例中,每个节点存储的数据流流通表可以包括所有映射或者仅包括与自己相关的映射。When the real-time computing system is running, the data stream flows from the source node 12, and then the source node 12 transmits the data stream to the corresponding processing node 13 in accordance with the stored data flow flow table, and the processing node 13 performs the service on the received data stream. Processing, wherein the service processing may include real-time calculation of the data stream or understanding of the stream calculation, the processing node 13 transmitting the service-processed data stream to another processing node 13 according to the stored data flow circulation table, by the processing The node 13 transmits the service-processed data stream to another node or outputs the result according to the stored data flow flow table. In the embodiment of the present invention, the data flow distribution table includes a data flow circulation path that matches the topology structure of the real-time computing system, that is, in the embodiment, the data flow distribution table controls the transmission path or the transmission structure of the data flow, such that Both the source node 12 and the processing node 13 can send the data stream that needs to be sent to the corresponding node through the respective stored data flow distribution table. For example, the topology of the real-time computing system includes that the source node 12 is connected to the processing node A, the processing node A is reconnected to the processing node B, the processing node B is reconnected to the processing node C, and the processing node C is the ending node. Thus, the data flow flow table stored at the source node 12 can include the data flow sent by the source node 12 to the processing section. The mapping of the point A, the data flow distribution table stored by the processing node A may include the mapping of the data stream sent by the processing node A to the processing node B, and the data flow distribution table stored by the processing node B may include the data stream sent by the processing node B. To the mapping of the processing node C, the data flow distribution table stored by the processing node C may include a mapping of the processing node C outputting the business processing result. Of course, in this embodiment, the data flow distribution table stored by each node may include all mappings or only include mappings related to itself.
在上述实时计算系统在运行时,控制流可以从源节点12传输至管理节点141,其中,该控制流用于描述所述实时计算系统当前需要更新的拓扑结构。管理节点141再将控制流传输给每个处理节点13,这样这些处理节点13就可以根据控制流更新存储的数据流流通表,从而实现动态调整实时计算系统的拓扑结构。When the real-time computing system described above is running, the control flow can be transmitted from the source node 12 to the management node 141, wherein the control flow is used to describe the topology that the real-time computing system currently needs to update. The management node 141 then transmits the control flow to each processing node 13, so that the processing nodes 13 can update the stored data flow flow table according to the control flow, thereby realizing the dynamic adjustment of the topology of the real-time computing system.
另外,本发明实施例中,上述实时计算系统可以是分布式系统,即上述各节点可以分布运行在不同的机器中,当然,也可以允许其中的一部分节点运行在同一机器中,例如:管理节点121和输出节点122可以运行在同一机器中或者运行在不同的机器中。另外,本发明实施例中,这些机器不作限定,例如:这些机器可以是计算机或者服务器等。In addition, in the embodiment of the present invention, the real-time computing system may be a distributed system, that is, the foregoing nodes may be distributed and run in different machines, and of course, some of the nodes may be allowed to run in the same machine, for example, a management node. 121 and output node 122 can run in the same machine or in different machines. In addition, in the embodiment of the present invention, these machines are not limited, for example, these machines may be computers or servers.
请参阅图2,图2是本发明实施例提供的一种数据流处理方法的流程示意图,如图2所示,包括以下步骤:Referring to FIG. 2, FIG. 2 is a schematic flowchart of a data stream processing method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
201、实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构。201. The processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently. Topology.
本实施例中,上述控制流描述的当前需要更新的拓扑结构可以是实时计算系统当前需要更新的完整拓扑结构,例如:上述控制流可以描述如图1所示的源节点、处理节点、管理节点和输出节点之间的拓扑结构。或者,上述控制流描述的当前需要更新的拓扑结构可以是当前需要调整的拓扑结构,例如:原先的拓扑结构中处理节点A连接处理节点B,处理节点B再连接处理节点C,即数据流的流通顺序是处理节点A到处理节点B再到处理节点C,而当前需要调整为处理节点A连接处理节点C,处理节点C再连接处理节点B。那么,上述控制流可以仅是描述处理节点A连接处理节点C,处理节 点C再连接处理节点B的拓扑结构,即更新后,数据流的流通顺序是处理节点A到处理节点B再到处理节点C。In this embodiment, the topology that needs to be updated in the foregoing control flow may be a complete topology that the real-time computing system needs to update. For example, the foregoing control flow may describe the source node, the processing node, and the management node as shown in FIG. 1 . The topology between the node and the output node. Alternatively, the topology that needs to be updated in the foregoing control flow description may be a topology that needs to be adjusted currently, for example, the processing node A is connected to the processing node B in the original topology, and the processing node B is connected to the processing node C, that is, the data stream. The order of circulation is from processing node A to processing node B to processing node C, and currently needs to be adjusted to process node A to connect processing node C, and processing node C to connect to processing node B. Then, the above control flow may only describe the processing node A connection processing node C, processing section The point C reconnects to the topology of the processing node B, that is, after the update, the flow order of the data stream is from the processing node A to the processing node B to the processing node C.
另外,本实施例中,上述控制流可以理解为控制流信息,即上述控制流可以理解为一条信息。In addition, in this embodiment, the foregoing control flow can be understood as control flow information, that is, the above control flow can be understood as one piece of information.
202、处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径。202. The processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Distribution path.
本实施例中,处理节点存储有与原拓扑结构匹配的数据流流通表,当接收到上述控制流时,即当前拓扑结构需要更新时,就可以按照当前需要更新的拓扑结构更新数据流流通表。而更新后的数据流流通表包括与当前需要更新的拓扑结构匹配的数据流流通路径,这样就可以实现该处理节点在发送数据流时,是按照更新后的拓扑结构发送数据流。In this embodiment, the processing node stores a data flow circulation table that matches the original topology structure. When the control flow is received, that is, when the current topology needs to be updated, the data flow distribution table may be updated according to the topology structure currently required to be updated. . The updated data flow circulation table includes a data flow distribution path that matches the topology that needs to be updated, so that the processing node sends the data flow according to the updated topology when transmitting the data flow.
另外,上述与当前需要更新的拓扑结构匹配的数据流流通路径可以理解为在当前需要更新的拓扑结构中数据流的流通路径或者流通结构。例如:当前需要更新的拓扑结构是处理节点A连接处理节点C,那么,上述更新的数据流流通表就会包括数据流从处理节点A传输给处理节点C,即当执行步骤202的处理节点为处理节点A,那么,上述更新的数据流流通表就可以包括将数据流传输至处理节点C的数据流流通路径。In addition, the above-mentioned data flow path matching the topology that needs to be updated may be understood as a flow path or a circulation structure of the data flow in the topology that is currently required to be updated. For example, if the topology that needs to be updated is the processing node A connection processing node C, then the updated data flow distribution table includes the data stream transmitted from the processing node A to the processing node C, that is, when the processing node performing step 202 is Processing node A, then the updated data flow flow table may include a data flow path for transmitting the data stream to the processing node C.
203、当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。203. When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service processed data according to the data flow distribution path included in the updated data flow distribution table. flow.
通过上述步骤可以实现处理节点在发送数据流时是按照更新的拓扑结构进行发送的,从而可以实现动态地调整实时计算系统的拓扑结构。另外,在实时计算系统在进行巨量业务运行时通过上述步骤动态调整实时计算系统的拓扑结构不会对正在进行处理的数据流造成的干扰。另外,当实时计算系统为分布式系统,各个节点运行在不同的机器时,通过上述步骤动态调整实时计算系统的拓扑结构可以避免某个机器的修改滞后导致的问题。Through the above steps, the processing node can be sent according to the updated topology when transmitting the data stream, so that the topology of the real-time computing system can be dynamically adjusted. In addition, when the real-time computing system performs a huge amount of business operations, the above-mentioned steps dynamically adjust the topology of the real-time computing system without causing interference to the data stream being processed. In addition, when the real-time computing system is a distributed system, and each node runs on a different machine, dynamically adjusting the topology of the real-time computing system through the above steps can avoid the problem caused by the modification lag of a certain machine.
本实施例中,上述处理节点可以是实时计算系统中的任意处理节点。In this embodiment, the processing node may be any processing node in the real-time computing system.
本实施例中,实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制 流用于描述所述实时计算系统当前需要更新的拓扑结构;所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以通过实时更新数据流流通表的方式实现更新实时计算系统的处理逻辑,即可以动态地调整实时计算系统的拓扑结构。In this embodiment, the processing node of the real-time computing system receives a control flow sent by the management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control The flow is used to describe a topology that the real-time computing system currently needs to update; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, where the updated data flow is circulated. The table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs service processing on the data flow, and follows the updated data flow. The data flow distribution path included in the flow table transmits the data stream after the business process. In this way, the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
请参阅图3,图3是本发明实施例提供的另一种数据流处理方法的流程示意图,如图3所示,包括以下步骤:Referring to FIG. 3, FIG. 3 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention. As shown in FIG. 3, the method includes the following steps:
301、实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构。301. A processing node of a real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently. Topology.
302、处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径。302. The processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Distribution path.
本实施例中,处理节点在更新数据流流通表时,可以暂停数据流发送,但可以对数据流进行业务处理。或者处理节点在更新数据流流通表时,可以暂停对数据流进行业务处理和数据流发送。在更新完数据流流通表时,恢复暂停的数据流发送和/或对数据流的业务处理。这样可以实现在动态调整实时计算系统的拓扑结构时,不然会出现数据流的错误处理和系统的阻塞影响性能。In this embodiment, when the processing node updates the data flow distribution table, the processing node may suspend the data flow transmission, but may perform service processing on the data flow. Or the processing node may suspend the service processing and data stream transmission of the data stream when updating the data flow distribution table. When the data flow flow table is updated, the suspended data stream transmission and/or the business processing of the data stream are resumed. In this way, when the topology of the real-time computing system is dynamically adjusted, the error handling of the data stream and the blocking effect of the system may occur.
303、当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。303. When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service processed data according to the data flow distribution path included in the updated data flow distribution table. flow.
上述方法还可以包括如下步骤:The above method may further comprise the following steps:
304、处理节点将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点。304. The processing node sends the updated data flow circulation table to a shared storage node of the real-time computing system.
305、当所述处理节点进行故障恢复时,所述处理节点从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数 据流的发送。305. When the processing node performs fault recovery, the processing node acquires the updated data flow flow table from the shared storage node, and performs the number according to the updated data flow circulation table. According to the flow of the transmission.
该实施方式中,可以实现处理节点在出现故障后,进行恢复时可以直接从共享存储节点获取步骤302更新的数据流流通表,从而可以实现处理节点在出现故障后,可以从共享存储节点中读取并初始化该数据流流通表,即完成了动态调整拓扑结构的高可用(High Availability,HA)机制。In this implementation manner, the processing node can obtain the data flow circulation table updated in step 302 directly from the shared storage node when the recovery node recovers, so that the processing node can read from the shared storage node after the failure occurs. The data flow table is initialized and initialized, that is, the high availability (HA) mechanism of dynamically adjusting the topology is completed.
本实施例中,上述方法还可以包括如下步骤:In this embodiment, the foregoing method may further include the following steps:
所述处理节点向所述实时计算系统的源节点反馈所述数据流流通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;And the processing node feeds back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the control to the management node flow;
当所述更新结果表示更新失败时,所述处理节点接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。When the update result indicates that the update fails, the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
该实施方式中,上述更新结果可以是步骤302的更新结果,当步骤302更新成功时,就可以通过上述更新结果告诉源节点上述处理节点更新成功。当更新失败时,通过上述更新结果告诉源节点上述处理节点更新失败,这样上述源节点就会再次向管理节点发送上述控制流,由管理节点再次向该处理节点发送上述控制流以使处理节点再次进行更新。当然,在源节点再次向管理节点发送上述控制流时,还可以携带有更新失败的处理节点的标识信息,这样可以使管理节点只向更新失败的处理节点再次发送控制流,而不向更新成功的处理节点再次发送控制流,以节约传输资源。In this embodiment, the update result may be the update result of step 302. When the step 302 is successfully updated, the source node may be notified of the update of the processing node by the update result. When the update fails, the source node is notified by the above update result that the processing node update fails, so that the source node sends the control flow to the management node again, and the management node sends the control flow to the processing node again to make the processing node again. Update. Certainly, when the source node sends the control flow to the management node again, it may also carry the identification information of the processing node that failed to update, so that the management node may only send the control flow to the processing node that failed the update, without successfully updating. The processing node sends the control flow again to save transmission resources.
该实施方式可以实现能够正确反馈更新结果,假如更新失败,能够再次启动更新任务。This embodiment can implement the correct feedback of the update result, and if the update fails, the update task can be started again.
本实施例中,上述方法还可以包括如下步骤:In this embodiment, the foregoing method may further include the following steps:
处理节点向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈的所述更新结果进行汇总,并输出汇总结果。Processing node feeds back an update result of the data flow table update to an output node of the real-time computing system, and the output node summarizes the update result fed back by all processing nodes of the real-time computing system, and outputs the summary result.
该输出节点可以获取每一个处理节点反馈的更新结果,这样输出节点就可以对其进行汇总,从而可以输出汇总结果,例如:将该汇总结果发送给展现设备,或者将该汇总结果进行打印等,从而让用户知道实时计算系统的拓扑结构调整的状态。 The output node can obtain the update result fed back by each processing node, so that the output node can summarize it, so that the summary result can be output, for example, sending the summary result to the presentation device, or printing the summary result, etc. This allows the user to know the state of the topology adjustment of the real-time computing system.
本实施例中,在图2所示的实施例的基础上增加了多种可选的实施方式,且都可以实现动态调整实时计算系统的拓扑结构。In this embodiment, a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 2, and the topology of the real-time computing system can be dynamically adjusted.
请参阅图4,图4是本发明实施例提供的另一种数据流处理方法的流程示意图,如图4所示,包括以下步骤:Referring to FIG. 4, FIG. 4 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention. As shown in FIG. 4, the method includes the following steps:
401、实时计算系统的管理节点接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构。401. The management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently. Topology.
本实施例中,上述控制流可以包括当前需要更新的拓扑结构的结构信息,以及还可以包括管理节点或者管理模块内部处理逻辑约定好的控制流分解结构标识,即可以通过控制流分解结构标识和结构信息识别出当前需要更新的拓扑结构。In this embodiment, the foregoing control flow may include structural information of a topology that needs to be updated, and may also include a control flow decomposition structure identifier of the management node or the management module internal processing logic, that is, the control flow may be decomposed by the structure identifier and The structure information identifies the topology that needs to be updated currently.
402、管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。402. The management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow. The updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs the data flow on the data flow. The service processes, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
本实施例中,管理节点可以是向实时计算系统中的所有处理节点发送上述控制流,这样所有处理节点都会对其存储的数据流流通表进行更新。例如:上述方法还可以包括如下步骤:In this embodiment, the management node may send the control flow to all processing nodes in the real-time computing system, so that all processing nodes update the stored data flow table. For example, the above method may further include the following steps:
所述管理节点与所述实时计算系统中的各个所述处理节点建立并保持连接;Establishing and maintaining a connection between the management node and each of the processing nodes in the real-time computing system;
上述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点的步骤,可以包括:The step of the foregoing management node sending the control flow to the processing node for performing service processing on the data stream may include:
所述管理节点以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
这样就可以实现通过广播方式将控制流发送给所有处理节点。This allows broadcast control to be sent to all processing nodes.
当然,本实施例中,管理节点还可以只是向部分处理节点发送控制流,例如:当前需要更新的拓扑结构中只涉及到部分处理节点时,那么,管理节 点就可以只向这些被涉及到的处理节点发送控制流,以使这些被涉及到处理节点更新数据流流通表,而没有被涉及到则可以不更新数据流流通表。Of course, in this embodiment, the management node may only send the control flow to the partial processing node. For example, when the topology that needs to be updated only involves some processing nodes, then the management section The point can only send control flow to these involved processing nodes, so that these are involved in the processing node to update the data flow flow table, and without being involved, the data flow flow table may not be updated.
本实施例中,实时计算系统的管理节点接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以实现动态调整实时计算系统的拓扑结构。In this embodiment, the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system. The topology that currently needs to be updated; the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing node updates according to the current needs described by the control flow The topology update data flow circulation table, wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing is performed by the processing The node performs service processing on the data stream, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table. This allows dynamic adjustment of the topology of the real-time computing system.
请参阅图5,图5是本发明实施例提供的另一种数据流处理方法的流程示意图,如图5所示,包括以下步骤:Referring to FIG. 5, FIG. 5 is a schematic flowchart of another data stream processing method according to an embodiment of the present invention. As shown in FIG. 5, the method includes the following steps:
501、实时计算系统的源节点获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构。501. A source node of a real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update.
其中,上述控制流可以是源节点接收用户输入的控制流。这里的控制流可以参考图1-4所示的实施例中描述的控制流,此处不作重复说明。The foregoing control flow may be a control flow in which the source node receives user input. The control flow here can refer to the control flow described in the embodiment shown in FIG. 1-4, and will not be repeatedly described herein.
502、源节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径。502. The source node updates the data flow distribution table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Distribution path.
源节点接收到上述控制流后,就可以对其进行分解,并根据分解获取的当前需要更新的拓扑结构更新数据流流通表。另外,源节点会存储或者缓存更新后的数据流流通表。当然,还可以存储或者缓存上述控制流。After receiving the above control flow, the source node can decompose the control flow, and update the data flow circulation table according to the topology structure that needs to be updated according to the decomposition. In addition, the source node stores or caches the updated data flow table. Of course, the above control flow can also be stored or cached.
503、源节点将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更 新的拓扑结构匹配的数据流流通路径。503. The source node sends the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node used for performing service processing on the data flow, so that the processing is performed. The node updates the data flow flow table according to the topology that is currently updated as described by the control flow, wherein the updated data flow flow table includes more The new topology matches the data flow path.
通过该步骤就可以实现由管理节点将控制流发送至处理节点,处理节点再对其存储的数据流流通表进行更新。Through this step, the control node can send the control flow to the processing node, and the processing node updates the stored data flow table.
504、源节点接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。504. The source node receives the data stream, and sends the data stream to the processing node according to a data flow path included in the data flow circulation table updated by the source node, where the processing node performs a service on the data stream. Processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
本实施例中,上述数据流可以用户输入的数据或者采集设备传输的数据等实时计算系统当前需要计算的数据。In this embodiment, the data stream may be used to calculate data currently needed by the system in real time, such as data input by the user or data transmitted by the collection device.
由于步骤502更新数据流流通表,这样步骤504按照更新后的数据流流通表发送数据流,就可以实现源节点是按照更新后的拓扑结构发送数据流,以实现动态调整实时计算系统的拓扑结构。Since the step 502 updates the data flow distribution table, the step 504 sends the data flow according to the updated data flow circulation table, so that the source node can send the data flow according to the updated topology to dynamically adjust the topology of the real-time computing system. .
本实施例中,上述方法还可以包括如下步骤:In this embodiment, the foregoing method may further include the following steps:
所述源节点将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Sending, by the source node, the updated data flow circulation table to a shared storage node of the real-time computing system;
当所述源节点进行故障恢复时,所述源节点从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。When the source node performs fault recovery, the source node acquires the updated data flow flow table from the shared storage node, and performs data flow transmission according to the updated data flow flow table.
该实施方式中,可以实现源节点在出现故障后,进行恢复时可以直接从共享存储节点获取步骤502更新的数据流流通表,从而可以实现源节点在出现故障后,可以从共享存储节点中读取并初始化该数据流流通表,即完成了动态调整拓扑结构的HA机制。In this implementation manner, after the source node recovers, the source node can directly obtain the data flow flow table updated in step 502 from the shared storage node, so that the source node can read from the shared storage node after the failure occurs. The data flow table is initialized and initialized, that is, the HA mechanism for dynamically adjusting the topology is completed.
本实施例中,上述方法还可以包括如下步骤:In this embodiment, the foregoing method may further include the following steps:
所述源节点接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;Receiving, by the source node, the processing node fed back by the processing node to update an update result of the data flow distribution table;
当所述更新结果表示更新失败时,所述源节点向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。When the update result indicates that the update fails, the source node sends the control flow to the management node, so that the management node sends the control flow to the processing node, and the processing node follows the The topology that is currently updated as described by the control flow updates the data flow flow table.
该实施方式中,可以及时获取各处理节点当前更新数据流流通表的更新结果。当某一个或者多个处理节点更新失败时,源节点可以触发管理节点向 更新失败的处理节点发送控制流,以使更新失败的处理节点再次更新数据流流通表。In this implementation manner, the update result of the current update data flow flow table of each processing node can be obtained in time. When one or more processing nodes fail to update, the source node can trigger the management node to The processing node that failed the update sends a control flow so that the processing node that failed the update updates the data flow flow table again.
本实施例中,实时计算系统的源节点获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述源节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;所述源节点将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;所述源节点接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。可以动态调整实时计算系统的拓扑结构。In this embodiment, the source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present; the source And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; Sending, by the source node, the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; Receiving, by the source node, a data stream, and including the data stream according to a data flow flow table updated by the source node Sending to the processing node according to the flow distribution path, the processing node performs service processing on the data flow, and sends the service processed data according to a data flow circulation path included in the data flow distribution table updated by the processing node. flow. The topology of the real-time computing system can be dynamically adjusted.
请参阅图6,图6是本发明实施例提供的以Storm应用为例的控制流传输示意图,如图6所示,包括:Referring to FIG. 6, FIG. 6 is a schematic diagram of a control flow transmission using a Storm application as an example, as shown in FIG.
601、报警节点接收控制流消息,解析控制流消息获取控制流,并根据该控制流更新数据流流通表,以及将控制流消息发送至警局节点。601. The alarm node receives the control flow message, parses the control flow message to obtain the control flow, and updates the data flow flow table according to the control flow, and sends the control flow message to the police station node.
其中,Storm应用为实时计算系统。上述报警节点可以理解为图1-图5所示的实施例中的源节点。该报警节点为Storm实时计算系统的Spout功能类节点,其中,Spout功能类是指是Storm应用中的数据源类,用于接收外部数据流并发送,或者自己构建数据流发送。Among them, the Storm application is a real-time computing system. The above alarm node can be understood as the source node in the embodiment shown in Figures 1-5. The alarm node is a Spout function class node of the Storm real-time computing system. The Spout function class refers to a data source class in the Storm application, which is used for receiving and sending an external data stream, or constructing a data stream for transmission by itself.
上述警局节点理解为图1-图5所示的实施例中的管理节点,警局节点为Storm应用中Bolt功能类节点,依赖警察模块主体(图1所示的实施例中的管理模块主体)的支撑,内部自动集成在Storm应用拓扑结构中,跟Storm应用中除给自己发送数据的报警节点外的所有节点保持连通。其中,Bolt功能类节点是Storm应用中的数据处理节点,各个Bolt实现了不同的业务逻辑, 多个Bolt组合完成复杂的业务逻辑处理。The police station node is understood to be the management node in the embodiment shown in FIG. 1 to FIG. 5, and the police station node is a Bolt function class node in the Storm application, and depends on the police module body (the management module body in the embodiment shown in FIG. 1) The support is automatically integrated in the Storm application topology, and is connected to all nodes except the alarm node that sends data to the Storm application. Among them, the Bolt function class node is a data processing node in the Storm application, and each Bolt implements different business logics. Multiple Bolt combinations complete complex business logic processing.
602、警局节点将控制流消息发送给警务节点A、警务节点B和警务节点C。602. The police station sends a control flow message to the police node A, the police node B, and the police node C.
其中,警务节点可以理解为图1-图5所示的实施例中的处理节点,为Bolt功能类节点。The policing node can be understood as a processing node in the embodiment shown in FIG. 1 to FIG. 5, which is a Bolt function class node.
603、警务节点A分解控制流消息,并根据分解获取的控制流更新数据流流通表。603. The police node A decomposes the control flow message, and updates the data flow flow table according to the control flow obtained by the decomposition.
604、警务节点A将控制流处理消息反馈给警报节点,以及将处理过程的反馈消息反馈给报警节点。604. The police node A feeds back the control flow processing message to the alarm node, and feeds back the feedback message of the processing process to the alarm node.
当步骤603更新失败时,警务节点A可以从报警节点获取到控制流消息,以实现再次进行更新,该情况下,向警报节点反馈的控制流处理消息可以是再次更新的更新结果。When the update of step 603 fails, the police node A may acquire the control flow message from the alarm node to implement the update again, in which case the control flow processing message fed back to the alarm node may be the updated result of the update again.
605、警务节点B分解控制流消息,并根据分解获取的控制流更新数据流流通表。605. The police node B decomposes the control flow message, and updates the data flow flow table according to the control flow obtained by the decomposition.
606、警务节点B将控制流处理消息反馈给警报节点,以及将处理过程的反馈消息反馈给报警节点。606. The police node B feeds back the control flow processing message to the alarm node, and feeds back the feedback message of the processing process to the alarm node.
607、警务节点C分解控制流消息,并根据分解获取的控制流更新数据流流通表。607. The police node C decomposes the control flow message, and updates the data flow flow table according to the control flow obtained by the decomposition.
608、警务节点C将控制流处理消息反馈给警报节点,以及将处理过程的反馈消息反馈给报警节点。608. The police node C feeds back the control flow processing message to the alarm node, and feeds back the feedback message of the processing process to the alarm node.
609、警报节点汇总反馈的更新结构,并按照预定的逻辑发送或者打印控制流处理结构。609. The alarm node summarizes the updated structure of the feedback, and sends or prints the control flow processing structure according to a predetermined logic.
请参阅图7,图7是本发明实施例提供的以Storm应用为例的数据流传输示意图,如图7所示,包括:Referring to FIG. 7, FIG. 7 is a schematic diagram of data flow transmission using a Storm application as an example, as shown in FIG. 7, including:
701、报警节点接收数据流,并按照数据流流通表将数据流发送给处理节点A。701. The alarm node receives the data stream, and sends the data stream to the processing node A according to the data flow distribution table.
702、警务节点A对接收到数据流进行业务处理,并按照数据流流通表将业务处理后的数据流发送给警务节点B。702. The police node A performs service processing on the received data stream, and sends the service processed data stream to the police node B according to the data flow distribution table.
703、警务节点B对接收到数据流进行业务处理,并按照数据流流通表 将业务处理后的数据流发送给警务节点C。703. The police node B performs service processing on the received data stream, and follows the data flow distribution table. The service processed data stream is sent to the police node C.
704、警务节点C对接收到数据流进行业务处理,并按照数据流流通表将业务处理后的数据流发送给下一个节点或者输出结果。704. The police node C performs service processing on the received data stream, and sends the service processed data stream to the next node or outputs the result according to the data flow distribution table.
请参阅图8,图8是本发明实施例提供的一种数据流处理装置的结构示意图,如图8所示,包括:接收单元81、第一更新单元82和第一发送单元83,其中:Referring to FIG. 8, FIG. 8 is a schematic structural diagram of a data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 8, the method includes: a receiving unit 81, a first updating unit 82, and a first sending unit 83, where:
接收单元81,用于接收实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构。The receiving unit 81 is configured to receive a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update.
需要说明的是,本实施例中的数据流处理装置可以应用于实时计算系统的处理节点,例如:如图1所示的处理节点。It should be noted that the data stream processing apparatus in this embodiment may be applied to a processing node of a real-time computing system, for example, a processing node as shown in FIG. 1.
本实施例中,上述控制流描述的当前需要更新的拓扑结构可以是实时计算系统当前需要更新的完整拓扑结构,例如:上述控制流可以描述如图1所示的源节点、处理节点、管理节点和输出节点之间的拓扑结构。或者,上述控制流描述的当前需要更新的拓扑结构可以是当前需要调整的拓扑结构,例如:原先的拓扑结构中处理节点A连接处理节点B,处理节点B再连接处理节点C,即数据流的流通顺序是处理节点A到处理节点B再到处理节点C,而当前需要调整为处理节点A连接处理节点C,处理节点C再连接处理节点B。那么,上述控制流可以仅是描述处理节点A连接处理节点C,处理节点C再连接处理节点B的拓扑结构,即更新后,数据流的流通顺序是处理节点A到处理节点B再到处理节点C。In this embodiment, the topology that needs to be updated in the foregoing control flow may be a complete topology that the real-time computing system needs to update. For example, the foregoing control flow may describe the source node, the processing node, and the management node as shown in FIG. 1 . The topology between the node and the output node. Alternatively, the topology that needs to be updated in the foregoing control flow description may be a topology that needs to be adjusted currently, for example, the processing node A is connected to the processing node B in the original topology, and the processing node B is connected to the processing node C, that is, the data stream. The order of circulation is from processing node A to processing node B to processing node C, and currently needs to be adjusted to process node A to connect processing node C, and processing node C to connect to processing node B. Then, the above control flow may only describe the processing node A connection processing node C, and the processing node C reconnects the processing node B topology structure, that is, after the update, the data flow circulation order is the processing node A to the processing node B and then to the processing node. C.
另外,本实施例中,上述控制流可以理解为控制流信息,即上述控制流可以理解为一条信息。In addition, in this embodiment, the foregoing control flow can be understood as control flow information, that is, the above control flow can be understood as one piece of information.
第一更新单元82,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径。The first update unit 82 is configured to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes the topology that is currently required to be updated. The matching data flow path.
本实施例中,处理节点存储有与原拓扑结构匹配的数据流流通表,当接收到上述控制流时,即当前拓扑结构需要更新时,就可以按照当前需要更新的拓扑结构更新数据流流通表。而更新后的数据流流通表包括与当前需要更 新的拓扑结构匹配的数据流流通路径,这样就可以实现该处理节点在发送数据流时,是按照更新后的拓扑结构发送数据流。In this embodiment, the processing node stores a data flow circulation table that matches the original topology structure. When the control flow is received, that is, when the current topology needs to be updated, the data flow distribution table may be updated according to the topology structure currently required to be updated. . And the updated data flow flow table includes more needs than current needs. The new topology matches the data flow path, so that the processing node sends the data stream according to the updated topology when sending the data stream.
另外,上述与当前需要更新的拓扑结构匹配的数据流流通路径可以理解为在当前需要更新的拓扑结构中数据流的流通路径或者流通结构。例如:当前需要更新的拓扑结构是处理节点A连接处理节点C,那么,上述更新的数据流流通表就会包括数据流从处理节点A传输给处理节点C,即当上述装置应用于处理节点为处理节点A,那么,上述更新的数据流流通表就可以包括将数据流传输至处理节点C的数据流流通路径。In addition, the above-mentioned data flow path matching the topology that needs to be updated may be understood as a flow path or a circulation structure of the data flow in the topology that is currently required to be updated. For example, if the topology that needs to be updated is the processing node A connection processing node C, then the updated data flow distribution table includes the data stream transmitted from the processing node A to the processing node C, that is, when the device is applied to the processing node. Processing node A, then the updated data flow flow table may include a data flow path for transmitting the data stream to the processing node C.
第一发送单元83,用于当所述处理节点接收到数据流时,对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The first sending unit 83 is configured to perform service processing on the data stream when the processing node receives the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table. The data stream.
通过上述装置可以实现处理节点在发送数据流时是按照更新的拓扑结构进行发送的,从而可以实现动态地调整实时计算系统的拓扑结构。另外,在实时计算系统在进行巨量业务运行时通过上述步骤动态调整实时计算系统的拓扑结构不会对正在进行处理的数据流造成的干扰。另外,当实时计算系统为分布式系统,各个节点运行在不同的机器时,通过上述步骤动态调整实时计算系统的拓扑结构可以避免某个机器的修改滞后导致的问题。Through the foregoing apparatus, the processing node can be sent according to the updated topology when transmitting the data stream, so that the topology of the real-time computing system can be dynamically adjusted. In addition, when the real-time computing system performs a huge amount of business operations, the above-mentioned steps dynamically adjust the topology of the real-time computing system without causing interference to the data stream being processed. In addition, when the real-time computing system is a distributed system, and each node runs on a different machine, dynamically adjusting the topology of the real-time computing system through the above steps can avoid the problem caused by the modification lag of a certain machine.
本实施例中,上述处理节点可以是实时计算系统中的任意处理节点。In this embodiment, the processing node may be any processing node in the real-time computing system.
本实施例中,实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以通过实时更新数据流流通表的方式实现更新实时计算系统的处理逻辑,即可以动态地调整实时计算系统的拓扑结构。In this embodiment, the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system. The topology structure currently needs to be updated; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the current needs update a topologically matched data flow path; when the processing node receives the data stream, the processing node performs a service processing on the data stream, and follows a data flow path included in the updated data flow flow table Sending the data stream after the business processing. In this way, the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
请参阅图9,图9是本发明实施例提供的另一种数据流处理装置的结构示意图,如图9所示,包括:接收单元91、第一更新单元92和第一发送单 元93,其中:Referring to FIG. 9, FIG. 9 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 9, the method includes: a receiving unit 91, a first updating unit 92, and a first sending list. Yuan 93, where:
接收单元91,用于接收实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构。The receiving unit 91 is configured to receive a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update.
需要说明的是,本实施例中的数据流处理装置可以应用于实时计算系统的处理节点,例如:如图1所示的处理节点。It should be noted that the data stream processing apparatus in this embodiment may be applied to a processing node of a real-time computing system, for example, a processing node as shown in FIG. 1.
第一更新单元92,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径。The first update unit 92 is configured to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the topology that is currently required to be updated. The matching data flow path.
本实施例中,处理节点在更新数据流流通表时,可以暂停数据流发送,但可以对数据流进行业务处理。或者处理节点在更新数据流流通表时,可以暂停对数据流进行业务处理和数据流发送。在更新完数据流流通表时,恢复暂停的数据流发送和/或对数据流的业务处理。这样可以实现在动态调整实时计算系统的拓扑结构时,不然会出现数据流的错误处理和系统的阻塞影响性能。In this embodiment, when the processing node updates the data flow distribution table, the processing node may suspend the data flow transmission, but may perform service processing on the data flow. Or the processing node may suspend the service processing and data stream transmission of the data stream when updating the data flow distribution table. When the data flow flow table is updated, the suspended data stream transmission and/or the business processing of the data stream are resumed. In this way, when the topology of the real-time computing system is dynamically adjusted, the error handling of the data stream and the blocking effect of the system may occur.
第一发送单元93,用于当所述处理节点接收到数据流时,对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The first sending unit 93 is configured to perform service processing on the data stream when the processing node receives the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table. The data stream.
本实施例中,所述装置还可以包括:In this embodiment, the device may further include:
第二发送单元94,用于将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;a second sending unit 94, configured to send the updated data flow distribution table to a shared storage node of the real-time computing system;
恢复单元95,用于当所述处理节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。The recovery unit 95 is configured to: when the processing node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
该实施方式中,可以实现处理节点在出现故障后,进行恢复时可以直接从共享存储节点获取第一更新单元92更新的数据流流通表,从而可以实现处理节点在出现故障后,可以从共享存储节点中读取并初始化该数据流流通表,即完成了动态调整拓扑结构的HA机制。In this implementation manner, the processing node can obtain the data flow circulation table updated by the first update unit 92 directly from the shared storage node when the recovery node recovers, so that the processing node can be shared storage after the failure occurs. The data flow table is read and initialized in the node, that is, the HA mechanism for dynamically adjusting the topology is completed.
如图10所示,上述装置还可以包括:As shown in FIG. 10, the foregoing apparatus may further include:
第一反馈单元96,用于向所述实时计算系统的源节点反馈所述数据流流 通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;a first feedback unit 96, configured to feed back the data stream to a source node of the real-time computing system Updating the result of the update of the table, so that when the update structure indicates that the update fails, the source node sends the control flow to the management node;
第二更新单元97,用于当所述更新结果表示所述第一更新单元更新失败时,接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。a second updating unit 97, configured to: when the update result indicates that the first update unit fails to update, receive the control flow sent by the management node, and update according to the current needs described by the control flow The topology updates the data flow flow table.
该实施方式中,上述更新结果可以是第一更新单元92的更新结果,当第一更新单元92更新成功时,就可以通过上述更新结果告诉源节点上述处理节点更新成功。当更新失败时,通过上述更新结果告诉源节点上述处理节点更新失败,这样上述源节点就会再次向管理节点发送上述控制流,由管理节点再次向该处理节点发送上述控制流以使处理节点再次进行更新。当然,在源节点再次向管理节点发送上述控制流时,还可以携带有更新失败的处理节点的标识信息,这样可以使管理节点只向更新失败的处理节点再次发送控制流,而不向更新成功的处理节点再次发送控制流,以节约传输资源。In this embodiment, the update result may be an update result of the first update unit 92. When the first update unit 92 is successfully updated, the source node may be notified of the update of the processing node by the update result. When the update fails, the source node is notified by the above update result that the processing node update fails, so that the source node sends the control flow to the management node again, and the management node sends the control flow to the processing node again to make the processing node again. Update. Certainly, when the source node sends the control flow to the management node again, it may also carry the identification information of the processing node that failed to update, so that the management node may only send the control flow to the processing node that failed the update, without successfully updating. The processing node sends the control flow again to save transmission resources.
该实施方式可以实现能够正确反馈更新结果,假如更新失败,能够再次启动更新任务。This embodiment can implement the correct feedback of the update result, and if the update fails, the update task can be started again.
如图11所示,上述装置还可以包括:As shown in FIG. 11, the foregoing apparatus may further include:
第二反馈单元98,用于向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈的所述更新结果进行汇总,并输出汇总结果。a second feedback unit 98, configured to feed back an update result of the data flow table update to an output node of the real-time computing system, and the update result fed back by the output node to all processing nodes of the real-time computing system Summarize and output the summary results.
该输出节点可以获取每一个处理节点反馈的更新结果,这样输出节点就可以对其进行汇总,从而可以输出汇总结果,例如:将该汇总结果发送给展现设备,或者将该汇总结果进行打印等,从而让用户知道实时计算系统的拓扑结构调整的状态。The output node can obtain the update result fed back by each processing node, so that the output node can summarize it, so that the summary result can be output, for example, sending the summary result to the presentation device, or printing the summary result, etc. This allows the user to know the state of the topology adjustment of the real-time computing system.
本实施例中,在图8所示的实施例的基础上增加了多种可选的实施方式,且都可以实现动态调整实时计算系统的拓扑结构。In this embodiment, a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 8, and the topology of the real-time computing system can be dynamically adjusted.
请参阅图12,图12是本发明实施例提供的另一种数据流处理装置的结构示意图,如图12所示,包括:接收单元121和发送单元122,其中:Referring to FIG. 12, FIG. 12 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 12, the method includes: a receiving unit 121 and a sending unit 122, where:
接收单元121,用于接收实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算 系统当前需要更新的拓扑结构。The receiving unit 121 is configured to receive a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing The topology that the system currently needs to update.
需要说明的是,本实施例提供的数据流处理装置可以应用于实时计算系统的管理节点,例如:图1所示的管理节点。It should be noted that the data stream processing apparatus provided in this embodiment may be applied to a management node of a real-time computing system, for example, the management node shown in FIG. 1.
本实施例中,上述控制流可以包括当前需要更新的拓扑结构的结构信息,以及还可以包括管理节点或者管理模块内部处理逻辑约定好的控制流分解结构标识,即可以通过控制流分解结构标识和结构信息识别出当前需要更新的拓扑结构。In this embodiment, the foregoing control flow may include structural information of a topology that needs to be updated, and may also include a control flow decomposition structure identifier of the management node or the management module internal processing logic, that is, the control flow may be decomposed by the structure identifier and The structure information identifies the topology that needs to be updated currently.
发送单元122,用于将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The sending unit 122 is configured to send the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data flow according to the topology that is currently updated according to the control flow. a flow table, wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the data is processed by the processing node The stream performs service processing, and transmits the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
本实施例中,管理节点可以是向实时计算系统中的所有处理节点发送上述控制流,这样所有处理节点都会对其存储的数据流流通表进行更新。例如:如图13所示,上述装置还可以包括:In this embodiment, the management node may send the control flow to all processing nodes in the real-time computing system, so that all processing nodes update the stored data flow table. For example, as shown in FIG. 13, the foregoing apparatus may further include:
保持单元123,用于与所述实时计算系统中的各个所述处理节点建立并保持连接;a holding unit 123, configured to establish and maintain a connection with each of the processing nodes in the real-time computing system;
所述发送单元122用于以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The sending unit 122 is configured to send the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
这样就可以实现通过广播方式将控制流发送给所有处理节点。This allows broadcast control to be sent to all processing nodes.
当然,本实施例中,管理节点还可以只是向部分处理节点发送控制流,例如:当前需要更新的拓扑结构中只涉及到部分处理节点时,那么,管理节点就可以只向这些被涉及到的处理节点发送控制流,以使这些被涉及到处理节点更新数据流流通表,而没有被涉及到则可以不更新数据流流通表。Of course, in this embodiment, the management node may only send the control flow to the partial processing node. For example, when the topology that needs to be updated only involves some processing nodes, the management node may only refer to these involved. The processing node sends the control flow so that these are involved in the processing node updating the data flow flow table, and without being involved, the data flow flow table may not be updated.
本实施例中,实时计算系统的管理节点接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其 中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以实现动态调整实时计算系统的拓扑结构。In this embodiment, the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system. The topology that currently needs to be updated; the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing node updates according to the current needs described by the control flow Topology update data flow flow table, The updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs a service on the data flow. Processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the updated data flow distribution table. This allows dynamic adjustment of the topology of the real-time computing system.
请参阅图14,图14是本发明实施例提供的另一种数据流处理装置的结构示意图,如图14所示,包括:获取单元141、更新单元142、第一发送单元143和第二发送单元144,其中:Referring to FIG. 14, FIG. 14 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. As shown in FIG. 14, the method includes: an obtaining unit 141, an updating unit 142, a first sending unit 143, and a second sending. Unit 144, wherein:
获取单元141,用于获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The obtaining unit 141 is configured to acquire a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update;
需要说明的是,本实施例中提供的数据流处理装置可以应用于实时计算系统的源节点,例如:如图1所示的源节点。It should be noted that the data stream processing apparatus provided in this embodiment may be applied to a source node of a real-time computing system, for example, a source node as shown in FIG. 1.
其中,上述控制流可以是源节点接收用户输入的控制流。这里的控制流可以参考图1-4所示的实施例中描述的控制流,此处不作重复说明。The foregoing control flow may be a control flow in which the source node receives user input. The control flow here can refer to the control flow described in the embodiment shown in FIG. 1-4, and will not be repeatedly described herein.
更新单元142,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径。The updating unit 142 is configured to update the data flow distribution table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a topology that matches the current need to be updated. Data flow path.
源节点接收到上述控制流后,就可以对其进行分解,并根据分解获取的当前需要更新的拓扑结构更新数据流流通表。另外,源节点会存储或者缓存更新后的数据流流通表。当然,还可以存储或者缓存上述控制流。After receiving the above control flow, the source node can decompose the control flow, and update the data flow circulation table according to the topology structure that needs to be updated according to the decomposition. In addition, the source node stores or caches the updated data flow table. Of course, the above control flow can also be stored or cached.
第一发送单元143,用于将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The first sending unit 143 is configured to send the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node used for performing service processing on the data stream, to And causing the processing node to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes data that matches the topology that needs to be updated currently. Flow path
通过该单元就可以实现由管理节点将控制流发送至处理节点,处理节点再对其存储的数据流流通表进行更新。Through the unit, the control node can send the control flow to the processing node, and the processing node updates the stored data flow table.
第二发送单元144,用于接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处 理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。a second sending unit 144, configured to receive a data stream, and send the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where The processing node performs service processing on the data stream, and sends the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
本实施例中,上述数据流可以用户输入的数据或者采集设备传输的数据等实时计算系统当前需要计算的数据。In this embodiment, the data stream may be used to calculate data currently needed by the system in real time, such as data input by the user or data transmitted by the collection device.
由于更新单元142更新数据流流通表,这样第二发送单元144按照更新后的数据流流通表发送数据流,就可以实现源节点是按照更新后的拓扑结构发送数据流,以实现动态调整实时计算系统的拓扑结构。Since the update unit 142 updates the data flow distribution table, so that the second sending unit 144 sends the data flow according to the updated data flow distribution table, the source node can send the data flow according to the updated topology to implement dynamic adjustment of the real-time calculation. The topology of the system.
本实施例中,如图15所示,上述装置还可以包括:In this embodiment, as shown in FIG. 15, the foregoing apparatus may further include:
第三发送单元145,用于将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;a third sending unit 145, configured to send the updated data flow circulation table to a shared storage node of the real-time computing system;
恢复单元146,用于当所述源节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。The recovery unit 146 is configured to: when the source node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
该实施方式中,可以实现源节点在出现故障后,进行恢复时可以直接从共享存储节点获取更新单元142更新的数据流流通表,从而可以实现源节点在出现故障后,可以从共享存储节点中读取并初始化该数据流流通表,即完成了动态调整拓扑结构的HA机制。In this implementation manner, after the source node recovers, the source node may directly obtain the data flow circulation table updated by the update unit 142 from the shared storage node, so that the source node may be in the shared storage node after the failure occurs. Reading and initializing the data flow table, the HA mechanism for dynamically adjusting the topology is completed.
本实施例中,如图16所示,所述装置还可以包括:In this embodiment, as shown in FIG. 16, the device may further include:
接收单元147,用于接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;The receiving unit 147 is configured to receive, by the processing node that is fed back by the processing node, an update result of the data flow distribution table.
第四发送单元148,用于当所述更新结果表示更新失败时,向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。The fourth sending unit 148 is configured to: when the update result indicates that the update fails, send the control flow to the management node, so that the management node sends the control flow to the processing node, by the processing The node updates the data flow flow table according to the topology that is currently updated as described by the control flow.
该实施方式中,可以及时获取各处理节点当前更新数据流流通表的更新结果。当某一个或者多个处理节点更新失败时,源节点可以触发管理节点向更新失败的处理节点发送控制流,以使更新失败的处理节点再次更新数据流流通表。In this implementation manner, the update result of the current update data flow flow table of each processing node can be obtained in time. When the update of one or more processing nodes fails, the source node may trigger the management node to send a control flow to the processing node that failed the update, so that the processing node that failed the update updates the data flow circulation table again.
本实施例中,实时计算系统的源节点获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要 更新的拓扑结构;所述源节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;所述源节点将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;所述源节点接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。可以动态调整实时计算系统的拓扑结构。In this embodiment, the source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a current requirement of the real-time computing system. An updated topology; the source node updates a data flow flow table according to the topology that is currently updated as described by the control flow, wherein the updated data flow flow table includes the topology that needs to be updated currently a data flow path of the structure matching; the source node transmitting the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a service for processing a data stream Processing the node, so that the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the topology that needs to be updated currently a data flow path of the structure matching; the source node receives the data stream, and sends the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, by the processing The node performs service processing on the data stream, and according to the data flow path included in the data flow circulation table updated by the processing node Sending the processed traffic data stream. The topology of the real-time computing system can be dynamically adjusted.
请参阅图17,图17是本发明实施例提供的另一种数据流处理装置的结构示意图,该装置应用于实时计算系统的处理节点,如图17所示,包括:处理器171、网络接口172、存储器173和通信总线174,其中,所述通信总线174用于实现所述处理器171、网络接口172和存储器173之间连接通信,所述处理器171执行所述存储器173中存储的程序用于实现以下方法:Referring to FIG. 17, FIG. 17 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. The apparatus is applied to a processing node of a real-time computing system, as shown in FIG. 17, and includes: a processor 171, a network interface. 172, a memory 174 for implementing connection communication between the processor 171, the network interface 172, and the memory 173, and a communication bus 174, the processor 171 executing the program stored in the memory 173 Used to implement the following methods:
接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;Receiving, by the management node of the real-time computing system, a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update;
按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;Updating the data flow circulation table according to the topology structure currently required to be updated according to the control flow, wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently;
当处理节点接收到数据流时,对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。When the processing node receives the data stream, performing processing on the data stream, and transmitting the data stream after the service processing according to the data stream circulation path included in the updated data stream circulation table.
本实施例中,处理器171还可以执行如下程序:In this embodiment, the processor 171 can also execute the following procedure:
将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Transmitting the updated data flow circulation table to a shared storage node of the real-time computing system;
当所述处理节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。 When the processing node performs failure recovery, the updated data flow distribution table is obtained from the shared storage node, and the data flow is transmitted according to the updated data flow distribution table.
本实施例中,处理器171还可以执行如下程序:In this embodiment, the processor 171 can also execute the following procedure:
向所述实时计算系统的源节点反馈所述数据流流通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;And feeding back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the control flow to the management node;
当所述更新结果表示更新失败时,所述处理节点接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。When the update result indicates that the update fails, the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
本实施例中,处理器171还可以执行如下程序:In this embodiment, the processor 171 can also execute the following procedure:
向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈的所述更新结果进行汇总,并输出汇总结果。And returning, to the output node of the real-time computing system, an update result of the data flow table update, and the output node summarizes the update result fed back by all processing nodes of the real-time computing system, and outputs a summary result.
本实施例中,实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以通过实时更新数据流流通表的方式实现更新实时计算系统的处理逻辑,即可以动态地调整实时计算系统的拓扑结构。In this embodiment, the processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system. The topology structure currently needs to be updated; the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the current needs update a topologically matched data flow path; when the processing node receives the data stream, the processing node performs a service processing on the data stream, and follows a data flow path included in the updated data flow flow table Sending the data stream after the business processing. In this way, the processing logic of the real-time computing system can be updated by updating the data flow table in real time, that is, the topology of the real-time computing system can be dynamically adjusted.
请参阅图18,图18是本发明实施例提供的另一种数据流处理装置的结构示意图,该装置应用于实时计算系统的管理节点,如图18所示,包括:处理器181、网络接口182、存储器183和通信总线184,其中,所述通信总线184用于实现所述处理器181、网络接口182和存储器183之间连接通信,所述处理器181执行所述存储器183中存储的程序用于实现以下方法:Referring to FIG. 18, FIG. 18 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. The apparatus is applied to a management node of a real-time computing system, as shown in FIG. 18, and includes: a processor 181, a network interface. 182. A memory 184 for communicating communications between the processor 181, the network interface 182 and the memory 183, and a communication bus 184, the processor 181 executing the program stored in the memory 183 Used to implement the following methods:
接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;Receiving, by the source node of the real-time computing system, a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system currently needs to update;
将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述 处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。Transmitting the control flow to a processing node for performing traffic processing on the data stream to cause the The processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently. When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the service processed data according to the data flow distribution path included in the updated data flow circulation table. flow.
本实施例中,处理器181还可以执行如下程序:In this embodiment, the processor 181 can also execute the following procedure:
与所述实时计算系统中的各个所述处理节点建立并保持连接;Establishing and maintaining a connection with each of the processing nodes in the real-time computing system;
处理器181执行的将所述控制流发送至用于对数据流进行业务处理的处理节点的程序,可以包括:The program executed by the processor 181 to send the control flow to the processing node for performing service processing on the data stream may include:
所述管理节点以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
本实施例中,实时计算系统的管理节点接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。这样可以实现动态调整实时计算系统的拓扑结构。In this embodiment, the management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe the real-time computing system. The topology that currently needs to be updated; the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing node updates according to the current needs described by the control flow The topology update data flow circulation table, wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing is performed by the processing The node performs service processing on the data stream, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table. This allows dynamic adjustment of the topology of the real-time computing system.
请参阅图19,图19是本发明实施例提供的另一种数据流处理装置的结构示意图,该装置应用于实时计算系统的源节点,如图19所示,包括:处理器191、网络接口192、存储器193和通信总线194,其中,所述通信总线194用于实现所述处理器191、网络接口192和存储器193之间连接通信,所述处理器191执行所述存储器193中存储的程序用于实现以下方法:Referring to FIG. 19, FIG. 19 is a schematic structural diagram of another data stream processing apparatus according to an embodiment of the present invention. The apparatus is applied to a source node of a real-time computing system, as shown in FIG. 19, and includes: a processor 191, a network interface. 192, a memory 194 for implementing connection communication between the processor 191, the network interface 192, and the memory 193, and a communication bus 194, the processor 191 executing the program stored in the memory 193 Used to implement the following methods:
获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;Obtaining a control flow for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update;
按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流 通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;Updating the data stream according to the topology that is currently updated as described by the control flow a table, wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently;
将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;Transmitting the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing node follows the The topology update data flow circulation table currently required to be updated by the control flow, wherein the updated data flow circulation table includes a data flow circulation path that matches the topology that needs to be updated currently;
接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。Receiving a data stream, and transmitting the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where the processing node performs service processing on the data flow, and according to the The data flow distribution path included in the data flow distribution table updated by the processing node transmits the data flow after the business processing.
本实施例中,处理器191执行的程序还可以包括:In this embodiment, the program executed by the processor 191 may further include:
将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Transmitting the updated data flow circulation table to a shared storage node of the real-time computing system;
当所述源节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。When the source node performs failure recovery, the updated data flow distribution table is obtained from the shared storage node, and the data flow is transmitted according to the updated data flow distribution table.
本实施例中,处理器191执行的程序还可以包括:In this embodiment, the program executed by the processor 191 may further include:
接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;Receiving, by the processing node fed back by the processing node, an update result of the data flow distribution table;
当所述更新结果表示更新失败时,向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。And when the update result indicates that the update fails, sending the control flow to the management node, so that the management node sends the control flow to the processing node, which is described by the processing node according to the control flow. The topology that currently needs to be updated updates the data flow flow table.
本实施例中,实时计算系统的源节点获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;所述源节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;所述源节点将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路 径;所述源节点接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。可以动态调整实时计算系统的拓扑结构。In this embodiment, the source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present; the source And updating, by the node, the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; Sending, by the source node, the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, so that the processing The node updates the data flow flow table according to the topology that is currently updated according to the control flow, where the updated data flow flow table includes a data flow path that matches the topology that needs to be updated currently. The source node receives the data stream, and sends the data stream to the processing node according to the data flow circulation path included in the data flow circulation table updated by the source node, and the data stream is processed by the processing node Performing business processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node. The topology of the real-time computing system can be dynamically adjusted.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,简称RAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the foregoing embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。 The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, and thus equivalent changes made in the claims of the present invention are still within the scope of the present invention.

Claims (18)

  1. 一种数据流处理方法,其特征在于,包括:A data stream processing method, comprising:
    实时计算系统的处理节点接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;a processing node of the real-time computing system receives a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update structure;
    所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow distribution table includes a data flow that matches the topology that needs to be updated currently. Circulation path
    当所述处理节点接收到数据流时,所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。When the processing node receives the data stream, the processing node performs service processing on the data stream, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 wherein the method further comprises:
    所述处理节点将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Transmitting, by the processing node, the updated data flow circulation table to a shared storage node of the real-time computing system;
    当所述处理节点进行故障恢复时,所述处理节点从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。When the processing node performs failure recovery, the processing node acquires the updated data flow distribution table from the shared storage node, and transmits the data flow according to the updated data flow distribution table.
  3. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    所述处理节点向所述实时计算系统的源节点反馈所述数据流流通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;And the processing node feeds back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the control to the management node flow;
    当所述更新结果表示更新失败时,所述处理节点接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。When the update result indicates that the update fails, the processing node receives the control flow sent by the management node, and updates the data flow circulation table according to the topology that is currently updated according to the control flow.
  4. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:The method of claim 1 or 2, wherein the method further comprises:
    所述处理节点向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈 的所述更新结果进行汇总,并输出汇总结果。The processing node feeds back an update result of the data flow table update to an output node of the real-time computing system, and the output node feeds back all processing nodes of the real-time computing system The updated results are summarized and the summary results are output.
  5. 一种数据流处理方法,其特征在于,包括:A data stream processing method, comprising:
    实时计算系统的管理节点接收所述实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;a management node of the real-time computing system receives a control flow sent by a source node of the real-time computing system for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update structure;
    所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。Transmitting, by the management node, the control flow to a processing node for performing service processing on the data stream, so that the processing node updates the data flow circulation table according to the topology that is currently updated according to the control flow description The updated data flow distribution table includes a data flow circulation path that matches the topology that needs to be updated currently; when the processing node receives the data flow, the processing node performs the data flow on the data flow. The service processes, and sends the data stream after the service processing according to the data flow distribution path included in the updated data flow distribution table.
  6. 如权利要求5所述的方法,其特征在于,所述方法还包括:The method of claim 5, wherein the method further comprises:
    所述管理节点与所述实时计算系统中的各个所述处理节点建立并保持连接;Establishing and maintaining a connection between the management node and each of the processing nodes in the real-time computing system;
    所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,包括:The management node sends the control flow to a processing node for performing service processing on the data stream, including:
    所述管理节点以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The management node transmits the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  7. 一种数据流处理方法,其特征在于,包括:A data stream processing method, comprising:
    实时计算系统的源节点获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;A source node of the real-time computing system acquires a control flow for adjusting a topology of the real-time computing system, wherein the control flow is used to describe a topology that the real-time computing system currently needs to update;
    所述源节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;And the source node updates the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a data flow that matches the topology that needs to be updated currently. Circulation path
    所述源节点将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以 使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The source node sends the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, And causing the processing node to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes data that matches the topology that needs to be updated currently. Flow path
    所述源节点接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。Receiving, by the source node, the data stream, and transmitting the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where the processing node performs a service on the data flow Processing, and transmitting the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
  8. 如权利要求7所述的方法,其特征在于,所述方法还包括:The method of claim 7 wherein the method further comprises:
    所述源节点将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;Sending, by the source node, the updated data flow circulation table to a shared storage node of the real-time computing system;
    当所述源节点进行故障恢复时,所述源节点从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。When the source node performs fault recovery, the source node acquires the updated data flow flow table from the shared storage node, and performs data flow transmission according to the updated data flow flow table.
  9. 如权利要求7或8所述的方法,其特征在于,所述方法还包括:The method of claim 7 or 8, wherein the method further comprises:
    所述源节点接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;Receiving, by the source node, the processing node fed back by the processing node to update an update result of the data flow distribution table;
    当所述更新结果表示更新失败时,所述源节点向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。When the update result indicates that the update fails, the source node sends the control flow to the management node, so that the management node sends the control flow to the processing node, and the processing node follows the The topology that is currently updated as described by the control flow updates the data flow flow table.
  10. 一种数据流处理装置,所述装置应用于实时计算系统的处理节点,其特征在于,包括:接收单元、第一更新单元和第一发送单元,其中:A data stream processing device, the device being applied to a processing node of a real-time computing system, comprising: a receiving unit, a first updating unit, and a first sending unit, wherein:
    所述接收单元,用于接收所述实时计算系统的管理节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The receiving unit is configured to receive a control flow sent by a management node of the real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe that the real-time computing system needs to be updated currently Topology;
    所述第一更新单元,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径; The first update unit is configured to update a data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes the topology that needs to be updated currently Structure matching data flow path;
    所述第一发送单元,用于当所述处理节点接收到数据流时,对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The first sending unit is configured to: when the processing node receives the data stream, perform service processing on the data stream, and send the service processing according to the data flow distribution path included in the updated data flow distribution table. After the data stream.
  11. 如权利要求10所述的装置,其特征在于,所述装置还包括:The device of claim 10, wherein the device further comprises:
    第二发送单元,用于将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;a second sending unit, configured to send the updated data flow distribution table to a shared storage node of the real-time computing system;
    恢复单元,用于当所述处理节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。And a recovery unit, configured to: when the processing node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
  12. 如权利要求10或11所述的装置,其特征在于,所述装置还包括:The device according to claim 10 or 11, wherein the device further comprises:
    第一反馈单元,用于向所述实时计算系统的源节点反馈所述数据流流通表更新的更新结果,以使所述更新结构表示更新失败时,由所述源节点向所述管理节点发送所述控制流;a first feedback unit, configured to feed back, to the source node of the real-time computing system, an update result of the data flow table update, so that when the update structure indicates that the update fails, the source node sends the update to the management node The control flow;
    第二更新单元,用于当所述更新结果表示所述第一更新单元更新失败时,接收所述管理节点发送的所述控制流,并按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表。a second updating unit, configured to: when the update result indicates that the first update unit fails to update, receive the control flow sent by the management node, and update according to the current flow description described by the control flow The topology updates the data flow flow table.
  13. 如权利要求10或11所述的装置,其特征在于,所述装置还包括:The device according to claim 10 or 11, wherein the device further comprises:
    第二反馈单元,用于向所述实时计算系统的输出节点反馈所述数据流流通表更新的更新结果,由所述输出节点对所述实时计算系统所有的处理节点反馈的所述更新结果进行汇总,并输出汇总结果。a second feedback unit, configured to feed back an update result of the data flow table update to an output node of the real-time computing system, where the update result is fed back by the output node to all processing nodes of the real-time computing system Summarize and output the summary results.
  14. 一种数据流处理装置,其特征在于,包括:接收单元和发送单元,其中:A data stream processing device, comprising: a receiving unit and a sending unit, wherein:
    所述接收单元,用于接收实时计算系统的源节点发送的用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The receiving unit is configured to receive a control flow sent by a source node of a real-time computing system for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update currently. ;
    所述发送单元,用于将所述控制流发送至用于对数据流进行业务处理的 处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;当所述处理节点接收到数据流时,由所述处理节点对所述数据流进行业务处理,并按照所述更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。The sending unit is configured to send the control flow to a service for processing a data stream. Processing the node, so that the processing node updates the data flow circulation table according to the topology that is currently updated as described by the control flow, wherein the updated data flow circulation table includes the topology that needs to be updated currently a data flow path of the structure matching; when the processing node receives the data stream, the processing node performs service processing on the data stream, and sends according to the data flow path included in the updated data flow circulation table The data stream after the business processing.
  15. 如权利要求14所述的装置,其特征在于,所述装置还包括:The device of claim 14 wherein said device further comprises:
    保持单元,用于与所述实时计算系统中的各个所述处理节点建立并保持连接;a holding unit, configured to establish and maintain a connection with each of the processing nodes in the real-time computing system;
    所述发送单元用于以广播方式将所述控制流发送至所述实时计算系统的各个所述处理节点。The sending unit is configured to send the control flow to each of the processing nodes of the real-time computing system in a broadcast manner.
  16. 一种数据流处理装置,所述装置应用于实时计算系统的源节点,其特征在于,包括:获取单元、更新单元、第一发送单元和第二发送单元,其中:A data stream processing device, the device being applied to a source node of a real-time computing system, comprising: an obtaining unit, an updating unit, a first sending unit, and a second sending unit, wherein:
    所述获取单元,用于获取用于调整所述实时计算系统的拓扑结构的控制流,其中,所述控制流用于描述所述实时计算系统当前需要更新的拓扑结构;The acquiring unit is configured to acquire a control flow for adjusting a topology of the real-time computing system, where the control flow is used to describe a topology that the real-time computing system needs to update at present;
    所述更新单元,用于按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The updating unit is configured to update a data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a topology matching the current need to be updated. Data flow path;
    所述第一发送单元,用于将所述控制流发送给所述实时计算系统的管理节点,以使所述管理节点将所述控制流发送至用于对数据流进行业务处理的处理节点,以使所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新数据流流通表,其中,所述更新的数据流流通表包括与所述当前需要更新的拓扑结构匹配的数据流流通路径;The first sending unit is configured to send the control flow to a management node of the real-time computing system, so that the management node sends the control flow to a processing node for performing service processing on the data stream, And causing the processing node to update the data flow circulation table according to the topology that is currently updated according to the control flow, where the updated data flow circulation table includes a topology that matches the current need to be updated. Data flow path;
    所述第二发送单元,用于接收数据流,并将所述数据流按照所述源节点更新的数据流流通表包括的数据流流通路径发送至所述处理节点,由所述处理节点对所述数据流进行业务处理,并按照该处理节点更新的数据流流通表包括的数据流流通路径发送所述业务处理后的数据流。 The second sending unit is configured to receive a data stream, and send the data stream to the processing node according to a data flow circulation path included in the data flow circulation table updated by the source node, where the processing node is The data stream performs service processing, and transmits the data stream after the service processing according to a data flow distribution path included in the data flow distribution table updated by the processing node.
  17. 如权利要求16所述的装置,其特征在于,所述装置还包括:The device of claim 16 wherein said device further comprises:
    第三发送单元,用于将所述更新的数据流流通表发送至所述实时计算系统的共享存储节点;a third sending unit, configured to send the updated data flow distribution table to a shared storage node of the real-time computing system;
    恢复单元,用于当所述源节点进行故障恢复时,从所述共享存储节点获取所述更新的数据流流通表,并按照所述更新的数据流流通表进行数据流的发送。And a recovery unit, configured to: when the source node performs failure recovery, acquire the updated data flow distribution table from the shared storage node, and send the data flow according to the updated data flow distribution table.
  18. 如权利要求16或17所述的装置,其特征在于,所述装置还包括:The device according to claim 16 or 17, wherein the device further comprises:
    接收单元,用于接收所述处理节点反馈的所述处理节点更新所述数据流流通表的更新结果;a receiving unit, configured to receive, by the processing node fed back by the processing node, an update result of updating the data flow distribution table;
    第四发送单元,用于当所述更新结果表示更新失败时,向所述管理节点发送所述控制流,以使所述管理节点向所述处理节点发送所述控制流,由所述处理节点按照所述控制流所描述的所述当前需要更新的拓扑结构更新所述数据流流通表。 a fourth sending unit, configured to: when the update result indicates that the update fails, send the control flow to the management node, so that the management node sends the control flow to the processing node, by the processing node Updating the data flow flow table according to the topology currently required to be updated as described by the control flow.
PCT/CN2016/093588 2015-08-27 2016-08-05 Data stream processing method and apparatus WO2017032212A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510534143.5 2015-08-27
CN201510534143.5A CN106487694B (en) 2015-08-27 2015-08-27 Data stream processing method and device

Publications (1)

Publication Number Publication Date
WO2017032212A1 true WO2017032212A1 (en) 2017-03-02

Family

ID=58099527

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/093588 WO2017032212A1 (en) 2015-08-27 2016-08-05 Data stream processing method and apparatus

Country Status (2)

Country Link
CN (1) CN106487694B (en)
WO (1) WO2017032212A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11095522B2 (en) * 2019-08-21 2021-08-17 Microsoft Technology Licensing, Llc Dynamic scaling for data processing streaming system

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019369B (en) * 2017-12-31 2022-06-07 中国移动通信集团福建有限公司 Method, apparatus, device and medium for sharing data stream processing topology

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130290554A1 (en) * 2012-04-26 2013-10-31 Qiming Chen Open station as a stream analysis operator container
CN103716182A (en) * 2013-12-12 2014-04-09 中国科学院信息工程研究所 Failure detection and fault tolerance method and failure detection and fault tolerance system for real-time cloud platform
CN104008007A (en) * 2014-06-12 2014-08-27 深圳先进技术研究院 Interoperability data processing system and method based on streaming calculation and batch processing calculation
CN104090886A (en) * 2013-12-09 2014-10-08 深圳市腾讯计算机系统有限公司 Method and device for constructing real-time portrayal of user
WO2014194251A2 (en) * 2013-05-30 2014-12-04 Vaibhav Nivargi Apparatus and method for collaboratively analyzing data from disparate data sources
CN104683445A (en) * 2015-01-26 2015-06-03 北京邮电大学 Distributed real-time data fusion system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130290554A1 (en) * 2012-04-26 2013-10-31 Qiming Chen Open station as a stream analysis operator container
WO2014194251A2 (en) * 2013-05-30 2014-12-04 Vaibhav Nivargi Apparatus and method for collaboratively analyzing data from disparate data sources
CN104090886A (en) * 2013-12-09 2014-10-08 深圳市腾讯计算机系统有限公司 Method and device for constructing real-time portrayal of user
CN103716182A (en) * 2013-12-12 2014-04-09 中国科学院信息工程研究所 Failure detection and fault tolerance method and failure detection and fault tolerance system for real-time cloud platform
CN104008007A (en) * 2014-06-12 2014-08-27 深圳先进技术研究院 Interoperability data processing system and method based on streaming calculation and batch processing calculation
CN104683445A (en) * 2015-01-26 2015-06-03 北京邮电大学 Distributed real-time data fusion system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11095522B2 (en) * 2019-08-21 2021-08-17 Microsoft Technology Licensing, Llc Dynamic scaling for data processing streaming system

Also Published As

Publication number Publication date
CN106487694B (en) 2020-03-27
CN106487694A (en) 2017-03-08

Similar Documents

Publication Publication Date Title
JP7463544B2 (en) Blockchain message processing method, apparatus, computer device, and computer program
WO2019153488A1 (en) Service configuration management method, apparatus, storage medium and server
US10069942B2 (en) Method and apparatus for changing configurations
WO2019061720A1 (en) Data synchronization method and system
WO2017092347A1 (en) Method, device and system for updating client configuration in memcached system
US11252035B2 (en) Data configuration method and apparatus
CN107517227B (en) Session implementation method and device for distributed consistency system
WO2020211629A1 (en) Short link message monitoring method and apparatus based on blockchain
CN112929225B (en) Session exception handling method and device, computer equipment and storage medium
WO2017032212A1 (en) Data stream processing method and apparatus
CN112118322A (en) Data synchronization method of network equipment, network equipment and system
US20210334185A1 (en) Task based service management platform
WO2020010906A1 (en) Method and device for operating system (os) batch installation, and network device
CN113342503B (en) Real-time progress feedback method, device, equipment and storage medium
WO2022120806A1 (en) Multi-cloud distributed messaging method and system for high performance computing
WO2016180156A1 (en) Router cluster upgrade system, method and apparatus
WO2024002390A1 (en) Pcb device message response method and apparatus, and computer device and storage medium
CN111953716A (en) Message communication method, system, computer device and storage medium
US10498637B2 (en) Switch processing method, controller, switch, and switch processing system
WO2015117365A1 (en) Method, device and system for interacting hello packets
US20180367448A1 (en) Dataflow consistency verification
WO2022121492A1 (en) File transmission method and apparatus, computer device, and storage medium
CN110995829A (en) Instance calling method and device and computer storage medium
WO2021136358A1 (en) Network device management method and system, and network device
CN114697334A (en) Execution method and device for scheduling tasks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16838472

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16838472

Country of ref document: EP

Kind code of ref document: A1