WO2015131721A1 - 流计算系统中的数据处理方法、控制节点及流计算系统 - Google Patents
流计算系统中的数据处理方法、控制节点及流计算系统 Download PDFInfo
- Publication number
- WO2015131721A1 WO2015131721A1 PCT/CN2015/071645 CN2015071645W WO2015131721A1 WO 2015131721 A1 WO2015131721 A1 WO 2015131721A1 CN 2015071645 W CN2015071645 W CN 2015071645W WO 2015131721 A1 WO2015131721 A1 WO 2015131721A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- working node
- node
- working
- concurrency
- data
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/401—Support for services or applications wherein the services involve a main real-time session and one or more additional parallel real-time or time sensitive sessions, e.g. white board sharing or spawning of a subconference
- H04L65/4015—Support for services or applications wherein the services involve a main real-time session and one or more additional parallel real-time or time sensitive sessions, e.g. white board sharing or spawning of a subconference where at least one of the additional parallel sessions is real time or time sensitive, e.g. white board sharing, collaboration or spawning of a subconference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/04—Processing captured monitoring data, e.g. for logfile generation
- H04L43/045—Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/82—Miscellaneous aspects
- H04L47/828—Allocation of resources per group of connections, e.g. per group of users
Definitions
- the present invention relates to the field of computer technologies, and in particular, to a data processing method, a control node, and a stream computing system in a stream computing system.
- Data-intensive services have been widely used. Typical data-intensive services include financial services, network monitoring, telecom data management, and web applications.
- data is characterized by large, fast, and time-varying data. Therefore, data should not be modeled with persistent stable relationships, but it is suitable for modeling with transient data streams, which leads to the study of data flow calculation.
- Data stream computing is a pipeline-like data processing model. Data stream computing comes from the idea that the value of data decreases with time, so it must be processed as soon as possible after event-triggered data generation. Data is processed as soon as it is generated, that is, an event is processed as soon as an event occurs, rather than buffering the data for batch processing.
- the data flow calculation is based on the streaming data processing model.
- the business data processing logic usually needs to be converted into a Directed Acyclic Graph (DAG; or a flow graph).
- DAG Directed Acyclic Graph
- a scheme for setting a streaming data processing model for data stream calculation is: physical equipment (PE, or called an execution unit) and a logical unit (in the DAG diagram, generally labeled with an Operator, or For the working node), the solution supports the degree of concurrency of the static configuration. That is, according to the concurrent degree of the static configuration of the user, each operator in the service execution process calls the corresponding number according to the concurrency.
- the execution unit processes the data stream generated by the service.
- the flow computing system is usually a distributed real-time stream processing system, various tasks in the system
- the processing situation is changed in real time.
- the initial user-set concurrency is often not optimal, so the streaming data processing model generated according to the concurrency set by the initial user can not adapt to the real-time changes of the system, which will lead to the flow computing system. Waste of resources and limited data processing capabilities.
- the invention provides a data processing method, a control node and a flow computing system in a flow computing system, which can adjust the concurrency of working nodes in a flow computing system in real time according to the business processing situation, thereby improving the data processing capability and resource utilization of the flow computing system. rate.
- the present invention provides a data processing method in a flow computing system, the flow computing system including a control node and a plurality of working nodes, the method comprising:
- the control node invokes one or more working nodes of the plurality of working nodes to process the data stream according to the concurrency of the configured working nodes;
- the control node collects data traffic information between each of the one or more working nodes and other working nodes and processing speed information of each of the one or more working nodes;
- the control node determines whether the optimized concurrency of each of the one or more working nodes is the same as the concurrency of the working node, and if not, the work is performed according to the optimized concurrency of the working node. The degree of concurrency of the node is adjusted.
- each working node includes one or more execution units, and when the working node is invoked to process the data stream, specifically processed by the execution unit included in the working node The data stream; the concurrency of the working node indicates the number of execution units included in the working node; then the control node adjusts the concurrency of the working node according to the optimized concurrency of the working node, including:
- the control node adds at least one new one to the working node according to the optimized concurrency of the working node. Executing the unit, or deleting at least one execution unit of the working node, such that the working node represented by the working node currently has the same degree of concurrency as the working node.
- control node adds at least one execution unit to the working node according to the optimized concurrency of the working node, or deletes the work.
- At least one execution unit of the node including:
- the control node When the optimized degree of concurrency of the working node is greater than the concurrency of the working node: the control node generates a first control instruction for creating a new execution unit and sends the same to the working node, so that the working node receives the first Creating a at least one new execution unit after a control instruction, and creating a data channel of the new execution unit and other execution units, wherein the total number of execution units currently included in the work node is represented by a degree of concurrency of the work node and the The optimization degree of the working nodes is the same;
- the control node When the optimization concurrency of the working node is less than the concurrency of the working node: the control node generates a second control instruction for deleting the execution unit of the working node and sends the second control instruction to the working node, so that the working node receives Deleting at least one execution unit of the working node after the second control instruction, and deleting a data channel connected to the deleted execution unit, wherein the work is represented by a total number of execution units currently included in the work node
- the concurrency of a node is the same as the optimal concurrency of the working node.
- At least one execution unit is added to the working node according to the optimized concurrency of the working node, or the working node is deleted. After at least one execution unit, the method further includes:
- control node adjusts a data distribution policy of the upstream working node corresponding to the working node according to the at least one execution unit added or deleted, where the data distribution policy is used to indicate that the working node receives the data, and receives the data, and receives the data The amount of data of the data device when receiving data;
- the method further includes:
- the control node adjusts a data distribution policy of the upstream working node corresponding to the working node according to the at least one execution unit added or deleted;
- control node sends the adjusted data distribution policy to the upstream working node, so that the upstream working node determines, according to the adjusted data distribution policy, the working node group to which the target working node belongs, and the working node group includes at least one a working node; and determining a downstream target working node from the working node group, and after determining the target execution unit corresponding to the target working node, transmitting the data packet correspondingly to the target execution unit.
- the present invention provides a control node in a flow computing system, the flow computing system including the control node and a plurality of working nodes, the control node comprising:
- a calling unit configured to invoke one or more working nodes of the plurality of working nodes to process the data stream according to the concurrency of the configured working nodes
- An information collecting unit configured to collect data traffic information between each of the one or more working nodes and other working nodes, and processing speed information of each of the one or more working nodes;
- a calculating unit configured to determine, according to the data traffic information and the processing speed information collected by the information collecting unit, an optimized concurrency of each of the one or more working nodes
- An adjusting unit configured to determine, respectively, whether an optimization concurrency of each of the one or more working nodes is the same as a concurrency of the working node, and if not, according to the optimized concurrency of the working node The degree of concurrency of the work node is adjusted.
- each working node includes one or more execution units, and each working node processes the data stream by calling its own execution unit; the concurrency of the working node represents the working node.
- the number of execution units included; the adjustment unit is specifically configured to adjust the concurrency of the working node according to the optimized concurrency of the working node.
- the adjustment unit includes:
- a first adjustment module configured to: when the optimized concurrency of the working node is greater than the concurrency of the working node: generate a first control instruction for adding a new execution unit and send the same to the working node, so that the working node receives the Creating at least one new execution unit after the first control instruction, and creating a data channel of the new execution unit and other execution units; wherein the work node represents the total number of execution units currently represented by the work node The degree of concurrency is the same as the optimization concurrency of the working node;
- a second adjustment module when the optimization concurrency of the working node is less than the concurrency of the working node: generating a second control instruction for deleting an execution unit with the working node and sending the work to the working node, so that the work is performed After receiving the second control instruction, the node deletes at least one execution unit of the working node, and deletes a data channel connected to the deleted execution unit; wherein the total number of execution units currently included in the working node is characterized The degree of concurrency of the working node is the same as the optimal concurrency of the working node.
- control node further includes:
- a first dispatching policy adjusting unit configured to adjust a data dispatching policy of an upstream working node corresponding to the working node according to the at least one execution unit added or deleted, and send an adjusted data distribution policy to the upstream working node, so that The upstream working node sends a data packet corresponding to the target execution unit after determining the target execution unit corresponding to the downstream target working node according to the adjusted data distribution policy, where the data distribution policy is used to indicate the work.
- control node further includes:
- a second dispatching policy adjusting unit configured to adjust a data dispatching policy of the upstream working node corresponding to the working node according to the at least one execution unit added or deleted, and send the adjusted data distribution policy to the upstream working node, so that Determining, by the upstream working node, a working node group to which the target working node belongs according to the adjusted data distribution policy, the working node group includes at least one working node; and determining a downstream target working node from the working node group And after determining the target execution unit corresponding to the target working node, the data packet is correspondingly dispatched to the target execution unit.
- the present invention provides a flow computing system, where the flow computing system includes: a control node and a plurality of working nodes;
- the control node is configured to invoke one or more working nodes of the multiple working nodes to process the data stream according to the concurrency of each working node configured in the flow computing system;
- the working node is configured to process the data stream under a call of the control node
- the control node is further configured to collect data traffic information between each of the one or more working nodes and other working nodes and processing speed information of each of the one or more working nodes Determining, according to the collected data traffic information and processing speed information, an optimization concurrency of each of the one or more working nodes; and determining, respectively, each of the one or more working nodes
- the degree of concurrency is the same as the degree of concurrency of the working node. If not, the concurrency of the working node is adjusted according to the optimization concurrency of the working node.
- the working node includes one or more execution units, and when the working node is invoked to process the data stream, specifically processed by the execution unit included in the working node. a data stream; the concurrency of the working node indicates the number of execution units included in the working node; and the control node is specifically used in the aspect of adjusting the concurrency of the working node according to the optimized concurrency of the working node.
- the optimization concurrency is the same as the working node.
- control node is further configured to adjust an upstream working node corresponding to the working node according to the at least one execution unit added or deleted. a data distribution policy, and sending the adjusted data distribution policy to the upstream working node, so that the upstream working node according to the adjusted data distribution strategy, after determining the target execution unit corresponding to the downstream target working node,
- the packet is correspondingly dispatched to the target execution unit, wherein the data dispatching strategy is used to indicate the amount of data of the device receiving the data and the device receiving the data when receiving the data when the working node is distributing the data.
- control node is further configured to: adjust, by the control node, the upstream corresponding to the working node according to the at least one execution unit added or deleted. a data distribution policy of the working node, and sending the adjusted data distribution policy to the upstream working node, so that the upstream working node determines the working node group to which the target working node belongs according to the adjusted data distribution policy,
- the working node group includes at least one working node; and determines a downstream target working node from the working node group, and after determining the target execution unit corresponding to the target working node, dispatching the data packet corresponding to the target execution unit .
- the technical solution provided by the embodiment of the present invention collects the processing speed information of each working node and the traffic information between the working nodes in real time during the running of the system, and adjusts the concurrency of the working node according to the information collected in real time, so that the work is performed.
- the processing capability of the node can meet the real-time requirements of the business processing, thereby achieving the data processing capability and resource utilization of the dynamic stream computing system.
- FIG. 1 is a schematic diagram of a DAG diagram in the prior art
- FIG. 2 is a flowchart of a data processing method in a stream computing system according to an embodiment of the present invention
- FIG. 3 is a schematic diagram of correspondence between processing time of tuples and arrival time of tuples according to an embodiment of the present invention
- FIG. 4 is a schematic diagram of calculating an optimization degree of concurrency in an embodiment of the present invention.
- FIG. 5 is a schematic diagram of a DAG picture segment according to an embodiment of the present invention.
- FIG. 6 is a flowchart of a data processing method in another flow computing system according to an embodiment of the present invention.
- FIG. 7 is a schematic structural diagram of a control node in a flow computing system according to an embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of a flow computing system according to an embodiment of the present invention.
- FIG. 9 is a schematic structural diagram of a control node in another flow computing system according to an embodiment of the present invention.
- an embodiment of the present invention provides a data processing method in a flow computing system.
- the method provided by the embodiment of the present invention can be used in a flow computing system.
- the flow computing system includes a control node and multiple working nodes. (Also called operator), the control node can send corresponding control commands to the working nodes of the subordinates, so that the working node calls the execution unit to process the data stream generated by the service according to the control instruction.
- Step 201 The control node invokes one or more working nodes of the multiple working nodes to process the data stream according to the concurrency of the configured working nodes.
- the working node in the present invention is also generally referred to as an operator in the stream computing system, and the embodiment of the present invention does not make a special distinction between the two;
- the concurrency of a node is configured in a flow graph (also referred to as a loop-free directed graph) that describes the business processing logic.
- a flow graph also referred to as a loop-free directed graph
- the control node invokes one or more working nodes to process the data flow generated by the service according to the concurrency of the configured working nodes, wherein the flow graph is in the flow computing system
- a general representation of the business data processing logic For a detailed description of the flow graph, refer to the description in the background section, and details are not described here.
- each working node includes one or more execution units.
- the execution unit included in the working node processes the data stream; the execution unit may be a thread. Or the process; the concurrency of the working node is used to represent the correspondence between the working node and the execution unit.
- the concurrency of the working node indicates the number of execution units included in the working node, such as the concurrency of the working node A. If it is 5, it means that the working node can call 5 execution units to process the data stream.
- the concurrency of the working node in this step refers to the concurrency of the initial configuration of the working node.
- Step 202 The control node collects data traffic information between each of the one or more working nodes and other working nodes, and processing speed information of each of the one or more working nodes.
- the traffic information between the working nodes refers to: data traffic information between working nodes having a logical upstream relationship in the flow graph; the speed information of the working node represents the speed at which the working node processes the data, and the working node processes the data. The speed is determined by the concurrency of the working node, data traffic and other factors.
- Step 203 The control node determines, by using the collected data traffic information and processing speed information, an optimized concurrency of each of the one or more working nodes.
- the degree of concurrency of the working node is the degree of concurrency that matches the current load situation of the working node.
- the execution unit is called to perform data stream processing according to the concurrency of the initial configuration, but the actual data is being processed. The specific processing cannot be estimated before processing, so many times the initial concurrency does not achieve the optimal effect. Therefore, for the above problems in the prior art, the method provided by the embodiment of the present invention is actually negative for each working node in the flow computing system.
- the load is collected (that is, the collected flow information and processing speed information), and then the matched optimization concurrency is calculated according to the collected load conditions.
- the optimization concurrency can match the data processing situation of the working node, thereby avoiding the waste of resources and the problem that the execution unit cannot meet the data processing requirements.
- Step 204 The control node determines whether the optimized concurrency of each of the one or more working nodes is the same as the concurrency of the working node, and if not, according to the optimized concurrency of the working node. Adjust the concurrency of the worker node.
- the optimized degree of concurrency of the working node is the same as its concurrency, it is not necessary to adjust it and maintain its current concurrency.
- the specific algorithm for determining the optimal concurrency of each working node through the collected real-time data traffic information and processing speed information may adopt various implementation forms. The following is a specific example, but should not be understood as implementation. The only way of the invention.
- a tuple (or a packet) in a data stream arrives at a working node matches the time at which the execution unit processes the tuple, the resource utilization of the system is optimal, which facilitates the full use of the system. Processing power. If the tuple arrival time is less than the tuple processing time, it means that the execution unit is overloaded, and the system will have tuple accumulation.
- the execution unit if the time 1 at which the tuple reaches the execution unit is greater than the tuple service processing time (ie, the tuple service processing time), the execution unit is relatively idle (as shown in FIG. 3).
- the tuple reaches the relationship between time 1 and tuple processing time); if the tuple arrival time a is smaller than the tuple business processing time, it means that the execution unit load is too large (as shown in Figure 3, the tuple reaches time 2 and the tuple processing time). ).
- the tuple arrival time is the average time interval between the tuple and the execution unit
- the tuple service processing time is the time required for the execution unit to averagely process one tuple.
- the tuple arrival time and the tuple service processing time are based on the collected working nodes.
- the flow rate information and the processing speed data of the processing unit are calculated and obtained.
- the concurrency of the working node needs to be adjusted, so that the tuple service processing time is consistent with the tuple arrival time.
- the operational relationship between the arrival time of the tuple and the concurrency of the working node is: the greater the degree of concurrency, the greater the arrival time of the tuple.
- the concurrency of each operator is calculated layer by layer, and the calculation order is A, B, C, D, E, F.
- Dop represents the concurrency value
- the calculation formula dop ⁇ tuple processing time/tuple arrival time is calculated.
- the tuple processing time and the tuple issuing time are obtained according to the statistical information reported by the actual service processing.
- the tuple arrival time is obtained according to the tuple issuing time of the upstream node.
- the working node A (or operator A) in FIG. 4 is the source node, and the tuple arrival time is obtained according to the throughput, that is, 1s/67 ⁇ 15ms.
- the calculation method of the above optimization concurrency is an optional implementation manner of the embodiment of the present invention, and the optimization concurrency calculation method of the present invention is not limited to the above manner.
- the method provided by the embodiment of the present invention is applicable to any one in a specific application environment, in which the requirements of different flow computing systems are different, and the specific device performance is different, so that the manner of calculating the optimization concurrency is different.
- a scenario in which the optimization concurrency is calculated according to the real-time situation, and then the processing logic in the flow computing system is adjusted according to the optimization concurrency.
- each working node includes one or more execution units
- the working node is called to process the data stream
- the calculated optimization concurrency can be determined. Whether the concurrency of each working node that is initially set is consistent with the current processing situation, and if the matching degree of the working node can be adjusted if the matching is not performed, the concurrency of the working node according to the optimized concurrency degree Making adjustments includes:
- the adjusted concurrency is the optimal and the most ideal implementation, but in the specific implementation process, due to other objective conditions, the actual situation may be Adjusting the degree of concurrency such that the adjusted concurrency is compared to the optimized concurrency Off or close.
- the specific effect of the adjustment is that the data processing capability of the working node is more suitable for the current system data processing needs.
- the control node generates a first control instruction for creating a new execution unit and sends it to the working node, so that the working node creates at least one new execution unit after receiving the first control instruction, and creates the new execution unit and The data channel of the other execution unit; after adjustment, the total number of execution units currently included in the working node is characterized by the same degree of concurrency of the working node as the optimized degree of convergence of the working node.
- the data channel of the new execution unit and the downstream execution unit is generally established first, and then the upstream data channel is established correspondingly.
- the control node generates a second control instruction for deleting the execution unit with the working node and sends the second control instruction to the working node, so that the working node deletes at least one execution unit of the working node after receiving the second control instruction, and Data channel connected to the deleted execution unit; adjusted Then, the total number of execution units currently included in the working node is characterized by the same degree of concurrency of the working node as the working node.
- the specific implementation steps of deleting the execution unit by the working node may be:
- the second upstream working node corresponds to at least one second An upstream execution unit, where the second downstream working node corresponds to at least one second downstream execution unit;
- the execution unit to be deleted is deleted.
- one or some execution units need to be deleted.
- the specific operations when deleting the execution unit may be:
- the data channel of the execution unit to be deleted and the upstream execution unit are disconnected; then, the work node to be deleted is processed to process the unprocessed data, and after the data is processed, the data channel between the execution unit to be deleted and the downstream execution unit is deleted; Finally delete the execution unit to be deleted.
- the execution unit since the execution unit is added or deleted, and the upstream working node of the working node that adjusts the degree of convergence is distributing data, the data distribution policy needs to be adjusted correspondingly. For example, if an execution unit is newly added, the data needs to be dispatched to the newly added execution unit for processing. Therefore, after the step 204 is performed, the method provided by the embodiment of the present invention further includes:
- the data distribution policy of the upstream working node corresponding to the working node is correspondingly adjusted.
- the concurrency of the working node is adjusted, that is, a certain number of execution units need to be newly added or deleted with respect to the original working node.
- the execution unit is adjusted, if the data distribution policy of the upstream working node is not adjusted, there will be a problem in data processing.
- the data distribution policy needs to be generated according to the number of downstream execution units and the processing capability of each execution unit, so the specific example of the data distribution policy It includes the path to the data distribution and the specific execution components corresponding to the data distribution.
- the control node adjusts a data distribution policy of the upstream working node corresponding to the working node according to the at least one execution unit added or deleted, where the data distribution policy is used to indicate that the working node receives the data, receives the data, and receives the data.
- n0 represents the upstream operator
- n1, n2, and n3 respectively represent the downstream operator of n0
- n0 transmits two streams s1 and s2, where n1 and n2 subscribe to the s1 stream, and n3 subscribes to s2.
- the concurrency of n1 is 1, and is performed by one PE, that is, n1 is performed by pe1; the concurrency of n2 is 2, which is performed by two PEs, that is, n2 is performed by pe2 and pe3; the concurrency of n3 is 3, by three PEs
- the first-level dispatch is performed, that is, the target operator is selected.
- the target operators selectable in the example are n1 and n2, and then respectively for n1 and n2.
- the second level is distributed; when the second level is distributed for n1, since the concurrency of n1 is 1, it is directly determined that tuple0 is dispatched to pe1; when the second level is distributed for n2, since the concurrency of n2 is 2, it needs to be based on The distribution policy of the n2 configuration is used for data distribution.
- it can be configured as a hash distribution, that is, firstly hashing some attribute fields of tuple0 to obtain a corresponding hash value, and then performing modulo according to the degree of concurrency, and the result can be used as an index value to select the PE corresponding to n2.
- the corresponding data distribution policy needs to be set, and the distribution policy may be extended according to a specific embodiment, such as Corresponding embodiments may support random data distribution (that is, the flow is randomly sent to a PE corresponding to the downstream operator), all-distribution (that is, all PEs corresponding to the downstream operator), and hash distribution (that is, according to the hash modulo method). Make sure the stream is sent to a PE).
- the control node adjusts a data distribution policy of the upstream working node corresponding to the working node according to the at least one execution unit added or deleted;
- control node sends the adjusted data distribution policy to the upstream working node, so that the upstream working node determines, according to the adjusted data distribution policy, the working node group to which the target working node belongs, and the working node group includes at least one a working node; and determining a downstream target working node from the working node group, and after determining the target execution unit corresponding to the target working node, transmitting the data packet correspondingly to the target execution unit.
- the solution provided by the embodiment of the present invention further provides a multi-level data dispatching scheme based on adjusting the concurrency of the operator, so that the degree of concurrency of the operator can be improved while ensuring the correct distribution of the data.
- the control node of the flow computing system instructs one or more working nodes to process the data flow generated by the service according to the configured flow graph, where the flow graph includes initial concurrency set in advance for each working node, and the specific steps include:
- Step 601 The working node collects the processing speed information of the processing data stream and the data traffic information of the other working node, and sends the collected processing speed information and the data traffic information to generate processing capability description information to the corresponding control node;
- Step 602 The control node collects processing capability description information of each working node that is invoked;
- Step 603 The control node determines the optimized concurrency of each working node by using the collected real-time processing capability description information and the flow graph.
- Step 604 the control node determines whether the optimized concurrency of each working node is the same as the initial concurrency of the working node in the flow graph, and if not, generates the concurrency according to the optimization. Controlling an instruction and transmitting the control instruction to the working node;
- Step 605 After receiving the control instruction, the working node adjusts its own concurrency according to the control instruction.
- the method provided by the embodiment of the present invention collects the processing status of each working node in real time during the running of the system, and then according to the real-time processing situation.
- the concurrency of the working node is adjusted so that the processing capability of the working node can meet the real-time requirements of the business processing, thereby achieving the effect of dynamically improving the data processing capability and resource utilization of the streaming computing system.
- the present invention further provides a control node 700 in a flow computing system, where the flow computing system includes a control node and a plurality of working nodes, and the control node includes:
- the calling unit 701 is configured to invoke one or more working nodes of the plurality of working nodes to process the data stream according to the concurrency of the configured working nodes;
- the information collection unit 702 is configured to collect data traffic information between each of the one or more working nodes and other working nodes, and processing speed information of each of the one or more working nodes;
- the calculating unit 703 is configured to determine, according to the data traffic information and the processing speed information collected by the information collecting unit 702, an optimized concurrency of each of the one or more working nodes;
- the adjusting unit 704 is configured to determine, respectively, whether the optimized concurrency of each of the one or more working nodes is the same as the initial concurrency of the working node, and if not, according to the optimized concurrency of the working node Adjust the concurrency of the worker node.
- each working node includes one or more execution units, and each working node processes the data flow by calling its own execution unit; the degree of concurrency of the working node indicates the number of execution units included in the working node; The aspect of adjusting the concurrency of the working node according to the optimization concurrency of the working node; the adjusting unit 704 is specifically configured to:
- the adjusting unit 704 includes:
- a first adjustment module configured to: when an optimized concurrency of the working node is greater than an initial concurrency of the working node: a first control instruction for generating a new execution unit and sent to the working node, so that the working node After receiving the first control instruction, creating at least one new execution unit, and creating a data channel of the new execution unit and other execution units;
- a second adjustment module when the optimized concurrency of the working node is less than the initial concurrency of the working node: generating a second control instruction for deleting an execution unit with the working node, and sending the second control instruction to the working node, And causing the working node to delete the at least one execution unit of the working node after receiving the second control instruction, and deleting the data channel connected to the deleted execution unit.
- the device further includes:
- the first dispatching policy adjustment unit 705 is configured to adjust, according to the at least one execution unit added or deleted, a data dispatching policy of the upstream working node corresponding to the working node, and send the adjusted data distribution policy to the upstream working node, And causing the upstream working node to send a data packet corresponding to the target execution unit after determining the target execution unit corresponding to the downstream target working node according to the adjusted data distribution policy, where the data distribution policy is used to represent When the worker node distributes data, the device receiving the data and the amount of data when the device receiving the data receives the data.
- the second dispatching policy adjusting unit 706 is configured to adjust a data dispatching policy of the upstream working node corresponding to the working node according to the at least one execution unit added or deleted, and send the adjusted data distribution policy to the upstream working node, And causing the upstream working node to determine a working node group to which the target working node belongs according to the adjusted data distribution policy, where the working node group includes at least one working node; and determining a downstream target work from the working node group And the node, and after determining the target execution unit corresponding to the target working node, the data packet is correspondingly dispatched to the target execution unit.
- control node 700 in the flow computing system provided by the embodiment of the present invention is used to implement the foregoing method.
- the control node 700 in the flow computing system provided by the embodiment of the present invention is used to implement the foregoing method.
- the control node 700 in the flow computing system provided by the embodiment of the present invention is used to implement the foregoing method.
- the control node 700 in the flow computing system provided by the embodiment of the present invention is used to implement the foregoing method.
- the control node in the flow computing system collects the processing status of each working node in real time during the running of the flow computing system, and then adjusts the concurrency of the working node according to the real-time processing situation, so that the working node
- the processing capability can meet the real-time requirements of business processing, thereby achieving the effect of dynamically improving the data processing capability and resource utilization of the stream computing system.
- the embodiment of the present invention further provides a flow computing system 800, the flow computing system 800 includes: a control node 801 and a plurality of working nodes 802;
- the control node 801 is configured to, according to the concurrency of each working node 802 configured in the flow computing system, invoke one or more working nodes of the multiple working nodes to process a data flow generated by the service;
- a working node 802 configured to process, by the control node 801, a data stream generated by the service
- the control node 801 is further configured to collect data traffic information between each of the one or more working nodes and other working nodes, and processing speed information of each of the one or more working nodes; Determining, by the collected data flow information and processing speed information, an optimization concurrency of each of the one or more working nodes; and determining, respectively, optimization of each of the one or more working nodes Whether the concurrency is the same as the concurrency of the working node. If not, the concurrency of the working node is adjusted according to the optimization concurrency of the working node.
- the working node includes one or more execution units, and when the working node is called to process the data stream, the execution unit included in the working node processes the data stream; the concurrency of the working node indicates the execution of the working node The number of units; in the aspect of adjusting the concurrency of the working node according to the optimized concurrency of the working node; the control node 801 is specifically configured to:
- the working node 802 is further configured to: add at least one execution unit according to the control instruction, or delete at least one execution unit of the working node 802, so that the working node 802 represented by the number of execution units currently included by the working node 802
- the degree of concurrency is the same as the optimization concurrency of the worker node 802.
- control node 801 is further configured to: adjust, according to the at least one execution unit added or deleted, a data dispatching policy of the upstream working node corresponding to the working node 802, and send the adjusted data distribution policy to the upstream working node.
- the upstream working node determines the target execution unit corresponding to the downstream target working node according to the adjusted data distribution policy
- the data packet is correspondingly distributed to the target execution unit, where the data distribution policy is used for Indicates the amount of data that the working node receives when receiving data, and the device that receives the data when it receives data.
- control node 801 is further configured to: adjust, according to the at least one execution unit added or deleted, a data distribution policy of the upstream working node corresponding to the working node, and send the adjusted data distribution policy to the upstream working node, so that Determining, by the upstream working node, the working node group to which the target working node belongs according to the adjusted data distribution policy, the working node group includes at least one working node; and determining a downstream target working node from the working node group, And after determining the target execution unit corresponding to the target working node, the data packet is correspondingly dispatched to the target execution unit.
- the flow computing system collects the processing speed information of each working node and the traffic information between the working nodes in real time during the running of the system, and adjusts the concurrency of the working node according to the information collected in real time, so that The processing power of the working node can meet the real-time requirements of the business processing, thereby achieving the data processing capability and resource utilization of the dynamic lifting flow computing system.
- the present invention further provides a control node for performing the data processing method in the foregoing various embodiments, the control node including at least one processor 901 (eg, a CPU), at least one network interface 902, or other communication.
- An interface, a memory 903, and at least one communication bus 904 are used to implement connection communication between these devices.
- the processor 901 is configured to execute an executable module, such as a computer program, stored in the memory 903.
- the memory 903 may include a high speed random access memory (RAM: Random Access Memory), and may also include a non-unstable memory. (non-volatile memory), such as at least one disk storage.
- the communication connection between the system gateway and at least one other network element is implemented by at least one network interface 902 (which may be wired or wireless), and may use an Internet, a wide area network, a local network, a metropolitan area network, or the like.
- the memory stores a program 9031, and the program may be executed by the processor, to: call one or more working node pairs of the plurality of working nodes according to a concurrency of the configured working nodes.
- Flow processing collecting data traffic information between each of the one or more working nodes and other working nodes and processing speed information of each of the one or more working nodes;
- Data flow information and processing speed information determine an optimized degree of convergence of each of the one or more working nodes; determining an optimal concurrency of each of the one or more working nodes and the The concurrency of the working nodes is the same. If they are not the same, the concurrency of the working nodes is adjusted according to the optimization concurrency of the working nodes.
- the method provided by the embodiment of the present invention collects the processing status of each working node in real time during the running of the system, and then according to the real-time processing situation.
- the concurrency of the working node is adjusted so that the processing capability of the working node can meet the real-time requirements of the business processing, thereby achieving the effect of dynamically improving the data processing capability and resource utilization of the streaming computing system.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (14)
- 一种流计算系统中的数据处理方法,所述流计算系统包括控制节点和多个工作节点,其特征在于,该方法包括:所述控制节点根据配置的各个工作节点的并发度,调用所述多个工作节点中的一个或多个工作节点对数据流进行处理;所述控制节点收集所述一个或多个工作节点中每个工作节点与其它工作节点之间的数据流量信息和所述一个或多个工作节点中每个工作节点的处理速度信息;所述控制节点根据收集到的数据流量信息和处理速度信息确定所述一个或多个工作节点中的每个工作节点的优化并发度;所述控制节点分别确定所述一个或多个工作节点中的每个工作节点的优化并发度与该工作节点的并发度是否相同,如果不相同,则按照该工作节点的优化并发度对该工作节点的并发度进行调整。
- 如权利要求1所述的方法,其特征在于,每个工作节点包含一个或多个执行单元,且当工作节点被调用对数据流进行处理时,具体由工作节点所包含的执行单元处理数据流;工作节点的并发度表示工作节点包含的执行单元的个数;则所述控制节点按照该工作节点的优化并发度对该工作节点的并发度进行调整,包括:所述控制节点根据该工作节点的优化并发度为该工作节点新增至少一个执行单元,或删除该工作节点的至少一个执行单元,使得该工作节点当前包含的执行单元个数所表征的该工作节点的并发度与该工作节点的优化并发度相同。
- 如权利要求2所述的方法,其特征在于,所述控制节点根据该工作节点的优化并发度为该工作节点新增至少一个执行单元,或删除该工作节点的至少一个执行单元,包括:当该工作节点的优化并发度大于该工作节点的并发度时:所述控制节点生成用于创建新的执行单元的第一控制指令并发送给该工作节点,使得该工作节点接收到所述第一控制指令后创建至少一个新执行单元,并创建所述新执行单元与其它执行单元的数据通道,其中该工作节点当前包含的执行单元的总个数所表征的该工作节点的并发度与该工作节点的优化并发度相同;当所述工作节点的优化并发度小于该工作节点的并发度时:所述控制节点生成用于删除所述工作节点的执行单元的第二控制指令并发送给该工作节点,使得该工作节点接收到所述第二控制指令后删除该工作节点的至少一个执行单元,并删除与所述删除的执行单元连接的数据通道,其中该工作节点当前包含的执行单元的总个数所表征的该工作节点的并发度与该工作节点的优化并发度相同。
- 如权利要求2或3所述的方法,其特征在于,在根据该工作节点的优化并发度为该工作节点新增至少一个执行单元,或删除该工作节点的至少一个执行单元之后,该方法还包括:所述控制节点根据新增或删除的至少一个执行单元,调整该工作节点对应的上游工作节点的数据派发策略,所述数据派发策略用于表示工作节点在派发数据时,接收数据的设备以及接收数据的设备在接收数据时的数据量;所述控制节点向所述上游工作节点发送调整后的数据派发策略,使得所述上游工作节点根据调整后的数据派发策略,在确定下游的目标工作节点对应的目标执行单元后,将数据包对应派发到所述目标执行单元。
- 如权利要求2或3所述的方法,其特征在于,在根据该工作节点的优化并发度为该工作节点新增至少一个执行单元,或删除该工作节点的至少一个执行单元之后,该方法还包括:控制节点根据新增或删除的至少一个执行单元,调整该工作节点对应的上游工作节点的数据派发策略;控制节点向所述上游工作节点发送调整后的数据派发策略,使得所述上游工作节点根据调整后的数据派发策略,确定所述目标工作节点所属的工作 节点组,所述工作节点组包括至少一个工作节点;并从所述工作节点组中确定下游的目标工作节点,以及在确定所述目标工作节点对应的目标执行单元后,将数据包对应派发到所述目标执行单元。
- 一种流计算系统中的控制节点,所述流计算系统包括所述控制节点和多个工作节点,其特征在于,所述控制节点包括:调用单元,用于根据配置的各个工作节点的并发度,调用所述多个工作节点中的一个或多个工作节点对数据流进行处理;信息收集单元,用于收集所述一个或多个工作节点中每个工作节点与其它工作节点之间的数据流量信息和所述一个或多个工作节点中每个工作节点的处理速度信息;计算单元,用于根据所述信息收集单元收集到的数据流量信息和处理速度信息确定所述一个或多个工作节点中的每个工作节点的优化并发度;调整单元,用于分别确定所述一个或多个工作节点中的每个工作节点的优化并发度与该工作节点的并发度是否相同,如果不相同,则按照该工作节点的优化并发度对该工作节点的并发度进行调整。
- 如权利要求6所述的控制节点,其特征在于,每个工作节点包含一个或多个执行单元,且当工作节点被调用对数据流进行处理时,具体由工作节点所包含的执行单元处理数据流;工作节点的并发度表示工作节点包含的执行单元的个数;在所述按照该工作节点的优化并发度对该工作节点的并发度进行调整的方面,所述调整单元具体用于,根据该工作节点的优化并发度为该工作节点新增至少一个执行单元,或删除该工作节点的至少一个执行单元,使得该工作节点当前包含的执行单元个数所表征的该工作节点的并发度与该工作节点的优化并发度相同。
- 如权利要求7所述的控制节点,其特征在于,在所述根据该工作节点的优化并发度为该工作节点新增或删除至少一个执行单元的方面,所述调整单元包括:第一调整模块,用于当该工作节点的优化并发度大于该工作节点的并发 度时:生成用于新增执行单元的第一控制指令并发送给该工作节点,使得该工作节点接收到所述第一控制指令后创建至少一个新的执行单元,并创建所述新的执行单元与其它执行单元的数据通道;其中,该工作节点当前包含的执行单元的总个数所表征的该工作节点的并发度与该工作节点的优化并发度相同;第二调整模块,当所述工作节点的优化并发度小于该工作节点的并发度时:生成用于删除与所述工作节点的执行单元的第二控制指令并发送给该工作节点,使得该工作节点接收到所述第二控制指令后删除该工作节点的至少一个执行单元,并删除与所述删除的执行单元连接的数据通道;其中,该工作节点当前包含的执行单元的总个数所表征的该工作节点的并发度与该工作节点的优化并发度相同。
- 如权利要求7或8所述的控制节点,其特征在于,该控制节点还包括:第一派发策略调整单元,用于根据新增或删除的至少一个执行单元,调整该工作节点对应的上游工作节点的数据派发策略,并向所述上游工作节点发送调整后的数据派发策略,使得所述上游工作节点根据调整后的数据派发策略,在确定下游的目标工作节点对应的目标执行单元后,将数据包对应派发到所述目标执行单元,其中,所述数据派发策略用于表示工作节点在派发数据时,接收数据的设备以及接收数据的设备在接收数据时的数据量。
- 如权利要求7或8所述的控制节点,其特征在于,该控制节点还包括:第二派发策略调整单元,用于根据新增或删除的至少一个执行单元,调整该工作节点对应的上游工作节点的数据派发策略,并向所述上游工作节点发送调整后的数据派发策略,使得所述上游工作节点根据调整后的数据派发策略,确定所述目标工作节点所属的工作节点组,所述工作节点组包括至少一个工作节点;并从所述工作节点组中确定下游的目标工作节点,以及在确定所述目标工作节点对应的目标执行单元后,将数据包对应派发到所述目标执行单元。
- 一种流计算系统,其特征在于,所述流计算系统包括:控制节点和多个工作节点;所述控制节点用于,根据所述流计算系统中配置的各个工作节点的并发度,调用所述多个工作节点中的一个或多个工作节点对数据流进行处理;所述工作节点,用于在所述控制节点的调用下,对所述数据流进行处理;所述控制节点还用于,收集所述一个或多个工作节点中每个工作节点与其他工作节点之间的数据流量信息和所述一个或多个工作节点中每个工作节点的处理速度信息;根据收集到的数据流量信息和处理速度信息确定所述一个或多个工作节点中的每个工作节点的优化并发度;并分别确定所述一个或多个工作节点中的每个工作节点的优化并发度与该工作节点的并发度是否相同,如果不相同,则按照该工作节点的优化并发度对该工作节点的并发度进行调整。
- 如权利要求11所述的流计算系统,其特征在于,所述工作节点包含一个或多个执行单元,且当工作节点被调用对数据流进行处理时,具体由工作节点所包含的执行单元处理数据流;工作节点的并发度表示工作节点包含的执行单元的个数;则在所述按照该工作节点的优化并发度对该工作节点的并发度进行调整的方面,所述控制节点具体用于:根据该工作节点的优化并发度为该工作节点新增至少一个执行单元,或删除该工作节点的至少一个执行单元,使得该工作节点当前包含的执行单元个数所表征的该工作节点的并发度与该工作节点的优化并发度相同。
- 如权利要求12所述的流计算系统,其特征在于,所述控制节点还用于根据新增或删除的至少一个执行单元,调整该工作节点对应的上游工作节点的数据派发策略,并向所述上游工作节点发送调整后的数据派发策略,使得所述上游工作节点根据调整后的数据派发策略,在确定下游的目标工作节点对应的目标执行单元后,将数据包对应派发到所述目标执行单元,其中,所述数据派发策略用于表示工作节点在派发数据时,接收数据的设备以及接收数据的设备在接收数据时的数据量。
- 如权利要求12所述的流计算系统,其特征在于,所述控制节点还用于控制节点根据新增或删除的至少一个执行单元,调整该工作节点对应的上游工作节点的数据派发策略,并向所述上游工作节点发送调整后的数据派发策略,使得所述上游工作节点根据调整后的数据派发策略,确定所述目标工作节点所属的工作节点组,所述工作节点组包括至少一个工作节点;并从所述工作节点组中确定下游的目标工作节点,以及在确定所述目标工作节点对应的目标执行单元后,将数据包对应派发到所述目标执行单元。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020167027194A KR101858041B1 (ko) | 2014-03-06 | 2015-01-27 | 스트림 컴퓨팅 시스템의 데이터 처리 방법, 제어 노드, 그리고 스트림 컴퓨팅 시스템 |
EP15759274.2A EP3115896A4 (en) | 2014-03-06 | 2015-01-27 | Data processing method in stream computing system, control node and stream computing system |
JP2016555667A JP6436594B2 (ja) | 2014-03-06 | 2015-01-27 | ストリーム計算システムにおけるデータ処理方法、制御ノードおよびストリーム計算システム |
US15/257,722 US10097595B2 (en) | 2014-03-06 | 2016-09-06 | Data processing method in stream computing system, control node, and stream computing system |
US16/112,236 US10630737B2 (en) | 2014-03-06 | 2018-08-24 | Data processing method in stream computing system, control node, and stream computing system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410082041.XA CN103870340B (zh) | 2014-03-06 | 2014-03-06 | 流计算系统中的数据处理方法、控制节点及流计算系统 |
CN201410082041.X | 2014-03-06 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/257,722 Continuation US10097595B2 (en) | 2014-03-06 | 2016-09-06 | Data processing method in stream computing system, control node, and stream computing system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015131721A1 true WO2015131721A1 (zh) | 2015-09-11 |
Family
ID=50908902
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/071645 WO2015131721A1 (zh) | 2014-03-06 | 2015-01-27 | 流计算系统中的数据处理方法、控制节点及流计算系统 |
Country Status (6)
Country | Link |
---|---|
US (2) | US10097595B2 (zh) |
EP (1) | EP3115896A4 (zh) |
JP (1) | JP6436594B2 (zh) |
KR (1) | KR101858041B1 (zh) |
CN (2) | CN103870340B (zh) |
WO (1) | WO2015131721A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112596895A (zh) * | 2020-12-02 | 2021-04-02 | 中国科学院计算技术研究所 | 一种sql语义感知的弹性倾斜处理方法及系统 |
Families Citing this family (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103870340B (zh) | 2014-03-06 | 2017-11-07 | 华为技术有限公司 | 流计算系统中的数据处理方法、控制节点及流计算系统 |
CN104063293B (zh) * | 2014-07-04 | 2017-06-27 | 华为技术有限公司 | 一种数据备份方法及流计算系统 |
CN104216766B (zh) * | 2014-08-26 | 2017-08-29 | 华为技术有限公司 | 对流数据进行处理的方法及装置 |
CN104320382B (zh) * | 2014-09-30 | 2018-04-20 | 华为技术有限公司 | 分布式的实时流处理装置、方法和单元 |
CN104317556B (zh) * | 2014-10-22 | 2018-03-16 | 华为技术有限公司 | 一种流式应用升级方法、主控节点及流计算系统 |
CN104572182B (zh) * | 2014-12-23 | 2018-07-13 | 杭州华为数字技术有限公司 | 一种流应用的配置方法、节点及流计算系统 |
CN106339252B (zh) * | 2015-07-08 | 2020-06-23 | 阿里巴巴集团控股有限公司 | 分布式dag系统的自适应优化方法和装置 |
CN105224805B (zh) * | 2015-10-10 | 2018-03-16 | 百度在线网络技术(北京)有限公司 | 基于流式计算的资源管理方法及装置 |
GB2544049A (en) * | 2015-11-03 | 2017-05-10 | Barco Nv | Method and system for optimized routing of data streams in telecommunication networks |
CN105930203B (zh) * | 2015-12-29 | 2019-08-13 | 中国银联股份有限公司 | 一种控制消息分发的方法及装置 |
CN105976242A (zh) * | 2016-04-21 | 2016-09-28 | 中国农业银行股份有限公司 | 一种基于实时流数据分析的交易欺诈检测方法及系统 |
CN107678790B (zh) * | 2016-07-29 | 2020-05-08 | 华为技术有限公司 | 流计算方法、装置及系统 |
US10572276B2 (en) | 2016-09-12 | 2020-02-25 | International Business Machines Corporation | Window management based on a set of computing resources in a stream computing environment |
CN108241525A (zh) * | 2016-12-23 | 2018-07-03 | 航天星图科技(北京)有限公司 | 一种多节点任务动态控制方法 |
CN109408219B (zh) * | 2017-08-16 | 2021-04-02 | 中国电信股份有限公司 | 分布式数据接收方法、系统和用于分布式数据接收的装置 |
CN107943579B (zh) * | 2017-11-08 | 2022-01-11 | 深圳前海微众银行股份有限公司 | 资源瓶颈预测方法、设备、系统及可读存储介质 |
US10496383B2 (en) * | 2017-12-20 | 2019-12-03 | Intel Corporation | Methods and apparatus to convert a non-series-parallel control flow graph to data flow |
CN108628605A (zh) * | 2018-04-28 | 2018-10-09 | 百度在线网络技术(北京)有限公司 | 流式数据处理方法、装置、服务器和介质 |
CN108984770A (zh) * | 2018-07-23 | 2018-12-11 | 北京百度网讯科技有限公司 | 用于处理数据的方法和装置 |
CN109117355A (zh) * | 2018-08-31 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | 用于分析信息流系统性能的方法和装置 |
CN110297640B (zh) * | 2019-06-12 | 2020-10-16 | 北京三快在线科技有限公司 | 模型部署的方法、装置、存储介质及电子设备 |
CN112256444B (zh) * | 2019-07-22 | 2023-08-01 | 腾讯科技(深圳)有限公司 | 基于dag的业务处理方法、装置、服务器及存储介质 |
CN112561051A (zh) * | 2019-09-26 | 2021-03-26 | 中兴通讯股份有限公司 | 一种对深度学习模型进行并行处理的方法及装置 |
CN110795151A (zh) * | 2019-10-08 | 2020-02-14 | 支付宝(杭州)信息技术有限公司 | 算子并发度调整方法、装置和设备 |
CN111400008B (zh) * | 2020-03-13 | 2023-06-02 | 北京旷视科技有限公司 | 计算资源调度方法、装置及电子设备 |
CN112202692A (zh) * | 2020-09-30 | 2021-01-08 | 北京百度网讯科技有限公司 | 数据分发方法、装置、设备以及存储介质 |
CN115996228B (zh) * | 2023-03-22 | 2023-05-30 | 睿至科技集团有限公司 | 一种基于物联网的能源数据的处理方法及其系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101661406A (zh) * | 2008-08-28 | 2010-03-03 | 国际商业机器公司 | 处理单元调度装置和方法 |
US20100306005A1 (en) * | 2009-05-29 | 2010-12-02 | Perceptive Software, Inc. | Workflow Management System and Method |
CN103246570A (zh) * | 2013-05-20 | 2013-08-14 | 百度在线网络技术(北京)有限公司 | Hadoop的调度方法、系统及管理节点 |
CN103870340A (zh) * | 2014-03-06 | 2014-06-18 | 华为技术有限公司 | 流计算系统中的数据处理方法、控制节点及流计算系统 |
Family Cites Families (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5548724A (en) * | 1993-03-22 | 1996-08-20 | Hitachi, Ltd. | File server system and file access control method of the same |
US5742806A (en) * | 1994-01-31 | 1998-04-21 | Sun Microsystems, Inc. | Apparatus and method for decomposing database queries for database management system including multiprocessor digital data processing system |
US5673407A (en) * | 1994-03-08 | 1997-09-30 | Texas Instruments Incorporated | Data processor having capability to perform both floating point operations and memory access in response to a single instruction |
US6088452A (en) * | 1996-03-07 | 2000-07-11 | Northern Telecom Limited | Encoding technique for software and hardware |
US5764905A (en) * | 1996-09-09 | 1998-06-09 | Ncr Corporation | Method, system and computer program product for synchronizing the flushing of parallel nodes database segments through shared disk tokens |
US6230313B1 (en) * | 1998-12-23 | 2001-05-08 | Cray Inc. | Parallelism performance analysis based on execution trace information |
US6820262B1 (en) * | 1999-07-22 | 2004-11-16 | Oracle International Corporation | Method for computing the degree of parallelism in a multi-user environment |
US6757291B1 (en) * | 2000-02-10 | 2004-06-29 | Simpletech, Inc. | System for bypassing a server to achieve higher throughput between data network and data storage system |
US7418470B2 (en) * | 2000-06-26 | 2008-08-26 | Massively Parallel Technologies, Inc. | Parallel processing systems and method |
US6671686B2 (en) * | 2000-11-02 | 2003-12-30 | Guy Pardon | Decentralized, distributed internet data management |
US6954776B1 (en) * | 2001-05-07 | 2005-10-11 | Oracle International Corporation | Enabling intra-partition parallelism for partition-based operations |
US7069268B1 (en) * | 2003-01-13 | 2006-06-27 | Cisco Technology, Inc. | System and method for identifying data using parallel hashing |
US7457261B2 (en) * | 2003-07-30 | 2008-11-25 | Cisco Technology, Inc. | Wireless network self-adaptive load balancer |
JP2005108086A (ja) * | 2003-10-01 | 2005-04-21 | Handotai Rikougaku Kenkyu Center:Kk | データ処理装置 |
JP4638250B2 (ja) * | 2005-02-03 | 2011-02-23 | 三菱電機株式会社 | プログラムコード生成支援装置及び方法並びにプログラムコード生成支援方法のプログラム |
US8289965B2 (en) * | 2006-10-19 | 2012-10-16 | Embarq Holdings Company, Llc | System and method for establishing a communications session with an end-user based on the state of a network connection |
US9094257B2 (en) * | 2006-06-30 | 2015-07-28 | Centurylink Intellectual Property Llc | System and method for selecting a content delivery network |
US8743703B2 (en) * | 2006-08-22 | 2014-06-03 | Centurylink Intellectual Property Llc | System and method for tracking application resource usage |
US8238253B2 (en) * | 2006-08-22 | 2012-08-07 | Embarq Holdings Company, Llc | System and method for monitoring interlayer devices and optimizing network performance |
JP2008097280A (ja) * | 2006-10-11 | 2008-04-24 | Denso Corp | 移動体用マルチコアcpuの制御装置、移動体用マイクロコンピュータ及び移動体操縦支援装置 |
US20090135944A1 (en) * | 2006-10-23 | 2009-05-28 | Dyer Justin S | Cooperative-MIMO Communications |
US8209703B2 (en) * | 2006-12-08 | 2012-06-26 | SAP France S.A. | Apparatus and method for dataflow execution in a distributed environment using directed acyclic graph and prioritization of sub-dataflow tasks |
WO2009078428A1 (ja) * | 2007-12-18 | 2009-06-25 | Nec Corporation | データストリーム処理システム、方法及びプログラム |
JP5149840B2 (ja) * | 2009-03-03 | 2013-02-20 | 株式会社日立製作所 | ストリームデータ処理方法、ストリームデータ処理プログラム、および、ストリームデータ処理装置 |
US20100306006A1 (en) * | 2009-05-29 | 2010-12-02 | Elan Pavlov | Truthful Optimal Welfare Keyword Auctions |
US8880524B2 (en) * | 2009-07-17 | 2014-11-04 | Apple Inc. | Scalable real time event stream processing |
JP2011034137A (ja) * | 2009-07-29 | 2011-02-17 | Toshiba Corp | 分散処理装置及び分散処理方法 |
US8656396B2 (en) * | 2009-08-11 | 2014-02-18 | International Business Machines Corporation | Performance optimization based on threshold performance measure by resuming suspended threads if present or by creating threads within elastic and data parallel operators |
JP5395565B2 (ja) | 2009-08-12 | 2014-01-22 | 株式会社日立製作所 | ストリームデータ処理方法及び装置 |
CN101702176B (zh) * | 2009-11-25 | 2011-08-31 | 南开大学 | 一种基于局部路径锁的xml数据并发控制方法 |
JP4967014B2 (ja) | 2009-12-16 | 2012-07-04 | 株式会社日立製作所 | ストリームデータ処理装置及び方法 |
EP2461511A4 (en) * | 2010-01-04 | 2014-01-22 | Zte Corp | SERIAL PROCESSING METHOD, BIT RATE MATCHING PARALLEL PROCESSING METHOD, AND DEVICE THEREOF |
JP2011243162A (ja) * | 2010-05-21 | 2011-12-01 | Mitsubishi Electric Corp | 台数制御装置、台数制御方法及び台数制御プログラム |
US8699344B2 (en) * | 2010-12-15 | 2014-04-15 | At&T Intellectual Property I, L.P. | Method and apparatus for managing a degree of parallelism of streams |
CN102082692B (zh) * | 2011-01-24 | 2012-10-17 | 华为技术有限公司 | 基于网络数据流向的虚拟机迁移方法、设备和集群系统 |
US8695008B2 (en) * | 2011-04-05 | 2014-04-08 | Qualcomm Incorporated | Method and system for dynamically controlling power to multiple cores in a multicore processor of a portable computing device |
CN102200906B (zh) | 2011-05-25 | 2013-12-25 | 上海理工大学 | 大规模并发数据流处理系统及其处理方法 |
US8997107B2 (en) * | 2011-06-28 | 2015-03-31 | Microsoft Technology Licensing, Llc | Elastic scaling for cloud-hosted batch applications |
US8694486B2 (en) * | 2011-09-27 | 2014-04-08 | International Business Machines Corporation | Deadline-driven parallel execution of queries |
CN103164261B (zh) * | 2011-12-15 | 2016-04-27 | 中国移动通信集团公司 | 多中心数据任务处理方法、装置及系统 |
JP2013225204A (ja) * | 2012-04-20 | 2013-10-31 | Fujitsu Frontech Ltd | トラフィック量予測に基づき、稼働サーバ台数を自動で最適化する負荷分散方法及び装置 |
US9002822B2 (en) * | 2012-06-21 | 2015-04-07 | Sap Se | Cost monitoring and cost-driven optimization of complex event processing system |
US9235446B2 (en) * | 2012-06-22 | 2016-01-12 | Microsoft Technology Licensing, Llc | Parallel computing execution plan optimization |
US9063788B2 (en) * | 2012-08-27 | 2015-06-23 | International Business Machines Corporation | Stream processing with runtime adaptation |
CA2883159C (en) * | 2012-09-21 | 2018-09-04 | Nyse Group, Inc. | High performance data streaming |
US9081870B2 (en) * | 2012-12-05 | 2015-07-14 | Hewlett-Packard Development Company, L.P. | Streaming system performance optimization |
US10051024B2 (en) * | 2013-03-14 | 2018-08-14 | Charter Communications Operating, Llc | System and method for adapting content delivery |
US9106391B2 (en) | 2013-05-28 | 2015-08-11 | International Business Machines Corporation | Elastic auto-parallelization for stream processing applications based on a measured throughput and congestion |
US20150039555A1 (en) * | 2013-08-02 | 2015-02-05 | International Business Machines Corporation | Heuristically modifying dbms environments using performance analytics |
US10069683B2 (en) * | 2013-09-27 | 2018-09-04 | Nxp Usa, Inc. | Apparatus for optimising a configuration of a communications network device |
-
2014
- 2014-03-06 CN CN201410082041.XA patent/CN103870340B/zh active Active
- 2014-03-06 CN CN201710947537.2A patent/CN107729147B/zh active Active
-
2015
- 2015-01-27 JP JP2016555667A patent/JP6436594B2/ja active Active
- 2015-01-27 EP EP15759274.2A patent/EP3115896A4/en not_active Ceased
- 2015-01-27 KR KR1020167027194A patent/KR101858041B1/ko active IP Right Grant
- 2015-01-27 WO PCT/CN2015/071645 patent/WO2015131721A1/zh active Application Filing
-
2016
- 2016-09-06 US US15/257,722 patent/US10097595B2/en active Active
-
2018
- 2018-08-24 US US16/112,236 patent/US10630737B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101661406A (zh) * | 2008-08-28 | 2010-03-03 | 国际商业机器公司 | 处理单元调度装置和方法 |
US20100306005A1 (en) * | 2009-05-29 | 2010-12-02 | Perceptive Software, Inc. | Workflow Management System and Method |
CN103246570A (zh) * | 2013-05-20 | 2013-08-14 | 百度在线网络技术(北京)有限公司 | Hadoop的调度方法、系统及管理节点 |
CN103870340A (zh) * | 2014-03-06 | 2014-06-18 | 华为技术有限公司 | 流计算系统中的数据处理方法、控制节点及流计算系统 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3115896A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112596895A (zh) * | 2020-12-02 | 2021-04-02 | 中国科学院计算技术研究所 | 一种sql语义感知的弹性倾斜处理方法及系统 |
CN112596895B (zh) * | 2020-12-02 | 2023-09-12 | 中国科学院计算技术研究所 | 一种sql语义感知的弹性倾斜处理方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
US20180367584A1 (en) | 2018-12-20 |
CN103870340B (zh) | 2017-11-07 |
KR101858041B1 (ko) | 2018-06-27 |
JP6436594B2 (ja) | 2018-12-12 |
US20160373494A1 (en) | 2016-12-22 |
KR20160127814A (ko) | 2016-11-04 |
CN103870340A (zh) | 2014-06-18 |
US10097595B2 (en) | 2018-10-09 |
JP2017509075A (ja) | 2017-03-30 |
US10630737B2 (en) | 2020-04-21 |
EP3115896A1 (en) | 2017-01-11 |
CN107729147A (zh) | 2018-02-23 |
EP3115896A4 (en) | 2017-04-05 |
CN107729147B (zh) | 2021-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015131721A1 (zh) | 流计算系统中的数据处理方法、控制节点及流计算系统 | |
WO2021213004A1 (zh) | 一种微服务管理系统、部署方法及相关设备 | |
CN103812949B (zh) | 一种面向实时云平台的任务调度与资源分配方法及系统 | |
KR101781063B1 (ko) | 동적 자원 관리를 위한 2단계 자원 관리 방법 및 장치 | |
US20160269247A1 (en) | Accelerating stream processing by dynamic network aware topology re-optimization | |
WO2019001092A1 (zh) | 负载均衡引擎,客户端,分布式计算系统以及负载均衡方法 | |
US20160188376A1 (en) | Push/Pull Parallelization for Elasticity and Load Balance in Distributed Stream Processing Engines | |
WO2021012663A1 (zh) | 一种访问日志的处理方法及装置 | |
WO2013104217A1 (zh) | 基于云基础设施的针对应用系统维护部署的管理系统和方法 | |
CN102427475A (zh) | 一种云计算环境中负载均衡调度的系统 | |
CN108092895A (zh) | 一种软件定义网络联合路由选择及网络功能部署方法 | |
Liu et al. | Service resource management in edge computing based on microservices | |
WO2015123974A1 (zh) | 一种数据分发策略的调整方法、装置及系统 | |
Kettimuthu et al. | An elegant sufficiency: load-aware differentiated scheduling of data transfers | |
Xia et al. | A QoE-aware service-enhancement strategy for edge artificial intelligence applications | |
Wang et al. | Task scheduling for MapReduce in heterogeneous networks | |
WO2015196940A1 (zh) | 一种流处理方法、装置及系统 | |
CN103176850A (zh) | 一种基于负载均衡的电力系统网络集群任务分配方法 | |
US20190065555A1 (en) | System, method of real-time processing under resource constraint at edge | |
CN112995241B (zh) | 服务调度方法和装置 | |
CN104753751B (zh) | 一种动态确定虚拟网络的方法及系统 | |
CN111782354A (zh) | 一种基于强化学习的集中式数据处理时间优化方法 | |
CN111966497B (zh) | 一种广域网环境中分布式系统的计算任务分配方法 | |
KR20140076956A (ko) | 클라우드 네트워크의 노드 제어 시스템 | |
Li et al. | Research on Traffic Modeling Based on Integrated In-Queue Shaping |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15759274 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016555667 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2015759274 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015759274 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020167027194 Country of ref document: KR |