CN117807272A - Adaptive data flow method, system, equipment and storage medium - Google Patents

Adaptive data flow method, system, equipment and storage medium Download PDF

Info

Publication number
CN117807272A
CN117807272A CN202311748976.2A CN202311748976A CN117807272A CN 117807272 A CN117807272 A CN 117807272A CN 202311748976 A CN202311748976 A CN 202311748976A CN 117807272 A CN117807272 A CN 117807272A
Authority
CN
China
Prior art keywords
data
operator node
node set
processed
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311748976.2A
Other languages
Chinese (zh)
Inventor
斯奇能
文江
王亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202311748976.2A priority Critical patent/CN117807272A/en
Publication of CN117807272A publication Critical patent/CN117807272A/en
Pending legal-status Critical Current

Links

Abstract

The application discloses a self-adaptive data flow method, a system, equipment and a storage medium, wherein the self-adaptive data flow method comprises the following steps: analyzing based on the received data amount of the data to be processed and the total processable data amount of the current operator node set to obtain a current analysis result, wherein the current operator node set comprises at least one current operator node; if the current analysis result represents that the current operator node set does not meet the data processing requirement of the data to be processed, the application data flow system adjusts the current operator nodes in the current operator node set to obtain a target operator node set, and the target operator nodes in the target operator node set are used for processing the data to be processed; and outputting the data processed by the target operator node. By the scheme, the node resources can be adaptively adjusted, and efficient data flow is realized.

Description

Adaptive data flow method, system, equipment and storage medium
Technical Field
The present disclosure relates to the field of cloud native technologies, and in particular, to a method, a system, an apparatus, and a storage medium for adaptive data transfer.
Background
In recent years, data processing platforms have been widely developed and used. For example, big data computing platforms (MaxCompute) provide distributed processing power for TB/PB level data and are applied in the fields of data analysis, data mining, and the like.
Therefore, in order to fully function as a data processing platform, it is desirable that an ETL service (Extract-Transform-Load) can efficiently process various kinds of data flowing so that even a process of a computing system or a computing engine different from the ETL service can utilize resources on the data processing platform.
However, the current data flow method has low adaptability, and it is difficult to realize efficient data flow.
Disclosure of Invention
The application provides at least one adaptive data flow method, device, equipment and computer readable storage medium.
The first aspect of the present application provides an adaptive data streaming method, where the method is applied to a current operator node set in a data streaming system, and the method includes: analyzing based on the received data amount of the data to be processed and the total processable data amount of the current operator node set to obtain a current analysis result, wherein the current operator node set comprises at least one current operator node; if the current analysis result represents that the current operator node set does not meet the data processing requirement of the data to be processed, applying for the data flow system to adjust current operator nodes in the current operator node set to obtain a target operator node set, wherein the target operator nodes in the target operator node set are used for processing the data to be processed; and outputting the data processed by the target operator node.
In an embodiment, the step of applying the data circulation system to adjust the current operator nodes in the current operator node set if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed, and obtaining a target operator node set includes: if the current analysis result is that the total amount of the processable data is smaller than the data amount of the data to be processed, the current operator node set is characterized as not meeting the data processing requirement of the data to be processed; applying for the data flow system to perform node addition processing on the current operator node set to obtain the target operator node set.
In an embodiment, the step of applying the data circulation system to adjust the current operator nodes in the current operator node set if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed, and obtaining a target operator node set includes: if the current analysis result is that the total amount of the processable data is larger than the data amount of the data to be processed, the current operator node set is characterized as not meeting the data processing requirement of the data to be processed; applying for the data flow system to perform node recovery processing on the current operator node set to obtain the target operator node set.
In an embodiment, each current operator node in the current operator node set is provided with an input data channel thread and an output data channel thread, which are independent, respectively, and the input data channel thread is used for receiving the data to be processed, and the output data channel thread is used for outputting the processed data.
In an embodiment, after the step of outputting the data processed by the target operator node, the method further includes: determining remaining data to be processed based on the data to be processed and the processed data output by the target operator node; continuously analyzing based on the data quantity of the remaining data to be processed and the total processable data quantity of the target operator node set to obtain a continuous analysis result; and if the continuous analysis result represents that the target operator node set does not meet the data processing requirement of the residual data to be processed, applying for the data flow system to adjust the target operator nodes in the target operator node set.
In an embodiment, the step of outputting the data processed by the target operator node includes: outputting the processed data to a subsequent operator node set in the data streaming system for data processing, wherein the subsequent operator node set comprises at least one subsequent operator node, the subsequent operator node is serial with the target operator node, and the data processing flow of the subsequent operator node is behind the data processing flow of the target operator node; or outputting the processed data to a target device.
In an embodiment, the step of outputting the processed data to a subsequent operator node set in the data streaming system for data processing includes: outputting the processed data to the subsequent operator node set, so that the subsequent operator node set analyzes based on the received data amount of the processed data and the total processable data amount of the subsequent operator node set to obtain a subsequent analysis result; and if the subsequent analysis result represents that the subsequent operator node set does not meet the data processing requirement of the processed data, applying for the data flow system to adjust the subsequent operator nodes in the subsequent operator node set.
A second aspect of the present application provides an adaptive data flow system, the system comprising: an operator node set, which is used for analyzing based on the received data quantity of the data to be processed and the total processable data quantity of the operator node set to obtain an analysis result, wherein the operator node set comprises at least one operator node; if the analysis result represents that the operator node set does not meet the data processing requirement, applying for the data flow system to adjust operator nodes in the operator node set, and obtaining an adjusted operator node set; outputting the data processed by the operator nodes in the adjusted operator node set; and the management node is used for responding to the request of the operator node set and adjusting the operator node set.
A third aspect of the present application provides an adaptive data transfer device, including: the analysis module is used for analyzing the data quantity of the received data to be processed and the total processable data quantity of the current operator node set to obtain a current analysis result, wherein the current operator node set comprises at least one current operator node; the adjustment module is used for applying the data flow system to adjust the current operator nodes in the current operator node set if the current analysis result represents that the current operator node set does not meet the data processing requirement of the data to be processed, so as to obtain a target operator node set, wherein the target operator nodes in the target operator node set are used for processing the data to be processed; and the output module is used for outputting the data processed by the target operator node.
In a fourth aspect, an electronic device is provided that includes a memory and a processor configured to execute program instructions stored in the memory to implement the adaptive data streaming method described above.
A fifth aspect of the present application provides a computer readable storage medium having stored thereon program instructions which, when executed by a processor, implement the above-described adaptive data streaming method.
According to the scheme, the data quantity of the received data to be processed and the total data quantity of the current operator node set can be compared and analyzed, so that whether the total data quantity of the current operator node set is matched with the data quantity of the data to be processed or not can be judged according to the current analysis result; if the current operator node set does not meet the data processing requirement of the data to be processed, the data flow system is applied to adjust the current operator nodes in the current operator node set, and the target operator node set is obtained to process the data to be processed; outputting the processed data; therefore, node resources can be adaptively adjusted according to the data quantity of the data to be processed, and efficient data flow is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.
FIG. 1 is a schematic view of an application environment of an exemplary embodiment of an adaptive data flow method of the present application;
FIG. 2 is a flow chart of an exemplary embodiment of an adaptive data flow method of the present application;
FIG. 3 is an exemplary node adjustment schematic of the adaptive data flow method of the present application;
FIG. 4 is another exemplary node adjustment schematic of the adaptive data flow method of the present application;
FIG. 5 is a block diagram of an adaptive data flow device according to an exemplary embodiment of the present application;
FIG. 6 is a schematic structural diagram of an embodiment of an electronic device of the present application;
FIG. 7 is a schematic diagram of an embodiment of a computer-readable storage medium of the present application.
Detailed Description
The following describes the embodiments of the present application in detail with reference to the drawings.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
For easy understanding, reference may be made to fig. 1, and fig. 1 is a schematic application environment of an exemplary embodiment of an adaptive data circulation method of the present application, in which an application environment, a data receiving module flows data to be processed into an operator node set of a data circulation system to perform ETL service, and specifically, may request, through an http request, a management node of the data circulation system to pull up the data receiving module; the management node of the data flow system can be a node which is developed based on the K8S technology and provides cloud primary service, and specifically, the K8S source service can be requested through http request, so that the aim of managing node resources on the cloud is fulfilled, and efficient utilization of the resources and efficient flow of data are maximally achieved; the operator node set comprises one or more operator nodes and is mainly used for processing data to be processed received from a data source; the data sending module is used for sending the data processed by the operator node to target equipment (such as a target computer or a target server and the like), and particularly can request the management node of the data circulation system to pull up the data sending module through an http request. In the whole data processing link, as shown in fig. 1, data to be processed flows in from a data receiving module, is processed by operator nodes and flows out from a data sending module, the circulation processing efficiency of the data in the ETL service can be improved through the method provided by the application, and the nodes in the ETL service are dynamically adjusted on the cloud by means of the capability of the cloud primordial technology to adaptively utilize resources, so that the purpose of exclusive resource of the operators is achieved.
Referring to fig. 2, fig. 2 is a flow chart illustrating an exemplary embodiment of an adaptive data flow method according to the present application. Specifically, the method is applied to a current operator node set in a data flow system, wherein the current operator node set can have one operator node or a plurality of operator nodes, and the method can comprise the following steps:
step S110, analyzing based on the received data quantity of the data to be processed and the total processable data quantity of the current operator node set to obtain a current analysis result, wherein the current operator node set comprises at least one current operator node.
For easy understanding, in this embodiment, the current operator node set includes a current operator node as an example, where the total amount of processable data of the current operator node set is the sum of the processable data amounts of the current operator nodes, and in the case that there is one current operator node in the current operator node set, that is, the processable data amount of the current operator node, the processing performance or the computing power of the current operator node is represented.
Specifically, based on the data amount of the data to be processed and the total processable data amount of the current operator node set, the analysis method may be to compare the data amount of the data to be processed received by the current operator node with the processable data amount of the current operator node, so as to obtain a current analysis result; if the data volume difference between the data volume of the data to be processed and the processable data volume of the current operator node is larger than a preset data volume difference value threshold, the current analysis result represents that the current operator node does not meet the data processing requirement of the data to be processed.
It should be noted that the fact that the current operator node meets the data processing requirement of the data to be processed means that the processable data size of the current operator node is matched with the data size of the data to be processed, that is, the problem of performance deficiency or the problem of performance excess cannot be caused. The performance deficiency means that the processable data volume of the current operator node is smaller than the data volume of the data to be processed, and the data volume difference between the processable data volume of the current operator node and the data volume of the data to be processed is larger than a preset first data volume difference value threshold; the performance surplus means that the processable data volume of the current operator node is larger than the data volume of the data to be processed, and the data volume difference between the processable data volume of the current operator node and the data volume of the data to be processed is larger than a preset second data volume difference threshold. The first data amount difference threshold and the second data amount difference threshold may be the same or different in value, and the threshold may be set to 0 or other values, without limitation. Both the performance deficiency and the performance process represent that the current operator node set does not meet the data processing requirements of the data to be processed.
Step S120, if the current analysis result represents that the current operator node set does not meet the data processing requirement of the data to be processed, the application data flow system adjusts the current operator nodes in the current operator node set to obtain a target operator node set, and the target operator nodes in the target operator node set are used for processing the data to be processed.
The foregoing steps are described in connection with the present invention, if the present analysis result indicates that the present operator node set does not meet the data processing requirement of the data to be processed, the data flow system needs to be applied to adjust the number of the present operator nodes in the present operator node set, so that the adjusted operator node set (i.e., the target operator node set) can meet the data processing requirement of the data to be processed. It will be appreciated that there is a major difference in the number of nodes between the target operator node set and the current operator node set.
For example, referring to fig. 3, fig. 3 is an exemplary node adjustment schematic diagram of the adaptive data flow method of the present application, if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed, for example, a problem of performance deficiency occurs, and the processable data size of the current operator node is smaller than the data size of the data to be processed, the current operator node initiates a resource management request to a management node in the data flow system, so as to apply resources to the management node, so that the management node creates one or more operator nodes according to the applied resources, and obtains the target operator node set.
It should be noted that, the manner in which the management node creates the operator node may be that the management node performs a node replication operation based on the current operator node initiating the resource request, that is, newly creates one or more operator nodes (target operator nodes) that are the same as the current operator node, and forms the operator node cluster with the newly created operator node and the previous operator nodes, that is, the target operator node set, so as to achieve the improvement of the performance of the current operator node in multiple level, and also ensure the consistency of data, so as to solve the performance bottleneck of the current operator node.
In addition, referring to fig. 4, fig. 4 is another exemplary node adjustment schematic diagram of the adaptive data flow method of the present application, if a performance excess problem occurs, a current operator node in a current operator node set initiates a resource management request to a management node in a data flow system, so that the management node performs node recovery processing on the current operator node in the current operator node set, and reduces the number of nodes of the current operator node in the current operator node set to obtain a target operator node set.
It should be noted that, in the case that there are multiple current operator nodes in the current operator node set, if a performance excess problem occurs, the current operator nodes in the current operator node set in a working state and the current operator nodes in an idle state can be distinguished first, where the working state refers to that the operator nodes are performing data processing, and the idle state refers to that the operator nodes are not performing data processing or that the operator nodes are not performing data processing for a period of time (a timer can be set in the operator nodes to obtain a time when the operator nodes are not performing data processing, if the time is greater than a preset idle time, the operator nodes are considered to be in the idle state); and actively initiating a resource management request to the management node by the current operator node in the working state or the current operator node in the idle state, so that the management node carries out recovery processing on the current operator node in the idle state until the management node recovers all the current operator nodes in the idle state, or if the current operator nodes in the idle state in the current operator node set are smaller than the idle quantity threshold value, or if the node quantity of the current operator nodes in the current operator node set is smaller than the preset node quantity threshold value, stopping recovery processing on the nodes by the management node. The idle quantity threshold value can be preset or calculated according to actual data processing conditions, for example, if the number of nodes of the current operator nodes in the working state in the current operator node set is N, and N is a positive integer, the idle quantity threshold value can be obtained by calculating the product between N and a preset idle proportion M and rounding up, so that a certain number of operator nodes in the idle state in the current operator node set can timely cope with the suddenly-increased data quantity of the data to be processed, and overload of the operator nodes is prevented.
And step S130, outputting the data processed by the target operator node.
And when the data to be processed is processed by any target operator node, outputting the processed data.
It can be understood that the ETL service may include a plurality of data processing flows, that is, operator nodes corresponding to the plurality of data processing flows are correspondingly set, that is, after the operator node 1 in the ETL service processes the data to be processed, the processed data is transmitted to the operator node 2 in the ETL service for data processing. Therefore, if the target operator node is not the last operator node in the ETL service, the processed data may be output to the next operator node of the target operator node; if the target operator node is the last operator node in the ETL service, the processed data may be output to the data sending module and sent to the target device.
It can be seen that, according to the method and the device, the data quantity of the received data to be processed and the total data quantity of the current operator node set can be compared and analyzed, so that whether the total data quantity of the current operator node set is matched with the data quantity of the data to be processed or not can be judged according to the current analysis result; if the current operator node set does not meet the data processing requirement of the data to be processed, the application data flow system adjusts the current operator nodes in the current operator node set to obtain a target operator node set for processing the data to be processed; outputting the processed data; therefore, node resources can be adaptively adjusted according to the data quantity of the data to be processed, and efficient data flow is realized.
On the basis of the above embodiment, the step of applying for the data flow system to adjust the current operator nodes in the current operator node set to obtain the target operator node set is described in the embodiment of the present application, if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed. Specifically, the method of the embodiment comprises the following steps:
if the current analysis result is that the total amount of the processable data is smaller than the data amount of the data to be processed, the current operator node set is characterized as not meeting the data processing requirement of the data to be processed; and the application data flow system performs node addition processing on the current operator node set to obtain a target operator node set.
The foregoing embodiments are used to describe that if the current analysis result is that the total amount of processable data is smaller than the data amount of the data to be processed, and the data amount difference between the total amount of processable data and the data amount of the data to be processed is greater than the first data amount difference threshold, it is described that the current operator node set has a phenomenon of performance deficiency when the current operator node set processes the data to be processed of the volume, and the current operator node set is characterized as not meeting the data processing requirement of the data to be processed; the management node in the application data flow system adjusts the node number of the current operator nodes in the current operator node set, specifically, the current operator node can initiate a resource management request to the management node to apply for resources, so that the management node can allocate available resources according to the received resource management request to create one or more operator nodes identical to the current operator node, and the specific created node number can be obtained by calculation according to the processable data quantity of a single operator node and the data quantity of data to be processed, which is not described herein in detail; the newly created operator nodes and the existing current operator nodes form an operator node cluster together, so that node addition processing is carried out on the current operator node set to obtain target operator nodes, the performance of the current operator nodes is improved by multiple levels, the consistency of data can be ensured, and the performance bottleneck of the current operator nodes is solved. It will be appreciated that the target operator nodes in the target operator node set are obtained on the basis of the current operator nodes in the current operator node set, and in general, there may be no difference between the current operator node set and the target operator node set except for the difference in the number of nodes.
On the basis of the above embodiment, the step of applying for the data flow system to adjust the current operator nodes in the current operator node set to obtain the target operator node set is described in the embodiment of the present application, if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed. Specifically, the method of the embodiment comprises the following steps:
if the current analysis result is that the total amount of the processable data is larger than the data amount of the data to be processed, the current operator node set is characterized as not meeting the data processing requirement of the data to be processed; and the application data flow system performs node recovery processing on the current operator node set to obtain a target operator node set.
The foregoing embodiments are used to describe that if the current analysis result is that the total amount of processable data is greater than the data amount of the data to be processed, and the data amount difference between the total amount of processable data and the data amount of the data to be processed is greater than the second data amount difference threshold, it is described that the current operator node set has a phenomenon of performance excess when the volume of data to be processed is processed, and the current operator node set is characterized as not meeting the data processing requirement of the data to be processed; the management node in the application data flow system adjusts the node number of the current operator nodes in the current operator node set, and specifically, a resource management request can be initiated to the management node for the current operator nodes, so that the management node can recycle the operator nodes in the idle state in the current operator node set according to the received resource management request, resources occupied by the operator nodes are released, and a current operator node set with reduced node number, namely, a target operator node set is obtained. The specific node recovery process may refer to the description of the foregoing embodiments, and will not be repeated here.
It can be understood that the resources recovered by the managed node can also be continuously used for creating other operator nodes in other operator node sets when the performance of other operator node sets in the data flow system of the application is deficient.
On the basis of the above embodiment, it should be noted that, in the present embodiment of the present application, each present operator node in the present operator node set is provided with an input data channel thread and an output data channel thread, which are independent, respectively, and the input data channel thread is used for receiving data to be processed, and the output data channel thread is used for outputting the processed data.
It should be noted that, data is one-way circulation in the ETL service, each operator node is separately provided with a respective input data channel thread and an output data channel thread, the two channel threads perform their own roles, the input data channel threads are responsible for data input of the operator nodes and the output data channel threads are responsible for data output of the operator nodes, and the independent channel threads are definite in division, so that the data processing efficiency of the ETL service can be improved.
On the basis of the above embodiments, the embodiments of the present application describe steps after outputting data processed by a target operator node. Specifically, the method of the embodiment comprises the following steps:
Determining the rest data to be processed based on the data to be processed and the processed data output by the target operator node; continuously analyzing based on the data quantity of the remaining data to be processed and the total processable data quantity of the target operator node set to obtain a continuous analysis result; and if the continuous analysis result represents that the target operator node set does not meet the data processing requirements of the rest data to be processed, applying for the data flow system to adjust the target operator nodes in the target operator node set.
It can be understood that after the target operator node outputs the processed data, it is equivalent to that all or part of the data to be processed is successfully processed, and the remaining amount of the data to be processed changes, so that performance analysis is continuously performed based on the data amount of the remaining data to be processed and the total amount of the processable data of the target operator node set, and a continuous analysis result is obtained; and if the continuous analysis result represents that the target operator node set does not meet the data processing requirements of the rest data to be processed, applying for the data flow system to adjust the target operator nodes in the target operator node set.
For the sake of understanding, the foregoing description of the adjustment process of the target operator node set may be referred to in the same way as the foregoing description of the adjustment process of the target operator node set; illustratively, if the total amount of processable data of the target operator node set is greater than the data amount of the remaining data to be processed, and the data amount difference between the total amount of processable data of the target operator node set and the data amount of the remaining data to be processed is greater than the data amount difference threshold, characterizing that the target operator node set has a problem of excessive performance; and the target operator nodes in the idle state initiate resource management requests to management nodes in the data flow system, so that the management nodes carry out node recovery processing on the target operator nodes in the target operator node set, and the node quantity of the target operator nodes in the target operator node set is reduced.
It should be noted that, after the target operator node outputs the processed data, even if all or part of the data to be processed is successfully processed, it is still possible to continue to receive more data to be processed, so that the data amount of the remaining data to be processed is increased; if the target operator node set has a performance deficiency, the description of the foregoing embodiments may be referred to in the same manner, and will not be repeated here.
On the basis of the above embodiments, the embodiments of the present application describe a step of outputting data processed by a target operator node. Specifically, the method of the embodiment comprises the following steps:
outputting the processed data to a subsequent operator node set in a data streaming system for data processing, wherein the subsequent operator node set comprises at least one subsequent operator node, the subsequent operator node is in series with the target operator node, and the data processing flow of the subsequent operator node is behind the data processing flow of the target operator node; or outputting the processed data to the target device.
The foregoing embodiments are used to describe that, in an application scenario of the present application, a plurality of operator nodes may exist in the ETL service and correspond to a plurality of different data processing flows, where the plurality of operator nodes execute the data processing flows in series; if the target operator node is not the last operator node in the ETL service, the processed data may be output to the next operator node of the target operator node for data processing, namely the subsequent operator node; if the target operator node is the last operator node in the ETL service, the processed data may be output to the data sending module and sent to the target device.
On the basis of the above embodiments, the embodiments of the present application describe the step of outputting the processed data to the subsequent operator node set in the data streaming system for data processing. Specifically, the method of the embodiment comprises the following steps:
outputting the processed data to a subsequent operator node set, so that the subsequent operator node set analyzes based on the received data amount of the processed data and the processable data total amount of the subsequent operator node set to obtain a subsequent analysis result; and if the subsequent analysis result represents that the subsequent operator node set does not meet the data processing requirement of the processed data, applying for the data flow system to adjust the subsequent operator nodes in the subsequent operator node set.
For the description in connection with the foregoing embodiments, reference may be made to fig. 1, where the operator node set shown in fig. 1 includes an operator node set 1 and an operator node set 2, where the operator node set 1 is equivalent to a target operator node set or a current operator node set, and the operator node set 2 is equivalent to a subsequent operator node set; after the target operator node outputs the processed data to the subsequent operator nodes in the subsequent operator node set, the subsequent operator nodes can apply for resource adjustment to the management node in the data circulation system according to the description of the current operator node and the description of the target operator node in the foregoing embodiment, which is not described herein.
It should be further noted that the execution body of the adaptive data transfer method may be an adaptive data transfer device, for example, the adaptive data transfer method may be executed by a terminal device or a server or other processing device, where the terminal device may be a User Equipment (UE), a computer, a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital processing (Personal Digital Assistant, PDA), a handheld device, a computing device, an in-vehicle device, a wearable device, or the like. In some possible implementations, the adaptive data flow method may be implemented by a processor invoking computer readable instructions stored in a memory.
The application also provides an adaptive data flow system, which at least comprises:
the operator node set is used for analyzing based on the received data quantity of the data to be processed and the processable data total quantity of the operator node set to obtain an analysis result, and comprises at least one operator node; if the analysis result represents that the operator node set does not meet the data processing requirement, the application data flow system adjusts operator nodes in the operator node set to obtain an adjusted operator node set; outputting the data processed by the operator nodes in the adjusted operator node set; and the management node is used for responding to the request of the operator node set and adjusting the operator node set.
For the description of the present system, reference may be made to fig. 1 of the present application and the description of the adaptive data streaming method in the foregoing embodiment of the present application, which is not repeated herein.
Fig. 5 is a block diagram of an adaptive data flow device according to an exemplary embodiment of the present application. As shown in fig. 5, the exemplary adaptive data flow device 500 includes: an analysis module 510, an adjustment module 520, and an output module 530. Specifically:
the analysis module 510 is configured to analyze, based on the received data amount of the data to be processed and the total processable data amount of the current operator node set, to obtain a current analysis result, where the current operator node set includes at least one current operator node.
The adjustment module 520 is configured to, if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed, apply for the data circulation system to adjust the current operator nodes in the current operator node set, obtain a target operator node set, where the target operator nodes in the target operator node set are used to process the data to be processed.
And the output module 530 outputs the data processed by the target operator node.
In the adaptive data flow device, comparing and analyzing the received data volume of the data to be processed and the total processable data volume of the current operator node set to judge whether the total processable data volume of the current operator node set is matched with the data volume of the data to be processed according to the current analysis result; if the current operator node set does not meet the data processing requirement of the data to be processed, the application data flow system adjusts the current operator nodes in the current operator node set to obtain a target operator node set for processing the data to be processed; outputting the processed data; therefore, node resources can be adaptively adjusted according to the data quantity of the data to be processed, and efficient data flow is realized.
The functions of each module may refer to an adaptive data streaming method embodiment, which is not described herein.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of an electronic device of the present application. The electronic device 600 comprises a memory 601 and a processor 602, the processor 602 being adapted to execute program instructions stored in the memory 601 to implement the steps of any of the adaptive data streaming method embodiments described above. In one particular implementation scenario, electronic device 600 may include, but is not limited to: the electronic device 600 may also include mobile devices such as a notebook computer and a tablet computer, and is not limited herein.
In particular, the processor 602 is configured to control itself and the memory 601 to implement the steps of any of the adaptive data streaming method embodiments described above. The processor 602 may also be referred to as a CPU (Central Processing Unit ). The processor 602 may be an integrated circuit chip having signal processing capabilities. The processor 602 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 602 may be commonly implemented by an integrated circuit chip.
According to the scheme, the data quantity of the received data to be processed and the total data quantity of the current operator node set can be compared and analyzed, so that whether the total data quantity of the current operator node set is matched with the data quantity of the data to be processed or not can be judged according to the current analysis result; if the current operator node set does not meet the data processing requirement of the data to be processed, the application data flow system adjusts the current operator nodes in the current operator node set to obtain a target operator node set for processing the data to be processed; outputting the processed data; therefore, node resources can be adaptively adjusted according to the data quantity of the data to be processed, and efficient data flow is realized.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of a computer readable storage medium of the present application. The computer readable storage medium 710 stores program instructions 711 executable by the processor, the program instructions 711 for implementing the steps of any of the adaptive data streaming method embodiments described above.
According to the scheme, the data quantity of the received data to be processed and the total data quantity of the current operator node set can be compared and analyzed, so that whether the total data quantity of the current operator node set is matched with the data quantity of the data to be processed or not can be judged according to the current analysis result; if the current operator node set does not meet the data processing requirement of the data to be processed, the application data flow system adjusts the current operator nodes in the current operator node set to obtain a target operator node set for processing the data to be processed; outputting the processed data; therefore, node resources can be adaptively adjusted according to the data quantity of the data to be processed, and efficient data flow is realized.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.
In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims (10)

1. An adaptive data streaming method, wherein the method is applied to a current operator node set in a data streaming system, the method comprising:
analyzing based on the received data amount of the data to be processed and the total processable data amount of the current operator node set to obtain a current analysis result, wherein the current operator node set comprises at least one current operator node;
if the current analysis result represents that the current operator node set does not meet the data processing requirement of the data to be processed, applying for the data flow system to adjust current operator nodes in the current operator node set to obtain a target operator node set, wherein the target operator nodes in the target operator node set are used for processing the data to be processed;
and outputting the data processed by the target operator node.
2. The method according to claim 1, wherein the step of applying for the data flow system to adjust the current operator nodes in the current operator node set to obtain the target operator node set if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed, includes:
If the current analysis result is that the total amount of the processable data is smaller than the data amount of the data to be processed, the current operator node set is characterized as not meeting the data processing requirement of the data to be processed;
applying for the data flow system to perform node addition processing on the current operator node set to obtain the target operator node set.
3. The method according to claim 1, wherein the step of applying for the data flow system to adjust the current operator nodes in the current operator node set to obtain the target operator node set if the current analysis result indicates that the current operator node set does not meet the data processing requirement of the data to be processed, includes:
if the current analysis result is that the total amount of the processable data is larger than the data amount of the data to be processed, the current operator node set is characterized as not meeting the data processing requirement of the data to be processed;
applying for the data flow system to perform node recovery processing on the current operator node set to obtain the target operator node set.
4. The method of claim 1, wherein each current operator node in the set of current operator nodes is provided with a separate input data channel thread for receiving the data to be processed and an output data channel thread for outputting the processed data.
5. The method of claim 1, wherein after the step of outputting the data processed by the target operator node, the method further comprises:
determining remaining data to be processed based on the data to be processed and the processed data output by the target operator node;
continuously analyzing based on the data quantity of the remaining data to be processed and the total processable data quantity of the target operator node set to obtain a continuous analysis result;
and if the continuous analysis result represents that the target operator node set does not meet the data processing requirement of the residual data to be processed, applying for the data flow system to adjust the target operator nodes in the target operator node set.
6. The method of claim 1, wherein the step of outputting the data processed by the target operator node comprises:
outputting the processed data to a subsequent operator node set in the data streaming system for data processing, wherein the subsequent operator node set comprises at least one subsequent operator node, the subsequent operator node is serial with the target operator node, and the data processing flow of the subsequent operator node is behind the data processing flow of the target operator node;
Or outputting the processed data to a target device.
7. The method of claim 6, wherein the step of outputting the processed data to a subsequent set of operator nodes in the data streaming system for data processing comprises:
outputting the processed data to the subsequent operator node set, so that the subsequent operator node set analyzes based on the received data amount of the processed data and the total processable data amount of the subsequent operator node set to obtain a subsequent analysis result; and if the subsequent analysis result represents that the subsequent operator node set does not meet the data processing requirement of the processed data, applying for the data flow system to adjust the subsequent operator nodes in the subsequent operator node set.
8. An adaptive data flow system, the system comprising:
an operator node set, which is used for analyzing based on the received data quantity of the data to be processed and the total processable data quantity of the operator node set to obtain an analysis result, wherein the operator node set comprises at least one operator node; if the analysis result represents that the operator node set does not meet the data processing requirement, applying for the data flow system to adjust operator nodes in the operator node set, and obtaining an adjusted operator node set; outputting the data processed by the operator nodes in the adjusted operator node set;
And the management node is used for responding to the request of the operator node set and adjusting the operator node set.
9. An electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the method of any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon program instructions, which when executed by a processor, implement the method of any of claims 1 to 7.
CN202311748976.2A 2023-12-18 2023-12-18 Adaptive data flow method, system, equipment and storage medium Pending CN117807272A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311748976.2A CN117807272A (en) 2023-12-18 2023-12-18 Adaptive data flow method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311748976.2A CN117807272A (en) 2023-12-18 2023-12-18 Adaptive data flow method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117807272A true CN117807272A (en) 2024-04-02

Family

ID=90428950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311748976.2A Pending CN117807272A (en) 2023-12-18 2023-12-18 Adaptive data flow method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117807272A (en)

Similar Documents

Publication Publication Date Title
US9568975B2 (en) Power balancing to increase workload density and improve energy efficiency
US11206706B2 (en) Method and apparatus for web browsing on multihomed mobile devices
US11196845B2 (en) Method, apparatus, and computer program product for determining data transfer manner
CN107995286B (en) Automatic service starting and stopping method based on dubbo platform, server and storage medium
CN107800574B (en) Storage QOS adjusting method, system, equipment and computer readable memory
CN112868265A (en) Network resource management method, management device, electronic device and storage medium
CN110430068A (en) A kind of Feature Engineering method of combination and device
EP4068717A1 (en) Method for controlling edge node, node, and edge computing system
CN110855564B (en) Intelligent routing path selection method, device and equipment and readable storage medium
CN105554049A (en) Distributed traffic control method and equipment
EP3672203A1 (en) Distribution method for distributed data computing, device, server and storage medium
CN109587068B (en) Flow switching method, device, equipment and computer readable storage medium
CN117807272A (en) Adaptive data flow method, system, equipment and storage medium
CN106330504B (en) Method for realizing application and service controller
CN105335376A (en) Stream processing method, device and system
CN109688171B (en) Cache space scheduling method, device and system
CN104243587A (en) Load balancing method and system for message servers
CN109298944B (en) Server pressure adjusting method and device, computer device and storage medium
CN115714774A (en) Calculation force request, calculation force distribution and calculation force execution method, terminal and network side equipment
CN115955437B (en) Data transmission method, device, equipment and medium
CN111309484A (en) Management method and device for improving terminal performance and computer readable storage medium
US11086822B1 (en) Application-based compression
CN115086300B (en) Video file scheduling method and device
WO2023105671A1 (en) Computer and program
CN112866128A (en) Speed limiting method and device for distributed network and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination