CN116992245B - Distributed time sequence data analysis processing method - Google Patents

Distributed time sequence data analysis processing method Download PDF

Info

Publication number
CN116992245B
CN116992245B CN202311255652.5A CN202311255652A CN116992245B CN 116992245 B CN116992245 B CN 116992245B CN 202311255652 A CN202311255652 A CN 202311255652A CN 116992245 B CN116992245 B CN 116992245B
Authority
CN
China
Prior art keywords
performance
computing node
data processing
sequence data
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311255652.5A
Other languages
Chinese (zh)
Other versions
CN116992245A (en
Inventor
李淑琴
肖勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Minxuan Big Data Co ltd
Original Assignee
Jiangxi Minxuan Big Data Co ltd
Filing date
Publication date
Application filed by Jiangxi Minxuan Big Data Co ltd filed Critical Jiangxi Minxuan Big Data Co ltd
Priority to CN202311255652.5A priority Critical patent/CN116992245B/en
Publication of CN116992245A publication Critical patent/CN116992245A/en
Application granted granted Critical
Publication of CN116992245B publication Critical patent/CN116992245B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application discloses a distributed time sequence data analysis processing method, which particularly relates to the technical field of data processing, wherein the performance of a computing node can be more comprehensively and quantitatively known through the normalization processing of an interval comprehensive performance value, a time sequence data processing variation index and a fault abnormality and recovery performance comprehensive evaluation index, and the comparison of the computing node performance bump evaluation coefficient and a computing node performance evaluation threshold value, so that potential computing node problems can be found and solved in advance, and the usability and stability of time sequence data processing can be improved; the method has the advantages that the comprehensive performance judgment value is calculated, and a processing distribution strategy of time sequence data is formulated when the distribution calculation nodes synthesize normal signals, so that the time sequence data is distributed and transmitted to calculation nodes with better performance, the real-time analysis and processing of the time sequence data generated by production equipment can be ensured, and the real-time monitoring performance and the accuracy of monitoring results of the production equipment are ensured.

Description

Distributed time sequence data analysis processing method
Technical Field
The application relates to the technical field of data processing, in particular to a distributed time sequence data analysis processing method.
Background
Time series data is a data set formed by recording according to time sequence, wherein each data point is associated with a specific time point or time period; time series data is typically used to capture the change in a phenomenon, variable or event over time; the data may be continuous, discrete, or sampled at different time intervals.
In actual production, a plurality of production devices need to monitor time sequence data generated by the production devices in real time, process the time sequence data through a set algorithm, analyze and process the running state, fault trend and the like of the production devices, so that the actual running condition and early warning fault trend of the production devices are accurately known in real time; however, the time sequence data generated by the production equipment are generally randomly distributed to the computing nodes, the respective performances and the actual running stability of the computing nodes are changed after the computing nodes are used, the time sequence data generated by the production equipment are randomly distributed to the computing nodes, the difference of the performances and the actual running stability of the computing nodes is not considered, and if the performances of some computing nodes are poor, the speed for processing the time sequence data can be limited, so that the real-time monitoring and early warning of the running state and the fault trend of the production equipment can be adversely affected.
In order to solve the above problems, a technical solution is now provided.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, the present application provides a distributed time-series data analysis processing method to solve the above-mentioned problems in the prior art.
In order to achieve the above purpose, the present application provides the following technical solutions:
a distributed time sequence data analysis processing method comprises the following steps:
step S1: acquiring speed information of a computing node, and computing a time sequence data processing performance value; calculating an interval comprehensive performance value according to the time sequence data processing performance value, and calculating a time sequence data processing variation index according to the fluctuation condition of the time sequence data processing performance value;
step S2: collecting fault information of a computing node, and calculating a fault abnormality and a comprehensive evaluation index of recovery performance by analyzing recovery normal time corresponding to each fault report of the computing node and the frequency of fault report;
step S3: calculating the performance bump evaluation coefficient of the calculation node by normalizing the interval comprehensive performance value, the time sequence data processing variation index and the fault abnormality and recovery performance comprehensive evaluation index; generating a computing node recommended use signal or a computing node operation bad signal through comparing the computing node performance bump evaluation coefficient with a computing node performance evaluation threshold;
step S4: and recommending the use signals or the calculation nodes to run the bad signals to calculate the comprehensive performance judgment value for the calculation nodes corresponding to the plurality of calculation nodes, and generating a distributed calculation node comprehensive bad signal or a distributed calculation node comprehensive normal signal according to the comparison of the comprehensive performance judgment value and the comprehensive performance judgment threshold value.
In a preferred embodiment, in step S1, a data processing performance monitoring interval is set, where the data processing performance monitoring interval includes a data amount of time series data with a fixed size, the data processing performance monitoring interval is equally divided into a plurality of cells, and a data amount of time series data corresponding to each cell is obtained, where the data amount of time series data corresponding to each cell is the same;
acquiring the time for processing the data quantity of the time sequence data corresponding to the cells; and calculating a time sequence data processing performance value corresponding to each cell, wherein the time sequence data processing performance value is the ratio of the data volume of the time sequence data corresponding to the cell to the time of processing the data volume of the time sequence data corresponding to the cell.
In a preferred embodiment, the interval comprehensive performance value is a ratio of a sum of time sequence data processing performance values corresponding to all cells in the data processing performance monitoring interval to the number of cells in the data processing performance monitoring interval; marking interval comprehensive performance values as
In a preferred embodiment, the time series data processing performance value corresponding to each cell is obtained, and the time series data processing variability index is calculated, where the expression is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Monitoring the number of inter-cells within an interval for data processing performance,/->For the number of the time series data processing performance value corresponding to the cell in the data processing performance monitoring section, +.>,/>Are all greater than 1A positive integer;the time sequence data processing mutation index and the data processing performance monitoring interval are respectively +.>Time series data processing performance value corresponding to each cell and the +.>Time series data processing performance values corresponding to the cells.
In a preferred embodiment, in step S2, a fault monitoring interval is set, and a time length corresponding to the fault monitoring interval is obtained; acquiring the error reporting times of the computing node in the fault monitoring interval, and acquiring the normal recovery time corresponding to each error reporting of the computing node in the fault monitoring interval;
setting a recovery normal time threshold, acquiring the number of computing nodes with recovery normal time greater than the recovery normal time threshold and corresponding to the computing node error reporting in the fault monitoring interval, and acquiring the recovery normal time corresponding to the computing nodes with the recovery normal time greater than the recovery normal time threshold; the comprehensive evaluation index of fault abnormality and recovery performance is calculated, and the expression is as follows:wherein->The number of the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the computing node error reporting in the fault monitoring interval and the number of the recovery normal time corresponding to the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the computing node error reporting in the fault monitoring interval are respectively calculated; />,/>Are positive integers greater than 1;the comprehensive evaluation index of fault abnormality and recovery performance, the number of times of fault reporting of the calculation node in the fault monitoring interval, the corresponding time length of the fault monitoring interval and the first part in the fault monitoring interval are respectively calculated>And the recovery normal time corresponding to the error reporting of each computing node is larger than the recovery normal time corresponding to the computing node of the recovery normal time threshold value and the recovery normal time threshold value.
In a preferred embodiment, in step S3, the expression for calculating the node performance bump evaluation coefficient is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Calculating a node performance bump evaluation coefficient; />Respectively interval comprehensive performance value, time sequence data processing mutation index and fault abnormality and recovery performance comprehensive evaluation index, < >>Preset proportionality coefficients of interval comprehensive performance value, time sequence data processing mutation index and fault abnormality and recovery performance comprehensive evaluation index respectively, and ∈>Are all greater than 0;
setting a computing node performance evaluation threshold; generating a computing node operation bad signal when the computing node performance bump evaluation coefficient is larger than the computing node performance evaluation threshold; and when the computing node performance bump evaluation coefficient is smaller than or equal to the computing node performance evaluation threshold, generating a computing node recommended use signal.
In a preferred embodiment, in step S4, a total of h computing nodes are set, h being a positive integer;
acquiring the quantity of the operation bad signals of the corresponding generation computing node of the computing node; calculating the ratio of the number of the operation bad signals of the corresponding generation computing node of the computing node to h, and marking the ratio of the number of the operation bad signals of the corresponding generation computing node of the computing node to h as a comprehensive performance judgment value;
setting a comprehensive performance judgment threshold; when the comprehensive performance judgment value is larger than the comprehensive performance judgment threshold value, generating a comprehensive bad signal of the distributed computing node; and when the comprehensive performance judgment value is smaller than or equal to the comprehensive performance judgment threshold value, generating a comprehensive normal signal of the distributed computing node.
The distributed time sequence data analysis processing method has the technical effects and advantages that:
1. the performance and the running state of the computing node are judged by carrying out normalization processing on the interval comprehensive performance value, the time sequence data processing variation index and the fault abnormality and recovery performance comprehensive evaluation index, and comparing the computing node performance bump evaluation coefficient with the computing node performance evaluation threshold value, so that the performance of the computing node can be more comprehensively and quantitatively known, and potential computing node problems can be found and solved in advance by timely detecting the performance problems and fluctuation of the computing node, so that the influence of performance reduction or faults on a system is reduced. This helps to improve the usability and stability of the time series data processing.
2. The method has the advantages that the comprehensive performance judgment value is calculated, and a processing distribution strategy of time sequence data is formulated when the distribution calculation nodes synthesize normal signals, so that the time sequence data is distributed and transmitted to calculation nodes with better performance, the real-time analysis and processing of the time sequence data generated by production equipment can be ensured, and the real-time monitoring performance and the accuracy of monitoring results of the production equipment are ensured.
Drawings
FIG. 1 is a flow chart of a distributed time series data analysis processing method of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Fig. 1 shows a distributed time series data analysis processing method of the present application, which includes the following steps:
step S1: acquiring speed information of a computing node, and computing a time sequence data processing performance value; and calculating an interval comprehensive performance value according to the time sequence data processing performance value, and calculating a time sequence data processing variation index according to the fluctuation condition of the time sequence data processing performance value.
Step S2: and collecting fault information of the computing nodes, and calculating fault abnormality and comprehensive evaluation indexes of recovery performance by analyzing recovery normal time corresponding to each fault report of the computing nodes and the frequency of fault report occurrence.
Step S3: calculating the performance bump evaluation coefficient of the calculation node by normalizing the interval comprehensive performance value, the time sequence data processing variation index and the fault abnormality and recovery performance comprehensive evaluation index; the recommended usage signal of the computing node or the poor running signal of the computing node is generated by comparing the computing node performance bump evaluation coefficient with a computing node performance evaluation threshold.
Step S4: and recommending the use signals or the calculation nodes to run the bad signals to calculate the comprehensive performance judgment value for the calculation nodes corresponding to the plurality of calculation nodes, and generating a distributed calculation node comprehensive bad signal or a distributed calculation node comprehensive normal signal according to the comparison of the comprehensive performance judgment value and the comprehensive performance judgment threshold value.
In step S1, collecting computing node speed information, and evaluating the performance and the running state of a single computing node through analysis of computing node performance information; the computing node speed information may be used to learn the performance level of the computing node, including its performance under processing load; the computing node speed information helps to identify performance problems of the computing node in time, which may indicate problems such as resource bottlenecks, excessive loads, or hardware failures if the data processing speed of the computing node is reduced or fluctuates greatly. By quickly identifying the problem, measures can be taken to address the problem and reduce potential system failures.
Setting a data processing performance monitoring interval, wherein the data processing performance monitoring interval comprises the data volume of time sequence data with fixed size, equally dividing the data volume of the time sequence data with fixed size in the data processing performance monitoring interval into a plurality of cells, and acquiring the data volume of the time sequence data corresponding to each cell, namely, the data volume of the time sequence data corresponding to each cell is the same.
And acquiring the time for processing the data quantity of the time sequence data corresponding to the cells.
And calculating a time sequence data processing performance value corresponding to each cell, wherein the time sequence data processing performance value is the ratio of the data volume of the time sequence data corresponding to the cell to the time of processing the data volume of the time sequence data corresponding to the cell. The larger the time series data processing performance value is, the higher the data processing speed and performance of the computing node among cells are.
The interval comprehensive performance value is the ratio of the sum of the time sequence data processing performance values corresponding to all cells in the data processing performance monitoring interval to the number of the cells in the data processing performance monitoring interval; marking interval comprehensive performance values asThe method comprises the steps of carrying out a first treatment on the surface of the The larger the interval comprehensive performance value is, the higher the data processing speed and performance of the computing node in the data processing performance monitoring interval are.
And analyzing the stability of the time sequence data processing performance value corresponding to the cells in the data processing performance monitoring interval, and analyzing the stability of the data processing speed and performance in the data processing performance monitoring interval.
Acquiring time sequence data processing performance value corresponding to each cell, analyzing fluctuation condition of time sequence data processing performance value corresponding to adjacent cells, calculating time sequence data processing mutation index, and table thereofThe expression is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Monitoring the number of inter-cells within an interval for data processing performance,/->For the number of the corresponding time series data processing performance value between the cells in the data processing performance monitoring interval,,/>are positive integers greater than 1; />The time sequence data processing mutation index and the data processing performance monitoring interval are respectively +.>Time series data processing performance value corresponding to each cell and the +.>Time series data processing performance values corresponding to the cells.
The larger the time sequence data processing variation index is, the more severe or more fluctuation of time sequence data processing performance values is shown in a data processing performance monitoring interval, and the more unstable the performance of the computing node is. A large mutation pitch index may be an indication of computing node resource limitations, and a computing node may be affected by a resource bottleneck for some period of time, resulting in reduced performance. If the performance of the compute nodes fluctuates continuously over a short period of time, performance instability may result, affecting data processing speed and response time.
The data volume of the time sequence data included in the data processing performance monitoring interval is the data volume of the time sequence data with the fixed size nearest to the real-time. The size of the data amount of the time series data with the fixed size included in the data processing performance monitoring interval is set by a person skilled in the art according to the actual monitoring requirement of the processing speed of the data.
Note that the data amount of the time series data refers to the number or size of the time series data; units thereof may be used including, but not limited to, bytes (Bytes) and Bits (Bits).
In step S2, computing node fault information is collected, the computing node fault information reflects the frequency of occurrence of the computing node fault and the recovery capability after the occurrence of the fault, and the fault tolerance, availability and fault handling capability of the computing node can be evaluated.
Setting a fault monitoring interval and obtaining the time length corresponding to the fault monitoring interval; the error reporting times of the computing node in the fault monitoring interval are obtained, and the computing node has a self-repairing function when reporting errors in the system, so that the recovery normal time corresponding to each error reporting of the computing node in the fault monitoring interval is obtained.
Setting a recovery normal time threshold, when the recovery normal time corresponding to the error report of the computing node is greater than the recovery normal time threshold, the performance of the computing node in processing the fault is poor, which may be caused by the performance bottleneck, overload or insufficient resources of the computing node, and the long-time recovery normal time may indicate the insufficient stability of the computing node in facing the fault, which means that the computing node cannot effectively cope with the fault, and the self-healing capability of the computing node is insufficient (the self-healing capability includes the capability of automatic fault detection, fault positioning, fault processing and the like).
Obtaining the number of the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the error reporting of the computing nodes in the fault monitoring interval by analyzing the degree that the recovery normal time corresponding to each error reporting of the computing nodes is larger than the recovery normal time threshold value and the frequency of error reporting in the fault monitoring interval; calculating fault anomaliesThe comprehensive evaluation index of the recovery performance has the expression:wherein->The number of the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the computing node error reporting in the fault monitoring interval and the number of the recovery normal time corresponding to the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the computing node error reporting in the fault monitoring interval are respectively calculated; />,/>Are positive integers greater than 1;the comprehensive evaluation index of fault abnormality and recovery performance, the number of times of fault reporting of the calculation node in the fault monitoring interval, the corresponding time length of the fault monitoring interval and the first part in the fault monitoring interval are respectively calculated>And the recovery normal time corresponding to the error reporting of each computing node is larger than the recovery normal time corresponding to the computing node of the recovery normal time threshold value and the recovery normal time threshold value.
The larger the comprehensive evaluation index of fault abnormality and recovery performance is, the more serious the error reporting frequency of the computing node is in the fault monitoring interval, and the worse the fault self-repairing capability is in the fault monitoring interval; the worse the performance of the computing node is, the worse the processing capability of the time sequence data is, and the time sequence data cannot be processed timely and accurately.
The time length corresponding to the fault monitoring interval is set by a person skilled in the art according to actual monitoring requirements of the computing node fault, the time length corresponding to the fault monitoring interval is unchanged, the position of the fault monitoring interval changes along with real-time change, namely, one critical point of the fault monitoring interval is real-time.
And the recovery normal time is a time interval from the self-repairing of the system after the error reporting to the recovery of the computing node.
Wherein errors are typically caused by hardware or software faults, anomalies, or errors of the compute nodes; obtaining the number of errors typically requires the functionality of a monitoring system or a recording system to capture and record the error events of the computing node; the self-repairing function means that the computing node or the system has an automatic mechanism to solve the problem of error reporting, so that the computing node is restored to a normal running state; measures may be automatically taken to restore normal status, such as restarting services, switching to backup computing nodes, repairing data, etc.
It should be noted that the normal recovery time threshold is set by a person skilled in the art according to the normal recovery time and other practical situations such as a requirement standard for the error reporting self-recovery capability of the computing node, which will not be described herein.
In step S3, the computing node speed information and the computing node fault information are comprehensively analyzed, and the performance and the running state of the computing node are evaluated, so that the performance and the running state of a plurality of computing nodes are determined, and the help is provided for the time sequence data to the distribution strategy of the computing nodes.
The interval comprehensive performance value, the time sequence data processing variation index and the fault abnormality and recovery performance comprehensive evaluation index are subjected to normalization processing, the calculation node performance bump evaluation coefficient is calculated, and the expression of the calculation node performance bump evaluation coefficient is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Calculating a node performance bump evaluation coefficient; />Respectively, interval comprehensive performance value, time sequence data processing mutation index, fault abnormality and recoveryPerformance integrated assessment index>Preset proportionality coefficients of interval comprehensive performance value, time sequence data processing variation index and fault abnormality and recovery performance comprehensive evaluation index respectively, andare all greater than 0.
The larger the calculation node performance bump evaluation coefficient is, the worse the performance and the running state of the calculation node are, and the worse the processing capability of time sequence data is.
The calculation node performance evaluation threshold is set by a person skilled in the art according to the calculation node performance bump evaluation coefficient and other actual conditions such as the requirement standard of the actual operation of the calculation node, and will not be described herein.
And judging the performance and the running state of the computing node through the comparison of the computing node performance bump evaluation coefficient and the computing node performance evaluation threshold value, judging the processing capacity of the time sequence data, and generating a computing node recommended use signal or a computing node running bad signal.
Generating a computing node operation bad signal when the computing node performance bump evaluation coefficient is larger than the computing node performance evaluation threshold; and when the computing node performance bump evaluation coefficient is smaller than or equal to the computing node performance evaluation threshold, generating a computing node recommended use signal.
When a poor running signal of the computing node is generated, representing that the performance of the computing node has problems or large fluctuation, further diagnosis and maintenance are needed; and notifying an administrator or operator to overhaul the computing node according to the generated poor running signal of the computing node.
When the recommended use signal of the computing node is generated, the performance of the representative computing node is in a normal range, the computing node can be used normally, the generated recommended use signal of the computing node indicates that the performance of the computing node is good, and good service can be provided for processing time sequence data.
The performance and the running state of the computing node are judged by carrying out normalization processing on the interval comprehensive performance value, the time sequence data processing variation index and the fault abnormality and recovery performance comprehensive evaluation index, and comparing the computing node performance bump evaluation coefficient with the computing node performance evaluation threshold value, so that the performance of the computing node can be more comprehensively and quantitatively known, and potential computing node problems can be found and solved in advance by timely detecting the performance problems and fluctuation of the computing node, so that the influence of performance reduction or faults on a system is reduced. This helps to improve the usability and stability of the time series data processing.
In step S4, a total of h computing nodes are set, where h is a positive integer; and acquiring a calculation node operation bad signal or a calculation node recommended use signal generated by each calculation node correspondingly.
Acquiring the quantity of the operation bad signals of the corresponding generation computing node of the computing node; acquiring the number of the recommended use signals of the corresponding generation computing nodes of the computing nodes; and the ratio of the number of the operation bad signals of the corresponding generation computing node of the computing node to h is marked as a comprehensive performance judgment value.
And setting a comprehensive performance judgment threshold. When the comprehensive performance judgment value is larger than the comprehensive performance judgment threshold value, generating a comprehensive bad signal of the distributed computing node; the ratio of poor performance and poor running state in the h computing nodes is too large to ensure the normal processing of time sequence data, and the system of the whole computing node is overhauled by professional technicians according to the generated comprehensive poor signal of the distributed computing nodes.
When the comprehensive performance judgment value is smaller than or equal to the comprehensive performance judgment threshold value, generating a comprehensive normal signal of the distributed computing node; the comprehensive performance of the h computing nodes can ensure the normal operation of the system, and at the moment, a processing distribution strategy of time sequence data is formulated according to the size of the computing node performance bump evaluation coefficient corresponding to the computing node:
the smaller the computing node performance bump evaluation coefficient corresponding to the computing node is, the better the computing node performance is.
And sequencing the computing nodes generating the computing node recommended use signals in sequence from small to large according to the computing node performance bump evaluation coefficients corresponding to the computing nodes.
And preferentially distributing the time sequence data to the computing nodes with good computing node performance.
The comprehensive performance judgment threshold is set by a person skilled in the art according to the magnitude of the comprehensive performance judgment value and other actual conditions such as the requirement standard of the actual operation of the plurality of computing nodes, and is not described herein.
The method has the advantages that the comprehensive performance judgment value is calculated, and a processing distribution strategy of time sequence data is formulated when the distribution calculation nodes synthesize normal signals, so that the time sequence data is distributed and transmitted to calculation nodes with better performance, the real-time analysis and processing of the time sequence data generated by production equipment can be ensured, and the real-time monitoring performance and the accuracy of monitoring results of the production equipment are ensured.
The above formulas are all formulas with dimensionality removed and numerical calculation, the formulas are formulas with the latest real situation obtained by software simulation through collecting a large amount of data, and preset parameters and threshold selection in the formulas are set by those skilled in the art according to the actual situation.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system, apparatus and module may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, may be located in one place, or may be distributed over multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Finally: the foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (7)

1. The distributed time sequence data analysis processing method is characterized by comprising the following steps of:
step S1: acquiring speed information of a computing node, and computing a time sequence data processing performance value; calculating an interval comprehensive performance value according to the time sequence data processing performance value, and calculating a time sequence data processing variation index according to the fluctuation condition of the time sequence data processing performance value;
step S2: collecting fault information of a computing node, and calculating a fault abnormality and a comprehensive evaluation index of recovery performance by analyzing recovery normal time corresponding to each fault report of the computing node and the frequency of fault report;
step S3: calculating the performance bump evaluation coefficient of the calculation node by normalizing the interval comprehensive performance value, the time sequence data processing variation index and the fault abnormality and recovery performance comprehensive evaluation index; generating a computing node recommended use signal or a computing node operation bad signal through comparing the computing node performance bump evaluation coefficient with a computing node performance evaluation threshold;
step S4: and recommending the use signals or the calculation nodes to run the bad signals to calculate the comprehensive performance judgment value for the calculation nodes corresponding to the plurality of calculation nodes, and generating a distributed calculation node comprehensive bad signal or a distributed calculation node comprehensive normal signal according to the comparison of the comprehensive performance judgment value and the comprehensive performance judgment threshold value.
2. The distributed time series data analysis processing method according to claim 1, wherein: in step S1, a data processing performance monitoring interval is set, wherein the data processing performance monitoring interval includes data volumes of time sequence data with fixed size, the data processing performance monitoring interval is equally divided into a plurality of cells, the data volumes of the time sequence data corresponding to each cell are obtained, and the data volumes of the time sequence data corresponding to each cell are identical;
acquiring the time for processing the data quantity of the time sequence data corresponding to the cells; and calculating a time sequence data processing performance value corresponding to each cell, wherein the time sequence data processing performance value is the ratio of the data volume of the time sequence data corresponding to the cell to the time of processing the data volume of the time sequence data corresponding to the cell.
3. The distributed time series data analysis processing method according to claim 2, wherein: the interval comprehensive performance value is the ratio of the sum of the time sequence data processing performance values corresponding to all cells in the data processing performance monitoring interval to the number of the cells in the data processing performance monitoring interval; marking interval comprehensive performance values as
4. The distributed time series data analysis processing method according to claim 2, wherein: acquiring a time sequence data processing performance value corresponding to each cell, and calculating a time sequence data processing mutation index, wherein the expression is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Monitoring the number of inter-cells within an interval for data processing performance,/->For the number of the corresponding time series data processing performance value between the cells in the data processing performance monitoring interval,,/>are positive integers greater than 1; />The time sequence data processing mutation index and the data processing performance monitoring interval are respectively +.>Time series data processing performance value corresponding to each cell and the +.>Time series data processing performance values corresponding to the cells.
5. The distributed time series data analysis processing method according to claim 1, wherein: in step S2, setting a fault monitoring interval and obtaining the time length corresponding to the fault monitoring interval; acquiring the error reporting times of the computing node in the fault monitoring interval, and acquiring the normal recovery time corresponding to each error reporting of the computing node in the fault monitoring interval;
setting a recovery normal time threshold, acquiring the number of computing nodes with recovery normal time greater than the recovery normal time threshold and corresponding to the computing node error reporting in the fault monitoring interval, and acquiring the recovery normal time corresponding to the computing nodes with the recovery normal time greater than the recovery normal time threshold; the comprehensive evaluation index of fault abnormality and recovery performance is calculated, and the expression is as follows:wherein->The number of the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the computing node error reporting in the fault monitoring interval and the number of the recovery normal time corresponding to the computing nodes with the recovery normal time larger than the recovery normal time threshold value corresponding to the computing node error reporting in the fault monitoring interval are respectively calculated; />,/>Are positive integers greater than 1;the comprehensive evaluation index of fault abnormality and recovery performance, the number of times of fault reporting of the calculation node in the fault monitoring interval, the corresponding time length of the fault monitoring interval and the first part in the fault monitoring interval are respectively calculated>And the recovery normal time corresponding to the error reporting of each computing node is larger than the recovery normal time corresponding to the computing node of the recovery normal time threshold value and the recovery normal time threshold value.
6. The distributed time series data analysis processing method according to claim 1, wherein: in step S3, the expression for calculating the node performance bump evaluation coefficient is:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Calculating a node performance bump evaluation coefficient; />Respectively interval comprehensive performance value, time sequence data processing mutation index and fault abnormality and recovery performance comprehensive evaluation index, < >>Preset proportionality coefficients of interval comprehensive performance value, time sequence data processing mutation index and fault abnormality and recovery performance comprehensive evaluation index respectively, and ∈>Are all greater than 0;
setting a computing node performance evaluation threshold; generating a computing node operation bad signal when the computing node performance bump evaluation coefficient is larger than the computing node performance evaluation threshold; and when the computing node performance bump evaluation coefficient is smaller than or equal to the computing node performance evaluation threshold, generating a computing node recommended use signal.
7. The distributed time series data analysis processing method according to claim 1, wherein: in step S4, a total of h computing nodes are set, where h is a positive integer;
acquiring the quantity of the operation bad signals of the corresponding generation computing node of the computing node; calculating the ratio of the number of the operation bad signals of the corresponding generation computing node of the computing node to h, and marking the ratio of the number of the operation bad signals of the corresponding generation computing node of the computing node to h as a comprehensive performance judgment value;
setting a comprehensive performance judgment threshold; when the comprehensive performance judgment value is larger than the comprehensive performance judgment threshold value, generating a comprehensive bad signal of the distributed computing node; and when the comprehensive performance judgment value is smaller than or equal to the comprehensive performance judgment threshold value, generating a comprehensive normal signal of the distributed computing node.
CN202311255652.5A 2023-09-27 Distributed time sequence data analysis processing method Active CN116992245B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311255652.5A CN116992245B (en) 2023-09-27 Distributed time sequence data analysis processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311255652.5A CN116992245B (en) 2023-09-27 Distributed time sequence data analysis processing method

Publications (2)

Publication Number Publication Date
CN116992245A CN116992245A (en) 2023-11-03
CN116992245B true CN116992245B (en) 2023-12-01

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023097808A1 (en) * 2021-12-03 2023-06-08 株洲瑞德尔冶金设备制造有限公司 Fault monitoring method and apparatus for sintering device
CN116681427A (en) * 2023-08-03 2023-09-01 深圳市新启发汽车用品有限公司 Self-help purchasing method and system for automobile accessories based on intelligent algorithm

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023097808A1 (en) * 2021-12-03 2023-06-08 株洲瑞德尔冶金设备制造有限公司 Fault monitoring method and apparatus for sintering device
CN116681427A (en) * 2023-08-03 2023-09-01 深圳市新启发汽车用品有限公司 Self-help purchasing method and system for automobile accessories based on intelligent algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于距离度量和健康指数的电子设备健康评估方法;和麟;雷偲凡;刘洋;;计算机测量与控制(第10期);全文 *
基于长短时记忆―自编码神经网络的风电机组性能评估及异常检测;柳青秀;马红占;褚学宁;马斌彬;王峥;;计算机集成制造系统(第12期);全文 *

Similar Documents

Publication Publication Date Title
EP1480126B1 (en) Self-learning method and system for detecting abnormalities
CN107025153B (en) Disk failure prediction method and device
US11748227B2 (en) Proactive information technology infrastructure management
US8918345B2 (en) Network analysis system
JP2010526352A (en) Performance fault management system and method using statistical analysis
CN111722952A (en) Fault analysis method, system, equipment and storage medium of business system
WO2011155308A1 (en) Agreement breach prediction system, agreement breach prediction method and agreement breach prediction program
CN113590429A (en) Server fault diagnosis method and device and electronic equipment
CN113037562A (en) Gateway fault assessment method and device and server
CN113220534A (en) Cluster multi-dimensional anomaly monitoring method, device, equipment and storage medium
CN114202238A (en) Power supply equipment health degree evaluation method, operation and maintenance method, device and server
CN112433928A (en) Fault prediction method, device, equipment and storage medium of storage equipment
CN113992602B (en) Cable monitoring data uploading method, device, equipment and storage medium
US7664797B1 (en) Method and apparatus for using statistical process control within a storage management system
CN116992245B (en) Distributed time sequence data analysis processing method
CN112416896A (en) Data abnormity warning method and device, storage medium and electronic device
CN115690681A (en) Processing method of abnormity judgment basis, abnormity judgment method and device
CN116992245A (en) Distributed time sequence data analysis processing method
CN116400249A (en) Detection method and device for energy storage battery
CN115543665A (en) Memory reliability evaluation method and device and storage medium
CN112100037B (en) Alarm level identification method, device, electronic equipment and storage medium
CN112732517B (en) Disk fault alarm method, device, equipment and readable storage medium
CN111190415B (en) Industrial control system availability testing method and system
CN112838962A (en) Performance bottleneck detection method and device for big data cluster
CN117932520B (en) Solid biological waste treatment equipment monitoring method based on data identification

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant