CN113254253A - Data processing method, system and equipment - Google Patents

Data processing method, system and equipment Download PDF

Info

Publication number
CN113254253A
CN113254253A CN202110792252.2A CN202110792252A CN113254253A CN 113254253 A CN113254253 A CN 113254253A CN 202110792252 A CN202110792252 A CN 202110792252A CN 113254253 A CN113254253 A CN 113254253A
Authority
CN
China
Prior art keywords
data
node
time
index
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110792252.2A
Other languages
Chinese (zh)
Other versions
CN113254253B (en
Inventor
陈超
严川
张博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cloudwise Beijing Technology Co Ltd
Original Assignee
Cloudwise Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cloudwise Beijing Technology Co Ltd filed Critical Cloudwise Beijing Technology Co Ltd
Priority to CN202110792252.2A priority Critical patent/CN113254253B/en
Publication of CN113254253A publication Critical patent/CN113254253A/en
Application granted granted Critical
Publication of CN113254253B publication Critical patent/CN113254253B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy

Abstract

The invention discloses a data processing method, a system and equipment, wherein the method comprises the following steps: receiving and caching real-time collected index time sequence data; and transmitting the index time sequence data to a node corresponding to the index on a node ring to perform anomaly detection processing, and obtaining and outputting a detection result, wherein the node ring comprises a plurality of nodes, and one node corresponds to one index. By the mode, the usability of the real-time anomaly detection system of the time sequence data is improved, the calculation time is reduced, and more index data can be detected in unit time; the fault tolerance is improved, and the delay of the individual time period does not extend to the future time period.

Description

Data processing method, system and equipment
Technical Field
The present invention relates to the field of computer data processing technologies, and in particular, to a data processing method, system, and device.
Background
Stream data is a set of sequential, large, fast, continuous arriving data sequences, which can be generally viewed as a dynamic collection of data that grows indefinitely over time. The time series data is the most typical streaming data and is the most common observation index in the field of operation and maintenance. The method comprises the steps of carrying out abnormity detection judgment on time series data generated in real time, and timely discovering and processing faults. In an intelligent operation and maintenance scene, in order to find system problems in time, a service system needs to be monitored comprehensively in real time, and the process relates to real-time anomaly detection of a large number of time sequence indexes.
In the prior art, a real-time detection system based on a database and a file system has complex flow, high time consumption and poor fault tolerance.
Disclosure of Invention
The technical problem to be solved by the invention is how to provide a data processing method, system and equipment. The problems that in the prior art, a real-time detection system based on a database and a file system is complex in process, high in time consumption and poor in fault tolerance are solved.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method of processing data, comprising:
receiving and caching real-time collected index time sequence data;
and transmitting the index time sequence data to a node corresponding to the index on a node ring to perform anomaly detection processing, and obtaining and outputting a detection result, wherein the node ring comprises a plurality of nodes, and one node corresponds to one index.
Optionally, the receiving and caching the index time series data collected in real time includes:
and in the first unit time, receiving and caching the index time sequence data collected in real time through a receiving process.
Optionally, the node has a historical data storage region, a data region to be detected, and an anomaly detection algorithm model region of an index corresponding to the node.
Optionally, the transmitting the indicator time series data to a node ring, and performing anomaly detection processing in a node corresponding to the indicator includes:
transmitting the index time sequence data to a to-be-detected data area of a node corresponding to the index on the node ring through a detection process in a second unit time after the first unit time;
and starting from the initial node on the node ring, sequentially carrying out anomaly detection processing on the data to be detected of each node in sequence until the detection of each node is finished or the detection time is finished.
Optionally, the data processing method further includes: and when the detection time is over, if the nodes on the node ring are not detected, performing the abnormality detection processing by taking the node which is subjected to the abnormality detection processing at the end of the detection time as a starting node at the beginning of the next detection time.
Optionally, after the indicator time series data is transmitted to the to-be-detected data region of the node corresponding to the indicator on the node ring, the method further includes:
performing at least one of duplicate removal, sorting and completion processing on the backup data of the index time sequence data to obtain a first processing result;
and merging the first processing result with the historical data in the historical data storage area to obtain merged historical data.
Optionally, the data processing method further includes: and after the abnormal detection processing of the node corresponding to the index on the node ring, emptying the to-be-detected data area of the node.
An embodiment of the present invention further provides a data processing system, including:
the receiving module is used for receiving and caching the index time sequence data collected in real time;
the processing module is used for transmitting the index time sequence data to a node corresponding to the index on the node ring for anomaly detection processing to obtain a detection result;
and the output module is used for outputting the detection result, the node ring comprises a plurality of nodes, and one node corresponds to one index.
An embodiment of the present invention further provides an electronic device, including: a processor, a memory storing a computer program which, when executed by the processor, performs the data processing method as described above.
Embodiments of the present invention also provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the data processing method as described above.
The scheme of the invention at least comprises the following beneficial effects:
the method comprises the steps of receiving and caching real-time collected index time sequence data; and transmitting the index time sequence data to a node corresponding to the index on a node ring to perform anomaly detection processing, and obtaining and outputting a detection result, wherein the node ring comprises a plurality of nodes, and one node corresponds to one index. The availability of the real-time anomaly detection system of the time sequence data is improved, the calculation consumption is low, and more index data can be detected in unit time; the fault tolerance is high, and the delay of individual time interval can not extend to the future time interval.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the embodiments of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the embodiments of the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart of a method for processing data provided by an embodiment of the invention;
FIG. 2 illustrates an exception process detection flow diagram provided by an embodiment of the present invention;
FIG. 3 illustrates a flow chart of ring node detection provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a data processing system according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention provides a data processing method, including:
step 11, receiving and caching real-time collected index time sequence data;
and 12, transmitting the index time sequence data to a node corresponding to the index on a node ring to perform anomaly detection processing, and obtaining and outputting a detection result, wherein the node ring comprises a plurality of nodes, and one node corresponds to one index.
The data processing method of the embodiment receives and caches the index time sequence data collected in real time; and transmitting the index time sequence data to a node corresponding to the index on a node ring to perform anomaly detection processing, and obtaining and outputting a detection result, wherein the node ring comprises a plurality of nodes, and one node corresponds to one index. The usability of the real-time anomaly detection system of the time sequence data is improved; the calculation time is reduced, so that more index data can be detected in unit time; the fault tolerance is improved, and the delay of the individual time period does not extend to the future time period.
As shown in the flow chart of the abnormal process detection in fig. 2, in an alternative embodiment of the present invention, step 11 may include: and in the first unit time, receiving and caching the index time sequence data collected in real time through a receiving process.
In this embodiment, the data collected by the collector in each unit time is received and cached in the machine where the system is located by the receiving process, and usually, the data collection frequency of the collector is 1 minute. And in the first unit time, the latest data of the index time sequence data collected by the collector is transmitted to a file system or a database for storage. The collected index time sequence data refers to the index data of the collected service system, such as the memory utilization rate, the click rate of the advertisement per minute, the response time of the interface, the calling times of the interface, the CPU utilization rate and the like.
In a first time unit of system starting, a receiving process receives data collected by a collector; in the second unit time, the receiving process distributes the real-time data received in the previous unit time to each detection process, and then empties the real-time data of the previous unit time and receives the real-time data of the second unit time.
Meanwhile, after each detection process receives data at the first time, the abnormality detection of each index data is started. In the whole process, except for the first time unit, the receiving process of the receiving process and the detection process during execution are executed in parallel, so that the detection efficiency is improved.
In an optional embodiment of the present invention, the node has a historical data storage region, a data region to be detected, and an anomaly detection algorithm model region of an index corresponding to the node.
Specifically, the historical data storage area refers to historical data of the index corresponding to each node that is cached in the memory.
And after receiving the latest index time sequence data, the node merges the latest received backup data of the index time sequence data with the cached historical data after performing deduplication, sequencing, completion and other processing to obtain merged data.
According to the data processing method provided by the embodiment of the invention, the historical data is acquired from the database every time, the data is subjected to processing such as de-duplication, sorting, completion and the like by the algorithm, and each node caches the corresponding historical data in the memory, so that the time for acquiring and transmitting the historical data from the database is saved.
And the data area to be detected receives and caches real-time acquired index time sequence data through a receiving process in a first unit time.
The anomaly detection algorithm model area and the anomaly detection processing method are based on a distributed framework to calculate and store real-time data. The anomaly detection algorithm takes storm as an example, the storm is a master-slave architecture and is provided with a master node and a plurality of working nodes. The main node is responsible for distributing tasks to the working nodes and monitoring the execution condition of each working node; the working node monitors the master node and starts or closes the working process as required. In the cluster case, the distributed system can integrate and utilize the cluster resources, but in the single machine case, the distributed system such as spark needs extra time due to the task distribution and other work. Different time sequence data anomaly detection algorithms have different requirements on the quality of time sequence data, such as a statistical method like k-sigma, and the like, and have no requirements on the time sequence data; prophet equal time sequence fitting algorithm requires time sequence data to be sequential; classical time sequence prediction algorithms such as arima require time sequence data to be sequential, equally spaced and have no deficiency. streaming data frameworks such as storm pay more attention to the distribution and storage of large-scale streaming data, pay more attention to a high-level framework, and are only responsible for data transmission and storage processes, but not responsible for a calculation process in a bottom-level working process. Under the condition of large-scale time series data abnormity detection, the detection of all indexes is required to be completed within unit time. The index data is far larger than the number of processes, and each process needs to complete the detection of a large number of indexes in unit time. When the number of indexes is hundreds of thousands of orders, under the configuration of common hardware, each process needs to complete the detection of thousands of indexes, and in the process, the detection of each index can be executed only in sequence. In the case of a 1 minute detection interval, if the process is assigned to the 4000 indices, the detection time assigned to each index is only 15ms on average. The time consumption of the detection process is only influenced by the time consumption of algorithm calculation and is not limited by data transmission time, response time of an interface, calling times of the interface, CPU utilization rate and memory occupancy rate.
As shown in the ring node detection flowchart of fig. 3, in an optional embodiment of the present invention, in step 12, the transmitting the index time-series data to the node ring, and performing the abnormality detection processing in the node corresponding to the index includes:
transmitting the index time sequence data to a to-be-detected data area of a node corresponding to the index on the node ring through a detection process in a second unit time after the first unit time;
and starting from the initial node on the node ring, sequentially carrying out anomaly detection processing on the data to be detected of each node in sequence until the detection of each node is finished or the detection time is finished.
In this embodiment, first, the index time series data collected in real time is respectively transmitted to each node on the ring structure detection machine of the abnormality detection system, and then the abnormality detection processing is performed.
When the system is started, only data transmission work is carried out in the first unit time, and the data of the collector in the unit time is transmitted to a machine where the system is located; in the second unit time, on one hand, the data acquired by the collector in the second unit time are transmitted, and on the other hand, the system performs time sequence abnormity detection at the same time. The serial process of data acquisition and detection is changed into a parallel process. Although the result of real-time data is delayed to be generated by one unit time, the time consumption of single detection of the indexes is reduced, and the number of the detection indexes which can be carried by the machine in the unit time is greatly increased.
The ring structure detection machine is characterized in that a plurality of nodes exist on a ring, each node is matched with one index, and the nodes store historical data, to-be-detected data and an algorithm model of the corresponding index. In each unit time, data to be detected is firstly cached into the nodes, and index abnormity detection of each node is executed according to the sequence of the ring nodes. In each round of detection, index data detection on each subsequent node is sequentially executed from a certain node on the ring. When the time is close to the end of the unit time, the system stops detecting and outputs the detected algorithm result, and the time of the next unit time is not occupied. After the node executes the detection, the data to be detected cached by the node is cleared. The condition for each round of detection is that each node on the ring has performed one detection or the execution time is close to the unit time limit. And after the detection is finished, the abnormal detection result is sent to the corresponding user.
In an optional embodiment of the present invention, the data processing method further includes:
and step 13, when the detection time is over, if the nodes on the node ring are not detected, performing the abnormality detection processing by taking the node which is subjected to the abnormality detection processing when the detection time is over as a starting node when the next detection time is started.
In this embodiment, after the previous round of detection reaches the completed condition or is affected by factors such as a fault, if the amount of data in a certain unit time is too large and undetected data still exists in a node, in the next unit time, after the system caches real-time data and starts detection, subsequent detection tasks are sequentially executed from the last executed node, so as to ensure that the problem that part of the data has no detection result is not caused.
The nodes on the node ring may encounter various abnormal situations, and the data processing method provided in the embodiment of the present invention provides a corresponding logical processing method, as follows:
real-time data is missing. Due to the problems of collector failure or abnormal network transmission, and the like, no data exists in a part of time periods. Firstly, under the condition of real-time data loss, a cached data point to be detected in a node corresponding to a data loss index is empty and cannot be detected; the cached historical data is not updated because there are no new data points. And secondly, after the missing condition is finished, after the new data are spliced to the historical data by the nodes, the missing data can be supplemented in a missing value supplementing mode so as to meet the requirement that the algorithm requires no missing of the training data.
Real-time data is out of order. Similarly, due to network problems, the received index data of the system cannot be guaranteed to be sequential, such as: data from 00:01:00 to 00:02:00 may be received at 00:02:00 and then data comprising 00:00:00 to 00:01:00 may be received at 00:03: 00. As in the above example, after the algorithm receives the real-time data of 00:01: 00-00: 02:00, the algorithm completes the missing and the completion of the training data in the time period of 00:00: 00-00: 01:00, and then calculates. And then after receiving real-time data containing 00:00: 00-00: 01:00 at 00:03:00, replacing the supplemented 00:00: 00-00: 01:00 data on the training data by real data by the system, caching the real-time data containing 00:00: 00: 00-00: 01:00 in the data to be detected on the node, and detecting all the data to be detected after algorithm training is finished.
The data is not equally spaced. The algorithm requires that the training data have equal granularity at equal intervals, but the acquisition unit cannot acquire data at equal intervals due to the problems of accuracy and the like, for example, 1 minute is set, and the time value of the acquired data is as follows: "00:00:59, 00:02:00, 00:03:02, 00:03:59,. . . ". When the real-time data is spliced with the training data, the time of the real-time data is rounded, and as an example, the time of the data is rounded as follows: "00:00:00, 00:02:00, 00:03:00, 00:03:00,. . . And then removing the duplicate, filling in the missing value and the like. After the algorithm is trained, the original data to be detected is used for detection, so that the data consistency is guaranteed, namely the detection result of the input data is output by the algorithm.
The individual time periods do not complete all the index calculations. The equipment computing resources have limits, and in order to save cost and fully utilize computing performance, computing tasks close to the performance limit are distributed to the machine under the condition of limited resources. Affected by accidental faults of collectors, networks and the like, the data volume to be detected may be far beyond a constant value in unit time. In each unit time, the system firstly caches the data to be detected into the nodes, and index abnormality detection of each node is executed according to the sequence of the ring nodes. When the time is close to the end of the unit time, the system stops detecting and outputs the detected algorithm result, and the time of the next unit time is not occupied. If the data amount in a certain unit time is too much due to factors such as faults and the like, all nodes cannot be executed in the unit time, in the next unit time, after the system caches real-time data and starts to detect, subsequent detection tasks are executed in sequence from the last unfinished node, and the problem that part of data has no detection result is solved.
In an optional embodiment of the present invention, in step 12, after transmitting the indicator time-series data to the to-be-detected data region of the node corresponding to the indicator on the node ring, the method further includes:
step 121, performing at least one of deduplication, sorting and completion processing on the backup data of the index time sequence data to obtain a first processing result;
and step 122, merging the first processing result with the historical data in the historical data storage area to obtain merged historical data.
In the embodiment, the first processing result is obtained by performing at least one of duplication, sorting and completion processing on the backup of the time series data, so that the algorithm development complexity can be reduced.
After receiving the latest index data, the node merges the latest index data with the cached historical data and eliminates outdated historical data.
Based on the first processing result, the algorithm model is trained, and then the latest received data in the historical data is subjected to anomaly detection. When the system has a high requirement on real-time performance, data points within the latest 1 minute are generally detected, and the system needs to complete the whole process from data acquisition to result output within 1 minute. The system reads historical data, typically 7-day history data, from a database or file system for the most recent period of time. The algorithm performs duplicate removal, sorting and completion on the data according to the quality requirement of the algorithm on the time sequence data.
In an optional embodiment of the present invention, the data processing method further includes:
and 14, emptying the to-be-detected data area of the node after the abnormal detection processing of the node corresponding to the index on the node ring.
In this embodiment, during the ring detection, the node that completes the detection may empty the to-be-detected data area of the node. In each round of detection, index data detection on each subsequent node is sequentially executed from a certain node on the ring. After the node executes the detection, the data to be detected cached by the node is clear.
In the method according to the above embodiment of the present invention, in each unit time, the data to be detected is cached in the node, and the index anomaly detection of each node is performed according to the ring node sequence. When the time is close to the end of the unit time, the system stops detecting and outputs the detected algorithm result, and the time of the next unit time is not occupied. If the data amount in a certain unit time is too much due to factors such as faults and the like, all nodes cannot be executed in the unit time, in the next unit time, after the system caches real-time data and starts to detect, subsequent detection tasks are executed in sequence from the last unfinished node, and the problem that part of data has no detection result is solved. Each index data corresponds to one node. When the system performs detection tasks, the detection tasks are sequentially performed according to the sequence of the nodes on the ring. When the calculation is slow in a certain unit time and the detection tasks of all indexes are not completed, the system can be executed in sequence from the last executed node, and the indexes are not omitted.
In the system of the present invention, the node corresponding to each index caches the history data of the corresponding index in the memory. The system only needs to acquire the latest data generated by the collector in each unit time. Compared with the common system which acquires the historical training data from the database each time and preprocesses the data through an algorithm, in the system, each node caches the corresponding historical data in the memory, so that the time for acquiring the historical data from the database and transmitting the historical data is saved.
In the embodiment of the invention, the preprocessed historical data is cached in the node, and after the real-time data enters the node, the real-time data is only preprocessed and then added to the regular historical data. The frequency domain of the collector is generally 30 s-60 s, namely only 1-2 real-time data points are generated per minute by a single index, and the time consumption for preprocessing the real-time data is far less than that for preprocessing 7-day history data;
in the method, historical data does not need to be taken; the transmission process and the calculation process are processed asynchronously. That is, when the system is started, only data transmission work is carried out in the first unit time, and the collector is transmitted to the machine where the system is located in the unit time; in the second unit time, on one hand, the data acquired by the collector in the second unit time are transmitted, and on the other hand, the system performs time sequence abnormity detection at the same time. The serial process of data acquisition and detection is changed into a parallel process. Although the result of real-time data is delayed to be generated by one unit time, the time consumption of single detection of the indexes is reduced, and the number of the detection indexes which can be carried by the machine in the unit time is greatly increased.
Fig. 4 is a schematic structural diagram illustrating a data processing system according to an embodiment of the present invention. As shown in fig. 4, the system includes:
the receiving module 41 is configured to receive and cache the index time series data acquired in real time;
the processing module 42 is configured to transmit the indicator time sequence data to a node on the node ring corresponding to the indicator to perform anomaly detection processing, so as to obtain a detection result;
and an output module 43, configured to output the detection result, where the node ring includes multiple nodes, and one node corresponds to one index.
Optionally, when receiving and buffering the index time series data acquired in real time, the receiving module 41 is specifically configured to receive and buffer the index time series data acquired in real time through a receiving process in a first unit time.
Optionally, the node has a historical data storage region, a data region to be detected, and an anomaly detection algorithm model region of an index corresponding to the node.
Optionally, the processing module 42 transmits the indicator time series data to the node ring, and when performing the anomaly detection processing in the node corresponding to the indicator, is specifically configured to: transmitting the index time sequence data to a to-be-detected data area of a node corresponding to the index on the node ring through a detection process in a second unit time after the first unit time; and starting from the initial node on the node ring, sequentially carrying out anomaly detection processing on the data to be detected of each node in sequence until the detection of each node is finished or the detection time is finished.
Optionally, the processing module 42 is further configured to, when the detection time is over, if the node on the node ring is not detected completely, use the node that is performing the anomaly detection processing when the detection time is over as a starting node when the next detection time starts, and perform the anomaly detection processing.
Optionally, the processing module 42 is further configured to perform at least one of deduplication, sorting, and completion processing on the backup data of the indicator time series data to obtain a first processing result;
and merging the first processing result with the historical data in the historical data storage area to obtain merged historical data.
Optionally, the processing module 42 is further configured to empty the to-be-detected data area of the node after the abnormal detection processing of the node corresponding to the index on the node ring.
It should be noted that this embodiment is a system embodiment corresponding to the above method embodiment, and all implementation manners in the above method embodiment are applicable to this system embodiment, and the same technical effect can be achieved.
An embodiment of the present invention provides a non-volatile computer storage medium, where at least one executable instruction is stored in the computer storage medium, and the computer executable instruction may execute the data processing method in any of the above method embodiments.
Fig. 5 is a schematic structural diagram of a computing device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.
As shown in fig. 5, the computing device may include: a processor (processor), a Communications Interface (Communications Interface), a memory (memory), and a Communications bus.
Wherein: the processor, the communication interface, and the memory communicate with each other via a communication bus. A communication interface for communicating with network elements of other devices, such as clients or other servers. And the processor is used for executing the program, and particularly can execute the relevant steps in the data processing method embodiment for the computing equipment.
In particular, the program may include program code comprising computer operating instructions.
The processor may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And the memory is used for storing programs. The memory may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program may specifically be adapted to cause a processor to execute the method of processing data in any of the above-described method embodiments. For specific implementation of each step in the program, reference may be made to corresponding steps and corresponding descriptions in units in the foregoing data processing method embodiments, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of embodiments of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best modes of embodiments of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that is, the claimed embodiments of the invention require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. Embodiments of the present invention may also be embodied as device or system programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing embodiments of the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several systems, several of these systems may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (10)

1. A data processing method, comprising:
receiving and caching real-time collected index time sequence data;
and transmitting the index time sequence data to a node corresponding to the index on a node ring to perform anomaly detection processing, and obtaining and outputting a detection result, wherein the node ring comprises a plurality of nodes, and one node corresponds to one index.
2. The data processing method of claim 1, wherein receiving and buffering real-time collected indicator timing data comprises:
and in the first unit time, receiving and caching the index time sequence data collected in real time through a receiving process.
3. The data processing method according to claim 2, wherein the node has a history data storage region, a data region to be detected, and an anomaly detection algorithm model region of an index corresponding to the node.
4. The data processing method according to claim 3, wherein the transmitting the index time-series data to the node ring, and performing the abnormality detection processing in the node corresponding to the index, comprises:
transmitting the index time sequence data to a to-be-detected data area of a node corresponding to the index on the node ring through a detection process in a second unit time after the first unit time;
and starting from the initial node on the node ring, sequentially carrying out anomaly detection processing on the data to be detected of each node in sequence until the detection of each node is finished or the detection time is finished.
5. The data processing method of claim 4, further comprising:
and when the detection time is over, if the nodes on the node ring are not detected, performing the abnormality detection processing by taking the node which is subjected to the abnormality detection processing at the end of the detection time as a starting node at the beginning of the next detection time.
6. The data processing method according to claim 1, wherein after transmitting the indicator time series data to the to-be-detected data region of the node corresponding to the indicator on the node ring, the method further comprises:
performing at least one of duplicate removal, sorting and completion processing on the backup data of the index time sequence data to obtain a first processing result;
and merging the first processing result with the historical data in the historical data storage area to obtain merged historical data.
7. The data processing method of claim 1, further comprising:
and after the abnormal detection processing of the node corresponding to the index on the node ring, emptying the to-be-detected data area of the node.
8. A data processing system, comprising:
the receiving module is used for receiving and caching the index time sequence data collected in real time;
the processing module is used for transmitting the index time sequence data to a node corresponding to the index on the node ring for anomaly detection processing to obtain a detection result;
and the output module is used for outputting the detection result, the node ring comprises a plurality of nodes, and one node corresponds to one index.
9. An electronic device, comprising: processor, memory storing a computer program which, when executed by the processor, performs the data processing method of any one of claims 1 to 7.
10. A computer-readable storage medium storing instructions which, when executed on a computer, cause the computer to perform the data processing method of any one of claims 1 to 7.
CN202110792252.2A 2021-07-14 2021-07-14 Data processing method, system and equipment Active CN113254253B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110792252.2A CN113254253B (en) 2021-07-14 2021-07-14 Data processing method, system and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110792252.2A CN113254253B (en) 2021-07-14 2021-07-14 Data processing method, system and equipment

Publications (2)

Publication Number Publication Date
CN113254253A true CN113254253A (en) 2021-08-13
CN113254253B CN113254253B (en) 2021-11-02

Family

ID=77191184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110792252.2A Active CN113254253B (en) 2021-07-14 2021-07-14 Data processing method, system and equipment

Country Status (1)

Country Link
CN (1) CN113254253B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114462900A (en) * 2022-04-13 2022-05-10 云智慧(北京)科技有限公司 Method, device and equipment for splitting service active node

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06112993A (en) * 1992-09-29 1994-04-22 Mitsubishi Electric Corp Serial data communication equipment
CN104954153A (en) * 2014-03-24 2015-09-30 中兴通讯股份有限公司 Method and device for node fault detection
CN105764162A (en) * 2016-05-10 2016-07-13 江苏大学 Wireless sensor network abnormal event detecting method based on multi-attribute correlation
WO2019098199A1 (en) * 2017-11-17 2019-05-23 日本電気株式会社 Information processing device, information processing method, and recording medium
CN110113406A (en) * 2019-04-29 2019-08-09 成都网阔信息技术股份有限公司 Based on distributed calculating service cluster frame
CN110213125A (en) * 2019-05-23 2019-09-06 南京维拓科技股份有限公司 Abnormality detection system based on time series data under a kind of cloud environment
CN111522680A (en) * 2020-04-17 2020-08-11 支付宝(杭州)信息技术有限公司 Method, device and equipment for automatically repairing abnormal task node
CN111708672A (en) * 2020-06-15 2020-09-25 北京优特捷信息技术有限公司 Data transmission method, device, equipment and storage medium
CN112363893A (en) * 2021-01-11 2021-02-12 杭州涂鸦信息技术有限公司 Method, equipment and device for detecting time sequence index abnormity
CN112380209A (en) * 2020-10-29 2021-02-19 华东师范大学 Block chain multi-channel state data-oriented structure tree aggregation method
CN112818066A (en) * 2019-11-15 2021-05-18 深信服科技股份有限公司 Time sequence data anomaly detection method and device, electronic equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06112993A (en) * 1992-09-29 1994-04-22 Mitsubishi Electric Corp Serial data communication equipment
CN104954153A (en) * 2014-03-24 2015-09-30 中兴通讯股份有限公司 Method and device for node fault detection
CN105764162A (en) * 2016-05-10 2016-07-13 江苏大学 Wireless sensor network abnormal event detecting method based on multi-attribute correlation
WO2019098199A1 (en) * 2017-11-17 2019-05-23 日本電気株式会社 Information processing device, information processing method, and recording medium
CN110113406A (en) * 2019-04-29 2019-08-09 成都网阔信息技术股份有限公司 Based on distributed calculating service cluster frame
CN110213125A (en) * 2019-05-23 2019-09-06 南京维拓科技股份有限公司 Abnormality detection system based on time series data under a kind of cloud environment
CN112818066A (en) * 2019-11-15 2021-05-18 深信服科技股份有限公司 Time sequence data anomaly detection method and device, electronic equipment and storage medium
CN111522680A (en) * 2020-04-17 2020-08-11 支付宝(杭州)信息技术有限公司 Method, device and equipment for automatically repairing abnormal task node
CN111708672A (en) * 2020-06-15 2020-09-25 北京优特捷信息技术有限公司 Data transmission method, device, equipment and storage medium
CN112380209A (en) * 2020-10-29 2021-02-19 华东师范大学 Block chain multi-channel state data-oriented structure tree aggregation method
CN112363893A (en) * 2021-01-11 2021-02-12 杭州涂鸦信息技术有限公司 Method, equipment and device for detecting time sequence index abnormity

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114462900A (en) * 2022-04-13 2022-05-10 云智慧(北京)科技有限公司 Method, device and equipment for splitting service active node
CN114462900B (en) * 2022-04-13 2022-07-29 云智慧(北京)科技有限公司 Method, device and equipment for splitting service active node

Also Published As

Publication number Publication date
CN113254253B (en) 2021-11-02

Similar Documents

Publication Publication Date Title
CN111077870A (en) Intelligent OPC data real-time acquisition and monitoring system and method based on stream calculation
CN110908788B (en) Spark Streaming based data processing method and device, computer equipment and storage medium
CN106815254B (en) Data processing method and device
CN108492150B (en) Method and system for determining entity heat degree
CN103425568A (en) Method and device for processing log information
CN111143158B (en) Monitoring data real-time storage method, system, electronic equipment and storage medium
CN110647447B (en) Abnormal instance detection method, device, equipment and medium for distributed system
CN109308170A (en) A kind of data processing method and device
CN111966289A (en) Partition optimization method and system based on Kafka cluster
CN113254253B (en) Data processing method, system and equipment
CN111339052A (en) Unstructured log data processing method and device
CN115328741A (en) Exception handling method, device, equipment and storage medium
CN105446707B (en) Data conversion method
CN112817687A (en) Data synchronization method and device
CN114401207B (en) Communication abnormal terminal equipment positioning method and device and electronic equipment
CN115309735A (en) Big data cleaning method and device, computer equipment and storage medium
CN115269519A (en) Log detection method and device and electronic equipment
CN115396752A (en) Redis-based biplane data acquisition method and system
CN108805741B (en) Fusion method, device and system of power quality data
CN112052147A (en) Monitoring method, electronic device and storage medium
CN117609315B (en) Data processing method, device, equipment and readable storage medium
CN116016265B (en) Message all-link monitoring method, device, system, equipment and storage medium
CN117724440A (en) Fault code acquisition method and device, electronic equipment and storage medium
CN116450485B (en) Detection method and system for application performance interference
CN108647156A (en) Cache cleaner method, apparatus, computer installation and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant