CN116070897A - Order wind control method and device based on anomaly detection algorithm and storage medium - Google Patents

Order wind control method and device based on anomaly detection algorithm and storage medium Download PDF

Info

Publication number
CN116070897A
CN116070897A CN202111239552.4A CN202111239552A CN116070897A CN 116070897 A CN116070897 A CN 116070897A CN 202111239552 A CN202111239552 A CN 202111239552A CN 116070897 A CN116070897 A CN 116070897A
Authority
CN
China
Prior art keywords
data
moving average
anomaly detection
residual
sample data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111239552.4A
Other languages
Chinese (zh)
Inventor
方传艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Suzhou Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Suzhou Software Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202111239552.4A priority Critical patent/CN116070897A/en
Publication of CN116070897A publication Critical patent/CN116070897A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention discloses an order wind control method, an order wind control device, a calculation device and a storage medium based on an anomaly detection algorithm. According to the technical scheme provided by the invention, a historical order data set is obtained, and data preprocessing is carried out on the historical order data set to obtain a sample data set; for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes; model training is carried out by utilizing a plurality of residual error data corresponding to a plurality of sample data, and a target abnormal detection model is obtained; and calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result. The invention utilizes the preprocessing of the data and different moving average indexes to realize the adaptation of the model to the periodicity and the real-time property of the data, and simultaneously fully excavates the potential abnormal information.

Description

Order wind control method and device based on anomaly detection algorithm and storage medium
Technical Field
The invention relates to the field of wind control data, in particular to an order wind control method, an order wind control device, a calculation device and a computer storage medium based on an anomaly detection algorithm.
Background
With the development of the telecommunications industry, the traffic and its corresponding order volume are rapidly increasing, and the large volume of order data creates pressure on the proper operation of the system. Therefore, early warning wind control is needed to be carried out on the order data, and risk prediction is carried out on the order data.
In the prior art, three modes are generally adopted: the first method is based on rules, comparing the current time with the order quantity of the previous time, and alarming when the current time exceeds a certain threshold; the second method is to monitor and alarm based on a baseline alarm model formed by a statistical method; the third method is to form a predictor based on a prediction algorithm, predict a predicted value of order data at the current moment, compare the predicted value with a true value, and judge whether the order data at the current moment is abnormal or not through a threshold method. In the above methods, the false alarm rate and the missing report rate of the first method are both high. The second method considers the periodic characteristics of the order data time sequence, but ignores the real-time characteristics, so that the false alarm rate is higher as well as the missing alarm rate, and the operation and maintenance cannot respond to the problems in time due to frequent alarm. In the third method, if the difference between the actual value and the predicted value exceeds a certain proportion of the predicted value, it is determined that the point is abnormal, and the false alarm rate is also high.
Disclosure of Invention
The present invention has been made in view of the above-mentioned problems, and it is an object of the present invention to provide an order wind control method based on an anomaly detection algorithm and a corresponding order wind control apparatus based on an anomaly detection algorithm, a computing device and a computer storage medium, which overcome or at least partially solve the above-mentioned problems.
According to one aspect of the present invention, there is provided an order wind control method based on an anomaly detection algorithm, the method comprising:
acquiring a historical order data set, and performing data preprocessing on the historical order data set to obtain a sample data set;
for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes;
model training is carried out by utilizing a plurality of residual error data corresponding to a plurality of sample data, and a target abnormal detection model is obtained;
and calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result.
In the above scheme, the performing data preprocessing on the historical order data set to obtain a sample data set further includes:
and carrying out completion processing on the missing values in the historical order data set, and carrying out standardization and normalization processing on each historical order data in the historical order data set to obtain a sample data set.
In the above solution, the supplementing the missing values in the historical order data set further includes:
searching historical order data with missing values in the historical order data set;
and calculating the average value of the first preset number of historical order data without missing values, which are positioned in front of the historical order data in the historical order data set, as a complement value corresponding to the missing value according to each historical order data with the missing value, and complementing the missing value in the historical order data by using the complement value.
In the above aspect, for each sample data in the sample data set, calculating, according to a plurality of moving average indexes, a plurality of residual data corresponding to the sample data further includes:
calculating a plurality of first moving average data corresponding to the sample data according to a plurality of moving average indexes;
calculating a difference between the sample data and each of the first moving average data;
and obtaining a plurality of residual data corresponding to the sample data according to the difference value between the sample data and each first moving average data.
In the above solution, the performing model training by using a plurality of residual data corresponding to a plurality of sample data, and obtaining the target anomaly detection model further includes:
constructing an initial anomaly detection model by using an isolated forest algorithm;
dividing the sample dataset into a first subset and a second subset;
and training the initial anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the first subset to obtain a trained anomaly detection model.
In the above solution, the performing model training by using a plurality of residual data corresponding to a plurality of sample data, and obtaining the target anomaly detection model further includes:
evaluating the trained anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the second subset to obtain evaluation parameters;
judging whether the evaluation parameter exceeds a preset threshold value or not;
if yes, taking the trained abnormality detection model as a target abnormality detection model;
if not, model training is conducted again until the evaluation parameters exceed a preset threshold value, and a target abnormality detection model is obtained.
In the above solution, calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes further includes:
calculating a plurality of second moving average data corresponding to the real-time data according to a plurality of moving average indexes;
calculating a difference between the real-time data and each second moving average data;
and obtaining a plurality of residual data corresponding to the real-time data according to the difference value between the real-time data and each second moving average data.
According to another aspect of the present invention, there is provided an order wind control apparatus based on an anomaly detection algorithm, including: the system comprises a preprocessing module, a calculation module, a training module and a risk prediction module; wherein,,
the preprocessing module is used for acquiring a historical order data set, and performing data preprocessing on the historical order data set to obtain a sample data set;
the calculation module is used for calculating a plurality of residual data corresponding to each sample data in the sample data set according to a plurality of moving average indexes;
the training module is used for carrying out model training by utilizing a plurality of residual error data corresponding to a plurality of sample data to obtain a target abnormality detection model;
the risk prediction module is used for calculating a plurality of residual error data corresponding to the real-time data according to a plurality of moving average indexes, and inputting the plurality of residual error data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result.
According to yet another aspect of the present invention, there is provided a computing device comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the order wind control method based on the anomaly detection algorithm.
According to still another aspect of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the order wind control method based on an anomaly detection algorithm as described above.
According to the technical scheme provided by the invention, a historical order data set is obtained, and data preprocessing is carried out on the historical order data set to obtain a sample data set; for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes; model training is carried out by utilizing a plurality of residual error data corresponding to a plurality of sample data, and a target abnormal detection model is obtained; and calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result. The method solves the problems that in the prior art, the processing of order data is simpler, the periodicity and the real-time change of the data cannot be compatibly considered, the false alarm rate and the missing report rate are larger, and the existing prediction algorithm-based method cannot be compatible and adaptive to different data models of different ordering systems efficiently; according to the technical scheme provided by the invention, the order data is preprocessed, the residual data is calculated in a moving average mode, the periodicity and the real-time change of the data are effectively considered, and the abnormality detection model trained based on the processed data can be suitable for abnormality detection of different data models, so that the universality of the detection model based on an abnormality detection algorithm is greatly improved, and the training cost required for detection of different data models is greatly reduced.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1A illustrates a flow diagram of an order wind control method based on an anomaly detection algorithm, according to one embodiment of the present invention;
FIG. 1B illustrates a process diagram of an order wind control method based on an anomaly detection algorithm, according to one embodiment of the present invention;
FIG. 2 illustrates a training flow diagram of a target anomaly detection model according to one embodiment of the present invention;
FIG. 3 illustrates a block diagram of an order wind control device based on an anomaly detection algorithm, according to one embodiment of the present invention;
FIG. 4 illustrates a schematic diagram of a computing device, according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1A shows a flow diagram of an order wind control method based on an anomaly detection algorithm according to one embodiment of the present invention, as shown in FIG. 1A, the method comprising the steps of:
step S101, a historical order data set is obtained, and data preprocessing is carried out on the historical order data set to obtain a sample data set.
Specifically, the missing values in the historical order data set are subjected to completion processing, and each historical order data in the historical order data set is subjected to standardization and normalization processing to obtain a sample data set.
Wherein the supplementing the missing values in the historical order data set further comprises: searching historical order data with missing values in the historical order data set; and calculating the average value of the first preset number of historical order data without missing values, which are positioned in front of the historical order data in the historical order data set, as a complement value corresponding to the missing value according to each historical order data with the missing value, and complementing the missing value in the historical order data by using the complement value.
Specifically, the missing value of the historical order data can be that a space, a NaNs or other placeholders exist in the data; and calculating the average value of the first preset number of normal historical order data before the current time point of the historical order data with the missing value, and taking the average value as the missing value.
Specifically, the normalization and normalization processing of each historical order data in the historical order data set means that the orders of magnitude of each historical order data are unified, the dimensional influence among each historical order data is eliminated, and comparability among the historical order data, corresponding moving average indexes obtained later and residual data is achieved, so that comprehensive comparison and evaluation are facilitated.
Step S102, for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes.
Specifically, calculating a plurality of first moving average data corresponding to the sample data according to a plurality of moving average indexes; calculating a difference between the sample data and each of the first moving average data; and obtaining a plurality of residual data corresponding to the sample data according to the difference value between the sample data and each first moving average data.
The moving average indexes are that sample data with different numbers corresponding to each other are selected through moving data selection windows with different lengths, average values of the sample data are calculated respectively, and each average value is first moving average data.
The moving average indicator MA (Moving average) is also referred to as a moving average line indicator, and can represent trend characteristics of data, and the calculation formula is generally:
MA=(C1+C2+C3+…+Cn)/n
wherein, C is the data of each time point; n is the number of data selected, otherwise known as the number of moving average cycles.
Preferably, the moving average index selection may include, for example, 5, 10 and 30 data amounts, so as to represent the data change trend of the sample data in the short term, the medium term and the long term.
Specifically, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes refers to taking absolute values (i.e., absolute value residual errors) of residual errors between the sample data and first moving average data corresponding to the sample data under the moving average indexes, and obtaining the plurality of residual error data corresponding to each sample data;
the calculation formula of the residual data, i.e. the absolute value residual, is generally:
e=|y-MAn|
wherein e is residual data; y is sample data; MAn is first moving average data corresponding to the sample data under the moving average index;
preferably, MAn is MA5, MA10 and MA30.
Step S103, performing model training by using a plurality of residual data corresponding to the plurality of sample data to obtain a target abnormality detection model.
Specifically, each residual data e corresponding to the sample data in the sample data set obtained in step S102 is used as training data of an anomaly detection model, and the target anomaly detection model is obtained through training.
Step S104, calculating a plurality of residual data corresponding to the real-time data according to a plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result.
Specifically, calculating a plurality of second moving average data corresponding to the real-time data according to a plurality of moving average indexes; calculating a difference between the real-time data and each second moving average data; and obtaining a plurality of residual data corresponding to the real-time data according to the difference value between the real-time data and each second moving average data.
Preferably, the method for calculating the second moving average data and the plurality of residual data corresponding to the real-time data is the same as the calculation method in step S102, and will not be described herein again;
preferably, the inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model to perform risk prediction, so as to obtain a corresponding risk prediction result, for example, if the obtained risk prediction result is 1, the data is normal; and if the obtained risk prediction result is-1, indicating that the data is abnormal. When the predicted result shows abnormality, there is a risk of ordering, and corresponding measures can be taken.
Preferably, in the case where the first moving average data is MA5, MA10, and MA30, a specific flow from model training to abnormality detection is shown in fig. 1B; FIG. 1B illustrates a process diagram of an order wind control method based on an anomaly detection algorithm, according to one embodiment of the present invention; FIG. 1B specifically illustrates how an anomaly detection model is trained based on moving average data and risk prediction is performed using the trained target anomaly detection model.
According to the order wind control method based on the anomaly detection algorithm, a historical order data set is obtained, and data preprocessing is carried out on the historical order data set to obtain a sample data set; for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes; model training is carried out by utilizing a plurality of residual error data corresponding to a plurality of sample data, and a target abnormal detection model is obtained; and calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result. By means of the technical scheme, the order data can be preprocessed, the anomaly detection model can be trained based on the moving average index and the residual data, so that the finally obtained target anomaly detection model effectively takes into account the periodicity and real-time change of the data, the anomaly detection model trained based on the processed data can be suitable for anomaly detection of different data models, universality of the anomaly detection algorithm-based detection model is greatly improved, and training cost required for detection of different data models is greatly reduced.
FIG. 2 shows a schematic diagram of a training process for a target anomaly detection model according to one embodiment of the present invention, as shown in FIG. 2, wherein:
step S201: and constructing an initial anomaly detection model by using an isolated forest algorithm.
Specifically, the embodiment constructs an initial anomaly detection model based on an isolated forest algorithm of the scikit-learn machine learning framework, wherein the isolated forest algorithm is a rapid anomaly detection method based on Ensemble, has linear time complexity and high accuracy, and meets the requirement of big data processing.
Step S202: the sample dataset is divided into a first subset and a second subset.
Specifically, after the sample data set is divided into a first subset and a second subset, the first subset is used for training an abnormality detection model; using the second subset for evaluation testing of an anomaly detection model;
preferably, the ratio of the first subset to the second subset may be 7:3.
step S203: and training the initial anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the first subset to obtain a trained anomaly detection model.
Specifically, a plurality of residual data corresponding to the plurality of sample data in the first subset are determined according to the above manner, and are used as training data of the anomaly detection model.
Step S204: and evaluating the trained abnormality detection model by utilizing a plurality of residual data corresponding to the plurality of sample data in the second subset to obtain evaluation parameters.
Specifically, determining a plurality of residual data corresponding to the plurality of sample data in the second subset according to the mode, and using the residual data as test data of the trained abnormality detection model;
predicting the test data through the trained abnormality detection model, wherein the predicted value is 1 to represent normal data, and the predicted value is-1 to represent abnormal data; and evaluating the trained abnormality detection model according to the prediction result and the original test data.
Preferably, the model evaluation method is an F1 score evaluation method, and the calculation process is as follows:
Figure BDA0003318733850000091
precision is the precision rate; TP (True Positive) the predicted answer is correct; FP (False Positive) mispredicts other classes as the current class.
Figure BDA0003318733850000092
Wherein, recovery is recall or recall; TP (True Positive) the predicted answer is correct; FN (False Negative) is the label of this class predicted as other classes.
Figure BDA0003318733850000093
Wherein, F1 is the evaluation result F1 score, which can be regarded as a harmonic mean of the accuracy rate and recall rate of the anomaly detection model, and the maximum value is 1 and the minimum value is 0; when F1 is closer to 1, the detection effect of the abnormality detection model is shown to be better.
Step S205: and judging whether the evaluation parameter exceeds a preset threshold value.
Specifically, if yes, step S206 is executed; if not, step S207 is performed.
Preferably, the evaluation parameter may specifically be F1 score in step S204.
Step S206: and taking the trained abnormality detection model as a target abnormality detection model.
Step S207: and (5) carrying out model training again until the evaluation parameters exceed a preset threshold value to obtain a target abnormality detection model.
Preferably, the following super parameters can be parameterized when model training is performed again:
n_evastiators: int, the number of base estimators in the set (i.e. the number of trees in the isolated forest), the default value is 100;
the contact: float (0, 0.5), which is the amount of contamination of the data set, i.e. the proportion of outliers in the data set, with a default value of 0.1; a threshold value for defining a decision function at the time of fitting, which refers to a desired proportion of outliers in the dataset, the threshold value being defined by fitting according to the sample score;
max_samples: the amount of data for training each base evaluator; if max_samples is greater than the sample size, then all base estimators (trees) will be trained using all samples; default value is [ auto ], if the value is [ auto ], max_samples=min (256, n_samples);
max_features: for the number of features used to train each base evaluator (tree); default value is 1; in this embodiment, a total of 3 moving average indexes MA5, MA10, MA30 are used, and if it is determined that the training effect of the anomaly detection model is not good after the evaluation, the moving average indexes of other can be selected to participate in retraining.
FIG. 3 shows a block diagram of an order wind control device based on an anomaly detection algorithm, according to one embodiment of the present invention, as shown in FIG. 3, the device comprising: a preprocessing module 301, a calculation module 302, a training module 303, and a risk prediction module 304; wherein,,
the preprocessing module 301 is configured to obtain a historical order data set, and perform data preprocessing on the historical order data set to obtain a sample data set.
Specifically, the preprocessing module 301 is further configured to: and carrying out completion processing on the missing values in the historical order data set, and carrying out standardization and normalization processing on each historical order data in the historical order data set to obtain a sample data set.
The preprocessing module 301 is further configured to: searching historical order data with missing values in the historical order data set; and calculating the average value of the first preset number of historical order data without missing values, which are positioned in front of the historical order data in the historical order data set, as a complement value corresponding to the missing value according to each historical order data with the missing value, and complementing the missing value in the historical order data by using the complement value.
The calculating module 302 is configured to calculate, for each sample data in the sample data set, a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes.
Specifically, the computing module 302 is further configured to: calculating a plurality of first moving average data corresponding to the sample data according to a plurality of moving average indexes; calculating a difference between the sample data and each of the first moving average data; and obtaining a plurality of residual data corresponding to the sample data according to the difference value between the sample data and each first moving average data.
The training module 303 is configured to perform model training by using a plurality of residual data corresponding to a plurality of sample data, so as to obtain a target anomaly detection model.
Specifically, training module 303 is further to: constructing an initial anomaly detection model by using an isolated forest algorithm; dividing the sample dataset into a first subset and a second subset; and training the initial anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the first subset to obtain a trained anomaly detection model.
The training module 303 is further configured to: evaluating the trained anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the second subset to obtain evaluation parameters; judging whether the evaluation parameter exceeds a preset threshold value or not; if yes, taking the trained abnormality detection model as a target abnormality detection model; if not, model training is conducted again until the evaluation parameters exceed a preset threshold value, and a target abnormality detection model is obtained.
The risk prediction module 304 is configured to calculate a plurality of residual data corresponding to real-time data according to a plurality of moving average indexes, and input the plurality of residual data corresponding to the real-time data to the target anomaly detection model for risk prediction, so as to obtain a corresponding risk prediction result.
Specifically, risk prediction module 304 is further to: calculating a plurality of second moving average data corresponding to the real-time data according to a plurality of moving average indexes; calculating a difference between the real-time data and each second moving average data; and obtaining a plurality of residual data corresponding to the real-time data according to the difference value between the real-time data and each second moving average data.
According to the order wind control device based on the anomaly detection algorithm, a historical order data set is obtained, and data preprocessing is carried out on the historical order data set to obtain a sample data set; for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes; model training is carried out by utilizing a plurality of residual error data corresponding to a plurality of sample data, and a target abnormal detection model is obtained; and calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result. By means of the technical scheme, the order data can be preprocessed, the anomaly detection model can be trained based on the moving average index and the residual data, so that the finally obtained target anomaly detection model effectively takes into account the periodicity and real-time change of the data, the anomaly detection model trained based on the processed data can be suitable for anomaly detection of different data models, universality of the anomaly detection algorithm-based detection model is greatly improved, and training cost required for detection of different data models is greatly reduced.
The invention also provides a nonvolatile computer storage medium, wherein the computer storage medium stores at least one executable instruction, and the executable instruction can execute the order wind control method based on the anomaly detection algorithm in any method embodiment.
FIG. 4 illustrates a schematic diagram of a computing device, according to an embodiment of the invention, the particular embodiment of the invention not being limited to a particular implementation of the computing device.
As shown in fig. 4, the computing device may include: a processor 402, a communication interface (Communications Interface) 404, a memory 406, and a communication bus 408.
Wherein:
processor 402, communication interface 404, and memory 406 communicate with each other via communication bus 408.
A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.
Processor 402 is configured to execute program 410, and may specifically perform relevant steps in the order wind control method embodiment based on the anomaly detection algorithm.
In particular, program 410 may include program code including computer-operating instructions.
The processor 402 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included by the computing device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
Memory 406 for storing programs 410. Memory 406 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Program 410 may be specifically configured to cause processor 402 to perform an order wind control method based on an anomaly detection algorithm in any of the method embodiments described above. The specific implementation of each step in the procedure 410 may refer to the corresponding descriptions in the corresponding steps and units in the order wind control method embodiment based on the anomaly detection algorithm, which are not described herein. It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and modules described above may refer to corresponding procedure descriptions in the foregoing method embodiments, which are not repeated herein.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some or all of the components in accordance with embodiments of the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present invention can also be implemented as an apparatus or device program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.

Claims (10)

1. An order wind control method based on an anomaly detection algorithm comprises the following steps:
acquiring a historical order data set, and performing data preprocessing on the historical order data set to obtain a sample data set;
for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data according to a plurality of moving average indexes;
model training is carried out by utilizing a plurality of residual error data corresponding to a plurality of sample data, and a target abnormal detection model is obtained;
and calculating a plurality of residual data corresponding to the real-time data according to the plurality of moving average indexes, and inputting the plurality of residual data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result.
2. The method of claim 1, wherein the data preprocessing the historical order dataset to obtain a sample dataset further comprises:
and carrying out completion processing on the missing values in the historical order data set, and carrying out standardization and normalization processing on each historical order data in the historical order data set to obtain a sample data set.
3. The method of claim 2, wherein the complementing the missing values in the historical order dataset further comprises:
searching historical order data with missing values in the historical order data set;
and calculating the average value of the first preset number of historical order data without missing values, which are positioned in front of the historical order data in the historical order data set, as a complement value corresponding to the missing value according to each historical order data with the missing value, and complementing the missing value in the historical order data by using the complement value.
4. The method of claim 1, wherein for each sample data in the sample data set, calculating a plurality of residual data corresponding to the sample data from a plurality of moving average indicators further comprises:
calculating a plurality of first moving average data corresponding to the sample data according to a plurality of moving average indexes;
calculating a difference between the sample data and each of the first moving average data;
and obtaining a plurality of residual data corresponding to the sample data according to the difference value between the sample data and each first moving average data.
5. The method of claim 1, wherein the model training using a plurality of residual data corresponding to a plurality of sample data, to obtain a target anomaly detection model further comprises:
constructing an initial anomaly detection model by using an isolated forest algorithm;
dividing the sample dataset into a first subset and a second subset;
and training the initial anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the first subset to obtain a trained anomaly detection model.
6. The method of claim 5, wherein the model training using a plurality of residual data corresponding to a plurality of sample data, to obtain a target anomaly detection model further comprises:
evaluating the trained anomaly detection model by utilizing a plurality of residual data corresponding to a plurality of sample data in the second subset to obtain evaluation parameters;
judging whether the evaluation parameter exceeds a preset threshold value or not;
if yes, taking the trained abnormality detection model as a target abnormality detection model;
if not, model training is conducted again until the evaluation parameters exceed a preset threshold value, and a target abnormality detection model is obtained.
7. The method of any of claims 1-6, wherein calculating a plurality of residual data corresponding to real-time data from a plurality of moving average metrics further comprises:
calculating a plurality of second moving average data corresponding to the real-time data according to a plurality of moving average indexes;
calculating a difference between the real-time data and each second moving average data;
and obtaining a plurality of residual data corresponding to the real-time data according to the difference value between the real-time data and each second moving average data.
8. An order wind control device based on an anomaly detection algorithm, comprising: the system comprises a preprocessing module, a calculation module, a training module and a risk prediction module; wherein,,
the preprocessing module is used for acquiring a historical order data set, and performing data preprocessing on the historical order data set to obtain a sample data set;
the calculation module is used for calculating a plurality of residual data corresponding to each sample data in the sample data set according to a plurality of moving average indexes;
the training module is used for carrying out model training by utilizing a plurality of residual error data corresponding to a plurality of sample data to obtain a target abnormality detection model;
the risk prediction module is used for calculating a plurality of residual error data corresponding to the real-time data according to a plurality of moving average indexes, and inputting the plurality of residual error data corresponding to the real-time data into the target anomaly detection model for risk prediction to obtain a corresponding risk prediction result.
9. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction, where the executable instruction causes the processor to perform an operation corresponding to the order wind control method based on the anomaly detection algorithm according to any one of claims 1 to 7.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the order wind control method based on an anomaly detection algorithm as claimed in any one of claims 1 to 7.
CN202111239552.4A 2021-10-25 2021-10-25 Order wind control method and device based on anomaly detection algorithm and storage medium Pending CN116070897A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111239552.4A CN116070897A (en) 2021-10-25 2021-10-25 Order wind control method and device based on anomaly detection algorithm and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111239552.4A CN116070897A (en) 2021-10-25 2021-10-25 Order wind control method and device based on anomaly detection algorithm and storage medium

Publications (1)

Publication Number Publication Date
CN116070897A true CN116070897A (en) 2023-05-05

Family

ID=86168458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111239552.4A Pending CN116070897A (en) 2021-10-25 2021-10-25 Order wind control method and device based on anomaly detection algorithm and storage medium

Country Status (1)

Country Link
CN (1) CN116070897A (en)

Similar Documents

Publication Publication Date Title
CN112436968B (en) Network traffic monitoring method, device, equipment and storage medium
CN113282461B (en) Alarm identification method and device for transmission network
CN109739904B (en) Time sequence marking method, device, equipment and storage medium
CN113935497A (en) Intelligent operation and maintenance fault processing method, device and equipment and storage medium thereof
CN111881023B (en) Software aging prediction method and device based on multi-model comparison
CN114911788B (en) Data interpolation method and device and storage medium
CN114997342B (en) SCR fault diagnosis method, device, equipment and storage medium
CN111062642A (en) Method and device for identifying industrial risk degree of object and electronic equipment
CN115456107A (en) Time series abnormity detection system and method
CN111522736A (en) Software defect prediction method and device, electronic equipment and computer storage medium
CN113487223B (en) Risk assessment method and system based on information fusion
CN114429249A (en) Method, system, equipment and storage medium for predicting service life of steel pipe bundle production equipment
CN114138601A (en) Service alarm method, device, equipment and storage medium
CN113313304A (en) Power grid accident abnormity analysis method and system based on big data decision tree
CN117272145A (en) Health state evaluation method and device of switch machine and electronic equipment
CN111783883A (en) Abnormal data detection method and device
CN117149565A (en) State detection method, device, equipment and medium for key performance indexes of cloud platform
CN116955059A (en) Root cause positioning method, root cause positioning device, computing equipment and computer storage medium
CN116562120A (en) RVE-based turbine engine system health condition assessment method and RVE-based turbine engine system health condition assessment device
CN116070897A (en) Order wind control method and device based on anomaly detection algorithm and storage medium
CN115278757A (en) Method and device for detecting abnormal data and electronic equipment
CN112685610A (en) False registration account identification method and related device
CN112512072B (en) VoLTE network fault prediction method and equipment
CN117951529B (en) Sample acquisition method, device and equipment for hard disk data fault prediction
CN116595308A (en) Equipment health state assessment method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination