CN116701130A - Dynamic baseline optimization method and device based on index portrait and electronic equipment - Google Patents

Dynamic baseline optimization method and device based on index portrait and electronic equipment Download PDF

Info

Publication number
CN116701130A
CN116701130A CN202310765907.6A CN202310765907A CN116701130A CN 116701130 A CN116701130 A CN 116701130A CN 202310765907 A CN202310765907 A CN 202310765907A CN 116701130 A CN116701130 A CN 116701130A
Authority
CN
China
Prior art keywords
target
baseline
early
monitoring
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310765907.6A
Other languages
Chinese (zh)
Inventor
刘东阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202310765907.6A priority Critical patent/CN116701130A/en
Publication of CN116701130A publication Critical patent/CN116701130A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a dynamic baseline optimization method and device based on an index portrait, electronic equipment and a storage medium. Relates to the technical field of computers. The method comprises the following steps: acquiring real-time data of a target index in a preset period; extracting features of the acquired real-time data to obtain the portrait features of the target indexes; determining a target baseline, calling a baseline admittance risk model corresponding to the target baseline, taking the portrait characteristic of the target index as input, predicting a matching result of the target index and the target baseline through the baseline admittance risk model, and determining whether the target baseline monitors and pre-warns the target index according to the matching result; the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under a target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes. The application can effectively reduce false alarm and missing report of the base line.

Description

Dynamic baseline optimization method and device based on index portrait and electronic equipment
Technical Field
The present application relates to the field of computer technology, and in particular, to a dynamic baseline optimization method based on an index portrait, a dynamic baseline optimization device based on an index portrait, an electronic device, and a computer readable storage medium.
Background
Baseline alarms are one of the important means of database monitoring. Currently, there are two general methods for controlling a dynamic baseline admission threshold by an operation and maintenance person: when the trading volume level is low, the curve periodicity is weakened, so that the accuracy of the dynamic baseline under the trading volume curve is greatly reduced, and the operation and maintenance personnel uses the trading volume level as a dynamic baseline access threshold; there are all offline conditions for transactions, machines, clusters, or systems, and it is generally considered that a dynamic baseline fails for an object with a transaction amount of 0 over a period of time. However, the existing method has rough control over the admission threshold of the dynamic baseline, and false alarms or missed alarms are easy to generate.
Disclosure of Invention
The application provides a dynamic baseline optimization method, a device, electronic equipment and a storage medium based on an index portrait, which are used for solving the problem that a baseline alarm is easy to alarm by mistake or is missed in the prior art.
In a first aspect of the present application, there is provided a dynamic baseline optimization method based on an index portrait, including:
acquiring real-time data of a target index in a preset period;
extracting features of the obtained real-time data to obtain the portrait features of the target indexes, wherein the portrait features of the target indexes at least comprise one of statistical features, time domain features and business features of the target indexes in a preset period;
determining a target baseline, calling a baseline admittance risk model corresponding to the target baseline, taking the portrait characteristic of the target index as input, predicting a matching result of the target index and the target baseline through the baseline admittance risk model, and determining whether the target baseline monitors and pre-warns the target index according to the matching result;
the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under the target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes.
Optionally, determining whether the target baseline monitors and pre-warns the target index according to the matching result includes:
if the target index is determined to be matched with the target base line according to the matching result, determining a control index of the target index belonging to the target base line, so that the target base line can monitor and early warn the target index;
if the target index is not matched with the target base line according to the matching result, determining that the target index does not belong to the control index of the target base line, so that the target base line does not monitor and early warn the target index.
Optionally, the statistical features include at least:
one of variance, standard deviation, outlier ratio, coefficient of variation, maximum, minimum, maximum, mode, median, arithmetic mean, quantile, and quartile difference of the real-time data;
the time domain features include at least:
one of autocorrelation, differential average, differential absolute average, differential median, differential absolute median and differential absolute sum of the real-time data;
the service features at least comprise:
one of trend type, periodicity, regularity spike, and periodicity offset of the real-time data.
Optionally, the machine learning model is a random forest model, and the training step of the baseline admission risk model includes:
acquiring historical data of different indexes in different historical periods and monitoring and early warning results corresponding to the historical data under the target baseline, wherein the monitoring and early warning results at least comprise a first monitoring and early warning result and a second monitoring and early warning result, the first monitoring and early warning result indicates that the corresponding historical data is normal, and the second monitoring and early warning result indicates that the corresponding historical data is abnormal;
clustering historical data with the same monitoring and early warning result, constructing a corresponding sub-training sample data set, and obtaining a sub-training sample data set corresponding to the monitoring and early warning result, wherein all the obtained sub-training sample data sets are training sample data sets of the random forest model;
and training the random forest model by taking the obtained training sample data set as input and the matching result with the target baseline as output to obtain a baseline admittance risk model corresponding to the target baseline.
Optionally, the method further comprises:
acquiring early warning result feedback information of a second monitoring early warning result, wherein the early warning result feedback information comprises first early warning result feedback information indicating that the second monitoring early warning result is accurate early warning and second early warning result feedback information indicating that the second monitoring early warning result is false early warning;
clustering the historical data with the same monitoring and early warning result to construct a corresponding sub-training sample data set, wherein the clustering comprises the following steps:
constructing a first sub-training sample data set by taking the monitoring and early-warning result as the history data of the first monitoring and early-warning result;
a second sub-training sample data set is constructed by taking the monitoring early-warning result as a second monitoring early-warning result and taking early-warning result feedback information corresponding to the second monitoring early-warning result as historical data of the first early-warning result feedback information;
and constructing a third sub-training sample data set by taking the monitoring early-warning result as a second monitoring early-warning result and taking early-warning result feedback information corresponding to the second monitoring early-warning result as historical data of second early-warning result feedback information.
Optionally, the number of historical data in the first sub-training sample data set, the second sub-training sample data set, and the third sub-training sample data set are all the same.
Optionally, each history period includes at least one period, each period includes a plurality of same time slots, and data points in the history data of the corresponding index correspond to the time slots one to one; before constructing the third sub-training sample data set, the method further comprises:
aiming at the monitoring and early-warning result is a second monitoring and early-warning result, and the early-warning result feedback information corresponding to the second monitoring and early-warning result is the historical data of the second early-warning result feedback information:
determining all abnormal data points of the current historical data in the corresponding historical period;
if the ratio of the number of abnormal data points with the same time slot in different periods to the number of periods is larger than a preset threshold, updating the monitoring and early-warning result of the current historical data to be a first monitoring and early-warning result, or discarding the current historical data.
In a second aspect of the present application, there is provided a dynamic baseline optimization apparatus based on an index image, comprising:
the data acquisition module is configured to acquire real-time data of the target index in a preset period;
the index portrait module is configured to perform feature extraction on the acquired real-time data to obtain portrait features of the target index, wherein the portrait features of the target index at least comprise one of statistical features, time domain features and business features of the target index in a preset period;
the baseline adjustment module is configured to determine a target baseline, call a baseline admittance risk model corresponding to the target baseline, take the portrait characteristic of the target index as input, predict a matching result of the target index and the target baseline through the baseline admittance risk model, and determine whether the target baseline monitors and early warns the target index according to the matching result;
the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under the target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes.
In a third aspect of the present application, there is provided an electronic apparatus comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes the computer-executable instructions stored in the memory to implement the methods described above.
In a fourth aspect of the present application, there is provided a computer-readable storage medium having stored therein computer-executable instructions for performing the method described above when executed by a processor.
In a fifth aspect of the application, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method described above.
According to the application, by extracting the characteristics of different indexes, carrying out portraits on the different indexes, training a machine learning model by utilizing the portraits of the different indexes and the early warning result under the target base line, obtaining the base line admission risk model of the target base line, extracting the portraits of the target indexes, and predicting whether the target index is applicable to the current target base line or not through the pre-trained base line admission risk model based on the portraits of the target indexes, so that the admission threshold of the target base line is dynamically adjusted, and false report and missing report of the base line can be effectively reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a method flow diagram of a dynamic baseline optimization method based on an index portrait according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a dynamic baseline optimization logic provided by an embodiment of the present application;
FIG. 3 is a schematic block diagram of a dynamic baseline optimization device based on an index portrait according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the detailed description described herein is merely for illustrating and explaining the embodiments of the present application, and is not intended to limit the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that, if directional indications (such as up, down, left, right, front, and rear … …) are included in the embodiments of the present application, the directional indications are merely used to explain the relative positional relationship, movement conditions, etc. between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.
In addition, if there is a description of "first", "second", etc. in the embodiments of the present application, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.
As shown in fig. 1 and 2, a first aspect of the present application provides a dynamic baseline optimization method based on an index portrait, including:
s100, acquiring real-time data of a target index in a preset period;
s200, extracting features of the acquired real-time data to obtain the portrait features of the target indexes, wherein the portrait features of the target indexes at least comprise one of statistical features, time domain features and business features of the target indexes in a preset period;
s300, determining a target baseline, calling a baseline admittance risk model corresponding to the target baseline, taking the portrait characteristic of the target index as input, predicting a matching result of the target index and the target baseline through the baseline admittance risk model, and determining whether the target baseline monitors and pre-warns the target index according to the matching result;
the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under a target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes.
In this way, the application carries out feature extraction on different indexes, carries out portraits on different indexes, trains a machine learning model by utilizing the portraits of different indexes and the early warning result under the target baseline, obtains the baseline admittance risk model of the target baseline, carries out portraits feature extraction on the target indexes, predicts whether the target indexes are applicable to the current target baseline or not through the pre-trained baseline admittance risk model based on the portraits of the target indexes, further dynamically adjusts the admittance threshold of the target baseline, and can effectively reduce false report and missing report of the baseline.
It can be understood that in the technical scheme of the application, the processes of acquiring, collecting, storing, using, processing, transmitting, providing, disclosing and applying the data and the like all meet the requirements of related laws and regulations.
In the present application, the index may be a trade amount index. Where the transaction amount indicator refers to a measure of the amount of transactions or transaction traffic handled in the system or application, representing the total amount of transactions completed in a certain time. For example, the transaction amount indicator may be all transactions processed by the system, such as website accesses, order submissions, payment requests, etc., and the target indicator may be, but is not limited to, one of the transaction amount indicators. At present, when the trade volume index is monitored and early-warned through the target base line, because some trade volume indexes may not be applicable to the base line algorithm of the current target base line, that is, the trade volume index is monitored and early-warned by adopting the current target base line, an incorrect monitoring and early-warning result may be generated, or the trade index is not reported. Therefore, in order to improve the early warning accuracy of the transaction amount indexes, the application extracts the portrait features of each transaction amount index in advance based on the historical data of different transaction amount indexes, wherein the portrait features comprise at least one of statistical features, time domain features and business features. Taking the statistical characteristics, time domain characteristics and business characteristics of each trading volume index as input, taking whether the trading volume index is applicable to a target baseline or not as output, training a preset machine learning model, and obtaining a baseline admission risk model of the target index. It can be understood that in the application, different baseline admittance risk models can be trained in advance for different target baselines, so that when the transaction amount index is monitored and early-warned through the target baselines, the baseline admittance risk model corresponding to the target baselines can be called according to the portrait characteristics of the target index to predict whether the target index is applicable to the target baselines or not, and the admittance threshold of the target baselines can be dynamically adjusted according to the prediction result. It can be understood that the admission threshold of the target baseline is used for indicating the application range of the transaction amount index for monitoring and early warning by the target baseline, if the prediction result indicates that the target index is not applicable to the target baseline, the target baseline does not monitor and early warning, otherwise, if the prediction result indicates that the target index is applicable to the target baseline, the target index is configured to be monitored and early warned by the target baseline.
Thus, by targeting the target index
In step S200, optionally, the statistical features include at least: one of variance, standard deviation, outlier ratio, coefficient of variation, maximum, minimum, maximum, mode, median, arithmetic mean, quantile, and quartile difference of the real-time data; the time domain features include at least: one of autocorrelation, differential average, differential absolute average, differential median, differential absolute median and differential absolute sum of the real-time data; the service features at least comprise: one of trend type, periodicity, regularity spike, and periodicity offset of the real-time data. It can be understood that in the present application, the real-time data of the target index in the preset period can be obtained through a time window with a preset length, that is, the obtained real-time data of the target index is a time sequence data set, and the statistical feature, the time domain feature and the service feature are extracted for the time sequence data set. The method for calculating the image features is not limited to the conventional method.
The trend type designates a time period curve trend type, such as rising and falling, and the curve trend type is judged by differentiating the transaction amount and utilizing the rising and falling proportion of the transaction amount, for example, the rising or falling trend of the target index in a preset time period; the regular peak can be extracted by identifying the timing batch task, for example, the increase of the transaction amount of the target index corresponding to the same index time in the allowable time offset in different periods in the preset period can be identified; the periodic offset refers to the phenomenon that the transaction amount curves in different periods are offset in the time dimension, and the occurrence of the phenomenon and the period offset can be found through period offset detection.
In step S300, determining whether the target baseline monitors and pre-warns the target index according to the matching result, including:
if the target index is determined to be matched with the target base line according to the matching result, determining a control index of the target index belonging to the target base line, so that the target base line can monitor and early warn the target index; the target index is matched with the target base line, namely the target index is applicable to the target base line, the target index is marked as a control index of the target base line, and therefore the target base line is configured to monitor and early warn the target index;
if the target index is determined to be not matched with the target base line according to the matching result, and the target index is determined to not belong to the control index of the target base line, the target index is not marked as the control index of the target base line, so that the target base line is configured to not monitor and early warn the target index, and the target base line is enabled not to monitor and early warn the target index.
In the application, the training step of the baseline admittance risk model comprises the following steps:
s400, acquiring historical data of different indexes in different historical periods and monitoring and early-warning results corresponding to the historical data under a target baseline, wherein the monitoring and early-warning results at least comprise a first monitoring and early-warning result and a second monitoring and early-warning result, the first monitoring and early-warning result represents that the corresponding historical data is normal, and the second monitoring and early-warning result represents that the corresponding historical data is abnormal; it can be understood that the historical period may have the same length as the preset period, and for each index, the historical data of each index in a plurality of historical periods and the monitoring and early warning result corresponding to each historical data are obtained through a preset time window.
For example, for different indexes, transaction amount data of 4 weeks can be taken for interpolation and complementation, and the transaction amount data is divided according to working days and non-working days; taking N minutes as a rolling window to average the historical transaction amount to realize downsampling; taking working calendar history data, and calculating a working day dynamic base line: taking data of the transaction amount history of the working day at the same moment every day, removing abnormal values, filling the missing values with the mean value, performing Gaussian fitting, and subtracting 3 times of standard deviation from the mean value obtained by fitting to serve as a lower base line at the moment of the working day; the calculation mode of the non-workday dynamic base line is the same as that of the workday dynamic base line, and the training data is non-workday data.
The method of the application further comprises the following steps:
acquiring early warning result feedback information of a second monitoring early warning result, wherein the early warning result feedback information comprises first early warning result feedback information indicating that the second monitoring early warning result is accurate early warning and second early warning result feedback information indicating that the second monitoring early warning result is false early warning; it can be understood that in the transaction amount dynamic baseline running process, a system administrator can audit the monitoring and early-warning result sent by the dynamic baseline to judge the authenticity of the early-warning result, add a result label to the monitoring and early-warning result, and store the transaction amount index data and the monitoring and early-warning result. Wherein, for the convenience of calculation, the length of the transaction amount index data may be determined as the length of the preset period. The result label comprises a real alarm and an abnormal alarm (false alarm), wherein the false alarm indicates that the transaction amount index is an object unsuitable for using the existing dynamic baseline.
S500, clustering historical data with the same monitoring and early warning result, constructing a corresponding sub-training sample data set, and obtaining the sub-training sample data set corresponding to the monitoring and early warning result, wherein all the obtained sub-training sample data sets are training sample data sets of a random forest model.
The method for clustering the historical data with the same monitoring and early warning result comprises the steps of: constructing a first sub-training sample data set by taking the monitoring early warning result as the historical data of the first monitoring early warning result, wherein the first sub-training sample data set comprises all normal indexes under a target baseline; a second sub-training sample data set is constructed by taking the monitoring early warning result as a second monitoring early warning result and taking early warning result feedback information corresponding to the second monitoring early warning result as historical data of the first early warning result feedback information, namely the second sub-training sample data set comprises all abnormal indexes of real warning under a target baseline; and constructing a third sub-training sample data set by taking the monitoring early-warning result as a second monitoring early-warning result and taking early-warning result feedback information corresponding to the second monitoring early-warning result as historical data of second early-warning result feedback information, wherein the third sub-training sample data set comprises all abnormal indexes of abnormal warning under a target baseline.
The number of the historical data in the first sub-training sample data set, the second sub-training sample data set and the third sub-training sample data set is the same, i.e. the ratio of the historical data in the first sub-training sample data set, the second sub-training sample data set and the third sub-training sample data set is 1:1:1.
And S600, training the random forest model by taking the obtained training sample data set as input and the matching result with the target baseline as output to obtain a baseline admittance risk model corresponding to the target baseline.
In the application, the machine learning model is a random forest model, wherein the random forest refers to a classifier for training and predicting samples by utilizing a plurality of trees. Random forests are made up of a number of CART (Classification And Regression Tree) trees, for each tree, the training set they use is sampled back from the total training set, meaning that some samples in the total training set may or may not appear in the training set of a tree multiple times, and the features used are extracted randomly from all features in a proportion without back when training the nodes of each tree.
The training process of the random forest is as follows:
a. given a training set S, a test set T, and a feature dimension F. Determining parameters: the number t of CART used, the depth d of each tree, the number f of features used by each node, the termination condition: the least number of samples s on the node and the least information gain m on the node;
b. for the 1_t th tree, i= 1_t: the training set S (i) with the same extraction size as S is put back from S, the training set S (i) is randomly selected as a sample of a root node, and training is started from the root node;
c. if the current node reaches the termination condition, setting the current node as a leaf node, wherein the prediction output of the leaf node is c (j) of the type with the largest number in the current node sample set, and the probability p is the proportion of c (j) to the current sample set. And then continues to train the other nodes. If the current node does not reach the termination condition, randomly selecting F-dimensional features from the F-dimensional features, searching one-dimensional features k with the best classification effect (taking a base value as a judgment standard) and a threshold th thereof by utilizing the F-dimensional features, dividing samples with the kth-dimensional features of samples on the current node smaller than th into left nodes, and dividing the rest samples into right nodes;
d. repeating b, c until all nodes are trained or marked as leaf nodes;
e. repeating b, c, d until all CART are trained.
According to the steps, the random forest model is trained by taking the historical data of which the label is an abnormal alarm, namely the data in the third sub-training sample data set, as a negative sample, and taking the normal data applicable to the target baseline, namely the data in the first sub-training sample data set and the second sub-training sample data set, as positive sample data, so that a baseline admittance risk model for judging whether the target index is applicable to the target baseline is obtained.
The prediction process of the random forest is as follows:
a. for the 1_t th tree, i= 1_t: starting from the root node of the current tree, judging whether to enter a left node (< th) or a right node (> =th) according to the threshold th of the current node until reaching a certain leaf node, and outputting a predicted value;
b. and repeatedly executing the a until all t trees output the predicted value, outputting the predicted value as the class with the largest sum of the predicted probabilities in all the trees, namely accumulating p of each c (j).
In the present application, each history period includes at least one period, each period includes a plurality of identical time slots, and data points in the history data of the corresponding index are in one-to-one correspondence with the time slots, for example, one history period includes m periods, each period includes a plurality of identical time slots, and since the history data is a time series data set, which is formed by a plurality of data points, each time slot may correspond to one data point in one history data.
At present, in the calculation process of a dynamic baseline, there is usually a step of removing abnormal points, if sudden drop points appear regularly in each period, the abnormal points are removed according to the existing method, and the dynamic baseline can generate abnormal alarms at the regular sudden drop points. To solve this problem, the method further comprises, before constructing the third sub-training sample data set:
aiming at the condition that the monitoring and early-warning result is a second monitoring and early-warning result, the early-warning result feedback information corresponding to the second monitoring and early-warning result is the historical data of the second early-warning result feedback information, namely the historical data of the abnormal alarm is marked:
determining all abnormal data points of the current historical data in the corresponding historical period; if the ratio of the number of abnormal data points with the same time slot in different periods to the number of periods is larger than a preset threshold, updating the monitoring and early-warning result of the current historical data to be a first monitoring and early-warning result, or discarding the current historical data.
Before removing an abnormal point, the method increases the judgment of the periodicity of the abnormal point, for example, m periods are selected, the number of times n of occurrence of the same abnormal point in the same index time slot in each period is calculated, if n/m is larger than a set threshold value such as 80%, the point is considered to have periodicity, and if the periodicity of the abnormal point occurs, the historical data is divided into normal data. For example, transaction amount data of 4 weeks is taken, one day is taken as a period, each sampling time is taken as a time slot, and if the ratio of the abnormal point of the index to the same sampling time every day is higher than 80%, the abnormal point is judged to be normal data.
According to the method, the reason of the generation of the alarm is analyzed in the process of constructing the training data set, so that the alarm caused by improper calculation method or defect in the algorithm calculation process can be subjected to targeted optimization, the training data set is optimized, and the prediction precision is improved.
As shown in fig. 3, in a second aspect of the present application, there is provided a dynamic baseline optimization apparatus based on an index image, including:
the data acquisition module is configured to acquire real-time data of the target index in a preset period;
the index portrait module is configured to perform feature extraction on the acquired real-time data to obtain portrait features of the target index, wherein the portrait features of the target index at least comprise one of statistical features, time domain features and business features of the target index in a preset period;
the baseline adjustment module is configured to determine a target baseline, call a baseline admittance risk model corresponding to the target baseline, take the image characteristics of the target index as input, predict the matching result of the target index and the target baseline through the baseline admittance risk model, and determine whether the target baseline monitors and pre-warns the target index according to the matching result;
the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under a target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes.
The dynamic baseline optimization device based on the index portrait provided by the embodiment of the application can be used for executing the technical scheme of the dynamic baseline optimization method based on the index portrait in the embodiment, and the implementation principle and the technical effect are similar, and are not repeated here.
It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the data acquisition module may be a processing element that is set up separately, may be implemented in a chip of the above apparatus, or may be stored in a memory of the above apparatus in the form of program codes, and the functions of the data acquisition module may be called and executed by a processing element of the above apparatus. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
In a third aspect of the present application, there is provided an electronic apparatus comprising: a processor, a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes the computer-executable instructions stored in the memory to implement the method described above.
In a fourth aspect of the present application, a computer-readable storage medium is provided, in which computer-executable instructions are stored, which when executed by a processor are adapted to carry out the method described above.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 4, the electronic device may include: a transceiver 121, a processor 122, a memory 123.
Processor 122 executes the computer-executable instructions stored in the memory, causing processor 122 to perform the aspects of the embodiments described above. The processor 122 may be a general-purpose processor including a central processing unit CPU, a network processor (network processor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.
Memory 123 is coupled to processor 122 via the system bus and communicates with each other, and memory 123 is configured to store computer program instructions.
The transceiver 121 may be used to acquire a task to be run and configuration information of the task to be run.
The system bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The system bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory).
The electronic device provided by the embodiment of the application can be the terminal device of the embodiment.
In a fifth aspect of the application, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method described above.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (11)

1. A dynamic baseline optimization method based on an index portrait, comprising:
acquiring real-time data of a target index in a preset period;
extracting features of the obtained real-time data to obtain the portrait features of the target indexes, wherein the portrait features of the target indexes at least comprise one of statistical features, time domain features and business features of the target indexes in a preset period;
determining a target baseline, calling a baseline admittance risk model corresponding to the target baseline, taking the portrait characteristic of the target index as input, predicting a matching result of the target index and the target baseline through the baseline admittance risk model, and determining whether the target baseline monitors and pre-warns the target index according to the matching result;
the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under the target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes.
2. The dynamic baseline optimization method based on the index portrait according to claim 1, wherein determining whether the target baseline monitors and pre-warns the target index according to the matching result comprises:
if the target index is determined to be matched with the target base line according to the matching result, determining a control index of the target index belonging to the target base line, so that the target base line can monitor and early warn the target index;
if the target index is not matched with the target base line according to the matching result, determining that the target index does not belong to the control index of the target base line, so that the target base line does not monitor and early warn the target index.
3. The method for dynamic baseline optimization based on an index portrait according to claim 1, wherein said statistical features at least include:
one of variance, standard deviation, outlier ratio, coefficient of variation, maximum, minimum, maximum, mode, median, arithmetic mean, quantile, and quartile difference of the real-time data;
the time domain features include at least:
one of autocorrelation, differential average, differential absolute average, differential median, differential absolute median and differential absolute sum of the real-time data;
the service features at least comprise:
one of trend type, periodicity, regularity spike, and periodicity offset of the real-time data.
4. The method for dynamic baseline optimization based on an indicator portrait of claim 1, wherein the machine learning model is a random forest model, and the training step of the baseline admission risk model includes:
acquiring historical data of different indexes in different historical periods and monitoring and early warning results corresponding to the historical data under the target baseline, wherein the monitoring and early warning results at least comprise a first monitoring and early warning result and a second monitoring and early warning result, the first monitoring and early warning result indicates that the corresponding historical data is normal, and the second monitoring and early warning result indicates that the corresponding historical data is abnormal;
clustering historical data with the same monitoring and early warning result, constructing a corresponding sub-training sample data set, and obtaining a sub-training sample data set corresponding to the monitoring and early warning result, wherein all the obtained sub-training sample data sets are training sample data sets of the random forest model;
and training the random forest model by taking the obtained training sample data set as input and the matching result with the target baseline as output to obtain a baseline admittance risk model corresponding to the target baseline.
5. The method of dynamic baseline optimization based on an indicator representation according to claim 4, further comprising:
acquiring early warning result feedback information of a second monitoring early warning result, wherein the early warning result feedback information comprises first early warning result feedback information indicating that the second monitoring early warning result is accurate early warning and second early warning result feedback information indicating that the second monitoring early warning result is false early warning;
clustering the historical data with the same monitoring and early warning result to construct a corresponding sub-training sample data set, wherein the clustering comprises the following steps:
constructing a first sub-training sample data set by taking the monitoring and early-warning result as the history data of the first monitoring and early-warning result;
a second sub-training sample data set is constructed by taking the monitoring early-warning result as a second monitoring early-warning result and taking early-warning result feedback information corresponding to the second monitoring early-warning result as historical data of the first early-warning result feedback information;
and constructing a third sub-training sample data set by taking the monitoring early-warning result as a second monitoring early-warning result and taking early-warning result feedback information corresponding to the second monitoring early-warning result as historical data of second early-warning result feedback information.
6. The method of dynamic baseline optimization based on an indicator representation according to claim 5, wherein the number of historical data in the first, second, and third sub-training sample data sets is the same.
7. The method of dynamic baseline optimization based on an indicator representation according to claim 5, wherein each historical period comprises at least one period, each period comprises a plurality of same time slots, and data points in the historical data of the corresponding indicator correspond to the time slots one by one; before constructing the third sub-training sample data set, the method further comprises:
aiming at the monitoring and early-warning result is a second monitoring and early-warning result, and the early-warning result feedback information corresponding to the second monitoring and early-warning result is the historical data of the second early-warning result feedback information:
determining all abnormal data points of the current historical data in the corresponding historical period;
if the ratio of the number of abnormal data points with the same time slot in different periods to the number of periods is larger than a preset threshold, updating the monitoring and early-warning result of the current historical data to be a first monitoring and early-warning result, or discarding the current historical data.
8. A dynamic baseline optimization device based on an index portrait, comprising:
the data acquisition module is configured to acquire real-time data of the target index in a preset period;
the index portrait module is configured to perform feature extraction on the acquired real-time data to obtain portrait features of the target index, wherein the portrait features of the target index at least comprise one of statistical features, time domain features and business features of the target index in a preset period;
the baseline adjustment module is configured to determine a target baseline, call a baseline admittance risk model corresponding to the target baseline, take the portrait characteristic of the target index as input, predict a matching result of the target index and the target baseline through the baseline admittance risk model, and determine whether the target baseline monitors and early warns the target index according to the matching result;
the baseline admission risk model is obtained by training a preset machine learning model by using historical data of different indexes and corresponding monitoring and early warning results under the target baseline, wherein the historical data of the different indexes comprise portrait features of the corresponding indexes.
9. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-7.
10. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-7.
11. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-7.
CN202310765907.6A 2023-06-27 2023-06-27 Dynamic baseline optimization method and device based on index portrait and electronic equipment Pending CN116701130A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310765907.6A CN116701130A (en) 2023-06-27 2023-06-27 Dynamic baseline optimization method and device based on index portrait and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310765907.6A CN116701130A (en) 2023-06-27 2023-06-27 Dynamic baseline optimization method and device based on index portrait and electronic equipment

Publications (1)

Publication Number Publication Date
CN116701130A true CN116701130A (en) 2023-09-05

Family

ID=87833844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310765907.6A Pending CN116701130A (en) 2023-06-27 2023-06-27 Dynamic baseline optimization method and device based on index portrait and electronic equipment

Country Status (1)

Country Link
CN (1) CN116701130A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665135A (en) * 2023-07-28 2023-08-29 中国华能集团清洁能源技术研究院有限公司 Thermal runaway risk early warning method and device for battery pack of energy storage station and electronic equipment
CN117221008A (en) * 2023-11-07 2023-12-12 中孚信息股份有限公司 Multi-behavior baseline correction method, system, device and medium based on feedback mechanism
CN117235434A (en) * 2023-11-15 2023-12-15 人工智能与数字经济广东省实验室(深圳) Forestry carbon sink project baseline construction method, system, terminal and medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665135A (en) * 2023-07-28 2023-08-29 中国华能集团清洁能源技术研究院有限公司 Thermal runaway risk early warning method and device for battery pack of energy storage station and electronic equipment
CN117221008A (en) * 2023-11-07 2023-12-12 中孚信息股份有限公司 Multi-behavior baseline correction method, system, device and medium based on feedback mechanism
CN117221008B (en) * 2023-11-07 2024-02-23 中孚信息股份有限公司 Multi-behavior baseline correction method, system, device and medium based on feedback mechanism
CN117235434A (en) * 2023-11-15 2023-12-15 人工智能与数字经济广东省实验室(深圳) Forestry carbon sink project baseline construction method, system, terminal and medium
CN117235434B (en) * 2023-11-15 2024-03-19 人工智能与数字经济广东省实验室(深圳) Forestry carbon sink project baseline construction method, system, terminal and medium

Similar Documents

Publication Publication Date Title
CN116701130A (en) Dynamic baseline optimization method and device based on index portrait and electronic equipment
CN109213654B (en) Anomaly detection method and device
CN110191094B (en) Abnormal data monitoring method and device, storage medium and terminal
CN109598095B (en) Method and device for establishing scoring card model, computer equipment and storage medium
CN111984503A (en) Method and device for identifying abnormal data of monitoring index data
CN112258093A (en) Risk level data processing method and device, storage medium and electronic equipment
CN113299401B (en) Infectious disease data transmission monitoring method and device, computer equipment and medium
CN108537243B (en) Violation warning method and device
CN111813644B (en) Evaluation method and device for system performance, electronic equipment and computer readable medium
CN113822366A (en) Service index abnormality detection method and device, electronic equipment and storage medium
CN110781220A (en) Fault early warning method and device, storage medium and electronic equipment
CN114879613A (en) Industrial control system information security attack risk assessment method and system
CN110706016A (en) Method and device for detecting business abnormity and computer readable storage medium
CN111949496B (en) Data detection method and device
CN109784586B (en) Prediction method and system for danger emergence condition of vehicle danger
CN115099825A (en) Abnormal fund flow account identification method and device
CN114444570A (en) Fault detection method, device, electronic equipment and medium
CN110928859A (en) Model monitoring method and device, computer equipment and storage medium
CN109118043B (en) Online data quality monitoring method and device, server and storage medium
CN108764290B (en) Method and device for determining cause of model transaction and electronic equipment
CN114285612A (en) Method, system, device, equipment and medium for detecting abnormal data
CN113947076A (en) Policy data detection method and device, computer equipment and storage medium
CN116743637B (en) Abnormal flow detection method and device, electronic equipment and storage medium
CN114936614B (en) Operation risk identification method and system based on neural network
CN107085544B (en) System error positioning method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination