CN111740991A - Anomaly detection method and system - Google Patents

Anomaly detection method and system Download PDF

Info

Publication number
CN111740991A
CN111740991A CN202010567982.8A CN202010567982A CN111740991A CN 111740991 A CN111740991 A CN 111740991A CN 202010567982 A CN202010567982 A CN 202010567982A CN 111740991 A CN111740991 A CN 111740991A
Authority
CN
China
Prior art keywords
model
label
abnormal
data
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010567982.8A
Other languages
Chinese (zh)
Other versions
CN111740991B (en
Inventor
张鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inesa R&d Center
Original Assignee
Inesa R&d Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inesa R&d Center filed Critical Inesa R&d Center
Priority to CN202010567982.8A priority Critical patent/CN111740991B/en
Publication of CN111740991A publication Critical patent/CN111740991A/en
Application granted granted Critical
Publication of CN111740991B publication Critical patent/CN111740991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of information data processing, in particular to an anomaly detection method, which utilizes an unsupervised model and a supervised model to mutually print false labels to process a small number of label sets to obtain positive and negative label sets, then carries out iterative processing until the positive and negative label sets show convergence, and designs an anomaly detection system for the anomaly detection method, wherein the anomaly detection system comprises a data acquisition unit for acquiring a data set, a model prediction unit for carrying out training fitting on the unsupervised model and the supervised model and unmarked data prediction, a training set updating unit for integrating the predicted positive and negative sample sets and updating the data set by a return data acquisition unit, a judgment unit for judging whether the positive and negative sample sets are converged, and a detection unit for detecting abnormal points of a test set, and the anomaly detection method can improve the indexes such as accuracy, recall rate and precision of label printing by division work, therefore, the problems of low confidence and poor accuracy of abnormal point detection under the condition of limited marking quantity are solved.

Description

Anomaly detection method and system
Technical Field
The invention relates to the technical field of information data processing, in particular to an anomaly detection method and an anomaly detection system.
Background
Outlier detection, also known as outlier detection, refers to the task of finding data points that are significantly different from normal data.
Outliers usually account for a small overall data size, but they mean distinctive information compared to normal points. The task of anomaly detection can therefore often address important issues in the relevant field, leading to significant discoveries. Such as new disease monitoring, credit card fraud identification, network security attacks, traffic anomalies, and planetary detection, among others.
The detection method comprises an unsupervised method, a supervised method and a semi-supervised method, and the specific use is usually determined according to the labeling condition of a training sample.
The method has the advantages that the method does not need to use a data label, but has limited performance, supervised learning is difficult to be allocated to fields when facing similar monitoring tasks such as novel infectious diseases or unknown fault detection, the requirement on data labeling by semi-supervised learning is low, the information in the unlabelled data can be fully utilized to improve the detection accuracy, and the semi-supervised learning effect is unstable when the number of labels is very small.
Therefore, under the condition that the accurate and representative mark acquisition difficulty is high, the method has important practical significance for improving the accuracy of abnormal point detection to the maximum extent.
Disclosure of Invention
The invention breaks through the difficult problems in the prior art and designs the detection method and the system which can stably and accurately detect the abnormal points under the condition that the available label data are extremely rare.
In order to achieve the above object, the present invention provides an abnormality detection method, comprising: the specific abnormality detection method is as follows: receiving an abnormal detection requirement sent by terminal equipment and a small quantity of label sets to be subjected to abnormal detection, performing mutual false label printing processing on the unsupervised model and the supervised model on the small quantity of label sets according to the condition of the small quantity of label sets to form a positive label set and a negative label set, then performing mutual false label printing processing on the unsupervised model and the supervised model on the positive label set and the negative label set until the positive label set and the negative label set are converged, and obtaining an abnormal result data set subjected to detection and marking.
Further, the positive and negative label sets are a sample set marked as "0" after unsupervised prediction and a sample set marked as "1" after supervised prediction.
Further, the specific steps of the mutual false labeling processing of the unsupervised model and the supervised model are as follows:
s1, setting abnormal point proportion parameters, taking all data sets as training sets, and training unsupervised models;
s2, performing unsupervised model prediction on the unlabeled data set U, and labeling the normal sample as '0' and the normal sample label set as L0;
s3, when the number of labels meets the training requirement of the supervised model, using a small amount of labeled data sets L in the data sets as training sets, improving the classification capability of the supervised model by increasing sample weight, and setting the class _ weight parameter as 'balanced' for training the supervised model;
s4, carrying out supervised model prediction on the unlabelled data set U, and labeling the abnormal sample as '1' and the abnormal sample label set as L1;
s5, putting L0 and L1 into a training set, called a positive and negative label set, updating the labeled training set L to Li and the unlabeled training set U to Ui.
Further, the anomaly detection method further comprises test set anomaly point detection.
Further, the specific method for unsupervised model prediction in S2 is as follows: and predicting the unlabeled data set U by using the trained unsupervised model, marking the samples with labels of '1' when the abnormal point score exceeds a certain threshold value is judged as abnormal samples according to the set abnormal point proportion parameters, marking the other samples as normal samples after unsupervised model prediction with labels of '0', and integrating the normal samples into the data set L0.
Further, the specific method for supervised model prediction in S4 is as follows: and predicting a non-label data set U on the trained supervised model, wherein the labeled samples are few, the supervised model is in an under-fitting state, and a confusion matrix of a classification result has the characteristics of high precision and low recall rate, so that the confidence coefficient of the abnormal sample is predicted to be high, the label is marked as '1', the L1 data set is integrated, and the confidence coefficient of the normal sample is predicted to be lower.
Further, the method for detecting the abnormal point in the test set comprises the following steps: on the basis that the label-free data is fully utilized, the test set is L + U, the training model is a supervised model, and test set prediction is carried out.
The invention also designs an anomaly detection system, which is characterized in that: the system comprises a data acquisition unit for acquiring a data set, a model prediction unit for performing training fitting and unmarked data prediction on an unsupervised model and a supervised model, a training set updating unit for integrating a positive sample set and a negative sample set after prediction and updating the data set by a return data acquisition unit, a judgment unit for judging whether the positive sample set and the negative sample set are converged or not and a detection unit for detecting abnormal points of a test set.
Furthermore, the system of the invention further comprises a request receiving module, which is used for receiving a request sent by the terminal device for carrying out anomaly detection on the data set to be predicted; and the control module is used for sending the abnormal detection result to the terminal equipment and controlling the terminal equipment to display the abnormal detection result.
The invention also designs computer equipment which is characterized in that: comprising a processor and a memory for storing processor-executable instructions; wherein the processor is configured to: the following steps may be performed: receiving an abnormal detection requirement sent by terminal equipment and a small quantity of label sets to be subjected to abnormal detection, performing mutual false label printing processing on the unsupervised model and the supervised model on the small quantity of label sets according to the condition of the small quantity of label sets to form a positive label set and a negative label set, then performing mutual false label printing processing on the unsupervised model and the supervised model on the positive label set and the negative label set until the positive label set and the negative label set are converged, and obtaining an abnormal result data set subjected to detection and marking.
The invention also provides a computer-readable storage medium, which is characterized in that: the computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, cause the processor to perform the steps of: receiving an abnormal detection requirement sent by terminal equipment and a small quantity of label sets to be subjected to abnormal detection, performing mutual false label printing processing on the unsupervised model and the supervised model on the small quantity of label sets according to the condition of the small quantity of label sets to form a positive label set and a negative label set, then performing mutual false label printing processing on the unsupervised model and the supervised model on the positive label set and the negative label set until the positive label set and the negative label set are converged, and obtaining an abnormal result data set subjected to detection and marking.
Compared with the prior art, the method and the device have the advantages that the label-free data are predicted by the aid of the unsupervised model and the supervised model respectively, accuracy of label printing in division work is improved, repeated iterative training is conducted on positive and negative data sets obtained through prediction, and indexes such as recall rate and precision are improved, so that the problems of low confidence coefficient and poor precision of abnormal point detection under the condition that the number of labels is limited are solved.
Drawings
Fig. 1 is a schematic flow chart of an anomaly detection method according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of an anomaly detection system according to an embodiment of the present invention.
Fig. 3 is a line chart comparing an anomaly detection method with a conventional method according to an embodiment of the present invention.
Wherein, 1 is a data acquisition unit, 2 is a judgment unit, 3 is a model prediction unit, 4 is a training set updating unit, 5 is a detection unit, 3-1 is an unsupervised model, and 3-2 is a supervised model.
Detailed Description
The invention will be further described with reference to the accompanying drawings, but is not to be construed as being limited thereto.
Referring to fig. 1, in an embodiment of the present invention, the unsupervised model 3-1 uses an isolated forest IForest model, the supervised model 3-2 uses a LightGBM model, and other existing and commonly used unsupervised models 3-1 and supervised models 3-2 may be selected according to actual situations.
The confidence of the newly added label is improved in a mode that the unsupervised model 3-1IForest and the supervised model 3-2LightGBM mutually mark false labels. The method comprises the steps of firstly, carrying out abnormal point detection on an unsupervised IForest, marking a normal label (a '0' label) on a normal sample after a suspicious sample is eliminated, marking an abnormal label (a '1' label) on the suspicious sample by a supervised model 3-2LightGBM, and optimizing indexes such as recall rate, precision and the like after iterative training compared with the semi-supervised learning based on LightGBM self-training.
In this embodiment, after receiving an anomaly detection request sent by a terminal device and a small number of tag sets to be detected, a specific anomaly detection flow includes the following steps:
s1, the unsupervised model 3-1 is trained, and it is firstly assumed that the abnormal points account for a small proportion of the whole body and the characteristic values are significantly different from the normal points. Based on this assumption and the advantage that no tags are required by unsupervised model 3-1, the full dataset of the dataset is used for training of the isolated forest IForest. Setting an abnormal point proportion parameter in advance, wherein the closer the abnormal point proportion in an actual scene is, the higher the performance of the model is;
s2, performing unsupervised model 3-1 prediction on the unlabeled data set U, predicting the unlabeled data set (the initial state is U) by using the fitted IForest model, judging a sample with an abnormal point score exceeding a certain threshold value as an abnormal sample according to the setting of a registration parameter, and judging the rest samples as normal samples (labels 0) predicted by the unsupervised model 3-1, wherein the normal samples are called L0;
s3 when the number of the labels reaches the training requirement of the supervised model 3-2, the abnormal point detection task can be regarded as a classification task of supervised learning, but two problems need to be faced: sample labels are few and sample class is unbalanced. Therefore, when a small number of labeled datasets (with an initial state of L) in the dataset are used for training of the supervised model 3-2LightGBM, the supervised model classification capability needs to be improved by increasing the sample weight, and the class _ weight parameter is set to 'balanced';
s4 carries out supervised model 3-2 prediction on the unlabelled data set U, the unlabelled data set (with the initial state of U) is predicted on the basis of the trained LightGBM, and under the condition that the number of label samples is small, the model is in an under-fitting state, and a confusion matrix of classification results has the characteristics of high precision and low recall rate. The confidence of predicting as an abnormal sample is high ("1" label), i.e. the L1 dataset, and the confidence of predicting as a normal sample is lower.
S5, putting L0 and L1 into a training set which is called a positive and negative label set, updating a labeled training set L to Li, and updating an unlabeled training set U to Ui;
s6, judging whether the positive and negative sample sets converge, if the updated training set updated sample number does not exceed 10% of the mark amount, judging convergence to execute the next step, or judging the state of the unlabeled data set Ui: if the sample set is empty, judging convergence and executing the next step, otherwise, repeating the steps 2-5 and iterating until the sample set is converged;
s7, on the basis of full utilization of unlabeled data, training data to be L + U, training a LightGBM model, and performing test set prediction.
Specifically, the steps of IForest anomaly detection are as follows:
a) randomly selecting m sample points from the training data as subsamples, and putting the subsamples into root nodes of the tree;
b) randomly assigning a dimension (attribute), and randomly generating a cutting point p in the current node data, wherein the cutting point is generated between the maximum value and the minimum value of the assigned dimension in the current node data;
c) a hyperplane is generated by the cut point, and then the data space of the current node is divided into 2 subspaces: placing data smaller than p in the specified dimension on the left child of the current node, and placing data larger than or equal to p on the right child of the current node;
d) recursion steps b and c in the child nodes, and new child nodes are continuously constructed until only one data in the child nodes (the cutting can not be continued) or the child nodes reach the defined height;
e) after t subtrees are obtained, for a training data x, traversing each subtree, calculating the number of layers of x in each tree finally, and obtaining the height average value of x in each tree, namely APLt;
f) after obtaining the APL of each test data, we can set a threshold, and the test data with the APL lower than the threshold is abnormal.
Referring to fig. 2, the present embodiment further designs an anomaly detection system, which includes a data acquisition unit 1 for acquiring a data set, a model prediction unit 3 for performing training fitting and unmarked data prediction on an unsupervised model 3-1 and a supervised model 3-2, a training set update unit 4 for integrating the predicted positive and negative sample sets and updating the data set by the returned data acquisition unit 1, a determination unit 2 for determining whether the positive and negative sample sets converge, and a detection unit 5 for detecting an anomaly point of the test set.
Furthermore, the system of the invention further comprises a request receiving module, which is used for receiving a request sent by the terminal device for carrying out anomaly detection on the data set to be predicted; and the control module is used for sending the abnormal detection result to the terminal equipment and controlling the terminal equipment to display the abnormal detection result.
In one embodiment, a computer device is designed, comprising a processor and a memory for storing processor-executable instructions; wherein the processor is configured to: the following steps may be performed: receiving an abnormal detection requirement sent by terminal equipment and a small quantity of label sets to be subjected to abnormal detection, performing mutual false label printing processing on the unsupervised model and the supervised model on the small quantity of label sets according to the condition of the small quantity of label sets to form a positive label set and a negative label set, then performing mutual false label printing processing on the unsupervised model and the supervised model on the positive label set and the negative label set until the positive label set and the negative label set are converged, and obtaining an abnormal result data set subjected to detection and marking.
In one embodiment, the specific anomaly detection flow steps are as follows:
s1, the unsupervised model 3-1 is trained, and it is firstly assumed that the abnormal points account for a small proportion of the whole body and the characteristic values are significantly different from the normal points. Based on this assumption and the advantage that no tags are required by unsupervised model 3-1, the full dataset of the dataset is used for training of the isolated forest IForest. Setting an abnormal point proportion parameter in advance, wherein the closer the abnormal point proportion in an actual scene is, the higher the performance of the model is;
s2, performing unsupervised model 3-1 prediction on the unlabeled data set U, predicting the unlabeled data set (the initial state is U) by using the fitted IForest model, judging a sample with an abnormal point score exceeding a certain threshold value as an abnormal sample according to the setting of a registration parameter, and judging the rest samples as normal samples (labels 0) predicted by the unsupervised model 3-1, wherein the normal samples are called L0;
s3 when the number of the labels reaches the training requirement of the supervised model 3-2, the abnormal point detection task can be regarded as a classification task of supervised learning, but two problems need to be faced: sample labels are few and sample class is unbalanced. Therefore, when a small number of labeled datasets (with an initial state of L) in the dataset are used for training of the supervised model 3-2LightGBM, the supervised model classification capability needs to be improved by increasing the sample weight, and the class _ weight parameter is set to 'balanced';
s4 carries out supervised model 3-2 prediction on the unlabelled data set U, the unlabelled data set (with the initial state of U) is predicted on the basis of the trained LightGBM, and under the condition that the number of label samples is small, the model is in an under-fitting state, and a confusion matrix of classification results has the characteristics of high precision and low recall rate. The confidence of predicting as an abnormal sample is high ("1" label), i.e. the L1 dataset, and the confidence of predicting as a normal sample is lower.
S5, putting L0 and L1 into a training set which is called a positive and negative label set, updating a labeled training set L to Li, and updating an unlabeled training set U to Ui;
s6, judging whether the positive and negative sample sets converge, if the updated training set updated sample number does not exceed 10% of the mark amount, judging convergence to execute the next step, or judging the state of the unlabeled data set Ui: if the set is empty, convergence is judged to execute the next step,
otherwise, repeating the steps 2-5, and iterating until the sample set is converged.
In one embodiment, the computer executable instructions, when executed by the processor, further cause the processor to perform the steps of: on the basis of fully utilizing the label-free data, training data are L + U, and a LightGBM model is trained to predict a test set.
In one embodiment, the present invention also contemplates a computer-readable storage medium configured on a server, the computer-readable storage medium having stored therein computer-executable instructions that, when executed by a processor, cause the processor to perform the steps of: receiving an abnormal detection requirement sent by terminal equipment and a small quantity of label sets to be subjected to abnormal detection, performing mutual false label printing processing on the unsupervised model and the supervised model on the small quantity of label sets according to the condition of the small quantity of label sets to form a positive label set and a negative label set, then performing mutual false label printing processing on the unsupervised model and the supervised model on the positive label set and the negative label set until the positive label set and the negative label set are converged, and obtaining an abnormal result data set subjected to detection and marking.
In order to illustrate that the effect of the invention on anomaly detection is obviously superior to that of the existing supervised and unsupervised models, an anomaly detection contrast test is specially carried out on the river water quality, and the figure 3 is shown.
In the graph, the horizontal axis x is the percentage of the number of abnormal marks in the total number of samples, the smaller x indicates that the number of samples used for training is less, the vertical axis y represents the F1 value of the training result, and the F1 value is an index comprehensively considering the precision rate and the recall rate, and can better reflect the actual effect.
In the figure, it can be seen that line a represents the detection result of the unsupervised abnormal forest model, line b represents the detection result of the supervised LGB, both of which have the very poor system prediction effect in the case of small sample size, and lines c and d represent the prediction effect after fine adjustment by the method of the present invention, and a good abnormal detection effect can be formed in the case of very small data size.
The invention is a semi-supervised anomaly detection framework based on unsupervised and supervised models, a data set with different label proportions is generated in a random label removing mode, the unsupervised model 3-1IForest and the supervised model 3-2LightGBM are used for respectively predicting the unlabelled data, the accuracy of labeler label printing is improved, ten times of experiments are repeated under the condition of each label proportion, and the average value of classification performance indexes is obtained. Finally, compared with the traditional unsupervised model 3-1, the supervised model 3-2 and the self-training model, the abnormal point detection performance of the method is obviously better in the performance of classification indexes, and the performance of the method is still more stable under the condition of extremely small amount of marking data.
It should be noted that, those skilled in the art can understand that the unsupervised model and the supervised model in the above embodiment method can use not only the IForest model and the LightGBM model, but also any unsupervised model and supervised model, so as to achieve the purpose and effect of the present invention.
It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by hardware related to computer program instructions, and the program may be stored in a computer readable storage medium, for example, in the storage medium of a computer system, and executed by at least one processor in the computer system, so as to implement the processes of the embodiments including the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description is specific and detailed, but it should not be understood as the limitation of the scope of the present invention, it should be noted that, for those skilled in the art, many variations and modifications can be made without departing from the concept of the present invention, and these all fall into the protection scope of the present invention.

Claims (10)

1. An abnormality detection method characterized by: the specific abnormality detection method is as follows: receiving an abnormal detection requirement sent by terminal equipment and a small amount of label sets to be subjected to abnormal detection, performing mutual false label printing processing on the unsupervised model (3-1) and the supervised model (3-2) on the small amount of label sets according to the condition of the small amount of label sets to form a positive label set and a negative label set, then performing mutual false label printing processing on the unsupervised model (3-1) and the supervised model (3-2) on the positive label set and the negative label set until the positive label set and the negative label set appear to be converged, and obtaining an abnormal result data set after detection and marking;
the positive and negative label sets are a sample set marked as '0' after unsupervised prediction and a sample set marked as '1' after supervised prediction.
2. The abnormality detection method according to claim 1, characterized in that: the method comprises the following specific steps of:
s1, setting abnormal point proportion parameters, taking all data sets as training sets, and training the unsupervised model (3-1);
s2, performing unsupervised model (3-1) prediction on the unlabeled data set U, and labeling the normal sample as '0' and the normal sample label set as L0;
s3, when the number of the labels reaches the training requirement of the supervised model (3-2), taking a small amount of labeled data set L in the data set as a training set, improving the classification capability of the supervised model (3-2) by increasing the sample weight, and setting the class _ weight parameter as 'balanced' to train the supervised model (3-2);
s4, carrying out supervised model (3-2) prediction on the unlabelled data set U, and labeling the abnormal sample as '1' and the abnormal sample label set as L1;
s5, putting L0 and L1 into a training set, called a positive and negative label set, updating the labeled training set L to Li and the unlabeled training set U to Ui.
3. The abnormality detection method according to claim 1, characterized in that: the anomaly detection method also comprises test set anomaly point detection.
4. An abnormality detection method according to claim 2, characterized in that: the specific method for predicting the unsupervised model (3-1) in the S2 comprises the following steps: and predicting the unlabeled data set U by using the trained unsupervised model (3-1), marking the samples with labels of '1' when the scores of the abnormal points exceed a certain threshold value according to the set abnormal point proportion parameters, marking the samples with labels of '0' when the other samples are normal samples predicted by the unsupervised model (3-1), and integrating the normal samples into the data set L0.
5. An abnormality detection method according to claim 2, characterized in that: the specific method for predicting the supervised model (3-2) in the S4 comprises the following steps: and (3) predicting the unlabeled data set U on the trained supervised model (3-2), wherein the labeled samples are few, the supervised model (3-2) is in an under-fit state, and a confusion matrix of a classification result has the characteristics of high precision and low recall rate, so that the confidence coefficient of the abnormal samples is high, the labeled samples are marked as '1', the L1 data set is integrated, and the confidence coefficient of the normal samples is lower.
6. An abnormality detection method according to claim 3, characterized in that: the method for detecting the abnormal points in the test set comprises the following steps: on the basis that label-free data is fully utilized, the test set is L + U, the training model is a supervised model (3-2), and test set prediction is carried out.
7. An anomaly detection system, characterized by: the system comprises a data acquisition unit (1) for acquiring a data set, a model prediction unit (3) for performing training fitting and unmarked data prediction on an unsupervised model (3-1) and a supervised model (3-2), a training set updating unit (4) for integrating a positive and negative sample set after prediction and updating the data set of a return data acquisition unit (1), a judgment unit (2) for judging whether the positive and negative sample set is converged, and a detection unit (5) for detecting abnormal points of a test set.
8. The detection system of claim 7, wherein: the device also comprises a request receiving module used for receiving a request sent by the terminal equipment for carrying out abnormity detection on the data set to be predicted;
and the control module is used for sending the abnormal detection result to the terminal equipment and controlling the terminal equipment to display the abnormal detection result.
9. A computer device, characterized by: comprises a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: the method of any one of claims 1-6 may be performed.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein computer-executable instructions that, when executed by a processor, cause the processor to perform the steps of the method of any one of claims 1-6.
CN202010567982.8A 2020-06-19 2020-06-19 Anomaly detection method and system Active CN111740991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010567982.8A CN111740991B (en) 2020-06-19 2020-06-19 Anomaly detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010567982.8A CN111740991B (en) 2020-06-19 2020-06-19 Anomaly detection method and system

Publications (2)

Publication Number Publication Date
CN111740991A true CN111740991A (en) 2020-10-02
CN111740991B CN111740991B (en) 2022-08-09

Family

ID=72651843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010567982.8A Active CN111740991B (en) 2020-06-19 2020-06-19 Anomaly detection method and system

Country Status (1)

Country Link
CN (1) CN111740991B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508095A (en) * 2020-12-07 2021-03-16 中国平安人寿保险股份有限公司 Sample processing method and device, electronic equipment and storage medium
CN112686912A (en) * 2021-01-05 2021-04-20 南开大学 Acute stroke lesion segmentation method based on gradual learning and mixed samples
CN112819386A (en) * 2021-03-05 2021-05-18 中国人民解放军国防科技大学 Method, system and storage medium for generating time series data with abnormity
CN113392920A (en) * 2021-06-25 2021-09-14 北京百度网讯科技有限公司 Method, apparatus, device, medium, and program product for generating cheating prediction model
CN113420816A (en) * 2021-06-24 2021-09-21 北京市生态环境监测中心 Data abnormal value determination method for full-spectrum water quality monitoring equipment
CN113435547A (en) * 2021-08-27 2021-09-24 中国环境监测总站 Water quality index fusion data anomaly detection method and system
CN113484817A (en) * 2021-06-30 2021-10-08 国网上海市电力公司 Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model
CN113570566A (en) * 2021-07-22 2021-10-29 深圳市人工智能与机器人研究院 Product appearance defect developmental cognition detection method and related device
CN113645231A (en) * 2021-08-10 2021-11-12 北京易通信联科技有限公司 Intrusion detection method, memory and processor of industrial control system
CN114881775A (en) * 2022-07-12 2022-08-09 浙江君同智能科技有限责任公司 Fraud detection method and system based on semi-supervised ensemble learning
CN115100739A (en) * 2022-06-09 2022-09-23 厦门国际银行股份有限公司 Man-machine behavior detection method, system, terminal device and storage medium
CN116702078A (en) * 2023-06-02 2023-09-05 中国电信股份有限公司浙江分公司 State detection method based on modular expandable cabinet power distribution unit

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764281A (en) * 2018-04-18 2018-11-06 华南理工大学 A kind of image classification method learning across task depth network based on semi-supervised step certainly
CN110135295A (en) * 2019-04-29 2019-08-16 华南理工大学 A kind of unsupervised pedestrian recognition methods again based on transfer learning
CN110263807A (en) * 2019-05-13 2019-09-20 杭州安恒信息技术股份有限公司 Anomaly detection method based on auto-encoder
CN111126576A (en) * 2020-03-26 2020-05-08 北京精诊医疗科技有限公司 Novel training strategy for deep learning
US20200150622A1 (en) * 2018-11-13 2020-05-14 Guangdong University Of Technology Method for detecting abnormity in unsupervised industrial system based on deep transfer learning
CN111222648A (en) * 2020-01-15 2020-06-02 深圳前海微众银行股份有限公司 Semi-supervised machine learning optimization method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764281A (en) * 2018-04-18 2018-11-06 华南理工大学 A kind of image classification method learning across task depth network based on semi-supervised step certainly
US20200150622A1 (en) * 2018-11-13 2020-05-14 Guangdong University Of Technology Method for detecting abnormity in unsupervised industrial system based on deep transfer learning
CN110135295A (en) * 2019-04-29 2019-08-16 华南理工大学 A kind of unsupervised pedestrian recognition methods again based on transfer learning
CN110263807A (en) * 2019-05-13 2019-09-20 杭州安恒信息技术股份有限公司 Anomaly detection method based on auto-encoder
CN111222648A (en) * 2020-01-15 2020-06-02 深圳前海微众银行股份有限公司 Semi-supervised machine learning optimization method, device, equipment and storage medium
CN111126576A (en) * 2020-03-26 2020-05-08 北京精诊医疗科技有限公司 Novel training strategy for deep learning

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508095A (en) * 2020-12-07 2021-03-16 中国平安人寿保险股份有限公司 Sample processing method and device, electronic equipment and storage medium
CN112686912B (en) * 2021-01-05 2022-06-10 南开大学 Acute stroke lesion segmentation method based on gradual learning and mixed samples
CN112686912A (en) * 2021-01-05 2021-04-20 南开大学 Acute stroke lesion segmentation method based on gradual learning and mixed samples
CN112819386A (en) * 2021-03-05 2021-05-18 中国人民解放军国防科技大学 Method, system and storage medium for generating time series data with abnormity
CN113420816A (en) * 2021-06-24 2021-09-21 北京市生态环境监测中心 Data abnormal value determination method for full-spectrum water quality monitoring equipment
CN113392920A (en) * 2021-06-25 2021-09-14 北京百度网讯科技有限公司 Method, apparatus, device, medium, and program product for generating cheating prediction model
CN113484817A (en) * 2021-06-30 2021-10-08 国网上海市电力公司 Intelligent electric energy meter automatic verification system abnormity detection method based on TSVM model
CN113570566B (en) * 2021-07-22 2024-03-19 深圳市人工智能与机器人研究院 Product appearance defect development cognition detection method and related device
CN113570566A (en) * 2021-07-22 2021-10-29 深圳市人工智能与机器人研究院 Product appearance defect developmental cognition detection method and related device
CN113645231A (en) * 2021-08-10 2021-11-12 北京易通信联科技有限公司 Intrusion detection method, memory and processor of industrial control system
CN113645231B (en) * 2021-08-10 2023-07-21 北京易通信联科技有限公司 Intrusion detection method, memory and processor for industrial control system
CN113435547B (en) * 2021-08-27 2021-11-16 中国环境监测总站 Water quality index fusion data anomaly detection method and system
CN113435547A (en) * 2021-08-27 2021-09-24 中国环境监测总站 Water quality index fusion data anomaly detection method and system
CN115100739A (en) * 2022-06-09 2022-09-23 厦门国际银行股份有限公司 Man-machine behavior detection method, system, terminal device and storage medium
CN114881775A (en) * 2022-07-12 2022-08-09 浙江君同智能科技有限责任公司 Fraud detection method and system based on semi-supervised ensemble learning
CN116702078A (en) * 2023-06-02 2023-09-05 中国电信股份有限公司浙江分公司 State detection method based on modular expandable cabinet power distribution unit
CN116702078B (en) * 2023-06-02 2024-03-26 中国电信股份有限公司浙江分公司 State detection method based on modular expandable cabinet power distribution unit

Also Published As

Publication number Publication date
CN111740991B (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN111740991B (en) Anomaly detection method and system
CN109902721B (en) Abnormal point detection model verification method, device, computer equipment and storage medium
WO2017124942A1 (en) Method and apparatus for abnormal access detection
CN109871954B (en) Training sample generation method, abnormality detection method and apparatus
CN113742387A (en) Data processing method, device and computer readable storage medium
CN113518063A (en) Network intrusion detection method and system based on data enhancement and BilSTM
CN112084761B (en) Hydraulic engineering information management method and device
CN104067567A (en) Systems and methods for spam detection using character histograms
CN111506599A (en) Industrial control equipment identification method and system based on rule matching and deep learning
CN116738551B (en) Intelligent processing method for acquired data of BIM model
CN114153980A (en) Knowledge graph construction method and device, inspection method and storage medium
CN116167010A (en) Rapid identification method for abnormal events of power system with intelligent transfer learning capability
CN114116829A (en) Abnormal data analysis method, abnormal data analysis system, and storage medium
CN113723555A (en) Abnormal data detection method and device, storage medium and terminal
CN112434071B (en) Metadata blood relationship and influence analysis platform based on data map
CN114328942A (en) Relationship extraction method, apparatus, device, storage medium and computer program product
CN116484109B (en) Customer portrait analysis system and method based on artificial intelligence
CN111177403B (en) Sample data processing method and device
CN112434651A (en) Information analysis method and device based on image recognition and computer equipment
CN116150401A (en) Strong robustness knowledge graph triplet quality inspection network model training method and quality inspection method based on noisy data set
CN113190851B (en) Active learning method of malicious document detection model, electronic equipment and storage medium
CN113901455B (en) Abnormal operation behavior detection method, device, equipment and medium
CN115208938A (en) User behavior control method and device and computer readable storage medium
CN116842183A (en) Map merging method and device, electronic equipment and storage medium
CN110570025A (en) prediction method, device and equipment for real reading rate of WeChat seal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant