CN116361351A

CN116361351A - Data mining method for health management of industrial equipment

Info

Publication number: CN116361351A
Application number: CN202211531923.0A
Authority: CN
Inventors: 杨小强
Original assignee: Chongqing Creation Vocational College
Current assignee: Chongqing Creation Vocational College
Priority date: 2022-12-01
Filing date: 2022-12-01
Publication date: 2023-06-30
Anticipated expiration: 2042-12-01
Also published as: CN116361351B

Abstract

The present disclosure relates to a data mining method for industrial equipment health management, comprising: acquiring a monitoring data set of the industrial equipment, wherein the monitoring data set comprises a plurality of state monitoring data, and the plurality of state monitoring data comprises first monitoring data and second monitoring data; carrying out vectorization processing on each state monitoring data in the plurality of state monitoring data to obtain a state monitoring vector corresponding to each state monitoring data, wherein the state monitoring vector comprises a first monitoring vector and a second monitoring vector; dividing the state monitoring vectors with vector similarity meeting the first preset requirement into the same category aiming at each state monitoring vector to obtain a plurality of category vector clusters, wherein the category is determined as a target category under the condition that the quantity and the proportion of the first monitoring vectors in the vector cluster of any category meet the second preset requirement; and determining new fault data based on second monitoring data corresponding to a second monitoring vector in the vector cluster of the target class.

Description

Data mining method for health management of industrial equipment

Technical Field

The present disclosure relates generally to the field of industrial equipment monitoring technology, and more particularly, to a data mining method for industrial equipment health management.

Background

At present, emerging technologies such as digital intelligence, equipment internet of things and the like in the manufacturing industry are continuously emerging, and a conversion type intelligent factory becomes a necessary development direction of a production line. Industrial equipment in factories is often complicated and high in automation degree, production equipment, detection equipment, cutter equipment and the like are high-end precision equipment, machining and detection precision is micrometers, and mass data information is generated in the production process. In order to timely master the states of products and equipment, and timely and efficient intervention control, large data analysis is needed to be carried out on various monitoring data of industrial equipment, especially data generated when the industrial equipment fails, so that the health management analysis of the industrial equipment is realized.

However, in the massive historical monitoring data of the industrial equipment, the data of the industrial equipment in normal operation often occupies a great majority proportion, namely, the data generated when the industrial equipment breaks down only occupies a very small proportion of the whole, and the monitoring equipment also has a great number of situations of false alarm, missing report and the like, so that the fault data of the industrial equipment does not have a good process of storage and original accumulation, and neither has a mature condition for the health management analysis of the industrial equipment in terms of basic data quantity nor data reliability.

In this case, when the health management analysis is performed on the industrial equipment in the factory, if the failure data amount reaches the basic requirement for the health management analysis by adopting the modes of open source data, culturing advanced data expert or accumulating data in a short period, etc., the problem of low coupling between the acquired data and the actual service is generally introduced, and the investment of cost is greatly increased. Therefore, when the amount of fault data for industrial equipment within a plant is small, it is necessary to focus on the carding and mining of the fault data in an economical and efficient manner.

Disclosure of Invention

The data mining method for the industrial equipment health management can well utilize the determined fault data in the industrial equipment historical fault database to mine new fault data, improves economy and efficiency of mining and carding the fault data, and provides reliable data support for subsequent equipment health management work.

In one general aspect, there is provided a data mining method for industrial equipment health management, comprising: acquiring a monitoring data set of industrial equipment, wherein the monitoring data set comprises a plurality of state monitoring data, the plurality of state monitoring data comprise first monitoring data and second monitoring data, the first monitoring data are fault data when the determined industrial equipment breaks down, and the second monitoring data are data to be determined; carrying out vectorization processing on each state monitoring data in the plurality of state monitoring data to obtain a state monitoring vector corresponding to each state monitoring data, wherein the state monitoring vector comprises a first monitoring vector and a second monitoring vector, the first monitoring data corresponds to the first monitoring vector, and the second monitoring data corresponds to the second monitoring vector; dividing the state monitoring vectors with vector similarity meeting the first preset requirement into the same category aiming at each state monitoring vector to obtain a plurality of category vector clusters, wherein the category is determined as a target category under the condition that the quantity and the proportion of the first monitoring vectors in the vector cluster of any category meet the second preset requirement; and determining new fault data based on second monitoring data corresponding to a second monitoring vector in the vector cluster of the target class.

Optionally, for each state monitoring vector, the classifying the state monitoring vectors with the vector similarity meeting the first preset requirement into the same category includes: for any one state monitoring vector, calculating the average vector similarity of the state monitoring vector relative to the current vector cluster of each category; when the maximum value in the average vector similarity of the state monitoring vector relative to the vector cluster of each current class is larger than a first threshold value, dividing the state monitoring vector into the class corresponding to the maximum value; when the maximum value is less than or equal to the first threshold value, a category is newly created and the state monitoring vector is divided into the newly created categories.

Optionally, for any one state monitoring vector, calculating the average vector similarity of the state monitoring vector relative to the current vector cluster of each category respectively includes: for a vector cluster of any current category, respectively calculating the vector similarity between the state monitoring vector and all seed vectors in the vector cluster of the category, wherein the seed vectors comprise a first monitoring vector and/or a first state monitoring vector divided into each category; taking the average value of the vector similarity of the state monitoring vector and all seed vectors in the vector cluster of the category as the average vector similarity of the state monitoring vector relative to the vector cluster of the category.

Optionally, in the case that the number and the proportion of the first monitoring vectors in the vector cluster of any category meet the second preset requirement, determining the category as the target category includes: when the number of first monitor vectors in the vector cluster of any category is greater than a second threshold and the ratio is greater than a third threshold, the category is determined to be the target category.

Optionally, the plurality of status monitoring data further includes third monitoring data, the third monitoring data being determined normal data when the industrial equipment is operating normally, wherein the status monitoring vector includes a third monitoring vector, and the third monitoring data corresponds to the third monitoring vector.

Optionally, the determining new fault data based on the second monitoring data corresponding to the second monitoring vector in the vector cluster of the target class includes: obtaining a first vector set based on each first monitoring vector and each third monitoring vector, and obtaining a second vector set based on the second monitoring vector in the vector cluster of each target class; training a classification model by using the first vector set, and predicting each second monitoring vector in the second vector set by using the trained classification model to obtain a prediction result of each second monitoring vector in the second vector set; and determining a fourth monitoring vector from the second vector set based on a prediction result of each second monitoring vector in the second vector set, and determining second monitoring data corresponding to the fourth monitoring vector as new fault data.

Optionally, the first vector set further includes label information corresponding to each first monitoring vector and each third monitoring vector, the label information is used for indicating that the corresponding first monitoring data belongs to fault data or the corresponding third monitoring data belongs to normal data, and the classification model is used for predicting a prediction probability that the second monitoring data corresponding to each second monitoring vector in the second vector set belongs to fault data.

Optionally, the determining a fourth monitoring vector from the second vector set includes: and selecting a second monitoring vector with the prediction probability larger than a fourth threshold value from the second vector set, and determining the second monitoring vector with the prediction probability larger than the fourth threshold value as the fourth monitoring vector.

Optionally, the training the classification model by using the first vector set, and predicting each second monitoring vector in the second vector set by using the trained classification model to obtain a prediction result of each second monitoring vector in the second vector set, including: cross training the two classification models by using the first vector set to respectively obtain a plurality of trained two classification models; respectively predicting each second monitoring vector in the second vector set by using the plurality of trained two-classification models to obtain a plurality of original prediction results of each second monitoring vector in the second vector set; and obtaining a prediction result of each second monitoring vector in the second vector set based on a plurality of original prediction results of each second monitoring vector in the second vector set, wherein an average value of the plurality of original prediction results of the second monitoring vector is taken as the prediction result of the second monitoring vector for any one second monitoring vector in the second vector set.

Optionally, the cross training the classification model using the first set of vectors includes: randomly dividing the first set of vectors into a first number of vector subsets; and training the classification model by utilizing a second number of vector subsets in the first number of vector subsets during each training, wherein the second number is smaller than the first number, and the second number of vector subsets used during any training are not identical to the second number of vector subsets used during other training.

According to the data mining method for the industrial equipment health management, the state monitoring data with certain data similarity are divided into the similar data, the part similar to the determined fault data in the to-be-processed data is screened out, the determined fault data in the industrial equipment historical fault database can be well utilized to mine new fault data, economy and efficiency of mining and combing the fault data are improved, reliable data support is provided for subsequent equipment health management work, the data for the industrial equipment health management analysis are more sufficient and perfect, accordingly more accurate analysis results can be obtained, the equipment management is changed from a preventive maintenance management means to more lean predictive maintenance, and post maintenance of emergency faults and excessive maintenance of preventive maintenance are reduced.

Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.

Drawings

The foregoing and other objects and features of embodiments of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings in which the embodiments are shown, in which:

FIG. 1 is a flow chart illustrating a data mining method for industrial equipment health management according to an embodiment of the present disclosure;

fig. 2 is a flowchart illustrating step S104 in fig. 1 according to an embodiment of the present disclosure.

Detailed Description

The following detailed description is provided to assist the reader in obtaining a thorough understanding of the methods, apparatus, and/or systems described herein. However, various changes, modifications, and equivalents of the methods, apparatuses, and/or systems described herein will be apparent after an understanding of the disclosure of the present application. For example, the order of operations described herein is merely an example and is not limited to those set forth herein, but may be altered as will be apparent after an understanding of the disclosure of the present application, except for operations that must occur in a particular order. Furthermore, descriptions of features known in the art may be omitted for clarity and conciseness.

The features described herein may be embodied in different forms and should not be construed as limited to the examples described herein. Rather, the examples described herein have been provided to illustrate only some of the many possible ways to implement the methods, devices, and/or systems described herein, which will be apparent after an understanding of the present disclosure.

Unless defined otherwise, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs after understanding this disclosure. Unless explicitly so defined herein, terms (such as those defined in a general dictionary) should be construed to have meanings consistent with their meanings in the context of the relevant art and the present disclosure, and should not be interpreted idealized or overly formal.

In addition, in the description of the examples, when it is considered that detailed descriptions of well-known related structures or functions will cause a ambiguous explanation of the present disclosure, such detailed descriptions will be omitted.

A data mining method for industrial equipment health management according to an embodiment of the present disclosure will be described in detail with reference to fig. 1 and 2.

Fig. 1 is a flowchart illustrating a data mining method for industrial device health management according to an embodiment of the present disclosure.

Referring to fig. 1, in step S101, a monitoring data set of an industrial device may be acquired, wherein the monitoring data set includes a plurality of status monitoring data, the plurality of status monitoring data includes first monitoring data and second monitoring data, the first monitoring data is failure data when the determined industrial device fails, and the second monitoring data is data to be determined. As an example, the industrial equipment may be a large industrial equipment such as a numerically controlled machine tool, an engine, a generator, or the like, or may be auxiliary equipment such as a heat sink, a servo motor, a self-priming pump, a deep well pump, or the like, but the disclosure is not limited thereto. Further, the monitoring data may be signal data collected by monitoring devices such as various meters, photoelectric switches, sensors, ampere meters, etc., such as, but not limited to, data including rotational speed signals, vibration signals, shaft load signals, shaft position signals, torque signals, pressure signals, temperature signals, current signals, etc., which may be determined by those skilled in the art according to industrial devices actually monitored in a factory, and the present disclosure is not limited thereto. Still further, any one of the monitoring status data may be a set of various signals of the industrial equipment collected by the monitoring equipment at the same time, where the set may include a rotational speed signal, a vibration signal, a shaft load signal, a shaft position signal, a torque signal, a pressure signal, a temperature signal, a current signal, and the like at the same time.

Next, in step S102, vectorization processing may be performed on each of the plurality of state monitoring data to obtain a state monitoring vector corresponding to each of the state monitoring data, where the state monitoring vector includes a first monitoring vector and a second monitoring vector, the first monitoring data corresponds to the first monitoring vector, and the second monitoring data corresponds to the second monitoring vector. Here, in one possible implementation, normalization processing may be performed on each signal in the state monitoring data, and each state monitoring data after normalization processing is processed into a vector form, so as to obtain a state monitoring vector corresponding to each state monitoring data; in another possible implementation, the pre-trained BERT model may be used to perform vectorization processing on each state monitoring data, so as to obtain a state monitoring vector corresponding to each state monitoring data, so that the obtained vectorized data has reliability, and further, the subsequent processing is more accurate and efficient.

Next, in step S103, for each state monitoring vector, the state monitoring vectors with vector similarity satisfying the first preset requirement may be divided into the same category to obtain a plurality of category vector clusters, where in the case that the number and proportion of the first monitoring vectors in the vector cluster of any category satisfy the second preset requirement, the category is determined as the target category. Here, the category may be determined as the target category when the number of first monitor vectors in the vector cluster of any category is greater than the second threshold and the ratio is greater than the third threshold. Further, the second threshold value and the third threshold value may be set by those skilled in the art according to actual circumstances, for example, the second threshold value may be 10 and the third threshold value may be 80%, but the present disclosure is not limited thereto.

According to the embodiment of the disclosure, for any one state monitoring vector, the average vector similarity of the state monitoring vector relative to the current vector cluster of each category can be calculated respectively; when the maximum value in the average vector similarity of the state monitoring vector relative to the vector cluster of each current class is larger than a first threshold value, dividing the state monitoring vector into the class corresponding to the maximum value; when the maximum value is less than or equal to the first threshold, a class is newly created and the state monitoring vector is divided into the newly created classes. Here, the first threshold value may be predetermined by a person skilled in the art or gradually determined by means of an iterative test, which is not limited by the present disclosure. Further, in calculating the vector similarity, the similarity between the two state monitoring vectors may be obtained by calculating a geometric distance, such as, but not limited to, a cosine distance, between the two state monitoring vectors, where the geometric distance is inversely proportional to the vector similarity; alternatively, the similarity between the state monitoring vector of the class to be classified and the state monitoring vector of the classified class may be obtained by calculating the distance of the state monitoring vector of the class to be classified from the hyperplane in which the state monitoring vector of each classified class is located, where the distance is inversely proportional to the vector similarity.

According to an embodiment of the disclosure, for a current vector cluster of any one category, vector similarity between the state monitoring vector and all seed vectors in the vector cluster of the category may be calculated, where the seed vectors include a first monitoring vector and/or a first state monitoring vector divided into each category; then, the average value of the vector similarity of the state monitoring vector and all the seed vectors in the vector cluster of the category can be used as the average vector similarity of the state monitoring vector relative to the vector cluster of the category. By taking the first monitoring vector corresponding to the determined fault data as a seed vector and then calculating the average vector similarity on the basis, the determined fault data in the historical fault database can be further utilized in the classification process, so that the classification result meets the requirements.

Next, in step S104, new fault data may be determined based on the second monitoring data corresponding to the second monitoring vector in the vector cluster of the target class. Step S104 in fig. 1 according to an embodiment of the present disclosure is described below in conjunction with fig. 2. Here, the plurality of status monitoring data as described above may further include third monitoring data, which is normal data when the determined industrial equipment is operating normally, and accordingly, the status monitoring vector may include a third monitoring vector, the third monitoring data corresponding to the third monitoring vector.

Referring to fig. 2, in step S201, a first vector set may be obtained based on each of the first monitor vector and the third monitor vector, and a second vector set may be obtained based on a second monitor vector in the vector cluster of each target class.

Next, in step S202, the first vector set may be used to train the classification model, and each second monitoring vector in the second vector set is predicted by using the trained classification model, so as to obtain a prediction result of each second monitoring vector in the second vector set. Here, the classification model may include, but is not limited to, at least one of the following models: a random forest model, a support vector machine model, a Wide and Deep model, etc., but the present disclosure is not limited thereto, and a person skilled in the art may train using an appropriate model according to actual situations.

According to an embodiment of the present disclosure, in addition to the first monitor vector and the third monitor vector, the first vector set may further include tag information corresponding to each of the first monitor vector and the third monitor vector, the tag information may be used to indicate that the corresponding first monitor data belongs to fault data or that the corresponding third monitor data belongs to normal data, and further, a classification model may be used to predict a prediction probability that the second monitor data corresponding to each of the second monitor vectors in the second vector set belongs to fault data. By training the classification model by using the first vector set containing the label information, the classification model can learn the distribution of each first monitoring vector or each third monitoring vector based on the label information corresponding to each first monitoring vector or each third monitoring vector, so that the prediction result of the classification model can accurately represent whether the second monitoring data corresponding to the second monitoring vector belongs to fault data or not.

According to the embodiment of the disclosure, when training the two-class model by using the first vector set, the two-class model can be cross-trained by using the first vector set to respectively obtain a plurality of trained two-class models so as to obtain a more stable result by the two-class models; then, predicting each second monitoring vector in the second vector set by using a plurality of trained classification models to obtain a plurality of original prediction results of each second monitoring vector in the second vector set; then, the prediction result of each second monitoring vector in the second vector set can be obtained based on a plurality of original prediction results of each second monitoring vector in the second vector set, wherein, for any one second monitoring vector in the second vector set, the average value of the plurality of original prediction results of the second monitoring vector can be used as the prediction result of the second monitoring vector, thereby avoiding excessive influence of the original prediction result of a certain two-classification model on the final prediction result, being beneficial to maintaining the stability of the prediction result and improving the reliability of the prediction result.

According to embodiments of the present disclosure, when cross-training a classification model with a first set of vectors, the first set of vectors may be randomly divided into a first number of vector subsets; at each training, the two-classification model is trained with a second number of vector subsets in the first number of vector subsets, where the second number is smaller than the first number, e.g., in the case of ten-fold cross training, the first number is 10 and the second number is 9, but the disclosure is not limited thereto, and specific values of the first number and the second number may be set by those skilled in the art according to the actual situation. Further, the second number of subsets of vectors used in any one training is not exactly the same as the second number of subsets of vectors used in other training. In other words, a part of vector subsets can be eliminated during each training, and the vector subsets eliminated each time are not identical, so that excessive influence of a part of vector subsets on training results is avoided to a certain extent, and a plurality of classification models are reliable as a whole.

Next, in step S203, a fourth monitoring vector may be determined from the second vector set based on the prediction result of each of the second monitoring vectors in the second vector set, and second monitoring data corresponding to the fourth monitoring vector may be determined as new fault data. Here, a second monitor vector having a prediction probability greater than the fourth threshold value may be selected from the second vector set, and the second monitor vector having a prediction probability greater than the fourth threshold value may be determined as the fourth monitor vector. Further, the fourth threshold may be determined by one skilled in the art according to actual circumstances, for example, 0.9 or 0.95, but the present disclosure is not limited thereto. Since the training-derived classification model has already learned the distribution of each first monitor vector, the prediction result of each second monitor vector by the classification model can indicate whether the distribution of each second monitor vector is the same as or similar to the first monitor vector, so that the second monitor vector which is the same as or similar to the distribution of the first monitor vector can be determined as the fourth monitor vector.

A data mining method for industrial device health management according to embodiments of the present disclosure may be written as a computer program and stored on a computer readable storage medium. The data mining method for industrial equipment health management as described above may be implemented when the computer program is executed by a processor. Examples of the computer readable storage medium include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, nonvolatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-RLTH, BD-RE, blu-ray or optical disk storage, hard Disk Drives (HDD), solid State Disks (SSD), card-type memories (such as multimedia cards, secure Digital (SD) cards or extreme digital (XD) cards), magnetic tapes, floppy disks, magneto-optical data storage devices, hard disks, solid state disks, and any other devices configured to store computer programs and any associated data, data files and data structures in a non-transitory manner and to provide the computer programs and any associated data, data files and data structures to a processor or computer to enable the processor or computer to execute the programs. In one example, the computer program and any associated data, data files, and data structures are distributed across networked computer systems such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed manner by one or more processors or computers.

Although a few embodiments of the present disclosure have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the disclosure, the scope of which is defined in the claims and their equivalents.

Claims

1. A data mining method for health management of industrial equipment, comprising:

acquiring a monitoring data set of industrial equipment, wherein the monitoring data set comprises a plurality of state monitoring data, the plurality of state monitoring data comprise first monitoring data and second monitoring data, the first monitoring data are fault data when the determined industrial equipment breaks down, and the second monitoring data are data to be determined;

carrying out vectorization processing on each state monitoring data in the plurality of state monitoring data to obtain a state monitoring vector corresponding to each state monitoring data, wherein the state monitoring vector comprises a first monitoring vector and a second monitoring vector, the first monitoring data corresponds to the first monitoring vector, and the second monitoring data corresponds to the second monitoring vector;

dividing the state monitoring vectors with vector similarity meeting the first preset requirement into the same category aiming at each state monitoring vector to obtain a plurality of category vector clusters, wherein the category is determined as a target category under the condition that the quantity and the proportion of the first monitoring vectors in the vector cluster of any category meet the second preset requirement;

and determining new fault data based on second monitoring data corresponding to a second monitoring vector in the vector cluster of the target class.

2. The data mining method according to claim 1, wherein the classifying the state monitoring vectors, for which the vector similarity satisfies the first preset requirement, into the same category for each state monitoring vector includes:

for any one state monitoring vector, calculating the average vector similarity of the state monitoring vector relative to the current vector cluster of each category;

when the maximum value in the average vector similarity of the state monitoring vector relative to the vector cluster of each current class is larger than a first threshold value, dividing the state monitoring vector into the class corresponding to the maximum value;

when the maximum value is less than or equal to the first threshold value, a category is newly created and the state monitoring vector is divided into the newly created categories.

3. The data mining method according to claim 2, wherein the calculating, for any one of the state monitoring vectors, the average vector similarity of the state monitoring vector with respect to the current vector cluster of each category includes:

for a vector cluster of any current category, respectively calculating the vector similarity between the state monitoring vector and all seed vectors in the vector cluster of the category, wherein the seed vectors comprise a first monitoring vector and/or a first state monitoring vector divided into each category;

taking the average value of the vector similarity of the state monitoring vector and all seed vectors in the vector cluster of the category as the average vector similarity of the state monitoring vector relative to the vector cluster of the category.

4. The data mining method according to claim 1, wherein the determining the class as the target class in the case that the number and the proportion of the first monitor vectors in the vector cluster of any class satisfy the second preset requirement includes:

when the number of first monitor vectors in the vector cluster of any category is greater than a second threshold and the ratio is greater than a third threshold, the category is determined to be the target category.

5. The data mining method of claim 1, wherein the plurality of status monitor data further comprises third monitor data, the third monitor data being determined normal data for normal operation of the industrial equipment, wherein the status monitor vector comprises a third monitor vector, the third monitor data corresponding to the third monitor vector.

6. The data mining method of claim 5, wherein the determining new fault data based on second monitoring data corresponding to a second monitoring vector in the vector cluster of the target class comprises:

obtaining a first vector set based on each first monitoring vector and each third monitoring vector, and obtaining a second vector set based on the second monitoring vector in the vector cluster of each target class;

training a classification model by using the first vector set, and predicting each second monitoring vector in the second vector set by using the trained classification model to obtain a prediction result of each second monitoring vector in the second vector set;

and determining a fourth monitoring vector from the second vector set based on a prediction result of each second monitoring vector in the second vector set, and determining second monitoring data corresponding to the fourth monitoring vector as new fault data.

7. The data mining method of claim 6, wherein the first vector set further includes tag information corresponding to each of the first monitor vector and the third monitor vector, the tag information indicating whether the corresponding first monitor data belongs to fault data or the corresponding third monitor data belongs to normal data, and the classification model is used to predict a prediction probability that the corresponding second monitor data of each of the second monitor vectors in the second vector set belongs to fault data.

8. The data mining method of claim 7, wherein the determining a fourth monitoring vector from the second set of vectors comprises:

and selecting a second monitoring vector with the prediction probability larger than a fourth threshold value from the second vector set, and determining the second monitoring vector with the prediction probability larger than the fourth threshold value as the fourth monitoring vector.

9. The data mining method of claim 6, wherein training the classification model using the first set of vectors and predicting each second monitor vector in the second set of vectors using the trained classification model to obtain a prediction result for each second monitor vector in the second set of vectors comprises:

cross training the two classification models by using the first vector set to respectively obtain a plurality of trained two classification models;

respectively predicting each second monitoring vector in the second vector set by using the plurality of trained two-classification models to obtain a plurality of original prediction results of each second monitoring vector in the second vector set;

and obtaining a prediction result of each second monitoring vector in the second vector set based on a plurality of original prediction results of each second monitoring vector in the second vector set, wherein an average value of the plurality of original prediction results of the second monitoring vector is taken as the prediction result of the second monitoring vector for any one second monitoring vector in the second vector set.

10. The data mining method of claim 9, wherein cross-training the classification model with the first set of vectors comprises:

randomly dividing the first set of vectors into a first number of vector subsets;

training the classification model with a second number of subsets of vectors of the first number of subsets of vectors at each training time, wherein the second number is smaller than the first number,

wherein the second number of subsets of vectors used in any one training is not exactly the same as the second number of subsets of vectors used in other training.