CN114462621A

CN114462621A - Machine supervision learning method and device

Info

Publication number: CN114462621A
Application number: CN202210008442.5A
Authority: CN
Inventors: 余为宾; 高磊
Original assignee: Shenzhen Secxun Technology Co ltd
Current assignee: Shenzhen Secxun Technology Co ltd
Priority date: 2022-01-06
Filing date: 2022-01-06
Publication date: 2022-05-10

Abstract

The invention discloses a machine supervised learning method and a device, wherein the machine supervised learning method comprises the following steps of S1: manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set; step S2: carrying out sample clash on the black sample set and the white sample set to modify the machine weight of each characteristic dimension; step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision. According to the method, the manual intervention labeling of the machine supervised learning result is realized through the black sample set and the white sample set, the results of the manual intervention labeling are collided, and the machine calculates reasonable machine weight, so that the machine learning model is corrected and optimized according to the training result, the training process is effectively controlled, and the machine supervised learning result is more accurate.

Description

Machine supervision learning method and device

Technical Field

The invention relates to the technical field of machine supervised learning, in particular to a machine supervised learning method and device.

Background

Machine supervised learning is a machine learning task that infers a function from labeled training data that includes a set of training examples, each of which consists of metadata and training samples in supervised learning. The prior machine supervision learning has the following technical problems: the machine learning model can not be corrected according to the training result, the quality of the training result is completely determined by the training sample, and the training process can not be controlled.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a machine supervision learning method and device, which can solve the technical problems.

(II) technical scheme

In order to solve the above technical problems, the present invention provides the following technical solutions: a machine supervision learning method comprises the following steps:

step S1: manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;

step S2: carrying out sample clash on the black sample set and the white sample set to modify the machine weight of each characteristic dimension;

step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.

Preferably, before step S1, the method further includes:

step S1 a: performing machine learning by using the training sample set to obtain a machine learning model;

step S1 b: and performing prediction classification on the test sample set by using a machine learning model to obtain a plurality of classification results.

Preferably, before step S2, the method further includes:

step S2 a: and manually setting the corresponding machine weight of each characteristic dimension of the characteristic vector.

Preferably, step S3 specifically includes: and substituting the characteristic vectors of the black sample set and the white sample set subjected to sample clash into a machine supervision learning algorithm for training and learning to obtain an optimized machine learning model.

Preferably, after step S3, the method further includes:

step S4: and performing prediction classification on the data sets to be classified by using the optimized machine learning model.

In order to solve the above technical problem, the present invention provides another technical solution as follows: a machine supervised learning apparatus, comprising the following modules: the system comprises a creating module, a collision module and a learning module;

the creating module is used for creating a black sample set and a white sample set according to results obtained after manual study and judgment are carried out on a plurality of classification results of the machine learning model, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;

the collision module is used for carrying out sample collision on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension;

the learning module is used for optimizing the machine learning model by using the black sample set and the white sample set after sample collision.

Preferably, the learning module is further configured to perform machine learning by using the training sample set to obtain a machine learning model; the machine supervision learning device also comprises a classification module which is used for carrying out prediction classification on the test sample set by utilizing the machine learning model so as to obtain a plurality of classification results.

Preferably, the learning module is specifically configured to substitute each feature vector of the black sample set and the white sample set after the sample collision into a machine supervised learning algorithm to perform training learning, so as to obtain an optimized machine learning model.

Preferably, the machine supervised learning algorithm is a decision tree algorithm.

Preferably, the classification module is further configured to perform predictive classification on the to-be-classified data set by using the optimized machine learning model.

(III) advantageous effects

Compared with the prior art, the invention provides a machine supervision learning method and device, which have the following beneficial effects: according to the method, a black sample set and a white sample set are created by manually studying and judging a plurality of classification results of a machine learning model, the black sample set and the white sample set are subjected to sample clash to modify the machine weight of each characteristic dimension, finally, the machine learning model is optimized by using the black sample set and the white sample set subjected to sample clash, the machine supervised learning result is artificially intervened and labeled through the black sample set and the white sample set, the artificial interference result is classed, and the machine calculates reasonable machine weight, so that the machine learning model is corrected and optimized according to the training result, the training process is effectively controlled, and the machine supervised learning result is more accurate.

Drawings

FIG. 1 is a flow chart of the steps of a first embodiment of the machine supervised learning method of the present invention;

FIG. 2 is a flowchart illustrating steps of a second embodiment of a machine supervised learning method of the present invention;

fig. 3 is a schematic block diagram of the machine supervised learning apparatus of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a first embodiment of the machine supervised learning method of the present invention includes the following steps:

step S1: and manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set.

Each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions.

For example, machine supervision and learning is applied to predict and classify the call ticket behavior so as to judge that the call ticket is a normal call ticket or an abnormal call ticket such as fraud, the black sample set comprises a plurality of sample data which are artificially researched and judged to be abnormal call tickets, and the white sample set comprises a plurality of sample data which are artificially researched and judged to be normal call tickets.

Step S2: and carrying out sample clash on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension.

Sample collision, namely solving a difference set of a black sample set and a white sample set, namely finding out different characteristic dimensions in sample data of the two sets; further, the machine weight is increased for different feature dimensions and decreased for the same feature dimension.

It should be understood that the step S3 is to further optimize the machine learning model by using the black sample set and the white sample set after the machine weight modification in the step S2. The step S3 specifically includes: and substituting the characteristic vectors of the black sample set and the white sample set subjected to sample clash into a machine supervision learning algorithm for training and learning to obtain an optimized machine learning model. The machine supervised learning algorithm may be specifically a decision tree algorithm, a logistic regression, a linear regression, K-nearest neighbor, naive bayes, etc., and each of the above machine supervised learning algorithms is the prior art, and the specific principle thereof is not described herein in detail. It should be understood that the feature dimension with the higher feature dimension means that the final classification result has a greater influence, and therefore the feature dimension with the higher machine weight can have a greater influence on the finally formed machine learning model than the feature dimension with the lower machine weight, and the machine learning model can be more reliable than machine supervised learning of the feature dimension with the same weight average of each machine, so that the classification result obtained by actually applying the machine learning model has higher accuracy.

Compared with the prior art, the invention provides a machine supervision learning method, which has the following beneficial effects: according to the method, a black sample set and a white sample set are created by manually studying and judging a plurality of classification results of a machine learning model, the black sample set and the white sample set are subjected to sample clash to modify the machine weight of each characteristic dimension, finally, the machine learning model is optimized by using the black sample set and the white sample set subjected to sample clash, the machine supervised learning result is artificially intervened and labeled through the black sample set and the white sample set, the artificial interference result is classed, and the machine calculates reasonable machine weight, so that the machine learning model is corrected and optimized according to the training result, the training process is effectively controlled, and the machine supervised learning result is more accurate.

Referring to fig. 2, a second embodiment of the machine supervised learning method of the present invention includes the following steps:

step S1 a: and performing machine learning by using the training sample set to obtain a machine learning model.

It should be understood that step S1 is a manual judgment of the plurality of classification results of step S1 b.

It should be appreciated that the initial machine weights for each feature dimension are set via step S2 a. For example, sample data of a black sample set corresponds to a feature vector a1= { a1, a2, A3, a4}, sample data of a white sample set corresponds to a feature vector B1= { B1, B2, B3, B4}, the machine weights of the four feature dimensions a1, a2, A3, a4 of a1 are all set to be 0.25 manually, and the machine weights of the four feature dimensions B1, B2, B3, B4 of B1 are all set to be 0.25 manually.

For example, through sample collision, it is determined that a2 corresponds to B2, A3 corresponds to B3 and is a feature dimension with different numerical values, a1 corresponds to B1, and a4 corresponds to B4 and is a feature dimension with the same numerical value, the machine weights of the four feature dimensions a1, a2, A3 and a4 of a1 after modification are 0.1, 0.4 and 0.1 respectively, and the machine weights of the four feature dimensions B1, B2, B3 and B4 of B1 after modification are 0.1, 0.4 and 0.1 respectively, that is, the machine weights of a2 and B2, A3 and B3 are correspondingly increased; the above machine weight values are only examples, and the specific values can be modified and adjusted as required.

It should be understood that in step S4, the optimized machine learning model is used for practical application, the data set to be classified includes a plurality of data to be subjected to prediction classification, and the corresponding classification result can be obtained by extracting the feature vector corresponding to the data in the data set to be classified and further inputting the feature vector into the optimized machine learning model.

Referring to fig. 3, the machine supervised learning apparatus of the present invention includes a creating module 11, an colliding module 12 and a learning module 13, where the creating module 11 is configured to create a black sample set and a white sample set according to a result obtained by manually studying and judging a plurality of classification results of a machine learning model, where each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector includes a plurality of feature dimensions; the collision module 12 is configured to perform sample collision on the black sample set and the white sample set to modify the machine weight of each feature dimension; the learning module 13 is configured to optimize the machine learning model by using the black sample set and the white sample set after the sample collision.

The learning module 13 is further configured to perform machine learning by using the training sample set to obtain a machine learning model; the machine-supervised learning apparatus further comprises a classification module 14 for performing predictive classification on the test sample set by using the machine learning model to obtain a plurality of classification results.

The learning module 13 is specifically configured to substitute each feature vector of the black sample set and the white sample set after the sample collision into a machine supervised learning algorithm to perform training learning, so as to obtain an optimized machine learning model. The classification module 14 is further configured to perform predictive classification on the to-be-classified data set by using the optimized machine learning model.

For the specific principle of the machine monitoring learning apparatus of the present invention, reference may be made to the description of the above-mentioned embodiment of the machine monitoring learning method, and redundant description is not repeated here.

It is to be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A machine supervision learning method is characterized by comprising the following steps:

step S1: manually judging a plurality of classification results of a machine learning model to create a black sample set and a white sample set, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;

step S2: carrying out sample clash on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension;

2. The machine supervised learning method of claim 1, wherein: before the step S1, the method further includes:

step S1 a: performing machine learning by using a training sample set to obtain the machine learning model;

step S1 b: and performing prediction classification on the test sample set by using the machine learning model to obtain a plurality of classification results.

3. The machine supervised learning method of claim 1, wherein: before the step S2, the method further includes:

step S2 a: and manually setting the corresponding machine weight of each feature dimension of the feature vector.

4. The machine supervised learning method of claim 1, wherein: the step S3 specifically includes: and substituting the characteristic vectors of the black sample set and the white sample set subjected to the sample collision into a machine supervision learning algorithm for training and learning so as to obtain the optimized machine learning model.

5. The machine supervised learning method of claim 4, wherein: the method further comprises the following steps after the step S3:

step S4: and performing prediction classification on a data set to be classified by using the optimized machine learning model.

6. A machine supervised learning device is characterized by comprising the following modules: the system comprises a creating module, a collision module and a learning module;

the learning module is used for optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.

7. The machine supervised learning device of claim 6, wherein: the learning module is further used for performing machine learning by utilizing a training sample set to obtain the machine learning model; the machine supervision learning device further comprises a classification module which is used for carrying out prediction classification on the test sample set by utilizing the machine learning model so as to obtain a plurality of classification results.

8. The machine-supervised learning apparatus of claim 7, wherein: the learning module is specifically used for substituting the feature vectors of the black sample set and the white sample set after the sample collision into a machine supervision learning algorithm to carry out training learning so as to obtain the optimized machine learning model.

9. The machine-supervised learning apparatus of claim 8, wherein: the machine supervision learning algorithm is a decision tree algorithm.

10. The machine supervised learning device of claim 8, wherein: the classification module is further used for performing prediction classification on a data set to be classified by using the optimized machine learning model.