CN114462621A - Machine supervision learning method and device - Google Patents

Machine supervision learning method and device Download PDF

Info

Publication number
CN114462621A
CN114462621A CN202210008442.5A CN202210008442A CN114462621A CN 114462621 A CN114462621 A CN 114462621A CN 202210008442 A CN202210008442 A CN 202210008442A CN 114462621 A CN114462621 A CN 114462621A
Authority
CN
China
Prior art keywords
machine
sample set
learning
learning model
white
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210008442.5A
Other languages
Chinese (zh)
Inventor
余为宾
高磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Secxun Technology Co ltd
Original Assignee
Shenzhen Secxun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Secxun Technology Co ltd filed Critical Shenzhen Secxun Technology Co ltd
Priority to CN202210008442.5A priority Critical patent/CN114462621A/en
Publication of CN114462621A publication Critical patent/CN114462621A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/40Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
    • G06F18/41Interactive pattern learning with a human teacher

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention discloses a machine supervised learning method and a device, wherein the machine supervised learning method comprises the following steps of S1: manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set; step S2: carrying out sample clash on the black sample set and the white sample set to modify the machine weight of each characteristic dimension; step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision. According to the method, the manual intervention labeling of the machine supervised learning result is realized through the black sample set and the white sample set, the results of the manual intervention labeling are collided, and the machine calculates reasonable machine weight, so that the machine learning model is corrected and optimized according to the training result, the training process is effectively controlled, and the machine supervised learning result is more accurate.

Description

Machine supervision learning method and device
Technical Field
The invention relates to the technical field of machine supervised learning, in particular to a machine supervised learning method and device.
Background
Machine supervised learning is a machine learning task that infers a function from labeled training data that includes a set of training examples, each of which consists of metadata and training samples in supervised learning. The prior machine supervision learning has the following technical problems: the machine learning model can not be corrected according to the training result, the quality of the training result is completely determined by the training sample, and the training process can not be controlled.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a machine supervision learning method and device, which can solve the technical problems.
(II) technical scheme
In order to solve the above technical problems, the present invention provides the following technical solutions: a machine supervision learning method comprises the following steps:
step S1: manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;
step S2: carrying out sample clash on the black sample set and the white sample set to modify the machine weight of each characteristic dimension;
step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.
Preferably, before step S1, the method further includes:
step S1 a: performing machine learning by using the training sample set to obtain a machine learning model;
step S1 b: and performing prediction classification on the test sample set by using a machine learning model to obtain a plurality of classification results.
Preferably, before step S2, the method further includes:
step S2 a: and manually setting the corresponding machine weight of each characteristic dimension of the characteristic vector.
Preferably, step S3 specifically includes: and substituting the characteristic vectors of the black sample set and the white sample set subjected to sample clash into a machine supervision learning algorithm for training and learning to obtain an optimized machine learning model.
Preferably, after step S3, the method further includes:
step S4: and performing prediction classification on the data sets to be classified by using the optimized machine learning model.
In order to solve the above technical problem, the present invention provides another technical solution as follows: a machine supervised learning apparatus, comprising the following modules: the system comprises a creating module, a collision module and a learning module;
the creating module is used for creating a black sample set and a white sample set according to results obtained after manual study and judgment are carried out on a plurality of classification results of the machine learning model, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;
the collision module is used for carrying out sample collision on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension;
the learning module is used for optimizing the machine learning model by using the black sample set and the white sample set after sample collision.
Preferably, the learning module is further configured to perform machine learning by using the training sample set to obtain a machine learning model; the machine supervision learning device also comprises a classification module which is used for carrying out prediction classification on the test sample set by utilizing the machine learning model so as to obtain a plurality of classification results.
Preferably, the learning module is specifically configured to substitute each feature vector of the black sample set and the white sample set after the sample collision into a machine supervised learning algorithm to perform training learning, so as to obtain an optimized machine learning model.
Preferably, the machine supervised learning algorithm is a decision tree algorithm.
Preferably, the classification module is further configured to perform predictive classification on the to-be-classified data set by using the optimized machine learning model.
(III) advantageous effects
Compared with the prior art, the invention provides a machine supervision learning method and device, which have the following beneficial effects: according to the method, a black sample set and a white sample set are created by manually studying and judging a plurality of classification results of a machine learning model, the black sample set and the white sample set are subjected to sample clash to modify the machine weight of each characteristic dimension, finally, the machine learning model is optimized by using the black sample set and the white sample set subjected to sample clash, the machine supervised learning result is artificially intervened and labeled through the black sample set and the white sample set, the artificial interference result is classed, and the machine calculates reasonable machine weight, so that the machine learning model is corrected and optimized according to the training result, the training process is effectively controlled, and the machine supervised learning result is more accurate.
Drawings
FIG. 1 is a flow chart of the steps of a first embodiment of the machine supervised learning method of the present invention;
FIG. 2 is a flowchart illustrating steps of a second embodiment of a machine supervised learning method of the present invention;
fig. 3 is a schematic block diagram of the machine supervised learning apparatus of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a first embodiment of the machine supervised learning method of the present invention includes the following steps:
step S1: and manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set.
Each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions.
For example, machine supervision and learning is applied to predict and classify the call ticket behavior so as to judge that the call ticket is a normal call ticket or an abnormal call ticket such as fraud, the black sample set comprises a plurality of sample data which are artificially researched and judged to be abnormal call tickets, and the white sample set comprises a plurality of sample data which are artificially researched and judged to be normal call tickets.
Step S2: and carrying out sample clash on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension.
Sample collision, namely solving a difference set of a black sample set and a white sample set, namely finding out different characteristic dimensions in sample data of the two sets; further, the machine weight is increased for different feature dimensions and decreased for the same feature dimension.
Step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.
It should be understood that the step S3 is to further optimize the machine learning model by using the black sample set and the white sample set after the machine weight modification in the step S2. The step S3 specifically includes: and substituting the characteristic vectors of the black sample set and the white sample set subjected to sample clash into a machine supervision learning algorithm for training and learning to obtain an optimized machine learning model. The machine supervised learning algorithm may be specifically a decision tree algorithm, a logistic regression, a linear regression, K-nearest neighbor, naive bayes, etc., and each of the above machine supervised learning algorithms is the prior art, and the specific principle thereof is not described herein in detail. It should be understood that the feature dimension with the higher feature dimension means that the final classification result has a greater influence, and therefore the feature dimension with the higher machine weight can have a greater influence on the finally formed machine learning model than the feature dimension with the lower machine weight, and the machine learning model can be more reliable than machine supervised learning of the feature dimension with the same weight average of each machine, so that the classification result obtained by actually applying the machine learning model has higher accuracy.
Compared with the prior art, the invention provides a machine supervision learning method, which has the following beneficial effects: according to the method, a black sample set and a white sample set are created by manually studying and judging a plurality of classification results of a machine learning model, the black sample set and the white sample set are subjected to sample clash to modify the machine weight of each characteristic dimension, finally, the machine learning model is optimized by using the black sample set and the white sample set subjected to sample clash, the machine supervised learning result is artificially intervened and labeled through the black sample set and the white sample set, the artificial interference result is classed, and the machine calculates reasonable machine weight, so that the machine learning model is corrected and optimized according to the training result, the training process is effectively controlled, and the machine supervised learning result is more accurate.
Referring to fig. 2, a second embodiment of the machine supervised learning method of the present invention includes the following steps:
step S1 a: and performing machine learning by using the training sample set to obtain a machine learning model.
Step S1 b: and performing prediction classification on the test sample set by using a machine learning model to obtain a plurality of classification results.
Step S1: and manually judging a plurality of classification results of the machine learning model to create a black sample set and a white sample set.
It should be understood that step S1 is a manual judgment of the plurality of classification results of step S1 b.
Step S2 a: and manually setting the corresponding machine weight of each characteristic dimension of the characteristic vector.
It should be appreciated that the initial machine weights for each feature dimension are set via step S2 a. For example, sample data of a black sample set corresponds to a feature vector a1= { a1, a2, A3, a4}, sample data of a white sample set corresponds to a feature vector B1= { B1, B2, B3, B4}, the machine weights of the four feature dimensions a1, a2, A3, a4 of a1 are all set to be 0.25 manually, and the machine weights of the four feature dimensions B1, B2, B3, B4 of B1 are all set to be 0.25 manually.
Step S2: and carrying out sample clash on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension.
For example, through sample collision, it is determined that a2 corresponds to B2, A3 corresponds to B3 and is a feature dimension with different numerical values, a1 corresponds to B1, and a4 corresponds to B4 and is a feature dimension with the same numerical value, the machine weights of the four feature dimensions a1, a2, A3 and a4 of a1 after modification are 0.1, 0.4 and 0.1 respectively, and the machine weights of the four feature dimensions B1, B2, B3 and B4 of B1 after modification are 0.1, 0.4 and 0.1 respectively, that is, the machine weights of a2 and B2, A3 and B3 are correspondingly increased; the above machine weight values are only examples, and the specific values can be modified and adjusted as required.
Step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.
Step S4: and performing prediction classification on the data sets to be classified by using the optimized machine learning model.
It should be understood that in step S4, the optimized machine learning model is used for practical application, the data set to be classified includes a plurality of data to be subjected to prediction classification, and the corresponding classification result can be obtained by extracting the feature vector corresponding to the data in the data set to be classified and further inputting the feature vector into the optimized machine learning model.
Referring to fig. 3, the machine supervised learning apparatus of the present invention includes a creating module 11, an colliding module 12 and a learning module 13, where the creating module 11 is configured to create a black sample set and a white sample set according to a result obtained by manually studying and judging a plurality of classification results of a machine learning model, where each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector includes a plurality of feature dimensions; the collision module 12 is configured to perform sample collision on the black sample set and the white sample set to modify the machine weight of each feature dimension; the learning module 13 is configured to optimize the machine learning model by using the black sample set and the white sample set after the sample collision.
The learning module 13 is further configured to perform machine learning by using the training sample set to obtain a machine learning model; the machine-supervised learning apparatus further comprises a classification module 14 for performing predictive classification on the test sample set by using the machine learning model to obtain a plurality of classification results.
The learning module 13 is specifically configured to substitute each feature vector of the black sample set and the white sample set after the sample collision into a machine supervised learning algorithm to perform training learning, so as to obtain an optimized machine learning model. The classification module 14 is further configured to perform predictive classification on the to-be-classified data set by using the optimized machine learning model.
For the specific principle of the machine monitoring learning apparatus of the present invention, reference may be made to the description of the above-mentioned embodiment of the machine monitoring learning method, and redundant description is not repeated here.
It is to be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A machine supervision learning method is characterized by comprising the following steps:
step S1: manually judging a plurality of classification results of a machine learning model to create a black sample set and a white sample set, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;
step S2: carrying out sample clash on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension;
step S3: and optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.
2. The machine supervised learning method of claim 1, wherein: before the step S1, the method further includes:
step S1 a: performing machine learning by using a training sample set to obtain the machine learning model;
step S1 b: and performing prediction classification on the test sample set by using the machine learning model to obtain a plurality of classification results.
3. The machine supervised learning method of claim 1, wherein: before the step S2, the method further includes:
step S2 a: and manually setting the corresponding machine weight of each feature dimension of the feature vector.
4. The machine supervised learning method of claim 1, wherein: the step S3 specifically includes: and substituting the characteristic vectors of the black sample set and the white sample set subjected to the sample collision into a machine supervision learning algorithm for training and learning so as to obtain the optimized machine learning model.
5. The machine supervised learning method of claim 4, wherein: the method further comprises the following steps after the step S3:
step S4: and performing prediction classification on a data set to be classified by using the optimized machine learning model.
6. A machine supervised learning device is characterized by comprising the following modules: the system comprises a creating module, a collision module and a learning module;
the creating module is used for creating a black sample set and a white sample set according to results obtained after manual study and judgment are carried out on a plurality of classification results of the machine learning model, wherein each sample data in the black sample set and the white sample set corresponds to a feature vector, and each feature vector comprises a plurality of feature dimensions;
the collision module is used for carrying out sample collision on the black sample set and the white sample set so as to modify the machine weight of each characteristic dimension;
the learning module is used for optimizing the machine learning model by using the black sample set and the white sample set after the sample collision.
7. The machine supervised learning device of claim 6, wherein: the learning module is further used for performing machine learning by utilizing a training sample set to obtain the machine learning model; the machine supervision learning device further comprises a classification module which is used for carrying out prediction classification on the test sample set by utilizing the machine learning model so as to obtain a plurality of classification results.
8. The machine-supervised learning apparatus of claim 7, wherein: the learning module is specifically used for substituting the feature vectors of the black sample set and the white sample set after the sample collision into a machine supervision learning algorithm to carry out training learning so as to obtain the optimized machine learning model.
9. The machine-supervised learning apparatus of claim 8, wherein: the machine supervision learning algorithm is a decision tree algorithm.
10. The machine supervised learning device of claim 8, wherein: the classification module is further used for performing prediction classification on a data set to be classified by using the optimized machine learning model.
CN202210008442.5A 2022-01-06 2022-01-06 Machine supervision learning method and device Pending CN114462621A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210008442.5A CN114462621A (en) 2022-01-06 2022-01-06 Machine supervision learning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210008442.5A CN114462621A (en) 2022-01-06 2022-01-06 Machine supervision learning method and device

Publications (1)

Publication Number Publication Date
CN114462621A true CN114462621A (en) 2022-05-10

Family

ID=81409515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210008442.5A Pending CN114462621A (en) 2022-01-06 2022-01-06 Machine supervision learning method and device

Country Status (1)

Country Link
CN (1) CN114462621A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178533A (en) * 2018-11-12 2020-05-19 第四范式(北京)技术有限公司 Method and device for realizing automatic semi-supervised machine learning
CN111222648A (en) * 2020-01-15 2020-06-02 深圳前海微众银行股份有限公司 Semi-supervised machine learning optimization method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178533A (en) * 2018-11-12 2020-05-19 第四范式(北京)技术有限公司 Method and device for realizing automatic semi-supervised machine learning
CN111222648A (en) * 2020-01-15 2020-06-02 深圳前海微众银行股份有限公司 Semi-supervised machine learning optimization method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112101220B (en) Rolling bearing service life prediction method based on unsupervised model parameter migration
Yan et al. Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic
CN111382546B (en) Method for predicting service life of generator insulation system based on support vector machine modeling
CN111352971A (en) Bank system monitoring data anomaly detection method and system
CN113034483B (en) Cigarette defect detection method based on deep migration learning
LU102710B1 (en) Input validation method for neural network model by crossing-layer dissection
CN109656818B (en) Fault prediction method for software intensive system
CN110263808B (en) Image emotion classification method based on LSTM network and attention mechanism
CN114398992A (en) Intelligent fault diagnosis method based on unsupervised domain adaptation
CN114331214A (en) Domain-adaptive bearing voiceprint fault diagnosis method and system based on reinforcement learning
CN110362989A (en) Malicious web pages detection method based on the online limit of sequence learning machine of hypomnesia type
Sundarrajan et al. Explainable efficient and optimized feature fusion network for surface defect detection
Saadallah et al. Early quality prediction using deep learning on time series sensor data
CN114462621A (en) Machine supervision learning method and device
CN115359059B (en) Solar cell performance test method and system
KR102504230B1 (en) Retraingin system using drift detection of machine learning and method thereof
CN116415485A (en) Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption
CN115496384A (en) Monitoring management method and device for industrial equipment and computer equipment
CN115660101A (en) Data service providing method and device based on service node information
CN116956105A (en) Classification model training method, defect identification method, device and electronic equipment
CN112949524B (en) Engine fault detection method based on empirical mode decomposition and multi-core learning
CN115048987A (en) Motor vibration prediction method of multi-source self-adaptive transfer learning based on manifold structure
CN115169832A (en) Sensitivity analysis method and system based on curve form change
CN111814836A (en) Vehicle driving behavior detection method and device based on class imbalance algorithm
CN116486178B (en) Defect detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220510