CN108229692B - Machine learning identification method based on dual contrast learning - Google Patents

Machine learning identification method based on dual contrast learning Download PDF

Info

Publication number
CN108229692B
CN108229692B CN201810128018.8A CN201810128018A CN108229692B CN 108229692 B CN108229692 B CN 108229692B CN 201810128018 A CN201810128018 A CN 201810128018A CN 108229692 B CN108229692 B CN 108229692B
Authority
CN
China
Prior art keywords
sample
machine learning
comparison
samples
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810128018.8A
Other languages
Chinese (zh)
Other versions
CN108229692A (en
Inventor
徐传运
许洲
张杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Maoqiao Technology Co.,Ltd.
Original Assignee
Chongqing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Technology filed Critical Chongqing University of Technology
Priority to CN201810128018.8A priority Critical patent/CN108229692B/en
Publication of CN108229692A publication Critical patent/CN108229692A/en
Application granted granted Critical
Publication of CN108229692B publication Critical patent/CN108229692B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a machine learning identification method based on dual contrast learning, which can utilize a certain amount of multimedia data samples of known types to adopt different contrast sample input arrangement sequences to carry out machine learning on a model f1Performing multiple differentiated learning training for multimedia data type identification processing, and machine learning model f1The method is designed into a combined model framework of a convolutional neural network model or/and a fully-connected neural network model, dependence on massive training samples is greatly reduced, class recognition of multimedia data classes which are not trained can be conveniently expanded, the problem that the existing multimedia data classification machine learning recognition method is limited in practical applicability and universality due to dependence on massive training samples and incapability of directly classifying and recognizing classes which are not trained is well solved, and the method can be widely and effectively applied to more specific multimedia data classification use occasions.

Description

Machine learning identification method based on dual contrast learning
Technical Field
The invention relates to the technical field of multimedia data processing and machine learning, in particular to a machine learning identification method based on dual contrast learning.
Background
Multimedia (Multimedia) is a combination of multiple media, and in a computer system, Multimedia refers to a man-machine interactive information exchange and transmission medium combining two or more media, and the media used include characters, pictures, photos, sounds, animations and movies, and interactive functions provided by programs.
With the advent of the big data age, the technology of classifying and mining mass multimedia data is particularly important. In massive data mining, how to guide classification and mining of new data by using information classified and mined from existing data has become a new research hotspot. Particularly, when the number of samples of some tasks is small, the time cost for classifying and mining mass data can be effectively reduced and the information acquisition accuracy can be improved by utilizing multi-task learning. For example, in the face recognition-based community access control system development task, if the face image of each owner is divided into an independent image data category, the system is required to process and recognize the face image in a classified manner, determine which owner's face the face image collected at the current access control location belongs to (i.e., determine which image data category the face image belongs to), and further determine whether to release the access control.
The deep learning-based method is proved to be an effective and robust information classification method in practice. Deep neural networks (e.g., deep convolutional neural networks) are the most representative machine learning methods. Deep learning models typically have tens of learnable data processing layers, hundreds of thousands, or even millions, of learnable parameters. Since a large number of parameters constitutes an extremely large learning space, a large amount of training data is usually required in order to obtain optimal model parameters. However, in order to train the deep learning model, a training data set with a large number of samples must be constructed, and the number of training samples is usually tens of thousands or more. However, constructing such a training set is very difficult and expensive in practical applications. For example, in the face recognition-based community access control system development task, if the face image of each owner is divided into an independent image data category, it is very unrealistic to collect tens of thousands of face image training samples for each owner when performing classification recognition training on the machine learning model. This results in greedy of the depth model for large data, which makes the deep learning method difficult to be applied specifically in many fields, or to have reliable technical realizability.
When the deep learning method is used for a classification task, the traditional deep learning method requires that the class of a comparison sample of a classification model is the same as the class of a production sample, namely, the model can only classify the learned class, and if a new class of samples needs to be classified, the machine learning model needs to be retrained, or some adaptive training learning is carried out on the machine learning model. For example, in the face of a community access control system development task based on face recognition, if the face image of each owner is divided into an independent image data category, the current deep learning method is adopted, and the face image of each owner needs to be learned and trained; when a new owner appears, even if the face image of the new owner is directly added into the identification and comparison sample database, because the machine learning model does not perform learning training on the face image of the new owner before, when the face image of the new owner is collected again at the entrance guard, the machine learning model still cannot directly classify and identify the new owner based on the face image data of the new owner in the comparison sample database. This also results in the training of machine learning models based on deep learning methods, which consumes a lot of training computational resources and long training learning time, and limits the convenience and versatility of the machine learning models in practical applications.
Disclosure of Invention
Aiming at the defects in the prior art, the technical problem to be solved by the invention is how to provide a machine learning identification method based on dual contrast learning, so as to solve the problem that the existing multimedia data classification machine learning identification method depends on a large number of training samples to cause limited practical application, and further solve the problem that the existing multimedia data classification machine learning identification method cannot directly classify and identify classes which are not trained, so as to cause limited universality.
In order to solve the technical problems, the invention adopts the following technical means:
a machine learning identification method based on dual contrast learning selects a target identification sample and a contrast sample from a plurality of multimedia data of different known categories as a machine learning model f1To a machine learning model f1Performing learning training, and using the machine learning model f after the learning training1Carrying out category identification on multimedia data to be identified; the machine learning model f1Including a first sub-learning model fDPAnd a second sub-learning model fDEThe first sub-scienceLearning model fDPThe second sub-learning model f is a convolutional neural network model or a fully-connected neural network modelDEA convolutional neural network model or a fully-connected neural network model; the selected contrast sample comprises a plurality of multimedia data of more than two different categories, and the contrast sample is set and input into the machine learning model f1According to the input arrangement sequence of the comparison samples, combining the target identification samples and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model fDEAnd each corresponding second sub-learning model f is inputDEThe output of the first sub-learning model f is ordered according to the input arrangement order rule of the comparison samples to form a data vector as the first sub-learning model fDPAnd the first sub-learning model fDPAs a result vector of the machine learning model; thus, the machine learning model f obtained by training is learned through training and learning1Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs, so that the machine learning model f can be subjected to different input arrangement sequences of the comparison samples by utilizing multimedia data samples of known categories1And carrying out a plurality of times of learning and training.
In the above machine learning identification method based on the double-comparison learning, as a preferable mode, the machine learning model f is used1One or more input target identification samples belong to the same category;
if used as machine learning model f1The input target identification sample is one, and when the target identification sample and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
a combination rule ①, establishing a pairing combination relationship between the target identification sample and each comparison sample, and performing pairing combination respectively;
a combination rule ②, dividing each comparison sample according to categories, establishing a combination relationship between the target identification sample and each category of comparison sample, and combining the target identification samples and each category of comparison samples;
if used as machine learning model f1The input target identification sample is a plurality of target identification samples, and when the target identification sample and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
combination rule method a: establishing a pairing combination relationship between each target identification sample and each comparison sample, and respectively carrying out pairing combination;
combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each target identification sample and each category of comparison sample, and combining the target identification samples and each category of comparison sample;
combination rule mode c: establishing a pairing combination relationship between all target identification samples as a whole and each comparison sample respectively, and performing pairing combination respectively;
combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relation between all target identification samples as a whole and the comparison samples of each category respectively, and combining the comparison samples respectively.
In the above machine learning identification method based on the double-comparison learning, the machine learning model f is preferably a pair machine learning model f1In the process of learning training, the target identification sample and the comparison sample are selected from a preset multimedia data sample library, and a part of multimedia data of known types contained in the multimedia data sample library is selected as a target identification sample and a comparison sample to a machine learning model f each time1Performing learning training, and selecting target identification samples and comparison samples from the multimedia data sample library for multiple times to perform machine learning model f1Performing learning training to ensure that the selection of the target identification sample and the comparison sample traverses the multimedia data sample libraryAnd performing comparison sample selection operation for at least H times aiming at each multimedia data category in the multimedia data sample library, wherein H is a threshold value of training selection times set by the user.
In the above machine learning identification method based on dual contrast learning, as a preferable embodiment, the machine learning model f after learning training is used1The specific way for identifying the category of the multimedia data to be identified is as follows:
obtaining multimedia data serving as an object to be recognized as a sample to be recognized and a comparison sample selected from a plurality of multimedia data of different known types as a machine learning model f after learning training1The selected contrast sample comprises a plurality of multimedia data of more than two different classes, and the contrast sample is set and input to the machine learning model f1According to the input arrangement sequence of the comparison samples, combining the samples to be identified and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model fDEAnd each corresponding second sub-learning model f is inputDEThe output of the first sub-learning model f is ordered according to the input arrangement order rule of the comparison samples to form a data vector as the first sub-learning model fDPAnd the first sub-learning model fDPAs a result vector of the machine learning model; in the class identification process, the machine learning model f1Each result vector element in the output result vector is used for representing the correlation between the sample to be identified and the category to which the comparison sample at the corresponding arrangement sequence position belongs, so that the category to which the sample to be identified belongs is determined according to the correlation.
In the machine learning identification method based on the dual contrast learning, as a preferred scheme, one or more samples to be identified are obtained and belong to the same category;
if it is inputted to the machine learning model f1When the sample to be identified and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
a combination rule ① is that a pairing combination relationship is established between the sample to be identified and each comparison sample respectively, and pairing combination is carried out respectively;
a combination rule ②, wherein, the comparison samples are classified according to categories, and then the samples to be identified are combined with the comparison samples of each category respectively;
if it is inputted to the machine learning model f1When the sample to be identified and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
combination rule method a: establishing a pairing combination relationship between each sample to be identified and each comparison sample, and respectively carrying out pairing combination;
combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each sample to be identified and each category of comparison sample, and combining the samples respectively;
combination rule mode c: establishing a pairing combination relationship between all samples to be identified as a whole and each comparison sample respectively, and performing pairing combination respectively;
combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relation between all samples to be identified as a whole and the comparison samples of each category respectively, and combining respectively.
In the machine learning identification method based on the dual contrast learning, as a preferred scheme, one or more samples to be identified are obtained and belong to the same category;
if a plurality of samples to be identified are obtained, batch input to the machine learning model f can be adopted1Performing recognition processing, inputting into machine learning model in batchesf1The specific mode of the method is one of the following modes:
a batch input mode ①, in which all the comparison samples and each sample to be identified form a sample input set, and a plurality of sample input sets are formed by using the sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ②, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting a comparison sample from each category, then selecting a sample to be identified to form a sample input set, and forming a plurality of sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ③, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting one comparison sample from each category, forming a sample input set together with all samples to be identified, and forming a plurality of sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ④, in which all comparison samples and all samples to be identified form a sample input set as the machine learning model f1Is input.
In the machine learning identification method based on the double comparison learning, as a preferable scheme, the machine learning model f is used1The specific way of performing the category identification processing on the result vector output for multiple times is one of the following ways:
a multiple output type identification method ①, wherein each result vector element in each output result vector is counted and compared, a result vector element with the highest correlation degree represented by the correlation is found out, and the type of a comparison sample on the arrangement sequence position corresponding to the result vector element is determined as the type of the sample to be identified;
multiple output type ② method for recognizing class1Accumulating the result vectors output each time to obtain an accumulated result vector, thereby counting and comparing the correlation represented by each result vector element in the accumulated result vector and finding out the correlationAnd determining the class of a comparison sample in the arrangement sequence position corresponding to the result vector element as the class of the sample to be identified.
In the machine learning identification method based on dual contrast learning, as an optimal scheme, the contrast sample is selected from a preset multimedia data sample library, and each time the contrast sample is selected, the contrast sample is used as a machine learning model f1The input contrast sample category number L is smaller than the category number S of the known-category multimedia data contained in the multimedia data sample library, L and S are integers larger than 1, and the contrast samples are required to be selected from the multimedia data sample library for multiple times and are respectively used as a machine learning model f1Performing multiple category identification processing on a sample to be identified to ensure that the selection of a comparison sample traverses each multimedia data category contained in the multimedia data sample library, and executing at least K times of comparison sample selection operations aiming at each multimedia data category in the multimedia data sample library, wherein K is a threshold value of the identification selection times set; then, statistically comparing the machine learning model f1And identifying each result vector element in the output result vector by each category, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the category of a comparison sample at the arrangement sequence position corresponding to the result vector element as the category of the sample to be identified.
Compared with the prior art, the invention has the following beneficial effects:
1. the machine learning identification method based on dual contrast learning can utilize a certain amount of multimedia data samples of known types to adopt different contrast sample input arrangement sequences to carry out machine learning on a model f1Multiple differentiated learning training is carried out, namely a small amount of training samples can be utilized to carry out a large amount of learning training on the machine learning model to achieve the expected class recognition effect, so that the dependence on massive training samples is greatly reduced, and the problem that the existing multimedia data classification machine learning recognition method depends on a large amount of training samples to cause actual situation is solvedThe problem of limited application.
2. According to the machine learning identification method based on dual contrast learning, even if a certain multimedia data category is not subjected to learning training, only the multimedia data sample of the multimedia data category is added into the identification contrast sample database, and when the sample to be identified is the multimedia data of the category, the machine learning model f1The output result vector can still reflect the difference between the sample to be recognized and the contrast samples of other different classes and the correlation between the sample to be recognized and the contrast samples of the same class, so that the class of the sample to be recognized can still be determined according to the correlation, the class recognition of the multimedia data class which is not subjected to learning training can be conveniently expanded, and the problem that the universality is limited due to the fact that the class which is not subjected to learning training cannot be directly classified and recognized can be solved.
3. The machine learning identification method based on the dual contrast learning can adopt a class identification processing mode of local data selection to enable the machine learning model f to be selected each time in the process of class identification processing1The input contrast sample category quantity L is less than the category quantity S of the multimedia data with known categories contained in the multimedia data sample library, and then the contrast samples are selected for multiple times to be respectively used as a machine learning model f1The method of inputting the samples to be recognized and performing multiple times of class recognition processing to reduce the machine learning model f1The data amount of the data operation processing is executed in each class identification processing process, and the machine learning model f is avoided1The processing efficiency is too low or the processing cannot be performed efficiently.
4. The machine learning identification method based on the double comparison learning well solves the problem that the existing multimedia data classification machine learning identification method is limited in practical applicability and universality due to dependence on a large number of training samples and the fact that classification identification cannot be directly carried out on classes which are not trained, can be more widely and effectively applied to more specific multimedia data classification use occasions, and has wide technical application and popularization prospects.
Drawings
Fig. 1 is a schematic flow chart of a machine learning training process in the machine learning identification method of the present invention.
Fig. 2 is a schematic flow chart of another machine learning training process in the machine learning identification method of the present invention.
Fig. 3 is a flow chart illustrating a multimedia data class identification process in the machine learning identification method according to the present invention.
Detailed Description
Aiming at the problem that the practical application is limited because the existing multimedia data classification machine learning identification method needs to rely on a large number of training samples, the identification principle of the existing machine learning identification method needs to be analyzed, and the reason of the problem is found. In the existing classification machine learning identification method, a sample to be identified and a comparison sample of a known class are usually compared separately, and the similarity between the sample to be identified and the comparison sample is calculated, or the difference distance value between the sample to be identified and the comparison sample is calculated, so as to judge whether the sample to be identified and the comparison sample belong to the same class, thereby realizing class identification of the sample to be identified. The machine learning identification method is applied to application scenes of multimedia data classification and identification, and is easily limited by technical application:
on one hand, because the multimedia data has the possibility of large data difference of the data samples in the same category; for example, for the development task of a residential access control system based on face recognition, if the face image of each owner is divided into an independent image data category, the system needs to process and recognize the face image, but even the face image of the same owner is easy to have image differences due to different conditions such as ambient light, shooting angle, makeup and make-up of the owner, and the training samples under the conditions of ambient light, shooting angle and makeup and make-up are directly helpful for calculating and recognizing the similarity or difference distance value of the samples to be recognized under the same conditions, which results in that a large number of face images of the owners under the conditions of different ambient light, different shooting angles, different makeup and make-up are needed as training samples and recognition contrast data to train the machine learning model, the better face recognition effect can be ensured, so that the operation difficulty of model learning training in practical application is increased, and the technical application is limited.
On the other hand, in the existing classification machine learning identification method, the learning training influence on the identification result of the sample to be identified is difficult to reflect difference of various other training samples different from the class to which the sample to be identified belongs; for example, for a community access control system development task based on face recognition, if the face image of each owner is divided into an independent image data category, when the face image of one owner is recognized or trained, the face images of any other owner in the sample database are compared, and the face image recognition or training results of the owner to be recognized are both insufficient in similarity or large in difference distance; therefore, a large number of non-homogeneous contrast samples cannot bring meaningful distinguishing influence on the recognition or learning training result of the sample to be recognized, which indirectly causes that the recognition or learning training of the sample to be recognized only depends on the contrast samples of the same type, and increases the dependence of the machine learning recognition method on a large number of training samples.
Accordingly, for the limitation reasons in the two aspects, this also leads to another result, that for a data class that is not trained by learning in the multimedia data classification and identification application scenario, the existing machine learning identification method performs effective class identification on the sample to be identified of the class.
Aiming at the analysis result, based on the technical idea of solving the problems, the invention provides a machine learning identification method based on dual contrast learning, which adopts a learning training mode different from the prior art to train a machine learning model, as shown in fig. 1, a target identification sample R and a contrast sample a are selected from a plurality of multimedia data of different known classes as a machine learning model f1To a machine learning model f1To carry outLearning training, and further using the machine learning model f after the learning training1Carrying out category identification on multimedia data to be identified; the machine learning model f1Including a first sub-learning model fDPAnd a second sub-learning model fDEThe first sub-learning model fDPThe second sub-learning model f is a convolutional neural network model or a fully-connected neural network modelDEA convolutional neural network model or a fully-connected neural network model; the selected contrast sample comprises a plurality of multimedia data of more than two different categories, and the contrast sample is set and input into the machine learning model f1For example, in FIG. 1, a plurality of comparison samples listed in the input arrangement order of the comparison samples are respectively marked as a1、a2、…、anN denotes as a machine learning model f1Inputting the quantity of the comparison samples, and inputting the arrangement sequence according to the comparison samples, and then, combining the target identification sample R and the comparison sample a1、a2、…、anThe target identification samples R are combined with the comparison samples a according to the input arrangement sequence of the comparison samples, for example, the simple example in FIG. 11、a2、…、anCombining to form multiple data sample combinations with the comparison sample input arrangement rule, and using each multimedia data sample combination as the second sub-learning model fDEAnd each corresponding second sub-learning model f is inputDEOutput DE of1、DE2、…、DEnForming a data vector as the first sub-learning model f according to the input arrangement order rule of the comparison samplesDPAnd the first sub-learning model fDPAs a result vector C of the machine learning model; thus, the machine learning model f obtained by training is learned through training and learning1Each result vector element C in the output result vector CiE C (i e {1,2, …, n }) is used for characterizing the target identification sample and a comparison sample a at the corresponding arrangement sequence positioni(i e {1,2, …, n }) belongs to the category of the correlation between the categoriesThereby enabling the machine learning model f to be trained using different input ordering of comparison samples for known classes of multimedia data samples1And carrying out a plurality of times of learning and training. For example, as shown in FIG. 2, the same n comparison samples as in FIG. 1 are used, but the input arrangement order of the comparison samples is changed by setting so that the original a1Adjusting the contrast sample of sequential position to a4Sequential position, learning model f to machine1Different learning exercises are performed.
Compared with the prior art, the machine learning identification method based on dual contrast learning adopts different technical implementation ways of thinking, selects multimedia data of known categories as the target identification sample and the contrast sample, and inputs the target identification sample and the contrast sample into the machine learning model f1And performing learning training, wherein the selected comparison sample needs to contain more than two different types of multimedia data, so as to reflect the difference between the different types of the comparison samples in the input arrangement sequence. At the same time, since the machine learns the model f1Is to make the machine learning model f obtained by the learning training1Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which the comparison sample at the corresponding arrangement sequence position belongs, and the machine learning model f1Designed to include a first sub-learning model fDPAnd a second sub-learning model fDEThe first sub-learning model f ofDPAnd a second sub-learning model fDEThe method can be selected as a convolutional Neural Network model or a fully-connected Neural Network model, the convolutional Neural Network can be selected as a Residual Neural Network model (abbreviated as ResNet), a dense convolutional Network model (abbreviated as DenseContolutional Network), and the like, and the fully-connected Neural Network can be selected as a Neural Network model with a fully-connected layer, which is commonly used by a person skilled in the art; using a learning model comprising a first sub-learning model fDPAnd a second sub-learning model fDEThe machine learning model f composed of the combined model framework1During the learning and training process, willCombining the target identification sample (corresponding to the sample to be identified during the category identification process) with the comparison sample to form a plurality of data sample combinations with the comparison sample input arrangement order rule retained, and inputting the data sample combinations into the second sub-learning model fDEThe obtained output is ordered according to the input arrangement order rule of the comparison samples to form a data vector which is used as the first sub-learning model fDPIs input to the model so that the model f is learned from the first sub-learningDPOutput as machine learning model f1Thereby ensuring the machine learning model f1The arrangement order of the elements of the output result vector retains the corresponding relationship with the input arrangement order of the comparison sample, and is due to the first sub-learning model fDPAnd a second sub-learning model fDECan be selected as a convolution neural network model or a full-connection neural network model, so that each result vector element in the result vector is influenced by the input arrangement sequence of the comparison samples, and the machine learning model f obtained by training is enabled to be1The relevance represented by each result vector element in the output result vector has relevance influence on the input arrangement sequence of the comparison sample. Therefore, the multimedia data belonging to the same category as the target identification sample in the comparison sample is different in the input arrangement sequence of the comparison sample, and the machine learning model f is obtained1Therefore, when each multimedia data belonging to the same category as the target recognition sample is used as a comparison sample, the machine learning model f can be learned by adjusting the sequence position of the multimedia data in the input arrangement sequence of the comparison sample1And carrying out a plurality of differentiated learning trainings. Meanwhile, the multimedia data belonging to different categories from the target identification sample in the comparison sample are different in the input arrangement sequence of the comparison sample, and the machine learning model f is also subjected to1The learning training results of the target recognition samples have different influences, so that when each multimedia data belonging to different categories with the target recognition samples is used as the comparison sample, the multimedia data can also participate in the machine learning model f for multiple times by adjusting the sequence position of the multimedia data in the input arrangement sequence of the comparison sample1Zone (D) ofAnd (5) learning other exercises. Thus, the machine learning model f can be input and arranged in different orders by using different contrast samples by using a certain amount of multimedia data samples of known types1Multiple times of differentiated learning training are carried out, namely a small amount of training samples can be used for carrying out a large amount of learning training on the machine learning model to achieve the expected class recognition effect, so that the dependence on massive training samples is greatly reduced, and the problem that the practical application is limited because the existing multimedia data classification machine learning recognition method needs to rely on a large amount of training samples is solved.
In specific application, the machine learning identification method of the invention is utilized to learn the model f to the machine1In the process of learning training, a target identification sample and a comparison sample are selected from a preset multimedia data sample library, and a part of multimedia data of known types contained in the multimedia data sample library is selected as a target identification sample and a comparison sample to a machine learning model f each time1Performing learning training, and selecting target identification samples and comparison samples from the multimedia data sample library for multiple times to perform machine learning model f1And performing learning training to ensure that the selection of the target identification sample and the comparison sample traverses each multimedia data category contained in the multimedia data sample library, and executing at least H times of comparison sample selection operation aiming at each multimedia data category in the multimedia data sample library, wherein H is a training selection time threshold value. Selecting a part of multimedia data of known categories contained in the multimedia data sample library as a target identification sample and a comparison sample pair machine learning model f each time1The learning training is carried out, and the method is a learning training processing mode of local selection. Since the machine learning model f is performed if the global selection of all classes of multimedia data contained in the multimedia data sample library is performed1The learning training process of (1) easily results in huge comparison calculation data quantity and low calculation efficiency, and if a machine learning model f1The neural network of (a) is too hierarchical, and the machine learning model f is easy to be caused1Such a large amount of data cannot be efficiently processed. Thus, it is possible to provideSelecting a part of multimedia data of known category contained in the multimedia data sample library as a target identification sample and comparing the sample to the machine learning model f each time1Performing learning training, and selecting target identification sample and comparison sample for multiple times to obtain machine learning model f1Means for performing learning training to reduce machine learning model f1The data quantity of the data arithmetic processing is executed in each learning and training process, and the machine learning model f is avoided1The problem that the processing efficiency is too low or the processing cannot be effectively executed; however, in the class recognition processing method of local selection, there is a possibility that the multimedia data training contained in the multimedia data sample library is not fully utilized in the learning training process, and for this reason, the machine learning model f1Each result vector element in the output result vector is influenced by the input arrangement sequence of the comparison samples, so that the correlation influence of the comparison samples of the same category on the target identification sample on different input arrangement sequences of the comparison samples is different, and the machine learning model f is possibly influenced1So as to ensure that the model f is learned to the machine as much as possible1The learning training effect is that the learning training process is well performed to ensure that the selection of the target recognition sample and the comparison sample traverses each multimedia data category contained in the multimedia data sample library, and at least H times of comparison sample selection operation are executed for each multimedia data category in the multimedia data sample library, wherein H is a threshold value of training selection times, and the specific value of H can be determined according to practical application experience.
A machine learning model f obtained by learning and training using the machine learning and training program1It can be used for class identification of multimedia data. Specifically, a machine learning model f after learning training is used1The specific way for identifying the category of the multimedia data to be identified is as follows: as shown in fig. 3, multimedia data as an object to be recognized is acquired as a sample R to be recognizedxAnd a comparison sample a selected from a plurality of multimedia data of different known categories as machine learning after learning trainingModel f1The selected contrast sample comprises a plurality of multimedia data of more than two different classes, and the contrast sample is set and input to the machine learning model f1For example, in FIG. 3, a plurality of comparison samples listed in the input arrangement order of the comparison samples are respectively marked as a1、a2、…、anN denotes as a machine learning model f1The number of input comparison samples is determined, and the samples R to be identified are input according to the arrangement sequence of the input comparison samplesxAnd comparative sample a1、a2、…、anCombining with preset combining rules, e.g. the simple example in fig. 3 is to combine the samples R to be recognizedxRespectively comparing the input arrangement sequence of the comparison samples with the comparison sample a1、a2、…、anCombining to form multiple data sample combinations with the comparison sample input arrangement rule, and using each multimedia data sample combination as the second sub-learning model fDEAnd each corresponding second sub-learning model f is inputDEOutput DE of1、DE2、…、DEnForming a data vector as the first sub-learning model f according to the input arrangement order rule of the comparison samplesDPAnd the first sub-learning model fDPAs a result vector C of the machine learning model; in the class identification process, the machine learning model f1Each result vector element C in the output result vector CiEpsilon C (i epsilon {1,2, …, n }) is used for characterizing the sample R to be identifiedxWith a reference sample a at the corresponding arrangement order positioni(i e {1,2, …, n }) belongs to the category, so that the sample R to be identified can be determined according to the correlationxTo which category (c) belongs; for example, if learning the trained machine learning model f1The result vector element C in the result vector C of its outputiA smaller value of (A) indicates a higher degree of correlation with the category to which the comparison sample at the corresponding arrangement order position belongs, and the sample R to be recognized is determined in recognition as shown in FIG. 3xClass y ofxWhen the temperature of the water is higher than the set temperature,comparison sample a at the corresponding arrangement order position of the result vector element with the smallest value in the result vector CiClass y to whichiIt can be determined as the sample R to be recognizedxTo which class (i) belongs
Figure BDA0001574000330000111
i∈{1,2,…,n}。
In specific implementation, the machine learning training process and the multimedia data type identification processing process in the machine learning identification method based on dual contrast learning in this embodiment may be loaded into a processor of a machine learning identification device through computer programming, so that the processor is configured to execute the machine learning training program of the machine learning training process or execute the multimedia data type identification program of the multimedia data type identification processing process. The machine learning type identification device designed based on the machine learning identification method of the invention naturally has common technical characteristics and technical advantages.
In the implementation of the machine learning identification method and the device thereof, the machine learning model f obtained by the learning training is used1Each result vector element in the output result vector is used for representing the correlation between the target recognition sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs, and the correlation is easy to be achieved during specific training operation; for example, in the training process, the model f is learned in the machine according to whether the target recognition sample is in the same category as a comparison sample in the input arrangement sequence1If one result vector element at the corresponding arrangement sequence position in the output result vector is endowed with a preset expected correlation value, for example, the same category is endowed with a positive expected correlation value (for example, the value is assigned to '0'), and different categories are endowed with a negative expected correlation value (for example, the value is assigned to '1'), then the machine learning model f is trained through machine learning1The method can learn whether the correlation distinction of the same category between the target identification sample and the comparison sample is realized, and the target identification sample and the corresponding arrangement order are represented by each result vector element in the output result vectorCorrelation between categories to which a comparison sample at an ordinal position belongs. Thus trained, the machine learning model f is used for performing class recognition processing on multimedia data1Each result vector element in the output result vector can well distinguish and represent the correlation between the sample to be identified and the category to which a comparison sample at the corresponding arrangement sequence position belongs, wherein the closer the result vector element is to the positive correlation expected value, the more the comparison sample corresponding to the element sorting position in the result vector is input into the category to which the comparison sample at the arrangement sequence position belongs, the more the comparison sample can be judged to be the category to which the sample to be identified belongs. For example, in the learning training process shown in fig. 1, the target recognition sample R and the comparison sample a with the 1 st digit of the comparison sample input arrangement order1Belonging to the same class, thus giving the result vector the element c of the 1 st bit of the order of arrangement in the result vector1The value of (1) is '0' to represent a positive correlation expected value, and the result vector elements corresponding to the sequence positions of the contrast samples of the rest different classes are assigned with '1' to represent a negative correlation expected value; in the learning training process shown in FIG. 2, the target recognition sample R and the comparison sample a with the 4 th bit of the comparison sample input sequence are input4Belonging to the same class, thus giving the result vector the element c of the 1 st bit of the order of arrangement in the result vector4The value of (1) is '0' to represent the positive correlation expectation value, and the evaluation values of the result vector elements corresponding to the sequence positions of the contrast samples of the rest different classes are '1' to represent the negative correlation expectation value.
In the machine learning identification method of the present invention, one or more target identification samples may be obtained during the learning training, and all the target identification samples need to belong to the same category; similarly, one or more samples to be recognized may be used in the category identification process, but it is also necessary that all samples belong to the same category. In the specific application implementation, several factors are involved, and different situations need to be described for different cases.
Wherein the first aspect isIn the learning training process or the class identification processing process, target identification samples (corresponding to samples to be identified in the class identification processing process) and comparison samples are input into a machine learning model f1In the present invention, a preset combination rule is needed to be combined, and such a combination process is performed, on one hand, in order to enable a plurality of formed data sample combinations to retain a comparison sample input arrangement order rule, and on the other hand, in order to establish association between a target identification sample (or a sample to be identified) and a comparison sample and between the target identification sample and the comparison sample input arrangement order at a data input level, which is an important technical difference of the machine learning identification method of the present invention compared with the prior art. After the data input level establishes the association between the target identification sample (or the sample to be identified) and the comparison sample input arrangement sequence, the machine learning model f is used for1The processing output of (1), each result vector element in the output result vector is no longer related to the similarity between the target identification sample (or the sample to be identified) and a comparison sample, but also related to the correlation between the target identification sample (or the sample to be identified) and the comparison sample constituting each data sample combination, and is processed by the machine learning model f1The full-connection operation processing function of the method ensures that each result vector element is also related to a comparison sample input arrangement sequence rule reserved by each data sample combination as input, thereby better ensuring the machine learning model f1Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs.
The method is characterized in that a target identification sample (corresponding to a sample to be identified in the class identification processing process) and a comparison sample are combined and input into a machine learning model f1The special processing mode of learning training or class identification processing is carried out, so that the machine learning identification method based on double comparison learning of the invention can have class identification capability for data classes which are not subjected to learning training. Because the machine learning model f obtained after the training by the method of the invention1When class identification is carried out on multimedia data to be identified, the machine learning model f1One result vector element in the output result vector is not only related to the similarity between the sample to be recognized and the comparison sample, but also more related to the relevance between the target recognition sample (or the sample to be recognized) and the comparison sample in each data sample combination and the comparison sample input arrangement order rule reserved by each data sample combination as input, therefore, even if a certain multimedia data class is not trained by learning, as long as the multimedia data sample of the multimedia data class is added into the recognition comparison sample database, when the sample to be recognized is the multimedia data of the class, the machine learning model f1The output result vector can still embody the difference between the sample to be identified and the contrast samples of other different categories and the correlation between the sample to be identified and the contrast samples of the same category, so that the category to which the sample to be identified belongs can still be determined according to the correlation. Therefore, the machine learning identification method based on the dual contrast learning can conveniently expand the class identification of the multimedia data class which is not subjected to the learning training, and can solve the problem of limited universality caused by the fact that the class which is not subjected to the learning training cannot be directly classified and identified.
Meanwhile, just by establishing the association between the target identification sample (or the sample to be identified) and the comparison sample and the input arrangement sequence of the comparison sample, the specific combination mode can be distinguished under different conditions that the number of the target identification samples (or the samples to be identified) is one or more.
If used as machine learning model f1The input target identification sample (or sample to be identified) is one, and when the target identification sample (or sample to be identified) and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following ways:
a combination rule ① is that a pairing combination relationship is established between the target identification sample (or the sample to be identified) and each comparison sample respectively, and pairing combination is carried out respectively;
and ②, dividing the comparison samples according to categories, and establishing a combination relationship between the target identification sample (or the sample to be identified) and the comparison sample of each category for combination.
If used as machine learning model f1The input target identification sample (or the sample to be identified) is a plurality of samples, and when the target identification sample (or the sample to be identified) and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following ways:
combination rule method a: establishing a pairing combination relationship between each target identification sample (or to-be-identified sample) and each comparison sample, and respectively carrying out pairing combination;
combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each target identification sample (or sample to be identified) and each category of comparison sample, and combining the samples respectively;
combination rule mode c: establishing a pairing combination relationship between all target identification samples (or samples to be identified) and each comparison sample as a whole, and respectively carrying out pairing combination;
combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between all target identification samples (or samples to be identified) as a whole and the comparison samples of each category respectively, and combining respectively.
The combination rule method ① and the combination rule method a are to establish a pairing combination relationship between each target recognition sample (or sample to be recognized) and each comparison sample, and perform pairing combination, such a combination rule can form as many data sample combinations as possible with a comparison sample input arrangement order rule, for the learning training process flow, the data sample combinations as many as possible are beneficial to performing more different discriminative learning trainings by changing different comparison sample input arrangement orders, and for the learning model f of the lifting machine1The combination rule manner ② and the processing rules b, c, d are to combine all the target recognition samples (or samples to be recognized) as a whole (belonging to a category) or all the comparative samples of each category as a whole, and then combine them separately, so as to form a plurality of data sample combinations, which not only can keep the comparative sample input ordering rules, but also can form all the target recognition samples (or samples to be recognized) as a whole (belonging to a category) or all the comparative samples of each category as a whole to form a component of a data sample combination, and the formed data sample combination enters the machine learning model f1When the operation processing is carried out, the operation processing process is equivalent to the integration of the data common characteristics of the corresponding class samples, so that the method is helpful for improving the common characteristic distinguishing and identifying rate among different classes.
The factor of the second aspect may be that, in the class identification processing process, when the multiple samples to be identified of the same class are obtained and need to be subjected to class identification processing, and the comparison samples also have multiple classes and multiple numbers, the samples may be input to the machine learning model f in batches1The identification processing is carried out in the mode of (1); in specific operation, inputting the data into the machine learning model f in batches1The specific way of (2) can adopt one of the following ways:
a batch input mode ①, in which all the comparison samples and each sample to be identified form a sample input set, and a plurality of sample input sets are formed by using the sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ②, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting a comparison sample from each category, then selecting a sample to be identified to form a sample input set, and forming a plurality of sample input sets as the machine learning model f in a grading manner1The input of (1);
the batch input method ③ comprises classifying the samples according to their categories, selecting a sample from each category, and mixing with the samplesAll samples to be identified form a sample input set together; thereby forming a plurality of sample input sets, each time as the machine learning model f1The input of (1);
a batch input mode ④, in which all comparison samples and all samples to be identified form a sample input set as the machine learning model f1Is input.
Accordingly, using a batch input to the machine learning model f1The processing method for class identification is that each batch of input will get a result vector, so that the method can be used for learning the model f according to the machine1The specific way of performing the category identification processing on the result vector output for multiple times can also adopt one of the following ways:
a multiple output type identification method ①, wherein each result vector element in each output result vector is counted and compared, a result vector element with the highest correlation degree represented by the correlation is found out, and the type of a comparison sample on the arrangement sequence position corresponding to the result vector element is determined as the type of the sample to be identified;
multiple output type ② method for recognizing class1And accumulating the result vectors output at each time to obtain an accumulated result vector, counting and comparing the correlation represented by each result vector element in the accumulated result vector, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the class of a comparison sample at the arrangement sequence position corresponding to the result vector element as the class of the sample to be identified.
The multiple output type recognition mode ① directly performs correlation statistical comparison according to all result vector elements in the result vectors output for each time to find out the highest correlation degree to determine the type of the sample to be recognized, while the multiple output type recognition mode ② performs correlation statistical comparison after accumulating the result vectors output for each time to find out the highest correlation degree to determine the type of the sample to be recognized, in contrast, the multiple output type recognition mode ② is equivalent to inputting the result vectors output for a machine in a halved mannerLearning model f1Compared with the multiple output type identification method ①, the method has the advantages that the comprehensive consideration of accumulated averaging is performed on each time of output of the result vectors after the type identification processing, the method is more favorable for avoiding the condition of the type identification error of the sample to be identified caused by accidental errors, and the method is favorable for ensuring the better identification accuracy.
In the third aspect, in the process of performing the class identification processing by using the machine learning identification method of the present invention, the comparison sample may be selected from a preset multimedia data sample library, and in a specific application, the comparison sample selected each time may be operated to serve as the machine learning model f1The input contrast sample category number L is smaller than the category number S of the known-category multimedia data contained in the multimedia data sample library, L and S are integers larger than 1, and the contrast samples are required to be selected from the multimedia data sample library for multiple times and are respectively used as a machine learning model f1The method comprises the steps of inputting, performing multiple category identification processing on a sample to be identified to ensure that selection of a comparison sample traverses each multimedia data category contained in a multimedia data sample library, and executing comparison sample selection operation for at least K times aiming at each multimedia data category in the multimedia data sample library, wherein K is a threshold value of identification selection times set by the user. Let each time selected as machine learning model f1The input comparison sample category number L is smaller than the category number S of the multimedia data with known categories contained in the multimedia data sample library, and the method is a category identification processing mode of local selection. Because if the multimedia data of all categories contained in the multimedia data sample library is globally selected to execute the category identification processing of the sample to be identified, the comparison operation data quantity is huge, the operation efficiency is too low, and if the machine learning model f is used, the machine learning model f is used for identifying the category of the sample to be identified1The neural network of (a) is too hierarchical, and the machine learning model f is easy to be caused1Such a large amount of data cannot be efficiently processed. Therefore, let each time selected as the machine learning model f1The input contrast sample class number L is less than the known class of multimedia data contained in the multimedia data sample libraryThe number of classes S is then determined by selecting the comparison samples for multiple times as machine learning models f1The method of inputting the samples to be recognized and performing multiple times of class recognition processing to reduce the machine learning model f1The data amount of the data operation processing is executed in each class identification processing process, and the machine learning model f is avoided1The problem that the processing efficiency is too low or the processing cannot be effectively executed; however, the class identification processing method of local selection may result in that no class to which the sample to be identified belongs exists in the comparison sample selected once, so that an effective class identification result cannot be obtained, and the machine learning model f is used1Each result vector element in the output result vector is influenced by the input arrangement sequence of the comparison samples, so that the influence of the comparison samples of the same category on the correlation identification of the samples to be identified can be different in different input arrangement sequences of the comparison samples, and the category identification result of the samples to be identified can be influenced. Then, statistically comparing the machine learning model f1And identifying each result vector element in the output result vector by each category, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the category of a comparison sample at the arrangement sequence position corresponding to the result vector element as the category of the sample to be identified.
Identification effect comparative example:
in this embodiment, compared with some recognition methods using machine learning models in the prior art, the machine learning recognition method used in the machine learning recognition apparatus for multimedia data classification provided by the present invention uses the same data set to perform recognition effect comparison experiments, so as to verify the feasibility and effectiveness of the machine learning recognition method used in the machine learning recognition apparatus provided by the present invention.
In the present embodiment, the inventive method is labeled "LCNN", whereas the comparative prior art machine Learning models include the BPL (bayesian program Learning algorithm) model (labeled "BPL [ lange 2015 ]") mentioned in the documents "Lake, b.m., Salakhutdinov, R. & Tenenbaum, j.b. human-level constraining statistical Learning third route analysis supplied material.science 350, 1332. glancing 1338 (2015)", the BPL [ lange 2015] ", the document" vision ", o.g., Blundell, c.lillicr-, and the Convolutional Simame Net model (labeled "Convolutional Simame Net [ Kock2015 ]") mentioned in the document "Koch, G., Richard Zemel & Ruslon Salakhu tdinov.Simotal networks for one-shot image recognition.in (University of Toronto, 2015)".
In this embodiment, based on the omniroot dataset, samples of 30, 60, 136, 156, and 964 categories are respectively selected from a training set provided by the omniroot dataset as a training set, each category has 20 samples, and models participating in comparison are respectively trained; then, 20-to-1 single sample (20-way) category identification tests were performed using 400 test samples of 20 categories (20 samples per category) provided in the document "Koch, g., Richard Zemel & Ruslan salakhatdinov.
In this embodiment, the first sub-learning model f in the method scheme of the present invention is adoptedDPFully connected neural network, second sub-learning model f, selected as a single layerDEA residual neural network (ResNet) of layer 121 is selected. The statistical data of the recognition accuracy of the class recognition test of the comparative prior art machine learning model in this embodiment are shown in table 1, and the method of the present inventionThe recognition accuracy statistics of the class recognition tests performed by the embodiments are shown in table 2.
TABLE 1
Figure BDA0001574000330000171
TABLE 2
Figure BDA0001574000330000172
As can be seen from the above tables 1 and 2, the machine learning identification method of the present invention can be applied to the machine learning model f based on the same training sample set1And performing more different learning training, so that under the condition of the same training sample class book and training sample quantity, the recognition accuracy of the machine learning recognition method is superior to that of the machine learning model in the prior art participating in comparison, and the machine learning recognition method has good feasibility and effectiveness for multimedia data class recognition.
In summary, the machine learning identification method based on dual contrast learning of the present invention can utilize a certain amount of multimedia data samples of known types to adopt different contrast sample input arrangement orders to the machine learning model f1Multiple times of differentiated learning training are carried out, namely a small amount of training samples can be used for carrying out a large amount of learning training on the machine learning model to achieve the expected class recognition effect, so that the dependence on massive training samples is greatly reduced, and the problem that the practical application is limited because the existing multimedia data classification machine learning recognition method needs to rely on a large amount of training samples is solved; meanwhile, even if a certain multimedia data category is not subjected to learning training, only the multimedia data sample of the multimedia data category is added into the identification contrast sample database, and when the sample to be identified is the multimedia data of the category, the machine learning model f1The output result vector can still embody the difference between the sample to be identified and other contrast samples of different classes and the difference between the sample to be identified and the contrast samples of the same classThe correlation among the samples is compared, so that the class to which the sample to be recognized belongs can still be determined according to the correlation, the class recognition of the multimedia data class which is not subjected to learning training can be conveniently expanded, and the problem of limited universality caused by the fact that the class which is not subjected to learning training cannot be directly classified and recognized can be solved; in addition, in the process of carrying out the class identification processing, a class identification processing mode of locally selecting data can be adopted to enable the selected data to be used as the machine learning model f each time1The input contrast sample category quantity L is less than the category quantity S of the multimedia data with known categories contained in the multimedia data sample library, and then the contrast samples are selected for multiple times to be respectively used as a machine learning model f1The method of inputting the samples to be recognized and performing multiple times of class recognition processing to reduce the machine learning model f1The data amount of the data operation processing is executed in each class identification processing process, and the machine learning model f is avoided1The processing efficiency is too low or the processing cannot be performed efficiently. Therefore, the machine learning identification method based on dual contrast learning well solves the problem that the existing multimedia data classification machine learning identification method is limited in practical applicability and universality due to dependence on a large number of training samples and the fact that classes which are not trained directly cannot be classified and identified, can be widely and effectively applied to more specific multimedia data classification use occasions, and has wide technical application and popularization prospects.
Finally, the above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention, which should be covered by the claims of the present invention.

Claims (8)

1. A machine learning identification method based on dual contrast learning is characterized by comprising the following steps:
step (ii) ofFirstly, the method comprises the following steps: acquiring image data as training multimedia data, selecting target identification samples and comparison samples from a plurality of multimedia data of different known types as a machine learning model f1To a machine learning model f1Carrying out learning training; the machine learning model f1Including a first sub-learning model fDPAnd a second sub-learning model fDEThe first sub-learning model fDPThe second sub-learning model f is a convolutional neural network model or a fully-connected neural network modelDEA convolutional neural network model or a fully-connected neural network model; the selected contrast sample comprises a plurality of multimedia data of more than two different categories, and the contrast sample is set and input into the machine learning model f1According to the input arrangement sequence of the comparison samples, combining the target identification samples and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model fDEAnd each corresponding second sub-learning model f is inputDEThe output of the first sub-learning model f is ordered according to the input arrangement order rule of the comparison samples to form a data vector as the first sub-learning model fDPAnd the first sub-learning model fDPAs a result vector of the machine learning model; thus, the machine learning model f obtained by training is learned through training and learning1Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs, so that the machine learning model f can be subjected to different input arrangement sequences of the comparison samples by utilizing multimedia data samples of known categories1Carrying out a plurality of times of learning and training;
step two: using the image data as the multimedia data to be recognized, and using the machine learning model f after learning training1And carrying out category identification on the multimedia data to be identified, thereby realizing the classification identification of the image.
2. The machine learning identification method based on dual contrast learning of claim 1, wherein the machine learning model f is a machine learning model1One or more input target identification samples belong to the same category;
if used as machine learning model f1The input target identification sample is one, and when the target identification sample and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
a combination rule ①, establishing a pairing combination relationship between the target identification sample and each comparison sample, and performing pairing combination respectively;
a combination rule ②, dividing each comparison sample according to categories, establishing a combination relationship between the target identification sample and each category of comparison sample, and combining the target identification samples and each category of comparison samples;
if used as machine learning model f1The input target identification sample is a plurality of target identification samples, and when the target identification sample and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
combination rule method a: establishing a pairing combination relationship between each target identification sample and each comparison sample, and respectively carrying out pairing combination;
combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each target identification sample and each category of comparison sample, and combining the target identification samples and each category of comparison sample;
combination rule mode c: establishing a pairing combination relationship between all target identification samples as a whole and each comparison sample respectively, and performing pairing combination respectively;
combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relation between all target identification samples as a whole and the comparison samples of each category respectively, and combining the comparison samples respectively.
3. The machine learning identification method based on dual contrast learning of claim 1, wherein a machine learning model f is subjected to1In the process of learning training, the target identification sample and the comparison sample are selected from a preset multimedia data sample library, and a part of multimedia data of known types contained in the multimedia data sample library is selected as a target identification sample and a comparison sample to a machine learning model f each time1Performing learning training, and selecting target identification samples and comparison samples from the multimedia data sample library for multiple times to perform machine learning model f1And performing learning training to ensure that the selection of the target identification sample and the comparison sample traverses each multimedia data category contained in the multimedia data sample library, and executing at least H times of comparison sample selection operation aiming at each multimedia data category in the multimedia data sample library, wherein H is a training selection time threshold value.
4. The machine learning identification method based on dual contrast learning of claim 1, wherein the machine learning model f trained by learning is used1The specific way for identifying the category of the multimedia data to be identified is as follows:
obtaining multimedia data serving as an object to be recognized as a sample to be recognized and a comparison sample selected from a plurality of multimedia data of different known types as a machine learning model f after learning training1The selected contrast sample comprises a plurality of multimedia data of more than two different classes, and the contrast sample is set and input to the machine learning model f1According to the input arrangement sequence of the comparison samples, combining the samples to be identified and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model fDEAnd each corresponding second sub-learning model f is inputDEIs transported byForming a data vector as the first sub-learning model f according to the input arrangement order rule of the comparison samplesDPAnd the first sub-learning model fDPAs a result vector of the machine learning model; in the class identification process, the machine learning model f1Each result vector element in the output result vector is used for representing the correlation between the sample to be identified and the category to which the comparison sample at the corresponding arrangement sequence position belongs, so that the category to which the sample to be identified belongs is determined according to the correlation.
5. The machine learning identification method based on the dual contrast learning of claim 4, wherein the number of the acquired samples to be identified is one or more, and all belong to the same category;
if it is inputted to the machine learning model f1When the sample to be identified and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
a combination rule ① is that a pairing combination relationship is established between the sample to be identified and each comparison sample respectively, and pairing combination is carried out respectively;
a combination rule ②, wherein, the comparison samples are classified according to categories, and then the samples to be identified are combined with the comparison samples of each category respectively;
if it is inputted to the machine learning model f1When the sample to be identified and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:
combination rule method a: establishing a pairing combination relationship between each sample to be identified and each comparison sample, and respectively carrying out pairing combination;
combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each sample to be identified and each category of comparison sample, and combining the samples respectively;
combination rule mode c: establishing a pairing combination relationship between all samples to be identified as a whole and each comparison sample respectively, and performing pairing combination respectively;
combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relation between all samples to be identified as a whole and the comparison samples of each category respectively, and combining respectively.
6. The machine learning identification method based on the dual contrast learning of claim 4, wherein the number of the acquired samples to be identified is one or more, and all belong to the same category;
if a plurality of samples to be identified are obtained, batch input to the machine learning model f can be adopted1Performing recognition processing, inputting into machine learning model f in batches1The specific mode of the method is one of the following modes:
a batch input mode ①, in which all the comparison samples and each sample to be identified form a sample input set, and a plurality of sample input sets are formed by using the sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ②, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting a comparison sample from each category, then selecting a sample to be identified to form a sample input set, and forming a plurality of sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ③, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting one comparison sample from each category, forming a sample input set together with all samples to be identified, and forming a plurality of sample input sets as the machine learning model f in a grading manner1The input of (1);
a batch input mode ④ that all comparison samples and all samples to be identified form a sample input set as the machineLearning model f1Is input.
7. The machine learning identification method based on dual contrast learning of claim 6, wherein the machine learning model f is based on1The specific way of performing the category identification processing on the result vector output for multiple times is one of the following ways:
a multiple output type identification method ①, wherein each result vector element in each output result vector is counted and compared, a result vector element with the highest correlation degree represented by the correlation is found out, and the type of a comparison sample on the arrangement sequence position corresponding to the result vector element is determined as the type of the sample to be identified;
multiple output type ② method for recognizing class1And accumulating the result vectors output at each time to obtain an accumulated result vector, counting and comparing the correlation represented by each result vector element in the accumulated result vector, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the class of a comparison sample at the arrangement sequence position corresponding to the result vector element as the class of the sample to be identified.
8. The dual-contrast learning-based machine learning identification method of claim 4, wherein the contrast samples are selected from a predetermined multimedia data sample library, and each time the contrast sample is selected as the machine learning model f1The input contrast sample category number L is smaller than the category number S of the known-category multimedia data contained in the multimedia data sample library, L and S are integers larger than 1, and the contrast samples are required to be selected from the multimedia data sample library for multiple times and are respectively used as a machine learning model f1The method comprises the steps of inputting a sample to be identified, performing multiple category identification processing on the sample to be identified so as to ensure that the selection of a comparison sample traverses all multimedia data categories contained in a multimedia data sample library, and executing comparison sample selection operation for at least K times aiming at each multimedia data category in the multimedia data sample libraryK is the threshold value of the identification selection times set;
then, statistically comparing the machine learning model f1And identifying each result vector element in the output result vector by each category, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the category of a comparison sample at the arrangement sequence position corresponding to the result vector element as the category of the sample to be identified.
CN201810128018.8A 2018-02-08 2018-02-08 Machine learning identification method based on dual contrast learning Active CN108229692B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810128018.8A CN108229692B (en) 2018-02-08 2018-02-08 Machine learning identification method based on dual contrast learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810128018.8A CN108229692B (en) 2018-02-08 2018-02-08 Machine learning identification method based on dual contrast learning

Publications (2)

Publication Number Publication Date
CN108229692A CN108229692A (en) 2018-06-29
CN108229692B true CN108229692B (en) 2020-04-07

Family

ID=62669913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810128018.8A Active CN108229692B (en) 2018-02-08 2018-02-08 Machine learning identification method based on dual contrast learning

Country Status (1)

Country Link
CN (1) CN108229692B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255374A (en) * 2018-08-27 2019-01-22 中共中央办公厅电子科技学院 A kind of aesthetic properties evaluation method based on intensive convolutional network and multitask network
CN109344661B (en) * 2018-09-06 2023-05-30 南京聚铭网络科技有限公司 Machine learning-based micro-proxy webpage tamper-proofing method
CN110852376B (en) * 2019-11-11 2023-05-26 杭州睿琪软件有限公司 Method and system for identifying biological species

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101826166A (en) * 2010-04-27 2010-09-08 青岛大学 Novel recognition method of neural network patterns
CN104657745A (en) * 2015-01-29 2015-05-27 中国科学院信息工程研究所 Labelled sample maintaining method and two-way learning interactive classification method
CN106022392A (en) * 2016-06-02 2016-10-12 华南理工大学 Deep neural network sample automatic accepting and rejecting training method
CN106845421A (en) * 2017-01-22 2017-06-13 北京飞搜科技有限公司 Face characteristic recognition methods and system based on multi-region feature and metric learning
CN106897746A (en) * 2017-02-28 2017-06-27 北京京东尚科信息技术有限公司 Data classification model training method and device
CN107193836A (en) * 2016-03-15 2017-09-22 腾讯科技(深圳)有限公司 A kind of recognition methods and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015054666A1 (en) * 2013-10-10 2015-04-16 Board Of Regents, The University Of Texas System Systems and methods for quantitative analysis of histopathology images using multi-classifier ensemble schemes

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101826166A (en) * 2010-04-27 2010-09-08 青岛大学 Novel recognition method of neural network patterns
CN104657745A (en) * 2015-01-29 2015-05-27 中国科学院信息工程研究所 Labelled sample maintaining method and two-way learning interactive classification method
CN107193836A (en) * 2016-03-15 2017-09-22 腾讯科技(深圳)有限公司 A kind of recognition methods and device
CN106022392A (en) * 2016-06-02 2016-10-12 华南理工大学 Deep neural network sample automatic accepting and rejecting training method
CN106845421A (en) * 2017-01-22 2017-06-13 北京飞搜科技有限公司 Face characteristic recognition methods and system based on multi-region feature and metric learning
CN106897746A (en) * 2017-02-28 2017-06-27 北京京东尚科信息技术有限公司 Data classification model training method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Deep residual learning for image recognition;Kaiming He等;《IEEE conference on computer vision and pattern recognition》;20161231;770–778 *
大规模机器学习理论研究与应用;张利军;《中国博士学位论文全文数据库 信息科技辑》;20131215(第12期);I140-2 *
神经网络学习样本点的选取方法比较;王少波 等;《郑州大学学报(工学版)》;20030330(第1期);63-65,69 *

Also Published As

Publication number Publication date
CN108229692A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
CN108229588B (en) Machine learning identification method based on deep learning
CN107633255B (en) Rock lithology automatic identification and classification method under deep learning mode
CN108108657B (en) Method for correcting locality sensitive Hash vehicle retrieval based on multitask deep learning
CN109241317B (en) Pedestrian Hash retrieval method based on measurement loss in deep learning network
CN108960409B (en) Method and device for generating annotation data and computer-readable storage medium
CN105303179A (en) Fingerprint identification method and fingerprint identification device
CN110796186A (en) Dry and wet garbage identification and classification method based on improved YOLOv3 network
CN110232373A (en) Face cluster method, apparatus, equipment and storage medium
CN110503161B (en) Ore mud ball target detection method and system based on weak supervision YOLO model
CN108536784B (en) Comment information sentiment analysis method and device, computer storage medium and server
CN108229692B (en) Machine learning identification method based on dual contrast learning
CN112199462A (en) Cross-modal data processing method and device, storage medium and electronic device
CN110135505B (en) Image classification method and device, computer equipment and computer readable storage medium
CN112633382A (en) Mutual-neighbor-based few-sample image classification method and system
CN108345942B (en) Machine learning identification method based on embedded code learning
CN111062424A (en) Small sample food image recognition model training method and food image recognition method
WO2023087953A1 (en) Method and apparatus for searching for neural network ensemble model, and electronic device
CN110991247B (en) Electronic component identification method based on deep learning and NCA fusion
CN115659966A (en) Rumor detection method and system based on dynamic heteromorphic graph and multi-level attention
CN109376790A (en) A kind of binary classification method based on Analysis of The Seepage
CN108345943B (en) Machine learning identification method based on embedded coding and contrast learning
CN108229693B (en) Machine learning identification device and method based on comparison learning
CN111414930B (en) Deep learning model training method and device, electronic equipment and storage medium
Lin et al. Integrated circuit board object detection and image augmentation fusion model based on YOLO
CN111738290A (en) Image detection method, model construction and training method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220119

Address after: 408299 No. 7, floor 1, building B, No. 16, Second Branch Road, Pingdu East Road, Sanhe street, Fengdu County, Chongqing

Patentee after: Chongqing Maoqiao Technology Co.,Ltd.

Address before: No. 69 lijiatuo Chongqing District of Banan City Road 400054 red

Patentee before: Chongqing University of Technology

TR01 Transfer of patent right