CN108229692B

CN108229692B - Machine learning identification method based on dual contrast learning

Info

Publication number: CN108229692B
Application number: CN201810128018.8A
Authority: CN
Inventors: 徐传运; 许洲; 张杨
Original assignee: Chongqing University of Technology
Current assignee: Chongqing Maoqiao Technology Co.,Ltd.
Priority date: 2018-02-08
Filing date: 2018-02-08
Publication date: 2020-04-07
Anticipated expiration: 2038-02-08
Also published as: CN108229692A

Abstract

The invention provides a machine learning identification method based on dual contrast learning, which can utilize a certain amount of multimedia data samples of known types to adopt different contrast sample input arrangement sequences to carry out machine learning on a model f₁Performing multiple differentiated learning training for multimedia data type identification processing, and machine learning model f₁The method is designed into a combined model framework of a convolutional neural network model or/and a fully-connected neural network model, dependence on massive training samples is greatly reduced, class recognition of multimedia data classes which are not trained can be conveniently expanded, the problem that the existing multimedia data classification machine learning recognition method is limited in practical applicability and universality due to dependence on massive training samples and incapability of directly classifying and recognizing classes which are not trained is well solved, and the method can be widely and effectively applied to more specific multimedia data classification use occasions.

Description

Machine learning identification method based on dual contrast learning

Technical Field

The invention relates to the technical field of multimedia data processing and machine learning, in particular to a machine learning identification method based on dual contrast learning.

Background

Multimedia (Multimedia) is a combination of multiple media, and in a computer system, Multimedia refers to a man-machine interactive information exchange and transmission medium combining two or more media, and the media used include characters, pictures, photos, sounds, animations and movies, and interactive functions provided by programs.

With the advent of the big data age, the technology of classifying and mining mass multimedia data is particularly important. In massive data mining, how to guide classification and mining of new data by using information classified and mined from existing data has become a new research hotspot. Particularly, when the number of samples of some tasks is small, the time cost for classifying and mining mass data can be effectively reduced and the information acquisition accuracy can be improved by utilizing multi-task learning. For example, in the face recognition-based community access control system development task, if the face image of each owner is divided into an independent image data category, the system is required to process and recognize the face image in a classified manner, determine which owner's face the face image collected at the current access control location belongs to (i.e., determine which image data category the face image belongs to), and further determine whether to release the access control.

The deep learning-based method is proved to be an effective and robust information classification method in practice. Deep neural networks (e.g., deep convolutional neural networks) are the most representative machine learning methods. Deep learning models typically have tens of learnable data processing layers, hundreds of thousands, or even millions, of learnable parameters. Since a large number of parameters constitutes an extremely large learning space, a large amount of training data is usually required in order to obtain optimal model parameters. However, in order to train the deep learning model, a training data set with a large number of samples must be constructed, and the number of training samples is usually tens of thousands or more. However, constructing such a training set is very difficult and expensive in practical applications. For example, in the face recognition-based community access control system development task, if the face image of each owner is divided into an independent image data category, it is very unrealistic to collect tens of thousands of face image training samples for each owner when performing classification recognition training on the machine learning model. This results in greedy of the depth model for large data, which makes the deep learning method difficult to be applied specifically in many fields, or to have reliable technical realizability.

When the deep learning method is used for a classification task, the traditional deep learning method requires that the class of a comparison sample of a classification model is the same as the class of a production sample, namely, the model can only classify the learned class, and if a new class of samples needs to be classified, the machine learning model needs to be retrained, or some adaptive training learning is carried out on the machine learning model. For example, in the face of a community access control system development task based on face recognition, if the face image of each owner is divided into an independent image data category, the current deep learning method is adopted, and the face image of each owner needs to be learned and trained; when a new owner appears, even if the face image of the new owner is directly added into the identification and comparison sample database, because the machine learning model does not perform learning training on the face image of the new owner before, when the face image of the new owner is collected again at the entrance guard, the machine learning model still cannot directly classify and identify the new owner based on the face image data of the new owner in the comparison sample database. This also results in the training of machine learning models based on deep learning methods, which consumes a lot of training computational resources and long training learning time, and limits the convenience and versatility of the machine learning models in practical applications.

Disclosure of Invention

Aiming at the defects in the prior art, the technical problem to be solved by the invention is how to provide a machine learning identification method based on dual contrast learning, so as to solve the problem that the existing multimedia data classification machine learning identification method depends on a large number of training samples to cause limited practical application, and further solve the problem that the existing multimedia data classification machine learning identification method cannot directly classify and identify classes which are not trained, so as to cause limited universality.

In order to solve the technical problems, the invention adopts the following technical means:

a machine learning identification method based on dual contrast learning selects a target identification sample and a contrast sample from a plurality of multimedia data of different known categories as a machine learning model f₁To a machine learning model f₁Performing learning training, and using the machine learning model f after the learning training₁Carrying out category identification on multimedia data to be identified; the machine learning model f₁Including a first sub-learning model f_DPAnd a second sub-learning model f_DEThe first sub-scienceLearning model f_DPThe second sub-learning model f is a convolutional neural network model or a fully-connected neural network model_DEA convolutional neural network model or a fully-connected neural network model; the selected contrast sample comprises a plurality of multimedia data of more than two different categories, and the contrast sample is set and input into the machine learning model f₁According to the input arrangement sequence of the comparison samples, combining the target identification samples and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model f_DEAnd each corresponding second sub-learning model f is input_DEThe output of the first sub-learning model f is ordered according to the input arrangement order rule of the comparison samples to form a data vector as the first sub-learning model f_DPAnd the first sub-learning model f_DPAs a result vector of the machine learning model; thus, the machine learning model f obtained by training is learned through training and learning₁Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs, so that the machine learning model f can be subjected to different input arrangement sequences of the comparison samples by utilizing multimedia data samples of known categories₁And carrying out a plurality of times of learning and training.

In the above machine learning identification method based on the double-comparison learning, as a preferable mode, the machine learning model f is used₁One or more input target identification samples belong to the same category;

if used as machine learning model f₁The input target identification sample is one, and when the target identification sample and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:

a combination rule ①, establishing a pairing combination relationship between the target identification sample and each comparison sample, and performing pairing combination respectively;

a combination rule ②, dividing each comparison sample according to categories, establishing a combination relationship between the target identification sample and each category of comparison sample, and combining the target identification samples and each category of comparison samples;

if used as machine learning model f₁The input target identification sample is a plurality of target identification samples, and when the target identification sample and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:

combination rule method a: establishing a pairing combination relationship between each target identification sample and each comparison sample, and respectively carrying out pairing combination;

combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each target identification sample and each category of comparison sample, and combining the target identification samples and each category of comparison sample;

combination rule mode c: establishing a pairing combination relationship between all target identification samples as a whole and each comparison sample respectively, and performing pairing combination respectively;

combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relation between all target identification samples as a whole and the comparison samples of each category respectively, and combining the comparison samples respectively.

In the above machine learning identification method based on the double-comparison learning, the machine learning model f is preferably a pair machine learning model f₁In the process of learning training, the target identification sample and the comparison sample are selected from a preset multimedia data sample library, and a part of multimedia data of known types contained in the multimedia data sample library is selected as a target identification sample and a comparison sample to a machine learning model f each time₁Performing learning training, and selecting target identification samples and comparison samples from the multimedia data sample library for multiple times to perform machine learning model f₁Performing learning training to ensure that the selection of the target identification sample and the comparison sample traverses the multimedia data sample libraryAnd performing comparison sample selection operation for at least H times aiming at each multimedia data category in the multimedia data sample library, wherein H is a threshold value of training selection times set by the user.

In the above machine learning identification method based on dual contrast learning, as a preferable embodiment, the machine learning model f after learning training is used₁The specific way for identifying the category of the multimedia data to be identified is as follows:

obtaining multimedia data serving as an object to be recognized as a sample to be recognized and a comparison sample selected from a plurality of multimedia data of different known types as a machine learning model f after learning training₁The selected contrast sample comprises a plurality of multimedia data of more than two different classes, and the contrast sample is set and input to the machine learning model f₁According to the input arrangement sequence of the comparison samples, combining the samples to be identified and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model f_DEAnd each corresponding second sub-learning model f is input_DEThe output of the first sub-learning model f is ordered according to the input arrangement order rule of the comparison samples to form a data vector as the first sub-learning model f_DPAnd the first sub-learning model f_DPAs a result vector of the machine learning model; in the class identification process, the machine learning model f₁Each result vector element in the output result vector is used for representing the correlation between the sample to be identified and the category to which the comparison sample at the corresponding arrangement sequence position belongs, so that the category to which the sample to be identified belongs is determined according to the correlation.

In the machine learning identification method based on the dual contrast learning, as a preferred scheme, one or more samples to be identified are obtained and belong to the same category;

if it is inputted to the machine learning model f₁When the sample to be identified and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following modes:

a combination rule ① is that a pairing combination relationship is established between the sample to be identified and each comparison sample respectively, and pairing combination is carried out respectively;

a combination rule ②, wherein, the comparison samples are classified according to categories, and then the samples to be identified are combined with the comparison samples of each category respectively;

combination rule method a: establishing a pairing combination relationship between each sample to be identified and each comparison sample, and respectively carrying out pairing combination;

combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each sample to be identified and each category of comparison sample, and combining the samples respectively;

combination rule mode c: establishing a pairing combination relationship between all samples to be identified as a whole and each comparison sample respectively, and performing pairing combination respectively;

combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relation between all samples to be identified as a whole and the comparison samples of each category respectively, and combining respectively.

if a plurality of samples to be identified are obtained, batch input to the machine learning model f can be adopted₁Performing recognition processing, inputting into machine learning model in batchesf₁The specific mode of the method is one of the following modes:

a batch input mode ①, in which all the comparison samples and each sample to be identified form a sample input set, and a plurality of sample input sets are formed by using the sample input sets as the machine learning model f in a grading manner₁The input of (1);

a batch input mode ②, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting a comparison sample from each category, then selecting a sample to be identified to form a sample input set, and forming a plurality of sample input sets as the machine learning model f in a grading manner₁The input of (1);

a batch input mode ③, wherein the method comprises the steps of firstly classifying the comparison samples according to categories, then selecting one comparison sample from each category, forming a sample input set together with all samples to be identified, and forming a plurality of sample input sets as the machine learning model f in a grading manner₁The input of (1);

a batch input mode ④, in which all comparison samples and all samples to be identified form a sample input set as the machine learning model f₁Is input.

In the machine learning identification method based on the double comparison learning, as a preferable scheme, the machine learning model f is used₁The specific way of performing the category identification processing on the result vector output for multiple times is one of the following ways:

a multiple output type identification method ①, wherein each result vector element in each output result vector is counted and compared, a result vector element with the highest correlation degree represented by the correlation is found out, and the type of a comparison sample on the arrangement sequence position corresponding to the result vector element is determined as the type of the sample to be identified;

multiple output type ② method for recognizing class₁Accumulating the result vectors output each time to obtain an accumulated result vector, thereby counting and comparing the correlation represented by each result vector element in the accumulated result vector and finding out the correlationAnd determining the class of a comparison sample in the arrangement sequence position corresponding to the result vector element as the class of the sample to be identified.

In the machine learning identification method based on dual contrast learning, as an optimal scheme, the contrast sample is selected from a preset multimedia data sample library, and each time the contrast sample is selected, the contrast sample is used as a machine learning model f₁The input contrast sample category number L is smaller than the category number S of the known-category multimedia data contained in the multimedia data sample library, L and S are integers larger than 1, and the contrast samples are required to be selected from the multimedia data sample library for multiple times and are respectively used as a machine learning model f₁Performing multiple category identification processing on a sample to be identified to ensure that the selection of a comparison sample traverses each multimedia data category contained in the multimedia data sample library, and executing at least K times of comparison sample selection operations aiming at each multimedia data category in the multimedia data sample library, wherein K is a threshold value of the identification selection times set; then, statistically comparing the machine learning model f₁And identifying each result vector element in the output result vector by each category, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the category of a comparison sample at the arrangement sequence position corresponding to the result vector element as the category of the sample to be identified.

Compared with the prior art, the invention has the following beneficial effects:

1. the machine learning identification method based on dual contrast learning can utilize a certain amount of multimedia data samples of known types to adopt different contrast sample input arrangement sequences to carry out machine learning on a model f₁Multiple differentiated learning training is carried out, namely a small amount of training samples can be utilized to carry out a large amount of learning training on the machine learning model to achieve the expected class recognition effect, so that the dependence on massive training samples is greatly reduced, and the problem that the existing multimedia data classification machine learning recognition method depends on a large amount of training samples to cause actual situation is solvedThe problem of limited application.

2. According to the machine learning identification method based on dual contrast learning, even if a certain multimedia data category is not subjected to learning training, only the multimedia data sample of the multimedia data category is added into the identification contrast sample database, and when the sample to be identified is the multimedia data of the category, the machine learning model f₁The output result vector can still reflect the difference between the sample to be recognized and the contrast samples of other different classes and the correlation between the sample to be recognized and the contrast samples of the same class, so that the class of the sample to be recognized can still be determined according to the correlation, the class recognition of the multimedia data class which is not subjected to learning training can be conveniently expanded, and the problem that the universality is limited due to the fact that the class which is not subjected to learning training cannot be directly classified and recognized can be solved.

3. The machine learning identification method based on the dual contrast learning can adopt a class identification processing mode of local data selection to enable the machine learning model f to be selected each time in the process of class identification processing₁The input contrast sample category quantity L is less than the category quantity S of the multimedia data with known categories contained in the multimedia data sample library, and then the contrast samples are selected for multiple times to be respectively used as a machine learning model f₁The method of inputting the samples to be recognized and performing multiple times of class recognition processing to reduce the machine learning model f₁The data amount of the data operation processing is executed in each class identification processing process, and the machine learning model f is avoided₁The processing efficiency is too low or the processing cannot be performed efficiently.

4. The machine learning identification method based on the double comparison learning well solves the problem that the existing multimedia data classification machine learning identification method is limited in practical applicability and universality due to dependence on a large number of training samples and the fact that classification identification cannot be directly carried out on classes which are not trained, can be more widely and effectively applied to more specific multimedia data classification use occasions, and has wide technical application and popularization prospects.

Drawings

Fig. 1 is a schematic flow chart of a machine learning training process in the machine learning identification method of the present invention.

Fig. 2 is a schematic flow chart of another machine learning training process in the machine learning identification method of the present invention.

Fig. 3 is a flow chart illustrating a multimedia data class identification process in the machine learning identification method according to the present invention.

Detailed Description

Aiming at the problem that the practical application is limited because the existing multimedia data classification machine learning identification method needs to rely on a large number of training samples, the identification principle of the existing machine learning identification method needs to be analyzed, and the reason of the problem is found. In the existing classification machine learning identification method, a sample to be identified and a comparison sample of a known class are usually compared separately, and the similarity between the sample to be identified and the comparison sample is calculated, or the difference distance value between the sample to be identified and the comparison sample is calculated, so as to judge whether the sample to be identified and the comparison sample belong to the same class, thereby realizing class identification of the sample to be identified. The machine learning identification method is applied to application scenes of multimedia data classification and identification, and is easily limited by technical application:

on one hand, because the multimedia data has the possibility of large data difference of the data samples in the same category; for example, for the development task of a residential access control system based on face recognition, if the face image of each owner is divided into an independent image data category, the system needs to process and recognize the face image, but even the face image of the same owner is easy to have image differences due to different conditions such as ambient light, shooting angle, makeup and make-up of the owner, and the training samples under the conditions of ambient light, shooting angle and makeup and make-up are directly helpful for calculating and recognizing the similarity or difference distance value of the samples to be recognized under the same conditions, which results in that a large number of face images of the owners under the conditions of different ambient light, different shooting angles, different makeup and make-up are needed as training samples and recognition contrast data to train the machine learning model, the better face recognition effect can be ensured, so that the operation difficulty of model learning training in practical application is increased, and the technical application is limited.

On the other hand, in the existing classification machine learning identification method, the learning training influence on the identification result of the sample to be identified is difficult to reflect difference of various other training samples different from the class to which the sample to be identified belongs; for example, for a community access control system development task based on face recognition, if the face image of each owner is divided into an independent image data category, when the face image of one owner is recognized or trained, the face images of any other owner in the sample database are compared, and the face image recognition or training results of the owner to be recognized are both insufficient in similarity or large in difference distance; therefore, a large number of non-homogeneous contrast samples cannot bring meaningful distinguishing influence on the recognition or learning training result of the sample to be recognized, which indirectly causes that the recognition or learning training of the sample to be recognized only depends on the contrast samples of the same type, and increases the dependence of the machine learning recognition method on a large number of training samples.

Accordingly, for the limitation reasons in the two aspects, this also leads to another result, that for a data class that is not trained by learning in the multimedia data classification and identification application scenario, the existing machine learning identification method performs effective class identification on the sample to be identified of the class.

Aiming at the analysis result, based on the technical idea of solving the problems, the invention provides a machine learning identification method based on dual contrast learning, which adopts a learning training mode different from the prior art to train a machine learning model, as shown in fig. 1, a target identification sample R and a contrast sample a are selected from a plurality of multimedia data of different known classes as a machine learning model f₁To a machine learning model f₁To carry outLearning training, and further using the machine learning model f after the learning training₁Carrying out category identification on multimedia data to be identified; the machine learning model f₁Including a first sub-learning model f_DPAnd a second sub-learning model f_DEThe first sub-learning model f_DPThe second sub-learning model f is a convolutional neural network model or a fully-connected neural network model_DEA convolutional neural network model or a fully-connected neural network model; the selected contrast sample comprises a plurality of multimedia data of more than two different categories, and the contrast sample is set and input into the machine learning model f₁For example, in FIG. 1, a plurality of comparison samples listed in the input arrangement order of the comparison samples are respectively marked as a₁、a₂、…、a_nN denotes as a machine learning model f₁Inputting the quantity of the comparison samples, and inputting the arrangement sequence according to the comparison samples, and then, combining the target identification sample R and the comparison sample a₁、a₂、…、a_nThe target identification samples R are combined with the comparison samples a according to the input arrangement sequence of the comparison samples, for example, the simple example in FIG. 1₁、a₂、…、a_nCombining to form multiple data sample combinations with the comparison sample input arrangement rule, and using each multimedia data sample combination as the second sub-learning model f_DEAnd each corresponding second sub-learning model f is input_DEOutput DE of₁、DE₂、…、DE_nForming a data vector as the first sub-learning model f according to the input arrangement order rule of the comparison samples_DPAnd the first sub-learning model f_DPAs a result vector C of the machine learning model; thus, the machine learning model f obtained by training is learned through training and learning₁Each result vector element C in the output result vector C_iE C (i e {1,2, …, n }) is used for characterizing the target identification sample and a comparison sample a at the corresponding arrangement sequence position_i(i e {1,2, …, n }) belongs to the category of the correlation between the categoriesThereby enabling the machine learning model f to be trained using different input ordering of comparison samples for known classes of multimedia data samples₁And carrying out a plurality of times of learning and training. For example, as shown in FIG. 2, the same n comparison samples as in FIG. 1 are used, but the input arrangement order of the comparison samples is changed by setting so that the original a₁Adjusting the contrast sample of sequential position to a₄Sequential position, learning model f to machine₁Different learning exercises are performed.

Compared with the prior art, the machine learning identification method based on dual contrast learning adopts different technical implementation ways of thinking, selects multimedia data of known categories as the target identification sample and the contrast sample, and inputs the target identification sample and the contrast sample into the machine learning model f₁And performing learning training, wherein the selected comparison sample needs to contain more than two different types of multimedia data, so as to reflect the difference between the different types of the comparison samples in the input arrangement sequence. At the same time, since the machine learns the model f₁Is to make the machine learning model f obtained by the learning training₁Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which the comparison sample at the corresponding arrangement sequence position belongs, and the machine learning model f₁Designed to include a first sub-learning model f_DPAnd a second sub-learning model f_DEThe first sub-learning model f of_DPAnd a second sub-learning model f_DEThe method can be selected as a convolutional Neural Network model or a fully-connected Neural Network model, the convolutional Neural Network can be selected as a Residual Neural Network model (abbreviated as ResNet), a dense convolutional Network model (abbreviated as DenseContolutional Network), and the like, and the fully-connected Neural Network can be selected as a Neural Network model with a fully-connected layer, which is commonly used by a person skilled in the art; using a learning model comprising a first sub-learning model f_DPAnd a second sub-learning model f_DEThe machine learning model f composed of the combined model framework₁During the learning and training process, willCombining the target identification sample (corresponding to the sample to be identified during the category identification process) with the comparison sample to form a plurality of data sample combinations with the comparison sample input arrangement order rule retained, and inputting the data sample combinations into the second sub-learning model f_DEThe obtained output is ordered according to the input arrangement order rule of the comparison samples to form a data vector which is used as the first sub-learning model f_DPIs input to the model so that the model f is learned from the first sub-learning_DPOutput as machine learning model f₁Thereby ensuring the machine learning model f₁The arrangement order of the elements of the output result vector retains the corresponding relationship with the input arrangement order of the comparison sample, and is due to the first sub-learning model f_DPAnd a second sub-learning model f_DECan be selected as a convolution neural network model or a full-connection neural network model, so that each result vector element in the result vector is influenced by the input arrangement sequence of the comparison samples, and the machine learning model f obtained by training is enabled to be₁The relevance represented by each result vector element in the output result vector has relevance influence on the input arrangement sequence of the comparison sample. Therefore, the multimedia data belonging to the same category as the target identification sample in the comparison sample is different in the input arrangement sequence of the comparison sample, and the machine learning model f is obtained₁Therefore, when each multimedia data belonging to the same category as the target recognition sample is used as a comparison sample, the machine learning model f can be learned by adjusting the sequence position of the multimedia data in the input arrangement sequence of the comparison sample₁And carrying out a plurality of differentiated learning trainings. Meanwhile, the multimedia data belonging to different categories from the target identification sample in the comparison sample are different in the input arrangement sequence of the comparison sample, and the machine learning model f is also subjected to₁The learning training results of the target recognition samples have different influences, so that when each multimedia data belonging to different categories with the target recognition samples is used as the comparison sample, the multimedia data can also participate in the machine learning model f for multiple times by adjusting the sequence position of the multimedia data in the input arrangement sequence of the comparison sample₁Zone (D) ofAnd (5) learning other exercises. Thus, the machine learning model f can be input and arranged in different orders by using different contrast samples by using a certain amount of multimedia data samples of known types₁Multiple times of differentiated learning training are carried out, namely a small amount of training samples can be used for carrying out a large amount of learning training on the machine learning model to achieve the expected class recognition effect, so that the dependence on massive training samples is greatly reduced, and the problem that the practical application is limited because the existing multimedia data classification machine learning recognition method needs to rely on a large amount of training samples is solved.

In specific application, the machine learning identification method of the invention is utilized to learn the model f to the machine₁In the process of learning training, a target identification sample and a comparison sample are selected from a preset multimedia data sample library, and a part of multimedia data of known types contained in the multimedia data sample library is selected as a target identification sample and a comparison sample to a machine learning model f each time₁Performing learning training, and selecting target identification samples and comparison samples from the multimedia data sample library for multiple times to perform machine learning model f₁And performing learning training to ensure that the selection of the target identification sample and the comparison sample traverses each multimedia data category contained in the multimedia data sample library, and executing at least H times of comparison sample selection operation aiming at each multimedia data category in the multimedia data sample library, wherein H is a training selection time threshold value. Selecting a part of multimedia data of known categories contained in the multimedia data sample library as a target identification sample and a comparison sample pair machine learning model f each time₁The learning training is carried out, and the method is a learning training processing mode of local selection. Since the machine learning model f is performed if the global selection of all classes of multimedia data contained in the multimedia data sample library is performed₁The learning training process of (1) easily results in huge comparison calculation data quantity and low calculation efficiency, and if a machine learning model f₁The neural network of (a) is too hierarchical, and the machine learning model f is easy to be caused₁Such a large amount of data cannot be efficiently processed. Thus, it is possible to provideSelecting a part of multimedia data of known category contained in the multimedia data sample library as a target identification sample and comparing the sample to the machine learning model f each time₁Performing learning training, and selecting target identification sample and comparison sample for multiple times to obtain machine learning model f₁Means for performing learning training to reduce machine learning model f₁The data quantity of the data arithmetic processing is executed in each learning and training process, and the machine learning model f is avoided₁The problem that the processing efficiency is too low or the processing cannot be effectively executed; however, in the class recognition processing method of local selection, there is a possibility that the multimedia data training contained in the multimedia data sample library is not fully utilized in the learning training process, and for this reason, the machine learning model f₁Each result vector element in the output result vector is influenced by the input arrangement sequence of the comparison samples, so that the correlation influence of the comparison samples of the same category on the target identification sample on different input arrangement sequences of the comparison samples is different, and the machine learning model f is possibly influenced₁So as to ensure that the model f is learned to the machine as much as possible₁The learning training effect is that the learning training process is well performed to ensure that the selection of the target recognition sample and the comparison sample traverses each multimedia data category contained in the multimedia data sample library, and at least H times of comparison sample selection operation are executed for each multimedia data category in the multimedia data sample library, wherein H is a threshold value of training selection times, and the specific value of H can be determined according to practical application experience.

A machine learning model f obtained by learning and training using the machine learning and training program₁It can be used for class identification of multimedia data. Specifically, a machine learning model f after learning training is used₁The specific way for identifying the category of the multimedia data to be identified is as follows: as shown in fig. 3, multimedia data as an object to be recognized is acquired as a sample R to be recognized_xAnd a comparison sample a selected from a plurality of multimedia data of different known categories as machine learning after learning trainingModel f₁The selected contrast sample comprises a plurality of multimedia data of more than two different classes, and the contrast sample is set and input to the machine learning model f₁For example, in FIG. 3, a plurality of comparison samples listed in the input arrangement order of the comparison samples are respectively marked as a₁、a₂、…、a_nN denotes as a machine learning model f₁The number of input comparison samples is determined, and the samples R to be identified are input according to the arrangement sequence of the input comparison samples_xAnd comparative sample a₁、a₂、…、a_nCombining with preset combining rules, e.g. the simple example in fig. 3 is to combine the samples R to be recognized_xRespectively comparing the input arrangement sequence of the comparison samples with the comparison sample a₁、a₂、…、a_nCombining to form multiple data sample combinations with the comparison sample input arrangement rule, and using each multimedia data sample combination as the second sub-learning model f_DEAnd each corresponding second sub-learning model f is input_DEOutput DE of₁、DE₂、…、DE_nForming a data vector as the first sub-learning model f according to the input arrangement order rule of the comparison samples_DPAnd the first sub-learning model f_DPAs a result vector C of the machine learning model; in the class identification process, the machine learning model f₁Each result vector element C in the output result vector C_iEpsilon C (i epsilon {1,2, …, n }) is used for characterizing the sample R to be identified_xWith a reference sample a at the corresponding arrangement order position_i(i e {1,2, …, n }) belongs to the category, so that the sample R to be identified can be determined according to the correlation_xTo which category (c) belongs; for example, if learning the trained machine learning model f₁The result vector element C in the result vector C of its output_iA smaller value of (A) indicates a higher degree of correlation with the category to which the comparison sample at the corresponding arrangement order position belongs, and the sample R to be recognized is determined in recognition as shown in FIG. 3_xClass y of_xWhen the temperature of the water is higher than the set temperature,comparison sample a at the corresponding arrangement order position of the result vector element with the smallest value in the result vector C_iClass y to which_iIt can be determined as the sample R to be recognized_xTo which class (i) belongs

i∈{1,2,…,n}。

In specific implementation, the machine learning training process and the multimedia data type identification processing process in the machine learning identification method based on dual contrast learning in this embodiment may be loaded into a processor of a machine learning identification device through computer programming, so that the processor is configured to execute the machine learning training program of the machine learning training process or execute the multimedia data type identification program of the multimedia data type identification processing process. The machine learning type identification device designed based on the machine learning identification method of the invention naturally has common technical characteristics and technical advantages.

In the implementation of the machine learning identification method and the device thereof, the machine learning model f obtained by the learning training is used₁Each result vector element in the output result vector is used for representing the correlation between the target recognition sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs, and the correlation is easy to be achieved during specific training operation; for example, in the training process, the model f is learned in the machine according to whether the target recognition sample is in the same category as a comparison sample in the input arrangement sequence₁If one result vector element at the corresponding arrangement sequence position in the output result vector is endowed with a preset expected correlation value, for example, the same category is endowed with a positive expected correlation value (for example, the value is assigned to '0'), and different categories are endowed with a negative expected correlation value (for example, the value is assigned to '1'), then the machine learning model f is trained through machine learning₁The method can learn whether the correlation distinction of the same category between the target identification sample and the comparison sample is realized, and the target identification sample and the corresponding arrangement order are represented by each result vector element in the output result vectorCorrelation between categories to which a comparison sample at an ordinal position belongs. Thus trained, the machine learning model f is used for performing class recognition processing on multimedia data₁Each result vector element in the output result vector can well distinguish and represent the correlation between the sample to be identified and the category to which a comparison sample at the corresponding arrangement sequence position belongs, wherein the closer the result vector element is to the positive correlation expected value, the more the comparison sample corresponding to the element sorting position in the result vector is input into the category to which the comparison sample at the arrangement sequence position belongs, the more the comparison sample can be judged to be the category to which the sample to be identified belongs. For example, in the learning training process shown in fig. 1, the target recognition sample R and the comparison sample a with the 1 st digit of the comparison sample input arrangement order₁Belonging to the same class, thus giving the result vector the element c of the 1 st bit of the order of arrangement in the result vector₁The value of (1) is '0' to represent a positive correlation expected value, and the result vector elements corresponding to the sequence positions of the contrast samples of the rest different classes are assigned with '1' to represent a negative correlation expected value; in the learning training process shown in FIG. 2, the target recognition sample R and the comparison sample a with the 4 th bit of the comparison sample input sequence are input₄Belonging to the same class, thus giving the result vector the element c of the 1 st bit of the order of arrangement in the result vector₄The value of (1) is '0' to represent the positive correlation expectation value, and the evaluation values of the result vector elements corresponding to the sequence positions of the contrast samples of the rest different classes are '1' to represent the negative correlation expectation value.

In the machine learning identification method of the present invention, one or more target identification samples may be obtained during the learning training, and all the target identification samples need to belong to the same category; similarly, one or more samples to be recognized may be used in the category identification process, but it is also necessary that all samples belong to the same category. In the specific application implementation, several factors are involved, and different situations need to be described for different cases.

Wherein the first aspect isIn the learning training process or the class identification processing process, target identification samples (corresponding to samples to be identified in the class identification processing process) and comparison samples are input into a machine learning model f₁In the present invention, a preset combination rule is needed to be combined, and such a combination process is performed, on one hand, in order to enable a plurality of formed data sample combinations to retain a comparison sample input arrangement order rule, and on the other hand, in order to establish association between a target identification sample (or a sample to be identified) and a comparison sample and between the target identification sample and the comparison sample input arrangement order at a data input level, which is an important technical difference of the machine learning identification method of the present invention compared with the prior art. After the data input level establishes the association between the target identification sample (or the sample to be identified) and the comparison sample input arrangement sequence, the machine learning model f is used for₁The processing output of (1), each result vector element in the output result vector is no longer related to the similarity between the target identification sample (or the sample to be identified) and a comparison sample, but also related to the correlation between the target identification sample (or the sample to be identified) and the comparison sample constituting each data sample combination, and is processed by the machine learning model f₁The full-connection operation processing function of the method ensures that each result vector element is also related to a comparison sample input arrangement sequence rule reserved by each data sample combination as input, thereby better ensuring the machine learning model f₁Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs.

The method is characterized in that a target identification sample (corresponding to a sample to be identified in the class identification processing process) and a comparison sample are combined and input into a machine learning model f₁The special processing mode of learning training or class identification processing is carried out, so that the machine learning identification method based on double comparison learning of the invention can have class identification capability for data classes which are not subjected to learning training. Because the machine learning model f obtained after the training by the method of the invention₁When class identification is carried out on multimedia data to be identified, the machine learning model f₁One result vector element in the output result vector is not only related to the similarity between the sample to be recognized and the comparison sample, but also more related to the relevance between the target recognition sample (or the sample to be recognized) and the comparison sample in each data sample combination and the comparison sample input arrangement order rule reserved by each data sample combination as input, therefore, even if a certain multimedia data class is not trained by learning, as long as the multimedia data sample of the multimedia data class is added into the recognition comparison sample database, when the sample to be recognized is the multimedia data of the class, the machine learning model f₁The output result vector can still embody the difference between the sample to be identified and the contrast samples of other different categories and the correlation between the sample to be identified and the contrast samples of the same category, so that the category to which the sample to be identified belongs can still be determined according to the correlation. Therefore, the machine learning identification method based on the dual contrast learning can conveniently expand the class identification of the multimedia data class which is not subjected to the learning training, and can solve the problem of limited universality caused by the fact that the class which is not subjected to the learning training cannot be directly classified and identified.

Meanwhile, just by establishing the association between the target identification sample (or the sample to be identified) and the comparison sample and the input arrangement sequence of the comparison sample, the specific combination mode can be distinguished under different conditions that the number of the target identification samples (or the samples to be identified) is one or more.

If used as machine learning model f₁The input target identification sample (or sample to be identified) is one, and when the target identification sample (or sample to be identified) and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following ways:

a combination rule ① is that a pairing combination relationship is established between the target identification sample (or the sample to be identified) and each comparison sample respectively, and pairing combination is carried out respectively;

and ②, dividing the comparison samples according to categories, and establishing a combination relationship between the target identification sample (or the sample to be identified) and the comparison sample of each category for combination.

If used as machine learning model f₁The input target identification sample (or the sample to be identified) is a plurality of samples, and when the target identification sample (or the sample to be identified) and the comparison sample are combined by a preset combination rule, the preset combination rule is one of the following ways:

combination rule method a: establishing a pairing combination relationship between each target identification sample (or to-be-identified sample) and each comparison sample, and respectively carrying out pairing combination;

combination rule method b: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between each target identification sample (or sample to be identified) and each category of comparison sample, and combining the samples respectively;

combination rule mode c: establishing a pairing combination relationship between all target identification samples (or samples to be identified) and each comparison sample as a whole, and respectively carrying out pairing combination;

combination rule method d: firstly, dividing each comparison sample according to categories, then establishing a combination relationship between all target identification samples (or samples to be identified) as a whole and the comparison samples of each category respectively, and combining respectively.

The combination rule method ① and the combination rule method a are to establish a pairing combination relationship between each target recognition sample (or sample to be recognized) and each comparison sample, and perform pairing combination, such a combination rule can form as many data sample combinations as possible with a comparison sample input arrangement order rule, for the learning training process flow, the data sample combinations as many as possible are beneficial to performing more different discriminative learning trainings by changing different comparison sample input arrangement orders, and for the learning model f of the lifting machine₁The combination rule manner ② and the processing rules b, c, d are to combine all the target recognition samples (or samples to be recognized) as a whole (belonging to a category) or all the comparative samples of each category as a whole, and then combine them separately, so as to form a plurality of data sample combinations, which not only can keep the comparative sample input ordering rules, but also can form all the target recognition samples (or samples to be recognized) as a whole (belonging to a category) or all the comparative samples of each category as a whole to form a component of a data sample combination, and the formed data sample combination enters the machine learning model f₁When the operation processing is carried out, the operation processing process is equivalent to the integration of the data common characteristics of the corresponding class samples, so that the method is helpful for improving the common characteristic distinguishing and identifying rate among different classes.

The factor of the second aspect may be that, in the class identification processing process, when the multiple samples to be identified of the same class are obtained and need to be subjected to class identification processing, and the comparison samples also have multiple classes and multiple numbers, the samples may be input to the machine learning model f in batches₁The identification processing is carried out in the mode of (1); in specific operation, inputting the data into the machine learning model f in batches₁The specific way of (2) can adopt one of the following ways:

the batch input method ③ comprises classifying the samples according to their categories, selecting a sample from each category, and mixing with the samplesAll samples to be identified form a sample input set together; thereby forming a plurality of sample input sets, each time as the machine learning model f₁The input of (1);

Accordingly, using a batch input to the machine learning model f₁The processing method for class identification is that each batch of input will get a result vector, so that the method can be used for learning the model f according to the machine₁The specific way of performing the category identification processing on the result vector output for multiple times can also adopt one of the following ways:

multiple output type ② method for recognizing class₁And accumulating the result vectors output at each time to obtain an accumulated result vector, counting and comparing the correlation represented by each result vector element in the accumulated result vector, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the class of a comparison sample at the arrangement sequence position corresponding to the result vector element as the class of the sample to be identified.

The multiple output type recognition mode ① directly performs correlation statistical comparison according to all result vector elements in the result vectors output for each time to find out the highest correlation degree to determine the type of the sample to be recognized, while the multiple output type recognition mode ② performs correlation statistical comparison after accumulating the result vectors output for each time to find out the highest correlation degree to determine the type of the sample to be recognized, in contrast, the multiple output type recognition mode ② is equivalent to inputting the result vectors output for a machine in a halved mannerLearning model f₁Compared with the multiple output type identification method ①, the method has the advantages that the comprehensive consideration of accumulated averaging is performed on each time of output of the result vectors after the type identification processing, the method is more favorable for avoiding the condition of the type identification error of the sample to be identified caused by accidental errors, and the method is favorable for ensuring the better identification accuracy.

In the third aspect, in the process of performing the class identification processing by using the machine learning identification method of the present invention, the comparison sample may be selected from a preset multimedia data sample library, and in a specific application, the comparison sample selected each time may be operated to serve as the machine learning model f₁The input contrast sample category number L is smaller than the category number S of the known-category multimedia data contained in the multimedia data sample library, L and S are integers larger than 1, and the contrast samples are required to be selected from the multimedia data sample library for multiple times and are respectively used as a machine learning model f₁The method comprises the steps of inputting, performing multiple category identification processing on a sample to be identified to ensure that selection of a comparison sample traverses each multimedia data category contained in a multimedia data sample library, and executing comparison sample selection operation for at least K times aiming at each multimedia data category in the multimedia data sample library, wherein K is a threshold value of identification selection times set by the user. Let each time selected as machine learning model f₁The input comparison sample category number L is smaller than the category number S of the multimedia data with known categories contained in the multimedia data sample library, and the method is a category identification processing mode of local selection. Because if the multimedia data of all categories contained in the multimedia data sample library is globally selected to execute the category identification processing of the sample to be identified, the comparison operation data quantity is huge, the operation efficiency is too low, and if the machine learning model f is used, the machine learning model f is used for identifying the category of the sample to be identified₁The neural network of (a) is too hierarchical, and the machine learning model f is easy to be caused₁Such a large amount of data cannot be efficiently processed. Therefore, let each time selected as the machine learning model f₁The input contrast sample class number L is less than the known class of multimedia data contained in the multimedia data sample libraryThe number of classes S is then determined by selecting the comparison samples for multiple times as machine learning models f₁The method of inputting the samples to be recognized and performing multiple times of class recognition processing to reduce the machine learning model f₁The data amount of the data operation processing is executed in each class identification processing process, and the machine learning model f is avoided₁The problem that the processing efficiency is too low or the processing cannot be effectively executed; however, the class identification processing method of local selection may result in that no class to which the sample to be identified belongs exists in the comparison sample selected once, so that an effective class identification result cannot be obtained, and the machine learning model f is used₁Each result vector element in the output result vector is influenced by the input arrangement sequence of the comparison samples, so that the influence of the comparison samples of the same category on the correlation identification of the samples to be identified can be different in different input arrangement sequences of the comparison samples, and the category identification result of the samples to be identified can be influenced. Then, statistically comparing the machine learning model f₁And identifying each result vector element in the output result vector by each category, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the category of a comparison sample at the arrangement sequence position corresponding to the result vector element as the category of the sample to be identified.

Identification effect comparative example:

in this embodiment, compared with some recognition methods using machine learning models in the prior art, the machine learning recognition method used in the machine learning recognition apparatus for multimedia data classification provided by the present invention uses the same data set to perform recognition effect comparison experiments, so as to verify the feasibility and effectiveness of the machine learning recognition method used in the machine learning recognition apparatus provided by the present invention.

In the present embodiment, the inventive method is labeled "LCNN", whereas the comparative prior art machine Learning models include the BPL (bayesian program Learning algorithm) model (labeled "BPL [ lange 2015 ]") mentioned in the documents "Lake, b.m., Salakhutdinov, R. & Tenenbaum, j.b. human-level constraining statistical Learning third route analysis supplied material.science 350, 1332. glancing 1338 (2015)", the BPL [ lange 2015] ", the document" vision ", o.g., Blundell, c.lillicr-, and the Convolutional Simame Net model (labeled "Convolutional Simame Net [ Kock2015 ]") mentioned in the document "Koch, G., Richard Zemel & Ruslon Salakhu tdinov.Simotal networks for one-shot image recognition.in (University of Toronto, 2015)".

In this embodiment, based on the omniroot dataset, samples of 30, 60, 136, 156, and 964 categories are respectively selected from a training set provided by the omniroot dataset as a training set, each category has 20 samples, and models participating in comparison are respectively trained; then, 20-to-1 single sample (20-way) category identification tests were performed using 400 test samples of 20 categories (20 samples per category) provided in the document "Koch, g., Richard Zemel & Ruslan salakhatdinov.

In this embodiment, the first sub-learning model f in the method scheme of the present invention is adopted_DPFully connected neural network, second sub-learning model f, selected as a single layer_DEA residual neural network (ResNet) of layer 121 is selected. The statistical data of the recognition accuracy of the class recognition test of the comparative prior art machine learning model in this embodiment are shown in table 1, and the method of the present inventionThe recognition accuracy statistics of the class recognition tests performed by the embodiments are shown in table 2.

TABLE 1

TABLE 2

As can be seen from the above tables 1 and 2, the machine learning identification method of the present invention can be applied to the machine learning model f based on the same training sample set₁And performing more different learning training, so that under the condition of the same training sample class book and training sample quantity, the recognition accuracy of the machine learning recognition method is superior to that of the machine learning model in the prior art participating in comparison, and the machine learning recognition method has good feasibility and effectiveness for multimedia data class recognition.

In summary, the machine learning identification method based on dual contrast learning of the present invention can utilize a certain amount of multimedia data samples of known types to adopt different contrast sample input arrangement orders to the machine learning model f₁Multiple times of differentiated learning training are carried out, namely a small amount of training samples can be used for carrying out a large amount of learning training on the machine learning model to achieve the expected class recognition effect, so that the dependence on massive training samples is greatly reduced, and the problem that the practical application is limited because the existing multimedia data classification machine learning recognition method needs to rely on a large amount of training samples is solved; meanwhile, even if a certain multimedia data category is not subjected to learning training, only the multimedia data sample of the multimedia data category is added into the identification contrast sample database, and when the sample to be identified is the multimedia data of the category, the machine learning model f₁The output result vector can still embody the difference between the sample to be identified and other contrast samples of different classes and the difference between the sample to be identified and the contrast samples of the same classThe correlation among the samples is compared, so that the class to which the sample to be recognized belongs can still be determined according to the correlation, the class recognition of the multimedia data class which is not subjected to learning training can be conveniently expanded, and the problem of limited universality caused by the fact that the class which is not subjected to learning training cannot be directly classified and recognized can be solved; in addition, in the process of carrying out the class identification processing, a class identification processing mode of locally selecting data can be adopted to enable the selected data to be used as the machine learning model f each time₁The input contrast sample category quantity L is less than the category quantity S of the multimedia data with known categories contained in the multimedia data sample library, and then the contrast samples are selected for multiple times to be respectively used as a machine learning model f₁The method of inputting the samples to be recognized and performing multiple times of class recognition processing to reduce the machine learning model f₁The data amount of the data operation processing is executed in each class identification processing process, and the machine learning model f is avoided₁The processing efficiency is too low or the processing cannot be performed efficiently. Therefore, the machine learning identification method based on dual contrast learning well solves the problem that the existing multimedia data classification machine learning identification method is limited in practical applicability and universality due to dependence on a large number of training samples and the fact that classes which are not trained directly cannot be classified and identified, can be widely and effectively applied to more specific multimedia data classification use occasions, and has wide technical application and popularization prospects.

Finally, the above embodiments are only intended to illustrate the technical solution of the present invention and not to limit the same, and although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention, which should be covered by the claims of the present invention.

Claims

1. A machine learning identification method based on dual contrast learning is characterized by comprising the following steps:

step (ii) ofFirstly, the method comprises the following steps: acquiring image data as training multimedia data, selecting target identification samples and comparison samples from a plurality of multimedia data of different known types as a machine learning model f₁To a machine learning model f₁Carrying out learning training; the machine learning model f₁Including a first sub-learning model f_DPAnd a second sub-learning model f_DEThe first sub-learning model f_DPThe second sub-learning model f is a convolutional neural network model or a fully-connected neural network model_DEA convolutional neural network model or a fully-connected neural network model; the selected contrast sample comprises a plurality of multimedia data of more than two different categories, and the contrast sample is set and input into the machine learning model f₁According to the input arrangement sequence of the comparison samples, combining the target identification samples and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model f_DEAnd each corresponding second sub-learning model f is input_DEThe output of the first sub-learning model f is ordered according to the input arrangement order rule of the comparison samples to form a data vector as the first sub-learning model f_DPAnd the first sub-learning model f_DPAs a result vector of the machine learning model; thus, the machine learning model f obtained by training is learned through training and learning₁Each result vector element in the output result vector is used for representing the correlation between the target identification sample and the category to which a comparison sample at the corresponding arrangement sequence position belongs, so that the machine learning model f can be subjected to different input arrangement sequences of the comparison samples by utilizing multimedia data samples of known categories₁Carrying out a plurality of times of learning and training;

step two: using the image data as the multimedia data to be recognized, and using the machine learning model f after learning training₁And carrying out category identification on the multimedia data to be identified, thereby realizing the classification identification of the image.

2. The machine learning identification method based on dual contrast learning of claim 1, wherein the machine learning model f is a machine learning model₁One or more input target identification samples belong to the same category;

3. The machine learning identification method based on dual contrast learning of claim 1, wherein a machine learning model f is subjected to₁In the process of learning training, the target identification sample and the comparison sample are selected from a preset multimedia data sample library, and a part of multimedia data of known types contained in the multimedia data sample library is selected as a target identification sample and a comparison sample to a machine learning model f each time₁Performing learning training, and selecting target identification samples and comparison samples from the multimedia data sample library for multiple times to perform machine learning model f₁And performing learning training to ensure that the selection of the target identification sample and the comparison sample traverses each multimedia data category contained in the multimedia data sample library, and executing at least H times of comparison sample selection operation aiming at each multimedia data category in the multimedia data sample library, wherein H is a training selection time threshold value.

4. The machine learning identification method based on dual contrast learning of claim 1, wherein the machine learning model f trained by learning is used₁The specific way for identifying the category of the multimedia data to be identified is as follows:

obtaining multimedia data serving as an object to be recognized as a sample to be recognized and a comparison sample selected from a plurality of multimedia data of different known types as a machine learning model f after learning training₁The selected contrast sample comprises a plurality of multimedia data of more than two different classes, and the contrast sample is set and input to the machine learning model f₁According to the input arrangement sequence of the comparison samples, combining the samples to be identified and the comparison samples according to a preset combination rule, thereby forming a plurality of data sample combinations with the input arrangement sequence rule of the comparison samples reserved, and respectively taking each multimedia data sample combination as the second sub-learning model f_DEAnd each corresponding second sub-learning model f is input_DEIs transported byForming a data vector as the first sub-learning model f according to the input arrangement order rule of the comparison samples_DPAnd the first sub-learning model f_DPAs a result vector of the machine learning model; in the class identification process, the machine learning model f₁Each result vector element in the output result vector is used for representing the correlation between the sample to be identified and the category to which the comparison sample at the corresponding arrangement sequence position belongs, so that the category to which the sample to be identified belongs is determined according to the correlation.

5. The machine learning identification method based on the dual contrast learning of claim 4, wherein the number of the acquired samples to be identified is one or more, and all belong to the same category;

6. The machine learning identification method based on the dual contrast learning of claim 4, wherein the number of the acquired samples to be identified is one or more, and all belong to the same category;

if a plurality of samples to be identified are obtained, batch input to the machine learning model f can be adopted₁Performing recognition processing, inputting into machine learning model f in batches₁The specific mode of the method is one of the following modes:

a batch input mode ④ that all comparison samples and all samples to be identified form a sample input set as the machineLearning model f₁Is input.

7. The machine learning identification method based on dual contrast learning of claim 6, wherein the machine learning model f is based on₁The specific way of performing the category identification processing on the result vector output for multiple times is one of the following ways:

8. The dual-contrast learning-based machine learning identification method of claim 4, wherein the contrast samples are selected from a predetermined multimedia data sample library, and each time the contrast sample is selected as the machine learning model f₁The input contrast sample category number L is smaller than the category number S of the known-category multimedia data contained in the multimedia data sample library, L and S are integers larger than 1, and the contrast samples are required to be selected from the multimedia data sample library for multiple times and are respectively used as a machine learning model f₁The method comprises the steps of inputting a sample to be identified, performing multiple category identification processing on the sample to be identified so as to ensure that the selection of a comparison sample traverses all multimedia data categories contained in a multimedia data sample library, and executing comparison sample selection operation for at least K times aiming at each multimedia data category in the multimedia data sample libraryK is the threshold value of the identification selection times set;

then, statistically comparing the machine learning model f₁And identifying each result vector element in the output result vector by each category, finding out a result vector element with the highest correlation degree represented by the correlation, and judging the category of a comparison sample at the arrangement sequence position corresponding to the result vector element as the category of the sample to be identified.