CN113255838A - Image classification model training method, system and device, medium and classification method - Google Patents
Image classification model training method, system and device, medium and classification method Download PDFInfo
- Publication number
- CN113255838A CN113255838A CN202110723977.6A CN202110723977A CN113255838A CN 113255838 A CN113255838 A CN 113255838A CN 202110723977 A CN202110723977 A CN 202110723977A CN 113255838 A CN113255838 A CN 113255838A
- Authority
- CN
- China
- Prior art keywords
- image
- classification model
- image classification
- training
- input data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an image classification model training method, a system, a device, a medium and a classification method, which relate to the field of image classification and comprise the following steps: constructing a first image classification model; constructing a data set, wherein the data set comprises N types of images; randomly extracting an image from each type of image in the data set as a training sample of the type of image to obtain training samples of 0 th type to N-1 th type images; respectively randomly extracting an image combination from each type of image in the data set to obtain a reference sample; obtaining corresponding input data based on the training sample of each type of image and the reference sample, inputting the corresponding input data into the first image classification model to perform iterative training on the first image classification model, and obtaining a second image classification model after the iterative training is completed; the image classification model designed by the invention can achieve the effect of image classification, and can enable the network parameters to be converged more quickly in the iteration process.
Description
Technical Field
The invention relates to the field of image classification, in particular to an image classification model training method, an image classification model training system, an image classification model training device, a medium and a classification method.
Background
The image classification task belongs to the classic problem in the field of deep learning, and when a designed network model is trained and predicted, the whole image is generally input into a network after being preprocessed.
The existing technology based on deep learning classification generally extracts parameters of an image and utilizes a loss function to perform back propagation so as to optimize model parameters, and the existing classification method is easy to generate an overfitting phenomenon.
Disclosure of Invention
In order to solve the problems, the invention provides an image classification model training method, an image classification model training system, an image classification model training device, a medium and a classification method.
In order to achieve the above object, the present invention provides an image classification model training method, including:
constructing a first image classification model;
constructing a data set, wherein the data set comprises N types of images, the number corresponding to each type of image is 0, 1, … and N-1 in sequence, and N is an integer greater than or equal to 2;
randomly extracting an image from each type of image in the data set as a training sample of the type of image to obtain training samples of 0 th type to N-1 th type images;
respectively randomly extracting an image combination from each type of image in the data set to obtain a reference sample;
obtaining first input data based on the training sample of the class 0 image and the reference sample, obtaining second input data based on the training sample of the class 1 image and the reference sample, …, and obtaining Nth input data based on the training sample of the class N-1 image and the reference sample;
inputting the first input data to the Nth input data into the first image classification model in sequence to carry out iterative training on the first image classification model, and obtaining a second image classification model after the iterative training is finished;
wherein the single training process of the first image classification model is as follows: inputting the p-th input data into the first image classification model for feature extraction to obtain the p-th of training samples in the p-th input dataObtaining a characteristic and a second characteristic of a reference sample in the pth input data, p being greater than or equal to 1 and less than or equal to N; carrying out difference operation on the second characteristic and the first characteristic to obtain a difference characteristic; coding the label corresponding to the difference characteristics to obtain a labelLInputting the differential features into a full-connected layer in the first image classification model to obtain an output vectorO(ii) a Computing the output vector O and the label using a loss functionLBased on the loss value, adjusting a parameter of the first image classification model.
Preferably, the parameters of the first image classification model are shared in the iterative training process. Model parameter sharing in the iterative training process can enable the optimized parameters obtained by each training and learning to be utilized by the model.
The principle of the invention is as follows:
the method is used for classifying the preset type of images, the designed first image classification model is realized in a characteristic map difference mode, and meanwhile, a predictive value decoding method corresponding to the first image classification model is provided according to the characteristics of the designed first image classification model. If the predicted sample is from the ith class, its corresponding features must differ minimally from the features of the samples in the ith class and significantly from the features of the samples in the other classes. The method extracts the characteristics of one reference image respectively extracted from N classifications while extracting the characteristics of one training image, calculates the difference between the characteristics of the training image and the characteristics of each reference image, and finds out which reference image the characteristics of the training image are closest to. Meanwhile, a decoding method of the prediction classification (namely, a conversion method of which classification a given image belongs to at all) is given according to the characteristics of the designed network.
Wherein, the label corresponding to the difference characteristic is coded to obtain the labelL,The purpose of the encoding is: and constructing a corresponding One-hot vector according to the mode of inputting the training sample. Because each time the training sample is constructed in an iteration mode, firstly an image I is extracted from the class I, and an image is sequentially and randomly extracted from 0-N-1 classes according to the class labels to form a reference sample set, wherein the ith sample of the N reference samplesThe label is the same as that of I, so the ith component of the obtained code L is 1, and the rest N-1 components are all 0. And carrying out difference comparison on the L and the network output O to obtain a loss value of the network, wherein the loss value is used for adjusting subsequent network parameters.
Preferably, in the method, the single training process of the first image classification model includes: the training sample of the k-th class image isThe label corresponding to the training sample of the kth class image isK is greater than or equal to 0 and k is less than or equal to N-1;
the reference sample of the k-th class image is、、…、The label corresponding to the reference sample of the kth class image is、、…、;
Will be provided with、、…、Sequentially inputting the first image classification model for feature extraction to obtain the first feature and the second feature, wherein the first feature isThe second feature includes:、…、wherein, in the step (A),,,min order to represent the hyper-parameters of the feature dimensions,to representIs thatmA dimension vector;
carrying out difference operation on the second feature and the first feature to obtain the difference feature;
encoding the label corresponding to the difference characteristic to obtain the labelL;
Inputting the differential features into a full-link layer in the first image classification model to obtain an output vectorO, (ii) a Computing the output vector O and the label using a loss functionLLoss ofA value to adjust a parameter of the first image classification model based on the loss value.
wherein differences between the features in the reference sample set and the training sample features are found by performing a difference operation on the extracted features.
Preferably, the loss function in the method is a Smooth-L1 loss function.
Preferably, the method adopts an One-Hot coding mode to code the label corresponding to the difference characteristic.
Preferably, in the method, the first image classification model is a convolutional neural network model.
The invention also provides an image classification method, which comprises the following steps: obtaining the second image classification model by adopting the image classification model training method;
and inputting the image to be classified into the second image classification model, and outputting the classification result of the image to be classified by the second image classification model.
Preferably, in the image classification method, the image to be classified isThe reference sample is、、…、The reference sample corresponds to a label of、、…、;
Will be provided with、、…、Input to the second image classification model, the output of which is:、、…、the output of the second image classification model is vectorized and expressed as;
Output vector to the second image classification modelODecoding, and obtaining a classification result of the image to be classified as:
the invention also provides an image classification model training system, which comprises:
the model building unit is used for building a first image classification model;
the data set construction unit is used for constructing a data set, the data set comprises N types of images, the number corresponding to each type of image is 0, 1, … and N-1 in sequence, and N is an integer greater than or equal to 2;
a training sample obtaining unit, configured to randomly extract an image from each type of image in the data set as a training sample of the type of image, and obtain training samples of 0 th type to N-1 th type of images;
a reference sample obtaining unit, configured to randomly extract an image combination from each type of image in the data set to obtain a reference sample;
a model input data obtaining unit, configured to obtain first input data based on a training sample of a class 0 image and the reference sample, obtain second input data based on the training sample of the class 1 image and the reference sample, …, and obtain nth input data based on a training sample of an N-1 image and the reference sample;
the training unit is used for sequentially inputting the first input data to the Nth input data into the first image classification model to perform iterative training on the first image classification model, and a second image classification model is obtained after the iterative training is completed;
wherein the single training process of the first image classification model is as follows: inputting pth input data into the first image classification model for feature extraction, obtaining first features of training samples in the pth input data and obtaining second features of reference samples in the pth input data, wherein p is greater than or equal to 1 and less than or equal to N; carrying out difference operation on the second characteristic and the first characteristic to obtain a difference characteristic; coding the label corresponding to the difference characteristics to obtain a labelLInputting the differential features into a full-connected layer in the first image classification model to obtain an output vectorO(ii) a Computing the output vector O and the label using a loss functionLBased on the loss value, adjusting a parameter of the first image classification model.
The invention also provides an image classification model training device, which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the image classification model training method when executing the computer program.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the image classification model training method.
One or more technical schemes provided by the invention at least have the following technical effects or advantages:
the invention can achieve the effect of image classification through the designed image classification model.
The invention can correctly decode the category information according to the predictive value decoding method adopted by the designed image classification model.
The invention can lead the parameters of the image classification model to be converged more quickly in the iterative training process.
In the process of convergence of the image classification model parameters, the overfitting degree of the training image classification model is reduced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention;
FIG. 1 is a schematic flow chart of an image classification model training method;
FIG. 2 is a schematic view of the structure of a mold;
FIG. 3 is a diagram illustrating the effect of the model in the iterative training process;
fig. 4 is a schematic diagram of the image classification system based on depth feature difference.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflicting with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described and thus the scope of the present invention is not limited by the specific embodiments disclosed below.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
The present description uses flowcharts to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of an image classification model training method, the present invention provides an image classification model training method, including:
constructing a first image classification model;
constructing a data set, wherein the data set comprises N types of images, the number corresponding to each type of image is 0, 1, … and N-1 in sequence, and N is an integer greater than or equal to 2;
randomly extracting an image from each type of image in the data set as a training sample of the type of image to obtain training samples of 0 th type to N-1 th type images;
respectively randomly extracting an image combination from each type of image in the data set to obtain a reference sample;
obtaining first input data based on the training sample of the class 0 image and the reference sample, obtaining second input data based on the training sample of the class 1 image and the reference sample, …, and obtaining Nth input data based on the training sample of the class N-1 image and the reference sample;
inputting the first input data to the Nth input data into the first image classification model in sequence to carry out iterative training on the first image classification model, and obtaining a second image classification model after the iterative training is finished;
wherein the single training process of the first image classification model is as follows: inputting pth input data into the first image classification model for feature extraction, obtaining first features of training samples in the pth input data and obtaining second features of reference samples in the pth input data, wherein p is greater than or equal to 1 and less than or equal to N; carrying out difference operation on the second characteristic and the first characteristic to obtain a difference characteristic; coding the label corresponding to the difference characteristics to obtain a labelLInputting the differential features into a full-connected layer in the first image classification model to obtain an output vectorO(ii) a Computing the output vector O and the label using a loss functionLBased on the loss value, adjusting a parameter of the first image classification model.
In the embodiment of the invention, the sample categories to be classified are numbered numerically, and if the data set comprises N classifications, the numerical number corresponding to each class is 0, 1, … and N-1 in sequence, wherein N is an integer greater than or equal to 2; in practical application, the number of the image types can be adjusted according to actual needs, and the invention is not particularly limited.
Designing a network model, namely a first image classification model, wherein a structural schematic diagram of the first image classification model is shown in fig. 2, the image processing process is performed by the model from left to right in fig. 2, after the image is input into the model, a convolutional layer (conv), an active layer (relu) and a pooling layer (pooling) in the model are used for processing in sequence, after the processing, an output is input into a full connection layer (FC), and then a classification result is obtained by calculating an output-input loss function of the full connection layer.
And constructing a training sample of the 0 th class and a reference sample set corresponding to the training sample, and respectively extracting features.
Randomly selecting an image from the N classes as a reference sample, and recording the image as a reference sample、、…、The corresponding label is、、…、. This set of reference samples will contribute to the decoding of subsequent predictors and therefore it is not possible to shuffle the order of each picture.
Will be provided with、、…、Sequentially inputting the network model to extract features, and respectively expressing the obtained features as、、…、Wherein,mIn order to represent the hyper-parameters of the feature dimensions,the representation F is a vector and is an m-dimensional vector.
And carrying out differential operation on the extracted features to find the difference between the features in the reference sample set and the features of the training samples. The calculation formula adopted is as follows:
performing One-Hot encoding on the label corresponding to the difference characteristic, wherein other encoding modes can be adopted for encoding in the embodiment of the invention, the encoding mode is not specifically limited in the invention, and the label corresponding to the difference characteristic can be represented as the label with the sample in the 0 th class as the reference for performing the difference, because the difference is performed by taking the sample in the 0 th class as the referenceNamely: the value corresponding to class 0 is 1, and the remaining N-1 values are all 0.
Inputting each differential feature into the fully-connected layer, i.e. eachAfter passing through the full connection layer, mapping to scalar values, the resulting output is expressed sequentially as:、、…、vectorized by。
Calculation of the loss function Using the Smooth-L1OAndLthe loss value of (2) is used as a basis for adjusting the parameter model, wherein the loss function in the embodiment of the present invention may be calculated by using other types of loss functions, and the specific type of the loss function is not limited in the present invention.
Training samples of types 1, 2, … and N-1 and a reference sample set corresponding to the training samples are constructed in the same way, and feature extraction, differential operation and model parameter training are carried out.
And (3) predicting by using the trained model, wherein the specific process is as follows:
assume that the prediction samples to be classified areReference samples are fixedly selected from the N classes, also denoted as、、…、The corresponding label is、、…、。
In turn will、、…、Input to the network model, the resulting output is, in turn, represented as:、、…、vectorized by。
For output vectorODecoding is performed, and the predicted class is represented as:
namely:corresponding toAnd the detection classification is a classification corresponding to the maximum value of the output vector.
The image in the invention can be an image in a plurality of fields, the image in each field can be divided into a plurality of categories, the field of the image and the category of the image classification are not specifically limited, the embodiment of the invention introduces the acquired license image related to the admission and permission of the market main body by classification, the designed network model is realized by adopting a characteristic diagram difference mode, and simultaneously, a predictive value decoding method corresponding to the network model is provided according to the characteristics of the designed network model.
The obtained license images related to the admission and the permission of the market main body are taken as an embodiment for explanation, and the types of the used license images are 6 types in total, and the method comprises the following steps: business licenses, food service licenses, cafeteria business licenses, pharmaceutical business licenses, and other types of licenses.
And (3) carrying out number numbering on the sample categories to be classified, wherein the data set comprises 6 categories, and the number corresponding to each category is 0, 1, … and 5 in sequence.
Designing a network model, and respectively extracting the features of training data randomly sampled from 6 categories in a network parameter sharing mode, wherein the specific process is as follows:
and constructing a training sample of the 0 th class and a reference sample set corresponding to the training sample, and respectively extracting features.
Randomly selecting a reference sample from each of 6 classes, and recording the selected reference sample as、、…、The corresponding label is、、…、. This set of reference samples will contribute to the decoding of subsequent predictors and therefore it is not possible to shuffle the order of each picture.
Will be provided with、、…、Sequentially inputting the network models to perform feature extraction, and respectively representing the obtained features as、、…、WhereinIn the experiment, the hyper-parameter of the characteristic dimension is set to be 10, and the specific value of the hyper-parameter of the characteristic dimension can be flexibly adjusted according to actual needs, and the invention is not specifically limited.
And carrying out differential operation on the obtained characteristics to obtain relative change characteristics, wherein the adopted calculation formula is as follows:
the label corresponding to the difference feature is subjected to One-Hot coding, and since the difference is carried out by taking the sample in the class 0 as a reference, the label corresponding to the difference feature can be expressed asNamely: the value corresponding to class 0 is 1, and the remaining 5 values are all 0.
Inputting each differential feature into the fully-connected layer, and inputting eachAfter passing through the full connection layer, mapping to scalar values, the resulting output is expressed sequentially as:、、…、vectorized by。
Calculation of the loss function Using the Smooth-L1OAndLand as a basis for adjusting the parametric model.
In the same way, training samples of types 1, 2, … and 5 and a reference sample set corresponding to the training samples are constructed, and feature extraction, differential operation and model parameter training are carried out.
And (3) predicting by using the trained model, wherein the specific process is as follows:
assume that the prediction samples to be classified areReference samples are fixedly selected from the N classes, also denoted as、、…、The corresponding label is、、…、。
In turn will、、…、Input to the network model, the resulting output is, in turn, represented as:、、…、vectorized by。
For output vectorODecoding is performed, and the predicted class is represented as:
namely:the corresponding prediction is classified into the class corresponding to the maximum value of the output vector.
Referring to fig. 3, fig. 3 shows the effect of the model in the iterative training process, and the solid line represents the accuracy rate change under different iteration times; the solid dotted line and the dotted line represent the variation of the training average loss value and the test average loss value with the number of iterations, respectively. It is not difficult to find out from the change situation of the curve that the network parameters are converged continuously as the iteration progresses.
Example two
Referring to fig. 4, fig. 4 is a schematic diagram illustrating an image classification model training system, and a second embodiment of the present invention provides an image classification model training system, including:
the model building unit is used for building a first image classification model;
the data set construction unit is used for constructing a data set, the data set comprises N types of images, the number corresponding to each type of image is 0, 1, … and N-1 in sequence, and N is an integer greater than or equal to 2;
a training sample obtaining unit, configured to randomly extract an image from each type of image in the data set as a training sample of the type of image, and obtain training samples of 0 th type to N-1 th type of images;
a reference sample obtaining unit, configured to randomly extract an image combination from each type of image in the data set to obtain a reference sample;
a model input data obtaining unit, configured to obtain first input data based on a training sample of a class 0 image and the reference sample, obtain second input data based on the training sample of the class 1 image and the reference sample, …, and obtain nth input data based on a training sample of an N-1 image and the reference sample;
the training unit is used for sequentially inputting the first input data to the Nth input data into the first image classification model to perform iterative training on the first image classification model, and a second image classification model is obtained after the iterative training is completed;
wherein the single training process of the first image classification model is as follows: inputting pth input data into the first image classification model for feature extraction, obtaining first features of training samples in the pth input data and obtaining second features of reference samples in the pth input data, wherein p is greater than or equal to 1 and less than or equal to N; carrying out difference operation on the second characteristic and the first characteristic to obtain a difference characteristic; coding the label corresponding to the difference characteristics to obtain a labelLInputting the differential features into a full-connected layer in the first image classification model to obtain an output vectorO(ii) a Computing the output vector O and the label using a loss functionLBased on the loss value, adjusting a parameter of the first image classification model.
EXAMPLE III
The third embodiment of the invention provides an image classification model training device, which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the image classification model training method when executing the computer program.
The processor may be a Central Processing Unit (CPU), or other general-purpose processor, a digital signal processor (digital signal processor), an Application Specific Integrated Circuit (Application Specific Integrated Circuit), an off-the-shelf programmable gate array (field programmable gate array) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may be used for storing the computer program and/or the module, and the processor may implement various functions of the image classification model training apparatus in the invention by operating or executing the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card, a secure digital card, a flash memory card, at least one magnetic disk storage device, a flash memory device, or other volatile solid state storage device.
Example four
The fourth embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the image classification model training method are implemented.
The image classification model training apparatus, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, all or part of the flow in the method of implementing the embodiments of the present invention may also be stored in a computer readable storage medium through a computer program, and when the computer program is executed by a processor, the computer program may implement the steps of the above-described method embodiments. Wherein the computer program comprises computer program code, an object code form, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, a point carrier signal, a telecommunications signal, a software distribution medium, etc. It should be noted that the computer readable medium may contain content that is appropriately increased or decreased as required by legislation and patent practice in the jurisdiction.
While the invention has been described with respect to the basic concepts, it will be apparent to those skilled in the art that the foregoing detailed disclosure is only by way of example and not intended to limit the invention. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (12)
1. The image classification model training method is characterized by comprising the following steps:
constructing a first image classification model;
constructing a data set, wherein the data set comprises N types of images, the number corresponding to each type of image is 0, 1, … and N-1 in sequence, and N is an integer greater than or equal to 2;
randomly extracting an image from each type of image in the data set as a training sample of the type of image to obtain training samples of 0 th type to N-1 th type images;
respectively randomly extracting an image combination from each type of image in the data set to obtain a reference sample;
obtaining first input data based on the training sample of the class 0 image and the reference sample, obtaining second input data based on the training sample of the class 1 image and the reference sample, …, and obtaining Nth input data based on the training sample of the class N-1 image and the reference sample;
inputting the first input data to the Nth input data into the first image classification model in sequence to carry out iterative training on the first image classification model, and obtaining a second image classification model after the iterative training is finished;
wherein the single training process of the first image classification model is as follows: inputting pth input data into the first image classification model for feature extraction, obtaining first features of training samples in the pth input data and obtaining second features of reference samples in the pth input data, wherein p is greater than or equal to 1 and less than or equal to N; carrying out difference operation on the second characteristic and the first characteristic to obtain a difference characteristic; coding the label corresponding to the difference characteristics to obtain a labelLInputting the difference features into the first image classification modelThe full connection layer of (1) obtains an output vectorO(ii) a Computing the output vector using a loss functionOAnd the labelLBased on the loss value, adjusting a parameter of the first image classification model.
2. The method of claim 1, wherein the first image classification model parameters are shared during an iterative training process.
3. The method of claim 1, wherein the single training process of the first image classification model comprises: the training sample of the k-th class image isThe label corresponding to the training sample of the kth class image isK is greater than or equal to 0 and k is less than or equal to N-1;
the reference sample of the k-th class image is、、…、The label corresponding to the reference sample of the kth class image is、、…、;
Will be provided with、、…、Sequentially inputting the first image classification model for feature extraction to obtain the first feature and the second feature, wherein the first feature isThe second feature includes:、…、wherein, in the step (A),,,min order to represent the hyper-parameters of the feature dimensions,to representIs thatmA dimension vector;
carrying out difference operation on the second feature and the first feature to obtain the difference feature;
the label corresponding to the difference characteristic is coded to obtainThe labelL;
5. the image classification model training method according to claim 1, wherein the loss function is a Smooth-L1 loss function.
6. The method for training the image classification model according to claim 1, wherein the method encodes the label corresponding to the difference feature by using One-Hot encoding.
7. The method of claim 1, wherein the first image classification model is a convolutional neural network model.
8. An image classification method, characterized in that the method comprises: obtaining the second image classification model by adopting the image classification model training method of any one of claims 1 to 7;
and inputting the image to be classified into the second image classification model, and outputting the classification result of the image to be classified by the second image classification model.
9. The image classification method according to claim 8, characterized in that the image to be classified isThe reference sample is、、…、The reference sample corresponds to a label of、、…、;
Will be provided with、、…、Input to the second image classification model, the output of which is:、、…、the output of the second image classification model is vectorized and expressed as;
Output vector to the second image classification modelODecoding, and obtaining a classification result of the image to be classified as:
10. an image classification model training system, the system comprising:
the model building unit is used for building a first image classification model;
the data set construction unit is used for constructing a data set, the data set comprises N types of images, the number corresponding to each type of image is 0, 1, … and N-1 in sequence, and N is an integer greater than or equal to 2;
a training sample obtaining unit, configured to randomly extract an image from each type of image in the data set as a training sample of the type of image, and obtain training samples of 0 th type to N-1 th type of images;
a reference sample obtaining unit, configured to randomly extract an image combination from each type of image in the data set to obtain a reference sample;
a model input data obtaining unit, configured to obtain first input data based on a training sample of a class 0 image and the reference sample, obtain second input data based on the training sample of the class 1 image and the reference sample, …, and obtain nth input data based on a training sample of an N-1 image and the reference sample;
the training unit is used for sequentially inputting the first input data to the Nth input data into the first image classification model to perform iterative training on the first image classification model, and a second image classification model is obtained after the iterative training is completed;
wherein the single training process of the first image classification model is as follows: inputting pth input data into the first image classification model for feature extraction, obtaining first features of training samples in the pth input data and obtaining second features of reference samples in the pth input data, wherein p is greater than or equal to 1 and less than or equal to N; carrying out difference operation on the second characteristic and the first characteristic to obtain a difference characteristic; coding the label corresponding to the difference characteristics to obtain a labelLInputting the differential features into a full-connected layer in the first image classification model to obtain an output vectorO(ii) a Computing the output vector using a loss functionOAnd the labelLBased on the loss value, adjusting a parameter of the first image classification model.
11. An image classification model training apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the image classification model training method according to any one of claims 1 to 7 when executing the computer program.
12. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the image classification model training method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110723977.6A CN113255838A (en) | 2021-06-29 | 2021-06-29 | Image classification model training method, system and device, medium and classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110723977.6A CN113255838A (en) | 2021-06-29 | 2021-06-29 | Image classification model training method, system and device, medium and classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113255838A true CN113255838A (en) | 2021-08-13 |
Family
ID=77190121
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110723977.6A Pending CN113255838A (en) | 2021-06-29 | 2021-06-29 | Image classification model training method, system and device, medium and classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113255838A (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108345942A (en) * | 2018-02-08 | 2018-07-31 | 重庆理工大学 | A kind of machine learning recognition methods based on embedded coding study |
CN109492666A (en) * | 2018-09-30 | 2019-03-19 | 北京百卓网络技术有限公司 | Image recognition model training method, device and storage medium |
CN109815801A (en) * | 2018-12-18 | 2019-05-28 | 北京英索科技发展有限公司 | Face identification method and device based on deep learning |
CN109993236A (en) * | 2019-04-10 | 2019-07-09 | 大连民族大学 | Few sample language of the Manchus matching process based on one-shot Siamese convolutional neural networks |
CN110533057A (en) * | 2019-04-29 | 2019-12-03 | 浙江科技学院 | A kind of Chinese character method for recognizing verification code under list sample and few sample scene |
CN111242199A (en) * | 2020-01-07 | 2020-06-05 | 中国科学院苏州纳米技术与纳米仿生研究所 | Training method and classification method of image classification model |
CN111291765A (en) * | 2018-12-07 | 2020-06-16 | 北京京东尚科信息技术有限公司 | Method and device for determining similar pictures |
CN111797930A (en) * | 2020-07-07 | 2020-10-20 | 四川长虹电器股份有限公司 | Fabric material near infrared spectrum identification and identification method based on twin network |
CN111950728A (en) * | 2020-08-17 | 2020-11-17 | 珠海格力电器股份有限公司 | Image feature extraction model construction method, image retrieval method and storage medium |
US20210055737A1 (en) * | 2019-08-20 | 2021-02-25 | Volkswagen Ag | Method of pedestrian activity recognition using limited data and meta-learning |
CN112784929A (en) * | 2021-03-14 | 2021-05-11 | 西北工业大学 | Small sample image classification method and device based on double-element group expansion |
-
2021
- 2021-06-29 CN CN202110723977.6A patent/CN113255838A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108345942A (en) * | 2018-02-08 | 2018-07-31 | 重庆理工大学 | A kind of machine learning recognition methods based on embedded coding study |
CN109492666A (en) * | 2018-09-30 | 2019-03-19 | 北京百卓网络技术有限公司 | Image recognition model training method, device and storage medium |
CN111291765A (en) * | 2018-12-07 | 2020-06-16 | 北京京东尚科信息技术有限公司 | Method and device for determining similar pictures |
CN109815801A (en) * | 2018-12-18 | 2019-05-28 | 北京英索科技发展有限公司 | Face identification method and device based on deep learning |
CN109993236A (en) * | 2019-04-10 | 2019-07-09 | 大连民族大学 | Few sample language of the Manchus matching process based on one-shot Siamese convolutional neural networks |
CN110533057A (en) * | 2019-04-29 | 2019-12-03 | 浙江科技学院 | A kind of Chinese character method for recognizing verification code under list sample and few sample scene |
US20210055737A1 (en) * | 2019-08-20 | 2021-02-25 | Volkswagen Ag | Method of pedestrian activity recognition using limited data and meta-learning |
CN111242199A (en) * | 2020-01-07 | 2020-06-05 | 中国科学院苏州纳米技术与纳米仿生研究所 | Training method and classification method of image classification model |
CN111797930A (en) * | 2020-07-07 | 2020-10-20 | 四川长虹电器股份有限公司 | Fabric material near infrared spectrum identification and identification method based on twin network |
CN111950728A (en) * | 2020-08-17 | 2020-11-17 | 珠海格力电器股份有限公司 | Image feature extraction model construction method, image retrieval method and storage medium |
CN112784929A (en) * | 2021-03-14 | 2021-05-11 | 西北工业大学 | Small sample image classification method and device based on double-element group expansion |
Non-Patent Citations (4)
Title |
---|
GREGORY KOCH 等: "Siamese Neural Networks for One-shot Image Recognition", 《32ND INTERNATIONAL CONFERENCE ON MACHINE》 * |
GREGORY KOCH: "Siamese Neural Networks for One-shot Image Recognition", 《网络在线公开: HTTPS://WWW.CS.TORONTO.EDU/~GKOCH/FILES/MSC-THESIS.PDF》 * |
IULIA ALEXANDRA LUNGU 等: "Multi-Resolution Siamese Networks for One-Shot Learning", 《2020 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS》 * |
杨树_: "【论文阅读】Siamese Neural Networks for One-shot Image Recognition", 《网络在线公开: HTTPS://BLOG.CSDN.NET/XIAOXU2050/ARTICLE/DETAILS/83614806》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10740865B2 (en) | Image processing apparatus and method using multi-channel feature map | |
US20220108178A1 (en) | Neural network method and apparatus | |
US10803591B2 (en) | 3D segmentation with exponential logarithmic loss for highly unbalanced object sizes | |
US11836603B2 (en) | Neural network method and apparatus with parameter quantization | |
CN113469088B (en) | SAR image ship target detection method and system under passive interference scene | |
KR102250728B1 (en) | Sample processing method and device, related apparatus and storage medium | |
CN113344206A (en) | Knowledge distillation method, device and equipment integrating channel and relation feature learning | |
KR20200144398A (en) | Apparatus for performing class incremental learning and operation method thereof | |
US11120297B2 (en) | Segmentation of target areas in images | |
CN113256592B (en) | Training method, system and device of image feature extraction model | |
CN114168732A (en) | Text emotion analysis method and device, computing device and readable medium | |
EP3637327A1 (en) | Computing device and method | |
CN114444668A (en) | Network quantization method, network quantization system, network quantization apparatus, network quantization medium, and image processing method | |
CN112668381A (en) | Method and apparatus for recognizing image | |
WO2022125181A1 (en) | Recurrent neural network architectures based on synaptic connectivity graphs | |
CN110969239A (en) | Neural network and object recognition method | |
CN113468323A (en) | Dispute focus category and similarity judgment method, dispute focus category and similarity judgment system, dispute focus category and similarity judgment device and dispute focus category and similarity judgment recommendation method | |
CN113255838A (en) | Image classification model training method, system and device, medium and classification method | |
Gaihua et al. | Instance segmentation convolutional neural network based on multi-scale attention mechanism | |
CN111259673A (en) | Feedback sequence multi-task learning-based law decision prediction method and system | |
Cohen et al. | Deepbrain: Functional representation of neural in-situ hybridization images for gene ontology classification using deep convolutional autoencoders | |
Huang et al. | Principles of artificial intelligence in radiooncology | |
US20230206059A1 (en) | Training brain emulation neural networks using biologically-plausible algorithms | |
US20220343134A1 (en) | Convolutional neural network architectures based on synaptic connectivity | |
JP7520753B2 (en) | Learning device, method and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210813 |