CN109583501B - Method, device, equipment and medium for generating image classification and classification recognition model - Google Patents

Method, device, equipment and medium for generating image classification and classification recognition model Download PDF

Info

Publication number
CN109583501B
CN109583501B CN201811457125.1A CN201811457125A CN109583501B CN 109583501 B CN109583501 B CN 109583501B CN 201811457125 A CN201811457125 A CN 201811457125A CN 109583501 B CN109583501 B CN 109583501B
Authority
CN
China
Prior art keywords
classification
level
training
neural network
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811457125.1A
Other languages
Chinese (zh)
Other versions
CN109583501A (en
Inventor
潘跃
刘振强
梁柱锦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bigo Technology Singapore Pte Ltd
Original Assignee
Guangzhou Baiguoyuan Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Baiguoyuan Information Technology Co Ltd filed Critical Guangzhou Baiguoyuan Information Technology Co Ltd
Priority to CN201811457125.1A priority Critical patent/CN109583501B/en
Publication of CN109583501A publication Critical patent/CN109583501A/en
Priority to PCT/CN2019/120903 priority patent/WO2020108474A1/en
Application granted granted Critical
Publication of CN109583501B publication Critical patent/CN109583501B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a method, a device, equipment and a medium for generating a picture classification and classification recognition model. The method comprises the following steps: acquiring a picture set to be classified, wherein the picture set comprises at least two pictures; inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture; if the classification score of the picture meets the preset condition, determining a classification recognition result of the picture according to the classification score; if the classification score of the picture does not meet the preset condition, continuing to input the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class classification recognition model is generated based on neural network training. The embodiment of the invention improves the accuracy and efficiency of picture classification.

Description

Method, device, equipment and medium for generating image classification and classification recognition model
Technical Field
The embodiment of the invention relates to a data processing technology, in particular to a method, a device, equipment and a medium for generating a picture classification and classification recognition model.
Background
With the rapid development of deep learning technology, deep neural networks are used in a large amount in the field of image classification.
In the prior art, in order to enable a classification recognition model generated based on deep neural network training to have higher classification accuracy, a mode of increasing the depth of a deep neural network is generally adopted.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems: firstly, because the deep neural network mainly adopts a method of inverse gradient propagation in the training process, the training difficulty is gradually increased along with the continuous increase of the network depth; secondly, because the computation of the forward reasoning process of the deep neural network is huge, the computation is gradually increased along with the continuous increase of the network depth, and the classification efficiency is further reduced.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a medium for generating a picture classification and classification recognition model, which are used for improving the accuracy and efficiency of picture classification.
In a first aspect, an embodiment of the present invention provides a method for classifying pictures, where the method includes:
acquiring a picture set to be classified, wherein the picture set comprises at least two pictures;
inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture;
if the classification result of the picture meets a preset condition, determining a classification recognition result of the picture according to the classification score; if the classification score of the picture does not meet the preset condition, continuing to input the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture set is obtained; wherein each class classification recognition model is generated based on neural network training.
Further, after the image set is input into a pre-trained current-level classification recognition model and a classification score of each image is obtained, the method further includes:
obtaining the classification probability of each picture according to the classification score of each picture;
the classification score of the picture meets a preset condition that the classification probability of the picture is greater than or equal to a probability threshold; and if the classification score of the picture does not meet the preset condition, the classification probability of the picture is smaller than a probability threshold.
In a second aspect, an embodiment of the present invention further provides a method for generating a classification recognition model, where the method includes:
acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture;
inputting the training picture and the original classification label of the training picture into a neural network model to obtain the classification score of each level of neural network layer on the training picture and the classification score and the classification label of each level of full connection layer on the training picture, wherein the neural network model comprises an N level of neural network layer and an N-1 level of full connection layer, the ith level of full connection layer is positioned behind the (i + 1) level of neural network layer, N is more than or equal to 3, and i belongs to [1, N-1 ];
obtaining a first-stage loss function of a first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures;
obtaining a P-th level loss function of the P-th level neural network layer according to the classification score and the classification label of the P-1-th level full connection layer on the training picture, wherein P belongs to [2, N ];
and determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of fully-connected layers until the loss functions of the neural network model reach preset function values, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
Further, the classification score of each level of the full-connection layer on the training picture is generated as follows:
obtaining the classification score of the first full-connection layer on the training picture according to the classification score of the first neural network layer on the training picture and the classification score of the second neural network layer on the training picture;
and obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
Further, the classification label of each level of full connection layer to the training picture is generated as follows:
updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first-stage full-connection layer on the training pictures;
and updating the classification label of the P-1 level full connection layer to the training picture according to the classification score of the P-1 level full connection layer to the training picture to obtain the classification label of the P level full connection layer to the training picture, wherein P belongs to [2, N ].
Further, the updating the original classification label of the training picture according to the classification score of the first-level neural network layer on the training picture to obtain the classification label of the first-level full-connection layer on the training picture includes:
obtaining the classification probability of the first-stage neural network layer to the training pictures according to the classification scores of the first-stage neural network layer to the training pictures;
if the classification probability of the first-stage neural network layer to the training pictures is greater than or equal to a first probability threshold value, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-stage full-connection layer to the training pictures;
and if the classification probability of the first-level neural network layer to the training picture is smaller than a first probability threshold value, keeping the original classification label of the training picture unchanged, and taking the original classification label of the training picture as the classification label of the first-level full-connection layer to the training picture.
Further, the step of updating the classification label of the training picture by the P-1 th full connection layer according to the classification score of the P-1 th full connection layer on the training picture to obtain the classification label of the P-1 th full connection layer on the training picture, where P belongs to [2, N ], includes:
obtaining the classification probability of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures, wherein P belongs to [2, N ];
if the classification probability of the P-1 level full connection layer to the training picture is greater than or equal to a P probability threshold value, modifying the classification label of the P-1 level full connection layer to the training picture into the preset classification label, and taking the preset classification label as the classification label of the P level full connection layer to the training picture;
and if the classification probability of the P-1 level full connection layer to the training picture is smaller than a P probability threshold value, keeping the classification label of the P-1 level full connection layer to the training picture unchanged, and taking the classification label of the P-1 level full connection layer to the training picture as the classification label of the P level full connection layer to the training picture.
Further, the determining a loss function of the neural network model according to the loss functions at each level, and adjusting network parameters of the neural network layers at each level and the fully-connected layers at each level until the loss function of the neural network model reaches a preset function value, wherein each level of the neural network layer is used as a classification identification model of a corresponding level, and the method includes:
determining a loss function of the neural network model according to each level of loss function;
calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of full connection layer, wherein the partial derivatives of training pictures corresponding to preset classification labels in the loss function are zero;
and adjusting network parameters of each level of neural network layer and each level of fully-connected layer according to the partial derivatives, and recalculating the loss function until the loss function reaches the preset function value, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
In a third aspect, an embodiment of the present invention further provides an image classification device, where the device includes:
the image set acquisition module is used for acquiring an image set to be classified, and the image set comprises at least two images;
the classification result generation module is used for inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture;
the image classification recognition system comprises a classification recognition result generation module, a recognition module and a recognition module, wherein the classification recognition result generation module is used for determining a classification recognition result of an image according to a classification score if the classification score of the image meets a preset condition; if the classification score of the picture does not meet the preset condition, continuing inputting the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class classification recognition model is generated based on neural network training.
Further, the apparatus further comprises:
the classification probability acquisition module is used for obtaining the classification probability of each picture according to the classification score of each picture;
the classification score of the picture meets a preset condition that the classification probability of the picture is greater than or equal to a probability threshold; and if the classification score of the picture does not meet the preset condition, the classification probability of the picture is smaller than a probability threshold.
In a fourth aspect, an embodiment of the present invention further provides a device for generating a classification recognition model, where the device includes:
the training sample acquisition module is used for acquiring a training sample, and the training sample comprises a training picture and an original classification label of the training picture;
the classification score and classification label generation module is used for inputting the training picture and the original classification label of the training picture into a neural network model to obtain the classification score of each level of neural network layer on the training picture and the classification score and classification label of each level of full connection layer on the training picture, the neural network model comprises an N level neural network layer and an N-1 level full connection layer, the ith level full connection layer is positioned behind the (i + 1) level neural network layer, N is more than or equal to 3, and i belongs to [1, N-1 ];
the first-stage loss function generation module is used for obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures;
the P-level loss function generation module is used for obtaining a P-level loss function of the P-level neural network layer according to the classification score and the classification label of the P-1-level full connection layer on the training picture, wherein P belongs to [2, N ];
and the classification recognition model generation module is used for determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of full connection layers until the loss function of the neural network model reaches a preset function value, so that each level of neural network layer is used as a classification recognition model of a corresponding level.
Further, the classification score of each level of the full-connection layer on the training picture is generated as follows:
obtaining the classification score of the first full-connection layer on the training picture according to the classification score of the first neural network layer on the training picture and the classification score of the second neural network layer on the training picture;
and obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
Further, the classification label of each level of full connection layer to the training picture is generated as follows:
updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first-stage full-connection layer on the training pictures;
and updating the classification label of the P-1 level full connection layer to the training picture according to the classification score of the P-1 level full connection layer to the training picture to obtain the classification label of the P level full connection layer to the training picture, wherein P belongs to [2, N ].
Further, the updating the original classification label of the training picture according to the classification score of the first-level neural network layer on the training picture to obtain the classification label of the first-level full-connection layer on the training picture includes:
obtaining the classification probability of the first-stage neural network layer to the training pictures according to the classification scores of the first-stage neural network layer to the training pictures;
if the classification probability of the first-stage neural network layer to the training pictures is smaller than a first probability threshold, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-stage full-connection layer to the training pictures;
and if the classification probability of the first-level neural network layer to the training picture is greater than or equal to a first probability threshold value, keeping the original classification label of the training picture unchanged, and taking the original classification label of the training picture as the classification label of the first-level full-connection layer to the training picture.
Further, the step of updating the classification label of the training picture by the P-1 th full connection layer according to the classification score of the P-1 th full connection layer on the training picture to obtain the classification label of the P-1 th full connection layer on the training picture, where P belongs to [2, N ], includes:
obtaining the classification probability of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures, wherein P belongs to [2, N ];
if the classification probability of the P-1 level full connection layer to the training picture is smaller than a P probability threshold value, modifying the classification label of the P-1 level full connection layer to the training picture into the preset classification label, and taking the preset classification label as the classification label of the P level full connection layer to the training picture;
and if the classification probability of the P-1 level full connection layer to the training picture is more than or equal to a P probability threshold value, keeping the classification label of the P-1 level full connection layer to the training picture unchanged, and taking the classification label of the P-1 level full connection layer to the training picture as the classification label of the P level full connection layer to the training picture.
Further, the determining a loss function of the neural network model according to the loss functions at each level, and adjusting network parameters of the neural network layers at each level and the fully-connected layers at each level until the loss function of the neural network model reaches a preset function value, wherein each level of the neural network layer is used as a classification identification model of a corresponding level, and the method includes:
determining a loss function of the neural network model according to each level of loss function;
calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of full connection layer, wherein the partial derivatives of training pictures corresponding to preset classification labels in the loss function are zero;
and adjusting network parameters of each level of neural network layer and each level of fully-connected layer according to the partial derivatives, and recalculating the loss function until the loss function reaches the preset function value, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
In a fifth aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a method according to the first or second aspect of an embodiment of the present invention.
In a sixth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method according to the first or second aspect of the present invention.
The method comprises the steps of obtaining a picture set to be classified, wherein the picture set comprises at least two pictures, inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture, and determining a classification recognition result of the picture according to the classification score if the classification score of the picture meets a preset condition; if the classification score of the picture does not meet the preset condition, the picture is continuously input into a pre-trained next-stage classification recognition model until a classification recognition result of the picture is obtained, each stage of classification recognition model is generated based on neural network training, and the picture is classified by adopting the multi-stage classification recognition model, so that the accuracy and the efficiency of picture classification are improved.
Drawings
Fig. 1 is a flowchart of a picture classification method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for classifying pictures in an embodiment of the present invention;
FIG. 3 is a flow chart of a method for generating a classification recognition model according to an embodiment of the present invention;
FIG. 4 is a flow chart of another method for generating a classification recognition model in an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a neural network model in an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of an image classifying device according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of an apparatus for generating a classification recognition model according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an apparatus in an embodiment of the present invention.
Detailed Description
In the following embodiments, optional features and examples are provided in each embodiment, and various features described in the embodiments may be combined to form a plurality of alternatives, and each numbered embodiment should not be regarded as only one technical solution. The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Examples
With the continuous development of network technology, the functions of the network are more and more powerful. People can upload pictures shot by themselves to a network platform through a network, and the pictures are watched by other users of the network platform, and the network platform is a short video application program or a live broadcast platform. Due to the fact that the quality of the pictures uploaded by the users is uneven, some pictures not only influence the physical and mental health of other users, but also can violate laws. Therefore, the uploaded pictures of the user need to be audited, and the audited pictures are classified and identified accurately. Moreover, because different uploaded pictures have simple pictures and difficult pictures, the simple pictures or the difficult pictures refer to classification and identification difficulty, and if the classification to which the pictures belong is easy to determine, the pictures can be called as simple pictures; if it is not easy to determine the category to which the picture belongs, the picture may be referred to as a difficult picture. It is of course understood that the above is only one application scenario requiring picture classification.
In the conventional technology, a classification recognition model generated based on deep neural network training can be adopted to classify the pictures. In order to obtain a classification recognition model with high classification accuracy, namely whether the classification recognition model is a simple picture or a difficult picture, the classification of the classification recognition model can be accurately determined, a mode of increasing the depth of a deep neural network can be adopted, but the following problems can be caused as follows: because the deep neural network mainly adopts a method of inverse gradient propagation in the training process, the training difficulty is gradually increased along with the continuous increase of the network depth. In addition, because the computation of the forward reasoning process of the deep neural network is huge, the computation is gradually increased along with the continuous increase of the network depth, and the classification efficiency is further reduced.
In order to solve the above problem, that is, to achieve higher classification accuracy and improve classification efficiency without increasing network depth, a mode of using a multi-level classification recognition model may be considered, where the multi-level classification recognition model refers to different levels, and each level of classification recognition model is used for classifying and recognizing pictures with corresponding difficulty levels, and the foregoing will be further described with reference to specific embodiments below.
Fig. 1 is a flowchart of a picture classification method according to an embodiment of the present invention, where the embodiment is applicable to a case where accuracy and efficiency of picture classification are improved, the method may be executed by a picture classification apparatus, the apparatus may be implemented in a software and/or hardware manner, and the apparatus may be configured in a device, such as a computer or a mobile terminal. As shown in fig. 1, the method specifically includes the following steps:
step 110, obtaining a picture set to be classified, wherein the picture set comprises at least two pictures.
In the embodiment of the present invention, the picture set to be classified may be a picture set uploaded to the network platform by the user, or may be a picture set stored in advance, and a specific source of the picture set may be selected according to an actual situation, which is not specifically set herein. The picture set comprises at least two pictures, wherein the classification identification difficulty of each picture may be the same or different, namely, each picture in the picture set has a difficulty, in other words, each picture in the picture set may be determined only by a classification identification model of different grades.
And 120, inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture.
Step 130, determining whether the classification score of the picture meets a preset condition; if yes, go to step 140; if not, go to step 150.
And step 140, determining a classification recognition result of the picture according to the classification score of the picture.
And 150, continuously inputting the picture into a pre-trained next-level classification recognition model to obtain the classification score of the picture, and returning to execute the step 130.
In the embodiment of the present invention, there are different levels of classification and identification models, each level of classification and identification models may be used to classify and identify pictures with corresponding difficulty levels, and it can be understood that, because the difficulty levels of the pictures used for classification and identification by each level of classification and identification models are different, the complexity of the network structure of each level of classification and identification models is usually different, the more complex the network structure of the classification and identification models is, the more difficult the pictures used for classification and identification are, and the above-mentioned hierarchical identification of the pictures may be implemented. It can also be understood that, in the above-mentioned hierarchical identification process, the number of pictures passing through each hierarchical classification identification model is continuously reduced, and accordingly, the amount of operation is reduced, thereby improving the classification efficiency. It should be noted that the complexity of the network structure described herein is relative. It should be further noted that each class of classification and recognition models are generated based on neural network training and are generated by collaborative training, rather than being generated by separate training, that is, in the training process, the classification scores of the class of classification and recognition models at each stage influence each other.
The current-level classification recognition model may refer to a classification recognition model for classifying and recognizing the simplest picture, and may be understood as a first-level classification recognition model, the next-level classification recognition model may refer to a classification recognition model for classifying and recognizing the pictures with difficulty degrees other than the simplest picture, and the next-level classification recognition model may be understood as a second-level classification recognition model, a third-level classification recognition model, and the like.
After the picture set to be classified is obtained, the picture set is input into a pre-trained current-level classification recognition model to obtain the classification score of each picture in the picture set, and whether the picture needs to be input into a next-level classification recognition model is determined according to the classification score of each picture until the classification score of each picture in the picture set is obtained. Specifically, the method comprises the following steps: inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture in the picture set, determining whether the classification score of the picture meets a preset condition, if so, determining a classification recognition result of the picture according to the classification score of the picture, and not inputting the picture into a next-level classification recognition model; if the classification score of the picture does not meet the preset condition, the picture is continuously input into a next-level classification recognition model to obtain the classification score of each picture, whether the classification score of each picture meets the preset condition or not is continuously determined, if the classification score of the picture meets the preset condition, the classification recognition result of the picture is determined according to the classification score of the picture, the picture is not input into the next-level classification recognition model, and if the classification score of the picture does not meet the preset condition, the picture is input into the next-level classification recognition model until the classification recognition result of each picture in the picture set is obtained. The preset condition may be that the classification probability of the picture is greater than or equal to a probability threshold, wherein the classification probability of the picture is calculated according to the classification score of the picture.
It should be noted that the technical solution provided in the embodiment of the present invention is directed to the problem of two categories of pictures, where the two categories represent whether the category score is yes or no, and whether yes or no can be represented by a preset identifier. For example, if it is determined whether a picture contains illegal contents, yes is characterized by "1" and no is characterized by "0". If the classification score is "1" (i.e., yes), it indicates that the picture contains illegal content, or if the classification score is "0" (i.e., no), it indicates that the picture does not contain illegal content.
Based on the above, determining the classification recognition result of the picture according to the classification score of the picture can be understood as follows: the classification score is set in advance, for example, the classification score may include "1" and "0", where the specific meaning of "1" indicates "yes", and "0" indicates "no", and the specific meaning of "yes" and "no" needs to be determined according to the content to be identified, and whether a picture contains illegal content is determined as described above, and if the classification score is "1" (i.e., yes), the picture contains illegal content, or if the classification score is "0" (i.e., not yes), the picture does not contain illegal content.
It can be understood that if the difficulty levels of the pictures in the picture set are different, the number of pictures input to the next-stage classification recognition model is sequentially reduced. In addition, generally, the number of simple pictures is large, in other words, most pictures in the picture set can be accurately classified and recognized through the current-level classification recognition model, so that for the next-level classification recognition model, the number of pictures to be classified is small, and therefore, the classification efficiency of the classification recognition model can be improved. Meanwhile, the process also embodies the step of carrying out classification recognition according to the difficulty degree of the pictures, and compared with the step of carrying out classification recognition on all the pictures with difficulty degrees by the same classification recognition model, the classification accuracy is improved. The above-mentioned improvement of the classification accuracy can be understood as follows: when the same classification recognition model is trained, the classification scores of pictures with various difficulty degrees, which play a role in the reverse gradient propagation, are included, and not only the classification scores of difficult pictures. When the classification recognition models of different levels are trained, the classification scores of the classified and recognized pictures are not included, namely the classification recognition models with the later levels are used for playing a role in the reverse gradient propagation, the classification scores of the pictures are harder to play a role in the reverse gradient propagation during the training, and the specificity of the classification recognition models obtained through the training is more prominent through the training model mechanism.
It should be noted that, if the classification scores of some pictures still do not satisfy the preset condition through all the pre-trained classification recognition models, the classification recognition result of the picture can be determined according to the classification score obtained by the picture through the last-stage classification recognition model.
For example, if there are N-level classification recognition models, the N-level classification recognition models may specifically include a first-level classification recognition model, a second-level classification recognition model, … …, an N-1-level classification recognition model, and an nth-level classification recognition model, and the set of pictures to be classified includes M pictures. Inputting M pictures into a first-stage classification recognition model to obtain a classification score of each picture, determining that the classification score of U pictures meets a preset condition, determining a classification recognition result according to the classification score of the U pictures, continuously inputting (M-U) pictures into a second-stage classification recognition model, determining that the classification score of K pictures meets the preset condition, determining a classification recognition result according to the classification score of the K pictures, continuously inputting (M-U-K) pictures into a third-stage classification recognition model, determining that the classification score of (M-U-K) pictures meets the preset condition, and determining the classification recognition result according to the classification score of (M-U-K) pictures. And at this point, obtaining a classification recognition result of each picture in the picture set, and finishing the classification recognition operation of the picture set to be classified.
According to the technical scheme, the picture set to be classified is obtained and comprises at least two pictures, the picture set is input into a pre-trained current-level classification recognition model to obtain the classification score of each picture, and if the classification score of the picture meets a preset condition, the classification recognition result of the picture is determined according to the classification score; if the classification score of the picture does not meet the preset condition, the picture is continuously input into a pre-trained next-stage classification recognition model until a classification recognition result of the picture is obtained, each stage of classification recognition model is generated based on neural network training, and the picture is classified by adopting the multi-stage classification recognition model, so that the accuracy and the efficiency of picture classification are improved.
Optionally, on the basis of the above technical solution, after the image set is input into a pre-trained current-level classification recognition model to obtain a classification score of each image, the method may further include: and obtaining the classification probability of each picture according to the classification score of each picture. The classification score of the picture meets a preset condition that the classification probability of the picture is greater than or equal to a probability threshold; and if the classification score of the picture does not meet the preset condition, the classification probability of the picture is smaller than a probability threshold.
In the embodiment of the present invention, the picture set is input into the pre-trained current-level classification recognition model, the classification score of each picture is obtained, and the classification score can be understood as a vector, and it can be understood that the classification score of the picture set is composed of the classification scores of the pictures.
Adopting a classifier, calculating the classification probability of the picture according to the classification score of the picture, and correspondingly, if the classification score of the picture meets a preset condition, determining a classification recognition result of the picture according to the classification score of the picture, which may specifically include: and if the classification probability of the picture is greater than or equal to the probability threshold, determining the classification recognition result of the picture according to the classification score of the picture. If the classification score of the picture does not meet the preset condition, inputting the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained, which specifically includes: and if the classification probability of the picture is smaller than the probability threshold, inputting the picture into a pre-trained next-stage classification recognition model until a classification recognition result of the picture is obtained. Note that the classifier may be Softmax or Logistic, etc.
Fig. 2 is a flowchart of another image classification method according to an embodiment of the present invention, and as shown in fig. 2, the method specifically includes the following steps:
step 210, obtaining a picture set to be classified, wherein the picture set comprises at least two pictures.
Step 220, inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture.
And step 230, obtaining the classification probability of each picture according to the classification score of each picture.
Step 240, determining whether the classification probability of the picture is greater than or equal to a probability threshold; if yes, go to step 250; if not, go to step 260.
And step 250, determining the classification recognition result of the picture according to the classification score of the picture.
And step 260, continuously inputting the picture into a pre-trained next-level classification recognition model to obtain the classification score of the picture, and returning to execute the step 230.
In the embodiment of the present invention, it should be noted that each class classification recognition model is generated based on neural network training.
According to the technical scheme, the picture set to be classified is obtained and comprises at least two pictures, the picture set is input into a pre-trained current-level classification recognition model to obtain the classification score of each picture, the classification probability of each picture is obtained according to the classification score of each picture, and if the classification probability of the picture is larger than or equal to a probability threshold value, the classification recognition result of the picture is determined according to the classification score; if the classification probability of the picture is smaller than the probability threshold, the picture is continuously input into a pre-trained next-stage classification recognition model until a classification recognition result of the picture is obtained, each stage of classification recognition model is generated based on neural network training, and the pictures are classified by adopting the multi-stage classification recognition model, so that the accuracy and the efficiency of picture classification are improved.
Fig. 3 is a flowchart of a method for generating a classification recognition model according to an embodiment of the present invention, where the method is applicable to a case where accuracy and efficiency of image classification are improved, and the method may be executed by a device for generating a classification recognition model, where the device may be implemented in software and/or hardware, and the device may be configured in a device, such as a computer or a mobile terminal. As shown in fig. 3, the method specifically includes the following steps:
step 310, obtaining a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture.
In the embodiment of the invention, a training sample is obtained, the training sample may include training pictures and original classification labels of the training pictures, and the number of the training pictures is at least two. The classification label is used for representing the belonging classification of the training picture.
And step 320, inputting the training pictures and the original classification labels of the training pictures into a neural network model to obtain the classification score of each level of neural network layer on the training pictures and the classification score and the classification labels of each level of full connection layer on the training pictures, wherein the neural network model comprises an N level neural network layer and an N-1 level full connection layer, the ith level full connection layer is positioned behind the (i + 1) level neural network layer, N is more than or equal to 3, and i belongs to [1, N-1 ].
In the embodiment of the invention, the neural network model can comprise an N-level neural network layer and an N-1 level full connection layer, wherein the ith level full connection layer is positioned between an i +1 level neural network layer and an i +2 level neural network layer, N is more than or equal to 3, i belongs to [1, N-1], the neural network is a mathematical model which is based on the basic principle of the neural network in biology, and simulates the processing mechanism of the neural system of the human brain on complex information by taking network topology knowledge as the theoretical basis after the structure of the human brain and the response mechanism of external stimulation are understood and abstracted. The model particularly realizes information processing by adjusting the weight of interconnection among a large number of internal nodes (neurons) according to the complexity of the system.
The neural network can comprise a convolutional neural network, a cyclic neural network and a deep neural network, and the convolutional neural network is taken as an example for description, the core problem solved by the convolutional neural network is how to automatically extract and abstract features and then map the features to task targets to solve practical problems. The convolutional neural network has the characteristic of weight sharing, namely convolutional kernels, and the same characteristics of different positions of the image data can be extracted through the operation of one convolutional kernel, namely, the same objects of different positions in one image data are basically the same in characteristics. It can be understood that only a part of features can be obtained by using one convolution kernel, and the features of the picture can be extracted by setting multi-kernel convolution and learning different features by using each convolution kernel. In the image classification, the convolutional layer is used for extracting and analyzing low-level features into high-level features, the low-level features are basic features such as textures, edges and the like, the high-level features such as human faces, shapes of objects and the like can express the attributes of samples better, and the process is the hierarchy of the convolutional neural network.
It should be noted that the fully-connected layer functions as a "classifier" in the whole convolutional neural network. If we say that the convolutional layer, excitation layer, and pooling layer operations map the raw data to the hidden layer feature space, the fully-connected layer serves to map the learned "distributed feature representation" to the sample label space. In practical use, the fully-connected layer may be implemented by a convolution operation: a fully-connected layer that is fully-connected to the previous layer may be converted to a convolution with a convolution kernel of 1x 1; and the fully-connected layer of which the front layer is the convolutional layer can be converted into the global convolution with the convolution kernel of H multiplied by W, wherein H and W are respectively the height and the width of the convolution result of the front layer. At present, because parameters of a full connection layer are redundant, only the parameters of the full connection layer can account for about 80% of the parameters of the whole network, so that some network models with excellent performance, such as a residual network model, adopt global average pooling to replace the full connection layer to fuse learned depth features, that is, a convolutional neural network may not include the full connection layer.
It should be further noted that the N-level fully-connected layer provided by the embodiment of the present invention is a fully-connected layer other than the N-level neural network layer, that is, each level of neural network layer may include a fully-connected layer itself, but the fully-connected layer included in the neural network layer is different from the N-level fully-connected layer in the neural network model.
Inputting the training sample into the neural network model, namely inputting the training picture and the original classification label of the training picture into the neural network model to obtain the classification score of each level of neural network layer on the training picture and the classification score and the classification label of each level of full connection layer on the training picture, wherein the classification score and the classification label of the full connection layer on the training picture are used for calculating the loss function of the neural network layer, and the classification score of the neural network layer on the training picture is used for calculating the classification score and the classification label of the full connection layer.
And 330, obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures.
And 340, obtaining a P-th level loss function of the P-th level neural network layer according to the classification scores and the classification labels of the P-1-th level full connection layer on the training pictures, wherein P belongs to [2, N ].
And 350, determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of fully-connected layers until the loss function of the neural network model reaches a preset function value, wherein each level of neural network layer is used as a classification recognition model of a corresponding level.
In an embodiment of the invention, the loss function is a function that maps an event or value of one or more variables to a real number that may visually represent some "cost" associated therewith, i.e. the loss function maps an event of one or more variables to a real number associated with some cost. The loss function may be used to measure model performance and inconsistency between actual and predicted values, with model performance increasing as the value of the loss function decreases. For the embodiment of the present invention, the predicted values herein refer to the classification scores of the first-stage neural network layer for the training pictures and the classification scores of all-level connected layers for the training pictures, and the actual values refer to the original classification labels of the training pictures and the classification labels of all-level connected layers for the training pictures. The loss function may be a cross entropy loss function, a 0-1 loss function, a square loss function, an absolute loss function, a logarithmic loss function, or the like, and may be specifically set according to actual conditions, and is not specifically limited herein.
The training process of the neural network model is to calculate a loss function of the neural network model through forward propagation, calculate partial derivatives of the loss function to network parameters, and adjust the network parameters of each level of neural network layer and each level of full connection layer by adopting a reverse gradient propagation method until the loss function of the neural network model reaches a preset function value. When the loss function value of the neural network model reaches the preset function value, the training of the neural network model is completed, and at the moment, the network parameters of each level of neural network layer and each level of full connection layer are also determined. On the basis, each level of neural network layer is used as a classification recognition model of a corresponding level, namely, the first level of neural network layer is used as a first level of classification recognition model, the P level of neural network layer is used as a P level of classification recognition model, and P belongs to [2, N ].
It should be noted that the loss function of the neural network model according to the embodiment of the present invention is obtained by weighted summation of the loss functions of the N-level neural network layers. The first-level loss function of the first-level neural network layer is obtained by calculation according to the classification score of the first-level neural network layer on the training picture and the original classification label of the training picture, the P-level loss function of the P-level neural network layer is obtained by calculation according to the classification score of the P-1-level full-connection layer on the training picture and the classification label, and P belongs to [2, N ].
In addition, it can be understood that each full-link layer has a classification label for the training picture, in other words, the classification label of the training picture is updated once after passing through each full-link layer, and the updating is further described here specifically: for the classification label of each training picture, the classification label of the P-th full-link layer for the training picture may be the same as the classification label of the upper full-link layer for the training picture, and may also be different from the classification label of the upper full-link layer for the training picture, where the upper level refers to all levels before the P-th level, and therefore, the updating refers to performing an updating operation, and the result of the updating operation may be that the classification label of the training picture is updated (i.e., the classification label of the P-th full-link layer for the training picture is different from the classification label of the upper full-link layer for the training picture), or that the classification label of the training picture is not updated (i.e., the classification label of the P-th full-link layer for the training picture is the same as the classification label of the upper full-link layer for the training picture).
It can also be understood that, because the classification scores and the classification labels of the training pictures according to which the loss functions of the neural network layers are determined are different, and the obtained loss functions of the neural network layers are different, the network parameters of the neural network layers and the fully-connected layers are adjusted based on the loss functions of the neural network model, the complexity of the neural network layer structures of the various levels determined finally is different, and correspondingly, the complexity of the classification recognition model structures of the various levels is different. Based on the above, the classification recognition models at all levels can be used for classifying the pictures with the corresponding difficulty degrees, in other words, the simple pictures can obtain the classification results meeting the requirements through the classification recognition models with simple structures, and the difficult pictures can obtain the classification results meeting the requirements only through the classification models with more complex structures, that is, the classification recognition models at all levels respectively process the training pictures with the corresponding difficulty degrees, but not all the classification recognition models at all levels process all the training pictures. The classification efficiency is greatly improved.
Meanwhile, the N-level neural network layer and the N-1-level full-connection layer are generated by collaborative training, but not by respective training, and the results of each level of neural network layer and each level of full-connection layer affect each other. The performance of each level of neural network layer obtained by training in the above way is superior to that of the neural network layer obtained by training only one neural network layer. Because each stage of neural network layer is used as a classification recognition model of a corresponding stage, the performance of each stage of classification recognition model obtained by training in the above way is superior to that of a classification recognition model obtained by training only one neural network layer.
In addition, when the N-level neural network layer is trained, the N-level neural network layer may be initialized by loading a pre-training model, where the pre-training model refers to a trained model, and the model and the N-level neural network layer to be trained are both used for classifying similar training samples.
According to the technical scheme, a training sample is obtained and comprises a training picture and an original classification label of the training picture, the training picture and the original classification label of the training picture are input into a neural network model, the classification score of each level of neural network layer on the training picture is obtained, the classification score and the classification label of each level of full connection layer on the training picture are obtained, the neural network model comprises an N level of neural network layer and an N-1 level of full connection layer, the ith level of full connection layer is positioned behind the (i + 1) level of neural network layer, N is larger than or equal to 3, and i belongs to [1, N-1 ]. Obtaining a first-stage loss function of the first-stage neural network layer according to the classification score of the first-stage neural network layer on the training picture and an original classification label of the training picture, obtaining a P-stage loss function of the P-stage neural network layer according to the classification score and the classification label of the P-1-stage full connection layer on the training picture, determining the loss function of the neural network model according to the loss functions of all stages, adjusting network parameters of all stages of neural network layers and all stages of full connection layers until the loss function of the neural network model reaches a preset function value, taking each stage of neural network layer as a corresponding stage of classification recognition model, obtaining a multi-stage classification recognition model by adopting a collaborative training mode, and improving the accuracy and efficiency of the classification recognition model for picture classification.
Optionally, on the basis of the above technical solution, the classification score of each full-link layer for the training picture may be generated as follows: and obtaining the classification score of the first-level full-connection layer to the training picture according to the classification score of the first-level neural network layer to the training picture and the classification score of the second-level neural network layer to the training picture. And obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
In the embodiment of the present invention, in addition to the classification score of the first-stage full-connection layer on the training picture, the classification scores of the other full-connection layers on the training picture may be generated as follows: and obtaining the classification score of the P-level full connection layer on the training picture and the classification score of the P + 1-level neural network layer on the training picture, wherein P belongs to [2, N ].
The classification score of the first-level full-connection layer on the training picture can be generated in the following way: and obtaining the classification score of the first-level full-connection layer to the training picture according to the classification score of the first-level neural network layer to the training picture and the classification score of the second-level neural network layer to the training picture.
Optionally, on the basis of the above technical solution, the classification label of each level of the full-connected layer for the training picture may be generated as follows: and updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first-stage full-connection layer on the training pictures. And updating the classification label of the P-1 level full connection layer to the training picture according to the classification score of the P-1 level full connection layer to the training picture to obtain the classification label of the P level full connection layer to the training picture, wherein P belongs to [2, N ].
In the embodiment of the present invention, each level of the fully-connected layer has a classification label for the training picture, in other words, the classification label for the training picture is updated once every time the training picture passes through one level of the fully-connected layer, and specifically, the classification label for the training picture of each level of the fully-connected layer can be generated in the following manner: updating the original classification label of the training picture according to the classification score of the first-level neural network layer on the training picture to obtain the classification label of the first-level full-connection layer on the training picture, and updating the classification label of the P-1-level full-connection layer on the training picture according to the classification score of the P-1-level full-connection layer on the training picture. It should be noted that the updating described herein refers to performing an updating operation, and whether to update the classification label may be determined by whether the classification score of the training picture by the network layer satisfies a preset condition, where the preset condition may be: obtaining the classification probability of the network layer to the training pictures according to the classification scores of the network layer to the training pictures; and the classification probability of the network layer to the training pictures is greater than or equal to a probability threshold.
Optionally, on the basis of the above technical solution, the original classification label of the training picture is updated according to the classification score of the first-level neural network layer on the training picture, so as to obtain the classification label of the first-level full-link layer on the training picture, which may specifically include: and obtaining the classification probability of the first-stage neural network layer to the training pictures according to the classification scores of the first-stage neural network layer to the training pictures. And if the classification probability of the first-stage neural network layer to the training pictures is greater than or equal to the first probability threshold, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-stage full-connection layer to the training pictures. And if the classification probability of the first-level neural network layer to the training pictures is smaller than a first probability threshold value, keeping the original classification labels of the training pictures unchanged, and taking the original classification labels of the training pictures as the classification labels of the first-level full-connection layer to the training pictures.
In the embodiment of the present invention, a classifier may be adopted to convert the classification score of the training picture into the classification probability of the training picture, where the classifier may be a Softmax function, and the Softmax function may map the classification score into a (0, 1) interval, which may be understood as a probability, that is, the classification score of the training picture may be converted into the classification probability of the training picture by Softmax. In addition, the classifier may also be a Logistic function, and which classifier is specifically selected may be determined according to actual conditions, which is not specifically limited herein.
Obtaining the classification probability of the first-level neural network layer to the training pictures according to the classification scores of the first-level neural network layer to the training pictures, if the classification probability of the first-level neural network layer to the training pictures is larger than or equal to a first probability threshold value, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-level full-connection layer to the training pictures; if the classification probability of the first-level neural network layer to the training picture is smaller than the first probability threshold, the original classification label of the training picture can be kept unchanged, and the original classification label of the training picture is used as the classification label of the first-level full-connection layer to the training picture. The first probability threshold may be used as a criterion for determining whether to modify the original classification label of the training picture, and the specific value may be set according to an actual situation, which is not specifically limited herein.
It should be noted that the target object here is each training picture, that is, it is necessary to determine the relation between the classification probability of each training picture and the first probability threshold, and determine whether the classification label of the training picture is modified or retained according to the result.
It should be noted that, if the classification probability of the first-level neural network layer to the training picture is greater than or equal to the first probability threshold, the reason why the classification label of the training picture is modified to the preset classification label is as follows: if the classification probability of the first-level neural network layer to the training picture is larger than or equal to the first probability threshold, the classification result of the first-level neural network layer to the training picture can be proved to be in accordance with the requirement, and the classification label of the training picture is modified into the preset classification label, so that the training picture corresponding to the preset classification label does not participate in the adjustment of the network parameters of the lower-level neural network layer and the full connection layer when the network parameters are adjusted according to the loss function subsequently.
If the classification probability of the first-level neural network layer to the training picture is smaller than the first probability threshold, the reason for keeping the original classification label of the training picture unchanged is as follows: if the classification probability of the first-level neural network layer to the training picture is smaller than the first probability threshold, it can be shown that the classification result of the first-level neural network layer to the training picture is not in accordance with the requirement, and the original classification label of the training picture is unchanged, so that the training picture participates in the adjustment of the network parameters of the lower-level neural network layer and the full connection layer when the network parameters are adjusted according to the loss function subsequently.
Optionally, on the basis of the above technical solution, the classification label of the training picture by the P-1 th full link layer is updated according to the classification score of the training picture by the P-1 th full link layer, so as to obtain a classification label of the training picture by the P-1 th full link layer, where P belongs to [2, N ], and specifically, the method may include: and obtaining the classification probability of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures, wherein P belongs to [2, N ]. And if the classification probability of the P-1 level full connection layer to the training picture is more than or equal to the P probability threshold value, modifying the classification label of the P-1 level full connection layer to the training picture into a preset classification label, and taking the preset classification label as the classification label of the P level full connection layer to the training picture. And if the classification probability of the P-1 level full connection layer to the training picture is smaller than the P probability threshold, keeping the classification label of the P-1 level full connection layer to the training picture unchanged, and taking the classification label of the P-1 level full connection layer to the training picture as the classification label of the P level full connection layer to the training picture.
In the embodiment of the present invention, as described above, the classifier may also be adopted to convert the classification score of the training picture into the classification probability of the training picture, where the classifier may be a Softmax function or a Logistic function, and which classifier is specifically selected may be determined according to an actual situation, and is not specifically limited herein.
Obtaining the classification probability of the P-1 grade full connection layer to the training picture according to the classification score of the P-1 grade full connection layer to the training picture, wherein P belongs to [2, N ], if the classification probability of the P-1 grade neural network layer to the training picture is more than or equal to a P probability threshold value, modifying the classification label of the P-1 grade full connection layer to the training picture into a preset classification label, and taking the preset classification label as the classification label of the P grade full connection layer to the training picture; if the classification probability of the P-1 level full connection layer to the training picture is smaller than the P probability threshold, the classification label of the P-1 level full connection layer to the training picture can be kept unchanged, and the classification label of the P-1 level full connection layer to the training picture is used as the classification label of the P level full connection layer to the training picture. The pth probability threshold may be used as a criterion for determining whether to modify the classification label of the training picture for the pth-1 full-link layer, and the specific value may be set according to an actual situation, which is not specifically limited herein.
It should be noted that, as described above, the target object here is still each training picture, that is, it is necessary to determine the relation between the classification probability of each training picture and the pth probability threshold, and determine whether the classification label of the training picture is modified or retained according to the result.
It should be noted that, if the classification probability of the P-1 th fully-connected layer to the training picture is greater than or equal to the P-th probability threshold, the reason why the classification label of the P-1 th fully-connected layer to the training picture is modified to the preset classification label is as follows: if the classification probability of the P-1 th full connection layer to the training picture is larger than or equal to the first probability threshold, the classification result of the P-1 th neural network layer to the training picture can be proved to be in accordance with the requirement, and the classification label of the training picture is modified into the preset classification label, so that the training picture corresponding to the preset classification label does not participate in the adjustment of the network parameters of the lower neural network layer and the full connection layer when the network parameters are adjusted according to the loss function subsequently.
If the classification probability of the P-1 level full connection layer to the training picture is smaller than the P probability threshold, the reason for keeping the classification label of the P-1 level full connection layer to the training picture unchanged is as follows: if the classification probability of the P-1 th full connection layer to the training picture is smaller than the P-th probability threshold, it can be shown that the classification result of the P-th neural network layer to the training picture is not in accordance with the requirement, and the training picture participates in the adjustment of the network parameters of the lower neural network layer and the full connection layer when the network parameters are adjusted according to the loss function subsequently through the unchanged classification label of the training picture.
The classification labels of the training pictures are updated once after passing through each level of full connection layer, so that the simple training pictures do not participate in the adjustment of the network parameters of the lower level neural network layer and the full connection layer, and the complexity of the neural network layer structures of all levels obtained by training is different.
Optionally, on the basis of the above technical scheme, determining a loss function of the neural network model according to the loss functions at each level, and adjusting network parameters of the neural network layers at each level and the fully-connected layers at each level until the loss function of the neural network model reaches a preset function value, where each level of the neural network layer is used as a corresponding level of the classification recognition model, and the method specifically includes: and determining the loss function of the neural network model according to the loss functions at all levels. Calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of full connection layer, wherein the partial derivatives of training pictures corresponding to preset classification labels in the loss function are zero. And adjusting network parameters of each level of neural network layer and each level of full connection layer according to the partial derivative, and recalculating the loss function until the loss function reaches a preset function value, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
In an embodiment of the present invention, determining the loss function of the neural network model according to the loss functions of each stage can be understood as follows: the loss functions of each level are weighted and summed to obtain the loss function of the neural network model, the proportional coefficients corresponding to the loss functions of each level can be set, the loss functions of each level are multiplied by the corresponding proportional coefficients respectively to obtain weighted values, and the weighted values are added to obtain the loss function of the neural network model. Illustratively, Loss function for each neural network layer is Loss (f)i) The proportional coefficient corresponding to each stage of loss function is TiWherein i ∈ [1, N ]]Then the loss function of the neural network model can be expressed as
Figure BDA0001887965900000161
Based on the above, it can be understood that the Loss function Loss (f) of each stage can be adjusted by adjusting the scaling factor corresponding to the Loss function of each stagei) The ratio of the loss function of the neural network model. The loss function may be a cross entropy loss function, a 0-1 loss function, a square loss function, an absolute loss function, a logarithmic loss function, or the like, and may be specifically set according to actual conditions, and is not specifically limited herein.
After determining a loss function of a neural network model according to each level of loss function, calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of fully-connected layer, wherein the network parameters comprise weight and bias, adjusting the network parameters of each level of neural network layer and each level of fully-connected layer according to the partial derivatives by adopting a reverse gradient propagation method, and recalculating the loss function until the loss function reaches a preset function value, wherein the preset function value can be a minimum loss function value, when the loss function reaches the preset function value, the training of the neural network model can be completed, each level of neural network layer can be determined according to the network parameters, and each level of neural network layer is used as a classification identification model of a corresponding level.
It should be noted that, in the neural network model training process, the partial derivative of the training picture corresponding to the preset classification label in the loss function is zero, that is, the training picture corresponding to the preset classification label does not participate in the adjustment of the network parameters of the lower neural network layer and the full connection layer.
Fig. 4 is a flowchart of another method for generating a classification recognition model according to an embodiment of the present invention, where the method is applicable to a case where the accuracy and efficiency of image classification are improved, and the method may be executed by a device for generating a classification recognition model, where the device may be implemented in software and/or hardware, and the device may be configured in a device, such as a computer or a mobile terminal. As shown in fig. 4, the method specifically includes the following steps:
step 410, obtaining a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture.
And step 420, inputting the training pictures and the original classification labels of the training pictures into the neural network model to obtain the classification scores of each level of neural network layer on the training pictures.
430, obtaining the classification score of the first full-connection layer to the training picture according to the classification score of the first neural network layer to the training picture and the classification score of the second neural network layer to the training picture; and obtaining the classification score of the P-level full training level layer to the training picture according to the classification score of the P-1 level full connecting layer to the training picture and the classification score of the P +1 level neural network layer to the training picture, wherein P belongs to [2, N ].
Step 440, updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first full-connection layer on the training pictures; and updating the classification labels of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures to obtain the classification labels of the P level full connection layer to the training pictures.
Step 450, obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures; and obtaining a P-th-level loss function of the P-th-level neural network layer according to the classification scores and the classification labels of the P-1-th-level full connection layer on the training pictures.
And step 460, determining a loss function of the neural network model according to the loss functions at all levels.
Step 470, calculating partial derivatives of the loss function to the network parameters of each level of neural network layer and each level of fully-connected layer, where the partial derivative of the training picture corresponding to the preset classification label in the loss function is zero.
And 480, adjusting network parameters of each level of neural network layer and each level of fully-connected layer according to the partial derivative, and recalculating the loss function until the loss function reaches a preset function value, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
In an embodiment of the present invention, updating the original classification label of the training picture according to the classification score of the first-level neural network layer on the training picture to obtain the classification label of the first-level fully-connected layer on the training picture, which may specifically include: and obtaining the classification probability of the first-stage neural network layer to the training pictures according to the classification scores of the first-stage neural network layer to the training pictures. And if the classification probability of the first-stage neural network layer to the training pictures is greater than or equal to the first probability threshold, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-stage full-connection layer to the training pictures. And if the classification probability of the first-level neural network layer to the training pictures is smaller than a first probability threshold value, keeping the original classification labels of the training pictures unchanged, and taking the original classification labels of the training pictures as the classification labels of the first-level full-connection layer to the training pictures.
Updating the classification label of the P-1 level full connection layer to the training picture according to the classification score of the P-1 level full connection layer to the training picture to obtain the classification label of the P level full connection layer to the training picture, wherein P belongs to [2, N ], and the method specifically comprises the following steps: and obtaining the classification probability of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures, wherein P belongs to [2, N ]. And if the classification probability of the P-1 level full connection layer to the training picture is more than or equal to the P probability threshold value, modifying the classification label of the P-1 level full connection layer to the training picture into a preset classification label, and taking the preset classification label as the classification label of the P level full connection layer to the training picture. And if the classification probability of the P-1 level full connection layer to the training picture is smaller than the P probability threshold, keeping the classification label of the P-1 level full connection layer to the training picture unchanged, and taking the classification label of the P-1 level full connection layer to the training picture as the classification label of the P level full connection layer to the training picture.
In order to better understand the technical solution provided by the embodiment of the present invention, the following description takes an example that the neural network model includes a three-level neural network layer and a two-level fully-connected layer, specifically:
fig. 5 is a schematic structural diagram of a neural network model according to an embodiment of the present invention. The neural network model comprises a three-level neural network layer and two-level full-connection layers, namely a first-level neural network layer, a second-level neural network layer and a third-level neural network layer, and the first-level full-connection layer and the second-level full-connection layer, wherein the first-level full-connection layer is positioned behind the second-level neural network layer, and the second-level full-connection layer is positioned behind the third-level neural network layer.
And acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture, and inputting the training sample into the neural network model to obtain a classification score of each level of neural network layer. Obtaining the classification score of the first full-connection layer to the training pictures according to the classification score of the first neural network layer to the training pictures and the classification score of the second neural network layer to the training pictures; and obtaining the classification score of the second-level full-training level layer on the training picture according to the classification score of the first-level full-connection layer on the training picture and the classification score of the third-level neural network layer on the training picture.
Updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first full-connection layer on the training pictures; and updating the classification labels of the first-level full-connection layer to the training pictures according to the classification scores of the first-level full-connection layer to the training pictures to obtain the classification labels of the second-level full-connection layer to the training pictures.
Obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures; and obtaining a second-level loss function of the second-level neural network layer according to the classification scores and the classification labels of the first-level full connection layer to the training pictures.
And determining a loss function of the neural network model according to the first-stage loss function and the second-stage loss function.
Calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of full connection layer, wherein the partial derivatives of training pictures corresponding to preset classification labels in the loss function are zero.
And adjusting network parameters of each level of neural network layer and each level of full-connection layer according to the partial derivative, and recalculating the loss function until the loss function reaches a preset function value, wherein each level of neural network layer is used as a corresponding level of classification recognition model, namely, the first level of neural network layer is used as a first level of classification recognition model, the second level of neural network layer is used as a second level of classification recognition model, and the third level of neural network layer is used as a third level of classification recognition model.
According to the technical scheme, a training sample is obtained and comprises a training picture and an original classification label of the training picture, the training picture and the original classification label of the training picture are input into a neural network model, the classification score of each level of neural network layer on the training picture is obtained, the classification score and the classification label of each level of full connection layer on the training picture are obtained, the neural network model comprises an N level of neural network layer and an N-1 level of full connection layer, the ith level of full connection layer is positioned behind the (i + 1) level of neural network layer, N is larger than or equal to 3, and i belongs to [1, N-1 ]. Obtaining a first-stage loss function of the first-stage neural network layer according to the classification score of the first-stage neural network layer on the training picture and an original classification label of the training picture, obtaining a P-stage loss function of the P-stage neural network layer according to the classification score and the classification label of the P-1-stage full connection layer on the training picture, determining the loss function of the neural network model according to the loss functions of all stages, adjusting network parameters of all stages of neural network layers and all stages of full connection layers until the loss function of the neural network model reaches a preset function value, taking each stage of neural network layer as a corresponding stage of classification recognition model, obtaining a multi-stage classification recognition model by adopting a collaborative training mode, and improving the accuracy and efficiency of the classification recognition model for picture classification.
Fig. 6 is a schematic structural diagram of an image classification device according to an embodiment of the present invention, where the embodiment is applicable to a situation where accuracy and efficiency of image classification are improved, the device may be implemented in a software and/or hardware manner, and the device may be configured in a device, such as a computer or a mobile terminal. As shown in fig. 6, the apparatus specifically includes:
the image set obtaining module 510 is configured to obtain an image set to be classified, where the image set includes at least two images.
And a classification result generating module 520, configured to input the picture set into a pre-trained current-level classification recognition model, so as to obtain a classification score of each picture.
A classification recognition result generating module 530, configured to determine a classification recognition result of the picture according to the classification score if the classification score of the picture meets a preset condition; if the classification score of the picture does not meet the preset condition, continuing to input the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class classification recognition model is generated based on neural network training.
According to the technical scheme, the picture set to be classified is obtained and comprises at least two pictures, the picture set is input into a pre-trained current-level classification recognition model to obtain the classification score of each picture, and if the classification score of the picture meets a preset condition, the classification recognition result of the picture is determined according to the classification score; if the classification score of the picture does not meet the preset condition, the picture is continuously input into a pre-trained next-stage classification recognition model until a classification recognition result of the picture is obtained, each stage of classification recognition model is generated based on neural network training, and the picture is classified by adopting the multi-stage classification recognition model, so that the accuracy and the efficiency of picture classification are improved.
Optionally, on the basis of the above technical solution, the apparatus may further include:
and the classification probability acquisition module is used for obtaining the classification probability of each picture according to the classification score of each picture.
The classification score of the picture meets a preset condition that the classification probability of the picture is greater than or equal to a probability threshold; and if the classification score of the picture does not meet the preset condition, the classification probability of the picture is smaller than a probability threshold.
The image classification device configured in the device provided by the embodiment of the invention can execute the image classification method applied to the device provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Fig. 7 is a schematic structural diagram of a device for generating a classification recognition model according to an embodiment of the present invention, where the embodiment is applicable to a situation where accuracy and efficiency of image classification are improved, the device may be implemented in a software and/or hardware manner, and the device may be configured in a device, such as a computer or a mobile terminal. As shown in fig. 7, the apparatus specifically includes:
the training sample obtaining module 610 is configured to obtain a training sample, where the training sample includes a training picture and an original classification label of the training picture.
And a classification score and classification label generation module 620, configured to input the training picture and the original classification label of the training picture into the neural network model, so as to obtain a classification score of each level of neural network layer for the training picture, and a classification score and classification label of each level of full-link layer for the training picture, where the neural network model includes N levels of neural network layers and N-1 levels of full-link layers, the ith level of full-link layer is located behind the (i + 1) th level of neural network layer, N is greater than or equal to 3, and i belongs to [1, N-1 ].
The first-stage loss function generating module 630 is configured to obtain a first-stage loss function of the first-stage neural network layer according to the classification score of the first-stage neural network layer on the training picture and the original classification label of the training picture.
And the P-th loss function generating module 640 is configured to obtain a P-th loss function of the P-th neural network layer according to the classification score and the classification label of the P-1-th full connection layer on the training picture, where P belongs to [2, N ].
And the classification recognition model generation module 650 is configured to determine a loss function of the neural network model according to the loss functions at each level, and adjust network parameters of the neural network layers and the fully-connected layers at each level until the loss function of the neural network model reaches a preset function value, and then each level of the neural network layer is used as a classification recognition model at a corresponding level.
According to the technical scheme, a training sample is obtained and comprises a training picture and an original classification label of the training picture, the training picture and the original classification label of the training picture are input into a neural network model, the classification score of each level of neural network layer on the training picture is obtained, the classification score and the classification label of each level of full connection layer on the training picture are obtained, the neural network model comprises an N level of neural network layer and an N-1 level of full connection layer, the ith level of full connection layer is positioned behind the (i + 1) level of neural network layer, N is larger than or equal to 3, and i belongs to [1, N-1 ]. Obtaining a first-stage loss function of the first-stage neural network layer according to the classification score of the first-stage neural network layer on the training picture and an original classification label of the training picture, obtaining a P-stage loss function of the P-stage neural network layer according to the classification score and the classification label of the P-1-stage full connection layer on the training picture, determining the loss function of the neural network model according to the loss functions of all stages, adjusting network parameters of all stages of neural network layers and all stages of full connection layers until the loss function of the neural network model reaches a preset function value, taking each stage of neural network layer as a corresponding stage of classification recognition model, obtaining a multi-stage classification recognition model by adopting a collaborative training mode, and improving the accuracy and efficiency of the classification recognition model for picture classification.
Optionally, on the basis of the above technical solution, the classification score of each full-link layer for the training picture may be generated as follows:
and obtaining the classification score of the first-level full-connection layer to the training picture according to the classification score of the first-level neural network layer to the training picture and the classification score of the second-level neural network layer to the training picture.
And obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
Optionally, on the basis of the above technical solution, the classification label of each level of the full-connected layer for the training picture may be generated as follows:
and updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first-stage full-connection layer on the training pictures.
And updating the classification label of the P-1 level full connection layer to the training picture according to the classification score of the P-1 level full connection layer to the training picture to obtain the classification label of the P level full connection layer to the training picture, wherein P belongs to [2, N ].
Optionally, on the basis of the above technical solution, the original classification label of the training picture is updated according to the classification score of the first-level neural network layer on the training picture, so as to obtain the classification label of the first-level full-link layer on the training picture, which may specifically include:
and obtaining the classification probability of the first-stage neural network layer to the training pictures according to the classification scores of the first-stage neural network layer to the training pictures.
And if the classification probability of the first-stage neural network layer to the training pictures is smaller than a first probability threshold value, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-stage full-connection layer to the training pictures.
And if the classification probability of the first-level neural network layer to the training pictures is greater than or equal to the first probability threshold, keeping the original classification labels of the training pictures unchanged, and taking the original classification labels of the training pictures as the classification labels of the first-level full-connection layer to the training pictures.
Optionally, on the basis of the above technical solution, the classification label of the training picture by the P-1 th full link layer is updated according to the classification score of the training picture by the P-1 th full link layer, so as to obtain a classification label of the training picture by the P-1 th full link layer, where P belongs to [2, N ], and specifically, the method may include:
and obtaining the classification probability of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures, wherein P belongs to [2, N ].
And if the classification probability of the P-1 level full connection layer to the training picture is smaller than the P probability threshold value, modifying the classification label of the P-1 level full connection layer to the training picture into a preset classification label, and taking the preset classification label as the classification label of the P level full connection layer to the training picture.
And if the classification probability of the P-1 level full connection layer to the training picture is more than or equal to the P probability threshold value, keeping the classification label of the P-1 level full connection layer to the training picture unchanged, and taking the classification label of the P-1 level full connection layer to the training picture as the classification label of the P level full connection layer to the training picture.
Optionally, on the basis of the above technical scheme, determining a loss function of the neural network model according to the loss functions at each level, and adjusting network parameters of the neural network layers at each level and the fully-connected layers at each level until the loss function of the neural network model reaches a preset function value, where each level of the neural network layer is used as a corresponding level of the classification recognition model, and the method specifically includes:
and determining the loss function of the neural network model according to the loss functions at all levels.
Calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of full connection layer, wherein the partial derivatives of training pictures corresponding to preset classification labels in the loss function are zero.
And adjusting network parameters of each level of neural network layer and each level of full connection layer according to the partial derivative, and recalculating the loss function until the loss function reaches a preset function value, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
The device for generating the classification recognition model configured on the equipment, provided by the embodiment of the invention, can execute the method for generating the classification recognition model applied to the equipment, provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Fig. 8 is a schematic structural diagram of an apparatus according to an embodiment of the present invention. FIG. 8 illustrates a block diagram of an exemplary device 712 suitable for use to implement embodiments of the present invention. The device 712 shown in fig. 8 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present invention.
As shown in FIG. 8, device 712 may take the form of a general purpose computing device. Components of device 712 may include, but are not limited to: one or more processors 716, a system memory 728, and a bus 718 that couples the various system components including the system memory 728 and the processors 716.
Bus 718 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA (ISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
Device 712 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by mobile terminal 712 and includes both volatile and nonvolatile media, removable and non-removable media.
The system Memory 728 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 730 and/or cache Memory 732. Device 712 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 734 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 8, and commonly referred to as a "hard drive"). Although not shown in FIG. 8, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Computer disk Read-Only Memory, CD-ROM), Digital Video disk (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to the bus 718 by one or more data media interfaces. Memory 728 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
Program/utility 740 having a set (at least one) of program modules 742 may be stored, for instance, in memory 728, such program modules 742 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may include an implementation of a network environment. Program modules 742 generally perform the functions and/or methodologies of embodiments of the invention as described herein.
Device 712 may also communicate with one or more external devices 714 (e.g., keyboard, pointing device, display 724, etc.), with one or more devices that enable a user to interact with device 712, and/or with any devices (e.g., network card, modem, etc.) that enable device 712 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interfaces 722. Also, the device 712 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network, such as the internet) via the Network adapter 720. As shown, the network adapter 720 communicates with the other modules of the device 712 via a bus 718. It should be appreciated that although not shown in FIG. 8, other hardware and/or software modules may be used in conjunction with device 712, including but not limited to: microcode, device drivers, Redundant processing units, external disk drive Arrays, disk array (RAID) systems, tape drives, and data backup storage systems, to name a few.
The processor 716 executes various functional applications and data processing by running programs stored in the system memory 728, for example, implementing a picture classification method provided by the embodiment of the present invention, the method includes:
and acquiring a picture set to be classified, wherein the picture set comprises at least two pictures.
And inputting the picture set into a pre-trained current-level classification recognition model to obtain the classification score of each picture.
If the classification score of the picture meets the preset condition, determining a classification recognition result of the picture according to the classification score; if the classification score of the picture does not meet the preset condition, continuing to input the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class classification recognition model is generated based on neural network training.
An embodiment of the present invention further provides another apparatus, including: one or more processors; a memory for storing one or more programs; when the one or more programs are executed by the one or more processors, the one or more processors implement a method for generating a classification recognition model provided by an embodiment of the present invention, the method including:
and acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture.
Inputting the training picture and the original classification label of the training picture into a neural network model to obtain the classification score of each level of neural network layer on the training picture, and the classification score and the classification label of each level of full connection layer on the training picture, wherein the neural network model comprises an N level of neural network layer and an N-1 level of full connection layer, the ith level of full connection layer is positioned behind the (i + 1) level of neural network layer, N is more than or equal to 3, and i belongs to [1, N-1 ].
And obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures.
And obtaining a P-th level loss function of the P-th level neural network layer according to the classification score and the classification label of the P-1-th level full connection layer on the training picture, wherein P belongs to [2, N ].
And determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of fully-connected layers until the loss functions of the neural network model reach preset function values, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
Of course, those skilled in the art can understand that the processor may also implement the technical solution of the image classification method applied to the device or the technical solution of the generation method of the classification recognition model applied to the device provided in any embodiment of the present invention. The hardware structure and the function of the device can be explained with reference to the contents of the embodiment.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a picture classification method provided in an embodiment of the present invention, where the method includes:
and acquiring a picture set to be classified, wherein the picture set comprises at least two pictures.
And inputting the picture set into a pre-trained current-level classification recognition model to obtain the classification score of each picture.
If the classification score of the picture meets the preset condition, determining a classification recognition result of the picture according to the classification score; if the classification score of the picture does not meet the preset condition, continuing to input the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class classification recognition model is generated based on neural network training.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable Computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of Network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
Embodiments of the present invention also provide another computer-readable storage medium, where the computer-executable instructions, when executed by a computer processor, perform a method for generating a classification recognition model, the method comprising:
and acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture.
Inputting the training picture and the original classification label of the training picture into a neural network model to obtain the classification score of each level of neural network layer on the training picture, and the classification score and the classification label of each level of full connection layer on the training picture, wherein the neural network model comprises an N level of neural network layer and an N-1 level of full connection layer, the ith level of full connection layer is positioned behind the (i + 1) level of neural network layer, N is more than or equal to 3, and i belongs to [1, N-1 ].
And obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures.
And obtaining a P-th level loss function of the P-th level neural network layer according to the classification score and the classification label of the P-1-th level full connection layer on the training picture, wherein P belongs to [2, N ].
And determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of fully-connected layers until the loss functions of the neural network model reach preset function values, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
Of course, the computer-readable storage medium provided in the embodiments of the present invention has computer-executable instructions that are not limited to the method operations described above, and may also perform related operations in the method for classifying pictures and the method for generating classification recognition models of the device provided in any embodiment of the present invention. The description of the storage medium is explained with reference to the embodiments.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (11)

1. A picture classification method is characterized by comprising the following steps:
acquiring a picture set to be classified, wherein the picture set comprises at least two pictures;
inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture;
if the classification score of the picture meets a preset condition, determining a classification recognition result of the picture according to the classification score; if the classification score of the picture does not meet the preset condition, continuing inputting the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class recognition model is generated based on neural network training;
the classification recognition model is generated in the following mode:
acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture;
inputting the training pictures and the original classification labels of the training pictures into a neural network model to obtain the classification score of each level of neural network layer on the training pictures, and each level of full connection layer is used for classifying scores and classifying labels of the training pictures, the neural network model comprises an N level neural network layer and an N-1 level full connection layer, the ith level full connection layer is positioned behind the (i + 1) level neural network layer, N is more than or equal to 3, i belongs to [1, N-1], wherein the classification label is used for determining a loss function according to the classification label and adjusting network parameters according to the loss function, judging whether the corresponding training picture participates in the adjustment of network parameters of a lower-level neural network layer and a full connection layer according to the classification label, so that the complexity of each level of neural network layer structure obtained by training is different;
obtaining a first-stage loss function of a first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures;
obtaining a P-th level loss function of a P-th level neural network layer according to the classification scores and the classification labels of the P-1-th level full connection layer on the training pictures, wherein P belongs to [2, N ];
determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of full connection layers until the loss functions of the neural network model reach preset function values, wherein each level of neural network layer is used as a classification identification model of a corresponding level;
the classification score of each level of full-connection layer on the training picture is generated in the following mode:
obtaining the classification score of the first full-connection layer on the training picture according to the classification score of the first neural network layer on the training picture and the classification score of the second neural network layer on the training picture;
and obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
2. The method of claim 1, wherein the inputting the picture set into a pre-trained current-level classification recognition model, after obtaining the classification score of each picture, further comprises:
obtaining the classification probability of each picture according to the classification score of each picture;
the classification score of the picture meets a preset condition that the classification probability of the picture is greater than or equal to a probability threshold; and if the classification score of the picture does not meet the preset condition, the classification probability of the picture is smaller than a probability threshold.
3. A method for generating a classification recognition model is characterized by comprising the following steps:
acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture;
inputting the training pictures and the original classification labels of the training pictures into a neural network model to obtain the classification score of each level of neural network layer on the training pictures, and each level of full connection layer is used for classifying scores and classifying labels of the training pictures, the neural network model comprises an N level neural network layer and an N-1 level full connection layer, the ith level full connection layer is positioned behind the (i + 1) level neural network layer, N is more than or equal to 3, i belongs to [1, N-1], wherein the classification label is used for determining a loss function according to the classification label and adjusting network parameters according to the loss function, judging whether the corresponding training picture participates in the adjustment of network parameters of a lower-level neural network layer and a full connection layer according to the classification label, so that the complexity of each level of neural network layer structure obtained by training is different;
obtaining a first-stage loss function of a first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures;
obtaining a P-th level loss function of a P-th level neural network layer according to the classification scores and the classification labels of the P-1-th level full connection layer on the training pictures, wherein P belongs to [2, N ];
and determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of fully-connected layers until the loss functions of the neural network model reach preset function values, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
The classification score of each level of full-connection layer on the training picture is generated in the following mode:
obtaining the classification score of the first full-connection layer on the training picture according to the classification score of the first neural network layer on the training picture and the classification score of the second neural network layer on the training picture;
and obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
4. The method of claim 3, wherein the class label of each level of fully connected layer to the training picture is generated by:
updating the original classification labels of the training pictures according to the classification scores of the first-stage neural network layer on the training pictures to obtain the classification labels of the first-stage full-connection layer on the training pictures;
and updating the classification label of the P-1 level full connection layer to the training picture according to the classification score of the P-1 level full connection layer to the training picture to obtain the classification label of the P level full connection layer to the training picture, wherein P belongs to [2, N ].
5. The method of claim 4, wherein the updating the original classification label of the training picture according to the classification score of the first-level neural network layer on the training picture to obtain the classification label of the first-level fully-connected layer on the training picture comprises:
obtaining the classification probability of the first-stage neural network layer to the training pictures according to the classification scores of the first-stage neural network layer to the training pictures;
if the classification probability of the first-stage neural network layer to the training pictures is greater than or equal to a first probability threshold value, modifying the original classification labels of the training pictures into preset classification labels, and taking the preset classification labels as the classification labels of the first-stage full-connection layer to the training pictures;
and if the classification probability of the first-level neural network layer to the training picture is smaller than a first probability threshold value, keeping the original classification label of the training picture unchanged, and taking the original classification label of the training picture as the classification label of the first-level full-connection layer to the training picture.
6. The method according to claim 5, wherein the updating the classification label of the training picture by the P-1 th level fully-connected layer according to the classification score of the training picture by the P-1 th level fully-connected layer to obtain the classification label of the training picture by the P-1 th level fully-connected layer, and P e [2, N ], comprises:
obtaining the classification probability of the P-1 level full connection layer to the training pictures according to the classification scores of the P-1 level full connection layer to the training pictures, wherein P belongs to [2, N ];
if the classification probability of the P-1 level full connection layer to the training picture is greater than or equal to a P probability threshold value, modifying the classification label of the P-1 level full connection layer to the training picture into the preset classification label, and taking the preset classification label as the classification label of the P level full connection layer to the training picture;
and if the classification probability of the P-1 level full connection layer to the training picture is smaller than a P probability threshold value, keeping the classification label of the P-1 level full connection layer to the training picture unchanged, and taking the classification label of the P-1 level full connection layer to the training picture as the classification label of the P level full connection layer to the training picture.
7. The method according to claim 6, wherein the determining a loss function of the neural network model according to the loss functions of the levels, and adjusting network parameters of the neural network layers and the fully-connected layers until the loss function of the neural network model reaches a preset function value, and then each level of the neural network layer is used as a classification recognition model of a corresponding level, comprises:
determining a loss function of the neural network model according to each level of loss function;
calculating partial derivatives of the loss function to network parameters of each level of neural network layer and each level of full connection layer, wherein the partial derivatives of training pictures corresponding to preset classification labels in the loss function are zero;
and adjusting network parameters of each level of neural network layer and each level of fully-connected layer according to the partial derivatives, and recalculating the loss function until the loss function reaches the preset function value, wherein each level of neural network layer is used as a classification identification model of a corresponding level.
8. An apparatus for classifying pictures, comprising:
the image set acquisition module is used for acquiring an image set to be classified, and the image set comprises at least two images;
the classification result generation module is used for inputting the picture set into a pre-trained current-level classification recognition model to obtain a classification score of each picture;
the image classification recognition system comprises a classification recognition result generation module, a recognition module and a recognition module, wherein the classification recognition result generation module is used for determining a classification recognition result of an image according to a classification score if the classification score of the image meets a preset condition; if the classification score of the picture does not meet the preset condition, continuing inputting the picture into a pre-trained next-level classification recognition model until a classification recognition result of the picture is obtained; wherein each class recognition model is generated based on neural network training;
the classification recognition model is generated in the following mode:
acquiring a training sample, wherein the training sample comprises a training picture and an original classification label of the training picture;
inputting the training pictures and the original classification labels of the training pictures into a neural network model to obtain the classification score of each level of neural network layer on the training pictures, and each level of full connection layer is used for classifying scores and classifying labels of the training pictures, the neural network model comprises an N level neural network layer and an N-1 level full connection layer, the ith level full connection layer is positioned behind the (i + 1) level neural network layer, N is more than or equal to 3, i belongs to [1, N-1], wherein the classification label is used for determining a loss function according to the classification label and adjusting network parameters according to the loss function, judging whether the corresponding training picture participates in the adjustment of network parameters of a lower-level neural network layer and a full connection layer according to the classification label, so that the complexity of each level of neural network layer structure obtained by training is different;
obtaining a first-stage loss function of a first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures;
obtaining a P-th level loss function of a P-th level neural network layer according to the classification scores and the classification labels of the P-1-th level full connection layer on the training pictures, wherein P belongs to [2, N ];
determining a loss function of the neural network model according to the loss functions of all levels, and adjusting network parameters of all levels of neural network layers and all levels of full connection layers until the loss functions of the neural network model reach preset function values, wherein each level of neural network layer is used as a classification identification model of a corresponding level;
the classification score of each level of full-connection layer on the training picture is generated in the following mode:
obtaining the classification score of the first full-connection layer on the training picture according to the classification score of the first neural network layer on the training picture and the classification score of the second neural network layer on the training picture;
and obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
9. An apparatus for generating a classification recognition model, comprising:
the training sample acquisition module is used for acquiring a training sample, and the training sample comprises a training picture and an original classification label of the training picture;
a classification score and classification label generation module used for inputting the training picture and the original classification label of the training picture into a neural network model to obtain the classification score of each level of neural network layer to the training picture, and each level of full connection layer is used for classifying scores and classifying labels of the training pictures, the neural network model comprises an N level neural network layer and an N-1 level full connection layer, the ith level full connection layer is positioned behind the (i + 1) level neural network layer, N is more than or equal to 3, i belongs to [1, N-1], wherein the classification label is used for determining a loss function according to the classification label and adjusting network parameters according to the loss function, judging whether the corresponding training picture participates in the adjustment of network parameters of a lower-level neural network layer and a full connection layer according to the classification label, so that the complexity of each level of neural network layer structure obtained by training is different;
the first-stage loss function generation module is used for obtaining a first-stage loss function of the first-stage neural network layer according to the classification scores of the first-stage neural network layer on the training pictures and the original classification labels of the training pictures;
the P-level loss function generation module is used for obtaining a P-level loss function of the P-level neural network layer according to the classification score and the classification label of the P-1-level full connection layer on the training picture, wherein P belongs to [2, N ];
the classification recognition model generation module is used for determining a loss function of the neural network model according to each level of loss function, and adjusting network parameters of each level of neural network layer and each level of full-connection layer until the loss function of the neural network model reaches a preset function value, and then each level of neural network layer is used as a classification recognition model of a corresponding level;
the classification score of each level of full-connection layer on the training picture is generated in the following mode:
obtaining the classification score of the first full-connection layer on the training picture according to the classification score of the first neural network layer on the training picture and the classification score of the second neural network layer on the training picture;
and obtaining the classification score of the P-level full-connection layer to the training picture according to the classification score of the P-1-level full-connection layer to the training picture and the classification score of the P + 1-level neural network layer to the training picture, wherein P belongs to [2, N ].
10. An apparatus, comprising:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN201811457125.1A 2018-11-30 2018-11-30 Method, device, equipment and medium for generating image classification and classification recognition model Active CN109583501B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811457125.1A CN109583501B (en) 2018-11-30 2018-11-30 Method, device, equipment and medium for generating image classification and classification recognition model
PCT/CN2019/120903 WO2020108474A1 (en) 2018-11-30 2019-11-26 Picture classification method, classification identification model generation method and apparatus, device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811457125.1A CN109583501B (en) 2018-11-30 2018-11-30 Method, device, equipment and medium for generating image classification and classification recognition model

Publications (2)

Publication Number Publication Date
CN109583501A CN109583501A (en) 2019-04-05
CN109583501B true CN109583501B (en) 2021-05-07

Family

ID=65926768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811457125.1A Active CN109583501B (en) 2018-11-30 2018-11-30 Method, device, equipment and medium for generating image classification and classification recognition model

Country Status (2)

Country Link
CN (1) CN109583501B (en)
WO (1) WO2020108474A1 (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583501B (en) * 2018-11-30 2021-05-07 广州市百果园信息技术有限公司 Method, device, equipment and medium for generating image classification and classification recognition model
CN110222724B (en) * 2019-05-15 2023-12-19 平安科技(深圳)有限公司 Picture instance detection method and device, computer equipment and storage medium
EP3975579A4 (en) * 2019-05-23 2022-12-14 LG Electronics Inc. Display device
CN110210356A (en) * 2019-05-24 2019-09-06 厦门美柚信息科技有限公司 A kind of picture discrimination method, apparatus and system
CN110738267B (en) * 2019-10-18 2023-08-22 北京达佳互联信息技术有限公司 Image classification method, device, electronic equipment and storage medium
CN111738290B (en) * 2020-05-14 2024-04-09 北京沃东天骏信息技术有限公司 Image detection method, model construction and training method, device, equipment and medium
CN111783861A (en) * 2020-06-22 2020-10-16 北京百度网讯科技有限公司 Data classification method, model training device and electronic equipment
CN111782905B (en) * 2020-06-29 2024-02-09 中国工商银行股份有限公司 Data packet method and device, terminal equipment and readable storage medium
CN112182269B (en) * 2020-09-27 2023-11-28 北京达佳互联信息技术有限公司 Training of image classification model, image classification method, device, equipment and medium
CN112286440A (en) * 2020-11-20 2021-01-29 北京小米移动软件有限公司 Touch operation classification method and device, model training method and device, terminal and storage medium
CN112465042B (en) * 2020-12-02 2023-10-24 中国联合网络通信集团有限公司 Method and device for generating classified network model
CN112445410B (en) * 2020-12-07 2023-04-18 北京小米移动软件有限公司 Touch event identification method and device and computer readable storage medium
CN113063843A (en) * 2021-02-22 2021-07-02 广州杰赛科技股份有限公司 Pipeline defect identification method and device and storage medium
CN113935407A (en) * 2021-09-29 2022-01-14 光大科技有限公司 Abnormal behavior recognition model determining method and device
CN113705735A (en) * 2021-10-27 2021-11-26 北京值得买科技股份有限公司 Label classification method and system based on mass information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system
CN108509978A (en) * 2018-02-28 2018-09-07 中南大学 The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN
CN108875456A (en) * 2017-05-12 2018-11-23 北京旷视科技有限公司 Object detection method, object detecting device and computer readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679185B (en) * 2012-08-31 2017-06-16 富士通株式会社 Convolutional neural networks classifier system, its training method, sorting technique and purposes
US20170161592A1 (en) * 2015-12-04 2017-06-08 Pilot Ai Labs, Inc. System and method for object detection dataset application for deep-learning algorithm training
US10915817B2 (en) * 2017-01-23 2021-02-09 Fotonation Limited Method of training a neural network
CN107403198B (en) * 2017-07-31 2020-12-22 广州探迹科技有限公司 Official website identification method based on cascade classifier
CN109583501B (en) * 2018-11-30 2021-05-07 广州市百果园信息技术有限公司 Method, device, equipment and medium for generating image classification and classification recognition model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system
CN108875456A (en) * 2017-05-12 2018-11-23 北京旷视科技有限公司 Object detection method, object detecting device and computer readable storage medium
CN108509978A (en) * 2018-02-28 2018-09-07 中南大学 The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN

Also Published As

Publication number Publication date
WO2020108474A1 (en) 2020-06-04
CN109583501A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
CN109583501B (en) Method, device, equipment and medium for generating image classification and classification recognition model
CN110084281B (en) Image generation method, neural network compression method, related device and equipment
CN111754596B (en) Editing model generation method, device, equipment and medium for editing face image
CN108875807B (en) Image description method based on multiple attention and multiple scales
WO2021042828A1 (en) Neural network model compression method and apparatus, and storage medium and chip
CN111476284B (en) Image recognition model training and image recognition method and device and electronic equipment
US20210003700A1 (en) Method and apparatus for enhancing semantic features of sar image oriented small set of samples
Zhou et al. Dense teacher: Dense pseudo-labels for semi-supervised object detection
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
CN111741330B (en) Video content evaluation method and device, storage medium and computer equipment
CN111061843A (en) Knowledge graph guided false news detection method
CN111582397B (en) CNN-RNN image emotion analysis method based on attention mechanism
WO2021042857A1 (en) Processing method and processing apparatus for image segmentation model
CN113469088A (en) SAR image ship target detection method and system in passive interference scene
CN112580720A (en) Model training method and device
CN111460157A (en) Cyclic convolution multitask learning method for multi-field text classification
CN114842267A (en) Image classification method and system based on label noise domain self-adaption
CN114842343A (en) ViT-based aerial image identification method
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
CN113627550A (en) Image-text emotion analysis method based on multi-mode fusion
CN111445545B (en) Text transfer mapping method and device, storage medium and electronic equipment
CN112801107A (en) Image segmentation method and electronic equipment
CN111783688A (en) Remote sensing image scene classification method based on convolutional neural network
CN114723989A (en) Multitask learning method and device and electronic equipment
CN115188022A (en) Human behavior identification method based on consistency semi-supervised deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20211125

Address after: 31a, 15 / F, building 30, maple mall, bangrang Road, Brazil, Singapore

Patentee after: Baiguoyuan Technology (Singapore) Co.,Ltd.

Address before: 511442 23-39 / F, building B-1, Wanda Plaza North, Wanbo business district, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Patentee before: GUANGZHOU BAIGUOYUAN INFORMATION TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right