CN114821207B

CN114821207B - Image classification method and device, storage medium and terminal

Info

Publication number: CN114821207B
Application number: CN202210758519.0A
Authority: CN
Inventors: 王国龙; 廖丹萍; 戚晓东; 施钢杰
Original assignee: Zhejiang Fenghuang Yunrui Technology Co ltd; Advanced Institute of Information Technology AIIT of Peking University; Hangzhou Weiming Information Technology Co Ltd
Current assignee: Zhejiang Fenghuang Yunrui Technology Co ltd; Advanced Institute of Information Technology AIIT of Peking University; Hangzhou Weiming Information Technology Co Ltd
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2022-11-04
Anticipated expiration: 2042-06-30
Also published as: CN114821207A

Abstract

The invention discloses an image classification method, an image classification device, a storage medium and a terminal, wherein the method comprises the following steps: acquiring a target image to be classified; inputting a target image to be classified into a pre-trained image classification model, and outputting a plurality of category scores corresponding to the target image to be classified; loss functions of the pre-trained image classification model comprise cross entropy loss functions and modular length constraint losses of image classification; and determining the category corresponding to the maximum score in the plurality of category scores as the final category of the target image to be classified. According to the method, corresponding modular length constraint loss is added into a loss function of the model, so that the model can be quickly corrected from an incorrect classification category to a correct classification category in a training stage, and after the model is classified to the correct category, the classification confidence value difference between the correct category and the incorrect category is further enlarged, and the accuracy and the generalization performance of the model are improved.

Description

Image classification method and device, storage medium and terminal

Technical Field

The invention relates to the technical field of computer vision, in particular to an image classification method, an image classification device, a storage medium and a terminal.

Background

Image classification is an important research direction for computer vision and digital image processing, and is widely applied to various fields of robot navigation, intelligent video monitoring, industrial detection, aerospace and the like. Image classification is an algorithm that classifies images into different categories based on different features in the image information. The input of the image classification algorithm is one image, and the output is a certain category in the category set. At present, image classification algorithms based on deep neural networks are widely concerned and researched.

In the prior art, the image classification algorithm based on the deep neural network generally utilizes a cross entropy loss function to conduct guide training on a model in a training stage. However, since the cross-entropy loss function does not take into account the processing under the condition of correct or wrong classification during training, the cross-entropy loss function does not explicitly enlarge the gap between the correct class confidence value and the wrong class confidence value, thereby reducing the accuracy and generalization performance of the model.

Disclosure of Invention

The embodiment of the application provides an image classification method, an image classification device, a storage medium and a terminal. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

In a first aspect, an embodiment of the present application provides an image classification method, where the method includes:

acquiring a target image to be classified;

inputting a target image to be classified into a pre-trained image classification model, and outputting a plurality of category scores corresponding to the target image to be classified; the loss function of the pre-trained image classification model comprises a cross entropy loss function and a model length constraint loss of image classification;

and determining the category corresponding to the maximum score in the plurality of category scores as the final category of the target image to be classified.

Optionally, before the target image to be classified is acquired, the method further includes:

constructing an image classification training data set;

establishing and initializing an image classification network to obtain an initial image classification model;

constructing a modular length constraint loss aiming at the length of the feature vector, and constructing a target loss function of the image classification model based on the modular length constraint loss;

generating a pre-trained image classification model according to the image classification training data set, the initial image classification model and the target loss function; wherein the content of the first and second substances,

the target loss function expression is:

wherein i denotes that the input image is the ith image in the data set,

is the cross entropy loss function of image classification, e is the natural constant, W is the full link layer parameter, T is the transposition operation,

representing the loss of the modulo length constraint on the length of the feature vector when the target classification is wrong,

representing features extracted by the model for the input image,

is a class indication value, and when the target classification is incorrect,

on the contrary, the

；

Is the weight of the constraint;

is the loss of the modulo length constraint on the length of the feature vector when the object classification is correct,

is the weight of the constraint that is,

to prevent the denominator from being a number of 0, set

=10 ^-7 。

Optionally, constructing an image classification training data set includes:

collecting a sample image;

and marking a real class label for each acquired sample image to generate an image classification training data set.

Optionally, creating and initializing an image classification network to obtain an initial image classification model, including:

constructing an image classification network by adopting a neural network;

and initializing the image classification network by adopting a random initialization function or utilizing a pre-training model to obtain an initial image classification model.

Optionally, generating a pre-trained image classification model according to the image classification training data set, the initial image classification model and the target loss function includes:

acquiring an nth training image from an image classification training data set, inputting the nth training image into an image classification model, and obtaining a target feature corresponding to the training image and a confidence coefficient of each category; n is the traversal times;

determining the class corresponding to the maximum confidence coefficient in the confidence coefficients of each class as the output class of the training image;

determining a class indication value of the training image according to the output class of the training image;

calculating a model loss value by combining a target loss function according to the target characteristics corresponding to the training images and the class indication values of the training images;

and generating a pre-trained image classification model according to the model loss value.

Optionally, determining a class indication value of the training image according to the output class of the training image includes:

judging whether the output category of the training image is consistent with the real category of the training image;

if the training images are consistent, determining 0 as a class indication value of the training images;

alternatively, the first and second electrodes may be,

and if the difference is not consistent, determining 1 as the class indication value of the training image.

Optionally, generating a pre-trained image classification model according to the model loss value includes:

when the model loss value is lower than a preset threshold value or the training frequency is higher than a preset upper limit value, generating a trained image classification model;

alternatively, the first and second electrodes may be,

when the model loss value is higher than a preset threshold value or the training frequency does not reach a preset upper limit value, iterative optimization is carried out on the image classification model by adopting a random gradient descent algorithm, the step of obtaining the (n + 1) th training image in the image classification training data set and inputting the training image into the image classification model is continuously executed, and when n +1 is larger than the number of samples in the image classification training data set, the sequence of the images in the image classification training data set is randomly arranged and n =1 is reset; wherein n is the traversal number.

In a second aspect, an embodiment of the present application provides an image classification apparatus, including:

the image acquisition module is used for acquiring a target image to be classified;

the class score output module is used for inputting the target image to be classified into a pre-trained image classification model and outputting a plurality of class scores corresponding to the target image to be classified; the loss function of the pre-trained image classification model comprises a cross entropy loss function and a model length constraint loss of image classification;

and the final category determining module is used for determining the category corresponding to the maximum score in the plurality of category scores as the final category of the target image to be classified.

In a third aspect, embodiments of the present application provide a computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.

In a fourth aspect, an embodiment of the present application provides a terminal, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

in the embodiment of the application, an image classification device firstly acquires a target image to be classified, then inputs the target image to be classified into a pre-trained image classification model, outputs a plurality of class scores corresponding to the target image to be classified, a loss function of the pre-trained image classification model comprises a cross entropy loss function and a modular length constraint loss of image classification, and finally determines the class corresponding to the maximum score in the plurality of class scores as the final class of the target image to be classified. According to the method, corresponding modular length constraint loss is added into a loss function of the model, so that the model can be quickly corrected from an incorrect classification category to a correct classification category in a training stage through the loss, and after the model is classified to the correct category, the classification confidence value difference between the correct category and the incorrect category is further enlarged, and therefore the accuracy and the generalization performance of the model are improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic flowchart of an image classification method provided in an embodiment of the present application;

FIG. 2 is a flowchart illustrating an image classification model training method according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an image classification apparatus according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them.

It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meanings of the above terms in the present invention can be understood in a specific case to those of ordinary skill in the art. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The application provides an image classification method, an image classification device, a storage medium and a terminal, which are used for solving the problems in the related art. In the technical scheme provided by the application, as the corresponding modular length constraint loss is added into the loss function of the model, the loss can enable the model to be quickly corrected from the wrong classification category to the correct classification category in the training stage, and after the model is classified to the correct category, the classification confidence value difference between the correct category and the wrong category is further enlarged, so that the accuracy and generalization performance of the model are improved, and the detailed description is carried out by adopting an exemplary embodiment.

The image classification method provided by the embodiment of the present application will be described in detail below with reference to fig. 1 to fig. 2. The method may be implemented in dependence on a computer program, executable on an image classification apparatus based on the von neumann architecture. The computer program may be integrated into the application or may run as a separate tool-like application.

Referring to fig. 1, a flowchart of an image classification method is provided in an embodiment of the present application. As shown in fig. 1, the method of the embodiment of the present application may include the steps of:

s101, acquiring a target image to be classified;

the images to be classified are images used for testing the performance of the pre-trained image classification model, or images received when the pre-trained image classification model is applied to a classification scene.

Generally, when the target image to be classified is an image for testing the performance of a pre-trained image classification model, the target image to be classified may be obtained from a test sample, may also be an image obtained from a user terminal, and may also be an image downloaded from a cloud. When the target image to be classified is an image obtained by applying a pre-trained image classification model to a classification application scene, the image to be classified may be an image acquired in real time by an image acquisition device.

In a possible implementation manner, after the training of the image classification model is completed, and the trained image classification model is deployed in an actual application scene, when an object sensor or an object monitoring algorithm detects that an object enters a monitoring area of a camera, a photographing function of image acquisition and photographing is triggered to acquire a target image entering the monitoring area, and finally the target image is determined as a target image to be classified.

In another possible implementation manner, after the image classification model is trained, when the classification performance of the trained image classification model needs to be detected, a user downloads any image carrying an object from a sample test set, a local gallery or a cloud through a user terminal, and determines the image as a target image to be classified.

S102, inputting a target image to be classified into a pre-trained image classification model, and outputting a plurality of category scores corresponding to the target image to be classified;

the loss function of the pre-trained image classification model comprises a cross entropy loss function and a model length constraint loss of image classification;

in the embodiment of the application, when a pre-trained image classification model is generated, an image classification training data set is firstly constructed, then an image classification network is created and initialized to obtain an initial image classification model, then a model length constraint loss aiming at the length of a feature vector is constructed, a target loss function of the image classification model is constructed based on the model length constraint loss, and finally the pre-trained image classification model is generated according to the image classification training data set, the initial image classification model and the target loss function. Wherein, the target loss function expression is:

wherein i represents that the input image is the ith image in the data set,

representing the features extracted by the model for the input image,

is a class indication value, and when the target classification is wrong,

on the contrary, the

；

Is the weight of the constraint;

is the weight of the constraint that is,

to prevent the denominator from being a number of 0, set

=10 ^-7 。

In a possible implementation manner, after the target image to be classified is acquired according to step S101, the target image is input into a pre-trained image classification model, the model is processed according to the trained model parameters, after the processing is completed, a class score of the target image belonging to each of a plurality of preset classes is obtained, and finally, the pre-trained image classification model outputs the class score of each class, and a plurality of class scores of the target image are obtained after the output.

S103, determining the category corresponding to the maximum score in the plurality of category scores as the final category of the target image to be classified.

In a possible implementation manner, after obtaining a plurality of category scores based on step S102, the plurality of category scores may be sorted in a descending order, and a category corresponding to the largest score sorted at the first place is determined as a final category of the target image to be classified.

For example, the target image to be classified is input to the trained model, the class score of the image belonging to each class is obtained, and the maximum value of the class scores is taken from the class scores belonging to each class

And the corresponding class k is used as the class of the target image to be classified.

Referring to fig. 2, a flowchart of an image classification model training method is provided in the present embodiment. As shown in fig. 2, the method of the embodiment of the present application may include the following steps:

s201, constructing an image classification training data set;

in the embodiment of the application, when an image classification training data set is constructed, sample images are collected in an image library, and then a real class label is marked on each collected sample image to generate the image classification training data set.

S202, establishing and initializing an image classification network to obtain an initial image classification model;

in the embodiment of the application, when the image classification network is created and initialized to obtain the initial image classification model, the image classification network is firstly constructed by adopting the neural network, and then the image classification network is initialized by adopting a random initialization function or utilizing a pre-training model to obtain the initial image classification model.

S203, constructing a modular length constraint loss aiming at the length of the feature vector, and constructing a target loss function of the image classification model based on the modular length constraint loss;

assume that the images in the image classification training dataset share class K. In the training process, the ith training image is input to a neural network for feature extraction to obtain extracted features

And calculating the extracted features by using a full-connected layer W and a softmax layer to obtain fractional vectors of the current image corresponding to the K categories. Using vectors

To represent the category score vector. Wherein the content of the first and second substances,

representing pairs of input imagesCorresponding to the score of the kth category. Then

Assume that the true label of the input image is the first

And (4) class. When the image classification is correct, the image belongs to the correct category

The category score difference absolute value corresponding to any other category r is as follows:

when the image classification is wrong, the image belongs to the class r and the correct class on the assumption that the image is wrongly classified into the r-th class

The absolute value of the corresponding category score difference is as follows:

in the embodiment of the application, when the image classification is wrong, the correct classification

The smaller the class score difference corresponding to the error class r, the better. When the image classification is correct, the correct classification

The larger the class score difference corresponding to the error class r, the better. It can be shown that when the input image is classified incorrectly, the features

Smaller modulus length of (A), correct class

The smaller the class score difference corresponding to the error class r. When the input image is classified correctly, the features

The larger the modular length of (A), the correct class

The larger the class score difference corresponding to the error class r.

Therefore, the application proposes to add a constraint on the modular length in the loss function according to the correct classification condition of the training sample, so that when the sample is classified incorrectly, the modular length of the feature is updated towards the direction of decreasing, thereby reducing the score difference between the correct class and the wrong class, and when the sample is classified correctly, the modular length of the feature is updated towards the direction of increasing, thereby further increasing the score difference between the correct class and the wrong class. Based on this, we propose a classification loss function L based on a characteristic modulo length constraint, which consists of a classified softmax cross entropy loss function and a characteristic modulo length constraint:

the target loss function expression is:

wherein i represents that the input image is the ith image in the data set,

indicating that the feature vector is long when the target classification is wrongThe loss of confinement of the mode length of the degree,

representing the features extracted by the model for the input image,

is a class indication value, and when the target classification is wrong,

on the contrary

；

Is the weight of the constraint;

is the weight of the constraint that is,

to prevent the denominator from being a number of 0, set

=10 ^-7 。

And S204, generating a pre-trained image classification model according to the image classification training data set, the initial image classification model and the target loss function.

In the embodiment of the application, when a pre-trained image classification model is generated according to an image classification training data set, an initial image classification model and a target loss function, firstly, an nth training image is acquired from the image classification training data set and input into the image classification model to obtain a target feature corresponding to the training image and a confidence coefficient of each category, then, a category corresponding to the maximum confidence coefficient in each category confidence coefficient is determined as an output category of the training image, secondly, a category indication value of the training image is determined according to the output category of the training image, secondly, a model loss value is calculated according to the target feature corresponding to the training image and the category indication value of the training image by combining a target loss function, and finally, the pre-trained image classification model is generated according to the model loss value; and n is the traversal times.

Specifically, when a class indication value of a training image is determined according to an output class of the training image, firstly, whether the output class of the training image is consistent with a real class of the training image is judged; if the training images are consistent, determining 0 as a class indication value of the training images; or, if the two are not consistent, determining 1 as the class indication value of the training image.

Specifically, when a pre-trained image classification model is generated according to a model loss value, firstly, when the model loss value is lower than a preset threshold value or the training frequency is higher than a preset upper limit value, a trained image classification model is generated; or when the model loss value is higher than a preset threshold value or the training times do not reach a preset upper limit value, performing iterative optimization on the image classification model by adopting a random gradient descent algorithm, continuously performing the step of acquiring the (n + 1) th training image in the image classification training data set and inputting the training image into the image classification model, and when n +1 is greater than the number of samples in the image classification training data set, randomly arranging the sequence of the images in the image classification training data set and resetting n =1; wherein n is the traversal number.

In the embodiment of the application, an image classification device firstly acquires a target image to be classified, then inputs the target image to be classified into a pre-trained image classification model, outputs a plurality of class scores corresponding to the target image to be classified, a loss function of the pre-trained image classification model comprises a cross entropy loss function and a modular length constraint loss of image classification, and finally determines the class corresponding to the maximum score in the plurality of class scores as the final class of the target image to be classified. According to the method, corresponding modular length constraint loss is added into a loss function of the model, so that the model can be quickly corrected from an incorrect classification category to a correct classification category in a training stage, and after the model is classified to the correct category, the classification confidence value difference between the correct category and the incorrect category is further enlarged, and the accuracy and the generalization performance of the model are improved.

The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.

Referring to fig. 3, a schematic structural diagram of an image classification apparatus according to an exemplary embodiment of the present invention is shown. The image classification device can be realized by software, hardware or a combination of the two to form all or part of the terminal. The apparatus 1 includes an image acquisition module 10, a category score output module 20, and a final category determination module 30.

The image acquisition module 10 is used for acquiring a target image to be classified;

the category score output module 20 is configured to input the target image to be classified into a pre-trained image classification model, and output a plurality of category scores corresponding to the target image to be classified; the loss function of the pre-trained image classification model comprises a cross entropy loss function and a model length constraint loss of image classification;

and a final category determining module 30, configured to determine a category corresponding to the largest score in the multiple category scores as a final category of the target image to be classified.

It should be noted that, when the image classification apparatus provided in the foregoing embodiment executes the image classification method, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the image classification device and the image classification method provided by the above embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments and are not described herein again.

The above-mentioned serial numbers of the embodiments of the present application are merely for description, and do not represent the advantages and disadvantages of the embodiments.

The present invention also provides a computer readable medium having stored thereon program instructions which, when executed by a processor, implement the image classification method provided by the various method embodiments described above.

The present invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image classification method of the above-described respective method embodiments.

Please refer to fig. 4, which provides a schematic structural diagram of a terminal according to an embodiment of the present application. As shown in fig. 4, terminal 1000 can include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.

The communication bus 1002 is used to implement connection communication among these components.

The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.

The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.

Processor 1001 may include one or more processing cores, among other things. The processor 1001 interfaces various parts throughout the electronic device 1000 using various interfaces and lines to perform various functions of the electronic device 1000 and to process data by executing or performing instructions, programs, code sets, or instruction sets stored in the memory 1005 and invoking data stored in the memory 1005. Alternatively, the processor 1001 may be implemented in at least one hardware form of Digital Signal Processing (DSP), field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1001 may integrate one or a combination of a Central Processing Unit (CPU), an initial image processor (GPU), a modem, and the like. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1001, but may be implemented by a single chip.

The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an initial image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 4, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an image classification application program.

In the terminal 1000 shown in fig. 4, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to invoke the image classification application stored in the memory 1005 and specifically perform the following operations:

acquiring a target image to be classified;

In one embodiment, the processor 1001, before performing the acquiring of the target image to be classified, further performs the following operations:

constructing an image classification training data set;

the target loss function expression is:

wherein i represents that the input image is the ith image in the data set,

is the cross entropy loss function of image classification, e is the natural constant, W is the full connection layer parameter, T is the transposition operation,

representing the features extracted by the model for the input image,

is a class indication value, and when the target classification is wrong,

on the contrary

；

Is the weight of the constraint;

is the weight of the constraint that is,

to prevent the denominator from being a number of 0, set

=10 ^-7 。

In one embodiment, the processor 1001 performs the following operations when performing the construction of the image classification training data set:

collecting a sample image;

and marking each collected sample image with a real class label to generate an image classification training data set.

In one embodiment, the processor 1001 specifically performs the following operations when creating and initializing an image classification network to obtain an initial image classification model:

adopting a neural network to construct an image classification network;

In one embodiment, the processor 1001, when executing the generation of the pre-trained image classification model from the image classification training dataset, the initial image classification model and the target loss function, specifically performs the following operations:

acquiring an nth training image from an image classification training data set, inputting the nth training image into an image classification model, and obtaining a target feature corresponding to the training image and a confidence coefficient of each category; n is the traversal number;

In one embodiment, the processor 1001 specifically performs the following operations when determining the class indication value of the training image according to the output class of the training image:

alternatively, the first and second electrodes may be,

In one embodiment, the processor 1001, when executing the generation of the pre-trained image classification model according to the model loss value, specifically performs the following operations:

alternatively, the first and second liquid crystal display panels may be,

In the embodiment of the application, an image classification device firstly obtains a target image to be classified, then inputs the target image to be classified into a pre-trained image classification model, and outputs a plurality of class scores corresponding to the target image to be classified, a loss function of the pre-trained image classification model comprises a cross entropy loss function and a modular length constraint loss of image classification, and finally, a class corresponding to the maximum score in the class scores is determined as a final class of the target image to be classified. According to the method, corresponding modular length constraint loss is added into a loss function of the model, so that the model can be quickly corrected from an incorrect classification category to a correct classification category in a training stage, and after the model is classified to the correct category, the classification confidence value difference between the correct category and the incorrect category is further enlarged, and the accuracy and the generalization performance of the model are improved.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by instructing relevant hardware by a computer program, and the image classification program may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A method of image classification, the method comprising:

acquiring a target image to be classified;

inputting the target image to be classified into a pre-trained image classification model, and outputting a plurality of category scores corresponding to the target image to be classified; wherein the loss function of the pre-trained image classification model comprises a cross entropy loss function and a modular length constraint loss of image classification;

determining the category corresponding to the maximum score in the plurality of category scores as the final category of the target image to be classified; wherein the content of the first and second substances,

before the target image to be classified is obtained, the method further comprises the following steps:

constructing an image classification training data set;

generating a pre-trained image classification model according to the image classification training data set, an initial image classification model and the target loss function; wherein, the first and the second end of the pipe are connected with each other,

the target loss function expression is:

wherein i represents that the input image is the ith image in the data set,

representing the loss of modulo length constraint on the length of the feature vector when the target classification is wrong,

representing features extracted by the model for the input image,

is a class indication value, and when the target classification is incorrect,

on the contrary, the

；

Is the weight of the constraint;

is the weight of the constraint that is,

to prevent the denominator from being a number of 0, set

=10 ^-7 。

2. The method of claim 1, wherein constructing an image classification training dataset comprises:

collecting a sample image;

3. The method of claim 1, wherein the creating and initializing an image classification network to obtain an initial image classification model comprises:

constructing an image classification network by adopting a neural network;

4. The method of claim 1, wherein generating a pre-trained image classification model from the image classification training dataset, an initial image classification model, and the objective loss function comprises:

acquiring an nth training image from the image classification training data set, inputting the nth training image into the image classification model, and obtaining a target feature corresponding to the training image and a confidence coefficient of each category; n is the traversal times;

determining the category corresponding to the maximum confidence coefficient in the confidence coefficients of each category as the output category of the training image;

calculating a model loss value by combining the target loss function according to the target characteristics corresponding to the training image and the class indication value of the training image;

5. The method of claim 4, wherein determining the class indicator value for the training image based on the output class of the training image comprises:

alternatively, the first and second electrodes may be,

6. The method of claim 4, wherein generating a pre-trained image classification model from the model loss values comprises:

alternatively, the first and second liquid crystal display panels may be,

when the model loss value is higher than a preset threshold value or the training times do not reach a preset upper limit value, iterative optimization is carried out on the image classification model by adopting a random gradient descent algorithm, the step of acquiring the (n + 1) th training image in the image classification training data set and inputting the training image into the image classification model is continuously executed, and when the n +1 is larger than the number of samples in the image classification training data set, after random arrangement is carried out on the sequence of the images in the image classification training data set, n =1 is reset; wherein n is the traversal number.

7. An image classification apparatus, characterized in that the apparatus comprises:

the class score output module is used for inputting the target image to be classified into a pre-trained image classification model and outputting a plurality of class scores corresponding to the target image to be classified; wherein the loss function of the pre-trained image classification model comprises a cross entropy loss function and a modular length constraint loss of image classification;

a final category determining module, configured to determine a category corresponding to a maximum score in the multiple category scores as a final category of the target image to be classified; wherein, the first and the second end of the pipe are connected with each other,

the apparatus is further specifically configured to:

constructing an image classification training data set;

the target loss function expression is:

wherein i denotes that the input image is the ith image in the data set,

representing the features extracted by the model for the input image,

is a class indication value, and when the target classification is incorrect,

on the contrary, the

；

Is the weight of the constraint;

is the weight of the constraint that is,

to prevent the denominator from being a number of 0, set

=10 ^-7 。

8. A computer storage medium storing instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1 to 6.

9. A terminal, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to carry out the method steps according to any one of claims 1 to 6.