CN111159450A

CN111159450A - Picture classification method and device, computer equipment and storage medium

Info

Publication number: CN111159450A
Application number: CN201911391376.9A
Authority: CN
Inventors: 周康明; 谈咏东
Original assignee: Shanghai Eye Control Technology Co Ltd
Current assignee: Shanghai Eye Control Technology Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2020-05-15

Abstract

The application relates to a picture classification method, a device, a computer device and a storage medium, wherein a used classification neural network is trained based on an initial neural network, and a plurality of parallel full-connection layers of the initial neural network can output different attribute category numbers, so that when the initial neural network is trained, a feature extraction layer of the initial neural network can learn parameter features in pictures when the full-connection layers are trained, when different full-connection layers in the initial neural network are trained, the feature extraction layer of the initial neural network can be further promoted to learn the parameter features in the pictures, which is equivalent to that for the multi-attribute classification problem, the neural network can more accurately learn the parameter features in the pictures by arranging the plurality of parallel full-connection layers on the initial neural network, so when the trained classification neural network is used for classifying the attributes of the pictures, the obtained classification result is more accurate.

Description

Picture classification method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image classification method and apparatus, a computer device, and a storage medium.

Background

With the wider application of the deep convolutional neural network in image classification, the multiple image classifications with single attribute can not meet the actual requirements.

At present, there are many examples of multi-attribute categories, such as a movie can be divided into action and science fiction films, a photo can be divided into landscape and vegetation films, and so on. In general, in complex multi-attribute classification, because the boundary division of each type of attribute is different, a certain association exists between the attributes, and thus, when a deep convolutional neural network is trained, the boundary division of each type of attribute is not accurate enough, so that the trained model is not converged enough, and each type of attribute of a picture cannot be accurately identified.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a picture classification method, apparatus, computer device and storage medium.

In a first aspect, an embodiment of the present application provides a method for classifying pictures, where the method includes:

acquiring a plurality of multi-attribute pictures;

inputting each picture into a preset classification neural network to obtain a plurality of attribute classification results;

the classification neural network is trained based on an initial classification neural network, the initial neural network comprises a plurality of parallel full-connection layers, the plurality of parallel full-connection layers share one input layer and one feature extraction layer, and the number of attribute categories output by each full-connection layer is different.

In one embodiment, the training process of the classification neural network includes:

acquiring multiple sample multi-attribute pictures;

after each sample multi-attribute picture is input into the input layer and the feature extraction layer, each full connection layer is sequentially and independently trained according to the sequence that the number of the attribute categories output by the full connection layers is from small to large and the next full connection layer is trained after the previous full connection layer is converged until all the full connection layers are converged, all other full connection layers except the full connection layer with the largest output attribute category number are annotated, and the classified neural network is obtained.

In one embodiment, if the fully-connected layer of the initial neural network comprises a first fully-connected layer, a second fully-connected layer and a third fully-connected layer; the number of attribute categories output by the first full connection layer is less than the number of attribute categories output by the second full connection layer is less than the number of attribute categories output by the third full connection layer;

the training process of the classification neural network includes:

after the multi-attribute pictures of each sample are input into the input layer and the feature extraction layer, a first full connection layer is trained, a second full connection layer is trained after the first full connection layer converges, a third full connection layer is trained after the second full connection layer converges, and after the third full connection layer converges, the first full connection layer and the second full connection layer are annotated to obtain the classification neural network.

In one embodiment, the initial classification neural network further includes a plurality of classification function layers, and the classification function layers are respectively connected to the back of the corresponding full-connection layer;

and the classification function layer is used for converting the output of each fully-connected layer into a value of a preset interval in the training process, determining a classification result corresponding to each fully-connected layer according to the comparison result of the value of the preset interval and a preset threshold value, and predicting the accuracy of each classification result based on a standard classification result.

In one embodiment, the loss functions of all the fully connected layers have a correlation relationship;

the association relationship at least includes: and according to the sequence of the attribute classification quantity of the full-connection layers from small to large, starting from the second full-connection layer, the comprehensive loss function of each full-connection layer is the weighted sum of the hierarchy loss function of the current full-connection layer and the hierarchy loss functions of all full-connection layers before the current full-connection layer.

In one embodiment, if the fully-connected layer of the initial neural network comprises a first fully-connected layer, a second fully-connected layer and a third fully-connected layer; the hierarchy loss function of the first fully-connected layer is a first loss function, the hierarchy loss function of the second fully-connected layer is a second loss function, and the third fully-connected layer is a third hierarchy loss function;

the association relationship includes: the comprehensive loss function of the second fully-connected layer is a weighted sum of the first loss function and the second loss function;

the composite loss function of the third fully-connected layer is a weighted sum of the first loss function, the second loss function, and the third loss function.

In one embodiment, the condition for convergence of each fully-connected layer includes:

the accuracy of the classification function layer prediction of each fully connected layer reaches a preset accuracy range, and the value of the loss function of each fully connected layer reaches the preset range.

In a second aspect, an embodiment of the present application provides an image classification device, including:

the acquisition module is used for acquiring a plurality of multi-attribute pictures;

the classification module is used for inputting each picture into a preset classification neural network to obtain a classification result with various attributes;

In a third aspect, an embodiment of the present application provides a computer device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of any one of the methods provided in the embodiments of the first aspect when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of any one of the methods provided in the embodiments of the first aspect.

In the image classification method, the image classification device, the computer equipment and the storage medium provided by the embodiment of the application, the used classification neural network is trained based on the initial neural network, and the plurality of parallel full-connection layers of the initial neural network can output different attribute category numbers, so that when the initial neural network is trained, the characteristic extraction layer of the initial neural network can learn the parameter characteristics in the image when the full-connection layers are trained, when the different full-connection layers in the initial neural network are trained, the characteristic extraction layer of the initial neural network can be further promoted to learn the parameter characteristics in the image, which is equivalent to that for the multi-attribute classification problem, the parameter characteristics in the image can be learned more accurately by the neural network by arranging the plurality of parallel full-connection layers on the initial neural network, so when the trained classification neural network is used for classifying the attributes of the image, the obtained classification result is more accurate.

Drawings

Fig. 1 is an application environment diagram of a picture classification method according to an embodiment;

fig. 2 is a schematic flowchart of a method for classifying pictures according to an embodiment;

FIG. 3 is a diagram illustrating an initial neural network according to an embodiment;

fig. 4 is a flowchart illustrating a method for classifying pictures according to an embodiment;

fig. 5 is a block diagram illustrating an exemplary embodiment of an apparatus for classifying pictures;

fig. 6 is a block diagram of a picture classifying device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The image classification method provided by the application can be applied to an application environment as shown in fig. 1, wherein a processor of a computer device is used for providing calculation and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data of the picture classification. The network interface of the computer device is used for communicating with other external devices through network connection. The computer program is executed by a processor to implement a picture classification method.

The embodiment of the application provides a picture classification method and device, computer equipment and a storage medium. The following describes in detail the technical solutions of the present application and how the technical solutions of the present application solve the above technical problems by embodiments and with reference to the drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. It should be noted that in the image classification method provided in the present application, the execution main bodies of fig. 2 to fig. 4 are computer devices, where the execution main body may also be an image classification apparatus, where the apparatus may be implemented as part or all of the computer devices by software, hardware, or a combination of software and hardware.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments.

In an embodiment, fig. 2 provides a method for classifying a picture, and this embodiment relates to a specific process in which a computer device classifies a multi-attribute picture by using a preset classification neural network, as shown in fig. 2, the method includes:

s101, acquiring a plurality of multi-attribute pictures.

The multi-attribute picture indicates that one picture has multiple attributes, for example, the attribute of the same picture belongs to both landscape and plants, and the attribute of the picture can also belong to flowers. In practical applications, the picture needs to be classified into multiple attributes, that is, multiple attributes of the picture need to be identified. It should be noted that the present application is not only applicable to multiple attribute division of pictures, but also applicable to videos, for example, attributes of a segment of video are both action and science fiction from different perspectives.

And S102, inputting each picture into a preset classification neural network to obtain a classification result with various attributes.

And based on the multiple multi-attribute pictures, classifying the multiple multi-attribute pictures by adopting a preset classification neural network. The classification neural network is trained based on an initial neural network, the structure of the initial neural network can be shown in fig. 3, the classification neural network comprises a plurality of parallel full-connection layers, the plurality of parallel full-connection layers share one input layer and one feature extraction layer, and the number of attribute classes output by each full-connection layer in the plurality of full-connection layers is different.

Specifically, referring to fig. 3, the initial neural network includes three parallel fully-connected layers, and the number of the attribute categories output by each fully-connected layer is taken as a standard, and the standard is respectively named as a 3-classification fully-connected layer, a 10-classification fully-connected layer, and a 100-classification fully-connected layer, that is, the number of the attribute categories output by the 3-classification fully-connected layer is 3, for example, the attributes of a stack of pictures are divided into landscapes, people, and vehicles; wherein, the number of the attribute categories output by the 10 classified full-connection layers is 10, for example, the attributes of the stack of pictures are subdivided into 10 categories of mountain, water, plant, single person, multiple persons and the like; the number of attribute categories output by 100 classified full-link layers is 100, and the attributes of the stack of pictures are subdivided into 100 categories, which are not listed one by one.

That is, in the classification method provided by the present embodiment, the classification neural network used is trained based on the initial neural network, and the plurality of parallel full-connection layers of the initial neural network can output different attribute class numbers, so that when the initial neural network is trained, when training the fully-connected layer, the feature extraction layer of the initial neural network learns the parameter features in the picture, and when training the different fully-connected layers in the initial neural network, the parameter characteristics in the characteristic extraction layer learning picture of the initial neural network can be further promoted, which is equivalent to, for the multi-attribute classification problem, the neural network can learn the parameter characteristics in the picture more accurately by arranging a plurality of parallel fully-connected layers on the initial neural network, therefore, when the trained classification neural network is used for classifying the attributes of the pictures, the obtained classification result is more accurate.

Based on the foregoing embodiments, an embodiment is provided for a specific training process of an initial neural network, as shown in fig. 4, and in an embodiment, the training process of the classification neural network includes:

s201, acquiring multiple sample multi-attribute pictures.

In this embodiment, a process of training a neural network needs to acquire training sample data first, and the training sample data in this embodiment may be a large number of multi-attribute pictures acquired as sample data. The obtained sample picture is not limited, and any type of picture is not limited. In order to ensure that the trained classification neural network can classify the pictures more accurately, when training data is obtained, a large number of sample data of different types are selected as much as possible to ensure the richness of the sample data, so that the neural network can learn more detailed characteristics of the pictures.

S202, after the multi-attribute pictures of each sample are input into the input layer and the feature extraction layer, according to the sequence that the number of the attribute categories output by the full connection layers is from small to large and the next full connection layer is trained after the previous full connection layer is converged, each full connection layer is sequentially and independently trained until all the full connection layers are converged, all the full connection layers except the full connection layer with the largest output attribute category number are annotated, and the classified neural network is obtained.

Based on the obtained training sample data, the training process specifically includes that after the multi-attribute pictures of each sample are input to the input layer and the feature extraction layer, one of the multiple parallel full-connection layers is trained, one full-connection layer converges, the next full-connection layer is trained again, and during training, according to the sequence of the number of the attribute classes output by the full-connection layers from small to large, one training is performed, after the previous full-connection converges, the next full-connection layer is trained again until all the full-connection layers converge, all the other full-connection layers except the full-connection layer with the largest output attribute class number are annotated, the classification neural network is trained well, and the classification neural network in fig. 2 is obtained.

It should be further noted that, if a classification neural network capable of classifying 100 classes of attributes needs to be trained, when a plurality of full-connected layers of an initial neural network are set, the full-connected layer with the largest output attribute class number is set as the 100 classes of full-connected layers, except the 100 classes of full-connected layers, the output attribute class numbers of other full-connected layers are all smaller than 100, and the 100 classes of attributes belong to the preceding subclass attributes, in other words, when the initial neural network is trained according to the scheme, a complex subclass multi-attribute classification problem is converted into a simple large class multi-attribute classification problem in an asymptotic training mode from large class to subclass, a classification model is trained, and the multi-attribute classification model is further fine-tuned on the basis of the model to obtain a final subclass model.

The full-connection layer of the initial neural network comprises a first full-connection layer, a second full-connection layer and a third full-connection layer; for example, the number of attribute classes output by the first full-link layer < the number of attribute classes output by the second full-link layer < the number of attribute classes output by the third full-link layer provides an embodiment of a training process, and in an embodiment, the training process of the classification neural network includes:

More specifically, referring to fig. 2, fig. 2 can be regarded as a problem of classifying 100 classes of pictures, we can classify the pictures into 10 classes according to new boundaries, and then classify the 10 classes into 3 classes according to certain boundaries, which corresponds to the embodiment, where the first fully-connected layer is a 3-class fully-connected layer, the second fully-connected layer is a 10-class fully-connected layer, and the third fully-connected layer is a 100-class fully-connected layer,

and during training, after multi-attribute pictures of each sample are input into the input layer and the feature extraction layer, training 3 classification full-connection layers, training 10 classification full-connection layers after the convergence of the 3 classification full-connection layers, training 100 classification full-connection layers after the convergence of the 10 classification full-connection layers, and annotating the 3 classification full-connection layers and the 10 classification full-connection layers after the convergence of the 100 classification full-connection layers to obtain a classification neural network capable of classifying a pile of pictures into 100 attributes.

In the embodiment, the full connection layers of different categories are connected in parallel, when the full connection layer with 3 categories is trained, the picture is input from the input layer, passes through the feature extraction layer, and then is connected with the full connection layer with 3 categories, at the moment, the parameter feature extraction layer of the initial neural network learns to divide the picture into the parameter features with three attributes, when the full connection layer with 10 categories is trained, the picture is input from the input layer, passes through the feature extraction layer, but reaches the full connection layer with 10 categories, at the moment, the feature extraction layer learns the parameter features dividing the picture into ten attributes again on the basis of the trained full connection layer with 3 categories, and similarly, when the full connection layer with 100 categories goes to the rear, the feature extraction layer learns the parameter features dividing the picture into 100 attributes again on the basis of the trained full connection layer with 10 categories, so that the feature extraction layer learns the picture classification features further, the image feature recognition and classification by the feature extraction layer are more accurate.

In this embodiment, when the initial neural network is trained, a progressive training method with a small number of attribute categories is adopted, so that the feature extraction layer of the neural network learns more accurate parameter features in a step-by-step manner according to the parameter features of the learned pictures in a manner from easy to difficult, and not only can the classified neural network converge quickly, but also the accuracy of the classified neural network identification can be improved.

In addition, when the initial neural network is trained, a sigmoid activation function can be connected behind a full-connection layer of the neural network, sigmoid activation is carried out on each value output by each full-connection layer, and when the value after activation is larger than a threshold value, the input is considered to have the attribute, otherwise, the input does not have the attribute. Specifically, in one embodiment, the initial classification neural network further includes a plurality of classification function layers, and the classification function layers are respectively connected to the back of the corresponding full-connection layer; and the classification function layer is used for converting the output of each fully-connected layer into a value of a preset interval in the training process, determining a classification result corresponding to each fully-connected layer according to the comparison result of the value of the preset interval and a preset threshold value, and predicting the accuracy of each classification result based on the standard classification result.

In this embodiment, the classification function layer is established based on a sigmoid activation function, a classification function layer is connected after each full-connection layer, the sigmoid activation function in the classification function layer transforms the result output by each full-connection layer to an interval, for example, to an interval [0,1], and a new output value is obtained after transformation, for example, the output value obtained after transformation of the classification result output by the 3-classification full-connection layer is [0.2, 0.6, 0.9 ].

After the output of the full link layer is converted into the value of the preset interval, the classification result corresponding to each full link layer is determined according to the comparison result between the value of the preset interval and the preset threshold, for example, a preset threshold may be set to 0.5, the values [0.2, 0.6, 0.9] of the preset interval are compared with 0.5, and the comparison results are 0.2 less than 0.5, and 0.6 and 0.9 are both greater than 0.5. Alternatively, the classification result corresponding to each fully connected layer is converted based on the comparison result, for example, if the classification result greater than the preset threshold is 0 and the classification result smaller than the preset threshold is 1, the converted classification result may be [0,1, 1 ]. Further, the accuracy of the classification result is determined based on the standard classification result, where the standard classification result represents a preset true value, and specifically, the accuracy may be obtained according to a ratio of the standard classification result to the classification result, which is not limited in this embodiment. The method can predict the accuracy of each classification result, so that the performance of the neural network can be more quickly and accurately evaluated in the training process.

When training the neural network, the adjustment direction of the neural network training needs to be guided by a loss function, in this embodiment, a certain rule is set for the loss function of each connection layer, specifically, in one embodiment, the loss functions of all the connection layers have an association relationship; the association relationship at least includes: and according to the sequence of the attribute classification quantity of the full-connection layers from small to large, starting from the second full-connection layer, the comprehensive loss function of each full-connection layer is the weighted sum of the hierarchy loss function of the current full-connection layer and the hierarchy loss functions of all full-connection layers before the current full-connection layer.

In this embodiment, when training each fully-connected layer separately in sequence, an association relationship is established for the loss function of each fully-connected layer, where the association relationship is exemplarily: according to the sequence of the attribute classification quantity of the full connection layers from small to large, starting from the second full connection layer, the comprehensive loss function of each full connection layer is established based on the hierarchical loss functions of all the full connection layers before the previous full connection layer.

The full-connection layer of the initial neural network comprises a first full-connection layer, a second full-connection layer and a third full-connection layer; if the hierarchical loss function of the first fully-connected layer is a first loss function, the hierarchical loss function of the second fully-connected layer is a second loss function, and the third fully-connected layer is a hierarchical third loss function, for example, the relationship includes: the comprehensive loss function of the second fully-connected layer is a weighted sum of the first loss function and the second loss function; the composite loss function of the third fully-connected layer is a weighted sum of the first loss function, the second loss function, and the third loss function.

Specifically, assume that the initial neural network corresponding to the 3-class fully-connected layer is R1, the initial neural network corresponding to the 10-class fully-connected layer is R2, and the initial neural network corresponding to the 100-class fully-connected layer is R3; the hierarchy loss function for a 3-class fully-connected layer is L1, the hierarchy loss function for a 10-class fully-connected layer is L2, the hierarchy loss function for a 3-class fully-connected layer is L3,

then the overall loss function for R1 is La ═ L1; the comprehensive loss function corresponding to R2 is Lb ═ a × L1+ b × L2, where a and b are weights of loss functions L1 and L2; the overall loss function corresponding to R3 is Lc ═ c × L3+ e × L2+ fL1, where c, e, f are weights of loss functions L3, L2, L1, optionally Lc ═ c × L3+ e × Lb, where c, e are weights of loss functions L3, Lb.

In this embodiment, an incidence relation is set for the loss function of each fully-connected layer, and the loss function of the next fully-connected layer is established based on the loss function of the previous fully-connected layer, so that when the value of the loss function of the next fully-connected layer is observed, the loss function of the previous fully-connected layer can be referred to, so that the evaluation of the loss function is more comprehensive, and thus, the value of each fully-connected layer loss function can more accurately guide the training direction of the neural network.

In order to make the convergence of the neural network more accurate, an embodiment is provided for the convergence condition of the neural network, and then in an embodiment, the convergence condition of each fully-connected layer includes: the accuracy of the classification function layer prediction of each fully connected layer reaches a preset accuracy range, and the value of the loss function of each fully connected layer reaches the preset range. In this embodiment, when determining the convergence of the neural network, the accuracy of the classification function layer prediction of each fully-connected layer and the value of the loss function of each fully-connected layer are both considered, so that the convergence effect of the neural network is ensured.

It should be understood that although the various steps in the flow charts of fig. 2-4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-4 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 5, there is provided a picture classification apparatus including: an obtaining module 10, a classifying module 11, wherein,

the acquisition module 10 is used for acquiring a plurality of multi-attribute pictures;

the classification module 11 is configured to input each picture into a preset classification neural network to obtain a classification result with multiple attributes; the classification neural network is trained based on an initial classification neural network, the initial neural network comprises a plurality of parallel full-connection layers, the plurality of parallel full-connection layers share one input layer and one feature extraction layer, and the number of attribute categories output by each full-connection layer is different.

In one embodiment, as shown in fig. 6, there is provided a picture classification apparatus, further comprising:

the sample acquisition module 12 is configured to acquire multiple sample multi-attribute pictures;

and the training module 13 is configured to train each full connection layer separately in sequence after each sample multi-attribute picture is input to the input layer and the feature extraction layer, according to the sequence that the number of attribute categories output by the full connection layers is from small to large, and a next full connection sequence is trained after a previous full connection layer converges, until all the full connection layers converge, annotate all other full connection layers except the full connection layer with the largest number of output attribute categories, and obtain the classified neural network.

In one embodiment, if the fully-connected layer of the initial neural network comprises a first fully-connected layer, a second fully-connected layer, and a third fully-connected layer; the number of attribute categories output by the first full connection layer is less than the number of attribute categories output by the second full connection layer is less than the number of attribute categories output by the third full connection layer;

the training module 13 is configured to train a first full connection layer after the multi-attribute pictures of the samples are input to the input layer and the feature extraction layer, train a second full connection layer after the first full connection layer converges, train a third full connection layer after the second full connection layer converges, and annotate the first full connection layer and the second full connection layer after the third full connection layer converges to obtain the classified neural network.

In one embodiment, the initial classification neural network further includes a plurality of classification function layers, and the classification function layers are respectively connected to the back of the corresponding full-connection layer; and the classification function layer is used for converting the output of each fully-connected layer into a value of a preset interval in the training process, determining a classification result corresponding to each fully-connected layer according to the comparison result of the value of the preset interval and a preset threshold value, and predicting the accuracy of each classification result based on a standard classification result.

In one embodiment, the loss functions of all the fully connected layers have a correlation relationship; the association relationship at least includes: and according to the sequence of the attribute classification quantity of the full-connection layers from small to large, starting from the second full-connection layer, the comprehensive loss function of each full-connection layer is the weighted sum of the hierarchy loss function of the current full-connection layer and the hierarchy loss functions of all full-connection layers before the current full-connection layer.

In one embodiment, if the fully-connected layer of the initial neural network comprises a first fully-connected layer, a second fully-connected layer, and a third fully-connected layer; the hierarchy loss function of the first fully-connected layer is a first loss function, the hierarchy loss function of the second fully-connected layer is a second loss function, and the third fully-connected layer is a third hierarchy loss function;

the association relationship includes: the comprehensive loss function of the second fully-connected layer is a weighted sum of the first loss function and the second loss function; the composite loss function of the third fully-connected layer is a weighted sum of the first loss function, the second loss function, and the third loss function.

In one embodiment, the above-mentioned condition for convergence of each fully-connected layer includes: the accuracy of the classification function layer prediction of each fully connected layer reaches a preset accuracy range, and the value of the loss function of each fully connected layer reaches the preset range.

The implementation principle and technical effect of all the image classification devices provided in the above embodiments are similar to those of the above method embodiments, and are not described herein again.

For the specific definition of the image classification device, reference may be made to the above definition of the image classification method, which is not described herein again. All or part of the modules in the image classification device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, the internal structure of which may be as described above in fig. 1. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a picture classification method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 1 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

acquiring a plurality of multi-attribute pictures;

The implementation principle and technical effect of the computer device provided by the above embodiment are similar to those of the above method embodiment, and are not described herein again.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

acquiring a plurality of multi-attribute pictures;

The implementation principle and technical effect of the computer-readable storage medium provided by the above embodiments are similar to those of the above method embodiments, and are not described herein again.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method for classifying pictures, the method comprising:

acquiring a plurality of multi-attribute pictures;

inputting each picture into a preset classification neural network to obtain a classification result of multiple attributes;

2. The method of claim 1, wherein the training process of the classification neural network comprises:

acquiring multiple sample multi-attribute pictures;

and after each sample multi-attribute picture is input into the input layer and the feature extraction layer, according to the sequence that the number of the attribute categories output by the full connection layers is from small to large and the next full connection layer is trained after the previous full connection layer is converged, each full connection layer is sequentially and independently trained until all the full connection layers are converged, all the full connection layers except the full connection layer with the maximum output attribute category number are annotated, and the classified neural network is obtained.

3. The method of claim 2, wherein if the fully-connected layers of the initial neural network comprise a first fully-connected layer, a second fully-connected layer, and a third fully-connected layer; the number of attribute categories output by the first full connection layer < the number of attribute categories output by the second full connection layer < the number of attribute categories output by the third full connection layer;

the training process of the classification neural network comprises:

after each sample multi-attribute picture is input into the input layer and the feature extraction layer, the first full connection layer is trained, the second full connection layer is trained after the first full connection layer converges, the third full connection layer is trained after the second full connection layer converges, and the first full connection layer and the second full connection layer are annotated to obtain the classification neural network after the third full connection layer converges.

4. The method of claim 2 or 3, wherein the initial classification neural network further comprises a plurality of classification function layers, each connected after a corresponding fully-connected layer;

and the classification function layer is used for converting the output of each fully-connected layer into a value in a preset interval in the training process, determining a classification result corresponding to each fully-connected layer according to the comparison result of the value in the preset interval and a preset threshold value, and predicting the accuracy of each classification result based on a standard classification result.

5. The method of claim 4, wherein the loss functions of each of the fully-connected layers are correlated;

the association relationship at least comprises: according to the sequence of the attribute classification quantity of the full-connection layers from small to large, starting from the second full-connection layer, the comprehensive loss function of each full-connection layer is the weighted sum of the hierarchy loss function of the current full-connection layer and the hierarchy loss functions of all full-connection layers before the current full-connection layer.

6. The method of claim 5, wherein if the fully-connected layers of the initial neural network comprise a first fully-connected layer, a second fully-connected layer, and a third fully-connected layer; the hierarchical loss function of the first fully-connected layer is a first loss function, the hierarchical loss function of the second fully-connected layer is a second loss function, and the third fully-connected layer is a hierarchical third loss function;

the association relationship includes: the composite loss function of the second fully-connected layer is a weighted sum of the first loss function and the second loss function;

7. The method of claim 5, wherein the condition for each fully-connected layer to converge comprises:

the prediction accuracy of the classification function layer of each fully connected layer reaches a preset accuracy range, and the value of the loss function of each fully connected layer reaches a preset range.

8. An apparatus for classifying pictures, the apparatus comprising:

the classification module is used for inputting each picture into a preset classification neural network to obtain a classification result of multiple attributes;

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.