CN111401464B

CN111401464B - Classification method, classification device, electronic equipment and computer-readable storage medium

Info

Publication number: CN111401464B
Application number: CN202010218996.9A
Authority: CN
Inventors: 郭冠军
Original assignee: Douyin Vision Co Ltd
Current assignee: Douyin Vision Co Ltd
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2023-07-21
Anticipated expiration: 2040-03-25
Also published as: CN111401464A

Abstract

The embodiment of the disclosure relates to the technical field of image processing, and discloses a classification method, a device, electronic equipment and a computer readable storage medium, wherein the classification method comprises the following steps: determining first image features corresponding to at least one target image respectively; then, according to each first image feature corresponding to each target image, determining the category weight of each target image corresponding to each first preset image category preset by the terminal equipment according to the category setting instruction of the user; and then, according to the weights of the categories, determining a first preset image category corresponding to each target image respectively so as to classify each target image. The method of the embodiment of the disclosure can automatically classify each target image in the terminal equipment according to the user-defined image category, is not limited to the inherent image category of the terminal equipment any more, greatly improves the user participation degree and the flexibility of image category setting, and improves the user experience.

Description

Classification method, classification device, electronic equipment and computer-readable storage medium

Technical Field

The embodiment of the disclosure relates to the technical field of image processing, in particular to a classification method, a classification device, electronic equipment and a computer readable storage medium.

Background

The terminal equipment can shoot images through a camera, can acquire images through interaction with other terminal equipment, and can acquire images through accessing the Internet or social media. Wherein, different images have different shooting scenes, such as beach, snow, night, mountain and other scenes, and different target objects, such as automobiles, people, animals, plants and the like, can exist in different images.

In general, images in different shooting scenes have different image features, and images in which different target objects exist also have different image features. The terminal device may classify the respective images according to different image characteristics (i.e. image classification), wherein image classification is an image processing technique that classifies the respective images.

Currently, images may be classified by pre-trained image classification models. However, the inventors of the present disclosure found that, in the specific implementation,: because the image class is preset when the image classification model is trained, the images in the terminal equipment can be classified only according to the image class preset by the image classification model, so that the user using the terminal equipment can not adjust the image class of the terminal equipment according to own preference or habit, and poor user experience is caused.

Disclosure of Invention

The purpose of the embodiments of the present disclosure is to address at least one of the above technical shortcomings, and to provide this summary section to introduce concepts in a simplified form that are further described below in the detailed description section. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In one aspect, a classification method is provided, including:

determining first image features corresponding to at least one target image respectively;

according to the first image characteristics corresponding to the target images, determining the category weights corresponding to the first preset image categories preset by the terminal equipment according to the category setting instructions of the users;

and determining a first preset image category corresponding to each target image respectively according to each class weight so as to classify each target image.

In one aspect, a classification apparatus is provided, comprising:

the first determining module is used for determining first image features corresponding to at least one target image respectively;

The second determining module is used for determining category weights of the target images corresponding to the first preset image categories according to the first image features of the target images, wherein the first image features correspond to the first preset image categories, and the first preset image categories are preset by the terminal equipment according to the category setting instructions of the user;

and the third determining module is used for determining a first preset image category corresponding to each target image respectively according to each class weight so as to classify each target image.

In one aspect, an electronic device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the classification method described above when executing the program.

In one aspect, a computer readable storage medium is provided, on which a computer program is stored, which program, when executed by a processor, implements the classification method described above.

According to the classification method provided by the embodiment of the disclosure, the category weights of the target images corresponding to the first preset image categories are determined according to the first image features of the target images corresponding to the target images, and the first preset image categories corresponding to the target images are determined according to the category weights, so that not only can the target images in the terminal equipment be automatically classified, but also the user can set the corresponding image categories according to own preference or demand, and the target images in the terminal equipment can be automatically classified according to the self-defined image categories, and the user experience is greatly improved without being limited to the inherent image categories of the terminal equipment.

Additional aspects and advantages of embodiments of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a flow diagram of a classification method according to an embodiment of the disclosure;

FIG. 2 is a schematic diagram of a classification model according to an embodiment of the disclosure;

FIG. 3 is a schematic view of the basic structure of a sorting device according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are used merely to distinguish one device, module, or unit from another device, module, or unit, and are not intended to limit the order or interdependence of the functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the embodiments of the present disclosure will be described in further detail below with reference to the accompanying drawings.

The embodiment of the disclosure provides a classification method, a classification device, an electronic device and a computer readable storage medium, which aim to solve the technical problems in the prior art.

The following describes in detail, with specific embodiments, a technical solution of an embodiment of the present disclosure and how the technical solution of the embodiment of the present disclosure solves the foregoing technical problems. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present disclosure will be described below with reference to the accompanying drawings.

One embodiment of the present disclosure provides a classification method performed by a terminal, which may be a desktop device or a mobile terminal. As shown in fig. 1, the method includes:

step S110, determining each first image feature corresponding to at least one target image.

Specifically, the target image may be an image to be classified, where the image to be classified may be one or more images acquired by the terminal device through an image acquisition device carried by the terminal device, or may be one or more images acquired by the terminal device accessing the internet or social media, or may be one or more images acquired through information interaction with other terminal devices, or may be one or more images acquired through other channels, which is not limited in the embodiments of the present disclosure.

In general, since images in different shooting scenes often have different image features, images in which different target objects exist often also have different image features, one target image (for example, image 1) may have multiple image features (i.e., the first image feature described above) at the same time, for example, image 1 has image feature 1 and image feature 2, and for example, image 2 has image feature 3, image feature 4, and image feature 5.

The plurality of image features of a certain target image can be used as a classifying basis for classifying the certain target image, namely, the certain target image is classified according to the plurality of image features. It is necessary to determine each image feature (i.e., the first image feature) corresponding to at least one target image to be classified, so as to provide precondition guarantee for the subsequent accurate classification of the at least one target image.

Step S120, determining a category weight of each target image corresponding to each first predetermined image category according to each first image feature corresponding to each target image, wherein each first predetermined image category is preset by the terminal device according to a category setting instruction of the user.

Specifically, in the process of using the terminal device, the user can set corresponding image categories in the terminal device in advance according to personal preference or demand through corresponding category setting instructions. Correspondingly, the terminal device can receive the category setting instruction of the user and set the corresponding image category according to the category setting instruction of the user. For example, the user a sets the image category a, the image category B, the image category C, and the image category D in advance in the terminal device 1 used by the user a according to personal preference or demand thereof, that is, the terminal device 1 of the user a includes the image category a, the image category B, the image category C, and the image category D. For example, the user B sets the image category a, the image category D, the image category E, the image category F, and the image category G in advance in the terminal device 2 to be used according to personal preference or demand, that is, the terminal device 2 of the user B includes the image category a, the image category D, the image category E, the image category F, and the image category G.

Specifically, after determining each first image feature corresponding to each target image, the category weight corresponding to each first predetermined image category of each target image may be determined according to each first image feature corresponding to each target image. Wherein the sum of the class weights of each target image corresponding to the first predetermined image class is a predetermined value (e.g., 1, 2, 3, etc.).

In one example, if the target image is the above-mentioned image 1, the image 1 has the image feature 1 and the image feature 2, and the terminal device 1 includes the image category a, the image category B, the image category C, and the image category D, the category weight (such as A1) of the image 1 corresponding to the image category a, the category weight (such as B1) of the image 1 corresponding to the image category B, the category weight (such as C1) of the image 1 corresponding to the image category C, and the category weight (such as D1) of the image 1 corresponding to the image category D may be determined according to the image feature 1 and the image feature 2, wherein the sum of the A1, the B1, the C1, and the D1 is a predetermined value (such as 1, 2, and 3, etc.).

In yet another example, if the target image is the above-mentioned image 2, the image 2 has the image feature 3, the image feature 4 and the image feature 5, and the terminal device 1 includes the image category a, the image category B, the image category C and the image category D, the category weight (such as A2) of the image 2 corresponding to the image category a, the category weight (such as B2) of the image 2 corresponding to the image category B, the category weight (such as C2) of the image 2 corresponding to the image category C and the category weight (such as D2) of the image 2 corresponding to the image category D may be determined according to the image feature 3, the image feature 4 and the image feature 5, wherein the sum of the A2, the B2, the C2 and the D2 is a predetermined value (such as 1, 2 and 3).

Step S130, according to the weights of the categories, determining a first preset image category corresponding to each target image respectively so as to classify each target image.

Specifically, after determining the category weights of the target images corresponding to the first predetermined image categories, the first predetermined image categories corresponding to the target images may be determined according to the category weights of the target images corresponding to the first predetermined image categories, so as to classify the target images.

In one example, after determining the category weight A1 of the image 1 corresponding to the image category a, the category weight B1 corresponding to the image category B, the category weight C1 corresponding to the image category C, and the category weight D1 corresponding to the image category D, the image category corresponding to the image 1 may be determined according to A1, B1, C1, and D1. When determining the image type corresponding to the image 1 according to A1, B1, C1, and D1, the image type corresponding to the image 1 may be determined according to the maximum value of A1, B1, C1, and D1. For example, if the maximum value of A1, B1, C1 and D1 is A1, the image class a may be determined as the image class of image 1, that is, the image 1 is classified as the image class a.

The following describes the method of the embodiments of the present disclosure in detail:

in one possible implementation manner, the classification method of the embodiment of the disclosure is implemented by a classification model, where the classification model includes a classification network and a pre-trained feature extraction network, and the pre-trained feature extraction network is used to determine each first image feature corresponding to at least one target image respectively, and the classification network is used to determine a first predetermined image category corresponding to each target image respectively according to a first category weight corresponding to each target image respectively to each first predetermined image category.

In practical application, in the process of classifying each target image into each first preset image category defined by a user through a classification model, the terminal device may determine each first image feature corresponding to each target image through a pre-trained feature extraction network, determine each first image feature corresponding to each target image through a classification network after determining each first image feature corresponding to each target image, determine category weights corresponding to each target image respectively to each first preset image category according to each first image feature corresponding to each target image, and determine each first preset image category corresponding to each target image respectively according to each category weight, thereby classifying each target image into each corresponding first preset image category respectively.

Specifically, the pre-trained feature extraction network may be pre-trained offline in a server or other device, where the feature extraction network may be trained offline through a certain number of sample images belonging to different image categories.

In one example, to ensure that the feature extraction network has extremely strong feature extraction capability and accurately extracts image features, a large number of image categories (such as 2000 image categories) may be preset, a certain number of sample images (such as tens of thousands of image images) may be screened for each image category, then valid features of each sample image may be extracted by inputting the certain number of sample images screened for each image category into the corresponding feature extraction network, and an image category of each sample image may be determined based on the extracted valid features of each sample image. When the image category corresponding to each sample image is not correctly determined according to the extracted effective feature of each sample image, the effective feature extracted by the feature extraction network is not provided with a reference value or is smaller, and training is required to be continued on the feature extraction network until the image category of each sample image can be correctly identified according to the effective feature of each sample image extracted by the feature extraction network.

It should be noted that, the feature extraction network may be an intermediate layer of the convolutional neural network, and the image category corresponding to each sample image may be an output result of an output layer of the convolutional neural network. Fig. 2 is a schematic diagram of a classification model according to an embodiment of the disclosure, in fig. 2, an image is subjected to feature extraction by CNN (Convolutional Neural Networks, convolutional neural network) to obtain image features (for example, M is an integer greater than 0), then, based on the image features, class weights of the image corresponding to first predetermined image classes (for example, N is an integer greater than 0) are obtained, and then, the image is classified into a first predetermined image class (for example, image class 1) corresponding to the maximum value of the class weights according to the maximum value of the class weights. The dashed box in fig. 2 is the pre-trained feature extraction network, and the output layer corresponds to the classification network.

Specifically, the classification network is obtained by training the terminal device based on a first predetermined number of sample images (recorded as first sample images) and user-defined first predetermined image categories, wherein a certain number of sample images can be screened out for each first predetermined image category to train the classification network, for example, when a certain first predetermined image category is a cat, a certain number of pictures including various cats can be screened out as sample images (for example, 100 pieces) to train the classification network, for example, when a certain first predetermined image category is a dog, a certain number of pictures including various dogs can be screened out as sample images (for example, 110 pieces) to train the classification network. For other first predetermined image categories, the above method is also used to train the classification network, and will not be described herein.

The screened pictures including various cats and the screened pictures including various dogs are the first predetermined number of sample images, in other words, the first predetermined number is a sum of numbers of the first predetermined number of sample images screened for each first predetermined image category, and based on the above example, the first predetermined number is 210 (i.e. a sum of 100 and 110).

By way of specific example, the training of the classification network described above is specifically described below:

in one example, when each of the first predetermined image categories is a cat and a dog, a certain number of sample images (for example, 100 pieces) about the cat may be acquired first, and the corresponding first predetermined image category is marked as a cat, that is, the category label of the 100 sample images about the cat is a cat, a certain number of sample images (for example, 110 pieces) about the dog may be acquired, and the corresponding first predetermined image category is marked as a cat, that is, the category label of the 110 sample images about the dog is a dog; next, determining 100 image features (noted as second image features) corresponding to the sample images of the cats and 110 image features (noted as second image features) corresponding to the sample images of the dogs through a pre-trained feature extraction network; then, training the classification network based on the respective image features respectively corresponding to the 100 sample images of the cats and the respective class labels respectively corresponding to the 100 sample images of the cats until the classification network meets a first predetermined condition, and training the classification network based on the respective image features respectively corresponding to the 110 sample images of the dogs and the respective class labels respectively corresponding to the 110 sample images of the dogs until the classification network meets the first predetermined condition.

In another example, when each of the first predetermined image categories is a cup and Xie Shi, first, a certain number of sample images (for example, 105 pieces) about the cup may be acquired, and the corresponding first predetermined image category may be marked as a cup, that is, the category label of the 105 pieces of sample images about the cup is a cup, a certain number of sample images (for example, 115 pieces) about the shoe may be acquired, and the corresponding first predetermined image category may be marked as a shoe, that is, the category label of the 115 pieces of sample images about the shoe is a shoe; next, through a pre-trained feature extraction network, determining 105 image features (marked as second image features) corresponding to the sample images of the water cup respectively, and 115 image features (marked as second image features) corresponding to the sample images of the shoes respectively; then, training the classification network based on the respective image features respectively corresponding to 105 sample images for the water cup and the respective class labels respectively corresponding to 105 sample images for the water cup until the classification network meets a first predetermined condition, and training the classification network based on the respective image features respectively corresponding to 115 sample images for the shoe and the respective class labels respectively corresponding to 115 sample images for the shoe until the classification network meets the first predetermined condition.

Specifically, when training the classification network based on the respective second image features respectively corresponding to the respective first sample images and the respective first predetermined image categories respectively corresponding to the respective first sample images, the respective second image features of each first sample image are input to the classification network for each first sample image, and the weights of each first sample image corresponding to the respective first predetermined image categories are trained by the classification network such that the maximum value of the weights obtained by the trained classification network is the category weight of each first sample image corresponding to the first predetermined image category of each first sample image.

After each image feature of a certain sample image is input into the classification network, a class weight of the certain sample image corresponding to each first predetermined image class is obtained, and the classification network classifies the image class of the certain sample image according to the maximum value in each class weight, for example, when the first predetermined image class corresponding to the maximum value in each class weight is a dog, the certain sample image is classified into the first predetermined image class of the dog, and for example, when the first predetermined image class corresponding to the maximum value in each class weight is a cat, the certain sample image is classified into the first predetermined image class of the cat. However, when the image class of a certain sample image after being classified by the classification network is different from the class label of the certain sample image, it is indicated that the classification result of the classification network is wrong, and training of the classification network needs to be continued, for example, relevant parameters of the classification network need to be adjusted until the maximum value of the class weights obtained by the classification network after training is the class weight of the class label of the sample image.

In one example, if each first predetermined image category is a cat, dog, cup and shoe, respectively, the sample image is a picture about the cup (denoted as image S1), the category label of image S1 is the cup, and image S1 is one of each first sample image, each image feature of image S1 (denoted as second image feature) may be input to the classification network to train the classification network through image S1. The classification network obtains the class weights of the image S1 corresponding to the four first preset image classes of the cat, the dog, the water cup and the shoe, the classification network classifies the image S1 according to the maximum value in the obtained class weights, namely, the image S1 is classified into the first preset image class corresponding to the maximum value in the class weights, when the fact that the class weight of the image S1 corresponding to the cat is maximum is determined, the image S1 is classified into the first preset image class of the cat, and when the fact that the class weight of the image S1 corresponding to the first preset image class of the shoe is determined to be maximum, the image S1 is classified into the first preset image class of the shoe.

Wherein, when the classification network erroneously classifies the image S1 as the first predetermined image category of the cat, it is explained that the classification result of the classification network is extremely unreliable, and the training of the classification network is not finished, the training of the classification network needs to be continued until the classification network correctly classifies the image S1 as the first predetermined image category of the cup. In other words, in training the classification network through the image S1, the respective parameters of the classification network need to be continuously adjusted so that the maximum value of the weights obtained by the classification network after training is the class weight of the first predetermined image class, which is the cup, corresponding to the image S1, for example, the class weight of the first predetermined image class, which is the cup, corresponding to the image S1, is 0.9, the class weight of the first predetermined image class, which is the cat, is 0.05, the class weight of the first predetermined image class, which is the dog, is 0.04, and the class weight of the first predetermined image class, which is the shoe, is 0.01, obtained by the classification network after training.

Wherein, for each of the respective first sample images, the classification network may be trained by inputting the respective second image features of each of the first sample images into the classification network until a maximum value of the class weights obtained through the trained classification network is the class weight of the first predetermined image class of each of the first sample images corresponding to each of the first sample images.

Specifically, the classification network meeting the first predetermined condition may be that the classification accuracy of the classification network is greater than or equal to a predetermined threshold, where the predetermined threshold may be 95%, 98%, or the like, may be 0.95%, 0.98, or the like, or may be in other possible numerical forms, and the embodiment of the present disclosure is not limited thereto. For example, when 100 sample images for cats are input to the classification model, the classification network may be considered to have been trained if there are 95 or 98 first predetermined image categories that can be properly classified as cats.

Further, the classification network may satisfy the first predetermined condition that the number of iterative training of the classification network is greater than or equal to a predetermined number of times, which may be 1000 times, 1500 times, 3000 times, or the like. If 100 sample images for the cat are input to the classification model when the predetermined number of times is 1000, the classification network can be considered to have been trained when the number of times the classification network in the classification model is iteratively trained is greater than or equal to 1000.

In addition, the satisfaction of the first predetermined condition by the classification network may be further convergence of a loss function of the classification network, wherein a value of the loss function characterizes a difference between a class of the first sample image output by the classification network and a second predetermined image class corresponding to the first sample image. For example, when 100 cat-related sample images are input to the classification model, if the loss function obtained by the classification network of the classification model for the 100 cat-related sample images is already stable, the loss function can be considered to have converged.

In one possible implementation, the classification network in the classification model may be retrained by retraining: acquiring a second preset number of second sample images, and determining first preset image categories corresponding to the second sample images respectively; then, determining each third image feature corresponding to each second sample image through a pre-trained feature extraction network; then, training the classification network based on the third image features respectively corresponding to the second sample images and the first preset image categories respectively corresponding to the second sample images until the classification network meets the first preset conditions.

Specifically, after the terminal device trains the classification network according to each first predetermined image category set by the category setting instruction of the user, the trained classification network may be used to classify the acquired image, for example, when the terminal device acquires the image including the cat, the image is classified into the first predetermined image category of the cat by the classification network, for example, when the terminal device acquires the image including the cup, the image is classified into the first predetermined image category of the cup by the classification network, so as to realize automatic classification of the image acquired by the terminal device.

In the process of classifying the acquired images through the trained classification network based on each first preset image category, when the user is unsatisfied with the classification result of the classification network or the use time of the trained classification network is too long, the classification network can be retrained.

In one case, in the process of retraining the classification network, the user may select to newly add a batch of sample images (recorded as a third predetermined number of third sample images) based on the original first predetermined number of first sample images, so as to retrain the classification network, so as to enhance the accuracy of the classification result of the classification network, thereby further improving the classification accuracy of the classification network. Wherein, the user can add a batch of sample images (marked as a third predetermined number of third sample images) to the terminal device through the sample addition instruction, and correspondingly, the terminal device acquires the third predetermined number of third sample images corresponding to the sample addition instruction of the user.

After the terminal device acquires the third predetermined number of third sample images, retraining the classification network based on the third predetermined number of third sample images and the original first predetermined number of first sample images. Wherein, for convenience of description, the third predetermined number of third sample images and the first predetermined number of first sample images are recorded as the second predetermined number of second sample images, that is, the retrained classification network is based on the second predetermined number of second sample images, so as to obtain the retrained classification network.

In another case, during retraining of the classification network, the user may choose not to use the original first predetermined number of first sample images, but rather to reselect a new set of sample images to retrain the classification network, resulting in a retrained classification network. Wherein the user can update a new set of sample images (denoted as a second predetermined number of second sample images) for the terminal device by the sample update instruction, and correspondingly, the terminal device acquires the second predetermined number of second sample images corresponding to the sample update instruction of the user.

After the terminal device acquires the second sample images with the second preset number, retraining the classification network based on the second sample images with the second preset number to obtain the retrained classification network.

Specifically, the process of retraining the classification network by the terminal device based on the second predetermined number of second sample images may be: and determining each third image feature corresponding to each second sample image through the pre-trained feature extraction network, and training the classification network based on each third image feature corresponding to each second sample image and each first preset image category corresponding to each second sample image. The process is the same as the process of training the classification network based on the first predetermined number of first sample images, and will not be described in detail herein.

After obtaining the retrained classifying network according to the two conditions, the terminal device performs image classification on the image acquired according to the retrained classifying network in the subsequent image classifying process, namely, each time an image is acquired, the terminal device performs the following processing through a classifying model (comprising a pre-trained feature extraction network and the retrained classifying network): determining each first image feature corresponding to the image, determining the category weight of the image corresponding to each first preset image category according to each first image feature corresponding to the image, and determining the first preset image category corresponding to each target image according to each category weight so as to classify each target image.

In one possible implementation, when it is detected that each first predetermined image category is updated to each second predetermined image category according to a category update instruction of the user, the classification network is retrained, the classification network being retrained by: acquiring a fourth preset number of fourth sample images, and determining second preset image categories corresponding to the fourth sample images respectively; then, determining each fourth image feature corresponding to each fourth sample image through a pre-trained feature extraction network; then, training the classification network based on each fourth image feature corresponding to each fourth sample image and each second preset image category corresponding to each fourth sample image until the classification network meets the first preset condition.

Specifically, in the process of classifying the acquired images through the trained classification network based on each first predetermined image category, when the user is dissatisfied with one or more first predetermined image categories set originally or the first predetermined image categories set originally cannot meet the user requirements, the user can reset a certain number of new image categories (recorded as second predetermined image categories) according to the current personalized requirements. The user can update each first preset image category in the terminal device to each second preset image category through a corresponding category updating instruction, and correspondingly, the terminal device updates each first preset image category to each second preset image category according to the category updating instruction of the user.

After updating each first predetermined image category into each second predetermined image category, the terminal device needs to retrain the classification network based on each updated second predetermined image category, so that the acquired images can be attributed to the corresponding second predetermined image categories based on the retrained classification network, and image classification of the acquired images based on each second predetermined image category is achieved.

Specifically, the category update instruction includes any one of a category new instruction, a category replacement instruction, and a category deletion instruction. In one case, when the category update instruction is a category new instruction, the manner of updating each first predetermined image category to each second predetermined image category according to the category update instruction of the user may be that each second predetermined image category is determined according to the category new instruction of the user, and each second predetermined image category includes each first predetermined image category; that is, one or more predetermined image categories are newly added on the basis of the original first predetermined image categories, and the original first predetermined image categories and the newly added one or more predetermined image categories are used as the second predetermined image categories, that is, the second predetermined image categories include the original first predetermined image categories.

In one example, if the original first predetermined image categories are the image category a, the image category B, the image category C and the image category D, and the newly added image categories are the image category E and the image category F, the updated image categories are the image category a, the image category B, the image category C, the image category D, the image category E and the image category F, that is, the image category in the terminal device is updated from the original image category a, the image category B, the image category C and the image category D to the image category a, the image category B, the image category C, the image category D, the image category E and the image category F.

In another case, when the category update instruction is a category replacement instruction, the manner of updating each first predetermined image category to each second predetermined image category according to the category update instruction of the user may be that each second predetermined image category is determined according to the category addition instruction of the user, and each second predetermined image category does not include each first predetermined image category; that is, a certain number of second predetermined image categories, each of which is different from each of the first predetermined image categories, is rearranged and replaced with the certain number of second predetermined image categories instead of each of the original first predetermined image categories.

In one example, if the original first predetermined image categories are the image category a, the image category B, the image category C and the image category D, and the second predetermined image categories determined by the user's category replacement instruction are the image category E, the image category F and the image category G, the original image category a, the image category B, the image category C and the image category D are replaced by the image category E, the image category F and the image category G, that is, the image category in the terminal device is updated from the original image category a, the image category B, the image category C and the image category D to the image category E, the image category F and the image category G.

In still another case, when the category update instruction is a category deletion instruction, the first predetermined image categories may be updated to the second predetermined image categories according to the category update instruction of the user, respectively, and the second predetermined image categories may be determined according to the category deletion instruction of the user, and the first predetermined image categories include the second predetermined image categories; that is, one or more first predetermined image categories are newly deleted based on the original first predetermined image categories, and the remaining first predetermined image categories are regarded as second predetermined image categories, i.e., the original first predetermined image categories include the second predetermined image categories.

In one example, if the original first predetermined image categories are the image category a, the image category B, the image category C and the image category D, respectively, the user's category deletion instruction determines that the deleted first predetermined image categories are the image category C, respectively, and the remaining first predetermined image categories are the image category a, the image category B and the image category D, respectively, and at this time, the image category a, the image category B and the image category D are taken as the first predetermined image categories of the terminal device, that is, the image category in the terminal device is updated from the original image category a, the image category B, the image category C and the image category D to the image category a, the image category B and the image category D.

In particular, since the predetermined image class in the terminal device becomes updated, it is necessary to retrain the classification network in the classification model of the terminal device, i.e., retrain the classification network. Meanwhile, since the predetermined image class in the terminal device becomes updated, a batch of sample images matching the updated predetermined image class (denoted as second predetermined image class) need to be re-selected to re-train the classification network, i.e. for each second predetermined image class, a certain number of sample images matching each second predetermined image class are selected for subsequent re-training of the classification network. The sum of a number of sample images respectively selected for each second predetermined image category may be referred to as a fourth predetermined number of fourth sample images for convenience of description.

After the terminal device acquires the fourth sample images with the fourth preset number, retraining the classification network based on the fourth sample images with the fourth preset number to obtain a retrained classification network.

Specifically, the process of retraining the classification network by the terminal device based on the fourth predetermined number of fourth sample images may be: and determining fourth image features corresponding to the fourth sample images respectively through a pre-trained feature extraction network, and training the classification network based on the fourth image features corresponding to the fourth sample images respectively and second preset image categories corresponding to the fourth sample images respectively. The process is the same as the process of training the classification network based on the first predetermined number of first sample images, and will not be described in detail herein.

After obtaining the retrained classifying network according to the three conditions, the terminal device performs image classification on the image acquired according to the retrained classifying network in the subsequent image classifying process, namely, each time an image is acquired, the terminal device performs the following processing through a classifying model (comprising a pre-trained feature extraction network and the retrained classifying network): determining each first image feature corresponding to the image, determining the category weight of the image corresponding to each first preset image category according to each first image feature corresponding to the image, and determining the first preset image category corresponding to each target image according to each category weight so as to classify each target image.

Fig. 3 is a schematic structural diagram of a sorting apparatus according to another embodiment of the present disclosure, as shown in fig. 3, the apparatus 300 may include a first determining module 301, a second determining module 302, and a third determining module 303, where:

a first determining module 301, configured to determine each first image feature corresponding to at least one target image;

a second determining module 302, configured to determine, according to each first image feature corresponding to each target image, a category weight corresponding to each first predetermined image category of each target image, where each first predetermined image category is preset by the terminal device according to a category setting instruction of the user;

and the third determining module 303 is configured to determine a first predetermined image category corresponding to each target image according to each class weight, so as to classify each target image.

In one possible implementation manner, the classification device is implemented through a classification model, the classification model includes a classification network and a pre-trained feature extraction network, the pre-trained feature extraction network is used for determining first image features corresponding to at least one target image respectively, and the classification network is used for determining first predetermined image categories corresponding to the target images respectively according to first category weights corresponding to the target images respectively;

The classification network is obtained through a first training module, and the training module is used for:

acquiring a first preset number of first sample images, and determining first preset image categories corresponding to the first sample images respectively;

determining each second image feature corresponding to each first sample image through a pre-trained feature extraction network;

training the classification network based on each second image feature respectively corresponding to each first sample image and each first preset image category respectively corresponding to each first sample image until the classification network meets a first preset condition.

In one possible implementation manner, the first training module is specifically configured to, when training the classification network based on each second image feature respectively corresponding to each first sample image and each first predetermined image class respectively corresponding to each first sample image:

for each first sample image, the respective second image features of each first sample image are input to a classification network, and the weights of each first sample image corresponding to the respective first predetermined image class are trained by the classification network such that the maximum value of the respective weights obtained by the trained classification network is the class weight of each first sample image corresponding to the first predetermined image class of each first sample image.

In a possible implementation manner, the device further comprises a second training module, the second training module is used for retraining the classification network, and the second training module is specifically used for:

acquiring a second preset number of second sample images, and determining first preset image categories corresponding to the second sample images respectively;

determining each third image feature corresponding to each second sample image through a pre-trained feature extraction network;

training the classification network based on each third image feature corresponding to each second sample image and each first preset image category corresponding to each second sample image until the classification network meets the first preset condition.

In one possible implementation, the second training module, when acquiring a second predetermined number of second sample images, is configured to perform any one of:

acquiring a third predetermined number of third sample images corresponding to the sample newly-added instruction of the user, and determining the third predetermined number of third sample images and the first predetermined number of first sample images as a second predetermined number of second sample images;

a second predetermined number of second sample images corresponding to the sample update instruction of the user are acquired, the second sample images not including the first sample images.

In a possible implementation manner, the device further includes a third training module, where the third training module is configured to retrain the classification network when detecting that each first predetermined image category is updated to each second predetermined image category according to a category update instruction of the user, and the third training module is specifically configured to:

acquiring a fourth preset number of fourth sample images, and determining second preset image categories corresponding to the fourth sample images respectively;

determining fourth image features corresponding to the fourth sample images respectively through a pre-trained feature extraction network;

training the classification network based on each fourth image feature respectively corresponding to each fourth sample image and each second preset image category respectively corresponding to each fourth sample image until the classification network meets the first preset condition.

In one possible implementation, the category update instruction includes any one of a category new instruction, a category replacement instruction, and a category prune instruction; the third training module is used for executing any one of the following when updating each first preset image category to each second preset image category according to the category updating instruction of the user:

determining each second preset image category according to the category newly-added instruction of the user, wherein each second preset image category comprises each first preset image category;

Determining each second preset image category according to the category replacement instruction of the user, wherein each second preset image category does not comprise each first preset image category;

and determining each second preset image category according to the category deleting instruction of the user, wherein each first preset image category comprises each second preset image category.

In one possible implementation, the classification network satisfies a first predetermined condition, including any of:

the classification accuracy of the classification network is greater than or equal to a predetermined threshold;

the iterative training times of the classification network are more than or equal to the preset times;

the loss function of the classification network converges, the value of the loss function characterizing a difference between a class of the first sample image output by the classification network and a second predetermined image class corresponding to the first sample image.

According to the device provided by the embodiment of the disclosure, through determining the category weights of the target images corresponding to the first preset image categories according to the first image features of the target images respectively, and determining the first preset image categories of the target images according to the category weights, not only can the target images in the terminal equipment be automatically classified, but also the user can set the corresponding image categories according to own preference or demand, and the target images in the terminal equipment are automatically classified according to the self-defined image categories, and the user is not required to be limited to the inherent image categories of the terminal equipment, so that the flexibility of user participation and image category setting is greatly improved, and the user experience is greatly improved.

It should be noted that, this embodiment is an apparatus embodiment corresponding to the above-mentioned method embodiment, and this embodiment may be implemented in cooperation with the above-mentioned method embodiment. The related technical details mentioned in the above method embodiments are still valid in this embodiment, and in order to reduce repetition, they are not repeated here. Accordingly, the related technical details mentioned in the present embodiment may also be applied in the above-described method item embodiments.

Referring now to fig. 4, a schematic diagram of an electronic device 400 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, terminal devices such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

The electronic device comprises a memory and a processor, wherein the processor may be referred to as the processing means 401 described below, the memory comprising at least one of Read Only Memory (ROM) 402, random Access Memory (RAM) 403 and storage means 408 described below, as follows:

As shown in fig. 4, the electronic device 400 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 401, which may perform various suitable actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage means 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic device 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

In general, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, magnetic tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device 400 to communicate with other devices wirelessly or by wire to exchange data. While fig. 4 shows an electronic device 400 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communications device 409, or from storage 408, or from ROM 402. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 401.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: determining first image features corresponding to at least one target image respectively; then, according to each first image feature corresponding to each target image, determining the category weight of each target image corresponding to each first preset image category preset by the terminal equipment according to the category setting instruction of the user; and then, according to the weights of the categories, determining a first preset image category corresponding to each target image respectively so as to classify each target image.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules or units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the module or the unit is not limited to the unit itself in some cases, for example, the acquiring module may also be described as "a module for acquiring at least one event processing manner corresponding to a predetermined live event when the occurrence of the predetermined live event is detected".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided a classification method comprising:

In one possible implementation, the classification method is implemented by a classification model, where the classification model includes a classification network and a pre-trained feature extraction network, where the pre-trained feature extraction network is used to determine first image features corresponding to at least one target image respectively, and the classification network is used to determine first predetermined image categories corresponding to the target images respectively according to first category weights corresponding to the target images respectively to the first predetermined image categories;

the classification network is trained by:

In one possible implementation, training the classification network based on each second image feature respectively corresponding to each first sample image and each first predetermined image class respectively corresponding to each first sample image includes:

for each first sample image, inputting the respective second image features of each first sample image to a classification network, and training the class weights of each first sample image corresponding to the respective first predetermined image classes through the classification network such that the maximum value of the class weights obtained through the trained classification network is the class weight of each first sample image corresponding to the first predetermined image class of each first sample image.

In one possible implementation, the method further includes: retraining the classification network, the classification network retrained by:

In one possible implementation, a second predetermined number of second sample images is acquired, including any of:

In one possible implementation, the method further includes:

when detecting that each first preset image category is updated to each second preset image category according to the category updating instruction of the user, retraining the classification network, wherein the classification network is obtained through retraining in the following way:

In one possible implementation, the category update instruction includes any one of a category new instruction, a category replacement instruction, and a category prune instruction; the method for updating each first preset image category to each second preset image category according to the category updating instruction of the user comprises any one of the following steps:

According to one or more embodiments of the present disclosure, there is provided a classification apparatus comprising:

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A classification method, characterized in that it is applied to a terminal device, the terminal device has a fixed image category and can automatically classify each target image; the user can set the preset image category according to own preference or demand; the method comprises the following steps:

according to the first image characteristics corresponding to the target images, determining the category weights corresponding to the first preset image categories, wherein the first preset image categories are preset by the terminal equipment according to the category setting instructions of the user;

according to each class weight, determining a first preset image class corresponding to each target image respectively so as to classify each target image;

the classification method is realized through a classification model, the classification model comprises a classification network and a pre-trained feature extraction network, the pre-trained feature extraction network is used for determining first image features corresponding to at least one target image respectively, and the classification network is used for determining first preset image categories corresponding to the target images respectively according to first category weights corresponding to the target images respectively;

the classification network is trained by:

Determining each second image feature corresponding to each first sample image through the pre-trained feature extraction network;

and training the classification network based on each second image feature respectively corresponding to each first sample image and each first preset image category respectively corresponding to each first sample image until the classification network meets a first preset condition.

2. The method of claim 1, wherein training the classification network based on each second image feature for each first sample image and each first predetermined image class for each first sample image comprises:

for each first sample image, inputting the respective second image features of each first sample image into the classification network, and training the class weights of each first sample image corresponding to the respective first predetermined image classes through the classification network so that the maximum value of the class weights obtained through the trained classification network is the class weight of each first sample image corresponding to the first predetermined image class of each first sample image.

3. The method according to claim 1 or 2, characterized in that the method further comprises: retraining the classification network, the classification network being retrained by:

determining each third image feature corresponding to each second sample image through the pre-trained feature extraction network;

and training the classification network based on each third image feature corresponding to each second sample image and each first preset image category corresponding to each second sample image until the classification network meets a first preset condition.

4. A method according to claim 3, wherein the acquiring a second predetermined number of second sample images comprises any one of:

acquiring a third predetermined number of third sample images corresponding to a sample newly-added instruction of a user, and determining the third predetermined number of third sample images and the first predetermined number of first sample images as a second predetermined number of second sample images;

a second predetermined number of second sample images corresponding to a sample update instruction of a user are acquired, the second sample images not including the first sample images.

5. The method according to claim 1 or 2, characterized in that the method further comprises:

When detecting that each first preset image category is updated to each second preset image category according to the category updating instruction of the user, retraining the classification network, wherein the classification network is retrained by the following steps:

determining fourth image features corresponding to the fourth sample images respectively through the pre-trained feature extraction network;

and training the classification network based on each fourth image feature respectively corresponding to each fourth sample image and each second preset image category respectively corresponding to each fourth sample image until the classification network meets the first preset condition.

6. The method of claim 5, wherein the category update instruction comprises any one of a category add instruction, a category replace instruction, and a category prune instruction; the method for updating each first preset image category to each second preset image category according to the category updating instruction of the user comprises any one of the following steps:

determining each second preset image category according to a category newly-added instruction of a user, wherein each second preset image category comprises each first preset image category;

Determining each second preset image category according to a category replacement instruction of a user, wherein each second preset image category does not comprise each first preset image category;

7. The method according to any of claims 1-2, wherein the classification network satisfies a first predetermined condition comprising any of:

the iterative training times of the classification network are larger than or equal to preset times;

the loss function of the classification network converges, and the value of the loss function characterizes the difference between the class of the first sample image output by the classification network and the class of the second predetermined image corresponding to the first sample image.

8. A classification device, characterized in that it is applied to a terminal device, the terminal device has a fixed image category and can automatically classify each target image; the user can set the preset image category according to own preference or demand; the device comprises:

The second determining module is used for determining category weights of the target images corresponding to first preset image categories according to the first image features of the target images, wherein the first preset image categories are preset by the terminal equipment according to the category setting instructions of the user;

the third determining module is used for determining a first preset image category corresponding to each target image respectively according to each class weight so as to classify each target image;

the classification device is realized through a classification model, the classification model comprises a classification network and a pre-trained feature extraction network, the pre-trained feature extraction network is used for determining first image features corresponding to at least one target image respectively, and the classification network is used for determining first preset image categories corresponding to the target images respectively according to first category weights corresponding to the target images respectively;

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1-7 when executing the program.

10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the method of any of claims 1-7.