CN111259967B

CN111259967B - Image classification and neural network training method, device, equipment and storage medium

Info

Publication number: CN111259967B
Application number: CN202010054273.XA
Authority: CN
Inventors: 张潇; 赵瑞; 乔宇; 李鸿升
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2020-01-17
Filing date: 2020-01-17
Publication date: 2024-03-08
Anticipated expiration: 2040-01-17
Also published as: CN111259967A

Abstract

The present disclosure relates to an image classification and neural network training method, apparatus, device, and storage medium, the method comprising: performing feature extraction processing on the target image through a target neural network to obtain target features of the target image; according to the first parameter, determining the radial base distance between the target feature and the center-like feature of each type of image; and determining the target category of the target image according to the radial base distance. According to the image classification method disclosed by the embodiment of the invention, the class of the target image can be determined according to the radial base distance between the target feature and the class center feature of each class, the distribution effect of the target feature and the class center feature of each class can be improved, the randomness of the distribution is reduced, the aggregation effect of the class features is enhanced, and the classification accuracy of the target image is improved.

Description

Image classification and neural network training method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for image classification and neural network training.

Background

Image classification techniques are an important basis in computer vision techniques, and techniques such as object detection, image semantic segmentation, and instance segmentation all need to be based on image classification techniques. An image classification technique is a technique of outputting a set of probabilities describing whether an image belongs to a specific category or content by inputting the image acquired using an image acquisition device into a neural network. As a computer vision technology, the image classification can be widely applied to the fields of industrial production, anomaly detection, unmanned driving and the like.

The task of the image classification technique is to determine the affiliation between different subjects with higher similarity (e.g., discrimination of persons with long-phase similarity, classification of birds, etc.). In the training process of the neural network, the obtained characteristics are required to be optimized in the Euclidean space, however, the generated characteristic distribution effect is not ideal because reasonable loss value distribution cannot be carried out on training samples in the Euclidean space, so that the training effect of the neural network is not ideal, and the classification accuracy is low.

Disclosure of Invention

The present disclosure provides an image classification and neural network training method, apparatus, device and storage medium.

According to an aspect of the present disclosure, there is provided an image classification method, including: performing feature extraction processing on a target image through a target neural network to obtain target features of the target image; determining radial base distances between the target features and class center features of images of each class in an image set according to preset first parameters, wherein the radial base distances are used for representing classification probabilities that the target features respectively belong to each class, and the image set comprises images of at least one class; and determining the target category of the target image according to the radial base distance.

According to the image classification method disclosed by the embodiment of the invention, the class of the target image can be determined according to the radial base distance between the target feature and the class center feature of each class, the distribution effect of the target feature and the class center feature of each class can be improved, the randomness of the distribution is reduced, the aggregation effect of the class features is enhanced, and the classification accuracy of the target image is improved.

In one possible implementation manner, determining the radial base distance between the target feature and the center-like feature of each class of image according to a preset first parameter includes: determining Euclidean distance between the target feature and each class center feature; and determining radial base distances between the target feature and each class center feature according to the Euclidean distance and the first parameter.

In one possible implementation, the method further includes: respectively carrying out feature extraction processing on each image in the image set through a target neural network to respectively obtain feature information of each image; and determining class center characteristics of each class in the characteristic information of the images with the same class.

In one possible implementation manner, determining the class center feature of each class in the feature information of the images with the same class includes: determining a center image in each type of image; feature information corresponding to the center image of each category is determined as a category center feature of each category.

In one possible implementation manner, determining the class center feature of each class in the feature information of the images with the same class includes: and clustering the characteristic information of each image to obtain class center characteristics of each class.

In one possible implementation manner, determining the class center feature of each class in the feature information of the images with the same class includes: and respectively carrying out weighted average processing on the characteristic information of each class of images to obtain class center characteristics of each class.

According to an aspect of the present disclosure, there is provided a neural network training method, including: performing feature extraction processing on a first sample image through a neural network to obtain first features of the first sample image; according to a preset first parameter, determining radial base distances of the first feature and class center features respectively corresponding to each class of sample images in a training sample set; according to a preset second parameter and the radial base distance, determining network loss of the neural network; training the neural network according to the network loss, and obtaining the target neural network after training is completed.

According to the neural network training method, network loss can be determined according to the radial base distance, the neural network is optimized, the distribution effect of the characteristics extracted by the optimized neural network is improved, the training effect of the neural network is improved, and the classification accuracy of the neural network is improved.

In one possible implementation manner, determining the network loss of the neural network according to the preset second parameter and the radial base distance includes: determining a first radial base distance between the first feature and a class center feature of the class to which the first sample image belongs according to the labeling class of the first sample image; determining the classification probability of the first sample image according to the first radial base distance, the second parameter and the radial base distance between the first feature and each class of class center features; and determining the network loss of the neural network according to the classification probability.

Through the mode, the distance constraint can be added for the first feature and the second feature extracted by the neural network through the first parameter and the radial base distance, so that the radial base distance between the first feature and the similar central feature in the same category is not too large, randomness is reduced, outliers are reduced, the distance between the features is stable, training difficulty is reduced, and training effect is improved. And because the distance constraint is added, the relative difference between the radial base distance between different types of features in the middle and later stages of training and the radial base distance between the similar types of features is reduced, the features of the similar images can be further optimized, and additional training and optimization of the features of the similar images are not needed. The second parameter and the radial base distance are used for determining the classification probability, and then the network loss is determined through the classification probability, so that the relative difference between the radial base distance between different types of characteristics in the middle and later period of training and the radial base distance between the same type of characteristics can be reduced, the network loss of the same type of characteristics can not be too small, the characteristics of the same type of images can be further optimized, and the training effect is improved.

In one possible implementation, the method further includes: and determining the second parameter according to the category number of a plurality of sample images in the sample image set.

According to an aspect of the present disclosure, there is provided an image classification apparatus including: the first extraction module is used for carrying out feature extraction processing on a target image through a target neural network to obtain target features of the target image; the first determining module is used for determining radial base distances between the target features and class center features of images of all classes in an image set according to preset first parameters, wherein the radial base distances are used for representing classification probabilities that the target features respectively belong to all classes, and the image set comprises images of at least one class; and the second determining module is used for determining the target category of the target image according to the radial base distance.

In one possible implementation, the first determining module is further configured to: determining Euclidean distance between the target feature and each class center feature; and determining radial base distances between the target feature and each class center feature according to the Euclidean distance and the first parameter.

In one possible implementation, the apparatus further includes: the second extraction module is used for respectively carrying out characteristic extraction processing on each image in the image set through the target neural network and respectively obtaining characteristic information of each image; and the third determining module is used for determining the class center characteristics of each class in the characteristic information of the images with the same class.

In one possible implementation, the third determining module is further configured to: determining a center image in each type of image; feature information corresponding to the center image of each category is determined as a category center feature of each category.

In one possible implementation, the third determining module is further configured to: and clustering the characteristic information of each image to obtain class center characteristics of each class.

In one possible implementation, the third determining module is further configured to: and respectively carrying out weighted average processing on the characteristic information of each class of images to obtain class center characteristics of each class.

According to an aspect of the present disclosure, there is provided an image neural network training apparatus including: the third extraction module is used for carrying out feature extraction processing on the first sample image through a neural network to obtain first features of the first sample image; a fourth determining module, configured to determine, according to a preset first parameter, radial base distances of class center features corresponding to the first feature and each class of sample images in the training sample set, respectively; a fifth determining module, configured to determine a network loss of the neural network according to a preset second parameter and the radial base distance; and the training module is used for training the neural network according to the network loss, and obtaining the target neural network after the training is completed.

In one possible implementation, the fifth determining module is further configured to: determining a first radial base distance between the first feature and a class center feature of the class to which the first sample image belongs according to the labeling class of the first sample image; determining the classification probability of the first sample image according to the first radial base distance, the second parameter and the radial base distance between the first feature and each class of class center features; and determining the network loss of the neural network according to the classification probability.

In one possible implementation, the apparatus further includes: and a sixth determining module, configured to determine the second parameter according to the number of categories of the plurality of sample images in the sample image set.

According to an aspect of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to: the above method is performed.

According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the technical aspects of the disclosure.

FIG. 1 illustrates a flow chart of an image classification method according to an embodiment of the present disclosure;

FIG. 2 illustrates a flow chart of a neural network training method, according to an embodiment of the present disclosure;

FIG. 3 illustrates a schematic diagram of a feature space during initial training according to an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of a feature space in mid-to-late training according to an embodiment of the present disclosure;

FIG. 5 illustrates a graph of Euclidean distance versus radial base distance, according to an embodiment of the present disclosure;

FIG. 6 illustrates a first radial base distance versus classification probability graph according to an embodiment of the disclosure;

FIG. 7 illustrates an application schematic of a neural network training method according to an embodiment of the present disclosure;

fig. 8 shows a block diagram of an image classification apparatus according to an embodiment of the disclosure;

FIG. 9 illustrates a block diagram of a neural network training device, according to an embodiment of the present disclosure;

FIG. 10 illustrates a block diagram of an electronic device according to an embodiment of the disclosure;

fig. 11 shows a block diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.

Fig. 1 shows a flowchart of an image classification method according to an embodiment of the present disclosure, as shown in fig. 1, the method including:

in step S11, performing feature extraction processing on a target image through a target neural network, so as to obtain target features of the target image;

in step S12, determining a radial base distance between the target feature and a class center feature of each class of images in an image set according to a preset first parameter, where the radial base distance is used to represent a classification probability that the target feature respectively belongs to each class, and the image set includes at least one class of images;

in step S13, a category of the target image is determined according to the radial base distance.

In one possible implementation, the image classification method may be performed by a terminal device or other processing device, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, etc. The other processing device may be a server or cloud server, etc. In some possible implementations, the image classification training method may be implemented by way of a processor invoking computer readable instructions stored in a memory.

In one possible implementation, the image classification method may be used in a process of classifying a person image, for example, may be used in a process of classifying a captured person image by an access control device, a security device, or the like. For example, target features of a target image may be extracted through a neural network and classified according to the target features. When classifying a target image, class center features of at least one class are generally known, and target features of the image are extracted through a target neural network, so that the class of the target image is determined through feature similarity between the features of the target image and the class center features of each class. The target neural network may be a deep learning neural network such as a convolutional neural network, and the network structure of the target neural network is not limited in the present disclosure.

In one possible implementation, an image set (e.g., a sample image set) including a plurality of reference images may be stored in the access device or security device and used for comparison with the captured target image. The reference images in the image set can also be extracted by the neural network, the extracted features are clustered into a plurality of clusters in a feature space according to the categories of the reference images, and each cluster is a type of reference image. For example, if the reference image is a portrait image, each cluster is a feature of the reference image of the same person, and further, each cluster may further include class center features of each class, for example, features extracted from a certificate photograph, or class center features obtained by a weighted average of features of similar reference images.

In an example, the image set may include multiple categories of images, e.g., category a, category B, category C … … each having a class center. The method further comprises the steps of: respectively carrying out feature extraction processing on each image in the image set through a target neural network to respectively obtain feature information of each image; and determining class center characteristics of each class in the characteristic information of the images with the same class.

In one possible implementation, the image set may include a plurality of sample images, e.g., may include a plurality of portrait images, which may be categorized into a plurality of categories, e.g., the image set includes a plurality of portrait images of persons, each of which may be categorized into a category. Feature extraction processing can be performed on each sample image through the target neural network, so that feature information of each sample image can be obtained. The feature information may be a feature vector, and the present disclosure does not limit the data type of the feature information.

In an example, an image representative of each category may be selected as a center image (e.g., if the images in the image set are portrait images, classification may be made by the identity of the target objects in the images, credentials or front side of each target object may be selected as the center image for that category), and features of the center image extracted by the neural network may be determined as class center features.

In an example, the class center feature of each class may be obtained by clustering, for example, feature information of each image may be extracted through a neural network, and clustering processing may be performed on the feature information, so that the class center feature of each class may be obtained.

In an example, the class center feature of each class may be obtained by a weighted average method, for example, feature information of each image may be extracted through a neural network, and weighted average processing may be performed on the feature information of each class, so that the class center feature of each class may be obtained.

In one possible implementation, after the class center feature of each class is obtained, the class of the image may be determined according to the feature similarity between the target feature of the target image and the class center feature of each class, for example, the euclidean distance between the target feature of the target image and the class center feature of each class may be determined, and the class to which the class center feature having the smallest euclidean distance with the feature of the image belongs may be determined as the class of the image. However, determining feature similarity using euclidean distance may cause a larger randomness of the distribution of feature information in the feature space, and the aggregation effect of feature information of the same class is poor, resulting in difficulty in judging the class of the target feature, and determining feature similarity using euclidean distance is also unfavorable for training of the neural network.

In one possible implementation, in step S11, the target neural network may be used to extract target features of a target image. In step S12, a radial base distance between the target feature and each type of center-like feature in the reference image is determined according to a preset first parameter. In an example, the first parameter γ may be preset, and the radial base distance of the target feature from each class of center-like features may be calculated using the first parameter γ.

In one possible implementation, step S12 may include: determining Euclidean distance between the target feature and each class center feature; and determining radial base distances between the target feature and each class center feature according to the Euclidean distance and the first parameter.

In one possible implementation, the euclidean distance between the target feature and each class of center features may be determined, and in an example, the target feature and each class of center features may be feature vectors, and the euclidean distance between the target feature and each class of center features may be determined using the following formula (1):

wherein,is a target feature, wherein the target sample image belongs to the i (i is a positive integer) th class,/>For class center feature of the j (j is a positive integer) th class,/for the j-th class >Is->Is a binary norm of (c). d, d _i,j Is the euclidean distance between the target feature and the class-center feature of the j-th class. The Euclidean distance of the target feature from each class center feature may be determined according to equation (1) above.

In one possible implementation, the Euclidean distance d can be used _i,j And determining the radial base distance between the target feature and each class center feature by a preset first parameter gamma. For example, the radial base distance of the target feature from each class of class center features may be determined according to the following equation (2):

wherein K is _i,j Is the radial base distance between the target feature and the class center feature of the j-th class. By means of the exponential radial base distance and the preset target parameter gamma, the radial base distance K can be _i,j Adding a distance constraint such that the radial base distance K _i,j The value range of (2) is 0 < K _i,j And.ltoreq.1, i.e. when the Euclidean distance d _i,j Toward infinity, radial base distance K _i,j Is 0 when the Euclidean distance d _i,j Is 0, i.eRadial base distance K _i,j =1. The radial base distance of the target feature from each class center feature may be determined according to equation (2) above.

In one possible implementation manner, in step S13, the class of the target image is determined according to the radial base distance between the target feature and the class center feature of each class, for example, the class center feature with the smallest radial base distance from the target feature may be determined, and the class to which the class center feature belongs is determined as the class of the target image. For example, the image in the image set and the target image are both portrait images, the image in the image set may be classified into a plurality of categories according to the identity, and the category to which the center-of-category feature having the smallest radial base distance from the target feature belongs is identity a, and then the identity of the target object in the target image may be determined as identity a. The present disclosure is not limited to the type and manner of classification of the target image and the images in the image set.

In one possible implementation, the neural network may be trained prior to the feature extraction process using the neural network. In an example, the loss function of the neural network may be determined by radial basis distance to enhance the training effect.

Fig. 2 shows a flowchart of a neural network training method according to an embodiment of the present disclosure, as shown in fig. 2, the method including:

in step S21, performing feature extraction processing on a first sample image through a neural network, so as to obtain a first feature of the first sample image;

in step S22, according to a preset first parameter, determining a radial base distance between the first feature and a class center feature corresponding to each class of sample image in the training sample set;

in step S23, determining a network loss of the neural network according to a preset second parameter and the radial base distance;

in step S24, training the neural network according to the network loss, and obtaining a target neural network after training is completed.

In one possible implementation, the network loss of the neural network is also determined from the euclidean distance during training of the neural network. However, in the initial stage of training, the network loss of the neural network is large, network parameters are not accurate enough, errors of features extracted from images are large, randomness of the extracted features is strong, the features of the images of each category are randomly distributed in a feature space, and at the moment, euclidean distance between the features cannot accurately represent feature similarity between the features of each category.

Fig. 3 shows a schematic diagram of a feature space in an early training stage according to an embodiment of the present disclosure, as shown in fig. 3, where units in the feature space are unit distances in the feature space, because randomness is large, and there is no distance constraint between features, the euclidean distance of features of the same class of images (for example, in one class of images, a class center feature of the class and a feature of any image of the class) in the feature space may be large, and outliers are more, resulting in a large training difficulty, that is, a difficulty in narrowing the euclidean distance of the same class of features during training. On the other hand, because of strong randomness, euclidean distance between different types of features may be smaller, and in the training process, the training difficulty is also larger. I.e. the euclidean distance between the features is not stable. Further, deviations of the class center feature during the training process may be caused, for example, in the feature space, the class center feature of the credentials of the target object is not at the center of the area where all the features of the class are located, or the class center feature of the credentials of the target object has a larger deviation from the weighted average of all the features of the class.

In one possible implementation, in a stage of low network loss of the neural network in the middle and later stages of training, network parameters are optimized, errors of features extracted from the images are small, and features of images of each category can be respectively distributed in a centralized manner in a feature space, for example, in a cluster manner.

Fig. 4 shows a schematic diagram of a feature space in the middle and later stages of training according to an embodiment of the present disclosure, as shown in fig. 4, where the units in the feature space are units of distances in the feature space, and in the middle and later stages of training, features of the same type of images are intensively distributed, that is, euclidean distances between features of the same type of images are far smaller than euclidean distances between features of different types of images, which may result in little network loss and even disappearance, however, there is room for continuous optimization between features of the same type of images, but in the case of little network loss, it is difficult to continuously optimize euclidean distances between features of the same type of images.

In summary, determining network loss through Euclidean distances between features may result in poor training of the neural network. Network loss may be optimized, for example, may be determined by radial basis distance between features.

In one possible implementation, the neural network may be trained by a sample image set comprising at least one class of sample images. The sample image set may include a plurality of sample images, for example, may include a plurality of portrait images, which may be classified into a plurality of categories, for example, the sample image set may include a plurality of portrait images of persons, each of which may be classified into a category. The second feature of each sample image may be obtained by performing feature extraction processing on each sample image through a neural network. The second feature may be a feature vector, and the present disclosure does not limit the data type of the second feature.

In one possible implementation, the class center feature of each class may be determined from the second feature of each sample image. In an example, the second features of each sample image may be clustered to obtain class center features for each class. Alternatively, each class of sample images may have a labeling class, i.e., the class of each sample image is known, and the second features of the sample images of the same class may be weighted averaged to obtain a class center feature. Alternatively, a representative sample image may be selected in a class as a class center (e.g., if the sample image in the set of sample images is a portrait image, a credential or front side view of each target object may be selected as the class center for that class), and the characteristics of the class center extracted by the neural network may be determined as class center characteristics.

In a possible implementation, in step S21, the category of the first sample image is one of the categories of the sample images in the set of sample images. For example, the first sample image is a portrait image, and the identity of the target object in the first sample image may be consistent with the identity of the target object in at least one sample image in the set of sample images. The first feature of the first sample image may be obtained by performing feature extraction processing on the first sample image through a neural network. The data type of the first feature is consistent with the data type of the second feature, for example, the first feature and the second feature are feature vectors with the same dimension, so as to calculate feature similarity between the first feature and the second feature. The present disclosure does not limit the data types of the first feature and the second feature.

In one possible implementation, in step S22, a radial base distance between the first feature of the first sample image and the center-like feature of each sample image in the set of sample images may be determined. The Euclidean distance of the first feature to the class center feature of each class can be determined, for example, the Euclidean distance d of the first feature to the class center feature of each class is first determined by the above formula (1) _i,j . And determining the radial basis distance between the first feature and the class-center feature of each class based on the Euclidean distance, e.g. determining the radial basis distance K between the first feature and the class-center feature of each class by the above formula (2) _i,j 。

FIG. 5 is a graph showing Euclidean distance versus radial basis distance in units of feature space, where FIG. 5 shows the Euclidean distance d at values of 0.8, 1.2, 1.6, 2.0, 2.4, respectively, for a first parameter gamma _i,j Distance from radial base K _i,j A relationship between the two. With Euclidean distance d _i,j Increase of radial base distance K _i,j The decrease, while the first parameter gamma determines the rate of decrease of the radial basis distance, the rate of decrease of the radial basis distance decreases as the first parameter gamma increases, i.e. the greater the first parameter gamma, the radial The slower the base distance decreases. The value of the first parameter γ may be preset, for example, the first parameter γ may take a value between 1 and 2.

Through the mode, the distance constraint can be added for the first feature and the second feature extracted by the neural network through the first parameter and the radial base distance, so that the radial base distance between the first feature and the similar central feature in the same category is not too large, randomness is reduced, outliers are reduced, the distance between the features is stable, training difficulty is reduced, and training effect is improved. And because the distance constraint is added, the relative difference between the radial base distance between different types of characteristics in the middle and later stages of training and the radial base distance between the same types of characteristics is reduced, the characteristics of the same type of images can be further optimized, additional training and the characteristics of the same type of images are not required, and the training effect is improved.

In one possible implementation, in step S23, the network loss of the neural network may be determined according to the radial basis distance and the preset second parameter S. In an example, the second parameter s may be preset, and the network loss of the neural network may be determined using the second parameter s and the radial basis distance obtained by the above method.

In one possible implementation, step S23 may include: determining a first radial base distance between the first feature and a class center feature of the class to which the first sample image belongs according to the labeling class of the first sample image; determining the classification probability of the first sample image according to the first radial base distance, the second parameter and the radial base distance between the first feature and each class of class center features; and determining the network loss of the neural network according to the classification probability.

In one possible implementation, the first sample image may have a labeling category, i.e., the category of the first sample image (the exact category) may be labeled, and further, among the class center features of each class, the class center feature of the category of the first sample image may be determined. The radial base distance between the first feature of the first sample image and the center-like feature of the class of the first sample image is the first radial base distance K _i,i I.e. radial base distance K when i=j _i,j 。

In one possible implementation, the classification probability of the first sample image may be calculated, for example, using the first radial basis distance, the radial basis distance of the first feature from each class center feature, and the preset second parameter s.

In an example, the classification probability may be calculated according to the following equation (3):

wherein y is _i The i-th category is indicated as such,for the classification probability, i.e. the probability that the first sample image belongs to its labeling category, C is the number of categories, e.g. the number of categories of a plurality of sample images in the sample image set.

FIG. 6 is a graph showing a first radial basis distance in units of distances in feature space versus classification probability, showing a second parameter s at values of 8, 26, 32, 64, 128, respectively, a first radial basis distance K in accordance with an embodiment of the present disclosure _i,i And classification probability. With a first radial base distance K _i,i Increasing, classification probability P _i,yi The second parameter s determines the rate at which the classification probability increases, and as the second parameter s increases, the rate at which the classification probability increases, i.e., the greater the second parameter s, the faster the rate at which the classification probability increases. Because the value range of the first radial base distance is 0 < K _i,i Less than or equal to 1, therefore, it is necessary to preset a suitable second parameter s such that the first radial base distance K _i,i The range of values of the classification probability can cover the range of 0 to 1 when varying between 0 and 1. For example, as shown in fig. 6, when the value of the second parameter s is 8, the value range of the classification probability cannot cover 0 to 1 within the value range of the first radial base distance, so that the value of the second parameter s is not suitable for 8, and other values can be set Numerical values such as 16, 32, etc.

In a possible implementation manner, the second parameter s may be further calculated, and the method further includes: the second parameter is determined according to the number of categories of at least one sample image in the set of sample images. In an example, the second parameter may be determined according to the following equation (4):

s＝2ln (C-1) (4)

wherein, C is the category number.

In one possible implementation, the network loss of the neural network, e.g., the radial basis index normalized cross entropy loss, may be determined from the classification probability that the first sample image belongs to its labeling class.

In an example, the network loss of the neural network may be determined according to the following equation (5):

that is, radial basis index normalized cross entropy lossIs the value of the classification probability P _i,yi The base of the logarithm may be an arbitrarily set positive number, e.g., 10. The present disclosure does not limit the base of the logarithm.

In this way, the classification probability is determined through the second parameter and the radial base distance, and further the network loss is determined through the classification probability, so that the relative difference between the radial base distance between different types of features and the radial base distance between similar types of features in the middle and later period of training can be reduced, the network loss of the similar types of features cannot be too small, the features of the similar images can be further optimized, and the training effect is improved.

In one possible implementation, in step S24, the neural network may be trained according to network losses. In an example, the network parameters of the neural network may be adjusted in a direction that minimizes network loss, e.g., the network loss may be counter-propagated using a gradient descent method to adjust the network parameters of the neural network. And when the training conditions are met, obtaining the target neural network after training. The training condition may be an adjustment number, and the network parameters of the neural network may be adjusted a predetermined number of times. For another example, the training condition may be the magnitude or convergence of the network loss, the adjustment may be stopped when the network loss decreases to a certain extent or converges within a certain threshold, the trained target neural network may be obtained, and the trained target neural network may be used in the classification process.

According to the neural network training method, the distance constraint can be added for the first feature and the second feature extracted by the neural network through the first parameter and the radial base distance, so that the radial base distance between the first feature and the similar central feature in the same category is not excessively large, randomness is reduced, outliers are reduced, the distance between the features is stable, training difficulty is reduced, the distribution effect of the feature extracted by the neural network after optimization is improved, and training effect is improved. And because of adding the distance constraint, the relative difference between the radial base distance between different types of characteristics and the radial base distance between the similar types of characteristics in the middle and later stages of training is reduced, the network loss of the similar types of characteristics is not too small, the characteristics of the similar types of images can be further optimized, additional training and the characteristics of the similar types of images are not required, the training effect is improved, and the classification accuracy of the neural network is improved.

Fig. 7 is a schematic diagram illustrating an application of a neural network training method according to an embodiment of the present disclosure, in which a sample image in a sample image set may be first classified and class center features of each class may be acquired. The sample image set may include a plurality of sample images, for example, may include a plurality of portrait images, each of which may be classified into one type. The feature extraction processing can be performed on each sample image through the neural network, the second features of each sample image can be obtained, and the class center features of each class can be determined in the second features of each class, for example, the sample image is a portrait image, the certificate or the front side photograph of each target object can be selected as the class center of the class, and the second features of the class center can be determined as the class center features.

In one possible implementation manner, feature extraction may be performed on the first sample image through the neural network, so as to obtain a first feature of the first sample image, and euclidean distances between the first feature and various center features may be determined according to formula (1), further, a first parameter γ may be preset, and radial base distances between the first feature and various center features may be determined according to formula (2).

In one possible implementation, the second parameter may be calculated according to formula (4), and the classification probability of the first sample image may be calculated according to formula (3) according to the radial base distance between the first feature and each type of center feature, and further, the network loss of the neural network may be calculated according to formula (5).

In one possible implementation, the network parameters of the neural network may be adjusted according to the network loss, and the trained neural network is obtained when the training conditions are met. The target neural network trained in the mode has better performance in the classification processing.

In one possible implementation manner, in fig. 7, f illustrates a distribution of features obtained by feature extraction of a sample image in a sample image set by a target neural network trained by the above method in a feature space, and a graph, b graph, c graph, d graph and e graph respectively illustrate a distribution of features obtained by feature extraction of a sample image in a sample image set by a neural network trained by other methods in a feature space. As shown, the same class of features in the f-graph are more concentrated in distribution, less radial base distance from class center features, more pronounced classification, and classification accuracy in the classification accuracy f-graph (99.20%) is higher than classification accuracy for other trained neural networks (98.97%, 99.04%, 99.05%, 98.93% and 98.74%).

In one possible implementation manner, the neural network training method can be used for classifying a large number of images, detecting abnormality of industrial production, and also can be used for searching engines, album classification and the like based on the images, or can help the methods of target detection, semantic segmentation and the like to improve performance. The application field of the neural network training method is not limited by the disclosure.

Fig. 8 shows a block diagram of an image classification apparatus according to an embodiment of the present disclosure, as shown in fig. 8, the apparatus including:

the first extraction module 11 is configured to perform feature extraction processing on a target image through a target neural network, so as to obtain target features of the target image; the first determining module 12 is configured to determine, according to a preset first parameter, a radial base distance between the target feature and a class center feature of each class of images in an image set, where the radial base distance is used to represent a classification probability that the target feature respectively belongs to each class, and the image set includes at least one class of images; a second determining module 13, configured to determine a target class of the target image according to the radial base distance.

Fig. 9 shows a block diagram of a neural network training device, as shown in fig. 9, according to an embodiment of the present disclosure, the device comprising: a third extraction module 21, configured to perform feature extraction processing on a first sample image through a neural network, to obtain a first feature of the first sample image; a fourth determining module 22, configured to determine, according to a preset first parameter, a radial base distance between the first feature and a class center feature corresponding to each class of sample image in the training sample set; a fifth determining module 23, configured to determine a network loss of the neural network according to a preset second parameter and the radial base distance; and the training module 24 is used for training the neural network according to the network loss, and obtaining the target neural network after training is completed.

It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure.

In addition, the disclosure further provides an image classification device, an electronic device, a computer readable storage medium, and a program, where the foregoing may be used to implement any one of the image classification methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.

It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.

In some embodiments, a function or a module included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and a specific implementation thereof may refer to the description of the foregoing method embodiments, which is not repeated herein for brevity

The disclosed embodiments also provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method. The computer readable storage medium may be a non-volatile computer readable storage medium.

The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the method described above.

The electronic device may be provided as a terminal, server or other form of device.

Fig. 10 is a block diagram of an electronic device 800, according to an example embodiment. For example, electronic device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.

Referring to fig. 10, an electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen between the electronic device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operational mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the electronic device 800. For example, the sensor assembly 814 may detect an on/off state of the electronic device 800, a relative positioning of the components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in position of the electronic device 800 or a component of the electronic device 800, the presence or absence of a user's contact with the electronic device 800, an orientation or acceleration/deceleration of the electronic device 800, and a change in temperature of the electronic device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the electronic device 800 and other devices, either wired or wireless. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including computer program instructions executable by processor 820 of electronic device 800 to perform the above-described methods.

Embodiments of the present disclosure also provide a computer program product comprising computer readable code which, when run on a device, causes a processor in the device to execute instructions for implementing the picture searching method provided in any of the embodiments above.

The disclosed embodiments also provide another computer program product for storing computer readable instructions that, when executed, cause a computer to perform the operations of the picture searching method provided in any of the above embodiments.

The computer program product may be realized in particular by means of hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

Fig. 11 is a block diagram illustrating an electronic device 1900 according to an example embodiment. For example, electronic device 1900 may be provided as a server. Referring to FIG. 11, electronic device 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that can be executed by processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, processing component 1922 is configured to execute instructions to perform the methods described above.

The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 1932, including computer program instructions executable by processing component 1922 of electronic device 1900 to perform the methods described above.

The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvement of the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An image classification method, comprising:

performing feature extraction processing on a target image through a target neural network to obtain target features of the target image;

determining radial base distances between the target features and class center features of images of each class in an image set according to preset first parameters, wherein the radial base distances are used for representing classification probabilities that the target features respectively belong to each class, and the image set comprises images of at least one class;

determining a target category of the target image according to the radial base distance;

The determining, according to a preset first parameter, a radial base distance between the target feature and a class center feature of each class of image includes: the radial basis distance is determined using the following formula,

wherein K is _i,j For the radial base distance between the target feature and the class center feature of the j-th class, gamma is a preset target parameter, d _i,j Is the euclidean distance between the target feature and the class-center feature of the j-th class.

2. The method of claim 1, wherein determining the radial base distance of the target feature from the center-like feature of each class of image according to a preset first parameter comprises:

determining Euclidean distance between the target feature and each class center feature;

and determining radial base distances between the target feature and each class of center-like features according to the Euclidean distance and the first parameter.

3. The method according to claim 1 or 2, characterized in that the method further comprises:

respectively carrying out feature extraction processing on each image in the image set through a target neural network to respectively obtain feature information of each image;

and determining class center characteristics of each class in the characteristic information of the images with the same class.

4. A method according to claim 3, wherein determining class center features of each class from feature information of images of the same class comprises:

determining a center image in each type of image;

feature information corresponding to the center image of each category is determined as a category center feature of each category.

5. A method according to claim 3, wherein determining class center features of each class from feature information of images of the same class comprises:

and clustering the characteristic information of each image to obtain class center characteristics of each class.

6. A method according to claim 3, wherein determining class center features of each class from feature information of images of the same class comprises:

and respectively carrying out weighted average processing on the characteristic information of each class of images to obtain class center characteristics of each class.

7. A neural network training method, comprising:

performing feature extraction processing on a first sample image through a neural network to obtain first features of the first sample image;

according to a preset first parameter, determining radial base distances of the first feature and class center features respectively corresponding to each class of sample images in a training sample set;

According to a preset second parameter and the radial base distance, determining network loss of the neural network;

training the neural network according to the network loss, and obtaining a target neural network after training is completed;

the determining, according to a preset first parameter, a radial base distance between the first feature and a class center feature corresponding to each class of sample images in a training sample set, includes: the radial basis distance is determined using the following formula,

wherein K is _i,j Between the target feature and the class center feature of the j-th classGamma is a preset target parameter, d _i,j Is the euclidean distance between the target feature and the class-center feature of the j-th class.

8. The method of claim 7, wherein determining the network loss of the neural network based on the second predetermined parameter and the radial basis distance comprises:

determining a first radial base distance between the first feature and a class center feature of the class to which the first sample image belongs according to the labeling class of the first sample image;

determining the classification probability of the first sample image according to the first radial base distance, the second parameter and the radial base distance between the first feature and each class of class center features;

And determining the network loss of the neural network according to the classification probability.

9. The method of claim 7, wherein the method further comprises:

and determining the second parameter according to the category number of a plurality of sample images in the sample image set.

10. An image classification apparatus, comprising:

the first extraction module is used for carrying out feature extraction processing on a target image through a target neural network to obtain target features of the target image;

the first determining module is used for determining radial base distances between the target features and class center features of images of all classes in an image set according to preset first parameters, wherein the radial base distances are used for representing classification probabilities that the target features respectively belong to all classes, and the image set comprises images of at least one class;

the second determining module is used for determining the target category of the target image according to the radial base distance;

the first determining module is used for

The radial basis distance is determined using the following formula,

11. The apparatus of claim 10, wherein the first determination module is further to:

12. The apparatus according to claim 10 or 11, characterized in that the apparatus further comprises:

the second extraction module is used for respectively carrying out characteristic extraction processing on each image in the image set through the target neural network and respectively obtaining characteristic information of each image;

and the third determining module is used for determining the class center characteristics of each class in the characteristic information of the images with the same class.

13. The apparatus of claim 12, wherein the third determination module is further configured to:

determining a center image in each type of image;

14. The apparatus of claim 12, wherein the third determination module is further configured to:

15. The apparatus of claim 12, wherein the third determination module is further configured to:

16. A neural network training device, comprising:

the third extraction module is used for carrying out feature extraction processing on the first sample image through a neural network to obtain first features of the first sample image;

a fourth determining module, configured to determine, according to a preset first parameter, radial base distances of class center features corresponding to the first feature and each class of sample images in the training sample set, respectively;

a fifth determining module, configured to determine a network loss of the neural network according to a preset second parameter and the radial base distance;

the training module is used for training the neural network according to the network loss, and obtaining a target neural network after training is completed;

the fourth determining module is used for

The radial basis distance is determined using the following formula,

17. The apparatus of claim 16, wherein the fifth determination module is further configured to:

18. The apparatus of claim 16, wherein the apparatus further comprises:

and a sixth determining module, configured to determine the second parameter according to the number of categories of the plurality of sample images in the sample image set.

19. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to: performing the method of any one of claims 1 to 9.

20. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 9.