CN108197669B - Feature training method and device of convolutional neural network - Google Patents

Feature training method and device of convolutional neural network Download PDF

Info

Publication number
CN108197669B
CN108197669B CN201810096726.8A CN201810096726A CN108197669B CN 108197669 B CN108197669 B CN 108197669B CN 201810096726 A CN201810096726 A CN 201810096726A CN 108197669 B CN108197669 B CN 108197669B
Authority
CN
China
Prior art keywords
loss function
feature
loss
calculating
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810096726.8A
Other languages
Chinese (zh)
Other versions
CN108197669A (en
Inventor
张默
刘彬
孙伯元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Moshanghua Technology Co ltd
Original Assignee
Beijing Moshanghua Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Moshanghua Technology Co ltd filed Critical Beijing Moshanghua Technology Co ltd
Priority to CN201810096726.8A priority Critical patent/CN108197669B/en
Publication of CN108197669A publication Critical patent/CN108197669A/en
Application granted granted Critical
Publication of CN108197669B publication Critical patent/CN108197669B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The application discloses a feature training method and device of a convolutional neural network. The feature training method comprises the following steps: extracting a first characteristic picture; determining a feature map of the first feature picture, and acquiring a first feature according to the feature map; calculating a loss value of a loss function using the first feature as an input; and updating the convolutional neural network according to the loss value. The method and the device solve the technical problems that the loss objective function can not ensure that the intra-class distance is relatively closer and the inter-class distance is relatively farther.

Description

Feature training method and device of convolutional neural network
Technical Field
The application relates to the field of computers, in particular to a method and a device for training characteristics of a convolutional neural network.
Background
The convolutional neural network has good performance in the field of computer vision, and is particularly applied to the fields of object identification, object detection, object segmentation and the like. Training a convolutional neural network, and using a layer of convolutional layer and an active layer to be stacked, the strong visual representation capability can be realized, wherein the convolutional neural network structure is composed of two parts: convolutional network, objective loss function.
The inventors have found that there are some penalty functions in convolutional neural networks, which have the disadvantage that it is difficult to ensure that the distances within a class are closer and the distances between classes are further apart. If this premise is guaranteed, the features proposed by the trained network can be more representative. In addition, some loss functions ensure that the distance in the classes is closer, but do not ensure that the distance between the classes is farther, and simultaneously influence the accuracy of object recognition, so the method is widely used in the field of face classification. There are also some penalty functions that guarantee both closer intra-class distance and farther inter-class distance, however the problem is that the training process is difficult to converge if some noise is present in the training data itself.
Aiming at the problem that the loss objective function in the related technology can not ensure that the intra-class distance is relatively closer and the inter-class distance is relatively farther, an effective solution is not provided at present.
Disclosure of Invention
The present application mainly aims to provide a feature training method for a convolutional neural network to solve the problem.
In order to achieve the above object, according to one aspect of the present application, there is provided a feature training method of a convolutional neural network, including: extracting a first characteristic picture; determining a feature map of the first feature picture, and acquiring a first feature according to the feature map; calculating a loss value of a loss function using the first feature as an input; and updating the convolutional neural network according to the loss value; wherein the loss function is used to make the features trained in the updated convolutional neural network conform to a preset category.
Further, calculating the loss value of the loss function includes: configuring a first loss function, wherein the first loss function is used as a combined loss function of Softmax and cross entropy; configuring a second loss function, wherein the second loss function is used as an angle loss function.
Further, calculating the loss value of the loss function includes:
Figure GDA0002926872260000021
wherein the content of the first and second substances,
Figure GDA0002926872260000022
denotes yiCorresponding weight, N represents the number of input pictures;
and calculating an average value obtained by adding all probabilities corresponding to the N input pictures through a loss function.
Calculating the loss value of the loss function includes:
Figure GDA0002926872260000023
wherein the content of the first and second substances,
Figure GDA0002926872260000024
denotes yiCorresponding weight, N denotes the number of input pictures, yiRepresenting the corresponding category of each input picture;
calculating N pictures by loss function
Figure GDA0002926872260000025
Average value of (a).
Further, updating the convolutional neural network according to the loss value further includes: inputting a second picture to be tested; obtaining a corresponding second characteristic through the convolutional neural network after the loss value is updated; calculating a loss value of a loss function using the second feature as an input; and determining the category of the object corresponding to the second picture.
Further, the loss function is used to make the features trained in the updated convolutional neural network conform to preset categories as follows: the intra-class distance of the feature; inter-class distance of features.
In order to achieve the above object, according to another aspect of the present application, there is provided a feature training apparatus of a convolutional neural network.
The feature training device of the convolutional neural network according to the present application includes: the extraction unit is used for extracting a first characteristic picture; the determining unit is used for determining a feature map of the first feature picture and acquiring a first feature according to the feature map; a loss function unit for calculating a loss value of a loss function using the first feature as an input; the reverse unit is used for updating the convolutional neural network according to the loss value; wherein the loss function is used to make the features trained in the updated convolutional neural network conform to a preset category.
Further, the loss function unit includes: a first loss function unit and a second loss function unit, wherein the first loss function unit is used for being a combined loss function of Softmax and cross entropy; and the second loss function unit is used as an angle loss function.
Further, the apparatus further comprises: the test unit is used for inputting a second picture to be tested; obtaining a corresponding second characteristic through the convolutional neural network after the loss value is updated; calculating a loss value of a loss function using the second feature as an input; and determining the category of the object corresponding to the second picture.
Further, the inverse unit is further configured to make the features trained in the updated convolutional neural network conform to a preset through a loss function: the features are closer in-class distance; the inter-class distance of features is further.
In the embodiment of the application, a mode of optimizing feature training in the convolutional neural network is adopted, the loss function is used for enabling the features trained in the updated convolutional neural network to accord with the preset categories, the purpose of training the recognition capability to be stronger is achieved, the technical effect of training the features with stronger recognition capability is achieved, and the technical problems that the loss objective function cannot guarantee that the intra-class distance is relatively closer and the inter-class distance is relatively farther are solved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:
FIG. 1 is a schematic diagram of a feature training method for a convolutional neural network according to a first embodiment of the present application;
FIG. 2 is a schematic diagram of a feature training method for a convolutional neural network according to a second embodiment of the present application;
FIG. 3 is a schematic diagram of a feature training method for a convolutional neural network according to a third embodiment of the present application; and
FIG. 4 is a schematic diagram of a feature training apparatus for convolutional neural networks according to a preferred embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Many loss functions have been proposed so far, the original Softmax combined with cross entropy, which has the disadvantage that it is difficult to ensure that the distance within a class is closer and the distance between classes is further, and if this premise is guaranteed, the feature proposed by the trained network can be more characteristic.
The Center-Loss is proposed later, which can ensure that the distance in the class is closer, but does not ensure that the distance between the classes is farther, and simultaneously, the Center-Loss can influence the accuracy rate of object identification, so that the Center-Loss is widely used in the field of face classification; L-Softmax was proposed later, which guarantees both closer intra-class distance and farther inter-class distance, but it has the problem that the training process is difficult to converge if some noise is present in the training data itself.
According to the method, the optimized feature training mode in the convolutional neural network is adopted, the loss function is used for enabling the features trained in the updated convolutional neural network to accord with the preset categories, the purpose of stronger recognition capability is achieved, and the technical effect of training the features with stronger recognition capability is achieved.
The method in the embodiment of the application uses an angle-based loss function, which is mainly used in a training process of object recognition based on a deep learning convolutional neural network, wherein the main functions are shown as follows: a. the characterization capability of the trained features is stronger, namely the intra-class distance is closer, and the inter-class distance is farther; b. and on the premise of ensuring the establishment of the a, ensuring the convergence of the neural network training process.
(4) The target loss function related to the method can be used for model training of tasks except for object recognition, including object detection, object segmentation and the like.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
As shown in fig. 1, the method includes steps S102 to S108 as follows:
step S102, extracting a first characteristic picture;
inputting N pictures, and then carrying out normalization preprocessing on the N pictures to enable all pixel values to be between [ -1,1 ]; and then input into a convolutional neural network.
Because the convolutional neural network structure comprises a plurality of convolutional layers, an activation layer is connected behind each convolutional layer in the convolutional neural network, and a corresponding characteristic diagram is obtained after each convolutional layer passes through each convolutional layer.
And inputting the normalized data into a convolutional neural network to obtain a corresponding characteristic diagram.
Step S104, determining a feature map of the first feature picture, and acquiring a first feature according to the feature map;
determining the feature map of the feature picture means obtaining the feature map of the feature picture according to the number of channels of the feature map and the length and width of the feature map.
For example, let the size of each feature map be c × h × w, where c is the number of channels of the feature map, and h and w are the length and width of the feature map, since N pictures are input, that is, N feature maps can be obtained finally.
Step S106, taking the first characteristic as input, and calculating a loss value of a loss function;
and taking the multiple feature graphs as input, and obtaining dimensional multi-features as input through a full connection layer in the convolutional neural network.
For example, N feature maps are used as input, and NxM dimensional features are obtained through a full connection layer. I.e. N features, corresponding to N pictures, each feature being M-dimensional.
The calculation of the loss value of the loss function is to calculate the loss value of the loss function using the NxM-dimensional features and the class label of the picture as input.
Step S108, updating the convolutional neural network according to the loss value;
the loss function is used for enabling the features trained in the updated convolutional neural network to accord with preset categories.
The feature matching the predetermined category may be to ensure that the distance between the same (intra-class) features is closer and the distance between the different (inter-class) features is farther.
Specifically, the loss values of the loss functions are calculated, wherein two loss functions are included, the first loss function is a combination of Softmax and cross entropy, and the second loss function is an angle loss function.
From the above description, it can be seen that the present invention achieves the following technical effects:
in the embodiment of the application, a mode of optimizing feature training in the convolutional neural network is adopted, the loss function is used for enabling the features trained in the updated convolutional neural network to accord with the preset categories, the purpose of training the recognition capability to be stronger is achieved, the technical effect of training the features with stronger recognition capability is achieved, and the technical problems that the loss objective function cannot guarantee that the intra-class distance is relatively closer and the inter-class distance is relatively farther are solved. In the method in the embodiment of the application, more hyper-parameters are not introduced during training, so that the cost of manual parameter adjustment is reduced, and the use amount of video memory and memory is not obviously increased during training.
In the test process in the embodiment of the application, the extracted picture features can be used in the fields of object identification, object retrieval and the like.
According to the embodiment of the present invention, as a preferable feature in the embodiment, as shown in fig. 2, the calculating of the loss value of the loss function includes:
step S202, configuring a first loss function,
the first loss function is used as a combined loss function of Softmax and cross entropy;
calculating the loss value of the loss function includes:
Figure GDA0002926872260000071
wherein the content of the first and second substances,
Figure GDA0002926872260000072
denotes yiCorresponding weight, N represents the number of input pictures;
and calculating an average value obtained by adding all probabilities corresponding to the N input pictures through a loss function.
The loss function means
Figure GDA0002926872260000073
Wherein f is the first feature obtained,
Figure GDA0002926872260000074
is a weight vector corresponding to class i, so
Figure GDA0002926872260000075
Is a category yiCorresponding weight vectors (in this application, M categories are provided, each input picture corresponding to a particular category yi),yiAs the real category corresponding to the input picture.
By passing
Figure GDA0002926872260000076
Multiplying by f yields a fraction, and
Figure GDA0002926872260000077
wherein the expression of (1) represents that f is judged to be yiThe probability over the category of (a).
Step S204, configuring a second loss function,
the second loss function is used as an angular loss function.
Calculating the loss value of the loss function includes:
Figure GDA0002926872260000078
wherein the content of the first and second substances,
Figure GDA0002926872260000081
denotes yiCorresponding weight, N denotes the number of input pictures, yiRepresenting the corresponding category of each input picture;
calculating N pictures by loss function
Figure GDA0002926872260000082
Average value of (a).
Wherein f is the first characteristic obtained by the method,
Figure GDA0002926872260000083
is a category yiCorresponding weight vectors (in this application, M categories are provided, each input picture corresponding to a particular category yi),yiAs the real category corresponding to the input picture.
Figure GDA0002926872260000084
Represents
Figure GDA0002926872260000085
Cosine of the angle of f
Figure GDA0002926872260000086
In the range of [ -1,1]In the above-mentioned manner,the closer to 1, the
Figure GDA0002926872260000087
The smaller the angle between the vector and the f-feature vector.
The loss function LossFunction calculates N pictures
Figure GDA0002926872260000088
Average value of (1) can be
Figure GDA0002926872260000089
The angle with f is as small as possible.
According to the embodiment of the present invention, as shown in fig. 3, as a preferable option in the embodiment, after updating the convolutional neural network according to the loss value, the method further includes:
step S302, inputting a second picture to be tested;
inputting pictures to be tested, wherein the number of the pictures can be N (N > ═ 1), and obtaining corresponding characteristics through the trained neural network.
Step S304, obtaining a corresponding second characteristic through the convolutional neural network after the loss value is updated;
since after the loss value is calculated in step S108, all parameters of the entire network are updated using back propagation. Therefore, the picture to be tested is input into the updated convolutional neural network to obtain the corresponding characteristic diagram.
Step S306, taking the second characteristic as input, and calculating a loss value of a loss function;
input loss function through combination of Softmax and cross entropy
Figure GDA0002926872260000091
Function of angular loss
Figure GDA0002926872260000092
A loss value of the loss function is calculated.
Step S308, determining the category of the object corresponding to the second picture.
And (3) in the testing stage, the characteristic passes through a Softmax layer to obtain the probability of all known classes (the probability is added to be 1), and the class with the highest probability is selected as the class of the object corresponding to the picture.
Preferably, in this embodiment, the loss function is used to make the features trained in the updated convolutional neural network conform to preset categories as follows: the intra-class distance of the feature; inter-class distance of features.
And the loss function is used for ensuring that the intra-class distance of the features is closer and the inter-class distance of the features is farther when the training features in the updated convolutional neural network accord with the preset.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
According to an embodiment of the present invention, there is also provided an apparatus for implementing the above feature training method for a convolutional neural network, as shown in fig. 4, the apparatus includes: an extraction unit 10, configured to extract a first feature picture; a determining unit 20, configured to determine a feature map of the first feature picture, and obtain a first feature according to the feature map; a loss function unit 30 configured to calculate a loss value of a loss function using the first feature as an input; an inverse unit 40, configured to update the convolutional neural network according to the loss value; wherein the loss function is used to make the features trained in the updated convolutional neural network conform to a preset category.
In the extraction unit 10 of the embodiment of the application, all pixel values are between [ -1,1] by inputting N pictures and then performing normalization preprocessing on the N pictures; and then input into a convolutional neural network.
Because the convolutional neural network structure comprises a plurality of convolutional layers, an activation layer is connected behind each convolutional layer in the convolutional neural network, and a corresponding characteristic diagram is obtained after each convolutional layer passes through each convolutional layer.
And inputting the normalized data into a convolutional neural network to obtain a corresponding characteristic diagram.
Determining the feature map of the feature picture in the determining unit 20 in the embodiment of the present application means obtaining the feature map of the feature picture according to the number of channels of the feature map and the length and width of the feature map.
For example, let the size of each feature map be c × h × w, where c is the number of channels of the feature map, and h and w are the length and width of the feature map, since N pictures are input, that is, N feature maps can be obtained finally.
In the loss function unit 30 of the embodiment of the present application, a plurality of feature maps are used as input, and a dimensional multi-feature is obtained as input through a full connection layer in a convolutional neural network.
For example, N feature maps are used as input, and NxM dimensional features are obtained through a full connection layer. I.e. N features, corresponding to N pictures, each feature being M-dimensional.
The calculation of the loss value of the loss function is to calculate the loss value of the loss function using the NxM-dimensional features and the class label of the picture as input.
The loss function in the inverse unit 40 of the embodiment of the present application is used to make the features trained in the updated convolutional neural network conform to the preset categories.
The feature matching the predetermined category may be to ensure that the distance between the same (intra-class) features is closer and the distance between the different (inter-class) features is farther.
Specifically, the loss values of the loss functions are calculated, wherein two loss functions are included, the first loss function is a combination of Softmax and cross entropy, and the second loss function is an angle loss function.
As a preference in the present embodiment, the loss function unit 30 includes: a first loss function unit and a second loss function unit, wherein the first loss function unit is used for being a combined loss function of Softmax and cross entropy; and the second loss function unit is used as an angle loss function.
In the first loss function unit, calculating the loss value of the loss function includes:
Figure GDA0002926872260000101
wherein the content of the first and second substances,
Figure GDA0002926872260000111
denotes yiCorresponding weight, N represents the number of input pictures;
and calculating an average value obtained by adding all probabilities corresponding to the N input pictures through a loss function.
The loss function means
Figure GDA0002926872260000112
Wherein f is the first feature obtained,
Figure GDA0002926872260000113
is a weight vector corresponding to class i, so
Figure GDA0002926872260000114
Is a category yiCorresponding weight vectors (in this application, M categories are provided, each input picture corresponding to a particular category yi),yiAs the real category corresponding to the input picture.
By passing
Figure GDA0002926872260000115
Multiplying by f yields a fraction, and
Figure GDA0002926872260000116
wherein the expression of (1) represents that f is judged to be yiThe probability over the category of (a).
Calculating the loss value of the loss function in the second loss function unit includes:
Figure GDA0002926872260000117
wherein the content of the first and second substances,
Figure GDA0002926872260000118
denotes yiCorresponding weight, N denotes the number of input pictures, yiRepresenting the corresponding category of each input picture;
calculating N pictures by loss function
Figure GDA0002926872260000119
Average value of (a).
Wherein f is the first characteristic obtained by the method,
Figure GDA00029268722600001110
is a category yiCorresponding weight vectors (in this application, M categories are provided, each input picture corresponding to a particular category yi),yiAs the real category corresponding to the input picture.
Figure GDA00029268722600001111
Represents
Figure GDA00029268722600001112
Cosine of the angle of f
Figure GDA00029268722600001113
In the range of [ -1,1]The closer to 1, the
Figure GDA0002926872260000121
The smaller the angle between the vector and the f-feature vector.
The loss function LossFunction calculates N pictures
Figure GDA0002926872260000122
Average value of (1) can be
Figure GDA0002926872260000123
The angle with f is as small as possible.
As a preference in the present embodiment, the present invention further includes: the test unit is used for inputting a second picture to be tested; obtaining a corresponding second characteristic through the convolutional neural network after the loss value is updated; calculating a loss value of a loss function using the second feature as an input; and determining the category of the object corresponding to the second picture.
The test unit of the embodiment of the application inputs the pictures to be tested, the number of the pictures can be N (N > ═ 1), and the corresponding features are obtained through the trained neural network.
Since after the loss value is calculated in step S108, all parameters of the entire network are updated using back propagation. Therefore, the picture to be tested is input into the updated convolutional neural network to obtain the corresponding characteristic diagram.
Input loss function through combination of Softmax and cross entropy
Figure GDA0002926872260000124
Function of angular loss
Figure GDA0002926872260000125
A loss value of the loss function is calculated.
And (3) in the testing stage, the characteristic passes through a Softmax layer to obtain the probability of all known classes (the probability is added to be 1), and the class with the highest probability is selected as the class of the object corresponding to the picture.
Preferably, in this embodiment, the loss function is used to make the features trained in the updated convolutional neural network conform to preset categories as follows: the intra-class distance of the feature; inter-class distance of features.
And the loss function is used for ensuring that the intra-class distance of the features is closer and the inter-class distance of the features is farther when the training features in the updated convolutional neural network accord with the preset.
The device for implementing the feature training method of the convolutional neural network trains the features with stronger recognition capability, and ensures that the intra-class distance of the features is closer and the inter-class distance of the features is farther. The training of the features is mainly based on an angle optimization loss function, combined with a Softmax cross entropy loss function, and compared with the features obtained by the traditional method only using Softmax cross entropy, the recognition rate of the features obtained by training in the device of the embodiment of the application is improved by 1% on data sets such as Cifar10 and Cifar100, the recognition accuracy of the training model on the two data sets by the original method is respectively 92.5% and 69.24%, and the recognition accuracy in the device of the embodiment of the application is 93.7% and 72%.
Compared with L-Softmax, the method is easier to train, the L-Softmax method has strong constraint on the features, the advantage is that the features with higher recognition rate can be trained, but the problem that the training process is difficult to converge can be met.
Specifically, in the apparatus of the embodiment of the present application, the feature training method of the neural network is performed as follows:
the method mainly aims at object recognition based on the deep learning convolutional neural network, and comprises a training stage and a testing stage, wherein the method is mainly used for the training stage and helps to train a model with stronger recognition capability;
a training stage: taking the whole convolutional neural network as two parts, wherein the first part is used for extracting features, and the second part is used for calculating loss functions of the features and optimizing the loss functions;
s1 inputting N pictures, wherein N is the number of the input pictures in batch processing, and the N pictures are subjected to normalization preprocessing to enable all pixel values to be between [ -1,1 ].
S2, a convolutional neural network structure is adopted, the convolutional neural network structure is composed of a plurality of convolutional layers, an activation layer is connected behind each convolutional layer, a corresponding characteristic diagram can be obtained after each convolutional layer passes through, the specific number and structure of convolutional layers can be changed according to specific tasks, and only the output of the last convolutional neural network is needed;
s3, obtaining a final feature map, wherein the size of each feature map is cxhxw, c is the channel number of the feature map, and h and w are the length and width of the feature map, and because N pictures are input, N feature maps are obtained finally;
S4N feature graphs are used as input, and NxM dimensional features, namely N features, are obtained through a full connection layer and correspond to the N pictures, wherein each feature is M-dimensional;
s5, taking the last NxM feature and the class label of the picture as input, and calculating a loss value of a loss function, where the loss value includes two loss functions, a first loss function is a combination of Softmax and cross entropy, and a second loss function is an angle loss function, and the specific formula is as follows:
Figure GDA0002926872260000141
Figure GDA0002926872260000142
s6, after calculating the loss value, updating all parameters of the whole network by using back propagation;
testing phase
S1, inputting pictures to be tested, wherein the number of the pictures is N (N > -1), and obtaining corresponding characteristics through the trained neural network;
and S2, the characteristic is subjected to a Softmax layer to obtain the probability of all known classes (the probability is added to be 1), and the class with the highest probability is selected as the class of the object corresponding to the picture.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (6)

1. A method for training features of a convolutional neural network, comprising:
extracting a first characteristic picture;
determining a feature map of the first feature picture, and acquiring a first feature according to the feature map; calculating a loss value of a loss function using the first feature as an input, wherein calculating the loss value of the loss function comprises: configuring a first loss function, wherein the first loss function is used as a combined loss function of Softmax and cross entropy; configuring a second loss function, wherein the second loss function is used as an angle loss function; and
updating the convolutional neural network according to the loss value;
the loss function is used for enabling the features trained in the updated convolutional neural network to accord with preset categories;
obtaining the probability of the category of the first feature through a first loss function;
reducing the intra-class distance of the features through a second loss function, and increasing the inter-class distance of the features;
the extracting the first feature picture comprises:
inputting N pictures, performing normalization pretreatment on the N pictures, and inputting the normalized pictures into a convolutional neural network to obtain a corresponding characteristic diagram;
taking the first feature as an input, calculating a loss value of a loss function comprises:
n characteristic graphs are used as input, N multiplied by M dimensional characteristics are obtained through a full connection layer, and N characteristics correspond to N pictures and each characteristic is M-dimensional; calculating a loss value of a loss function by taking the characteristics of the dimension N multiplied by M and the class label of the picture as input;
wherein calculating the loss value of the loss function comprises: calculating a loss value of the first loss function and calculating a loss value of the second loss function;
calculating the loss value of the first loss function includes:
calculating an average value obtained by adding all probabilities corresponding to the N input pictures through a loss function;
the loss function means
Figure FDA0002926872250000021
Wherein f is the first feature obtained,
Figure FDA0002926872250000022
is a weight vector corresponding to class i, so
Figure FDA0002926872250000023
Is a category yiCorresponding weight vector, M categories, each input picture corresponding to a specific category yi,yiAs the real category corresponding to the input picture;
by passing
Figure FDA0002926872250000024
Multiplying by f yields a fraction, and
Figure FDA0002926872250000025
wherein the expression of (1) represents that f is judged to be yiThe probability over the category of (1);
calculating the loss value of the second loss function includes:
calculating N pictures by loss function
Figure FDA0002926872250000026
Average value of (d);
a loss function of
Figure FDA0002926872250000027
Wherein f is the first feature obtained,
Figure FDA0002926872250000028
is a category yiCorresponding weight vector, M categories, each input picture corresponding to a specific category yi,yiAs the real category to which the input picture corresponds,
Figure FDA0002926872250000029
represents
Figure FDA00029268722500000210
Cosine of the angle of f
Figure FDA00029268722500000211
In the range of [ -1,1]The closer to 1, the
Figure FDA00029268722500000212
The smaller the angle between the vector and the f-feature vector.
2. The feature training method of claim 1, further comprising, after updating the convolutional neural network according to the loss value:
inputting a second picture to be tested;
obtaining a corresponding second characteristic through the convolutional neural network after the loss value is updated;
calculating a loss value of a loss function using the second feature as an input;
and determining the category of the object corresponding to the second picture.
3. The feature training method according to any one of claims 1-2, wherein the loss function is used to make the trained features in the updated convolutional neural network conform to preset categories as follows:
the features are closer in-class distance;
the inter-class distance of features is further.
4. A convolutional neural network feature training apparatus, comprising:
the extraction unit is used for extracting a first characteristic picture;
the determining unit is used for determining a feature map of the first feature picture and acquiring a first feature according to the feature map;
a loss function unit configured to calculate a loss value of a loss function using the first feature as an input, wherein the loss function unit includes: a first loss function unit and a second loss function unit,
the first loss function unit is used for being a combined loss function of Softmax and cross entropy;
the second loss function unit is used as an angle loss function;
the reverse unit is used for updating the convolutional neural network according to the loss value; the loss function is used for enabling the features trained in the updated convolutional neural network to accord with preset categories;
obtaining the probability of the category of the first feature through a first loss function;
reducing the intra-class distance of the features through a second loss function, and increasing the inter-class distance of the features;
the extracting the first feature picture comprises:
inputting N pictures, performing normalization pretreatment on the N pictures, and inputting the normalized pictures into a convolutional neural network to obtain a corresponding characteristic diagram;
taking the first feature as an input, calculating a loss value of a loss function comprises:
taking N feature graphs as input, and obtaining N multiplied by M dimensional features, namely N features through a full connection layer, wherein each feature corresponds to N pictures and is M-dimensional; calculating a loss value of a loss function by taking the characteristics of the dimension N multiplied by M and the class label of the picture as input;
wherein calculating the loss value of the loss function comprises: calculating a loss value of the first loss function and calculating a loss value of the second loss function;
calculating the loss value of the first loss function includes:
calculating an average value obtained by adding all probabilities corresponding to the N input pictures through a loss function;
the loss function means
Figure FDA0002926872250000041
Wherein f is the first feature obtained,
Figure FDA0002926872250000042
is a weight vector corresponding to class i, so
Figure FDA0002926872250000043
Is a category yiCorresponding weight vector, M categories, each input picture corresponding to a specific category yi,yiAs the real category corresponding to the input picture;
by passing
Figure FDA0002926872250000044
Multiplying by f yields a fraction, and
Figure FDA0002926872250000045
wherein the expression of (1) represents that f is judged to be yiThe probability over the category of (1);
calculating the loss value of the second loss function includes:
calculating N pictures by loss function
Figure FDA0002926872250000046
Average value of (d);
a loss function of
Figure FDA0002926872250000047
Wherein f is the first feature obtained,
Figure FDA0002926872250000048
is a category yiCorresponding weight vector, M categories, each input picture corresponding to a specific category yi,yiAs the real category to which the input picture corresponds,
Figure FDA0002926872250000049
represents
Figure FDA00029268722500000410
Cosine of the angle of f
Figure FDA00029268722500000411
In the range of [ -1,1]The closer to 1, the
Figure FDA00029268722500000412
The smaller the angle between the vector and the f-feature vector.
5. The feature training device according to claim 4, further comprising: the test unit is used for inputting a second picture to be tested;
obtaining a corresponding second characteristic through the convolutional neural network after the loss value is updated;
calculating a loss value of a loss function using the second feature as an input;
and determining the category of the object corresponding to the second picture.
6. The feature training apparatus according to claim 4, wherein the inverse unit is further configured to make the features trained in the updated convolutional neural network conform to a preset value through a loss function: the features are closer in-class distance; the inter-class distance of features is further.
CN201810096726.8A 2018-01-31 2018-01-31 Feature training method and device of convolutional neural network Active CN108197669B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810096726.8A CN108197669B (en) 2018-01-31 2018-01-31 Feature training method and device of convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810096726.8A CN108197669B (en) 2018-01-31 2018-01-31 Feature training method and device of convolutional neural network

Publications (2)

Publication Number Publication Date
CN108197669A CN108197669A (en) 2018-06-22
CN108197669B true CN108197669B (en) 2021-04-30

Family

ID=62591623

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810096726.8A Active CN108197669B (en) 2018-01-31 2018-01-31 Feature training method and device of convolutional neural network

Country Status (1)

Country Link
CN (1) CN108197669B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717359B (en) * 2018-07-12 2023-07-25 浙江宇视科技有限公司 Counter propagation optimization method and device based on mathematical statistics and electronic equipment
CN109165566B (en) * 2018-08-01 2021-04-27 中国计量大学 Face recognition convolutional neural network training method based on novel loss function
CN109977845B (en) * 2019-03-21 2021-08-17 百度在线网络技术(北京)有限公司 Driving region detection method and vehicle-mounted terminal
CN110414550B (en) * 2019-06-14 2022-07-29 北京迈格威科技有限公司 Training method, device and system of face recognition model and computer readable medium
CN110378278B (en) * 2019-07-16 2021-11-02 北京地平线机器人技术研发有限公司 Neural network training method, object searching method, device and electronic equipment
CN113420737B (en) * 2021-08-23 2022-01-25 成都飞机工业(集团)有限责任公司 3D printing pattern recognition method based on convolutional neural network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682734A (en) * 2016-12-30 2017-05-17 中国科学院深圳先进技术研究院 Method and apparatus for increasing generalization capability of convolutional neural network
CN107944410B (en) * 2017-12-01 2020-07-28 中国科学院重庆绿色智能技术研究院 Cross-domain facial feature analysis method based on convolutional neural network
CN107909145A (en) * 2017-12-05 2018-04-13 苏州天瞳威视电子科技有限公司 A kind of training method of convolutional neural networks model

Also Published As

Publication number Publication date
CN108197669A (en) 2018-06-22

Similar Documents

Publication Publication Date Title
CN108197669B (en) Feature training method and device of convolutional neural network
CN107529650B (en) Closed loop detection method and device and computer equipment
CN111368943B (en) Method and device for identifying object in image, storage medium and electronic device
CN110717527A (en) Method for determining target detection model by combining void space pyramid structure
CN107025440A (en) A kind of remote sensing images method for extracting roads based on new convolutional neural networks
CN111539247B (en) Hyper-spectrum face recognition method and device, electronic equipment and storage medium thereof
CN110765882B (en) Video tag determination method, device, server and storage medium
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN109871845B (en) Certificate image extraction method and terminal equipment
CN104462494A (en) Remote sensing image retrieval method and system based on non-supervision characteristic learning
CN110516734B (en) Image matching method, device, equipment and storage medium
CN106650615A (en) Image processing method and terminal
CN110222718A (en) The method and device of image procossing
CN111881716A (en) Pedestrian re-identification method based on multi-view-angle generation countermeasure network
CN112084895B (en) Pedestrian re-identification method based on deep learning
CN113449671A (en) Multi-scale and multi-feature fusion pedestrian re-identification method and device
CN114331946A (en) Image data processing method, device and medium
CN115565019A (en) Single-channel high-resolution SAR image ground object classification method based on deep self-supervision generation countermeasure
Al-Amaren et al. RHN: A residual holistic neural network for edge detection
CN116994021A (en) Image detection method, device, computer readable medium and electronic equipment
CN112001386B (en) License plate character recognition method, system, medium and terminal
CN113450297A (en) Fusion model construction method and system for infrared image and visible light image
CN116912483A (en) Target detection method, electronic device and storage medium
CN111626212A (en) Method and device for identifying object in picture, storage medium and electronic device
CN108665455B (en) Method and device for evaluating image significance prediction result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20180622

Assignee: Apple R&D (Beijing) Co., Ltd.

Assignor: BEIJING MOSHANGHUA TECHNOLOGY CO., LTD.

Contract record no.: 2019990000054

Denomination of invention: Characteristic training method and device of convolutional neural network

License type: Exclusive License

Record date: 20190211

GR01 Patent grant
GR01 Patent grant