CN110991568A

CN110991568A - Target identification method, device, equipment and storage medium

Info

Publication number: CN110991568A
Application number: CN202010133440.XA
Authority: CN
Inventors: 吴志伟; 李德紘; 张少文; 冯琰一
Original assignee: Guangzhou Xinke Jiadu Technology Co Ltd; PCI Suntek Technology Co Ltd
Current assignee: Guangzhou Xinke Jiadu Technology Co Ltd; PCI Technology Group Co Ltd
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2020-04-10
Anticipated expiration: 2040-03-02
Also published as: CN110991568B

Abstract

The embodiment of the invention discloses a target identification method, a device, equipment and a storage medium, wherein the method comprises the steps of embedding a channel feature reactivation module and a fine feature self-enhancement module into a neural network structure to generate a first network model; connecting a gradient enhancement cross entropy loss function with the first network model to generate a second network model; training the second network model based on a small batch random gradient descent algorithm; modifying the trained second network model to obtain a reasoning network model; and inputting the image into the inference network model to obtain a target recognition result. According to the scheme, more subtle features can be learned and recognized, and the accuracy of target recognition is improved.

Description

Target identification method, device, equipment and storage medium

Technical Field

The embodiment of the application relates to the field of computers, in particular to a target identification method, a target identification device, target identification equipment and a storage medium.

Background

Object recognition refers to the process by which a particular object (or type of object) is distinguished from other objects (or other types of objects). It includes the identification of both two very similar objects and the identification of one type of object with another type of object. With the continuous development of computer technology, the application range of target identification is more and more extensive, such as identifying the targets of automobile models, flowers, plants, birds and the like.

In the prior art, a corresponding target object can be obtained by performing target identification on an image, however, in the existing target identification mode, for the case of many fine features, for example, when the target identification is performed on the image containing a plurality of faces and a plurality of vehicle types, efficient and accurate identification cannot be performed, which also causes a series of problems, particularly, under social emergency conditions such as high occurrence of epidemic situation, public safety incident and the like, it is necessary to accurately identify highly-dangerous persons such as determination of suspected or diagnosed patients and early warning and tracking of the action track of an important vehicle from a plurality of images, and how to efficiently and accurately perform the target identification is crucial at this time.

Disclosure of Invention

The embodiment of the invention provides a target identification method, a target identification device, target identification equipment and a storage medium, which can learn and identify more subtle features and improve the accuracy of target identification.

In a first aspect, an embodiment of the present invention provides a target identification method, where the method includes:

embedding a channel feature reactivation module and a fine feature self-enhancement module into a neural network structure to generate a first network model;

connecting a gradient enhancement cross entropy loss function with the first network model to generate a second network model;

training the second network model based on a small batch random gradient descent algorithm;

modifying the trained second network model to obtain a reasoning network model;

and inputting the image into the inference network model to obtain a target recognition result.

Optionally, embedding the channel feature reactivation module and the fine feature self-enhancement module into the neural network structure to generate a first network model, including:

the weight of the feature graph output in the neural network structure is redistributed according to the channel through a channel feature reactivation module;

and enhancing the non-significant features of the feature map output by the channel feature reactivation module through the fine feature self-enhancement module, and inhibiting the significant features.

Optionally, the reassigning, by the channel feature reactivation module, the weight of the feature map output in the neural network structure according to the channel includes:

compressing a feature map output in a neural network structure at a spatial level to obtain compressed features;

reactivating the compressed features to obtain activated weights;

and multiplying the activated weight by the input feature map according to channels.

Optionally, the fine feature self-enhancement module includes an enhancement mask and a suppression mask, and the enhancing, by the fine feature self-enhancement module, the non-significant feature of the feature map output by the channel feature reactivation module, and suppressing the significant feature include:

enhancing the area corresponding to the non-significant feature of the feature map output by the channel feature reactivation module according to the enhancement mask;

and according to the suppression mask, suppressing the area corresponding to the salient feature of the feature map output by the channel feature reactivation module.

Optionally, the embedding the channel feature reactivation module and the fine feature self-enhancement module into the neural network structure to generate a first network model includes:

deleting a global pooling layer of the residual network, and modifying the last layer of full-link layer into a convolutional layer with a convolutional kernel size of 1x1 and a channel number of C to obtain a characteristic diagram;

inputting the feature map into a channel feature reactivation module;

inputting the feature graph output by the channel feature reactivation module into a fine feature self-enhancement module and then connecting the fine feature self-enhancement module with a global pooling layer to generate a first network model.

Optionally, the connecting the gradient enhancement cross-entropy loss function to the first network model to generate a second network model includes:

and adjusting the loss value of the sample by a loss adjusting factor introduced in the gradient enhancement cross entropy loss function, and meanwhile, operating the negative sample meeting the preset condition to generate a second network model.

Optionally, the modifying the trained second network model to obtain the inference network model includes:

deleting the fine feature self-enhancement module and the gradient enhancement cross entropy loss function in the trained second network model, and accessing the Softmax loss function after the global pooling layer to obtain the inference network model.

In a second aspect, an embodiment of the present invention further provides an object recognition apparatus, where the apparatus includes:

the first processing module is used for embedding the channel feature reactivation module and the fine feature self-enhancement module into the neural network structure to generate a first network model;

the second processing module is used for connecting the gradient enhancement cross entropy loss function with the first network model to generate a second network model;

the training module is used for training the second network model based on a small batch random gradient descent algorithm;

the third processing module is used for modifying the trained second network model to obtain a reasoning network model;

and the recognition module is used for inputting the image into the reasoning network model to obtain a target recognition result.

In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the object recognition method according to the embodiment of the present invention.

In a fourth aspect, the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform the object recognition method according to the present invention.

In the embodiment of the invention, a channel feature reactivation module and a fine feature self-enhancement module are embedded into a neural network structure to generate a first network model, a gradient enhancement cross entropy loss function is connected with the first network model to generate a second network model, the second network model is trained based on a small batch random gradient descent algorithm, the trained second network model is modified to obtain a reasoning network model, and an image is input into the reasoning network model to obtain a target recognition result. The problem of accuracy rate reduction caused by unbalanced sample categories in the training process is solved by introducing the channel feature reactivation module, more fine features can be learned by a network model by introducing the fine feature self-enhancement module and the gradient enhancement cross entropy loss function, the method is very effective for improving the identification accuracy rate of similar categories, and is particularly suitable for identification tasks of fine-grained targets, such as identification of fine features of vehicle types and the like, the targets can be quickly and accurately identified in the acquired complex images, so that the target action tracks are further determined, and the method plays a vital role in management and control of public safety events such as epidemic situation prevention and control society.

Drawings

Fig. 1 is a flowchart of a target identification method according to an embodiment of the present invention;

FIG. 2 is a flow chart of another method for identifying objects according to an embodiment of the present invention;

FIG. 3 is a flow chart of another method for identifying objects according to an embodiment of the present invention;

FIG. 4 is a flow chart of another method for identifying objects according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a second network model according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an inference network model provided in an embodiment of the present invention;

FIG. 7 is a flow chart of another method for identifying objects according to an embodiment of the present invention;

fig. 8 is a block diagram of a target identification apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.

Detailed Description

The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.

Fig. 1 is a flowchart of a target identification method according to an embodiment of the present invention, where the present embodiment is applicable to target identification, and the method may be executed by a device such as a computer, and specifically includes the following steps:

step S101, embedding the channel feature reactivation module and the fine feature self-enhancement module into a neural network structure to generate a first network model.

In one embodiment, the pre-designed channel feature reactivation module and the fine feature self-enhancement module are embedded in a neural network structure, which may be an existing neural network structure. Exemplary, such as a Resnet50 neural network architecture. The channel characteristic reactivation module is used for reallocating weights to the characteristic diagram output in the neural network structure according to the channels, and the fine characteristic self-enhancement module is used for enhancing the non-significant characteristics of the characteristic diagram output in the channel characteristic reactivation module and inhibiting the significant characteristics. The fine feature self-enhancement module comprises an enhancement mask and a suppression mask, and is used for enhancing the area corresponding to the non-significant feature of the feature diagram output by the channel feature reactivation module according to the enhancement mask and suppressing the area corresponding to the significant feature of the feature diagram output by the channel feature reactivation module according to the suppression mask.

And S102, connecting the gradient enhancement cross entropy loss function with the first network model to generate a second network model.

In one embodiment, a second network model is generated by connecting a pre-designed gradient enhancement cross-entropy loss function to the first network model. The gradient enhancement cross entropy loss function is used for supervising the training of the network, so that the network model can focus on samples which are difficult to distinguish, and the recognition rate of samples of similar categories is improved.

And S103, training the second network model based on a small batch random gradient descent algorithm.

After the second network model is obtained, the second network model is trained based on a small-batch stochastic gradient descent algorithm, and the neural network model is trained through the small-batch stochastic gradient descent algorithm to obtain model parameters. In one embodiment, the Adam optimizer may be used to train the neural network model to obtain model parameters, or other optimization algorithms commonly used in deep learning, such as SGD algorithm, RMSProp algorithm, and the like, may be used.

And S104, modifying the trained second network model to obtain a reasoning network model, and inputting the image into the reasoning network model to obtain a target recognition result.

In one embodiment, after the second network model is trained, it is modified to obtain the inference network model. Specifically, the modification mode may be to delete the fine feature self-enhancement module and the gradient enhancement cross entropy loss function in the trained second network model, and access the Softmax loss function after the global pooling layer to obtain the inference network model.

According to the scheme, the channel feature reactivation module and the fine feature self-enhancement module are embedded into the neural network structure to generate a first network model, the gradient enhancement cross entropy loss function is connected with the first network model to generate a second network model, the second network model is trained on the basis of a small-batch random gradient descent algorithm, the trained second network model is modified to obtain a reasoning network model, and an image is input into the reasoning network model to obtain a target recognition result. The method solves the problem of accuracy rate reduction caused by unbalanced sample categories in the training process by introducing the channel feature reactivation module, can enable the network model to learn more fine features by introducing the fine feature self-enhancement module and the gradient enhancement cross entropy loss function, is very effective in improving the accuracy rate of similar categories, and is particularly suitable for identification tasks of fine-grained targets, such as vehicle type identification, flower and plant identification, bird identification and the like.

Fig. 2 is a flowchart of another target identification method according to an embodiment of the present invention, which shows a specific method for processing data by a channel feature reactivation module. As shown in fig. 2, the technical solution is as follows:

step S201, embedding the channel feature reactivation module into a neural network structure.

Step S202, compressing the feature diagram output in the neural network structure in a spatial layer through a channel feature reactivation module to obtain a compressed feature, reactivating the compressed feature to obtain an activated weight, and multiplying the input feature diagram by the activated weight according to a channel.

In one embodiment, the following common variables are defined: training image is I, class label is

Where L is the set of all class labels, C is the number of channels, and the feature map of the selected neural network structure (backbone network) output is

Expressed in the form of a set

Where W, H is the width and height of the feature map and R is the real number field in the mathematical formula.

Specifically, the channel feature reactivation module is designed as follows:

the input of the channel feature reactivation module is a feature map

To, for

Compressing at spatial level to obtain compressed features

The compression formula is:

，

then go right again

Performing reactivation to obtain activated weight

The reactivation formula is:

，

finally, multiplying the input characteristic diagram by the weight according to the channel to obtain

Expressed in the form of a set

The calculation formula is as follows:

。

step S203, embedding the fine feature self-enhancement module into the neural network structure, enhancing the non-significant features of the feature map output by the channel feature reactivation module through the fine feature self-enhancement module, and inhibiting the significant features to generate a first network model.

And step S204, connecting the gradient enhancement cross entropy loss function with the first network model to generate a second network model.

And S205, training the second network model based on a small batch random gradient descent algorithm.

And S206, modifying the trained second network model to obtain a reasoning network model, and inputting the image into the reasoning network model to obtain a target recognition result.

According to the scheme, the designed channel feature reactivation module is embedded into the neural network structure, and the weight is redistributed to the feature graph output by the neural network structure according to the channels, so that the problem of accuracy reduction caused by unbalanced sample types in the training process is solved.

Fig. 3 is a flowchart of another target identification method according to an embodiment of the present invention, which shows a specific method for data processing by the fine feature self-enhancement module. As shown in fig. 3, the technical solution is as follows:

and S301, embedding the channel feature reactivation module into a neural network structure.

Step S302, compressing the feature diagram output in the neural network structure at a spatial level through a channel feature reactivation module to obtain a compressed feature, reactivating the compressed feature to obtain an activated weight, and multiplying the input feature diagram by the activated weight according to a channel.

Step S303, embedding the fine-feature self-enhancement module into the neural network structure, enhancing a region corresponding to the non-salient feature of the feature map output from the channel feature reactivation module according to the enhancement mask, and suppressing a region corresponding to the salient feature of the feature map output from the channel feature reactivation module according to the suppression mask, thereby generating a first network model.

In one embodiment, the fine feature self-enhancement module includes an enhancement mask and a suppression mask, and the specific design of the module is as follows:

the input of the fine feature self-enhancement module is an output feature diagram of the channel feature reactivation module

Defining the output of the fine feature self-enhancement as

. It is determined by a mask which regions need enhancement and which regions need suppression, the mask comprising an enhancement mask and a suppression mask. Defining an enhancement mask as

，

A value of 1 or 0, 1 indicating that enhancement is required, 0 indicating that enhancement is not required, and an enhancement factor of

Indicating the degree of enhancement of the characteristic value, the suppression mask being

，

A value of 1 or 0, 1 indicating that inhibition is required, 0 indicating that inhibition is not required, and the inhibition factor is

The degree of suppression of the characteristic value is indicated. And enhancing or suppressing the corresponding area in the input feature map according to the enhancement mask and the suppression mask, multiplying the corresponding position of the input feature map by an enhancement factor when the enhancement mask at a certain position is 1, multiplying the corresponding position of the input feature map by a suppression factor when the suppression mask at a certain position is 1, and keeping the rest positions unchanged. The calculation formula of the fine feature self-enhancement module is as follows:

wherein the content of the first and second substances,

and

cannot be simultaneously 1.

The enhancement mask is used for calculating an area needing enhancement, the peak value of the input feature map represents a significant feature, besides the peak value, many non-significant features, namely subtle features exist, and the non-significant features need to be enhanced in order to improve the learning capability of the network model for the subtle features. Will input the feature map

Is divided into

Block, defining the m-th row and n-th column of the feature block as

The characteristic diagram is represented in a set form by blocks

Similarly, the enhancement mask is divided into

Block, define m row n column mask block as

The enhancement mask is represented in a set form by blocks

. The non-peak area in the feature block has a certain probability of being a fine feature, so that the corresponding area in the mask block is marked as 1 at random according to the probability p, and the rest positions are 0, namely:

where p represents a probability value, obeying a Bernoulli distribution,

representing the maximum value of the feature block, the enhancement mask corresponding position is 1 if the probability value is greater than or equal to 0.5 and is not the peak position, otherwise 0.

The suppression mask is used for calculating the area needing to be suppressed, the peak value in the input feature diagram represents the significant feature, and the random suppression of the peak value area according to a certain probability can improve the attention of the network model to the non-significant area or the fine feature. The calculation formula of the suppression mask is:

wherein the content of the first and second substances,

representing probability values, obeying bernoulli distributions,

representing the maximum value in the feature map, the suppression mask corresponds to a position of 1 if the probability value is greater than or equal to 0.5 and is the peak position, and 0 otherwise.

And step S304, connecting the gradient enhancement cross entropy loss function with the first network model to generate a second network model.

And S305, training the second network model based on a small batch random gradient descent algorithm.

And S306, modifying the trained second network model to obtain a reasoning network model, and inputting the image into the reasoning network model to obtain a target recognition result.

According to the scheme, the designed fine feature self-enhancement module can enhance the non-significant features in the feature diagram and inhibit the significant features so as to achieve the purpose of fine feature self-enhancement, and the module can be flexibly embedded into a backbone network model of a classical neural network structure so as to improve the recognition capability of the existing network model on similar samples and improve the recognition accuracy of similar class samples.

Fig. 4 is a flowchart of another target identification method according to an embodiment of the present invention, which shows a specific method for generating a second network model, and as shown in fig. 4, the method specifically includes the following steps:

step S401, deleting a global pooling layer of the residual network, modifying the last full-connection layer into a convolution layer with a convolution kernel size of 1x1 and a channel number of C to obtain a feature map, inputting the feature map into a channel feature reactivation module, inputting the feature map output by the channel feature reactivation module into a fine feature self-enhancement module, and then connecting the fine feature self-enhancement module with the global pooling layer to generate a first network model.

In an embodiment, a backbone network is obtained by deleting a global pooling layer and a full link layer of a residual network, a structural diagram of a first network model is shown in fig. 5, fig. 5 is a schematic structural diagram of a second network model provided in an embodiment of the present invention, as shown in the figure, original training data is input to the backbone network (the backbone network is a residual network in which the global pooling layer and the full link layer are deleted), a last full link layer of the original residual network is modified into a convolutional layer with a convolutional kernel size of 1x1 and a channel number of C, an obtained feature map is input to a channel feature reactivation module, and the channel feature reactivation module is connected to a fine feature self-enhancement module and then connected to the global pooling layer to obtain the first network model.

And S402, connecting the gradient enhancement cross entropy loss function with the first network model to generate a second network model.

As shown in fig. 5, after the fine feature self-enhancement module is accessed to the global pooling layer to obtain the feature map, the gradient enhancement cross entropy loss function is connected to obtain the final second network model, which is a complete training network model.

And S403, training the second network model based on a small batch random gradient descent algorithm.

And S404, modifying the trained second network model to obtain a reasoning network model, and inputting the image into the reasoning network model to obtain a target recognition result.

In an embodiment, as shown in fig. 6, fig. 6 is a schematic structural diagram of the inference network model provided in the embodiment of the present invention, specifically, the fine feature self-enhancement module and the gradient enhancement cross entropy loss function in the trained second network model are deleted, and the Softmax loss function is accessed after the global pooling layer to obtain the inference network model. Correspondingly, after the inference model is obtained, test data can be input into the model to obtain a corresponding target recognition result.

According to the scheme, the channel feature reactivation module and the fine feature self-enhancement module are embedded into the neural network structure to generate a first network model, and the gradient enhancement cross entropy loss function is connected with the first network model to generate a second network model; training the second network model based on a small batch random gradient descent algorithm; and modifying the trained second network model to obtain a reasoning network model, and inputting the image into the reasoning network model to obtain a target recognition result.

Fig. 7 is a flowchart of another target identification method according to an embodiment of the present invention, which provides a specific method for generating a second network model by connecting a gradient-enhanced cross entropy loss function to a first network model, and as shown in fig. 7, the method specifically includes the following steps:

and S701, deleting a global pooling layer of the residual network, modifying the last full-link layer into a convolution layer with a convolution kernel size of 1x1 and a channel number of C to obtain a feature map, inputting the feature map into a channel feature reactivation module, and inputting the feature map output by the channel feature reactivation module into a fine feature self-enhancement module to generate a first network model.

Step S702, adjusting the loss value of the sample through a loss adjusting factor introduced in the gradient enhanced cross entropy loss function, and meanwhile, operating the negative sample meeting the preset condition to generate a second network model.

In particular, the fine features are self-enhanced from the output of the module

Then connected with a global pooling layer to obtain scores

Expressed in the form of a set

The conventional cross entropy loss function is

，

Wherein

For training the real label of the image I, the conventional cross entropy loss function treats all classes equally, so that the problem of identifying similar class samples in a fine-grained target identification task cannot be well solved. According to the gradient enhancement cross entropy loss function provided by the scheme, when p (s, l) is calculated, only K (K) with the highest score is considered in negative samples<Class = C) and introduces loss adjustment factors

The loss value of the samples difficult to be distinguished is adjusted, so that the network can focus on the identification of the samples difficult to be distinguished, and the identification rate of similar category samples in the target identification task is improved. Definition of

For positive exemplar labels, the set of labels for all negative exemplar classes is

All negative examplesThe score set of this class is

Will be

Ranked from high to low by score, the Kth score is

The set of categories in which topK scores are high is

Then the GBCE is calculated as:

wherein K is,

For a hyper-parameter, the smaller K,

the larger the relative loss value of the hard-to-separate sample, the smaller the relative loss value of the easy-to-separate sample, and vice versa. The relative loss value of the samples difficult to separate is increased, and the relative loss value of the samples easy to separate is reduced, so that the capability of a network model for focusing the samples difficult to separate is improved.

And S703, training the second network model based on a small batch random gradient descent algorithm.

And step S704, modifying the trained second network model to obtain a reasoning network model, and inputting the image into the reasoning network model to obtain a target recognition result.

Therefore, by introducing the designed gradient enhanced cross entropy loss function, the problem of identifying similar category samples in a fine-grained target identification task can be well solved, so that the network can focus on identifying difficultly-classified samples, and the identification rate of the similar category samples in the target identification task is improved.

Fig. 8 is a block diagram of a target identification apparatus according to an embodiment of the present invention, where the apparatus is configured to execute the target identification method according to the above embodiment, and has corresponding functional modules and beneficial effects of the execution method. As shown in fig. 8, the apparatus specifically includes: a first processing module 101, a second processing module 102, a training module 103, a third processing module 104, and a recognition module 105, wherein,

the first processing module 101 is used for embedding the channel feature reactivation module and the fine feature self-enhancement module into a neural network structure to generate a first network model; a second processing module 102, configured to connect a gradient enhancement cross entropy loss function to the first network model to generate a second network model; the training module 103 is used for training the second network model based on a small batch random gradient descent algorithm; the third processing module 104 is configured to modify the trained second network model to obtain an inference network model; and the recognition module 105 is used for inputting the image into the inference network model to obtain a target recognition result.

In a possible embodiment, the first processing module 101 is specifically configured to:

reactivating the compressed features to obtain activated weights;

inputting the feature map into a channel feature reactivation module;

In a possible embodiment, the second processing module 102 is specifically configured to:

In a possible embodiment, the third processing module 104 is specifically configured to:

Fig. 9 is a schematic structural diagram of an apparatus according to an embodiment of the present invention, as shown in fig. 9, the apparatus includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of the processors 201 in the device may be one or more, and one processor 201 is taken as an example in fig. 9; the processor 201, the memory 202, the input device 203 and the output device 204 in the apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 9.

The memory 202, which is a computer-readable storage medium, may be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the object recognition method in the embodiments of the present invention. The processor 201 executes various functional applications of the device and data processing by executing software programs, instructions and modules stored in the memory 202, that is, implements the above-described object recognition method.

The memory 202 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 202 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 202 may further include memory located remotely from the processor 201, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the apparatus. The output device 204 may include a display device such as a display screen.

Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method of object recognition, the method comprising:

modifying the trained second network model to obtain a reasoning network model;

From the above description of the embodiments, it is obvious for those skilled in the art that the embodiments of the present invention can be implemented by software and necessary general hardware, and certainly can be implemented by hardware, but the former is a better implementation in many cases. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions to make a computer device (which may be a personal computer, a server, or a network device) perform the methods described in the embodiments of the present invention.

It should be noted that, in the embodiment of the object recognition apparatus, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.

It should be noted that the foregoing is only a preferred embodiment of the present invention and the technical principles applied. Those skilled in the art will appreciate that the embodiments of the present invention are not limited to the specific embodiments described herein, and that various obvious changes, adaptations, and substitutions are possible, without departing from the scope of the embodiments of the present invention. Therefore, although the embodiments of the present invention have been described in more detail through the above embodiments, the embodiments of the present invention are not limited to the above embodiments, and many other equivalent embodiments may be included without departing from the concept of the embodiments of the present invention, and the scope of the embodiments of the present invention is determined by the scope of the appended claims.

Claims

1. An object recognition method, comprising:

modifying the trained second network model to obtain a reasoning network model;

2. The method of claim 1, wherein embedding the channel feature reactivation module and the fine feature self-enhancement module in a neural network structure generates a first network model, comprising:

3. The method of claim 2, wherein the channel feature re-activation module re-assigns weights to feature maps output in the neural network structure on a channel-by-channel basis, comprising:

reactivating the compressed features to obtain activated weights;

4. The method of claim 2, wherein the fine feature self-enhancement module comprises an enhancement mask and a suppression mask, and the enhancing the non-salient features of the feature map output from the channel feature reactivation module by the fine feature self-enhancement module and the suppressing the salient features comprises:

5. The method of claim 1, wherein embedding the channel feature reactivation module and the fine feature self-enhancement module in a neural network structure generates a first network model, comprising:

inputting the feature map into a channel feature reactivation module;

6. The method of claim 5, wherein said connecting a gradient enhancement cross-entropy loss function to said first network model generates a second network model, comprising:

7. The method according to any one of claims 1-6, wherein modifying the trained second network model to obtain the inference network model comprises:

8. An object recognition apparatus, comprising:

9. An object recognition device, the device comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the object recognition method of any one of claims 1-7.

10. A storage medium containing computer-executable instructions for performing the object recognition method of any one of claims 1-7 when executed by a computer processor.