Disclosure of Invention
The present invention mainly aims to provide an image recognition method, an image recognition apparatus, a storage medium, and a device, and aims to solve the technical problem that how to recognize an image under a complex background in the prior art is an urgent solution.
In order to achieve the above object, the present invention provides an image recognition method, including the steps of:
acquiring an image to be identified;
determining the target characteristics and the first image category probability of the image to be identified through a preset classification network;
preprocessing the image to be identified through a preset area positioning network according to the target characteristics to obtain a target image;
determining a second image category probability of the target image through the preset classification network, and determining the target category probability according to the first image category probability and the second image category probability;
and determining the image type of the image to be identified according to the target category probability.
Preferably, the determining, by a preset classification network, the target feature and the first image category probability of the image to be recognized includes:
extracting the features of the image to be recognized through a preset classification network to obtain target features;
and determining the first image category probability of the image to be recognized through a preset probability distribution model according to the target characteristics.
Preferably, after determining the target feature and the first image category probability of the image to be recognized through a preset classification network, the image recognition method:
performing feature splicing on the image to be recognized through a preset multi-scale feature fusion model according to the target feature to obtain an image to be processed;
correspondingly, the preprocessing the image to be recognized through a preset area positioning network according to the target feature to obtain a target image, including:
and preprocessing the image to be processed through a preset area positioning network according to the target characteristics to obtain a target image.
Preferably, the preprocessing the image to be processed according to the target feature through a preset area positioning network to obtain a target image includes:
determining a target intercepting area through a preset area positioning network, and intercepting the image to be processed according to the target intercepting area to obtain an image to be adjusted;
and adjusting the size of the image corresponding to the image to be adjusted to a preset size to obtain a target image.
Preferably, the determining a target capture area through a preset area positioning network, and capturing the image to be processed according to the target capture area to obtain the image to be adjusted includes:
determining an intercepted central point and an intercepted side length through a preset area positioning network, and determining a target intercepted area according to the intercepted central point and the intercepted side length;
and carrying out image interception on the image to be processed according to the target interception area to obtain an image to be adjusted.
Preferably, after the target intercepting area is determined by a preset area positioning network, and the image to be processed is intercepted according to the target intercepting area, and the image to be adjusted is acquired, the image identification method comprises the following steps:
determining an initial loss function value of the image to be identified through a preset loss function model according to the first image category probability;
determining a second image category probability of the target image through the preset classification network, and determining a target loss function value of the image to be adjusted through the preset loss function model according to the second image category probability;
determining whether the target loss function value is less than the initial loss function value;
and if so, executing the step of adjusting the size of the image corresponding to the image to be adjusted to a preset size to obtain a target image.
Preferably, the determining, by the preset classification network, a second image category probability of the target image, and determining a target category probability according to the first image category probability and the second image category probability includes:
determining a second image category probability of the target image through the preset classification network;
and determining the target class probability through a preset weight algorithm according to the first image class probability and the second image class probability.
Furthermore, to achieve the above object, the present invention also proposes an image recognition apparatus comprising a memory, a processor and an image recognition program stored on the memory and executable on the processor, the image recognition program being configured to implement the steps of the image recognition method as described above.
Furthermore, to achieve the above object, the present invention also proposes a storage medium having stored thereon an image recognition program which, when executed by a processor, implements the steps of the image recognition method as described above.
In addition, to achieve the above object, the present invention also provides an image recognition apparatus including: the system comprises an acquisition module, a target feature extraction module, a preprocessing module, a target category probability determination module and an identification module;
the acquisition module is used for acquiring an image to be identified;
the target feature extraction module is used for determining the target features and the first image category probability of the image to be identified through a preset classification network;
the preprocessing module is used for preprocessing the image to be recognized through a preset area positioning network according to the target characteristics to obtain a target image;
the target class probability determining module is used for determining a second image class probability of the target image through the preset classification network and determining the target class probability according to the first image class probability and the second image class probability;
and the identification module is used for determining the image category of the image to be identified according to the target category probability.
In the invention, an image to be recognized is obtained, a target feature and a first image category probability of the image to be recognized are determined through a preset classification network, the image to be recognized is preprocessed through a preset area positioning network according to the target feature to obtain a target image, a second image category probability of the target image is determined through the preset classification network, the target category probability is determined according to the first image category probability and the second image category probability, and the image category of the image to be recognized is determined according to the target category probability; according to the image recognition method and device, the image to be recognized is intercepted through the preset classification network and the preset area positioning network, the target image is obtained, and the image category of the image to be recognized is determined according to the first image category probability of the image to be recognized and the second image category probability of the target image, so that the influence of environmental factors on image recognition can be avoided, and user experience is improved.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an image recognition device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the image recognition apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), and the optional user interface 1003 may further include a standard wired interface and a wireless interface, and the wired interface for the user interface 1003 may be a USB interface in the present invention. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory or a Non-volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the image recognition apparatus, and may include more or less components than those shown, or combine certain components, or a different arrangement of components.
As shown in FIG. 1, memory 1005, identified as one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and an image recognition program.
In the image recognition device shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting user equipment; the image recognition apparatus calls an image recognition program stored in the memory 1005 through the processor 1001 and performs the image recognition method provided by the embodiment of the present invention.
Based on the hardware structure, the embodiment of the image identification method is provided.
Referring to fig. 2, fig. 2 is a flowchart illustrating an image recognition method according to a first embodiment of the present invention.
In a first embodiment, the image recognition method comprises the steps of:
step S10: and acquiring an image to be identified.
It should be understood that the main implementation body of the present embodiment is the image recognition device, wherein the image recognition device may be an electronic device such as a personal computer or a server with an arithmetic function and an information receiving function.
Step S20: and determining the target characteristics and the first image category probability of the image to be recognized through a preset classification network.
It can be understood that the step of determining, by the image recognition device, the target feature and the first image category probability of the image to be recognized through the preset classification network may be to perform feature extraction on the image to be recognized through the preset classification network to obtain the target feature, and then determine the first image category probability of the image to be recognized through a preset probability distribution model according to the target feature.
It should be noted that the predetermined classification network may be a network composed of a convolutional layer, a pooling layer, and an active layer. The predetermined probability distribution model may be represented as follows:
p(X)=f(Wc*X)
wherein X is an image to be identified, WcFor presetting the parameters of the classification network, convolution calculation, f (-) is the characteristic layer to the full connectionLayer function, p (x) is the picture category probability.
Step S30: and preprocessing the image to be recognized through a preset area positioning network according to the target characteristics to obtain a target image.
It can be understood that the image recognition device may be configured to perform preprocessing on the image to be recognized through a preset area positioning network according to the target feature, where the preprocessing is performed on the image to be recognized through the preset area positioning network, determine a target capture area according to the target capture area, perform image capture on the image to be processed according to the target capture area, obtain an image to be adjusted, and adjust the size of the image corresponding to the image to be adjusted to a preset size, so as to obtain the target image.
It should be noted that the predetermined area location network may be composed of a 1 × 1 convolutional layer and two fully connected layers.
Step S40: and determining a second image category probability of the target image through the preset classification network, and determining the target category probability according to the first image category probability and the second image category probability.
It should be understood that, the determining, by the image recognition device, the second image category probability of the target image through the preset classification network may be performing feature extraction on the target image through the preset classification network to obtain features of the target image, and then determining, according to the features of the target image, the second image category probability of the image to be recognized through a preset probability distribution model. The determining, by the image recognition device, the target class probability according to the first image class probability and the second image class probability may be determining the target class probability through a preset weight algorithm according to the first image class probability and the second image class probability.
Step S50: and determining the image category of the image to be identified according to the target category probability.
It should be understood that the image recognition device determining the image category of the image to be recognized according to the target category probability may be to take the image category with the highest target category probability as the image category of the image to be recognized.
In a first embodiment, an image to be recognized is obtained, a target feature and a first image category probability of the image to be recognized are determined through a preset classification network, the image to be recognized is preprocessed through a preset area positioning network according to the target feature to obtain a target image, a second image category probability of the target image is determined through the preset classification network, the target category probability is determined according to the first image category probability and the second image category probability, and an image category of the image to be recognized is determined according to the target category probability; in the embodiment, the image to be recognized is intercepted through the preset classification network and the preset area positioning network to obtain the target image, and the image category of the image to be recognized is determined according to the first image category probability of the image to be recognized and the second image category probability of the target image, so that the influence of environmental factors on image recognition can be avoided, and the user experience is improved.
Referring to fig. 3, fig. 3 is a flowchart illustrating a second embodiment of the image recognition method according to the present invention, and the second embodiment of the image recognition method according to the present invention is proposed based on the first embodiment shown in fig. 2.
In the second embodiment, the step S20 includes:
step S201: and performing feature extraction on the image to be identified through a preset classification network to obtain target features.
It should be noted that the predetermined classification network may be a network composed of a convolutional layer, a pooling layer, and an active layer.
Step S202: and determining the first image category probability of the image to be recognized through a preset probability distribution model according to the target characteristics.
It is understood that the predetermined probability distribution model can be represented as follows:
p(X)=f(Wc*X)
wherein X is an image to be identified, WcFor the parameters of the preset classification network, convolution calculation, f (-) is the function from the characteristic layer to the full connection layer, and p (X) is the picture classification probability.
In the second embodiment, after the step S20, the method further includes:
step S20': and performing feature splicing on the image to be recognized through a preset multi-scale feature fusion model according to the target features to obtain the image to be processed.
It should be understood that, objects with different sizes exist in the image to be recognized, different objects have different features, a simple object can be distinguished by using a shallow feature, a complex object needs to be distinguished by using a deep feature, and in order to recognize the objects with different sizes, the image recognition device may perform feature stitching on the image to be recognized through a preset multi-scale feature fusion model according to the object features to obtain the image to be processed.
In a specific implementation, for example, the image recognition device may sample the last convolutional layer feature of the classification network to obtain the feature nxnxnxm1Sampling the characteristics of the last but one convolution layer of the classification network to obtain characteristics NxNxM2The feature is NXNXM1And feature NxNxM2Splicing to obtain fused features NxNxM, wherein M is1+M2And obtaining the image to be processed according to the fused features NxNxM.
Accordingly, the step S30 includes:
step S30': and preprocessing the image to be processed through a preset area positioning network according to the target characteristics to obtain a target image.
It can be understood that the image recognition device may be configured to perform preprocessing on the image to be processed through a preset area positioning network according to the target feature, where the preprocessing is performed on the image to be processed through the preset area positioning network, determine a target capture area according to the target capture area, perform image capture on the image to be processed according to the target capture area, obtain an image to be adjusted, and adjust the size of the image corresponding to the image to be adjusted to a preset size, so as to obtain the target image.
It should be noted that the predetermined area location network may be composed of a 1 × 1 convolutional layer and two fully connected layers.
In the second embodiment, the step S40 includes:
step S401: and determining a second image category probability of the target image through the preset classification network.
It can be understood that the step of determining, by the image recognition device, the second image category probability of the target image through the preset classification network may be to perform feature extraction on the target image through the preset classification network to obtain features of the target image, and then determine the second image category probability of the target image through a preset probability distribution model according to the features of the target image.
Step S402: and determining the target class probability through a preset weight algorithm according to the first image class probability and the second image class probability.
It should be noted that the preset weighting algorithm is shown as follows:
wherein,
is the target class probability, x
nRepresenting the n-th image class probability, w
nAnd representing the weight corresponding to the nth image category probability.
In a second embodiment, an image to be recognized is obtained, feature extraction is performed on the image to be recognized through a preset classification network to obtain target features, a first image category probability of the image to be recognized is determined through a preset probability distribution model according to the target features, feature splicing is performed on the image to be recognized through a preset multi-scale feature fusion model according to the target features to obtain an image to be processed, the image to be processed is preprocessed through a preset area positioning network according to the target features to obtain a target image, a second image category probability of the target image is determined through the preset classification network, and the target category probability is determined through a preset weighting algorithm according to the first image category probability and the second image category probability; according to the method and the device, the preset multi-scale feature fusion model is added to perform feature splicing on the image to be recognized on the basis of the preset classification network and the preset area positioning network, so that the positioning detection of the small target can be improved, and the interference of the background on gesture recognition is eliminated.
Referring to fig. 4, fig. 4 is a flowchart illustrating a third embodiment of the image recognition method according to the present invention, and the third embodiment of the image recognition method according to the present invention is proposed based on the second embodiment shown in fig. 3.
In the third embodiment, the step S30' includes:
step S301': and determining a target intercepting area through a preset area positioning network, and intercepting the image to be processed according to the target intercepting area to obtain the image to be adjusted.
It should be understood that the image recognition device determines a target intercepting region through a preset region positioning network, and performs image interception on the image to be processed according to the target intercepting region, and the obtaining of the image to be adjusted may be determining an intercepting center point and an intercepting side length through the preset region positioning network, determining a target intercepting region according to the intercepting center point and the intercepting side length, and performing image interception on the image to be processed according to the target intercepting region to obtain the image to be adjusted.
Step S302': and adjusting the size of the image corresponding to the image to be adjusted to a preset size to obtain a target image.
The preset size may be an image size set according to actual needs, or may be an optimum image size obtained through experiments.
Further, the step S301' includes:
determining an intercepted central point and an intercepted side length through a preset area positioning network, and determining a target intercepted area according to the intercepted central point and the intercepted side length;
and the matrix intercepts the image to be processed according to the target intercepting area to obtain the image to be adjusted.
It should be understood that the predetermined area location network is shown as follows:
[x,y,l]=g(Wc*Xc)
wherein the g (-) function is a predetermined area location network, XcFeatures spliced for features of different dimensions, WcAnd presetting parameters of the classification network to obtain an original image target area.
Assuming that the detected region is a square, the upper left corner point and the lower right corner point of the image to be adjusted are (x) respectivelyleft,yleft) And (x)right,yright) The coordinate points are shown as follows:
then, according to the coordinate points, a mask M similar to a tanh function is constructed to be multiplied by the input image X in element level to obtain an image X to be adjustedcropAs shown in the following formula:
Xcrop=X.*M(x,y,l)
wherein, the mask M is shown as the following formula:
M(·)=[σ(x-xleft)-σ(x-xright)]·[σ(y-yleft)-σ(y-yright)]
the sigma function here corresponds to a sufficiently large k value.
Setting function
As shown in FIG. 3, when k is large enough, σ (x) ≈ 1, when x ≧ 0; σ (x) ≈ 0 when x<0. The sigma (x) function then approximates a step function. If x is assumed
0<x
1Then σ (x-x)
0)-σ(x-x
1) Is a step function, σ (x-x)
0)-σ(x-x
1) 0, when x<x
0Or x>x
1;σ(x-x
0)-σ(x-x
1) 1, last, the picture X coming out of crop
cropIs the image to be adjusted.
Further, after step S301', the method further includes:
step S3011': and determining an initial loss function value of the image to be identified through a preset loss function model according to the first image category probability.
It should be noted that the predetermined loss function can be expressed as follows:
wherein, Y
(s)Probability of representing prediction class, Y real class, permutation loss function
Is the resulting prediction probability of the category t at the preset s sizes.
Step S3012': and determining a second image category probability of the target image through the preset classification network, and determining a target loss function value of the image to be adjusted through the preset loss function model according to the second image category probability.
It should be understood that the image recognition device may perform feature extraction on the target image through a preset classification network to obtain features of the target image, and then determine the second image category probability of the target image through a preset probability distribution model according to the features of the target image.
Step S3013': and judging whether the target loss function value is smaller than the initial loss function value.
It should be appreciated that determining whether the target loss function is less than the initial loss function value may be by comparing the target loss function value directly to the initial loss function value.
Step S3014': and if so, executing the step of adjusting the size of the image corresponding to the image to be adjusted to a preset size to obtain a target image.
In a third embodiment, an image to be recognized is obtained, a target feature and a first image category probability of the image to be recognized are determined through a preset classification network, the image to be recognized is subjected to feature splicing through a preset multi-scale feature fusion model according to the target feature to obtain an image to be processed, a target interception area is determined through a preset area positioning network, the image to be processed is intercepted according to the target interception area to obtain an image to be adjusted, an initial loss function value of the image to be recognized is determined through a preset loss function model according to the first image category probability, a second image category probability of the target image is determined through the preset classification network, and a target loss function value of the image to be adjusted is determined through the preset loss function model according to the second image category probability, judging whether the target loss function value is smaller than the initial loss function value or not, if so, executing the step of adjusting the size of the image corresponding to the image to be adjusted to a preset size to obtain a target image, and adjusting the size of the image corresponding to the image to be adjusted to the preset size to obtain the target image; in the embodiment, the target loss function value of the target function is determined through the preset loss function model, and whether the local predicted probability value is greater than the integral probability value or not is judged according to the target loss function value, so that recursive learning can be performed on the identification process, and the identification accuracy is improved.
Furthermore, an embodiment of the present invention further provides a storage medium, on which an image recognition program is stored, and the image recognition program, when executed by a processor, implements the steps of the image recognition method as described above.
Furthermore, referring to fig. 5, an embodiment of the present invention further provides an image recognition apparatus, including: the system comprises an acquisition module 10, a target feature extraction module 20, a preprocessing module 30, a target class probability determination module 40 and an identification module 50;
the acquiring module 10 is configured to acquire an image to be identified.
It should be understood that the main implementation body of the present embodiment is the image recognition device, wherein the image recognition device may be an electronic device such as a personal computer or a server with an arithmetic function and an information receiving function.
The target feature extraction module 20 is configured to determine a target feature and a first image category probability of the image to be identified through a preset classification network.
It can be understood that the step of determining, by the image recognition device, the target feature and the first image category probability of the image to be recognized through the preset classification network may be to perform feature extraction on the image to be recognized through the preset classification network to obtain the target feature, and then determine the first image category probability of the image to be recognized through a preset probability distribution model according to the target feature.
It should be noted that the predetermined classification network may be a network composed of a convolutional layer, a pooling layer, and an active layer. The predetermined probability distribution model may be represented as follows:
p(X)=f(Wc*X)
wherein X is an image to be identified, WcFor the parameters of the preset classification network, convolution calculation, f (-) is the function from the characteristic layer to the full connection layer, and p (X) is the picture classification probability.
The preprocessing module 30 is configured to preprocess the image to be recognized through a preset area positioning network according to the target feature, so as to obtain a target image.
It can be understood that the image recognition device may be configured to perform preprocessing on the image to be recognized through a preset area positioning network according to the target feature, where the preprocessing is performed on the image to be recognized through the preset area positioning network, determine a target capture area according to the target capture area, perform image capture on the image to be processed according to the target capture area, obtain an image to be adjusted, and adjust the size of the image corresponding to the image to be adjusted to a preset size, so as to obtain the target image.
It should be noted that the predetermined area location network may be composed of a 1 × 1 convolutional layer and two fully connected layers.
The target class probability determining module 40 determines a second image class probability of the target image through the preset classification network, and determines the target class probability according to the first image class probability and the second image class probability.
It should be understood that, the determining, by the image recognition device, the second image category probability of the target image through the preset classification network may be performing feature extraction on the target image through the preset classification network to obtain features of the target image, and then determining, according to the features of the target image, the second image category probability of the image to be recognized through a preset probability distribution model. The determining, by the image recognition device, the target class probability according to the first image class probability and the second image class probability may be determining the target class probability through a preset weight algorithm according to the first image class probability and the second image class probability.
The identification module 50 is configured to determine an image category of the image to be identified according to the target category probability.
It should be understood that the image recognition device determining the image category of the image to be recognized according to the target category probability may be to take the image category with the highest target category probability as the image category of the image to be recognized.
In the embodiment, an image to be recognized is obtained, a target feature and a first image category probability of the image to be recognized are determined through a preset classification network, the image to be recognized is preprocessed through a preset area positioning network according to the target feature to obtain a target image, a second image category probability of the target image is determined through the preset classification network, the target category probability is determined according to the first image category probability and the second image category probability, and the image category of the image to be recognized is determined according to the target category probability; in the embodiment, the image to be recognized is intercepted through the preset classification network and the preset area positioning network to obtain the target image, and the image category of the image to be recognized is determined according to the first image category probability of the image to be recognized and the second image category probability of the target image, so that the influence of environmental factors on image recognition can be avoided, and the user experience is improved.
In an embodiment, the target feature extraction module is further configured to perform feature extraction on the image to be recognized through a preset classification network to obtain a target feature, and determine a first image category probability of the image to be recognized through a preset probability distribution model according to the target feature;
in one embodiment, the image recognition apparatus further includes: a splicing module;
the splicing module is used for performing feature splicing on the image to be recognized through a preset multi-scale feature fusion model according to the target features to obtain an image to be processed;
in an embodiment, the preprocessing module is further configured to determine a target intercepting area through a preset area positioning network, perform image interception on the image to be processed according to the target intercepting area, obtain an image to be adjusted, and adjust the size of the image corresponding to the image to be adjusted to a preset size to obtain a target image;
in an embodiment, the preprocessing module is further configured to determine an interception center point and an interception side length through a preset area positioning network, determine a target interception area according to the interception center point and the interception side length, and perform image interception on the image to be processed according to the target interception area to obtain the image to be adjusted.
In one embodiment, the image recognition apparatus further includes: a judgment module;
the judging module is configured to determine an initial loss function value of the image to be recognized through a preset loss function model according to the first image category probability, determine a second image category probability of the target image through the preset classification network, determine a target loss function value of the image to be adjusted through the preset loss function model according to the second image category probability, judge whether the target loss function value is smaller than the initial loss function value, and if so, execute the step of adjusting the size of the image corresponding to the image to be adjusted to a preset size to obtain the target image;
in an embodiment, the target category probability determining module is further configured to determine a second image category probability of the target image through the preset classification network, and determine the target category probability through a preset weighting algorithm according to the first image category probability and the second image category probability.
Other embodiments or specific implementation manners of the image recognition apparatus according to the present invention may refer to the above method embodiments, and are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order, but rather the words first, second, third, etc. are to be interpreted as names.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be substantially implemented or a part contributing to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g., a Read Only Memory (ROM)/Random Access Memory (RAM), a magnetic disk, an optical disk), and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.