CN108764325B

CN108764325B - Image recognition method and device, computer equipment and storage medium

Info

Publication number: CN108764325B
Application number: CN201810502263.0A
Authority: CN
Inventors: 陈炳文; 王翔; 周斌
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-05-23
Filing date: 2018-05-23
Publication date: 2022-07-08
Anticipated expiration: 2038-05-23
Also published as: CN108764325A

Abstract

The invention relates to an image recognition method, an image recognition device, a computer device and a storage medium, wherein the method comprises the following steps: acquiring a current image of a target to be identified; acquiring a current pixel point from the current image, and determining a corresponding background reference area according to the position of the current pixel point; calculating the background similarity corresponding to the current pixel point according to the background reference area; processing the image characteristics corresponding to the current pixel point according to the trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to a target object pixel point; calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying and obtaining the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point. The method has high image recognition accuracy.

Description

Image recognition method and device, computer equipment and storage medium

Technical Field

The present invention relates to the field of image processing, and in particular, to an image recognition method, apparatus, computer device, and storage medium.

Background

With the development of scientific technology, information included in an image is more and more, and in order to acquire content in the image, identification processing needs to be performed on the image content, for example, when a monitored image is analyzed, a target object to be monitored needs to be identified from the image.

At present, when a target object in an image needs to be identified, an area with a gray value larger than a certain threshold value is often used as an area where the target is located according to the fact that the gray value of the background of the image is relatively small, however, the gray value of the background may be large or the gray value of the target may be small, and therefore, a method for identifying the target object in the image only according to the gray value is not accurate.

Disclosure of Invention

Therefore, it is necessary to provide an image recognition method, an image recognition device, a computer device, and a storage medium for solving the above-mentioned problems, where the background similarity can reflect whether a pixel is a background, and the probability obtained by using the model is used to positively reverse map whether the pixel is a target, and the background similarity corresponding to the pixel and the probability of the pixel belonging to the target pixel are combined to identify an image region where the target object is located, so that the image recognition accuracy is high.

An image recognition method, the method comprising: acquiring a current image of a target to be identified; acquiring a current pixel point from the current image, and determining a corresponding background reference area according to the position of the current pixel point; calculating the background similarity corresponding to the current pixel point according to the background reference area; processing the image characteristics corresponding to the current pixel point according to the trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to a target object pixel point; calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying and obtaining the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

An image recognition apparatus, the apparatus comprising: the current image acquisition module is used for acquiring a current image of the target to be identified; a background region determining module, configured to obtain a current pixel point from the current image, and determine a corresponding background reference region according to a position of the current pixel point; the similarity calculation module is used for calculating the background similarity corresponding to the current pixel point according to the background reference area; a first probability obtaining module, configured to process, according to a trained image target identification model, an image feature corresponding to the current pixel point to obtain a first probability corresponding to the current pixel point, where the first probability is a probability that the current pixel point belongs to a target object pixel point; and the target area identification module is used for calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying to obtain the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

In one embodiment, the background region determination module comprises: a first region obtaining unit, configured to obtain a first region and a second region on the current image according to a position of the current pixel point, where the second region is a sub-region of the first region, and the current pixel point is located inside the second region; a first area determination unit configured to use a non-overlapping image area between the first area and the second area as the background reference area.

In one embodiment, the apparatus further comprises: the training area acquisition module is used for acquiring a training image and acquiring a training area corresponding to a target object in the training image; the training feature acquisition module is used for acquiring training image features corresponding to all pixel points in the training area; and the training module is used for carrying out model training according to the training image characteristics to obtain a characteristic mapping function for mapping the training image characteristics to the minimum characteristic space and a central value of the characteristic space.

In one embodiment, the apparatus further comprises: a second region obtaining unit, configured to obtain a third region and a fourth region on the current image according to the position of the current pixel point, where the fourth region is a sub-region of the third region, and the current pixel point is located inside the fourth region; a second region determining unit configured to acquire a non-overlapping image region between the third region and the fourth region; the first statistical unit is used for counting the gray values of the pixel points corresponding to the non-overlapped image area to obtain a first statistical result, and counting the gray values of the pixel points corresponding to the fourth area to obtain a second statistical result; and the contrast characteristic obtaining unit is used for obtaining the contrast characteristic according to the first statistical result and the second statistical result.

In one embodiment, the target area identification module comprises: a second probability obtaining unit, configured to obtain a second probability corresponding to the current pixel point according to the background similarity, where the second probability is a probability that the current pixel point belongs to a target object pixel point, and the second probability and the background similarity are in a negative correlation relationship; a target probability obtaining unit, configured to determine, according to the first probability and the second probability, a current target probability corresponding to the current pixel point, where the current target probability is a probability that the current pixel point belongs to a target object pixel point; and the target area identification unit is used for identifying and obtaining the image area where the target object in the current image is located according to the target probability corresponding to each pixel point in the current image.

In one embodiment, the target area identifying unit is configured to: acquiring a first pixel point of which the target probability is greater than a first threshold value in the current image; acquiring the distribution characteristics of the target probability corresponding to the first pixel point; obtaining a second threshold value according to the distribution characteristics; and acquiring a region obtained by combining first pixel points with the target probability greater than the second threshold value in the current image, and taking the region as an image region where the target object in the current image is located.

A computer device comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to carry out the steps of the image recognition method described above.

A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, causes the processor to carry out the steps of the image recognition method described above.

The image identification method, the image identification device, the computer equipment and the storage medium acquire a current image of a target to be identified, acquire a current pixel point from the current image, determine a corresponding background reference region according to the position of the current pixel point, calculate the background similarity corresponding to the current pixel point according to the background reference region, process the image characteristics corresponding to the current pixel point according to a trained image target identification model to acquire a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to the pixel point of the target object, calculate the background similarity and the first probability corresponding to each pixel point in the current image, and identify and acquire the image region where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point. The background similarity can reflect whether the pixel points are the background or not, the probability obtained by the model is utilized to positively and reversely map whether the pixel points are the target or not, and the image area where the target object is located is obtained by combining the background similarity corresponding to the pixel points and the probability that the pixel points belong to the pixel points of the target object, so that the image identification accuracy is high.

Drawings

FIG. 1 is a diagram of an application environment of an image recognition method provided in one embodiment;

FIG. 2 is a flow diagram of an image recognition method in one embodiment;

FIG. 3A is a schematic diagram of a first region and a second region in one embodiment;

FIG. 3B is a schematic diagram of a first region and a second region in one embodiment;

FIG. 4 is a flowchart illustrating the calculation of the background similarity corresponding to the current pixel point according to the background reference region in one embodiment;

FIG. 5 is a diagram illustrating an embodiment of dividing a background reference region into a plurality of sub-regions;

FIG. 6 is a flowchart illustrating processing of image features corresponding to a current pixel point according to a trained image target recognition model to obtain a first probability corresponding to the current pixel point in one embodiment;

FIG. 7A is a flow diagram of obtaining an image target recognition model in one embodiment;

FIG. 7B is a diagram illustrating a region corresponding to a target object of a training image, in accordance with an embodiment;

FIG. 8 is a flow diagram for obtaining contrast characteristics corresponding to a current pixel point in one embodiment;

FIG. 9 is a flowchart illustrating obtaining gray scale gradient features corresponding to a current pixel point in one embodiment;

FIG. 10 is a flowchart illustrating an embodiment of identifying an image region where a target object is located in a current image according to a background similarity and a first probability corresponding to each pixel point;

fig. 11 is a flowchart illustrating an embodiment of identifying and obtaining an image area where a target object in a current image is located according to target probabilities corresponding to respective pixel points in the current image;

FIG. 12 is a schematic diagram illustrating target probabilities corresponding to respective pixel points of a current image in one embodiment;

FIG. 13 is a diagram illustrating an image detection result according to an embodiment;

FIG. 14 is a block diagram showing the structure of an image recognition apparatus according to an embodiment;

FIG. 15 is a block diagram that illustrates the structure of a background region determination module in one embodiment;

FIG. 16 is a block diagram that illustrates the structure of a target area identification module in one embodiment;

FIG. 17 is a block diagram showing an internal configuration of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Fig. 1 is a diagram of an application environment of an image recognition method provided in an embodiment, as shown in fig. 1, in the application environment, including a camera 110 and a computer device 120. After the image capturing device 110 captures an image to obtain a current image, the current image is sent to the computer device 120, and the computer device 120 obtains the current image of the target to be recognized, executes the image recognition method provided by the embodiment of the present invention, and recognizes and obtains an image area where the target object is located in the current image.

In one embodiment, after obtaining the image area where the target object is located in the current image, the computer 120 may display the current image and identify the image area where the target object is located, for example, add an arrow to the current image, where the area pointed by the arrow is the image area where the target object is located.

In an embodiment, after obtaining the image region where the target object is located in the current image, the gray-scale value of the pixel point in the image region where the target object is located may also be set to 1, and the gray-scale value of the pixel point in the image region where the target object is not located in the current image is set to 0, so as to obtain the binary image corresponding to the current image.

In one embodiment, the current image is an infrared image and the camera 110 may be an infrared camera.

In one embodiment, the computer device 120 may be an independent physical server or terminal, may also be a server cluster formed by a plurality of physical servers, and may be a cloud server providing basic cloud computing services such as a cloud server, a cloud database, a cloud storage, and a CDN.

It should be noted that the above application scenario is only an example and is not to be considered as a limitation of the present invention, and in practical applications, other application scenarios may exist. For example, computer device 120 may retrieve the current image from local storage or from another device. Alternatively, the computer device 120 and the camera 110 may be the same device.

As shown in fig. 2, in an embodiment, an image recognition method is provided, and this embodiment is mainly illustrated by applying the method to the computer device 120 in fig. 1. The method specifically comprises the following steps:

step S202, acquiring a current image of the target to be recognized.

Specifically, the current image refers to an image currently required to identify the target object. The current image can be transmitted to the computer equipment by the camera in real time or at regular time, and the current image can also be an image stored locally in the computer equipment. The computer device may also acquire a current image of the target to be recognized in response to the image recognition request. For example, when a user needs to identify a target object in an image, an image identification request may be sent, where the image identification request may carry an image or an image identifier, and after receiving the image identification request, the computer device obtains a current image according to the image identification request. The target object refers to an object to be recognized, for example, the target object may be a human face, a cat, an airplane, or the like, and the target object may be specifically set as needed.

Step S204, obtaining the current pixel point from the current image, and determining the corresponding background reference area according to the position of the current pixel point.

Specifically, a pixel point is the smallest image unit in an image represented by a sequence of numbers. The current image comprises a plurality of pixel points (pixels), and the pixel points are used as units for identification when the current image is subjected to image identification. The current pixel point refers to a currently acquired pixel point. For each pixel point of the current image, the steps S204 to S208 may be performed sequentially as the current pixel point, or the steps S204 to S208 may be performed for each current pixel point by simultaneously using a plurality of pixel points as the current pixel points. The background reference area is an image area corresponding to the background, and is determined according to the position of the current pixel point. For example, a region composed of pixels adjacent to the current pixel may be used as the background reference region. Or taking the area located in the preset range of the current pixel point as a background reference area.

In one embodiment, determining the corresponding background reference region according to the position of the current pixel point comprises: and acquiring a first region and a second region on the current image according to the position of the current pixel point, and taking a non-overlapped image region between the first region and the second region as a background reference region. The second area is a sub-area of the first area, and the current pixel point is located in the second area.

Specifically, the sizes of the first region and the second region may be set according to needs. The first region and the second region may be partial regions in the current image. For example, the first region may include 7 × 7 pixels, and the second region may include 3 × 3 pixels. The sub-region in which the second region is the first region means that the second region belongs to the first region. It can be understood that, since the current pixel point is located in the second region, and the second region is a sub-region of the first region, the current pixel point is also located in the first region. After the first area and the second area are obtained, the area outside the second area in the first area, namely the non-overlapping area of the first area and the second area is used as a background reference area. In the embodiment of the invention, because the pixel points around the current pixel point can also be the pixel points corresponding to the target object, the non-overlapped image area between the first area and the second area is used as the background reference area, and the accuracy of the selected background reference area is improved.

As shown in FIG. 3A, assume that a square in FIG. 3A represents a pixel, P_ijRepresents the pixel point of the ith row and the jth column, wherein P₄₄Is the current pixel point, P₃₂、P₃₃、P₃₄、P₄₂、P₄₃、P₄₄、P₅₂、P₅₃And P₅₄The second area is the area corresponding to the oblique line in fig. 3A, the first area is the area formed by all the pixels in fig. 3, and the non-overlapping area, i.e., the background reference area, is the area in fig. 3A except the area corresponding to the oblique line.

In one embodiment, at least one of the first region and the second region takes the current pixel point as a symmetry center. For example, when the shape of the first region is a rectangle, the center of symmetry is the intersection of the diagonals. As shown in FIG. 3B, P₄₄The second region is a region corresponding to the oblique line, and the background reference region is a region outside the region corresponding to the oblique line in fig. 3B.

Step S206, calculating the background similarity corresponding to the current pixel point according to the background reference area.

Specifically, the background similarity is used to indicate the degree to which the current pixel point is similar to the background. The greater the similarity, the greater the probability that the current pixel is the background. After the background reference area is obtained, all the pixel points can be used as target pixel points, and the similarity between each pixel point in the background reference area and the current pixel point is calculated. Or selecting partial pixel points from the background reference region as target pixel points, and calculating the similarity between the target pixel points and the current pixel points. And after the similarity between the target pixel point and the current pixel point is obtained, obtaining the background similarity according to the similarity obtained by calculation. For example, an average value, a maximum similarity, a minimum similarity, or a median similarity of the calculated similarities may be used as the background similarity. The calculation method of the similarity can be set as required. For example, pixel features corresponding to the pixel points may be obtained, and the similarity between the pixel features may be calculated. The pixel features may be one or more of color features, texture features, and grayscale features.

In one embodiment, the similarity may be calculated according to the gray value of the pixel point. For example, the gray value of the pixel point may be obtained, the inverse operation is performed on the gray value of the pixel point to obtain the complementary current gray value, and then the gray value corresponding to the pixel point and the complementary gray value are combined into a gray value vector. And then calculating the similarity between the vectors to obtain the similarity between the pixel points and the pixel points. The similarity between vectors can be calculated by using a cosine similarity algorithm and an Euclidean distance similarity algorithm.

Step S208, processing the image characteristics corresponding to the current pixel point according to the trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to the target object pixel point.

Specifically, the image feature is used to represent a corresponding property of the image, such as one or more of a contrast feature, a color feature, or a gray scale feature corresponding to the pixel point, which may be selected as needed. The target object pixel points are pixel points corresponding to the target object. Before processing the image characteristics according to the trained image target recognition model, model training is carried out through training data to determine model parameters corresponding to the image target recognition model, and mapping from the image characteristics to the probability that pixel points belong to target object pixel points is established. The model training method can be a supervised training method or an unsupervised training method. For the supervised training method, it is known whether a pixel point in the training data is a target object pixel point, and the supervised training model may be, for example, a support vector machine or a deep neural learning model. For the unsupervised training method, it may be unknown whether the pixel points in the training data are target object pixel points, and the unsupervised training model may be, for example, a clustering algorithm.

Step S210, calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying to obtain the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

Specifically, the background similarity and the first probability corresponding to each pixel point in the current image are obtained through calculation according to the steps S202 to S208, and then the image area where the target object is located in the current image is obtained through combination of the background similarity and the first probability corresponding to each pixel point.

In one embodiment, the pixel points with the background similarity smaller than the preset similarity and the first probability larger than the preset probability may be used as the target object pixel points.

In an embodiment, the number of pixel points in the image region corresponding to the target object may also be set, and the image region where the target object is located in the current image is obtained by combining the number of pixel points in the image region corresponding to the target object. For example, when the number of pixels in the image region where the target object is located is set to be 8 in advance, 10 pixels with the minimum background similarity can be obtained, then, the pixels with the first probability ranking of 8 bits are obtained from the 10 pixels and serve as the target object pixels, and the region formed by the target object pixels serves as the image region where the target object is located.

In an embodiment, the image area where the target object is located in the current image may also be obtained by combining the positions of the pixel points. The method comprises the steps of selecting pixel points with background similarity smaller than preset similarity and first probability larger than preset probability, then obtaining the positions of the selected pixel points, and taking a continuous image area formed by the selected pixel points as an image area where a target object in a current image is located.

In an embodiment, the second probability may also be obtained according to the background similarity, and the second probability is a probability that the current pixel belongs to the target object pixel. And then multiplying the first probability and the second probability to obtain a target probability, and taking the pixel points with the target probability larger than a preset value as target object pixel points in the current image to obtain an image area where the target object is located.

In an embodiment, the gray value of the pixel point determined as the target object in the current image may also be set to 1, the gray values of other pixel points are set to 0, and the binarized image corresponding to the current image is displayed.

The image identification method comprises the steps of obtaining a current image of a target to be identified, obtaining a current pixel point from the current image, determining a corresponding background reference area according to the position of the current pixel point, calculating the background similarity corresponding to the current pixel point according to the background reference area, processing the image characteristics corresponding to the current pixel point according to a trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to a target object pixel point, calculating the background similarity and the first probability corresponding to each pixel point in the current image, and identifying and obtaining the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point. The background similarity can reflect whether the pixel points are the background or not, the probability obtained by the model is utilized to positively and reversely map whether the pixel points are the target or not, and the image area where the target object is located is obtained by combining the background similarity corresponding to the pixel points and the probability that the pixel points belong to the pixel points of the target object, so that the image identification accuracy is high.

The method provided by the embodiment of the invention can be applied to the identification of the target object of the infrared image, the infrared image is an image obtained by an infrared imaging technology, the image can be obtained under the condition of not depending on illumination based on the imaging method of the infrared detection technology in the environment with insufficient light intensity and poor contrast, the target object in the infrared image is generally small, for example, the size of the image area corresponding to the target object is generally between 1 × 1 pixel and 6 × 6 pixel, and the target object usually disappears in a complex background, and the gray scale of the infrared image is severely changed due to the influences of factors such as the nonuniformity of atmospheric heat radiation, the atmospheric attenuation under different meteorological conditions, the internal noise of an infrared detector and the like, which is greatly different from the gray scale of the traditional visible light image. Therefore, it is not effective to recognize the target object using the conventional image recognition method. By adopting the method provided by the embodiment of the invention, the background similarity is obtained by utilizing the similarity between the current pixel point and the pixel point of the background reference area, the probability that the pixel point is the pixel point of the target object is determined by utilizing the image recognition model, and the two methods are combined to comprehensively determine whether the pixel point is the pixel point of the target object from a positive angle and a negative angle, so the image recognition effect is good.

In one embodiment, as shown in fig. 4, the step S206 of calculating the background similarity corresponding to the current pixel point according to the background reference area may include the following steps:

step S402, obtaining a target pixel point from the background reference area, and obtaining a gray value corresponding to the target pixel point and the current pixel point.

Specifically, the number of target pixel points may be one or more. The target pixel point may be all pixel points of the background reference region, and of course, the target pixel point may also be obtained according to a preset pixel point screening rule. For example, a pixel point in the background reference region whose gray value is the median of the gray values of the pixel points in the background reference region may be obtained as the target pixel point.

In one embodiment, the background reference region may be further divided into a plurality of sub-regions, and the target pixel point is obtained from each sub-region. For example, the pixel point corresponding to the median of the gray values of the pixel points in the sub-regions, where the gray value in each sub-region is the gray value of each pixel point in the sub-region, is used as the target pixel point. As shown in fig. 5, a region formed by pixel points through which a line segment Q1 passing through the center point of the first region vertically in the horizontal direction passes may be used as the first sub-region, a pixel point through which a line segment Q2 passing through the center point of the first region vertically in the vertical direction passes may be used as the second sub-region, and regions formed by pixel points through which diagonal lines Q3 and Q4 of the first region pass may be respectively used as the third sub-region and the fourth sub-region. And then, acquiring the median of the gray values of the pixel points in each sub-region, and taking the pixel points with the median gray values in each sub-region as target pixel points.

In an embodiment, when the second region exists, it can be understood that the pixel point corresponding to each sub-region does not include the pixel point in the second region. For example, for the fourth sub-region corresponding to the line segment Q4, P may be₁₁、P₂₂、P₆₆、P₇₇Formed of a region not including P₃₃、P₄₄、P₅₅。

Step S404, calculating according to the gray value of the target pixel point to obtain a reference gray value, performing negation operation on the reference gray value to obtain a corresponding complementary reference gray value, and forming a reference gray value vector by the reference gray value and the complementary reference gray value.

Specifically, the reference gray value may be a median of gray values corresponding to the respective target pixel points. Or the gray value obtained by normalizing the gray value of the target pixel point, or the gray value obtained by normalizing the median of the gray values corresponding to the target pixel point. For example, when the gray-level value of the target pixel is 200, since the range of the gray-level value is 0 to 255, the normalized reference gray-level value obtained by dividing 200 by 255 is 0.784. Or when the target pixel includes 4 pixels, which are 100, 250, 200, and 210, the median of the gray-level values is obtained first and is (200+ 210)/2 is 205, and the corresponding reference gray-level value is 205/255 is 0.804. The inversion operation is used to obtain a complementary image of the image, and the complementary means that the two gray values are added to be equal to the maximum value of the gray, such as 255 or 1. For example, when the reference gray-scale value is 0.804, the complementary reference gray-scale value is 0.196, which is 1-0.804. After the reference gray value and the complementary reference gray value are obtained, a reference gray value vector is formed, and the reference gray value vector is [0.804,0.196 ].

Step S406, performing an inversion operation on the gray value corresponding to the current pixel point to obtain a corresponding complementary current gray value, and combining the gray value corresponding to the current pixel point and the complementary current gray value into a current gray value vector.

Specifically, the gray value corresponding to the current pixel point may be a normalized gray value or an unnormalized gray value. And after the gray value corresponding to the current pixel point is obtained, performing negation operation on the gray value corresponding to the current pixel point to obtain a corresponding complementary current gray value, and then forming the gray value corresponding to the current pixel point and the complementary current gray value into a current gray value vector. For example, when the corresponding gray-scale value of the current pixel point is 0.901, the complementary current gray-scale value is 1-0.901 — 0.099. The current reference gray value vector is [0.901,0.099 ].

Step S408, calculating the background similarity corresponding to the current pixel point according to the reference gray value vector and the current gray value vector.

Specifically, the method for calculating the background similarity according to the reference gray value vector and the current gray value vector may be set according to actual needs. For example, the calculation may be performed by a cosine similarity calculation method or an euclidean distance calculation method.

In one embodiment, when the reference gray value vector includes a plurality of vectors, the similarity between each reference gray value vector and the current gray value vector may be calculated, and the background similarity may be obtained according to the similarity between each reference gray value vector and the current gray value vector. For example, one of the median, average, maximum, and minimum values of the similarity between the respective reference gray value vectors and the current gray value vector may be taken as the background similarity.

In one embodiment, the method for calculating the background similarity comprises the following steps: and comparing the vector values of the same position in the reference gray value vector and the current gray value vector to obtain the minimum vector value corresponding to each position. And combining the minimum values to obtain a recombined vector, calculating the square of a module of the recombined vector, obtaining the background fuzzy membership corresponding to the current pixel point according to the square of the module of the recombined vector, the module of the current gray value vector and the module of the reference gray value vector, and obtaining the background similarity according to the background fuzzy membership.

Specifically, the background fuzzy membership is used for representing the degree that the current pixel point is affiliated to the background, and an algorithm for obtaining the background similarity according to the background fuzzy membership can be set as required. For example, when there is only one background fuzzy membership, the fuzzy membership may be used as the background similarity. When the background fuzzy membership is plural, one of the median, average, maximum, and minimum values of the fuzzy membership may be used as the background similarity. For example, if the current gray-level value vector is [0.901,0.099], and the reference gray-level value vector is [0.804,0.196], the restructuring vector is [0.804,0.099], and assuming that the target pixel points are 4, the method for calculating the background similarity is expressed by the following formulas (1) to (4):

I＝[F_t(x,y),F_t(x,y)^c],F_t(x,y)^c＝1-F_t(x,y) (1)

Bg(x,y)＝max{P_j|j＝1...4} (4)

wherein in the above formula, F_t(x, y) is the gray value corresponding to the current pixel point t, F_t(x,y)^cAnd the complementary current gray value of the current pixel point t is I, and the I is the current gray value vector. w is a_jFor the reference gray value, w, corresponding to the jth target pixel point_j ^cIs the complementary reference gray value, W, of the jth target pixel point_jIs a reference gray value vector, P, of the jth target pixel point_jIs the fuzzy membership degree between the current pixel point and the jth target pixel point,

the parameter set to prevent the fuzzy membership from being equal to 1 may be specifically set as needed. 'Lambda' is a fuzzy traffic operator, and the calculation result of the fuzzy traffic operator is a vector at the same position between two vectorsThe minimum value of the values. "| |" represents the modulus of the vector, Bg (x, y) is the background similarity, max represents the maximum value, i.e., the background similarity is the maximum P_j。

In an embodiment, the image target recognition model is a support vector clustering model, as shown in fig. 6, the step S208 of processing the image features corresponding to the current pixel point according to the trained image target recognition model to obtain the first probability corresponding to the current pixel point may specifically include:

step S602, a feature mapping function corresponding to the image target recognition model is obtained, and a central value of a feature space corresponding to the feature mapping function is obtained.

Specifically, the basic idea of the support vector clustering model is as follows: for the input features used for model training, a feature mapping function may be used to map the input features to a feature space to obtain a feature mapping value, where the feature space is the smallest feature space capable of covering the mapping value obtained after mapping, and the center value of the feature space is the center value of the feature mapping value obtained after mapping by the feature mapping function. For example, the feature space may be a hyper-sphere, and the center of the feature space is the center of the hyper-sphere. In training, since finding a minimum feature space that completely covers all feature vectors results in a relatively large feature space, a model training condition may be set, and when the model training condition is met, the obtained feature space is used as the minimum feature space that satisfies the condition, and the model training method is described later. Therefore, for the trained image target recognition model, the corresponding feature mapping function can be obtained, and the central value of the feature space corresponding to the feature mapping function can be obtained.

Step S604, calculating the image characteristics according to the characteristic mapping function to obtain the mapping values corresponding to the image characteristics.

Specifically, after the feature mapping function is obtained, the feature mapping function is used to map the image features into the feature space, so as to obtain the corresponding mapping values.

Step S606, a first distance between the mapping value and the center value is calculated.

Specifically, after the mapping value is obtained, the distance between the mapping value and the center value in the feature space is obtained according to the mapping value. Assuming that the image feature is s (i), the feature mapping function is Φ, and the center value is a, the first distance between the mapping value and the center value can be represented as | Φ (s (i)) -a |, where "|" indicates that the euclidean distance is calculated.

Step S608, a first probability corresponding to the current pixel point is obtained according to the first distance, wherein the first distance and the first probability are in a negative correlation relationship.

Specifically, the first distance is inversely related to the first probability, i.e., the first probability becomes smaller as the first distance increases. For example, the first probability may be an inverse of the first distance.

In one embodiment, a second distance from the center of the feature space to the boundary of the feature space may also be obtained. Calculating a first probability that the current pixel point is a pixel point corresponding to the target according to the first distance, wherein the first probability comprises the following steps: and calculating a proportional value of the first distance and the second distance. And calculating to obtain a first probability corresponding to the current pixel point according to the proportion value, wherein the proportion value and the first probability are in a negative correlation relationship.

Specifically, the second distance is a distance from a boundary of the feature space to a center of the feature space, and when the feature space is a hyper sphere, the distance from the boundary to the center of the feature space is a distance from a center of a sphere to a surface of the sphere, that is, a radius of the sphere. The corresponding relationship between the ratio and the first probability may be set, for example, the first probability corresponding to the ratio of 0 to 10% may be set to 0.8, and the first probability corresponding to the ratio of 10 to 20% may be set to 0.6.

In one embodiment, the first probability is 1 minus a proportional value, represented by equation (5) below, where H is_t(x, y) is a first probability, Φ (s (i)) is a mapping value corresponding to the image feature of the current pixel point, a is a central value, R is a second distance, and "|" indicates that the euclidean distance is calculated.

In the embodiment of the invention, the first distance between the mapping value of the current pixel point and the central value corresponding to the support vector clustering model is calculated, and the first distance and the first probability are in a negative correlation relationship, namely the probability that the pixel point corresponding to the characteristic mapping value which is farther away from the center of the characteristic space belongs to the pixel point of the target object is smaller, so that the probability that the current pixel point is the pixel point of the target object can be accurately quantized, and the accuracy of image identification is further improved.

Fig. 7 shows a flowchart of an implementation of obtaining an image target recognition model according to an embodiment, which may specifically include the following steps:

step S702, a training image is obtained, and a training area corresponding to a target object in the training image is obtained.

Specifically, the training images are used for model training. The training area is an image area where the target object is located in the training image. The training area may be obtained by manual labeling, and as shown in fig. 7A, the area enclosed by the rectangular frame in fig. 7A is the area corresponding to the target object obtained by manual identification.

Step S704, a training image feature corresponding to each pixel point in the training area is obtained.

Specifically, the image feature is used to represent a corresponding property of the image, such as one or more of a contrast feature, a color feature, or a gray scale feature corresponding to the pixel point, which may be selected as needed.

Step S706, model training is carried out according to the training image characteristics, and a characteristic mapping function for mapping the training image characteristics to the minimum characteristic space and a central value of the characteristic space are obtained.

Specifically, the minimum feature space is based on the model training condition, which may be set, and when the model training condition is reached, the obtained feature space is used as the minimum feature space. For the support vector clustering model, the optimization target of model training can be expressed as the following formula, where min represents the minimum value, s.t. represents the subject to, and the formula representing the minimum value is constrained by the formula after s.t. R represents the radius of the super sphere, a represents the center of the super spherePhi is a feature mapping function, xi_iThe representation may allow mapping values corresponding to a part of the training samples to be located outside the hypersphere for a relaxation variable, the relaxation variable may be specifically set as needed, n is the number of the training samples, and x (i) represents an image feature. C is a penalty function for adjusting the balance between the error and the boundary of the hyper-sphere. Thus, the optimization objectives can be summarized as: and under the conditions of the penalty function and the set relaxation variable, training according to the training sample to obtain a feature mapping function and a minimum hyper-sphere. The minimum hypersphere has a center value of a and a radius value of R.

S.t.|Φ((x_i)-a)²|≤R²+ξ_iAnd xi_i≥0

In one embodiment, as shown in fig. 8, the image features include contrast features, and the step of obtaining the contrast feature corresponding to the current pixel point includes:

step S802, a third area and a fourth area are obtained on the current image according to the position of the current pixel point, wherein the fourth area is a sub-area of the third area, and the current pixel point is located in the fourth area.

Specifically, the contrast refers to the contrast of the image, and the sizes of the third region and the fourth region may be set specifically as required, for example, the third region may include 9 × 9 pixels, and the fourth region may include 3 × 3 pixels. It can be understood that, since the current pixel point is located in the fourth region, and the fourth region is a sub-region of the third region, the current pixel point is also located in the third region.

In one embodiment, the fourth region may be the same as the second region, and the third region may be the same as the first region.

In one embodiment, at least one of the fourth region and the third region takes the current pixel point as a symmetry center.

Step S804, a non-overlapping image area between the third area and the fourth area is acquired.

Specifically, since the fourth region is a sub-region of the third region, the non-overlapping image region between the third region and the fourth region is a region other than the fourth region in the third region.

Step S806, count the gray values of the pixels corresponding to the non-overlapping image regions to obtain a first statistical result, and count the gray values of the pixels corresponding to the fourth region to obtain a second statistical result.

Specifically, the first statistical result and the second statistical result may be the sum of the gray values or the average of the gray values. For example, the gray values of the pixels in the non-overlapping region may be summed up and then divided by the number of pixels in the non-overlapping region to obtain the average gray value of the pixels corresponding to the non-overlapping region, which is used as the first statistical result. The gray values of the pixel points in the fourth area can be summed up and then divided by the number of the pixel points in the fourth area to obtain the average gray value of the pixel points corresponding to the fourth area, which is used as a second statistical result.

And step S808, obtaining contrast characteristics according to the first statistical result and the second statistical result.

Specifically, after the first statistical result and the second statistical result are obtained, the contrast characteristic is obtained by combining the first statistical result and the second statistical result. For example, the contrast characteristic may be a ratio of the first statistical result to the second statistical result, or the contrast characteristic may be a difference between the first statistical result and the second statistical result.

In one embodiment, when the statistical result is an average value of gray values and the contrast characteristic is a difference value between the first statistical result and the second statistical result, the calculation formula of the contrast characteristic is formula (6), where lmci is the contrast characteristic corresponding to the ith current pixel point, n is_inThe number of pixel points in the fourth region, n_outF (x, y) represents the gray value of the pixel point, which is the number of the pixel points in the third area.

In one embodiment, as shown in fig. 9, the image feature includes a gray scale gradient feature, and the step of obtaining the gray scale gradient feature corresponding to the current pixel point includes:

step S902, obtaining a third region and a fourth region on the current image according to the position of the current pixel point, where the fourth region is a sub-region of the third region, and the current pixel point is located inside the fourth region.

Specifically, the gradation gradient feature refers to a feature related to a difference in gradation from pixel to pixel. The third area and the fourth area may be obtained by referring to the method in step S802, and details are not repeated.

In step S904, a non-overlapping image region between the third region and the fourth region is acquired.

Step S906, a first gray scale difference between each pixel point and an adjacent pixel point in the non-overlapping image region is obtained, and a second gray scale difference between each pixel point and an adjacent pixel point in the fourth region is obtained.

Specifically, the adjacent pixel points refer to pixel points with coincident boundaries with the pixel points, and the gray difference value between the adjacent pixel points and all the adjacent pixel points can be calculated, and the gray difference value between part of the adjacent pixel points can also be calculated. In one embodiment, the gray scale difference value may be divided into a gray scale difference value in a horizontal direction and a gray scale difference value in a vertical direction, and the gray scale difference value may be one of the gray scale difference value in the horizontal direction and the gray scale difference value in the vertical direction, or the gray scale difference value may be the sum of the gray scale difference value in the horizontal direction and the gray scale difference value in the vertical direction. To calculate the pixel P of FIG. 3A₄₄Taking the corresponding first gray scale difference as an example, P can be expressed₄₄And P₄₅The absolute value of the gray difference between them is taken as P₄₄Corresponding gray scale difference in horizontal direction, P₄₄And P₅₄The absolute value of the gray difference between them is taken as P₄₄Corresponding gray difference value in vertical direction, and adding the gray difference value in horizontal direction and the gray difference value in vertical direction to obtain P₄₄And the corresponding first gray scale difference value. The calculation method of the gray difference value is expressed by the following formula:

G_h(x,y)＝|F(x,y)-F(x+1,y)| (7)

G_v(x,y)＝|F(x,y)-F(x,y+1)| (8)

G(x,y)＝G_h(x,y)+G_v(x,y) (9)

wherein G is_h(x, y) represents a gray-scale difference in the horizontal direction, G_v(x, y) represents the gray difference in the vertical direction, G (x, y) is the gray difference corresponding to the pixel point, F (x, y) represents the gray value corresponding to the pixel point (x, y), x can represent a row, and y represents a column.

Step S908 is to count the first gray scale difference corresponding to each pixel point in the non-overlapping image region to obtain a third statistical result, and to count the second gray scale difference corresponding to each pixel point in the fourth region to obtain a fourth statistical result.

Specifically, the third statistical result and the fourth statistical result may be a sum of the grayscale differences or an average of the grayscale differences. For example, the first gray scale difference values of the pixels in the non-overlapping region may be summed up and then divided by the number of pixels in the non-overlapping region to obtain an average gray scale difference value of the pixels corresponding to the non-overlapping region, which is used as a third statistical result. The second gray difference values of the pixels in the fourth area can be summed and summed, and then divided by the number of the pixels in the fourth area to obtain the average gray difference value of the pixels corresponding to the fourth area, which is used as a fourth statistical result.

Step S910, obtaining a gray scale gradient feature according to the third statistical result and the fourth statistical result.

Specifically, after the third statistical result and the fourth statistical result are obtained, the third statistical result and the fourth statistical result are combined to obtain the gray gradient feature. For example, the contrast characteristic may be a ratio of the third statistical result to the fourth statistical result, or the gray gradient characteristic may be the third statistical resultDifference of the result and the fourth statistical result. When the gray scale gradient feature is the difference between the third statistical result and the fourth statistical result, and the third statistical result and the fourth statistical result are the average value of the gray scale differences, the calculation formula of the gray scale gradient feature is as follows, where lmgi is the gray scale gradient feature corresponding to the ith current pixel point, and n is the average value of the gray scale differences_inThe number of pixel points in the fourth region, n_outG (x, y) represents a gray level difference value, which is the number of pixels in the third region.

In the embodiment of the invention, when the image characteristics corresponding to the current pixel point are calculated, the image characteristics of the area corresponding to the current pixel point are calculated and are divided into two areas for calculation, so that the obtained image characteristics can reflect the environment of the current pixel point, and the obtained first probability accuracy is high.

In one embodiment, the image features may include one or both of contrast features and gray scale gradient features.

In one embodiment, as shown in fig. 10, the step S210 of identifying and obtaining the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point includes:

step S1002, a second probability corresponding to the current pixel point is obtained according to the background similarity, wherein the second probability is the probability that the current pixel point belongs to the target object pixel point, and the second probability and the background similarity are in a negative correlation relationship.

Specifically, the background similarity and the second probability are in a negative correlation relationship, which means that the second probability becomes smaller as the background similarity increases. For example, the second probability may be the inverse of the background similarity. Or the corresponding relationship between the range of the background similarity and the second probability may be set, for example, when the background similarity is 0-10%, the corresponding second probability is 90%, and when the background similarity is 11-20%, the corresponding second probability is 80%. Alternatively, the second probability may be expressed by the following formula, where Bg (x,y) background similarity, H_bAnd (x, y) represents a second probability corresponding to the pixel point (x, y).

H_b(x,y)＝1-Bg(x,y) (11)

Step S1004, determining a current target probability corresponding to the current pixel point according to the first probability and the second probability, where the current target probability is a probability that the current pixel point belongs to a target object pixel point.

Specifically, the method for determining the current target probability corresponding to the current pixel point according to the first probability and the second probability may be set as needed, for example, the first probability and the second probability may be multiplied, and the obtained product is used as the current target probability. Or taking the average value of the first probability and the second probability as the current target probability. Of course, the corresponding relationship between the product and the target probability may be further set, and the target probability may be obtained according to the product. And the target probability can also be obtained by combining the third probability of the current pixel point belonging to the target object pixel point obtained by another method. For example, the target probability may be a product of the first probability, the second probability, and the third probability.

Step S1006, identifying and obtaining the image area where the target object in the current image is located according to the target probability corresponding to each pixel point in the current image.

Specifically, the current image includes a plurality of pixel points, and an image area where the target object is located needs to be obtained according to the target probability corresponding to each pixel point. For example, the pixel points with the target probability greater than the preset probability may be used as the target object pixel points, and the region formed by the target object pixel points may be used as the image region where the target object is located.

In an embodiment, the image area where the target object is located in the current image can be obtained according to the position relationship of the pixel points. For example, for a target object whose corresponding region is a continuous region, if there is a pixel that exists alone although the target probability is greater than the preset probability, that is, there is no pixel around whose target probability is greater than the preset probability, then the pixel may not be the target object pixel.

In one embodiment, as shown in fig. 11, the step S1006 of identifying and obtaining an image area where the target object is located in the current image according to the target probability corresponding to each pixel point in the current image includes:

step S1102, a first pixel point in the current image whose target probability is greater than the first threshold is obtained.

Specifically, the first threshold may be specifically set as needed, for example, as a requirement for the recognition accuracy, and through experiments, it is found that when the first threshold is 0.85, the recognition accuracy is high.

In an embodiment, for a pixel point with a target probability smaller than the first threshold, the pixel point may be used as a background pixel point, and the target probability corresponding to the pixel point is updated to 0.

Step S1104, a distribution characteristic of the target probability corresponding to the first pixel point is obtained.

Specifically, the distribution characteristics reflect the distribution of the obtained target probability, and may include one or more of the maximum value, the minimum value, the average value of the target probability, and the distribution ratio or number in each numerical range. The specific setting can be according to needs. The respective numerical ranges may be predetermined, for example, 0.85 to 0.88 may be set as a first numerical range, and 0.89 to 0.92 may be set as a second numerical range.

In step S1106, a second threshold is obtained according to the distribution characteristics.

Specifically, after the distribution characteristics are obtained, the second threshold value is obtained according to each distribution characteristic. The method for obtaining the second threshold may be set as required, for example, the second threshold may be obtained by calculating an image threshold segmentation algorithm, where the image threshold segmentation algorithm may include one or more of an OTSU (maximum inter-class variance algorithm), an iterative maximum variance method, a maximum entropy method, and the like, and when the second threshold is obtained by using the image segmentation algorithm, the gray-value feature is replaced with a target probability corresponding to the first pixel point to calculate the second threshold.

Step S1108, a region obtained by combining the first pixel points in the current image whose target probability is greater than the second threshold is obtained as an image region where the target object is located in the current image.

Specifically, after the second threshold is obtained, the first pixel points with the target probability greater than the second threshold are used as target object pixel points, and the area obtained by combining the target object pixel points is used as the image area where the target object is located in the current image.

In the embodiment of the invention, the second threshold is obtained according to the distribution characteristics of the target probability, the image is divided again according to the second threshold, and the target object pixel points are further screened out according to the probability distribution of the specific pixel points corresponding to the image after the pixel points are preliminarily screened out through the first threshold, so that the image identification accuracy is further improved.

The method provided by the embodiment of the invention is explained by a specific embodiment, and comprises the following steps:

1. and acquiring a current image of the target to be identified, wherein the current image is assumed to be an image with 7 × 7 pixels, namely, the current image comprises 7 × 7 pixel points, and 7 rows and 7 columns are total.

2. And taking each pixel point as a current pixel point, and respectively calculating the background similarity and the corresponding first probability corresponding to each current pixel point according to the method provided by the embodiment of the invention. When calculating the background similarity, the sizes of the first area and the third area are 5 × 5 pixels, and the sizes of the second area and the fourth area are 3 × 3 pixels.

3. And calculating a second probability corresponding to each pixel point according to the background similarity corresponding to each pixel point, wherein the second probability is 1-background similarity.

4. And obtaining a target probability corresponding to each pixel point according to the first probability and the second probability, wherein the target probability is the product of the first probability and the second probability. As shown in fig. 12, one square in fig. 12 represents one pixel, and the number in the pixel is the target probability corresponding to the pixel point.

5. Obtaining a first pixel point in the current image where the target probability is greater than the first threshold, and if the first threshold is 0.85, then P shown in fig. 12₁₆、P₂₅、P₂₆、P₃₄、P₃₅、P₃₆、P₄₅And P₄₆Is the first pixel point.

6. A second threshold is obtained. The second threshold value can be obtained by adopting a method of selecting the threshold value by an iterative method, an initial threshold value T is firstly selected by an iterative method threshold value selecting algorithm, and the image is divided into two parts: r1 and R2, calculate the average values u1 and u2 of the regions R1 and R2, select the new threshold value T ═ u1+ u2)/2, and repeat the above process until the new threshold value is no longer changed or is changed by less than the set threshold value compared with the previous threshold value. In the embodiment of the present invention, first, the maximum target probability and the minimum target probability corresponding to the first pixel point are obtained, and are respectively 0.86 and 0.96, so that the initial threshold is 0.86+0.96 — 0.91, and 0.91 is used as the threshold, and the first pixel point is classified, so that P can be obtained₂₆、P₃₆、P₄₅The target object pixel points and the background pixel points are the other pixel points. Calculating a target object pixel point P₂₆、P₃₆、P₄₅The target probability average of (0.96+0.91+0.96)/3 is 0.94. Calculating the target probability average value of the background pixel point to be 0.89+0.86+0.87+0.87+ 0.89-0.876, so that the new threshold value is (0.876+ 0.94)/2-0.913, the change value between the new threshold value and the initial threshold value is 0.913-0.91-0.003, assuming that the change value is smaller than the set threshold value, 0.913 is the second threshold value, assuming that the change value is larger than the set threshold value, the first pixel point can be classified by using 0.913 as the threshold value, and then repeating the above steps until the new threshold value is not changed or the change is smaller than the set threshold value compared with the previous threshold value.

7. And obtaining a region obtained by combining the first pixel points with the target probability larger than the second threshold value in the current image, and taking the region as an image region where the target object in the current image is located. Assuming that the second threshold is finally 0.87, P is P in FIG. 12₁₆、P₂₆、P₃₆、P₄₅、P₄₆The region formed by the 4 pixels is an image region where the target object is located in the current image of the target object.

In an embodiment, taking the example of performing background suppression after image recognition on nine segments of infrared videos by using a feature selective filtering method (CSF), a maximum minimum difference method (DMMF), a multi-scale gradient Method (MSG) and the method provided by the embodiment of the present invention, the effect of the method provided by the embodiment of the present invention is further explained, and three indexes, i.e., signal-to-noise ratio gain (ISNR), contrast gain (ISCR) and Background Suppression Factor (BSF), are used to evaluate a background suppression result. The larger the background suppression factor is, the better the global background smoothing effect is. The larger the signal-to-noise ratio gain and the contrast gain are, the stronger the clutter enhancement capability of the method is. The specific index results are shown in table 1, where seq in the table indicates a video, and the numbers after seq are the video numbers, and it can be seen from table 1 that: the characteristic selective filtering method (CSF) has higher BSF, lower ISNR and ISCR, which indicates that the CSF has better global background smoothing performance but has poor clutter suppression effect, and the maximum minimum difference method (DMMF) has higher ISNR and BSF, which indicates that the target signal-to-noise ratio is not enhanced and the global background suppression degree is poor. Three indexes of the multi-scale gradient Method (MSG) are all low, which indicates that the background inhibition performance is general. The method has higher ISNR, ISCR and BSF indexes, has better global background smoothing and local clutter suppression effects, can effectively suppress complex background clutter and highlight the target.

Table 1 image identification evaluation index of four detection algorithms

In one embodiment, the detection rate (r), the accuracy rate (p) and the comprehensive index (F1) can be further used for evaluating the identification effect of the weak and small targets. Wherein r is the ratio of the number of detected correct targets to the total number of real targets, p is the ratio of the number of detected correct targets to the total number of detected targets, and F1 is the composite index of the indexes r and p, which may be the weighted harmonic mean corresponding to r and p. With higher r values while maintaining higher p values, higher F1 values also mean good recognition performance.

The detection and evaluation results of the four methods for nine-segment infrared video are shown in table 2. As can be seen from table 2: the feature selective filtering (CSF) has a high detection rate, but a low accuracy and general overall detection performance. The multi-scale gradient Method (MSG) and the maximum and minimum difference method (DMMF) have higher detection rate but lower accuracy, and the method has higher detection rate and accuracy at the same time, wherein the index of F1 is up to 97.7 percent, and the method has better detection stability.

Table 2 average target detection evaluation index of four detection methods

In addition, three images are selected, wherein the background of the first image is cloud-free and the color of the sky is dark, the background of the second image is cloud-containing and the shooting distance is far, the background of the third image is cloud-containing and the shooting distance is near, as shown in fig. 13, the image in the first column is an original image, the small point in the square frame represents the actual position of the target object, and the target object in the image is identified by the four methods, and the images in the 2 nd to 4 th columns in fig. 13 are respectively the detection results of the target object corresponding to the feature selective filter method (CSF), the maximum minimum difference method (DMMF), the multi-scale gradient Method (MSG) and the method provided by the embodiment of the present invention, as can be seen from fig. 13, the method provided by the embodiment of the present invention can accurately identify the image position where the target object is located.

As shown in fig. 14, in one embodiment, an image recognition apparatus is provided, which may be integrated in the computer device 120 described above, and specifically may include a current image acquisition module 1402, a background area determination module 1404, a similarity calculation module 1406, a first probability derivation module 1408, and a target area recognition module 1410.

A current image obtaining module 1402, configured to obtain a current image of the target to be identified.

A background region determining module 1404, configured to obtain a current pixel point from the current image, and determine a corresponding background reference region according to a position of the current pixel point.

The similarity calculating module 1406 is configured to calculate a background similarity corresponding to the current pixel point according to the background reference region.

A first probability obtaining module 1408, configured to process, according to the trained image target identification model, the image feature corresponding to the current pixel point to obtain a first probability corresponding to the current pixel point, where the first probability is a probability that the current pixel point belongs to a target object pixel point.

And the target area identification module 1410 is configured to calculate and obtain a background similarity and a first probability corresponding to each pixel point in the current image, and identify and obtain an image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

In one embodiment, the background region determination module 1404 includes:

the first region acquisition unit is used for acquiring a first region and a second region on the current image according to the position of the current pixel point, wherein the second region is a sub-region of the first region, and the current pixel point is positioned in the second region.

A first area determination unit for taking a non-overlapping image area between the first area and the second area as a background reference area.

In one embodiment, as shown in fig. 15, the similarity calculation module 1406 includes:

the gray value obtaining unit 1406A is configured to obtain a target pixel point from the background reference region, and obtain a gray value corresponding to the target pixel point and the current pixel point.

The reference vector forming unit 1406B is configured to calculate a reference gray value according to the gray value of the target pixel, perform an inverse operation on the reference gray value to obtain a corresponding complementary reference gray value, and form a reference gray value vector by the reference gray value and the complementary reference gray value.

The current vector forming unit 1406C is configured to perform an inverse operation on the gray value corresponding to the current pixel point to obtain a corresponding complementary current gray value, and form the gray value corresponding to the current pixel point and the complementary current gray value into a current gray value vector.

The similarity calculation unit 1406D is configured to calculate, according to the reference gray value vector and the current gray value vector, a background similarity corresponding to the current pixel point.

In one embodiment, the image target recognition model is a support vector clustering model, and the first probability obtaining module includes:

and the model parameter acquisition unit is used for acquiring a characteristic mapping function corresponding to the image target identification model and acquiring a central value of a characteristic space corresponding to the characteristic mapping function.

And the mapping value calculating unit is used for calculating the image characteristics according to the characteristic mapping function to obtain the mapping values corresponding to the image characteristics.

A first distance calculation unit for calculating a first distance between the mapped value and the center value.

And the first probability obtaining unit is used for obtaining a first probability corresponding to the current pixel point according to the first distance calculation, wherein the first distance and the first probability are in a negative correlation relationship.

In one embodiment, the image recognition apparatus further includes:

and the second distance acquisition module is used for acquiring a second distance from the center of the characteristic space to the boundary of the characteristic space.

The first probability deriving unit is for: and calculating a proportional value of the first distance and the second distance. And calculating to obtain a first probability corresponding to the current pixel point according to the proportion value, wherein the proportion value and the first probability are in a negative correlation relationship.

In one embodiment, the image recognition apparatus further includes:

and the training area acquisition module is used for acquiring a training image and acquiring a training area corresponding to the target object in the training image.

And the training feature acquisition module is used for acquiring training image features corresponding to all pixel points in the training area.

And the training module is used for carrying out model training according to the training image characteristics to obtain a characteristic mapping function for mapping the training image characteristics to the minimum characteristic space and a central value of the characteristic space.

In one embodiment, the image device further includes:

and the second area acquisition unit is used for acquiring a third area and a fourth area on the current image according to the position of the current pixel point, wherein the fourth area is a sub-area of the third area, and the current pixel point is positioned in the fourth area.

A second region determining unit for acquiring a non-overlapping image region between the third region and the fourth region.

The first statistical unit is used for counting the gray values of the pixel points corresponding to the non-overlapped image areas to obtain a first statistical result, and counting the gray values of the pixel points corresponding to the fourth area to obtain a second statistical result.

And the contrast characteristic obtaining unit is used for obtaining the contrast characteristic according to the first statistical result and the second statistical result.

In one embodiment, as shown in fig. 16, the target area identification module 1410 includes:

the second probability obtaining unit 1410A is configured to obtain a second probability corresponding to the current pixel point according to the background similarity, where the second probability is a probability that the current pixel point belongs to a target object pixel point, and the second probability and the background similarity are in a negative correlation relationship.

And a target probability obtaining unit 1410B, configured to determine a current target probability corresponding to the current pixel according to the first probability and the second probability, where the current target probability is a probability that the current pixel belongs to a target object pixel.

And the target area identification unit 1410C is configured to identify an image area where the target object in the current image is located according to the target probability corresponding to each pixel point in the current image.

In one embodiment, the target region identifying unit 1410C is configured to: and acquiring a first pixel point of which the target probability is greater than a first threshold value in the current image. And acquiring the distribution characteristics of the target probability corresponding to the first pixel point. And obtaining a second threshold value according to the distribution characteristics. And obtaining a region obtained by combining the first pixel points with the target probability larger than the second threshold value in the current image, and taking the region as an image region where the target object in the current image is located.

FIG. 17 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be the computer device 120 in fig. 1. As shown in fig. 17, the computer apparatus includes a processor, a memory, a network interface, and an input device connected through a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the image recognition method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform an image recognition method. The input device of the computer equipment can be a touch layer covered on a display screen, a key, a track ball or a touch pad arranged on a shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 17 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, the image recognition apparatus provided in the present application may be implemented in the form of a computer program, and the computer program may be run on a computer device as shown in fig. 17. The memory of the computer device may store various program modules constituting the image recognition apparatus, such as a current image acquisition module 1402, a background area determination module 1404, a similarity calculation module 1406, a first probability derivation module 1408, and a target area recognition module 1410 shown in fig. 14. The computer program constituted by the respective program modules causes the processor to execute the steps in the image recognition method of the respective embodiments of the present application described in the present specification.

For example, the computer device shown in fig. 17 may acquire a current image of the target to be recognized through the current image acquisition module 1402 in the image recognition apparatus shown in fig. 14; obtaining a current pixel point from a current image through a background area determining module 1404, and determining a corresponding background reference area according to the position of the current pixel point; calculating the background similarity corresponding to the current pixel point according to the background reference region by the similarity calculation module 1406; processing the image features corresponding to the current pixel point according to the trained image target identification model by a first probability obtaining module 1408 to obtain a first probability corresponding to the current pixel point, where the first probability is the probability that the current pixel point belongs to a target object pixel point; the background similarity and the first probability corresponding to each pixel point in the current image are calculated and obtained through the target area identification module 1410, and the image area where the target object is located in the current image is identified and obtained according to the background similarity and the first probability corresponding to each pixel point.

In one embodiment, a computer device is proposed, the computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring a current image of a target to be identified; acquiring a current pixel point from a current image, and determining a corresponding background reference area according to the position of the current pixel point; calculating the background similarity corresponding to the current pixel point according to the background reference area; processing the image characteristics corresponding to the current pixel point according to the trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to a target object pixel point; calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying to obtain the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

In one embodiment, the processor performs the step of determining the corresponding background reference region according to the position of the current pixel point, including: acquiring a first region and a second region on a current image according to the position of a current pixel point, wherein the second region is a sub-region of the first region, and the current pixel point is positioned in the second region; the non-overlapping image area between the first area and the second area is taken as a background reference area.

In one embodiment, the calculating, by the processor, the background similarity corresponding to the current pixel point according to the background reference region includes: acquiring a target pixel point from a background reference region, and acquiring a gray value corresponding to the target pixel point and a current pixel point; calculating according to the gray value of the target pixel point to obtain a reference gray value, performing negation operation on the reference gray value to obtain a corresponding complementary reference gray value, and forming a reference gray value vector by the reference gray value and the complementary reference gray value; performing negation operation on the gray value corresponding to the current pixel point to obtain a corresponding complementary current gray value, and forming a current gray value vector by the gray value corresponding to the current pixel point and the complementary current gray value; and calculating the background similarity corresponding to the current pixel point according to the reference gray value vector and the current gray value vector.

In one embodiment, the image target recognition model is a support vector clustering model, and the processor performs processing on the image features corresponding to the current pixel point according to the trained image target recognition model, and obtaining the first probability corresponding to the current pixel point includes: acquiring a feature mapping function corresponding to the image target recognition model, and acquiring a central value of a feature space corresponding to the feature mapping function; calculating the image characteristics according to the characteristic mapping function to obtain a mapping value corresponding to the image characteristics; calculating a first distance between the mapping value and the central value; and calculating to obtain a first probability corresponding to the current pixel point according to the first distance, wherein the first distance and the first probability are in a negative correlation relationship.

In one embodiment, the computer program further causes the processor to perform the steps of: acquiring a second distance from the center of the feature space to the boundary of the feature space; calculating a first probability that the current pixel point is a pixel point corresponding to the target according to the first distance, wherein the first probability comprises the following steps: calculating a proportional value of the first distance and the second distance; and calculating to obtain a first probability corresponding to the current pixel point according to the proportion value, wherein the proportion value and the first probability are in a negative correlation relationship.

In one embodiment, the processor performs the step of deriving an image object recognition model comprising: acquiring a training image, and acquiring a training area corresponding to a target object in the training image; acquiring training image characteristics corresponding to each pixel point in a training area; and carrying out model training according to the training image characteristics to obtain a characteristic mapping function for mapping the training image characteristics to the minimum characteristic space and a central value of the characteristic space.

In one embodiment, the processor performs the step of obtaining the image feature including a contrast feature, and the step of obtaining the contrast feature corresponding to the current pixel point includes: acquiring a third area and a fourth area on the current image according to the position of the current pixel point, wherein the fourth area is a sub-area of the third area, and the current pixel point is positioned in the fourth area; acquiring a non-overlapping image area between the third area and the fourth area; counting the gray values of the pixel points corresponding to the non-overlapped image areas to obtain a first statistical result, and counting the gray values of the pixel points corresponding to the fourth area to obtain a second statistical result; and obtaining contrast characteristics according to the first statistical result and the second statistical result.

In one embodiment, the identifying, by the processor, the image region where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel includes: obtaining a second probability corresponding to the current pixel point according to the background similarity, wherein the second probability is the probability that the current pixel point belongs to the target object pixel point, and the second probability and the background similarity are in a negative correlation relationship; determining a current target probability corresponding to the current pixel point according to the first probability and the second probability, wherein the current target probability is the probability that the current pixel point belongs to a target object pixel point; and identifying and obtaining an image area where the target object in the current image is located according to the target probability corresponding to each pixel point in the current image.

In one embodiment, the identifying, by the processor, the image region where the target object is located in the current image according to the target probability corresponding to each pixel point in the current image includes: acquiring a first pixel point of which the target probability is greater than a first threshold value in a current image; acquiring the distribution characteristics of the target probability corresponding to the first pixel point; obtaining a second threshold value according to the distribution characteristics; and obtaining a region obtained by combining the first pixel points with the target probability larger than the second threshold value in the current image, and taking the region as an image region where the target object in the current image is located.

In one embodiment, a computer readable storage medium is provided, having a computer program stored thereon, which, when executed by a processor, causes the processor to perform the steps of: acquiring a current image of a target to be identified; acquiring a current pixel point from a current image, and determining a corresponding background reference area according to the position of the current pixel point; calculating the background similarity corresponding to the current pixel point according to the background reference area; processing the image characteristics corresponding to the current pixel point according to the trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to a target object pixel point; calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying to obtain the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An image recognition method, the method comprising:

acquiring a current image of a target to be identified;

acquiring a current pixel point from the current image, and determining a corresponding background reference area according to the position of the current pixel point;

acquiring a target pixel point from the background reference region, calculating according to a gray value of the target pixel point to obtain a reference gray value, performing negation operation on the reference gray value to obtain a complementary reference gray value, forming the reference gray value and the complementary reference gray value into a reference gray value vector, performing negation operation on the gray value of the current pixel point to obtain a complementary current gray value, forming the gray value corresponding to the current pixel point and the complementary current gray value into a current gray value vector, and obtaining a background similarity corresponding to the current pixel point according to the reference gray value vector and the current gray value vector, wherein the background similarity is used for expressing the similarity degree between the current pixel point and a background, and the pixel characteristics comprise one or more of color characteristics, texture characteristics and gray characteristics;

processing the image characteristics corresponding to the current pixel point according to the trained image target identification model to obtain a first probability corresponding to the current pixel point, wherein the first probability is the probability that the current pixel point belongs to a target object pixel point;

calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying and obtaining the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

2. The method of claim 1, wherein the determining the corresponding background reference region according to the position of the current pixel point comprises:

acquiring a first region and a second region on the current image according to the position of the current pixel point, wherein the second region is a sub-region of the first region, and the current pixel point is positioned in the second region;

and taking a non-overlapped image area between the first area and the second area as the background reference area.

3. The method of claim 1, wherein the obtaining the background similarity corresponding to the current pixel point according to the reference gray value vector and the current gray value vector comprises:

and taking any one of the median, average, maximum or minimum of the similarity between each reference gray value vector and the current gray value vector as the background similarity corresponding to the current pixel point.

4. The method of claim 1, wherein the image target recognition model is a support vector clustering model, and the processing the image features corresponding to the current pixel point according to the trained image target recognition model to obtain the first probability corresponding to the current pixel point comprises:

acquiring a feature mapping function corresponding to the image target recognition model, and acquiring a central value of a feature space corresponding to the feature mapping function;

calculating the image characteristics according to the characteristic mapping function to obtain a mapping value corresponding to the image characteristics;

calculating a first distance of the mapped value from the center value;

and calculating to obtain a first probability corresponding to the current pixel point according to the first distance, wherein the first distance and the first probability are in a negative correlation relationship.

5. The method of claim 4, further comprising:

acquiring a second distance from the center of the feature space to the boundary of the feature space;

the calculating according to the first distance to obtain a first probability that the current pixel is a pixel corresponding to the target includes:

calculating a proportional value of the first distance and the second distance;

and calculating a first probability corresponding to the current pixel point according to the proportion value, wherein the proportion value and the first probability are in a negative correlation relationship.

6. The method of claim 4, wherein the step of deriving the image object recognition model comprises:

acquiring a training image, and acquiring a training area corresponding to a target object in the training image;

acquiring training image characteristics corresponding to each pixel point in the training area;

and carrying out model training according to the training image characteristics to obtain a characteristic mapping function for mapping the training image characteristics to a minimum characteristic space and a central value of the characteristic space.

7. The method of claim 1, wherein the image features comprise contrast features, and the step of obtaining the contrast feature corresponding to the current pixel point comprises:

acquiring a third area and a fourth area on the current image according to the position of the current pixel point, wherein the fourth area is a sub-area of the third area, and the current pixel point is positioned in the fourth area;

acquiring a non-overlapping image area between the third area and the fourth area;

counting the gray values of the pixel points corresponding to the non-overlapped image areas to obtain a first statistical result, and counting the gray values of the pixel points corresponding to the fourth area to obtain a second statistical result;

and obtaining the contrast characteristic according to the first statistical result and the second statistical result.

8. The method according to claim 1, wherein the identifying the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point comprises:

obtaining a second probability corresponding to the current pixel point according to the background similarity, wherein the second probability is the probability that the current pixel point belongs to a target object pixel point, and the second probability and the background similarity are in a negative correlation relationship;

determining a current target probability corresponding to the current pixel point according to the first probability and the second probability, wherein the current target probability is the probability that the current pixel point belongs to a target object pixel point;

and identifying and obtaining an image area where a target object in the current image is located according to the target probability corresponding to each pixel point in the current image.

9. The method according to claim 8, wherein the identifying and obtaining the image area where the target object is located in the current image according to the target probability corresponding to each pixel point in the current image comprises:

acquiring a first pixel point of which the target probability is greater than a first threshold value in the current image;

acquiring the distribution characteristics of the target probability corresponding to the first pixel point;

obtaining a second threshold value according to the distribution characteristics;

and acquiring a region obtained by combining first pixel points with the target probability greater than the second threshold value in the current image, and taking the region as an image region where the target object in the current image is located.

10. An image recognition apparatus, the apparatus comprising:

the current image acquisition module is used for acquiring a current image of the target to be identified;

a background region determining module, configured to obtain a current pixel point from the current image, and determine a corresponding background reference region according to a position of the current pixel point;

a similarity calculation module for obtaining target pixel points from the background reference region, calculating to obtain reference gray value according to the gray value of the target pixel points, performing an inversion operation on the reference gray value to obtain a complementary reference gray value, forming a reference gray value vector by the reference gray value and the complementary reference gray value, performing an inversion operation on the gray value of the current pixel point to obtain a complementary current gray value, forming a current gray value vector by the gray value corresponding to the current pixel point and the complementary current gray value, obtaining the background similarity corresponding to the current pixel point according to the reference gray value vector and the current gray value vector, the background similarity is used for representing the similarity degree of the current pixel point and the background, and the pixel characteristics comprise one or more of color characteristics, texture characteristics and gray level characteristics;

a first probability obtaining module, configured to process, according to a trained image target identification model, an image feature corresponding to the current pixel point to obtain a first probability corresponding to the current pixel point, where the first probability is a probability that the current pixel point belongs to a target object pixel point;

and the target area identification module is used for calculating to obtain the background similarity and the first probability corresponding to each pixel point in the current image, and identifying to obtain the image area where the target object is located in the current image according to the background similarity and the first probability corresponding to each pixel point.

11. The apparatus of claim 10, wherein the background region determination module comprises:

a first region obtaining unit, configured to obtain a first region and a second region on the current image according to a position of the current pixel point, where the second region is a sub-region of the first region, and the current pixel point is located inside the second region;

a first area determination unit configured to use a non-overlapping image area between the first area and the second area as the background reference area.

12. The apparatus of claim 10, wherein the similarity calculation module is further configured to:

13. The apparatus of claim 10, wherein the image target recognition model is a support vector clustering model, and wherein the first probability derivation module comprises:

the model parameter acquisition unit is used for acquiring a feature mapping function corresponding to the image target identification model and acquiring a central value of a feature space corresponding to the feature mapping function;

the mapping value calculating unit is used for calculating the image characteristics according to the characteristic mapping function to obtain the mapping values corresponding to the image characteristics;

a first distance calculation unit for calculating a first distance between the map value and the center value;

14. The apparatus of claim 13, further comprising:

a second distance obtaining module, configured to obtain a second distance from a center of the feature space to a boundary of the feature space;

the first probability deriving unit is configured to:

calculating a proportional value of the first distance and the second distance;

15. The apparatus of claim 13, further comprising:

the training area acquisition module is used for acquiring a training image and acquiring a training area corresponding to a target object in the training image;

the training feature acquisition module is used for acquiring training image features corresponding to all pixel points in the training area;

16. The apparatus of claim 10, wherein the image feature comprises a contrast feature, the apparatus further comprising:

a second region obtaining unit, configured to obtain a third region and a fourth region on the current image according to the position of the current pixel point, where the fourth region is a sub-region of the third region, and the current pixel point is located inside the fourth region;

a second region determining unit configured to acquire a non-overlapping image region between the third region and the fourth region;

the first statistical unit is used for counting the gray values of the pixel points corresponding to the non-overlapped image area to obtain a first statistical result, and counting the gray values of the pixel points corresponding to the fourth area to obtain a second statistical result;

17. The apparatus of claim 10, wherein the target area identification module comprises:

a second probability obtaining unit, configured to obtain a second probability corresponding to the current pixel point according to the background similarity, where the second probability is a probability that the current pixel point belongs to a target object pixel point, and the second probability and the background similarity are in a negative correlation relationship;

a target probability obtaining unit, configured to determine, according to the first probability and the second probability, a current target probability corresponding to the current pixel point, where the current target probability is a probability that the current pixel point belongs to a target object pixel point;

and the target area identification unit is used for identifying and obtaining the image area where the target object in the current image is located according to the target probability corresponding to each pixel point in the current image.

18. The apparatus of claim 17, wherein the target area identifying unit is configured to:

19. A computer arrangement, characterized by a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to carry out the steps of the image recognition method of any one of claims 1 to 9.

20. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, causes the processor to carry out the steps of the image recognition method according to any one of claims 1 to 9.