WO2019095997A1

WO2019095997A1 - Image recognition method and device, computer device and computer-readable storage medium

Info

Publication number: WO2019095997A1
Application number: PCT/CN2018/112759
Authority: WO
Inventors: 杨茜; 牟永强
Original assignee: 深圳云天励飞技术有限公司
Priority date: 2017-11-15
Filing date: 2018-10-30
Publication date: 2019-05-23
Also published as: CN107871143B; CN107871143A

Abstract

An image recognition method. The method comprises: performing region division on a query image and a database image; calculating logarithmic relative RGB coordinates of each pixel point in each region of the query image and the database image; clustering pixels points in each region of the query image and the database image according to the logarithmic relative RGB coordinates of each pixel point in each region of the query image and the database image to obtain a clustering center of each region of the query image and the database image; and determining, according to the clustering center of each region of the query image and the database image, whether the query image matches the database image. Also provided is an image recognition device, a computer device and a readable storage medium. By means of the present invention, image recognition of high robustness and high anti-interference performance can be quickly realized.

Description

Image recognition method and device, computer device and computer readable storage medium

The present application claims priority to Chinese Patent Application No. 201711132067.0, entitled "Image Recognition Method and Apparatus, Computer Apparatus, and Computer Readable Storage Medium", which is filed on November 15, 2017. This is incorporated herein by reference.

Technical field

The present invention relates to the field of computer vision technology, and in particular, to an image recognition method and apparatus, a computer apparatus, and a computer readable storage medium.

Background technique

Most existing image recognition methods have certain disadvantages. For example, based on multi-region and multi-feature recognition methods, it is usually necessary to form feature vectors with higher dimensionality and then perform training and prediction. The computational complexity is high and the accuracy and robustness are easily affected. The use of the RGB color space is affected by illumination and shooting conditions, resulting in different features extracted by the same object under different shooting conditions, thereby affecting the recognition accuracy. However, the histogram based on color space is susceptible to interference when calculating similarity, and the accuracy of histogram intersection method or distance calculation method is not very high, and the complexity is high in multi-region calculation.

Summary of the invention

In view of the above, it is necessary to provide an image recognition method and apparatus, a computer apparatus, and a computer readable storage medium, which can realize image recognition with fast, high robustness and high anti-interference.

A first aspect of the present application provides an image recognition method, the method comprising:

Area division of the query image and the database image;

Calculating a logarithmic relative RGB coordinate of each pixel of each region of the query image and the database image;

Clustering pixel points in each region of the query image and the database image according to the logarithm of each pixel of each region of the query image and the database image, and obtaining each region of the query image and the database image Cluster center

Whether the query image matches the database image is determined according to the cluster center of each region of the query image and the database image.

In another possible implementation manner, the query image and the database image are character images, and the zoning of the query image and the database image includes:

The query image and the database image are respectively divided into upper and lower areas according to the character image in the query image and the database image, wherein the upper area corresponds to the upper body of the character, and the lower area corresponds to the lower body of the character.

In another possible implementation manner, determining whether the query image matches the database image according to the cluster center of each region of the query image and the database image includes:

Calculating a distance of a cluster center of each region of the query image corresponding to the database image;

Whether the query image matches the database image is determined according to the distance between the query image and the cluster center of each region corresponding to the database image.

In another possible implementation manner, the distance between the calculation query image and the cluster center of each region corresponding to the database image includes:

Calculating the Euclidean distance of the cluster center of each region of the query image corresponding to the database image; or

Calculating the Manhattan distance of the cluster center of each region of the query image corresponding to the database image; or

The Mahalanobis distance of the cluster center of each region corresponding to the database image is calculated.

A second aspect of the present application provides an image recognition apparatus, the apparatus comprising:

a region dividing unit, configured to perform area division on the query image and the database image;

a coordinate calculation unit, configured to calculate a logarithmic relative RGB coordinate of each pixel of each region of the query image and the database image;

a clustering unit, configured to cluster pixel points in each region of the query image and the database image according to a logarithm of each pixel of each region of the query image and the database image to obtain a query image and a database The cluster center of each region of the image;

And a matching unit, configured to determine, according to the query center and the cluster center of each region of the database image, whether the query image and the database image match.

In another possible implementation manner, the query image and the database image include a character image, and the area dividing unit is specifically configured to:

In another possible implementation manner, the matching unit is specifically configured to:

In another possible implementation manner, the matching unit calculates a distance between the cluster center of each region corresponding to the database image corresponding to the database image, and specifically includes:

A third aspect of the present application provides a computer apparatus including a processor that implements the image recognition method when executing a computer program stored in a memory.

A fourth aspect of the present application provides a computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements the image recognition method.

The invention divides the query image and the database image into regions; calculates the logarithmic relative RGB coordinates of each pixel of each region of the query image and the database image; according to each pixel of each region of the query image and the database image Logarithmic relative RGB coordinates cluster the pixel in each area of the query image and the database image to obtain the cluster center of each region of the query image and the database image; according to each region of the query image and the database image The class center determines if the query image matches the database image. The invention utilizes logarithmic relative RGB coordinates for image recognition, and the logarithm of the different poses and shooting angles is similar to the RGB coordinate distribution, so that the posture and the shooting angle are robust, and the posture and the shooting angle are invariant. The logarithmic relative RGB coordinates are the normalized coordinates of the red component Ri and the green component Gi versus the green component Gi, which reduces the influence of illumination on the recognition, and thus is more robust to illumination and has illumination intensity invariance. The logarithmic relative RGB coordinates reduce the three-dimensional coordinates of each pixel to two dimensions, which reduces the computational complexity and improves the recognition speed. In the first embodiment, the pixel points are clustered according to the logarithmic relative RGB coordinate sub-region, and the pixel points of one region are changed into a cluster center, which reduces the influence of the accidental error (ie, eliminates the influence of the stray point), and improves the Identification robustness and immunity to interference. In addition, the present invention determines whether the query image and the database image match according to the clustering center of each region of the query image and the database image, and the calculation amount for the cluster center is small, the computational complexity is low, and the matching result can be quickly obtained. Therefore, the present invention can realize image recognition with fast high robustness and high anti-interference.

DRAWINGS

FIG. 1 is a flowchart of an image recognition method according to Embodiment 1 of the present invention.

Figure 2 is a logarithmic versus RGB coordinate distribution of an image.

FIG. 3 is a structural diagram of an image recognition apparatus according to Embodiment 2 of the present invention.

4 is a schematic diagram of a computer device according to Embodiment 3 of the present invention.

Detailed ways

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.

In the following description, numerous specific details are set forth in the description All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

All technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. The terminology used in the description of the present invention is for the purpose of describing particular embodiments and is not intended to limit the invention.

Preferably, the image recognition method of the present invention is applied in one or more computer devices. The computer device is a device capable of automatically performing numerical calculation and/or information processing according to an instruction set or stored in advance, and the hardware thereof includes but is not limited to a microprocessor and an application specific integrated circuit (ASIC). , Field-Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc.

The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device.

Embodiment 1

FIG. 1 is a flowchart of an image recognition method according to Embodiment 1 of the present invention. The image recognition method is applied to a computer device.

As shown in FIG. 1, the image recognition method specifically includes the following steps:

101: Perform area division on the query image and the database image.

The query image is an image that needs to be identified or matched, and the database image is an image in a pre-established image library. The image recognition method compares the query image with the database image to determine whether the query image matches the database image to confirm whether the content in the query image is consistent with the content in the database image. For example, when performing pedestrian recognition, the pedestrian image captured by the camera on the road is a query image, and the portrait library image of the traffic management system is a database image, and whether the pedestrian image and the portrait library image match are determined according to the similarity coefficient between the pedestrian image and the portrait library image. If it matches, the person in the pedestrian image is considered to be a character in the portrait library image; otherwise, if it does not match, the person in the non-portrait library image in the pedestrian image is considered to be able to image the pedestrian and another portrait library image. Identify.

Database images are typically associated with specific information, such as personally identifiable information. According to the matching result, related information (for example, personal identification information) of the query image can be obtained. For example, when performing pedestrian recognition, if the pedestrian image matches the portrait library image, the personal identification information corresponding to the portrait library image is taken as the personal identification information of the person in the pedestrian image.

The image recognition method can be applied to various fields such as video surveillance, product detection, medical diagnosis, and the like. For example, in traffic monitoring, the present invention can be utilized for pedestrian identification, driver identification, vehicle identification, and the like.

When arranging the query image and the database image, the same division method is adopted. For example, the query image and the database image are each divided into two upper and lower regions or two left and right regions.

In this embodiment, the image recognition method is used for character recognition (for example, pedestrian recognition), and the query image and the database image are character images, and the query image and the database image may be divided into upper and lower regions according to the character shapes in the image. . The upper area corresponds to the upper body of the character, and the lower area corresponds to the lower body of the character. For example, the query image is divided into an upper region R1 and a lower region R2, and the database image is divided into an upper region R1' and a lower region R2'. When the characters in the image are upright characters, since the proportions of the upright characters are roughly similar but the postures and actions are different, the division of the upper and lower regions according to the shape of the characters in the image is more robust. At the same time, the most colorful character costume is usually a jacket, so the character image is divided into upper and lower regions.

When the two images of the person image are divided, the position of the division can be determined according to the empirical value, for example, according to the golden ratio of the upper and lower body of the human body. Alternatively, the boundary between the top and bottom of the character in the character image can be identified, and the division is performed from the boundary.

It can be understood that the query image and the database image can be divided into regions in other ways. For example, the pyramid model can be used to divide the query image and the database image into regions.

The query image and the database image may be respectively divided into two regions, and the query image and the database image may each be divided into more than two regions, for example, divided into three regions or four regions.

102: Calculate a logarithmic relative RGB coordinate of each pixel of each region of the query image and the database image.

In this embodiment, the logarithm of the pixel point i of the red component is Ri, the green component is Gi, and the blue component is Bi is (xi, yi), wherein

You can take the logarithm of e, that is,

Alternatively, you can take a logarithm with other values, such as a base 10 logarithm.

Take

For the horizontal axis,

For the vertical axis, a logarithmic relative RGB coordinate distribution map of the query image and the database image can be obtained. When the image recognition method of the present invention is used for person recognition, if the color difference between the upper and lower body clothing of the character image is large, the logarithm of the upper part of the character image (corresponding to the upper body of the character) corresponds to the RGB coordinate and the character image. The logarithmic relative RGB coordinates of the lower region (corresponding to the lower body of the character) are often distributed in two different regions, so that two central coordinate clusters are usually obtained.

Figure 2 is a logarithmic versus RGB coordinate distribution of an image. In FIG. 2, the image is divided into two regions R1 and R2 (for example, the query image is divided into an upper region R1 and a lower region R2), where 20 is the logarithm of the pixel of the region R1 relative to the RGB coordinate distribution, and 21 is the region R2. The logarithm of the pixel is relative to the RGB coordinate distribution.

The use of the RGB color space is affected by the illumination, attitude, and shooting angle, resulting in different features extracted by the same object under different conditions, affecting the recognition accuracy. The invention utilizes logarithmic relative RGB coordinates for image recognition, and the logarithmic relative RGB coordinate distribution obtained by different poses and shooting angles is very similar, so the robustness to the attitude and the shooting angle is good, and the attitude and the shooting angle are invariant. The logarithmic relative RGB coordinates are the normalized coordinates of the red component Ri and the green component Gi versus the green component Gi, which reduces the influence of illumination and image quality on the recognition, and has the invariance of illumination intensity. At the same time, the logarithmic relative RGB coordinates are symmetrical about 0, which has good symmetry and balance. In addition, the logarithmic relative RGB coordinates reduce the three-dimensional coordinates of each pixel to two-dimensional, which reduces the computational complexity and makes its distribution can be represented by two-dimensional images, which provides a possibility for further clustering.

103: Clustering pixel points in each region of the query image and the database image according to the logarithm of each pixel of each region of the query image and the database image, and obtaining each of the query image and the database image. The clustering center of the area.

For example, clustering pixel points of the upper region R1 and the lower region R2 of the query image to obtain a cluster center (x ₁ , y ₁ ) of the upper region R1 of the query image and a cluster center (x _{2 of the} lower region R2 of the query image) y ₂ ); clustering the pixel points of the upper region R1' and the lower region R2' of the database image to obtain the cluster center (x ₁ ', y ₁ ') and the lower region R2' of the upper region R1' of the query image Clustering center (x ₂ ', y ₂ ').

Referring to FIG. 2, the pixel points of the region R1 are clustered according to the logarithm of each pixel of the region R1 with respect to the RGB coordinates to obtain the cluster center 22 of the region R1; the logarithm of each pixel according to the region R2 The pixel points of the region R2 are clustered with respect to the RGB coordinates to obtain the cluster center 23 of the region R2.

The GMM (Gaussian Mixture Model) or the K-Means algorithm may be used to cluster the pixel points in each region of the query image and the database image to obtain a cluster center of each region of the query image and the database image. For example, using a Gaussian mixture model GMM or K-Means algorithm with a clustering center number of 2, the cluster center (x ₁ , y ₁ ) of the upper region R1 of the query image and the cluster center of the lower region R2 (x ₂ , y) are obtained. ₂ ), the cluster center (x ₁ ', y ₁ ') of the upper region R1' of the database image and the cluster center (x ₂ ', y ₂ ') of the lower region R2' are obtained.

Other clustering algorithms can also be used to cluster the query image with pixels in each region of the database image. For example, a DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is used to cluster pixel points in each region of the query image and the database image.

Clustering pixel points according to the logarithmic relative RGB coordinate sub-region can reduce the influence of accidental error (ie, eliminate the influence of spurious points), and improve the robustness and anti-interference ability of the recognition.

104: Determine whether the query image and the database image match according to the cluster center of each region of the query image and the database image.

For example, when performing pedestrian recognition, it is determined whether the pedestrian image and the portrait library image match based on the similarity coefficient of the pedestrian image captured by the camera and the portrait library image of the traffic control system. If it matches, the person in the pedestrian image is considered to be a character in the portrait library image; otherwise, if it does not match, the person in the non-portrait library image in the pedestrian image is considered to be able to image the pedestrian and another portrait library image. Identify.

The distance between the query center and the cluster center of each region corresponding to the database image may be calculated, and whether the query image matches the database image is determined according to the distance between the query image and the cluster center of each region corresponding to the database image.

In this embodiment, the Euclidean distance of the cluster center of each region corresponding to the database image of the query image may be calculated, and the query image is determined according to the Euclidean distance of the cluster center of each region corresponding to the query image and the database image. Whether the database images match. For example, the query image includes a cluster center (x1, y1) of the upper region R1 and a cluster center (x2, y2) of the lower region R2, and the database image includes a cluster center (x1', y1') of the upper region R1' and The cluster center (x2', y2') of the lower region R2' is calculated as the cluster center (x1, y1) of the upper region R1 of the query image and the cluster center (x1', y1 of the upper region R1' of the database image. Euclidean distance

And the Euclidean distance between the cluster center (x2, y2) of the lower region R2 of the query image and the cluster center (x2', y2') of the lower region R2' of the database image

Alternatively, the Manhattan distance (ie, the absolute value distance) of the cluster center of each region corresponding to the database image of the query image may be calculated, and the query image is determined according to the Manhattan distance of the cluster center of each region corresponding to the query image and the database image. Whether the database images match. For example, the query image includes a cluster center (x1, y1) of the upper region R1 and a cluster center (x2, y2) of the lower region R2, and the database image includes a cluster center (x1', y1') of the upper region R1' and The cluster center (x2', y2') of the lower region R2' is calculated as the cluster center (x1, y1) of the upper region R1 of the query image and the cluster center (x1', y1 of the upper region R1' of the database image. ') Manhattan distance d ₁ =|x ₁ -x' ₁ |+|y ₁ -y'1|, and the cluster center (x2,y2) of the lower region R2 of the query image and the lower region R2' of the database image The Manhattan distance d ₂ =|x ₂ -x' ₂ |+|y ₂ -y' ₂ | of the clustering center (x2', y2').

In other embodiments, other distances, such as Mahalanobis distances, of the cluster center of each region of the query image corresponding to the database image may be calculated.

The distance between the query image and the cluster center of each region corresponding to the database image may be preset, and whether the query image matches the database image is determined according to the operation result.

In one embodiment, the sum of the distances of the cluster centers of each region of the query image corresponding to the database image (eg, d ₁ +d ₂ ) may be calculated, and the cluster center of each region corresponding to the database image of the query image is determined. Whether the sum of the distances is less than or equal to the preset distance, and if the sum of the distances of the cluster centers of the respective regions corresponding to the database image is less than or equal to the preset distance, the query image is judged to match the database image; otherwise, if If the sum of the distances of the cluster centers of the regions corresponding to the database image corresponding to the database image is greater than the preset distance, it is determined that the query image does not match the database image.

Alternatively, it can be determined whether the sum of the distances of the cluster centers of each region corresponding to the query image and the database image is smaller than the sum of the distances between the query image and the cluster center of each region corresponding to the other database images, if the query image and the database The sum of the distances of the cluster centers of each region corresponding to the image is smaller than the sum of the distances between the query images and the cluster centers of each region corresponding to the other database images, and then the query image is judged to match the database image; otherwise, if the image is queried The sum of the distances of the cluster centers corresponding to each region of the database image is not less than the sum of the distances of the cluster centers of the regions corresponding to the query image and the other database images, and then the query image is judged to be mismatched with the database image.

In another embodiment, an average distance (eg, (d ₁ +d ₂ )/2) of the cluster center of each region of the query image corresponding to the database image may be calculated, and each region corresponding to the database image is determined by the query image. Whether the average distance of the cluster center is less than or equal to the preset distance, and if the average distance of the cluster center of each region corresponding to the database image is less than or equal to the preset distance, the query image is judged to match the database image; otherwise If the average distance of the cluster center of each region corresponding to the query image and the database image is greater than the preset distance, it is determined that the query image does not match the database image.

Alternatively, it can be determined whether the average distance of the cluster center of each region corresponding to the query image and the database image is smaller than the average distance of the cluster center of each region corresponding to the query image and other database images, if the query image corresponds to the database image The average distance of the cluster center of each region is smaller than the average distance of the cluster center of each region corresponding to the query image and other database images, and then the query image is judged to match the database image; otherwise, if the query image corresponds to the database image The average distance of the cluster center of each region is not less than the average distance of the cluster center of each region of the query image corresponding to other database images, and then the query image is judged to not match the database image.

The image recognition method of the first embodiment performs area division on the query image and the database image; calculates the logarithm relative RGB coordinates of each pixel of each region of the query image and the database image; according to each region of the query image and the database image The logarithm of each pixel is compared with the RGB coordinates to cluster the pixel in each region of the query image and the database image to obtain a cluster center of each region of the query image and the database image; according to the query image and the database image The cluster center of each region determines whether the query image matches the database image. In the first embodiment, the logarithmic relative RGB coordinates are used for image recognition. The logarithm of the different poses and shooting angles is similar to the RGB coordinate distribution, so the robustness to the pose and the shooting angle is good, and the attitude and the shooting angle are invariant. The logarithmic relative RGB coordinates are the normalized coordinates of the red component Ri and the green component Gi versus the green component Gi, which reduces the influence of illumination on the recognition, and thus is more robust to illumination and has illumination intensity invariance. The logarithmic relative RGB coordinates reduce the three-dimensional coordinates of each pixel to two dimensions, which reduces the computational complexity and improves the recognition speed. In the first embodiment, the pixel points are clustered according to the logarithmic relative RGB coordinate sub-region, and the pixel points of each region are transformed into a cluster center, which reduces the influence of the accidental error (ie, eliminates the influence of the stray point) and improves The robustness and anti-interference ability of the identification. In addition, in the first embodiment, whether the query image and the database image match are determined according to the cluster center of each region of the query image and the database image, and the calculation amount of the calculation data of the cluster center is small, the operation complexity is low, and the matching result can be quickly obtained. Therefore, the image recognition method of the first embodiment can realize image recognition with fast high robustness and high anti-interference.

Embodiment 2

FIG. 3 is a structural diagram of an image recognition apparatus according to Embodiment 2 of the present invention. As shown in FIG. 3, the image recognition apparatus 10 may include an area division unit 301, a coordinate calculation unit 302, a clustering unit 303, and a matching unit 304.

The area dividing unit 301 is configured to perform area division on the query image and the database image.

Database images are typically associated with specific information, such as personally identifiable information. Based on the matching result, related information (for example, personal identification information) of the query image can be obtained. For example, when performing pedestrian recognition, if the pedestrian image matches the portrait library image, the personal identification information corresponding to the portrait library image is taken as the personal identification information of the person in the pedestrian image.

The coordinate calculation unit 302 is configured to calculate a logarithmic relative RGB coordinate of each pixel of each region of the query image and the database image.

You can take the logarithm of e, that is,

Take

For the horizontal axis,

Although the color feature is the most resolving feature, it is less robust to lighting environments, camera shooting, and spatial distribution. The use of the RGB color space is affected by illumination and shooting conditions, resulting in different features extracted by the same object under different shooting conditions, affecting the recognition accuracy. The invention utilizes logarithmic relative RGB coordinates for image recognition, and the logarithmic relative RGB coordinate distribution obtained by different poses and shooting angles is very similar, so the robustness to the attitude and the angle is better, and the shooting angle is invariant. The logarithmic relative RGB coordinates are the normalized coordinates of the red component Ri and the green component Gi versus the green component Gi, which reduces the influence of illumination and image quality on the recognition, and has a certain intensity of light intensity. At the same time, the logarithmic relative RGB coordinates are symmetrical about 0, which has good symmetry and balance. In addition, the logarithmic relative RGB coordinates reduce the three-dimensional coordinates of each pixel to two dimensions, which reduces the computational complexity and allows its distribution to be represented by a two-dimensional image, so there is no need to use common histogram statistics. It provides the possibility for further clustering.

The clustering unit 303 is configured to cluster the pixel in each region of the query image and the database image according to the logarithm of the pixel and the RGB coordinate of each pixel of each region of the database image to obtain a query image and The cluster center of each region of the database image.

For example, clustering pixel points of the upper region R1 and the lower region R2 of the query image to obtain a cluster center (x1, y1) of the upper region R1 of the query image and a cluster center (x2, y2) of the lower region R2; Clustering the pixel points of the upper region R1' and the lower region R2' of the database image to obtain the cluster center (x1', y ₁ ') of the upper region R1' of the query image and the cluster center of the lower region R2' ( x ₂ ', y ₂ ').

The matching unit 304 is configured to determine whether the query image and the database image match according to the query center and the cluster center of each region of the database image.

Alternatively, the Manhattan distance (ie, the absolute value distance) of the cluster center of each region corresponding to the database image of the query image may be calculated, and the query image is determined according to the Manhattan distance of the cluster center of each region corresponding to the query image and the database image. Whether the database images match. For example, the query image includes a cluster center (x1, y1) of the upper region R1 and a cluster center (x2, y2) of the lower region R2, and the database image includes a cluster center (x1', y1') of the upper region R1' and The cluster center (x2', y2') of the lower region R2' is calculated as the cluster center (x1, y1) of the upper region R1 of the query image and the cluster center (x1', y1 of the upper region R1' of the database image. ') Manhattan distance d ₁ =|x ₁ -x' ₁ |+|y ₁ -y' ₁ |, and the cluster center (x2,y2) of the lower region R2 of the query image and the lower region R2' of the database image The Manhattan distance d ₂ =|x ₂ -x' ₂ |+|y ₂ -y' ₂ | of the clustering center (x2', y2').

The image recognition apparatus of the second embodiment performs area division on the query image and the database image; calculates a logarithmic relative RGB coordinate of each pixel of each area of the query image and the database image; and according to the query image and each area of the database image The logarithm of each pixel is compared with the RGB coordinates to cluster the pixel in each region of the query image and the database image to obtain a cluster center of each region of the query image and the database image; according to the query image and the database image The cluster center of each region determines whether the query image matches the database image. In the second embodiment, the logarithmic relative RGB coordinates are used for image recognition. The logarithm of the different poses and shooting angles is similar to the RGB coordinate distribution, so the robustness to the pose and the shooting angle is better, and the attitude and shooting angle are invariant. The logarithmic relative RGB coordinates are the normalized coordinates of the red component Ri and the green component Gi versus the green component Gi, which reduces the influence of illumination on the recognition, and thus is more robust to illumination and has illumination intensity invariance. The logarithmic relative RGB coordinates reduce the three-dimensional coordinates of each pixel to two dimensions, which reduces the computational complexity and improves the recognition speed. In the second embodiment, the pixel points are clustered according to the logarithmic relative RGB coordinate sub-region, and the pixel points of each region are transformed into a cluster center, which reduces the influence of the accidental error (ie, eliminates the influence of the spurious point) and improves The robustness and anti-interference ability of the identification. In addition, in the first embodiment, whether the query image and the database image match are determined according to the cluster center of each region of the query image and the database image, and the calculation amount of the calculation data of the cluster center is small, the operation complexity is low, and the matching result can be quickly obtained. Therefore, the image recognition method of the second embodiment can realize image recognition with fast high robustness and high anti-interference.

Embodiment 3

4 is a schematic diagram of a computer device according to Embodiment 3 of the present invention. The computer device 1 includes a memory 20, a processor 30, and a computer program 40, such as an image recognition program, stored in the memory 20 and executable on the processor 30. When the processor 30 executes the computer program 40, the steps in the foregoing image recognition method embodiment are implemented, for example, steps 101-104 shown in FIG. Alternatively, when the processor 30 executes the computer program 40, the functions of the modules/units in the above device embodiments are implemented, such as the units 301-304 in FIG.

Illustratively, the computer program 40 can be partitioned into one or more modules/units that are stored in the memory 20 and executed by the processor 30 to complete this invention. The one or more modules/units may be a series of computer program instruction segments capable of performing a particular function for describing the execution of the computer program 40 in the computer device 1. For example, the computer program 40 may be divided into the area dividing unit 301, the coordinate calculating unit 302, the clustering unit 303, and the matching unit 304 in FIG. 3, and the specific functions of each unit are as follows.

The computer device 1 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. It will be understood by those skilled in the art that the schematic diagram 6 is merely an example of the computer device 1 and does not constitute a limitation of the computer device 1. It may include more or less components than those illustrated, or may combine some components, or different. The components, such as the computer device 1, may also include input and output devices, network access devices, buses, and the like.

The processor 30 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor 30 may be any conventional processor or the like, and the processor 30 is a control center of the computer device 1, and connects the entire computer device 1 by using various interfaces and lines. Various parts.

The memory 20 can be used to store the computer program 40 and/or modules/units by running or executing computer programs and/or modules/units stored in the memory 20, and by calling in memory. The data within 20 implements various functions of the computer device 1. The memory 20 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be Data (such as audio data, phone book, etc.) created according to the use of the computer device 1 is stored. In addition, the memory 20 may include a high-speed random access memory, and may also include a non-volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (SMC), and a secure digital (Secure Digital, SD). Card, flash card, at least one disk storage device, flash device, or other volatile solid state storage device.

The modules/units integrated by the computer device 1 can be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the present invention implements all or part of the processes in the foregoing embodiments, and may also be completed by a computer program to instruct related hardware. The computer program may be stored in a computer readable storage medium. The steps of the various method embodiments described above may be implemented when the program is executed by the processor. Wherein, the computer program comprises computer program code, which may be in the form of source code, object code form, executable file or some intermediate form. The computer readable medium may include any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM). , random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in a jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer readable media Does not include electrical carrier signals and telecommunication signals.

In the several embodiments provided by the present invention, it should be understood that the disclosed computer apparatus and method may be implemented in other manners. For example, the computer device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division, and the actual implementation may have another division manner.

In addition, each functional unit in each embodiment of the present invention may be integrated in the same processing unit, or each unit may exist physically separately, or two or more units may be integrated in the same unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software function modules.

It is apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, and the present invention can be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. Therefore, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the invention is defined by the appended claims instead All changes in the meaning and scope of equivalent elements are included in the present invention. Any reference signs in the claims should not be construed as limiting the claim. In addition, it is to be understood that the word "comprising" does not exclude other elements or steps. A plurality of units or computer devices recited in the computer device claims can also be implemented by the same unit or computer device in software or hardware. The first, second, etc. words are used to denote names and do not denote any particular order.

It should be noted that the above embodiments are only for explaining the technical solutions of the present invention and are not intended to be limiting, and the present invention will be described in detail with reference to the preferred embodiments. Modifications or equivalents are made without departing from the spirit and scope of the invention.

Claims

An image recognition method, the method comprising:

Area division of the query image and the database image;

Calculating a logarithmic relative RGB coordinate of each pixel of each region of the query image and the database image;

Clustering pixel points in each region of the query image and the database image according to the logarithm of each pixel of each region of the query image and the database image, and obtaining each region of the query image and the database image Cluster center

Whether the query image matches the database image is determined according to the cluster center of each region of the query image and the database image.
The method according to claim 1, wherein the query image and the database image are character images, and the zoning of the query image and the database image comprises:

The query image and the database image are respectively divided into upper and lower areas according to the character image in the query image and the database image, wherein the upper area corresponds to the upper body of the character, and the lower area corresponds to the lower body of the character.
The method according to claim 1, wherein the determining whether the query image matches the database image according to the cluster center of each region of the query image and the database image comprises:

Calculating a distance of a cluster center of each region of the query image corresponding to the database image;

Whether the query image matches the database image is determined according to the distance between the query image and the cluster center of each region corresponding to the database image.
The method according to claim 3, wherein the calculating the distance between the query image and the cluster center of each region corresponding to the database image comprises:

Calculating the Euclidean distance of the cluster center of each region of the query image corresponding to the database image; or

Calculating the Manhattan distance of the cluster center of each region of the query image corresponding to the database image; or

The Mahalanobis distance of the cluster center of each region corresponding to the database image is calculated.
An image recognition device, characterized in that the device comprises:

a region dividing unit, configured to perform area division on the query image and the database image;

a coordinate calculation unit, configured to calculate a logarithmic relative RGB coordinate of each pixel of each region of the query image and the database image;

a clustering unit, configured to cluster pixel points in each region of the query image and the database image according to a logarithm of each pixel of each region of the query image and the database image to obtain a query image and a database The cluster center of each region of the image;

And a matching unit, configured to determine, according to the query center and the cluster center of each region of the database image, whether the query image and the database image match.
The device according to claim 5, wherein the query image and the database image comprise a person image, and the area dividing unit is specifically configured to:

The query image and the database image are respectively divided into upper and lower areas according to the character image in the query image and the database image, wherein the upper area corresponds to the upper body of the character, and the lower area corresponds to the lower body of the character.
The device according to claim 5, wherein the matching unit is specifically configured to:

Calculating a distance of a cluster center of each region of the query image corresponding to the database image;

Whether the query image matches the database image is determined according to the distance between the query image and the cluster center of each region corresponding to the database image.
The device according to claim 7, wherein the matching unit calculates the distance between the cluster center of each region of the query image corresponding to the database image, and specifically includes:

Calculating the Euclidean distance of the cluster center of each region of the query image corresponding to the database image; or

Calculating the Manhattan distance of the cluster center of each region of the query image corresponding to the database image; or

The Mahalanobis distance of the cluster center of each region corresponding to the database image is calculated.
A computer apparatus, comprising: a processor, the processor for performing an image recognition method according to any one of claims 1 to 4 when the computer program is stored in a memory.
A computer readable storage medium having stored thereon a computer program, wherein the computer program is executed by a processor to implement the image recognition method according to any one of claims 1-4.