CN111209771A

CN111209771A - Neural network identification efficiency improving method and relevant identification efficiency improving device thereof

Info

Publication number: CN111209771A
Application number: CN201811389529.1A
Authority: CN
Inventors: 刘诚杰; 魏家博; 王傅民; 祁家玮
Original assignee: Vivotek Inc
Current assignee: Vivotek Inc
Priority date: 2018-11-21
Filing date: 2018-11-21
Publication date: 2020-05-29

Abstract

The invention discloses a method and a device for improving the identification efficiency of a neural network, wherein the method for improving the identification efficiency of the neural network comprises the steps of analyzing an input image to obtain foreground information, generating a foreground mask by utilizing the foreground information, and converting the input image into an output image through the foreground mask. The output image is used as the lead-in data of the neural network identification, the background noise of the input image is filtered, and the efficiency improvement of the neural network identification is facilitated.

Description

Neural network identification efficiency improving method and relevant identification efficiency improving device thereof

Technical Field

The present invention relates to an image recognition method and device, and more particularly, to a method and device for improving the recognition efficiency of a neural network applied to image recognition.

Background

Traditional image recognition techniques based on neural network-like algorithms directly use the original monitor image as input information. The information content in the original monitoring image is huge, so that the improvement of the image identification efficiency is greatly limited; even if a small-range specific image is selected from an original monitoring image for image identification, the operation efficiency is improved by reducing the information amount, and an object to be detected in the small-range specific image is still influenced by the surrounding complex environment background, so that a required identification result cannot be quickly and accurately obtained. Therefore, how to design a method for improving the recognition performance of the neural network is one of the key development topics of the related monitoring industry.

Disclosure of Invention

The present invention relates to a method and device for improving the recognition efficiency of a neural network applied in image recognition to solve the above problems.

The invention further discloses a method for improving the identification efficiency of the neural network, which comprises the steps of analyzing an input image to obtain foreground information, generating a foreground mask by utilizing the foreground information, and converting the input image into an output image through the foreground mask. The output image is used as the imported data of the neural network identification so as to improve the identification efficiency of the object.

The invention further discloses a device for improving the recognition efficiency of the neural network, which comprises an image generator and an operation processor. The image generator is used for obtaining an input image. The arithmetic processor is electrically connected with the image generator. The operation processor is used for analyzing the input image to obtain foreground information, generating a foreground mask by utilizing the foreground information, and converting the input image into an output image through the foreground mask, so that the object identification efficiency of the neural network algorithm under the complex environment can be effectively improved. The output image is used as the imported data of the neural network identification to improve the identification efficiency of the object.

The method and the device for improving the identification efficiency of the neural network firstly separate the foreground information from the input image, define the foreground shielding according to the pixel value distribution of the foreground information, effectively filter unnecessary information by converting the input image through the foreground shielding, and improve the identification accuracy of the neural network by using the generated output information as the import data of the neural network identification. The input image is not limited to RGB, YUV, HSL, or HSV, etc. color modes. The input image and the foreground information obtained by conversion, the foreground mask and the output image belong to the operation of each pixel value, so that the input image and the foreground mask have substantially the same image size. In addition, the pixel gray-scale value of the output image can be selectively limited within a specific range, so that the storage capacity required by the neural network identification efficiency improving device is reduced, and large-capacity image information is further effectively processed.

Drawings

Fig. 1 is a functional block diagram of a neural network recognition performance improving apparatus according to an embodiment of the present invention.

Fig. 2 is a flowchart of a method for improving the recognition performance of a neural network according to an embodiment of the present invention.

Fig. 3 to fig. 6 are schematic diagrams of an input image at different conversion stages according to an embodiment of the invention.

FIG. 7 is a flow chart of generating a foreground mask according to an embodiment of the present invention.

Fig. 8 is a schematic diagram of a foreground information histogram according to an embodiment of the present invention.

FIG. 9 is a diagram illustrating pixel distribution types used in resolving foreground masks according to an embodiment of the invention.

Wherein the reference numerals are as follows:

class 10 neural network identification efficiency improving device

12 image generator

14 arithmetic processor

I monitor picture

I1 input image

I2 Foreground information

I3 foreground mask

I4 output image

Histogram of H foreground information

H1 first square model

H2 second square model

S1 first group

S2 second group

Steps S200, S202, S204, S206, S208, S210

Steps S700, S702, S704, S706, S708, S710, S712, S714, S716

Detailed Description

Referring to fig. 1, fig. 1 is a functional block diagram of a neural network recognition performance improving apparatus 10 according to an embodiment of the present invention. The neural network recognition performance enhancing apparatus 10 may include an image generator 12 and a processor 14 electrically connected together. The image generator 12 is used to obtain an input image I1. The image generator 12 may be an image acquirer, which directly acquires image information of the monitoring range as an input image I1; alternatively, the image generator 12 may also be an image receiver, which receives the image information generated by the external image acquirer as the input image I1 in a wired or wireless manner. The input image I1 is mainly used in object recognition technology based on a neural network (CNN); therefore, the processor 14 executes a set of neural network identification performance enhancing methods, which can effectively improve the object identification performance of the neural network algorithm in a complex environment.

Referring to fig. 2 to 6, fig. 2 is a flowchart illustrating a method for enhancing the recognition performance of the neural network according to an embodiment of the present invention, and fig. 3 to 6 are schematic diagrams illustrating an input image I1 at different conversion stages according to an embodiment of the present invention. The performance enhancing method of neural network identification described in fig. 2 is applied to the performance enhancing apparatus 10 of neural network identification shown in fig. 1. First, step S200 and step S202 are executed to obtain a monitoring screen I associated with a monitoring range, and an object detection technique is used to select a range of the input image I1 on the monitoring screen I. The embodiment shown in fig. 3 selects a small range of input images I1 in the monitoring frame I, but the practical application is not limited thereto; for example, the monitor picture I can be used as the input image I1 in its entirety. Then, step S204 and step S206 are executed to generate the background information of the input image I1, and the difference between the input image I1 and the background information is calculated to obtain the foreground information I2. The input image I1 may establish background information through a gaussian Mixture Model (MOG) or a background subtraction based on a neural network algorithm, or obtain background information through any other algorithm.

The steps S204 and S206 analyze the input image I1 to obtain foreground information I2. The aforementioned method of obtaining the background information first and then calculating the difference between the input image I1 and the background information is only one of the many ways of obtaining the foreground information I2, and the practical application is not limited thereto. Next, step S208 and step S210 are performed, a foreground mask I3 is generated by using the foreground information I2, and the input image I1 is converted into the output image I4 through the foreground mask I3. If the monitor screen I obtains from a complex environment, such as a road with heavy traffic or a crossing with mixed people and vehicles, even if a small range of the input image I1 is drawn from the monitor screen I, the input image I1 still covers many background patterns that affect the detection accuracy. According to the invention, the background objects of the input image I1 are filtered out through the foreground information I2, and the background objects of the output image I4 shown in FIG. 6 are erased, so that the output image I4 is used as the imported data of the neural network identification, the background object interference in a complex environment can be reduced, and the object identification efficiency and the detection accuracy are effectively improved.

Referring to fig. 3 to 8, fig. 7 is a flowchart of generating a foreground mask I3 according to an embodiment of the present invention, and fig. 8 is a schematic diagram of a histogram H converted from foreground information I2 according to an embodiment of the present invention. Firstly, step S700 and step S702 are executed, a histogram H of the foreground information I2 is calculated, and the histogram H is divided into a plurality of groups according to the pixel value range; for example, into a first group S1 and a second group S2, wherein the range of pixel values of the first group S1 is smaller than the range of pixel values of the second group S2. Then, step S704 is executed to compare the number of pixels of the second group S2 with a predetermined parameter. The predetermined parameter may be determined according to the statistical data, for example, by determining the environment of the monitored image I, or by setting the ratio of the number of pixels between the second group S2 and the first group S1. The number of pixels of the second group S2 is greater than a predetermined parameter, indicating that there is a dynamic object within the input image I1; the number of pixels of the second group S2 is less than the predetermined parameter, and it is not yet determined that the object within the input image I1 remains still or is disturbed by noise.

If the number of pixels in the second group S2 is greater than the predetermined parameter, indicating that there is a significant change between the input image I1 and the background information, execute step S706 to set a foreground threshold; for example, the foreground threshold may be forty percent of the average of all pixels of the histogram H. The percentage of the foreground threshold is not limited to this value, depending on the design requirements. Then, step S708 is executed to classify the pixels in the foreground information I2 whose pixel values are higher than the foreground threshold into a first group of pixels and classify the pixels lower than the foreground threshold into a second group of pixels. Then, step S710 is performed to set the pixel values of the foreground mask I3 corresponding to the first group of pixels and the second group of pixels to the first value and the second value, respectively, so as to generate a foreground mask I3. For example, the first value may be 1, as shown in fig. 5 for the non-grid region of the foreground mask I3, and the second value may be 0, as shown in fig. 5 for the grid region of the foreground mask I3.

If the number of pixels in the second group S2 is less than the predetermined parameter and the variation between the input image I1 and the background information is not large, step S712 is executed to determine whether the first group S1 meets the specific condition. The specific condition means that the first group S1 has a larger number of pixels, and the actual number should depend on the environment and statistics. If the first group S1 meets the specific condition, indicating that the pixel distribution in the histogram H is concentrated in the low-level range, the object in the input image I1 is regarded as being still, and step S714 is executed, and the pixel values of all the pixels in the foreground mask I3 are set to the first value; when the first value is 1, the input image I1 can be directly used as the output image I4 as the imported data for the neural network identification. If the first group S1 does not meet the specific condition, indicating that the pixels in the histogram H are scattered, interpreting that the input image I1 is disturbed by noise, executing step S716, and setting the pixel values of all the pixels in the foreground mask I3 as the second value; if the second value is 0, the input image I1 is discarded directly.

In step S210, the input image I1 is converted into the output image I4 through the foreground mask I3, and products of all pixel values of the input image I1 and corresponding pixel values of the foreground mask I3 can be selected to be directly calculated, where the obtained products are pixel values of the output image I4; alternatively, after calculating the products of all pixel values of the input image I1 and the pixel values corresponding to the foreground mask I3, a first set of products whose positions correspond to pixel positions within the foreground mask I3 that do not belong to the second value and a second set of products whose positions belong to the second value may be further filtered out from the products. The second set of products may be classified as background, for example, if they are set to the second value, and the background pixels belonging to the second set of products in the output image I4 are black, which may affect the color rendering effect of the objects in the output image I4, so that the second set of products may be replaced by reference values (such as the single-diagonal regions of the output image I4 shown in fig. 6), and the first set of products and the reference values may be combined to be the pixel values of the output image I4. For example, the color of the object to be measured (e.g., a pedestrian) in the output image I4 is usually black and white, and if the second set of products is defined as the second value (black), it is easy to be confused with the pattern of the object to be measured, so that the second set of products can be selectively set to other colors, such as gray, to clearly separate the object to be measured from the background.

Referring to fig. 7 to 9, fig. 9 is a schematic diagram illustrating a pixel distribution type for analyzing a foreground mask according to an embodiment of the invention. Step S704 compares the number of pixels of the second group S2 with a predetermined parameter, and the present invention can further preset the first and second histogram models H1 and H2 as shown in fig. 9. If the histogram H of the foreground information I2 is similar to the first histogram model H1, i.e., the number of pixels in the second group S2 is larger (larger than the predetermined parameter), step S708 may be performed; if the number of pixels in the second group S2 is smaller (smaller than the predetermined parameter), step S712 is further performed to determine whether the histogram H of the foreground information I2 is similar to the second histogram model H2. If the histogram H is similar to the second histogram model H2, i.e. meets the specific condition that the first group S1 has a larger number of pixels, step S714 is performed to generate the associated foreground mask I3; if the histogram H is different from the second histogram model H2, i.e. the number of pixels in the first group S1 is small and does not meet the specific condition, step S716 is performed to discard the input image I1. The first square model H1 may present a visual graphic style with predetermined parameters, and the second square model H2 may present a visual graphic style with specific conditions, although the actual style is not limited to the embodiments disclosed above.

In summary, the method and the device for improving the recognition efficiency of the neural network of the present invention separate the foreground information from the input image, classify the foreground information according to the pixel value distribution of the foreground information to define the foreground masks under different situations, effectively filter out unnecessary information by the transformation of the foreground masks from the input image, and use the generated output information as the import data of the neural network recognition, thereby improving the recognition accuracy of the neural network. It is particularly mentioned that the input image is not limited to RGB, YUV, HSL or HSV, etc. color modes. The input image and the foreground information obtained by conversion, the foreground mask and the output image belong to the operation conversion of each pixel value, so that the input image and the foreground mask have substantially the same image size. In addition, the pixel gray-scale value of the output image can be selectively limited within the range of 0-128, so that the storage capacity required by the neural network identification efficiency improving device is reduced, and large-capacity image information is further effectively processed; the foreground mask thus belongs to a binary map and the output image belongs to a 256-level or 128-level gray scale map. Compared with the prior art, the method and the device filter the background noise of the input image, and are beneficial to improving the efficiency improvement of the neural network identification.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for improving the recognition efficiency of a neural network is characterized by comprising the following steps:

analyzing the input image to obtain foreground information;

generating a foreground mask using the foreground information; and

the input image is converted into an output image through the foreground mask, wherein the output image is used as import data of the neural network identification so as to improve the object identification efficiency.

2. The method of claim 1, wherein the neural network performance enhancing method further comprises:

and selecting the range of the input image in the monitoring picture by using an object detection technology.

3. The method of claim 1, wherein analyzing the input image to obtain the foreground information comprises:

generating background information of the input image; and

calculating the difference between the input image and the background information to obtain the foreground information.

4. The method of claim 1, wherein the generating the foreground mask using the foreground information comprises:

calculating a histogram of the foreground information;

the histogram is divided into a first group and a second group according to the pixel value range, and the pixel value range of the first group is smaller than the pixel value range of the second group;

comparing the number of pixels of the second group with a predetermined parameter; and

the foreground mask is generated based on the comparison.

5. The method of claim 4, wherein generating the foreground mask according to the comparison further comprises:

setting a foreground threshold when the number of pixels of the second group is greater than the predetermined parameter;

pixels in the foreground information with pixel values higher than the foreground threshold are classified into a first group of pixels;

pixels in the foreground information, the pixel values of which are lower than the foreground threshold, are classified into a second group of pixels; and setting pixel values within the foreground mask corresponding to the first set of pixels to a first value and pixel values corresponding to the second set of pixels to a second value.

6. The method of claim 4, wherein generating the foreground mask according to the comparison further comprises:

when the number of the pixels of the second group is less than the predetermined parameter, determining whether the first group meets a specific condition; and

if the first group meets the specific condition, the pixel values of all the pixels in the foreground mask are set to a first value.

7. The method of claim 6, wherein the pixel values of all pixels in the foreground mask are set to a second value when the first group does not meet the predetermined condition.

8. The method of claim 4, wherein transforming the input image into the output image via the foreground mask comprises:

the product of all pixel values of the input image with the corresponding pixel values of the foreground mask, respectively, is calculated to generate the output image.

9. The method of claim 4, wherein transforming the input image into the output image via the foreground mask comprises:

calculating the product of all pixel values of the input image and the corresponding pixel values of the foreground mask;

selecting from the products a first set of products whose positions correspond to pixel positions within the foreground mask that do not belong to the second value;

selecting a second set of products from the products whose positions correspond to pixel positions within the foreground mask that belong to a second value, and replacing the second set of products with reference values; and

the first set of products and the reference values are used to generate the output image.

10. The method of claim 1, wherein a pixel gray scale value of the output image is limited to a range of 0-128.

11. The method of claim 1, wherein the foreground mask has a size substantially the same as the input image.

12. An apparatus for enhancing recognition efficiency of neural network, the apparatus comprising:

an image generator for obtaining an input image; and

the computing processor, electrically connected to the image generator, is configured to execute the neural network identification performance improving method according to one or a combination of claims 1 to 11, so as to effectively improve the object identification performance of the neural network algorithm in a complex environment.