CN111046881B

CN111046881B - Pointer type instrument reading identification method based on computer vision and deep learning

Info

Publication number: CN111046881B
Application number: CN201911219009.0A
Authority: CN
Inventors: 王敬宇; 孙海峰; 王晶; 肖凝; 郝凌轶; 戚琦; 李炜; 刘国泰
Original assignee: Xuchang Beiyou Wanlian Network Technology Co ltd; Beijing University of Posts and Telecommunications
Current assignee: Xuchang Beiyou Wanlian Network Technology Co ltd; Beijing University of Posts and Telecommunications
Priority date: 2019-12-02
Filing date: 2019-12-02
Publication date: 2023-03-24
Anticipated expiration: 2039-12-02
Also published as: CN111046881A

Abstract

The pointer instrument reading identification method based on computer vision and deep learning comprises the following operation steps: (1) detecting the dial plate area of the pointer instrument; (2) preprocessing the pointer instrument image; (3) a scale and pointer detection process; (4) a digital area detection process; (5) a number identification process; (6) reading calculation process; the method can realize quick, accurate and automatic reading of the pointer instrument, and is time-saving and labor-saving.

Description

Pointer type instrument reading identification method based on computer vision and deep learning

Technical Field

The invention relates to a pointer instrument reading identification method based on computer vision and deep learning, belongs to the technical field of computer vision, and particularly belongs to the technical field of detection and identification based on computer vision.

Background

At present, many pointer type instruments in China still need manual meter reading, so that time and labor are wasted, efficiency is low, and errors are easy to occur. Computer vision technology and deep learning technology have been greatly developed in recent years, and how to utilize the computer vision technology and the deep learning technology to realize automatic reading of instruments is a technical problem which is urgently needed to be solved by various industries.

Disclosure of Invention

In view of the above, the present invention is to provide a method based on computer vision and deep learning technology to realize automatic meter reading of a pointer instrument.

In order to achieve the above purpose, the invention provides a pointer instrument reading identification method based on computer vision and deep learning, which comprises the following operation steps:

(1) The specific content of the detection process of the dial plate area of the pointer instrument is as follows: inputting the pointer instrument image into a depth neural network trained in advance, detecting the instrument dial area, and returning the coordinate information of the pointer instrument dial area after detection;

(2) The image preprocessing process of the pointer instrument specifically comprises the following steps: dividing the instrument dial area image according to the instrument dial area coordinate information in the step (1); after division, the size of the instrument dial area image is processed in a unified way; performing expansion and corrosion operations on the uniformly processed standard image to obtain a preprocessed image of the dial area of the pointer instrument;

(3) The scale and pointer detection process comprises the following specific contents: extracting coordinate information of all line segments in the image according to the preprocessed image obtained in the step (2), and recording line segment angle information; calculating the vertical distance from the image center point to the straight line where the line segment is located, and calculating the straight line distance from the image center point to two end points of the line segment; judging whether each line segment is a scale line or a pointer according to a preset condition;

(4) The digital area detection process specifically comprises the following steps: rotating and dividing the preprocessed image in the step (2) according to the coordinate information and the angle information of the line segment judged as the scale mark returned in the step (3) to obtain an image of the area where the number corresponding to the scale is located; carrying out contour search on the digital area image to obtain single digital position information; further segmenting the single digital area image, and unifying the resolution of the segmented images;

(5) The digital identification process specifically comprises the following steps: identifying the single digital image returned in the step (4) through a pre-trained deep neural network to obtain a digit with the maximum confidence coefficient; sequencing the recognized numbers according to the single number position information obtained in the step (4) to obtain final numbers;

(6) The reading calculation process specifically comprises the following steps: calculating the value represented by each degree according to the angle information judged as the scale mark returned in the step (3) and the final number returned in the step (5); and (4) returning the angle information which is judged as the pointer according to the step (3) to calculate the final reading result.

The specific content of the coordinate information returned to the dial plate area of the pointer instrument after detection comprises the following operation steps:

(11) Selecting a deep neural network as a training network;

(12) Inputting the coordinates of the upper left corner and the lower right corner of the CLOCK data in the COCO data set as training time stamp data;

(13) The model output of the training network is set to 2 classes and 2 coordinates: the 2 categories are respectively non-instrument dials and the classification result is an instrument dial, and are 2 categories in total; the 2 coordinates are respectively the coordinates of the upper left corner of the meter dial area and the coordinates of the lower right corner of the meter dial area.

And (3) after the division in the step (2), uniformly processing the size of the instrument dial area image into a standard image with the height of 800 pixels and the width of 800 pixels.

The step (2) of performing expansion and corrosion operations on the uniformly processed standard image to obtain the specific content of the preprocessed image of the dial area of the pointer instrument comprises the following operation steps:

(21) Converting the uniformly processed dial plate image into a gray image;

(22) Performing expansion processing with convolution kernel of 3 on the gray level image;

(23) And performing corrosion treatment with convolution kernel of 3 twice on the image after the expansion treatment to obtain a preprocessed image.

The step (3) of extracting the coordinate information of all line segments in the image according to the obtained preprocessed image, and recording the specific content of the line segment angle information comprises the following operation steps:

(3101) Converting the preprocessed image into a binary image;

(3102) Calculating the binary image to obtain image edge information;

(3103) And extracting line segments according to the image edge information, and recording coordinates of two end points of the line segments and angles between the two end points and the horizontal direction.

The step (3) of judging whether each line segment is a scale line or the specific content of the pointer according to the preset conditions comprises the following operation steps:

(3201) Judging as a condition of the pointer: the distance from the image center point to the straight line where the line segment is located is less than m pixel points, the length of the line segment is greater than the straight line with the maximum length of n pixel points, and m and n are natural numbers;

(3202) Judging as a scale condition: the distance from the image center point to the straight line where the line segment is located is less than j pixel points, the length of the line segment is greater than that of the line segment of k pixels, and j and k are natural numbers.

The step (4) of rotating and dividing the preprocessed image according to the coordinate information and the angle information of the line segment determined as the scale mark, and acquiring the specific content of the image of the area where the number corresponding to the scale is located includes the following operation steps:

(41) Respectively calculating the distances from two end points of the scale mark to the central point of the image, taking a point with a smaller distance, and setting coordinates as (x, y);

(42) Rotating the image according to the angle information of the scale marks, wherein the rotation center is the midpoint of the image, the rotation angle is the angle of the scale marks minus 90 degrees, and the scale marks are vertical to the horizontal direction after rotation;

(43) And the image area is divided into an image area with the coordinates of (x, y-s) at the upper left corner and (x + t, y + s) at the lower right corner, wherein s and t are natural numbers.

In the step (4), the single digital area image is further segmented, and the images obtained after segmentation are processed into a standard image with the height of 72 pixels and the width of 72 pixels.

In the step (5), the returned single digital image is identified through a pre-trained deep neural network, and the specific content of the digital image with the maximum confidence coefficient is obtained through the following operation steps:

(5101) Selecting a deep neural network as a training network;

(5102) Ten numbers of ten thousand different fonts and 0 to 9 at different angles are generated by using an ImageFont library;

(5103) Respectively taking the generated images and the types of the images as the input and the labels of the network, and training the deep neural network;

(5104) Inputting the standard image subjected to unified processing in the step (4) into the trained deep neural network for prediction and recognition;

(5105) The model output of the network is confidence of 11 categories, and the 11 categories are respectively: the background category is non-digital images, and 0 to 9 ten digital categories, and the category with the highest confidence is selected as the output result of the deep neural network.

The step (5) of sequencing the identified numbers according to the returned acquired single number position information to obtain the specific content of the final number comprises the following operation steps:

(5201) Recording the coordinates of the upper left corner of the image judged as a number through the neural network and a judgment result;

(5202) Sorting the numbers according to the ascending x value of the coordinate at the upper left corner, and combining the sorted numbers into an integer;

(5203) If the number starts with 0, the result is transferred to the form of a few tenths.

The step (6) of calculating the specific content of the value represented by each degree according to the angle information judged as the scale mark returned in the step (3) and the final number returned in the step (5) comprises the following operation steps:

(6101) Each set of test results is represented by { final number: angle } into a dictionary;

(6102) Calculating the result of dividing the final number difference value of each two groups in the dictionary by the angle difference value to obtain the number represented by each degree;

(6103) Taking the median of all the numbers represented by each degree as the final result.

The specific content of calculating the final reading result according to the angle information returned as the judgment pointer in the step (6) comprises the following operation steps:

(6201) Each set in the final number dictionary is selected { final number: angle } to calculate a final reading;

(6202) Reading result = final number- (angle-pointer angle) × value represented per degree;

(6203) Taking the median of all reading results as the final reading result.

The invention has the advantages of realizing rapid, accurate and automatic reading of the pointer instrument, saving time and labor.

Drawings

FIG. 1 is a flow chart of a pointer instrument reading identification method based on computer vision and deep learning.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the accompanying drawings.

Referring to fig. 1, a pointer instrument reading identification method based on computer vision and deep learning proposed by the present invention is described, the method comprises the following operation steps:

(3) The scale and pointer detection process comprises the following specific contents: extracting coordinate information of all line segments in the image according to the preprocessed image obtained in the step (2), and recording line segment angle information; calculating the vertical distance from the image center point to the straight line where the line segment is located, and calculating the straight line distance from the image center point to two end points of the line segment; judging whether each line segment is a scale line or a pointer according to preset conditions;

(4) The digital area detection process specifically comprises the following steps: rotating and dividing the preprocessed image in the step (2) according to the coordinate information and the angle information of the line segment judged as the scale mark returned in the step (3) to obtain an image of the area where the number corresponding to the scale is located; carrying out contour searching on the digital area image to obtain single digital position information; further segmenting the single digital area image, and unifying the resolution of the segmented images;

(5) The digital identification process specifically comprises the following steps: identifying the single digital image returned in the step (4) through a pre-trained deep neural network to obtain a digit with the maximum confidence coefficient; sequencing the identified numbers according to the single number position information obtained in the step (4) to obtain final numbers;

(11) Selecting a deep neural network as a training network, and pre-training, wherein YOLOv3 under a Pythrch (the Pythrch is a python version of the torrech, is a neural network framework with a Facebook open source and is specially used for GPU accelerated deep neural network programming) framework is used as the training network in the implementation;

(12) Inputting the coordinates (x 1, y 1) of the upper left corner and the coordinates (x 2, y 2) of the lower right corner of CLOCK data in a COCO data set (the COCO data set is a large image data set published by Microsoft, is specially designed for object detection, segmentation, human body key point detection and semantic segmentation and covers 80 classes) as training time annotation data;

(13) The model output of the training network is set to 2 classes and 2 coordinates: the 2 categories are respectively non-instrument dials and the classification result is an instrument dial, and are 2 categories in total; the 2 coordinates are respectively the coordinates of the upper left corner of the dial area of the instrument and the coordinates of the lower right corner of the dial area of the instrument.

In the step (2), the size of the divided instrument dial area image is processed into a standard image with the height of 800 pixels and the width of 800 pixels by using a resize function in an OpenCV library.

(21) Converting the uniformly processed dial plate image into a gray image by using a cvtColor function in OpenCV;

(22) Setting a convolution kernel of the gray image through a getStructuringElement function in OpenCV, and performing expansion processing with the convolution kernel being 3 by using an anode function in OpenCV;

(23) And setting a convolution kernel of the image after the expansion processing through a getStructuringElement function in OpenCV, and performing corrosion processing with the convolution kernel being 3 twice by using a partition function in OpenCV to obtain a preprocessed image.

(3101) Converting the preprocessed image into a binary image by using a cvtColor algorithm in OpenCV;

(3102) Calculating the binary image to obtain image edge information by using a Canny algorithm;

(3103) Extracting line segments by using a HoughLines algorithm in OpenCV according to the image edge information, and recording coordinates (x) of two end points of the line segments ₁ ,y ₁ )，(x ₂ ,y ₂ ) And an angle theta from the horizontal direction.

The step (3) of judging whether each line segment is a specific content of a scale line or a pointer according to a preset condition comprises the following operation steps:

(3201) Judging as a condition of the pointer: the distance from the image center point to the straight line of the line segment is less than m (m is 30 in the embodiment), and the length of the line segment is greater than the maximum length of n (n is 70 in the embodiment) pixel points;

(3202) Judging as a scale condition: the distance from the image center point to the straight line of the line segment is less than j (in the embodiment, j takes the value of 30) pixel points, and the length of the line segment is greater than the length of the line segment of k (in the embodiment, k takes the value of 40) pixel points.

Image center point (x) ₀ ，y ₀ ) The distance calculation formula to the straight line Ax + By + C =0 where the line segment is located is

(41) Separately calculating two end points (x) of the scale mark ₁ ,y ₁ ),(x ₂ ,y ₂ ) To the image center point (x) ₀ ,y ₀ ) A distance of

Taking a point with a smaller distance, and setting the coordinate of the point as (x, y);

(42) According to the angle information theta of the scale mark, calculating a rotation matrix through a getTrotationmatrix 2D function in OpenCV, and rotating the image through an affine transformation function warpAffine, wherein the rotation center is an image center point, the rotation angle is theta-90 degrees, and the scale mark is vertical to the horizontal direction after rotation;

(43) The image segmentation method comprises the steps of segmenting an image area with the coordinates of (x, y-s) at the upper left corner and (x + t, y + s) at the lower right corner, wherein s is a natural number, and the value is 100 in the embodiment; t is a natural number, and takes 150 in the embodiment.

In the step (4), the single digital area image is further segmented, and the segmented image is uniformly processed into a standard image with the height of 72 pixels and the width of 72 pixels by using a resize function in an OpenCV library.

(5101) Selecting a deep neural network as a training network, and using a ResNet50 under a Pythrch frame as the training network in the embodiment;

(5102) Ten numbers of ten thousand different fonts and 0 to 9 at different angles are generated by using an ImageFont library in Python;

(5201) Recording the coordinates (x, y) of the upper left corner of the image which is judged to be digital through the neural network and a judgment result;

(5202) And sorting the numbers according to the x value of the coordinate at the upper left corner from small to large, and combining the sorted numbers into an integer. For example, the image determined to be digital in step (5) is output as coordinates (340, 401) and classified as 5; coordinates (300, 400) classified as 1; coordinates (378, 397), classified as 0, then 300-340-378 in x-value order, then 1,5,0 in numerical order, and 150 in total;

(5203) If the number starts with 0, the result is transferred to the form of a few tenths. For example, the image determined to be digital in step (5) is output as coordinates (340, 401) and classified as 1; coordinates (378, 397), classified as 5; coordinates (300, 400), classified as 0, then 300-340-378 in x-value order from small to large, then 0,1,5 in numerical order, since starting with 0, the results go to 0.15.

The step of returning the angle information judged as the scale mark according to the step (3) and the final number returned in the step (5) and calculating the specific content of the numerical value represented by each degree comprises the following operation steps:

(6101) Each set of test results is represented by { final number: angle is added to the dictionary, such as a dictionary of 20:89, 40:151, 80:272};

(6102) Calculating the final number difference value of each two groups in the dictionary by dividing the result of the angle difference value to obtain the number represented by each degree, wherein the calculation results according to the dictionary are (20-40)/(89-151) =0.323, (20-80)/(89-272) =0.327, (40-80)/(151-272) =0.331 respectively;

(6103) Taking the median of all the numbers represented by each degree as the final result, calculating the median of 0.323,0.327,0.331 and 0.327 in the above results, and finally considering the value represented by each degree as 0.327.

(6201) Each set in the final number dictionary is selected { final number: angle to calculate the final reading result, such as a dictionary of 20:89, 40:151, 80:272};

(6202) Reading result = final number- (angle-pointer angle) × value represented by each degree, for example, the pointer angle is 160, and the reading results are calculated as 20- (89-160) × 0.327=43.2, 40- (151-160) × 0.327=42.9, and 80- (272-160) × 0.327=43.4, respectively;

(6203) Taking the median of all the reading results as the final reading result, and calculating the results according to the above to obtain 43.2, 42.9 and 43.4, wherein the median is 43.2, and finally determining that the final reading result is 43.2.

The inventor carries out a large number of experiments on the method, and the experimental results prove that the method is feasible and efficient.

Claims

1. Pointer instrument reading identification method based on computer vision and deep learning, its characterized in that: the method comprises the following operation steps:

2. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the specific content of the coordinate information returned to the dial plate area of the pointer instrument after detection comprises the following operation steps:

(11) Selecting a deep neural network as a training network;

3. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: and (3) after the division in the step (2), uniformly processing the size of the instrument dial area image into a standard image with the height of 800 pixels and the width of 800 pixels.

4. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the step (2) of performing expansion and corrosion operations on the uniformly processed standard image to obtain the specific content of the preprocessed image of the dial area of the pointer instrument comprises the following operation steps:

(21) Converting the uniformly processed dial plate image into a gray image;

5. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the step (3) of extracting the coordinate information of all line segments in the image according to the obtained preprocessed image, and recording the specific content of the line segment angle information comprises the following operation steps:

(3101) Converting the preprocessed image into a binary image;

(3102) Calculating the binary image to obtain image edge information;

6. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the step (3) of judging whether each line segment is a specific content of a scale line or a pointer according to a preset condition comprises the following operation steps:

7. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the step (4) of rotating and dividing the preprocessed image according to the coordinate information and the angle information of the line segment determined as the scale mark, and acquiring the specific content of the image of the area where the number corresponding to the scale is located includes the following operation steps:

(41) Respectively calculating the distances from the two end points of the scale mark to the central point of the image, taking a point with a smaller distance, and setting the coordinates as (x, y);

8. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: in the step (4), the single digital area image is further segmented, and the images obtained after segmentation are processed into a standard image with the height of 72 pixels and the width of 72 pixels.

9. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: in the step (5), the returned single digital image is identified through a pre-trained deep neural network, and the specific content of the digital image with the maximum confidence coefficient is obtained through the following operation steps:

(5101) Selecting a deep neural network as a training network;

10. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the step (5) of sequencing the identified numbers according to the returned acquired single number position information to obtain the specific content of the final number comprises the following operation steps:

(5202) Sorting the numbers according to the x value of the coordinate at the upper left corner from small to large, and combining the sorted numbers into an integer;

11. The pointer instrument reading identification method based on computer vision and deep learning of claim 1, characterized in that: the step (6) of returning the angle information determined as the scale mark according to the step (3) and the final number returned in the step (5) and calculating the specific content of the value represented by each degree comprises the following operation steps:

(6103) Taking the median of all numbers represented by each degree as the final result.

12. The pointer instrument reading identification method based on computer vision and deep learning of claim 1 or 11, wherein: the specific content of calculating the final reading result according to the angle information returned as the judgment pointer in the step (6) comprises the following operation steps:

(6203) Taking the median of all reading results as the final reading result.