Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides an automatic reading method and an automatic reading system for a pointer instrument based on deep learning, solves the problems of difficult detection of the pointer instrument and correction of an inclined instrument image in a complex environment, and realizes accurate reading of the instrument.
In order to achieve the expected effect, the invention adopts the following technical scheme:
the invention discloses an automatic reading method of a pointer instrument based on deep learning, which comprises the following steps:
s1) obtaining an image of an instrument to be read;
s2) dial target detection is carried out on the instrument image;
s3) performing inclination correction on the detected dial image;
s4) expanding the corrected dial image into a rectangular dial image;
s5) detecting the position of the pointer based on the rectangular dial plate image;
s6) calculating a reading from the pointer position.
Further, the S1) specifically includes: the method comprises the steps of obtaining original instrument image data, dividing the original instrument image data into a training data set and a testing data set, and marking the data sets.
Further, the S2) specifically includes: and constructing an instrument identification model, training a training data set by adopting a deep learning algorithm to obtain a trained instrument identification model, inputting an image to be detected in the test data set into the trained instrument identification model for target detection and classification, and cutting to obtain an instrument dial image of a definite type.
Further, the deep learning algorithm is a modified YOLOv5 algorithm, and specifically includes: firstly, for a backbone network, adding a global attention module GAM behind the last C3 module; secondly, in the neck region, a feature map with the size being one fourth of the input image is newly added; finally, a coupled-head decoupling head is introduced.
Further, the S3) specifically includes: and performing primary perspective transformation after key point matching and secondary perspective transformation after ellipse fitting on the image after primary perspective transformation by using an AKAZE algorithm to realize inclination correction.
Further, the performing primary perspective transformation after the key point matching by using the AKAZE algorithm specifically includes:
acquiring a centered meter dial image without inclination in advance as a template image;
marking key points including a starting scale point and a stopping scale point on the template image and storing marking data;
carrying out graying treatment on each image to be read and the template image acquired in advance;
carrying out feature point detection on the image subjected to the graying treatment by adopting an AKAZE algorithm, and matching the feature points by using a two-way matching method;
screening out characteristic points with larger matching errors in the image by using RANSAC to obtain a homography matrix;
and performing perspective transformation on the image to be read through the homography matrix, and correcting the template image.
Further, the S4) specifically includes: and (3) obtaining corresponding position information on the rectangular dial after polar coordinate transformation through the position information of the starting point and the ending point of the scale of the circular dial after inclination correction, and re-stitching the images to obtain a rectangular dial image which finally only comprises the region where the scale is located.
Further, the S5) specifically includes: training by adopting a deep learning algorithm to obtain a trained pointer detection model, inputting a rectangular dial plate image into the trained pointer detection model to perform target detection to obtain a target bounding box, wherein the central axis of the bounding box is a straight line where the pointer point is located, so that the pointer position is obtained.
Further, the S6) specifically includes: and calculating according to the pointer position, the rectangular dial image and the dial range of the instrument by a distance method to obtain a final reading.
The invention also discloses an automatic pointer instrument reading system based on deep learning, which comprises:
the receiving module is used for acquiring an image of the instrument to be read;
the detection module is used for detecting dial targets of the instrument images; performing tilt correction on the detected dial image; expanding the corrected dial image into a rectangular dial image; detecting the position of the pointer based on the rectangular dial plate image;
and the reading module is used for calculating the reading according to the pointer position.
Compared with the prior art, the invention has the beneficial effects that: the invention discloses an automatic reading method and an automatic reading system of a pointer instrument based on deep learning, wherein the method comprises the steps of firstly acquiring an image of the instrument to be read; secondly, dial target detection is carried out on the instrument image; performing inclination correction on the detected dial image again; then, the corrected dial image is unfolded to be a rectangular dial image; then, detecting the position of the pointer based on the rectangular dial plate image; finally, a reading is calculated based on the pointer position. The invention is suitable for various disc pointer tables with uniform scales, such as single pointer, double pointer and the like, and the traditional algorithm needs to respectively identify each type of meter; the invention can detect and read the instrument in complex environment based on the target detection of deep learning; the invention can also correct the inclination instrument in the acquired image, reduce the subsequent reading error and ensure more accurate reading precision.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention discloses an automatic reading method of a pointer instrument based on deep learning, which comprises the following steps:
s1) obtaining an image of an instrument to be read; this step is to screen out various meters including the conditions of inclination, light interference, stain interference and the like by preprocessing in order to acquire original image data for subsequent target detection, and to construct a meter automatic detection and identification data set containing a plurality of kinds of meter images.
S2) dial target detection is carried out on the instrument image; this step is mainly to detect dial targets by constructing a neural network model.
S3) performing inclination correction on the detected dial image; since the dial plane of the instrument is generally angularly offset from the camera plane, the dial area image extracted is not circular but a distorted image having an oblique angle. It is therefore necessary to process the image to be detected by perspective transformation, correcting it to a front view image so as to reduce the error of the readings.
S4) expanding the corrected dial image into a rectangular dial image; to facilitate subsequent pointer position detection and reading, a circular dial image needs to be expanded by a rectangular dial image.
S5) detecting the position of the pointer based on the rectangular dial plate image; and forming a pointer detection data set from the dial target detection data set through the steps, and detecting the pointer position through a neural network model.
S6) calculating a reading from the pointer position.
The invention is suitable for various disc pointer tables with uniform scales, such as single pointer, double pointer and the like, and the traditional algorithm needs to respectively identify each type of meter; the invention can detect and read the instrument in complex environment based on the target detection of deep learning; the invention can also correct the inclination instrument in the acquired image, reduce the subsequent reading error and ensure more accurate reading precision.
In a preferred embodiment, the acquiring the image of the instrument to be read specifically includes: the method comprises the steps of obtaining original instrument image data, dividing the original instrument image data into a training data set and a testing data set, and marking the data sets.
In a preferred embodiment, 2400 original instrument image data are obtained in the field through the inspection robot, various instruments including the conditions of inclination, illumination interference, stain interference and the like are screened through preprocessing, and an instrument automatic detection and identification data set is formed, wherein the data set comprises a plurality of instrument images. The data set is divided into a training set and a testing set according to the ratio of 9:1, and is marked by using a data marking tool Labellmg, so that each picture is provided with a target detection label.
In a preferred embodiment, the dial target detection on the meter image specifically includes: and constructing an instrument identification model, training a training data set by adopting a deep learning algorithm to obtain a trained instrument identification model, inputting an image to be detected in the test data set into the trained instrument identification model for target detection and classification, and cutting to obtain an instrument dial image of a definite type.
The mature target detection algorithm comprises a Faster-RCNN algorithm, a Mask-RCNN, YOLO, SSD algorithm and the like, the YOLO series algorithm is used as a typical representative of One-stage algorithm, the detection speed and the detection precision of the algorithm are balanced, and the method can be widely applied to various industrial sites to solve the practical problem.
In a preferred embodiment, the deep learning algorithm is a modified YOLOv5 algorithm, which specifically includes: firstly, for a backbone network, adding a global attention module GAM behind the last C3 module; secondly, in the neck region, a feature map with the size being one fourth of the input image is newly added; finally, a coupled-head decoupling head is introduced.
The method selects the YOLOv5s algorithm to detect the target of the instrument, improves the target, and the structure of the improved YOLOv5s algorithm is shown in figure 2. For a backbone network, a global attention module GAM is added behind the last C3 module so as to strengthen the capability of an algorithm for extracting the target characteristics of the instrument in the detection process; in the neck region, a feature map with the size being one fourth of that of an input instrument image is newly added, so that feature excavation capability of an algorithm aiming at a smaller instrument target is improved, namely, feature extraction and excavation can be carried out through the algorithm when the instrument image features are smaller; finally, a coupled-head decoupling head is introduced to improve the effect of the network model detection module on classifying and locating the targets of the instrument images.
With the addition of the global attention module, it is noted that in the YOLOv5 original algorithm, the same attention is given to each layer of features when the features are extracted, and the method ignores different feature channelsThe tracks have different importance, which will lead to difficulties in extracting part of the target. Aiming at the problem of weaker sensitivity of the network feature difference, the invention adds the GAM module in the backbone network, and utilizes the advantage of strong feature utilization capability of the module, reduces information loss and improves global feature interaction, and a module outline diagram is shown in figure 3. Given an input feature mapIntermediate state->And output->The definition is as follows: />,Wherein->And->Channel attention and space attention, respectively, < ->Representing an element multiplication operation. The channel attention sub-module uses a three-dimensional arrangement to preserve three-dimensional information and uses a two-layer multi-layer perceptron MLP to amplify the cross-dimensional channel spatial dependency. In the spatial attention sub-module, spatial information fusion is performed using two convolutional layers to focus on spatial information, and a pooling operation is deleted to further preserve the property map. Thus, the use of a global attention module can make network object extraction more powerful.
The detection end is optimized, and it is notable that in the original YOLOv5s algorithm, three feature maps with different sizes are used to detect targets with different sizes. According to the idea of using a feature pyramid network FPN in the network, the feature map after deep convolution has rich semantic information, but loses some position information of the target after multiple convolutions, so that the detection of a smaller target is affected; the semantic information of the feature map obtained after shallow convolution is less, but the position information of the target is more abundant.
In some industrial applications, some meters are located remotely, resulting in a smaller collection of meter targets in the map. Therefore, the invention adds a 4-time downsampling process in the original YOLOv5s algorithm, and sends the original picture features after the downsampling process into a subsequent feature fusion network, so that a feature map with a new size can be obtained, and the feature map has smaller receptive field and richer position information. The newly added feature map transmits the position information to the detection end, so that the detection end can detect the target on four scales, the detection performance of the smaller target can be improved, and the overall detection effect of the network is optimized.
The coupled-head decoupling head is introduced, and it is noted that the coupling head is adopted by the traditional YOLO series network in the head part, so that for the target detection task, the conflict exists between classification and regression, and the model performance is reduced by adopting the traditional coupling detection head. Fig. 4 is a schematic diagram of a decoupling head of an automatic reading method of a pointer instrument based on deep learning according to an embodiment of the present invention. The invention introduces a decoupling head to replace the original coupling head, reduces the channel dimension to a uniform channel number after the decoupling head is subjected to 1×1 convolution, then accesses two parallel 3×3 convolution layers for classification tasks and positioning and confidence tasks respectively, and uses two parallel 1×1 convolutions for decoupling branches for positioning and confidence tasks, so that different detection layers can be adopted for classification, positioning and confidence detection respectively.
The method provided by the invention is based on an improved YOLOv5s algorithm in a dial plate detection part, the mAP in the improved algorithm is improved by 3.6% compared with the original algorithm, and the mAP reaches 98.7%, so that the dial plate can be effectively and accurately detected from a complex environment. The detection and classification results after the pictures of the number to be read are input into the instrument identification model are shown in fig. 5, and the dial cutting image is shown in fig. 6 after the detection.
In a preferred embodiment, the inclination correction of the detected dial image specifically includes: and performing primary perspective transformation after key point matching and secondary perspective transformation after ellipse fitting on the image after primary perspective transformation by using an AKAZE algorithm to realize inclination correction. The primary perspective transformation has the effect of correcting the image to be detected into an image without inclination which is the same as the template image; the secondary perspective transformation is to obtain the center of the instrument and provide a center for unfolding and ensure that the image to be detected is corrected to be circular, thereby reducing errors brought to subsequent steps.
In a preferred embodiment, the performing, by using the AKAZE algorithm, a perspective transformation after the keypoint matching specifically includes: for each type of instrument, a centered meter dial image without inclination needs to be acquired in advance to serve as a template image; template image annotation is shown in fig. 7. Marking key points including a starting scale point and a stopping scale point on the template image and storing marked data; then carrying out graying treatment on each image to be read and the template image acquired in advance; then, carrying out feature point detection on the image subjected to the graying treatment by adopting an AKAZE algorithm, and matching the feature points by using a two-way matching method; then screening out characteristic points with larger matching errors in the image by using RANSAC to obtain a homography matrix M; and finally, performing perspective transformation on the image to be read through the homography matrix M, and correcting the template image.
Notably, the AKAZE algorithm includes three steps: and constructing a nonlinear scale space, detecting and positioning characteristic points and generating an M-LDB (Modified-local Difference Binary) characteristic descriptor. The AKAZE algorithm adopts an M-LDB algorithm for improving a local binary differential algorithm to describe characteristic points, and the algorithm has better robustness in rotation invariance and scale invariance by utilizing gradient and strength information extracted by a nonlinear space. Therefore, the invention can better detect the feature points which can be matched in the image to be read and the template image by using the AKAZE algorithm.
It is worth noting that the matching result obtained after the bidirectional matching still has wrong matching points, and the effect of subsequent perspective transformation is affected by not screening out the wrong matching points, so that the RANSAC algorithm is adopted to reject the wrong matching points. The algorithm is based on the obtained matching points, four points are randomly extracted from the matching points to serve as sample points, and a homography matrix is calculated; and calculating projection errors between all the matching points and the matrix, if the errors are smaller than a threshold value, selecting one point from the rest matching points to add a sample point, and continuously calculating and comparing in the previous mode, and continuously replacing a new matrix until the optimal homography matrix is found. The perspective transformation is to project the image from the current plane to the new view plane by using the best homography matrix found before, so as to realize image correction.
It is worth noting that, because the perspective transformation is based on the template image, the ellipse can be fitted by using the key points marked by the template image, and the long and short shaft end points of the fitted ellipse are used as the calculation basis of the transformation matrix, so that the aim of ensuring the correction of the image to be matched to the template image is fulfilled, and meanwhile, the circle center of the dial can be obtained after the ellipse is fitted, thereby providing conditions for the expansion of the dial. Ellipse fitting is the expression of the distribution of points on a plane by an elliptic equation, i.e. finding a circle that brings the points on the plane as close as possible to the ellipse. The invention uses a least square method to fit the dial ellipse, and solves the general equation of ellipse on the plane:。
according to the perspective transformation principle, the transformation matrix obtains a unique solution through coordinate values corresponding to four different points in two view planes. Therefore, after the fitted elliptic equation is obtained, the elliptic long-short axis end points are solved and used as transformation matrix calculation basis, and the correction effect is achieved. The image after the tilt correction is completed is shown in fig. 8.
In a preferred embodiment, the expanding the corrected dial image into a rectangular dial image specifically includes: and (3) obtaining corresponding position information on the rectangular dial after polar coordinate transformation through the position information of the starting point and the ending point of the scale of the circular dial after inclination correction, and re-stitching the images to obtain a rectangular dial image which finally only comprises the region where the scale is located. In order to facilitate the calculation of meter readings by adopting a distance method, the dial image is required to be converted from a circle by utilizing polar coordinatesThe shape expands into a rectangular shape. And in the secondary perspective transformation stage, the circle center of the dial scale area is obtained, and the circle center is used as a transformation center to perform polar coordinate transformation, so that the image is transformed into a polar coordinate system. The polar coordinate transformation expression is,/>Wherein: />、/>Polar radius and polar angle of polar coordinate system, respectively, < ->、/>For changing center +.>、/>Is the abscissa and ordinate of the pixel point in the original coordinate system. After the developed image is obtained, the starting point of polar coordinate development is not corresponding to the scale position of the circular dial, the position information corresponding to the rectangular dial after polar coordinate transformation can be obtained through the position information of the starting point and the end point of the scale of the circular dial, and the image is spliced again to obtain the rectangular dial image which only contains the scale region finally. The dial expansion rectangular image is shown in fig. 9
In a preferred embodiment, the detecting the pointer position based on the rectangular dial image specifically includes: forming a pointer detection data set by the instrument target detection data set through the steps, training the training data set by adopting a deep learning algorithm to obtain a trained pointer detection model, inputting a rectangular dial image into the trained pointer detection model to perform target detection to obtain a target bounding box, wherein the central axis of the bounding box is the straight line where the pointer point is located, and thus the pointer position is obtained. After the rectangular dial plate image is obtained, the target detection is carried out on the pointer by using the YOLOv5 algorithm, so that a target bounding box is obtained, the central axis of the bounding box is the straight line where the pointer point is located, and compared with the conventional method, the method reduces the interference of other areas in the original image on the pointer central axis positioning process.
In the preferred embodiment, the target detection background of the pointer is more single, and the target size is relatively larger, so that the pointer is better detected, and the original YOLOv5s algorithm is selected for detecting the position of the pointer. The pointer detection result is shown in fig. 10.
In a preferred embodiment, the calculating the reading according to the pointer position specifically includes: and calculating according to the pointer position, the rectangular dial image and the dial range of the instrument by a distance method to obtain a final reading. In a preferred embodiment, after the pointer position is obtained, the final reading is obtained by using the pointer position, the minimum tick mark position and the maximum tick mark position through a distance method formula. The final reading r based on the distance method is:
wherein: w is the width of the rectangular instrument, < >>The position abscissa of the pointer line, m is the measuring range of the instrument, < >>A reading is initiated for the meter.
In a preferred embodiment, to verify the applicability of the present invention, the reading results obtained by the method of the present invention are compared with manual readings, and the reading errors are quantitatively analyzed. In a preferred embodiment, the comparison experiment includes analysis of results from different meter readings and analysis of results from readings in an noisy environment.
Pointer readings for different types of meters: the same number of images were sampled for each type of pointer meter in the dataset, and the pressure gauge with a measuring range of 2.5MPa, the pressure gauge with a measuring range of 1.6MPa, the thermometer with a measuring range of 100 ℃, the thermometer with a measuring range of 150 ℃ and the thermometer with a measuring range of 120 ℃ were respectively denoted as meter 1, meter 2, meter 3, meter 4 and meter 5, and the partial test results are shown in table 1 below.
Table 1 results of readings from different meter sections
As can be seen from Table 1, the reading errors of the different types of meters are smaller by the method, wherein the maximum relative error is 1.7186%, and the method can accurately read the different types of pointer meters within the allowable range of the reading errors.
For pointer readings in noisy environments: in an actual working environment, the environment where the instrument is located is complex, and the shot instrument picture is easily influenced by factors such as illumination intensity, dirt and the like. In order to test the stability of the method, a plurality of groups of disturbed pressure gauge images with the measuring range of 1.6MPa are selected for testing, various disturbance example images are shown in fig. 11, and table 2 shows part of test results.
Table 2 partial meter reading results in noisy environments
Experiments show that various environmental interferences can have a slight influence on the reading, wherein the maximum relative error is 1.9089 percent, but the maximum relative error is within the range of expected allowable errors, and the method can be used for accurately reading the instrument under the condition of interference.
Therefore, the pointer type meter detection method can accurately detect the pointer type meter under the complex environment, and can perform inclination correction on the detected inclination meter, and finally, accurate reading can be realized.
Based on the same thought, the invention also discloses an automatic pointer instrument reading system based on deep learning, which comprises the following steps:
the receiving module is used for acquiring an image of the instrument to be read;
the detection module is used for detecting dial targets of the instrument images; performing tilt correction on the detected dial image; expanding the corrected dial image into a rectangular dial image; detecting the position of the pointer based on the rectangular dial plate image;
and the reading module is used for calculating the reading according to the pointer position.
In the preferred embodiment, when the system enters a meter reading mode, firstly, a meter image to be read is acquired through a receiving module, then, dial target detection is carried out on the meter image through a detecting module, then, inclination correction is carried out on the detected dial image, secondly, the corrected dial image is unfolded to be a rectangular dial image, and then pointer position detection is carried out based on the rectangular dial image; and finally, calculating a reading according to the pointer position through a reading module, and outputting the reading.
The invention is suitable for various disc pointer tables with uniform scales, such as single pointer, double pointer and the like, and the traditional algorithm needs to respectively identify each type of meter; the invention can detect and read the instrument in complex environment based on the target detection of deep learning; the invention can also correct the inclination instrument in the acquired image, reduce the subsequent reading error and ensure more accurate reading precision.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.