CN116188756A - Instrument angle correction and indication recognition method based on deep learning - Google Patents

Instrument angle correction and indication recognition method based on deep learning Download PDF

Info

Publication number
CN116188756A
CN116188756A CN202211507795.6A CN202211507795A CN116188756A CN 116188756 A CN116188756 A CN 116188756A CN 202211507795 A CN202211507795 A CN 202211507795A CN 116188756 A CN116188756 A CN 116188756A
Authority
CN
China
Prior art keywords
scale
instrument
convolution module
image
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211507795.6A
Other languages
Chinese (zh)
Inventor
陈运蓬
赵飞
赵锐
马江海
尚文
夏彦
彭柳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Datong Power Supply Co of State Grid Shanxi Electric Power Co Ltd
Original Assignee
Datong Power Supply Co of State Grid Shanxi Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Datong Power Supply Co of State Grid Shanxi Electric Power Co Ltd filed Critical Datong Power Supply Co of State Grid Shanxi Electric Power Co Ltd
Priority to CN202211507795.6A priority Critical patent/CN116188756A/en
Publication of CN116188756A publication Critical patent/CN116188756A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/23Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on positionally close patterns or neighbourhood relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/181Segmentation; Edge detection involving edge growing; involving edge linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Abstract

The invention relates to a meter angle correction and indication recognition method based on deep learning, which comprises the following steps: step 1, constructing a neural network to detect and correct the angle of an instrument panel image; step 2, acquiring multiple scene and multiple kinds of pointer instrument images, constructing a training set, and performing deep training on a convolutional neural network model by using the training set; step 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument; step 4, using a CTPN+CRNN network character detection and recognition model to process the image positioned to the instrument, and obtaining the numerical value and the position of the starting scale and the maximum range of the instrument; and 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.

Description

Instrument angle correction and indication recognition method based on deep learning
Technical Field
The invention relates to the technical field of image recognition, in particular to a meter angle correction and indication recognition method based on deep learning.
Background
The pointer instrument plays an important role in environments such as a transformer substation with strong magnetic interference, but is not suitable for manual reading because the pointer instrument has complex manual reading and is easy to generate errors and normally works in dangerous scenes of strong voltage and strong radiation. Therefore, research on intelligent detection and identification of the pointer instrument has important significance.
However, in the prior art, the instrument panel target detection based on deep learning is mostly based on the traditional image processing technology to position the instrument pointer, such as a template matching method or a subtraction method, and then the angle and the indication are converted, and the methods have the problems of complex recognition process, low intelligent reading universality, poor real-time performance and the like. But has better identification capability for meter readings at regular angles, and is difficult to accurately identify for meters with irregular angles.
Disclosure of Invention
The invention provides a meter angle correction and indication recognition method based on deep learning, which comprises the following steps:
step 1, constructing a neural network to detect and correct the angle of an instrument panel image;
step 2, acquiring multiple scene and multiple kinds of pointer instrument images, constructing a training set, and performing deep training on a convolutional neural network model by using the training set;
step 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument;
step 4, using a CTPN+CRNN network character detection and recognition model to process the image positioned to the instrument, and obtaining the numerical value and the position of the starting scale and the maximum range of the instrument;
and 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
Further, the method further comprises the following steps in step 1:
step 11, collecting image information of an instrument panel, and cleaning and marking data;
step 12, constructing a multi-scale convolutional neural network model, and training the network model by using instrument panel image information;
step 13, loading trained model parameters, carrying out angle correction on the instrument panel image, and correcting the detected pattern angles of the image to a uniform orientation;
in step 12, the multi-scale convolutional neural network comprises a feature extraction module and a multi-scale decision fusion module;
the feature extraction module comprises a small-scale CNN network, a medium-scale CNN network, a large-scale CNN network and a full-scale CNN network, and outputs small-scale feature vectors, medium-scale feature vectors, large-scale feature vectors and full-scale feature vectors respectively;
the multi-scale decision fusion module generates a coefficient matrix from the four feature vectors with different scales, and generates an output vector by fusing the coefficient matrix with the small-scale feature vector, the middle-scale feature vector and the large-scale feature vector.
Further, the method further comprises the following steps at step 12:
step 121, extracting characteristics of a displacement field multi-scale convolutional neural network;
step 122, decision-level fusion of the displacement field multi-scale convolutional neural network;
step 123, calculating a displacement field multi-scale convolutional neural network loss function;
the small-scale CNN network comprises a first small-scale convolution module, an average pooling layer, a second small-scale convolution module, a third small-scale convolution module, a fourth small-scale convolution module, a fifth small-scale convolution module and a sixth small-scale convolution module which are connected in sequence;
the mesoscale CNN network comprises a first mesoscale convolution module, an average pooling layer, a second mesoscale convolution module, a third mesoscale convolution module, a fourth mesoscale convolution module, a fifth mesoscale convolution module and a sixth mesoscale convolution module which are connected in sequence;
the large-scale CNN network comprises a first large-scale convolution module, an average pooling layer, a second large-scale convolution module, a third large-scale convolution module, a fourth large-scale convolution module, a fifth large-scale convolution module, a sixth large-scale convolution module and a seventh large-scale convolution module which are connected in sequence;
the full-scale CNN network comprises a first full-scale convolution module, an average pooling layer, a second full-scale convolution module, a third full-scale convolution module, a fourth full-scale convolution module, a fifth full-scale convolution module, a sixth full-scale convolution module and a seventh full-scale convolution module which are sequentially connected;
each convolution module comprises convolution kernels, normalization units and Relu function units which are sequentially connected and have different numbers and sizes.
Further, in step 122, the four different scales of output data are superimposed together and then enter the fully connected layer to generate a 6 x 1 vector. This vector is shaped and then passed through a SoftMax layer to generate a 3 x 2 coefficient matrix
Figure BDA0003964376010000031
In the coefficient matrix, the following equation is satisfied:
Figure BDA0003964376010000032
wherein ,cj and dj Elements representing matrices of scale factors and are all in [0,1]Is within the range of (2); finally, the displacement vector corresponding to the input subset image can be obtained as follows:
Figure BDA0003964376010000033
wherein ,u1 Is the abscissa of the small scale feature vector, v 1 Is the ordinate of the small scale feature vector, u 2 Is the abscissa of the mesoscale feature vector, v 2 Is the ordinate of the mesoscale feature vector, u 3 Is the abscissa of the large scale feature vector, v 3 Is the ordinate of the large scale feature vector.
Further, in step 123, the loss function includes:
optimizing the minimum circumscribed rectangle of the detection frame in any direction by adopting a Ciou loss function:
Figure BDA0003964376010000041
Figure BDA0003964376010000042
Figure BDA0003964376010000043
wherein IOU represents the intersection ratio of the prediction frame and the labeling frame, b represents the center point of the prediction detection frame, b gt Representing the center point of the annotation frame ρ 2 Representing the square of the distances between two central points of the prediction frame and the labeling frame, wherein alpha and upsilon are aspect ratios, and w, h and w gt 、h gt Representing the height and width of the prediction frame and the height and width of the real frame respectively;
confidence loss:
Figure BDA0003964376010000044
category loss:
Figure BDA0003964376010000045
loss of direction vector:
Figure BDA0003964376010000046
Figure BDA0003964376010000047
Figure BDA0003964376010000048
Figure BDA0003964376010000051
wherein ,
Figure BDA0003964376010000052
for positive sample coefficients, there is a positive sample +.>
Figure BDA0003964376010000053
1, the rest 0, ">
Figure BDA0003964376010000054
And
Figure BDA0003964376010000055
respectively representing the abscissa of the head of the detection frame and the tag value thereof,/->
Figure BDA0003964376010000056
and />
Figure BDA0003964376010000057
Respectively representing the ordinate of the detection frame direction vector head and the label value thereof,/->
Figure BDA0003964376010000058
and />
Figure BDA0003964376010000059
Respectively representing the tail abscissa value and the label value of the detection frame direction vector, < >>
Figure BDA00039643760100000510
and />
Figure BDA00039643760100000511
The vertical coordinate value of the tail of the detection frame direction vector and the label value thereof are respectively represented. />
Further, the Canny edge detection algorithm in step S3 specifically includes:
meanwhile, nonlinear bilateral filtering is carried out on the image by considering the value domain and the space domain, edge information is well reserved, gradient amplitude values are calculated in a 3×3 neighborhood of the filtered image through a Sober operator direction template, and then non-maximum suppression is carried out on the edge information to achieve an edge refinement effect;
the method comprises the steps of dividing a foreground part and a background part according to gray distribution of an image, maximizing gray class variance, finding a threshold value which maximizes the variance and defining the threshold value as a high threshold value, defining a high threshold value which is k times as a low threshold value, and performing edge connection according to the high threshold value and the low threshold value, wherein k is [0.5,0.8].
In step 2, the mobilenet v3 network in the YOLOv5 algorithm includes five convolution layers, the input image is convolved by the mobilenet v3 network to output a corresponding feature map, and then the feature map is learned by the FPN network and the PAN network, finally the feature map is sent to the Prediction Head module to predict the confidence coefficient of the Prediction category and the coordinates of the Prediction boundary frame, then the repeated detection frame is removed by the non-maximum suppression algorithm, and the category, the category confidence coefficient and the boundary frame of the instrument are finally displayed after the threshold value is set.
Further, in step 5, the Hough line detection algorithm step includes: obtaining an extraction range according to the detected positions of the circle center, the initial scale and the maximum range of the instrument, traversing all edge points in the extraction range, continuously and repeatedly and randomly extracting the edge points to be mapped into polar coordinate space straight lines, extracting line segments after the accumulator of each edge point exceeds a preset value, and finally calculating the lengths of all the extracted line segments, wherein the extracted line segment with the longest length is used as a pointer of the instrument.
In step 5, firstly, the included angle area between the instrument start scale and the maximum range scale is removed in the extraction range of the Hough straight line detection algorithm, and secondly, the detection radius is reduced to avoid mistaking the scale line as a pointer.
Further, in step 3, the step of applying the Hough circle detection positioning dial plate includes: and (3) reading a binarized image output by the Canny edge detection algorithm, traversing all edges of the image, accumulating in a two-dimensional accumulator along the intersection point of the gradient direction of the edge and the opposite direction line segment, sequencing the count in the two-dimensional accumulator from large to small, reserving the position with the maximum count as the circle center of the instrument, and calculating the distance from the circle center to the edge point of the image to obtain the radius.
The beneficial effects achieved by the invention are as follows:
four different-scale CNNs, namely small-scale CNNs, medium-scale CNNs, large-scale CNNs and full-scale CNNs, are used in the invention. The output of the small-scale network, the medium-scale network and the large-scale network is accurate in the respective ranges, while the output precision of the full-scale network in all the ranges is not so high, but is enough to judge the scale to which the result belongs. Thus, by fusing the results of the four networks, an appropriate coefficient matrix can be calculated to determine the correct result. Therefore, the result of the full-scale network plays an important role in decision making although the result does not participate in fusion.
According to the invention, aiming at the problem that the angle prediction loss value generates loss mutation at 0 degree in the neural network angle fitting training process, object angle information is subjected to expansion coding, the object angle is calculated by predicting an angle direction vector, meanwhile, the calculation of the object direction vector coordinate is added in the loss function, and the final angle information of the object is obtained by decoding the direction vector. Extending the angular prediction range to [ -180 °,180 °) and eliminating the angular loss periodic abrupt changes.
In the technical scheme, the Canny edge detection algorithm replaces the conventional Gaussian filtering with the nonlinear bilateral filtering, so that the edge details of the image are better kept; in the YOLOv5 algorithm, a MobileNet V3 network is used for replacing a conventional Darknet network, so that the data volume is reduced and the speed is increased; the CTPN+CRNN network character detection and identification model is added to read the scale and range information of different types of meters, so that the method has strong generalization capability and universality; and the pointer is positioned by using a Hough straight line detection algorithm, so that the difficulty in calculating the reading is reduced.
Drawings
FIG. 1 is a flow chart of a method for instrument angle correction and indication recognition based on deep learning;
FIG. 2 is a diagram of a model framework of a displacement field recognition multi-scale convolutional neural network in a method for instrument angle correction and registration recognition based on deep learning;
FIG. 3 is a schematic diagram of a multi-scale feature extraction module of a displacement field speckle image in a method for instrument angle correction and registration recognition based on deep learning;
FIG. 4 is a schematic diagram of a multi-scale decision stage fusion module of displacement field information in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 5 is a flowchart of the Yolov5 algorithm in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 6 is a flowchart of a Canny edge detection algorithm in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 7 is a Sober operator direction template in a method for instrument angle correction and registration recognition based on deep learning;
FIG. 8 is a flow chart of a Hough line detection algorithm in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 9 is a schematic diagram of the calculation of the meter reading of the angle method in the meter angle correction and reading recognition method based on the deep learning.
Detailed Description
The technical scheme of the present invention will be described in more detail with reference to the accompanying drawings, and the present invention includes, but is not limited to, the following examples.
As shown in fig. 1, the invention provides a method for instrument angle correction and indication recognition based on deep learning, which comprises the following steps:
step 1, constructing a neural network to detect and correct the angle of an instrument panel image;
specifically, step 1 further includes the following steps:
and 11, collecting image information of the instrument panel, and cleaning and marking the data.
Collecting instrument panel image data, wherein the size of the image data is 528 multiplied by 528 pixels; the collected data are subjected to data cleaning, and image data with clear images, normal exposure and obvious characteristics are reserved; the scale range part of the panel in the panel image is marked in the form of a rectangle in any direction, and the representation method is [ category (from 0 to 0, and so on, such as 0,1 and 2), rectangle width, rectangle height, rectangle center point coordinates (x and y), rectangle direction vector and x-axis clamp angle (radian system range [ -pi and pi is negative clockwise) in a plane rectangular coordinate system ], for example [0,50,20,100,200,1.75].
And cutting out the subset images with different sizes by taking the center point of the instrument panel as the center. A small-scale dataset, a medium-scale dataset, and a large-scale dataset are generated according to the subset image size. For images of different scale subsets, the center points of the images are consistent, so that the data volume of the three data sets is the same, and the image centers and displacement labels corresponding to the different data sets are the same.
The marked data set is divided into a training set, a verification set and a test set. Wherein the ratio of the training set, the verification set and the test set is 7:1:2.
Step 12, constructing a multi-scale convolutional neural network model, and training the network model by using instrument panel image information;
as shown in fig. 2, constructing an angle detection neural network model, which comprises the following steps:
step 121, extracting characteristics of a displacement field multi-scale convolutional neural network:
different from the traditional displacement field identification method based on a single-scale neural network, the method divides an instrument panel image into large-scale, middle-scale and small-scale subset images with different pixels through a sliding window, and each scale is subjected to feature extraction through a deep convolution layer.
In the multi-scale neural network training, a 2×48×48 pixel subset image is used as input data of a small-scale CNN, a 2×68×68 pixel subset image is used as input data of a medium-scale CNN, a 2×132×132 pixel subset image is used as input data of a large-scale CNN, and a 264×264 pixel subset image is used as input data of a full-scale CNN. To improve the accuracy of the model's predictions about target values that are near the cross-scale boundary.
As shown in fig. 3, the network structure of the feature extraction module includes a small-scale CNN network, a medium-scale CNN network, a large-scale CNN network and a full-scale CNN network. The small-scale CNN network comprises a first small-scale convolution module, an average pooling layer, a second small-scale convolution module, a third small-scale convolution module, a fourth small-scale convolution module, a fifth small-scale convolution module and a sixth small-scale convolution module which are connected in sequence; the mesoscale CNN network comprises a first mesoscale convolution module, an average pooling layer, a second mesoscale convolution module, a third mesoscale convolution module, a fourth mesoscale convolution module, a fifth mesoscale convolution module and a sixth mesoscale convolution module which are connected in sequence; the large-scale CNN network comprises a first large-scale convolution module, an average pooling layer, a second large-scale convolution module, a third large-scale convolution module, a fourth large-scale convolution module, a fifth large-scale convolution module, a sixth large-scale convolution module and a seventh large-scale convolution module which are connected in sequence; the full-scale CNN network comprises a first full-scale convolution module, an average pooling layer, a second full-scale convolution module, a third full-scale convolution module, a fourth full-scale convolution module, a fifth full-scale convolution module, a sixth full-scale convolution module and a seventh full-scale convolution module which are sequentially connected; each convolution module comprises convolution kernels, normalization units and Relu function units which are sequentially connected and have different numbers and sizes.
The small-scale CNN network outputs small-scale feature vectors, the medium-scale CNN network outputs medium-scale feature vectors, the large-scale CNN network outputs large-scale feature vectors, and the full-scale CNN network outputs full-scale feature vectors.
Step 122, decision-stage fusion of displacement field multi-scale convolutional neural network:
the decision fusion module automatically fuses the displacement information results of different single scales by designing the neural network, and compared with the traditional empirical fusion, the fusion mode of the decision fusion module learns through the neural network, so that the decision fusion module is more accurate and intelligent. After the feature extraction module, the extracted features further pass through the fully connected layers and generate displacement results on each individual CNN scale. A multi-scale decision fusion module is then proposed to fuse these learned displacements at different scales.
As shown in fig. 4, in the multi-scale fusion module, four output data of different scales are superimposed together and then enter the full connection layer to generate a 6×1 vector. This vector is shaped and then passed through a SoftMax layer to generate a 3 x 2 coefficient matrix
Figure BDA0003964376010000101
In the coefficient matrix, the following equation is satisfied:
Figure BDA0003964376010000102
wherein ,cj and dj Elements representing matrices of scale factors and are all in [0,1]Within a range of (2). Finally, the displacement vector corresponding to the input subset image can be obtained as follows:
Figure BDA0003964376010000103
wherein ,u1 Is the abscissa of the small scale feature vector, v 1 Is the ordinate of the small scale feature vector, u 2 Is the abscissa of the mesoscale feature vector, v 2 Is the ordinate of the mesoscale feature vector, u 3 Is the abscissa of the large scale feature vector, v 3 Is the ordinate of the large scale feature vector.
Step 123, calculating a loss function;
the loss function comprises three parts, and the minimum circumscribed rectangle of the detection frame in any direction is optimized by adopting the Ciou loss function.
Figure BDA0003964376010000104
Figure BDA0003964376010000105
Figure BDA0003964376010000106
Wherein IOU represents the intersection ratio of the prediction frame and the labeling frame. b represents the center point of the prediction detection frame, b gt Representing the center point of the callout box. ρ 2 Representing the square of the distances between two central points of the prediction frame and the labeling frame, wherein alpha and upsilon are aspect ratios, and w, h and w gt 、h gt Representing the height width of the predicted frame and the height width of the real frame, respectively.
Confidence loss:
Figure BDA0003964376010000111
category loss:
Figure BDA0003964376010000112
loss of direction vector:
Figure BDA0003964376010000113
Figure BDA0003964376010000114
Figure BDA0003964376010000115
Figure BDA0003964376010000116
wherein
Figure BDA0003964376010000117
For positive sample coefficients, there is a positive sample +.>
Figure BDA0003964376010000118
1 and the balance 0./>
Figure BDA0003964376010000119
And
Figure BDA00039643760100001110
respectively representing the abscissa of the head of the detection frame and the tag value thereof,/->
Figure BDA00039643760100001111
and />
Figure BDA00039643760100001112
Respectively representing the ordinate of the detection frame direction vector head and the label value thereof,/->
Figure BDA00039643760100001113
and />
Figure BDA00039643760100001114
Respectively representing the tail abscissa value and the label value of the detection frame direction vector, < >>
Figure BDA00039643760100001115
and />
Figure BDA00039643760100001116
The vertical coordinate value of the tail of the detection frame direction vector and the label value thereof are respectively represented.
At step 124, model training is performed on the neural network using Adam's method until the model performs well on the validation set and no overfitting stops training.
And step 13, loading trained model parameters, performing angle correction on the instrument panel image, and correcting the detected image angle to a uniform orientation.
Step 2, acquiring multiple scene and multiple kinds of pointer instrument images, constructing a training set, and performing deep training on a convolutional neural network model by using the training set;
specifically, the YOLOv5 algorithm usually adopts dark net as a feature extraction network, although the dark net utilizes a residual network structure to reduce the training difficulty of a model, the network is too deep to cause huge calculation amount and parameter amount, the training is finally complicated, the real-time performance is difficult to meet, and in order to achieve the real-time performance of target detection, the embodiment adopts a lightweight mobilenet v3 network as the feature extraction network.
As shown in fig. 5, the MobileNet V3 network comprises five convolutions from C1 layer to C5 layer, the input image is convolved by the MobileNet V3 network to output corresponding characteristic diagrams, the characteristic diagrams are defined to correspond to the C1 layer to the C5 layer respectively, the characteristic diagrams are sent to the F3 layer to the F5 layer of the FPN network to learn, the F5 layer is obtained by the C5 layer through one convolutions layer in the sending process, then the F5 layer is up-sampled, the up-sampled value is added with the convolved C4 layer to obtain the F4 layer, then the F4 layer is up-sampled once, and the up-sampled value is added with the convolved C3 layer to obtain the F3 layer.
And transmitting the obtained F3-F5 layers to the P3-P5 layers of the PAN network for learning, obtaining the P3 layer by the F3 layer through a convolution layer in the transmitting process, then downsampling the P3 layer, adding the downsampled value and the convolved F4 layer to obtain the P4 layer, then downsampling the P4 layer once, adding the downsampled value and the convolved F5 layer to obtain the P5 layer, finally transmitting the obtained P3-P5 layers to a Prediction Head module to predict the confidence degree of the Prediction category and the coordinates of the Prediction boundary frame, removing the repeated detection frame through a non-maximum suppression algorithm, and finally displaying the category, the category confidence degree and the boundary frame of the instrument after setting a threshold value.
In this embodiment, 1500 pointer instrument images covering multiple scenes and multiple types are captured and acquired, and after preliminary labeling of targets is performed in LabelImg, the images are input as a training set into the convolutional neural network model for training.
And 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument.
Specifically, according to the information such as instrument category, category confidence coefficient and boundary box output by the convolutional neural network model, the output image is cut, and then mean shift filtering is carried out: the elements with similar color distribution are clustered by means of the segmentation characteristic of the Mean Shift algorithm, color details are smoothed, and subsequent calculated amount is reduced.
As shown in FIG. 6, since the subsequent step requires the binarized edge image of the pointer instrument, the Canny edge detection algorithm is improved and optimized in the embodiment, and the nonlinear bilateral filtering which considers the value domain and the space domain simultaneously is used for replacing the common Gaussian filtering, so that the image edge information is completely reserved while the image is denoised.
After obtaining the image subjected to bilateral filtering, calculating gradient amplitude values of the image in a 3×3 neighborhood; as shown in fig. 7, the gradient calculation in each direction uses a Sober operator direction template, and the gradient magnitude and direction of a pixel are determined by finite differences of first partial derivatives of x direction, y direction, 45 ° direction and 135 ° direction of 8 neighborhoods of a certain pixel of an image.
After the gradient amplitude and direction of the pixel points in the eight fields are obtained, non-maximum suppression is carried out on the edge information, and the effect of edge refinement is achieved.
According to the gray level characteristics of the image, the foreground part and the background part of the image are segmented by counting the gray level distribution of pixel points in the image, the inter-class variance of gray level in the area is maximized, a threshold value which maximizes the variance is found, the threshold value is defined as a high threshold value, and a low threshold value is defined as a high threshold value k epsilon [0.5,0.8] which is k times.
And (3) performing edge connection after obtaining the high and low threshold values of the image:
when the amplitude of a certain pixel point is larger than the high threshold value, the pixel point is an edge point.
When the amplitude of a certain pixel point is lower than a low threshold value, the pixel point is not an edge point;
when the amplitude of a certain pixel point of the image is between the high threshold value and the low threshold value, if the pixel point is connected with the pixel point which is larger than the high threshold value, the pixel point is an edge point, otherwise, the pixel point is not the edge point.
Obtaining an output pointer instrument binarization image through the Canny edge detection algorithm, and then applying Hough circle detection positioning instrument panel circle center based on Hough gradient method: and traversing all edges of the image after reading the binarized image, accumulating in a two-dimensional accumulator along the intersection point of the gradient direction of the edge and the opposite direction line segment, sorting the counts in the two-dimensional accumulator from large to small, reserving the position with the largest count as the circle center of the instrument panel, and calculating the distance from the circle center to the edge point of the image to obtain the radius, thereby positioning the position and the range of the instrument panel in the image.
And 4, processing the image positioned to the instrument by using a CTPN+CRNN network character detection and recognition model to obtain the numerical value and the position of the starting scale and the maximum range of the instrument.
Specifically, the ctpn+crnn network text detection recognition model includes a CTPN network text detection model and a CRNN network text recognition model, where the CTPN network text detection model includes the following operation steps:
after the position and the range of the pointer instrument in the image are positioned, firstly, the pointer instrument image is subjected to feature extraction by utilizing a VGG16 network to generate a feature map, wherein the VGG16 uses a 3 multiplied by 3 small convolution kernel, and compared with other neural networks using a large convolution kernel, the feature map has better extraction effect.
Then, extracting texts on the feature map by using an RNN network, taking each feature point on the feature map as an anchor point, wherein each anchor point can select and generate 10 text proposal boxes with the width of 16 and different heights; the RNN network roughly classifies the generated text proposal boxes, selects proposal boxes possibly containing texts, inputs the proposal boxes into a full-connection layer to carry out accurate classification prediction, and adjusts position coordinates.
And finally outputting the initial coordinates and the height of the prediction candidate areas, the classification scores of the foreground and the background and the horizontal offset of the text proposal frame by the CTPN network text detection model, dividing the text areas of the pointer instrument image, and inputting the divided text areas into the CRNN network text recognition model.
The CRNN network text recognition model comprises the following operation steps:
and scaling and inputting each text region segmented by the CTPN network text detection model into a CNN network to obtain a Feature map, wherein in the embodiment, the text region is uniformly scaled into a gray image with height=32 and width width=160, and the gray image is input into the CNN network to obtain the Feature map with height=l, width=40 and channel=512.
The Feature Map is extracted into Feature sequences needed by an RNN network through Map-to-Sequence, each Feature vector corresponds to a receptive field of an original image, the Feature Sequence is output to two layers of two-way LSTM networks of 256 units to obtain character labels corresponding to each Feature vector, probability distribution vectors are output by the LSTM networks to form a probability matrix W, each element in the probability matrix W represents probability that the Feature vector possibly contains the character W, characters corresponding to the maximum value of each column in the probability matrix W are output to a CTC layer as actual characters to be combined and redundancy removed, text information of a pointer instrument is obtained, and after only numbers are reserved for sorting, the numerical value and the position of a starting scale corresponding to the minimum value and the numerical value and the position of the maximum range corresponding to the maximum value are set.
And 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
Specifically, after the reading of the starting scale and the maximum range information of the pointer type meter is obtained, the pointer in the pointer type meter needs to be extracted. Because the angle range of the pointer of the instrument panel is limited, the included angle area between the initial scale of the instrument panel and the scale of the maximum range is removed from the extraction range of the Hough straight line detection algorithm.
The method comprises the steps of extracting a pointer detection range, then according to the detected positions of a starting scale and a maximum range, reducing the radius of the extraction range, avoiding false detection of scales with longer length as pointers, reducing the area of a region detected by a Hough straight line, substituting pixel points in a target region detected by the pointers into Hough transformation, obtaining the extraction range according to the detected positions of the circle center, the starting scale and the maximum range of an instrument as shown in figure 8, traversing all edge points in the extraction range, continuously and repeatedly randomly extracting the edge points to map to a polar coordinate space straight line, extracting line segments after an accumulator of each edge point exceeds a preset value, thus obtaining all line segments in the extraction range, calculating the lengths of all line segments, and sequencing the line segments with the longest length as pointers of the instrument according to the lengths of the line segments.
As shown in fig. 9, the meter reading is finally calculated using an angle method, where the starting scale is point a, coordinates (x A ,y A ) The pointer end point is point B, coordinates (x B ,y B ) The maximum scale is point C, coordinate (x C ,y C ) The centre of the instrument panel is the point O, the coordinates (x O ,y O ) The connection line between the starting scale and the circle center is a vector
Figure BDA0003964376010000151
Pointer is vector +.>
Figure BDA0003964376010000152
The connecting line of the maximum range scale and the circle center is a vector +.>
Figure BDA0003964376010000153
The pointer is at an angle to the start scale>
Figure BDA0003964376010000154
Slope of line connecting start scale A and circle center O +.>
Figure BDA0003964376010000161
Intercept b of line between starting scale A and circle center O A =y o -k A ×x o Coordinates of point B (x B ,y B ) Substitution vector->
Figure BDA0003964376010000162
Obtaining position information position=y in a linear equation of a straight line b -k A ×x b -b A If the position is greater than or equal to 0, the angle formed by the pointer and the starting scale is theta, otherwise, the angle formed by the pointer and the starting scale is 2 pi-theta; similarly, the angle between the starting scale and the maximum range scale is calculated>
Figure BDA0003964376010000163
The maximum range MaxRange of the instrument is obtained by the prior CTPN+CRNN network character detection and recognition model, and the final indication is obtained
Figure BDA0003964376010000164
And outputting to finish the reading identification.
The present invention is not limited to the above embodiments, and those skilled in the art can implement the present invention in various other embodiments according to the examples and the disclosure of the drawings, so that the design of the present invention is simply changed or modified while adopting the design structure and concept of the present invention, and the present invention falls within the scope of protection.

Claims (10)

1. The instrument angle correction and indication recognition method based on the deep learning is characterized by comprising the following steps of:
step 1, constructing a neural network to detect and correct the angle of an instrument panel image;
step 2, acquiring multiple scene and multiple kinds of pointer instrument images, constructing a training set, and performing deep training on a convolutional neural network model by using the training set;
step 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument;
step 4, using a CTPN+CRNN network character detection and recognition model to process the image positioned to the instrument, and obtaining the numerical value and the position of the starting scale and the maximum range of the instrument;
and 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
2. The method for correcting and identifying the angle of a meter based on deep learning according to claim 1, wherein the step 1 further comprises the steps of:
step 11, collecting image information of an instrument panel, and cleaning and marking data;
step 12, constructing a multi-scale convolutional neural network model, and training the network model by using instrument panel image information;
step 13, loading trained model parameters, carrying out angle correction on the instrument panel image, and correcting the detected pattern angles of the image to a uniform orientation;
in step 12, the multi-scale convolutional neural network comprises a feature extraction module and a multi-scale decision fusion module;
the feature extraction module comprises a small-scale CNN network, a medium-scale CNN network, a large-scale CNN network and a full-scale CNN network, and outputs small-scale feature vectors, medium-scale feature vectors, large-scale feature vectors and full-scale feature vectors respectively;
the multi-scale decision fusion module generates a coefficient matrix from the four feature vectors with different scales, and generates an output vector by fusing the coefficient matrix with the small-scale feature vector, the middle-scale feature vector and the large-scale feature vector.
3. The method for instrument angle correction and registration recognition based on deep learning of claim 2, further comprising the steps of, at step 12:
step 121, extracting characteristics of a displacement field multi-scale convolutional neural network;
step 122, decision-level fusion of the displacement field multi-scale convolutional neural network;
step 123, calculating a displacement field multi-scale convolutional neural network loss function;
the small-scale CNN network comprises a first small-scale convolution module, an average pooling layer, a second small-scale convolution module, a third small-scale convolution module, a fourth small-scale convolution module, a fifth small-scale convolution module and a sixth small-scale convolution module which are connected in sequence;
the mesoscale CNN network comprises a first mesoscale convolution module, an average pooling layer, a second mesoscale convolution module, a third mesoscale convolution module, a fourth mesoscale convolution module, a fifth mesoscale convolution module and a sixth mesoscale convolution module which are connected in sequence;
the large-scale CNN network comprises a first large-scale convolution module, an average pooling layer, a second large-scale convolution module, a third large-scale convolution module, a fourth large-scale convolution module, a fifth large-scale convolution module, a sixth large-scale convolution module and a seventh large-scale convolution module which are connected in sequence;
the full-scale CNN network comprises a first full-scale convolution module, an average pooling layer, a second full-scale convolution module, a third full-scale convolution module, a fourth full-scale convolution module, a fifth full-scale convolution module, a sixth full-scale convolution module and a seventh full-scale convolution module which are sequentially connected;
each convolution module comprises convolution kernels, normalization units and Relu function units which are sequentially connected and have different numbers and sizes.
4. The method of instrument angle correction and registration recognition based on deep learning of claim 3, wherein in step 122, four output data of different scales are superimposed together and then enter the full connection layer to generate a 6 x 1 vector, which is shaped and then passed through a SoftMax layer to generate a 3 x 2 coefficient matrix
Figure FDA0003964373000000031
In the coefficient matrix, the following equation is satisfied:
Figure FDA0003964373000000032
wherein ,cj and dj Elements representing matrices of scale factors and are all in [0,1]Is within the range of (2); finally, the displacement vector corresponding to the input subset image can be obtained as follows:
Figure FDA0003964373000000033
wherein ,i1 Is the abscissa of the small scale feature vector, v 1 Is the ordinate of the small scale feature vector, u 2 Is the abscissa of the mesoscale feature vector, v 2 Is the ordinate of the mesoscale feature vector, u 3 Is the abscissa of the large scale feature vector, v 3 Is the ordinate of the large scale feature vector.
5. The method of deep learning based instrument angle correction and registration recognition of claim 3, wherein in step 123, the loss function comprises:
optimizing the minimum circumscribed rectangle of the detection frame in any direction by adopting a Ciou loss function:
Figure FDA0003964373000000034
Figure FDA0003964373000000035
Figure FDA0003964373000000036
wherein IOU represents the intersection ratio of the prediction frame and the labeling frame, b represents the center point of the prediction detection frame, b gt Representing the center point of the annotation frame ρ 2 Representing the square of the distances between two central points of the prediction frame and the labeling frame, wherein alpha and upsilon are aspect ratios, and w, h and w gt 、h gt Representing the height and width of the prediction frame and the height and width of the real frame respectively;
confidence loss:
Figure FDA0003964373000000037
category loss:
Figure FDA0003964373000000041
loss of direction vector:
Figure FDA0003964373000000042
Figure FDA0003964373000000043
Figure FDA0003964373000000044
Figure FDA0003964373000000045
wherein ,
Figure FDA0003964373000000046
for positive sample coefficients, there is a positive sample +.>
Figure FDA0003964373000000047
1, the rest 0, ">
Figure FDA0003964373000000048
and />
Figure FDA0003964373000000049
Respectively representing the abscissa of the head of the detection frame and the tag value thereof,/->
Figure FDA00039643730000000410
and />
Figure FDA00039643730000000411
Respectively representing the ordinate of the head of the detection frame direction vector and the label value, T i j(x) and Ti j (x) Respectively representing the tail abscissa value and the label value of the direction vector of the detection frame, T i j(y) and Ti j (y) the tail ordinate value of the detection frame direction vector and the tag value thereof, respectively.
6. The method for instrument angle correction and registration recognition based on deep learning according to claim 1, wherein the Canny edge detection algorithm in step S3 specifically comprises:
meanwhile, nonlinear bilateral filtering is carried out on the image by considering the value domain and the space domain, edge information is well reserved, gradient amplitude values are calculated in a 3×3 neighborhood of the filtered image through a Sober operator direction template, and then non-maximum suppression is carried out on the edge information to achieve an edge refinement effect;
the method comprises the steps of dividing a foreground part and a background part according to gray distribution of an image, maximizing gray class variance, finding a threshold value which maximizes the variance and defining the threshold value as a high threshold value, defining a high threshold value which is k times as a low threshold value, and performing edge connection according to the high threshold value and the low threshold value, wherein k is [0.5,0.8].
7. The method for correcting and identifying instrument angles based on deep learning according to claim 1, wherein in step 2, the mobilenet v3 network in the YOLOv5 algorithm comprises five convolution layers, the input image is convolved by the mobilenet v3 network and then outputs a corresponding feature map, the input image is learned by the FPN network and the PAN network, finally the input image is sent to a Prediction Head module to predict the confidence of the predicted class and the coordinates of the predicted boundary frame, the repeated detection frame is removed by a non-maximum suppression algorithm, and the class, the class confidence and the boundary frame of the instrument are finally displayed after a threshold value is set.
8. The method for instrument angle correction and registration recognition based on deep learning according to claim 1, wherein in step 5, the Hough straight line detection algorithm step includes: obtaining an extraction range according to the detected positions of the circle center, the initial scale and the maximum range of the instrument, traversing all edge points in the extraction range, continuously and repeatedly and randomly extracting the edge points to be mapped into polar coordinate space straight lines, extracting line segments after the accumulator of each edge point exceeds a preset value, and finally calculating the lengths of all the extracted line segments, wherein the extracted line segment with the longest length is used as a pointer of the instrument.
9. The method for correcting and identifying instrument angles and readings based on deep learning as claimed in claim 8, wherein in step 5, in the extraction range of the Hough straight line detection algorithm, firstly, the included angle area between the instrument start scale and the maximum range scale is removed, and secondly, the detection radius is reduced to avoid mistaking the scale line as a pointer.
10. The method for instrument angle correction and indication recognition based on deep learning according to claim 1, wherein in step 3, the step of applying Hough circle detection positioning dial plate comprises: and (3) reading a binarized image output by the Canny edge detection algorithm, traversing all edges of the image, accumulating in a two-dimensional accumulator along the intersection point of the gradient direction of the edge and the opposite direction line segment, sequencing the count in the two-dimensional accumulator from large to small, reserving the position with the maximum count as the circle center of the instrument, and calculating the distance from the circle center to the edge point of the image to obtain the radius.
CN202211507795.6A 2022-11-25 2022-11-25 Instrument angle correction and indication recognition method based on deep learning Pending CN116188756A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211507795.6A CN116188756A (en) 2022-11-25 2022-11-25 Instrument angle correction and indication recognition method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211507795.6A CN116188756A (en) 2022-11-25 2022-11-25 Instrument angle correction and indication recognition method based on deep learning

Publications (1)

Publication Number Publication Date
CN116188756A true CN116188756A (en) 2023-05-30

Family

ID=86444961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211507795.6A Pending CN116188756A (en) 2022-11-25 2022-11-25 Instrument angle correction and indication recognition method based on deep learning

Country Status (1)

Country Link
CN (1) CN116188756A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958998A (en) * 2023-09-20 2023-10-27 四川泓宝润业工程技术有限公司 Digital instrument reading identification method based on deep learning
CN117037162A (en) * 2023-08-14 2023-11-10 北京数字绿土科技股份有限公司 Detection method and system of pointer instrument based on deep learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117037162A (en) * 2023-08-14 2023-11-10 北京数字绿土科技股份有限公司 Detection method and system of pointer instrument based on deep learning
CN116958998A (en) * 2023-09-20 2023-10-27 四川泓宝润业工程技术有限公司 Digital instrument reading identification method based on deep learning
CN116958998B (en) * 2023-09-20 2023-12-26 四川泓宝润业工程技术有限公司 Digital instrument reading identification method based on deep learning

Similar Documents

Publication Publication Date Title
CN112949564B (en) Pointer type instrument automatic reading method based on deep learning
CN106529537B (en) A kind of digital instrument reading image-recognizing method
CN111325203B (en) American license plate recognition method and system based on image correction
CN106875381B (en) Mobile phone shell defect detection method based on deep learning
CN116188756A (en) Instrument angle correction and indication recognition method based on deep learning
CN113160192A (en) Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background
CN108921163A (en) A kind of packaging coding detection method based on deep learning
CN108629286B (en) Remote sensing airport target detection method based on subjective perception significance model
CN111046881B (en) Pointer type instrument reading identification method based on computer vision and deep learning
CN112307919B (en) Improved YOLOv 3-based digital information area identification method in document image
CN112766136B (en) Space parking space detection method based on deep learning
CN114549981A (en) Intelligent inspection pointer type instrument recognition and reading method based on deep learning
CN110659637A (en) Electric energy meter number and label automatic identification method combining deep neural network and SIFT features
CN115841669A (en) Pointer instrument detection and reading identification method based on deep learning technology
CN111598098A (en) Water gauge water line detection and effectiveness identification method based on full convolution neural network
CN112365462A (en) Image-based change detection method
CN114155527A (en) Scene text recognition method and device
CN114241469A (en) Information identification method and device for electricity meter rotation process
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN116453104B (en) Liquid level identification method, liquid level identification device, electronic equipment and computer readable storage medium
CN111369526B (en) Multi-type old bridge crack identification method based on semi-supervised deep learning
CN110991374B (en) Fingerprint singular point detection method based on RCNN
CN111950559A (en) Pointer instrument automatic reading method based on radial gray scale
CN109902751B (en) Dial digital character recognition method integrating convolution neural network and half-word template matching
CN116188755A (en) Instrument angle correction and reading recognition device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication