CN116188756A - Instrument angle correction and indication recognition method based on deep learning - Google Patents
Instrument angle correction and indication recognition method based on deep learning Download PDFInfo
- Publication number
- CN116188756A CN116188756A CN202211507795.6A CN202211507795A CN116188756A CN 116188756 A CN116188756 A CN 116188756A CN 202211507795 A CN202211507795 A CN 202211507795A CN 116188756 A CN116188756 A CN 116188756A
- Authority
- CN
- China
- Prior art keywords
- scale
- instrument
- convolution module
- image
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000013135 deep learning Methods 0.000 title claims abstract description 26
- 238000012937 correction Methods 0.000 title claims abstract description 25
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 64
- 238000001514 detection method Methods 0.000 claims abstract description 54
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000001914 filtration Methods 0.000 claims abstract description 16
- 238000003708 edge detection Methods 0.000 claims abstract description 12
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 102100032202 Cornulin Human genes 0.000 claims abstract description 9
- 101000920981 Homo sapiens Cornulin Proteins 0.000 claims abstract description 9
- 230000002146 bilateral effect Effects 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 71
- 238000000605 extraction Methods 0.000 claims description 23
- 238000006073 displacement reaction Methods 0.000 claims description 19
- 230000004927 fusion Effects 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 7
- 230000001629 suppression Effects 0.000 claims description 6
- 238000004140 cleaning Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011410 subtraction method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/23—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on positionally close patterns or neighbourhood relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/181—Segmentation; Edge detection involving edge growing; involving edge linking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
Abstract
The invention relates to a meter angle correction and indication recognition method based on deep learning, which comprises the following steps: step 1, constructing a neural network to detect and correct the angle of an instrument panel image; step 2, acquiring multiple scene and multiple kinds of pointer instrument images, constructing a training set, and performing deep training on a convolutional neural network model by using the training set; step 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument; step 4, using a CTPN+CRNN network character detection and recognition model to process the image positioned to the instrument, and obtaining the numerical value and the position of the starting scale and the maximum range of the instrument; and 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a meter angle correction and indication recognition method based on deep learning.
Background
The pointer instrument plays an important role in environments such as a transformer substation with strong magnetic interference, but is not suitable for manual reading because the pointer instrument has complex manual reading and is easy to generate errors and normally works in dangerous scenes of strong voltage and strong radiation. Therefore, research on intelligent detection and identification of the pointer instrument has important significance.
However, in the prior art, the instrument panel target detection based on deep learning is mostly based on the traditional image processing technology to position the instrument pointer, such as a template matching method or a subtraction method, and then the angle and the indication are converted, and the methods have the problems of complex recognition process, low intelligent reading universality, poor real-time performance and the like. But has better identification capability for meter readings at regular angles, and is difficult to accurately identify for meters with irregular angles.
Disclosure of Invention
The invention provides a meter angle correction and indication recognition method based on deep learning, which comprises the following steps:
step 4, using a CTPN+CRNN network character detection and recognition model to process the image positioned to the instrument, and obtaining the numerical value and the position of the starting scale and the maximum range of the instrument;
and 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
Further, the method further comprises the following steps in step 1:
step 11, collecting image information of an instrument panel, and cleaning and marking data;
step 12, constructing a multi-scale convolutional neural network model, and training the network model by using instrument panel image information;
in step 12, the multi-scale convolutional neural network comprises a feature extraction module and a multi-scale decision fusion module;
the feature extraction module comprises a small-scale CNN network, a medium-scale CNN network, a large-scale CNN network and a full-scale CNN network, and outputs small-scale feature vectors, medium-scale feature vectors, large-scale feature vectors and full-scale feature vectors respectively;
the multi-scale decision fusion module generates a coefficient matrix from the four feature vectors with different scales, and generates an output vector by fusing the coefficient matrix with the small-scale feature vector, the middle-scale feature vector and the large-scale feature vector.
Further, the method further comprises the following steps at step 12:
step 121, extracting characteristics of a displacement field multi-scale convolutional neural network;
step 122, decision-level fusion of the displacement field multi-scale convolutional neural network;
step 123, calculating a displacement field multi-scale convolutional neural network loss function;
the small-scale CNN network comprises a first small-scale convolution module, an average pooling layer, a second small-scale convolution module, a third small-scale convolution module, a fourth small-scale convolution module, a fifth small-scale convolution module and a sixth small-scale convolution module which are connected in sequence;
the mesoscale CNN network comprises a first mesoscale convolution module, an average pooling layer, a second mesoscale convolution module, a third mesoscale convolution module, a fourth mesoscale convolution module, a fifth mesoscale convolution module and a sixth mesoscale convolution module which are connected in sequence;
the large-scale CNN network comprises a first large-scale convolution module, an average pooling layer, a second large-scale convolution module, a third large-scale convolution module, a fourth large-scale convolution module, a fifth large-scale convolution module, a sixth large-scale convolution module and a seventh large-scale convolution module which are connected in sequence;
the full-scale CNN network comprises a first full-scale convolution module, an average pooling layer, a second full-scale convolution module, a third full-scale convolution module, a fourth full-scale convolution module, a fifth full-scale convolution module, a sixth full-scale convolution module and a seventh full-scale convolution module which are sequentially connected;
each convolution module comprises convolution kernels, normalization units and Relu function units which are sequentially connected and have different numbers and sizes.
Further, in step 122, the four different scales of output data are superimposed together and then enter the fully connected layer to generate a 6 x 1 vector. This vector is shaped and then passed through a SoftMax layer to generate a 3 x 2 coefficient matrix
In the coefficient matrix, the following equation is satisfied:
wherein ,cj and dj Elements representing matrices of scale factors and are all in [0,1]Is within the range of (2); finally, the displacement vector corresponding to the input subset image can be obtained as follows:
wherein ,u1 Is the abscissa of the small scale feature vector, v 1 Is the ordinate of the small scale feature vector, u 2 Is the abscissa of the mesoscale feature vector, v 2 Is the ordinate of the mesoscale feature vector, u 3 Is the abscissa of the large scale feature vector, v 3 Is the ordinate of the large scale feature vector.
Further, in step 123, the loss function includes:
optimizing the minimum circumscribed rectangle of the detection frame in any direction by adopting a Ciou loss function:
wherein IOU represents the intersection ratio of the prediction frame and the labeling frame, b represents the center point of the prediction detection frame, b gt Representing the center point of the annotation frame ρ 2 Representing the square of the distances between two central points of the prediction frame and the labeling frame, wherein alpha and upsilon are aspect ratios, and w, h and w gt 、h gt Representing the height and width of the prediction frame and the height and width of the real frame respectively;
confidence loss:
category loss:
loss of direction vector:
wherein ,for positive sample coefficients, there is a positive sample +.>1, the rest 0, ">Andrespectively representing the abscissa of the head of the detection frame and the tag value thereof,/-> and />Respectively representing the ordinate of the detection frame direction vector head and the label value thereof,/-> and />Respectively representing the tail abscissa value and the label value of the detection frame direction vector, < >> and />The vertical coordinate value of the tail of the detection frame direction vector and the label value thereof are respectively represented. />
Further, the Canny edge detection algorithm in step S3 specifically includes:
meanwhile, nonlinear bilateral filtering is carried out on the image by considering the value domain and the space domain, edge information is well reserved, gradient amplitude values are calculated in a 3×3 neighborhood of the filtered image through a Sober operator direction template, and then non-maximum suppression is carried out on the edge information to achieve an edge refinement effect;
the method comprises the steps of dividing a foreground part and a background part according to gray distribution of an image, maximizing gray class variance, finding a threshold value which maximizes the variance and defining the threshold value as a high threshold value, defining a high threshold value which is k times as a low threshold value, and performing edge connection according to the high threshold value and the low threshold value, wherein k is [0.5,0.8].
In step 2, the mobilenet v3 network in the YOLOv5 algorithm includes five convolution layers, the input image is convolved by the mobilenet v3 network to output a corresponding feature map, and then the feature map is learned by the FPN network and the PAN network, finally the feature map is sent to the Prediction Head module to predict the confidence coefficient of the Prediction category and the coordinates of the Prediction boundary frame, then the repeated detection frame is removed by the non-maximum suppression algorithm, and the category, the category confidence coefficient and the boundary frame of the instrument are finally displayed after the threshold value is set.
Further, in step 5, the Hough line detection algorithm step includes: obtaining an extraction range according to the detected positions of the circle center, the initial scale and the maximum range of the instrument, traversing all edge points in the extraction range, continuously and repeatedly and randomly extracting the edge points to be mapped into polar coordinate space straight lines, extracting line segments after the accumulator of each edge point exceeds a preset value, and finally calculating the lengths of all the extracted line segments, wherein the extracted line segment with the longest length is used as a pointer of the instrument.
In step 5, firstly, the included angle area between the instrument start scale and the maximum range scale is removed in the extraction range of the Hough straight line detection algorithm, and secondly, the detection radius is reduced to avoid mistaking the scale line as a pointer.
Further, in step 3, the step of applying the Hough circle detection positioning dial plate includes: and (3) reading a binarized image output by the Canny edge detection algorithm, traversing all edges of the image, accumulating in a two-dimensional accumulator along the intersection point of the gradient direction of the edge and the opposite direction line segment, sequencing the count in the two-dimensional accumulator from large to small, reserving the position with the maximum count as the circle center of the instrument, and calculating the distance from the circle center to the edge point of the image to obtain the radius.
The beneficial effects achieved by the invention are as follows:
four different-scale CNNs, namely small-scale CNNs, medium-scale CNNs, large-scale CNNs and full-scale CNNs, are used in the invention. The output of the small-scale network, the medium-scale network and the large-scale network is accurate in the respective ranges, while the output precision of the full-scale network in all the ranges is not so high, but is enough to judge the scale to which the result belongs. Thus, by fusing the results of the four networks, an appropriate coefficient matrix can be calculated to determine the correct result. Therefore, the result of the full-scale network plays an important role in decision making although the result does not participate in fusion.
According to the invention, aiming at the problem that the angle prediction loss value generates loss mutation at 0 degree in the neural network angle fitting training process, object angle information is subjected to expansion coding, the object angle is calculated by predicting an angle direction vector, meanwhile, the calculation of the object direction vector coordinate is added in the loss function, and the final angle information of the object is obtained by decoding the direction vector. Extending the angular prediction range to [ -180 °,180 °) and eliminating the angular loss periodic abrupt changes.
In the technical scheme, the Canny edge detection algorithm replaces the conventional Gaussian filtering with the nonlinear bilateral filtering, so that the edge details of the image are better kept; in the YOLOv5 algorithm, a MobileNet V3 network is used for replacing a conventional Darknet network, so that the data volume is reduced and the speed is increased; the CTPN+CRNN network character detection and identification model is added to read the scale and range information of different types of meters, so that the method has strong generalization capability and universality; and the pointer is positioned by using a Hough straight line detection algorithm, so that the difficulty in calculating the reading is reduced.
Drawings
FIG. 1 is a flow chart of a method for instrument angle correction and indication recognition based on deep learning;
FIG. 2 is a diagram of a model framework of a displacement field recognition multi-scale convolutional neural network in a method for instrument angle correction and registration recognition based on deep learning;
FIG. 3 is a schematic diagram of a multi-scale feature extraction module of a displacement field speckle image in a method for instrument angle correction and registration recognition based on deep learning;
FIG. 4 is a schematic diagram of a multi-scale decision stage fusion module of displacement field information in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 5 is a flowchart of the Yolov5 algorithm in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 6 is a flowchart of a Canny edge detection algorithm in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 7 is a Sober operator direction template in a method for instrument angle correction and registration recognition based on deep learning;
FIG. 8 is a flow chart of a Hough line detection algorithm in a method for instrument angle correction and indication recognition based on deep learning;
FIG. 9 is a schematic diagram of the calculation of the meter reading of the angle method in the meter angle correction and reading recognition method based on the deep learning.
Detailed Description
The technical scheme of the present invention will be described in more detail with reference to the accompanying drawings, and the present invention includes, but is not limited to, the following examples.
As shown in fig. 1, the invention provides a method for instrument angle correction and indication recognition based on deep learning, which comprises the following steps:
specifically, step 1 further includes the following steps:
and 11, collecting image information of the instrument panel, and cleaning and marking the data.
Collecting instrument panel image data, wherein the size of the image data is 528 multiplied by 528 pixels; the collected data are subjected to data cleaning, and image data with clear images, normal exposure and obvious characteristics are reserved; the scale range part of the panel in the panel image is marked in the form of a rectangle in any direction, and the representation method is [ category (from 0 to 0, and so on, such as 0,1 and 2), rectangle width, rectangle height, rectangle center point coordinates (x and y), rectangle direction vector and x-axis clamp angle (radian system range [ -pi and pi is negative clockwise) in a plane rectangular coordinate system ], for example [0,50,20,100,200,1.75].
And cutting out the subset images with different sizes by taking the center point of the instrument panel as the center. A small-scale dataset, a medium-scale dataset, and a large-scale dataset are generated according to the subset image size. For images of different scale subsets, the center points of the images are consistent, so that the data volume of the three data sets is the same, and the image centers and displacement labels corresponding to the different data sets are the same.
The marked data set is divided into a training set, a verification set and a test set. Wherein the ratio of the training set, the verification set and the test set is 7:1:2.
Step 12, constructing a multi-scale convolutional neural network model, and training the network model by using instrument panel image information;
as shown in fig. 2, constructing an angle detection neural network model, which comprises the following steps:
step 121, extracting characteristics of a displacement field multi-scale convolutional neural network:
different from the traditional displacement field identification method based on a single-scale neural network, the method divides an instrument panel image into large-scale, middle-scale and small-scale subset images with different pixels through a sliding window, and each scale is subjected to feature extraction through a deep convolution layer.
In the multi-scale neural network training, a 2×48×48 pixel subset image is used as input data of a small-scale CNN, a 2×68×68 pixel subset image is used as input data of a medium-scale CNN, a 2×132×132 pixel subset image is used as input data of a large-scale CNN, and a 264×264 pixel subset image is used as input data of a full-scale CNN. To improve the accuracy of the model's predictions about target values that are near the cross-scale boundary.
As shown in fig. 3, the network structure of the feature extraction module includes a small-scale CNN network, a medium-scale CNN network, a large-scale CNN network and a full-scale CNN network. The small-scale CNN network comprises a first small-scale convolution module, an average pooling layer, a second small-scale convolution module, a third small-scale convolution module, a fourth small-scale convolution module, a fifth small-scale convolution module and a sixth small-scale convolution module which are connected in sequence; the mesoscale CNN network comprises a first mesoscale convolution module, an average pooling layer, a second mesoscale convolution module, a third mesoscale convolution module, a fourth mesoscale convolution module, a fifth mesoscale convolution module and a sixth mesoscale convolution module which are connected in sequence; the large-scale CNN network comprises a first large-scale convolution module, an average pooling layer, a second large-scale convolution module, a third large-scale convolution module, a fourth large-scale convolution module, a fifth large-scale convolution module, a sixth large-scale convolution module and a seventh large-scale convolution module which are connected in sequence; the full-scale CNN network comprises a first full-scale convolution module, an average pooling layer, a second full-scale convolution module, a third full-scale convolution module, a fourth full-scale convolution module, a fifth full-scale convolution module, a sixth full-scale convolution module and a seventh full-scale convolution module which are sequentially connected; each convolution module comprises convolution kernels, normalization units and Relu function units which are sequentially connected and have different numbers and sizes.
The small-scale CNN network outputs small-scale feature vectors, the medium-scale CNN network outputs medium-scale feature vectors, the large-scale CNN network outputs large-scale feature vectors, and the full-scale CNN network outputs full-scale feature vectors.
Step 122, decision-stage fusion of displacement field multi-scale convolutional neural network:
the decision fusion module automatically fuses the displacement information results of different single scales by designing the neural network, and compared with the traditional empirical fusion, the fusion mode of the decision fusion module learns through the neural network, so that the decision fusion module is more accurate and intelligent. After the feature extraction module, the extracted features further pass through the fully connected layers and generate displacement results on each individual CNN scale. A multi-scale decision fusion module is then proposed to fuse these learned displacements at different scales.
As shown in fig. 4, in the multi-scale fusion module, four output data of different scales are superimposed together and then enter the full connection layer to generate a 6×1 vector. This vector is shaped and then passed through a SoftMax layer to generate a 3 x 2 coefficient matrix
In the coefficient matrix, the following equation is satisfied:
wherein ,cj and dj Elements representing matrices of scale factors and are all in [0,1]Within a range of (2). Finally, the displacement vector corresponding to the input subset image can be obtained as follows:
wherein ,u1 Is the abscissa of the small scale feature vector, v 1 Is the ordinate of the small scale feature vector, u 2 Is the abscissa of the mesoscale feature vector, v 2 Is the ordinate of the mesoscale feature vector, u 3 Is the abscissa of the large scale feature vector, v 3 Is the ordinate of the large scale feature vector.
Step 123, calculating a loss function;
the loss function comprises three parts, and the minimum circumscribed rectangle of the detection frame in any direction is optimized by adopting the Ciou loss function.
Wherein IOU represents the intersection ratio of the prediction frame and the labeling frame. b represents the center point of the prediction detection frame, b gt Representing the center point of the callout box. ρ 2 Representing the square of the distances between two central points of the prediction frame and the labeling frame, wherein alpha and upsilon are aspect ratios, and w, h and w gt 、h gt Representing the height width of the predicted frame and the height width of the real frame, respectively.
Confidence loss:
category loss:
loss of direction vector:
wherein For positive sample coefficients, there is a positive sample +.>1 and the balance 0./>Andrespectively representing the abscissa of the head of the detection frame and the tag value thereof,/-> and />Respectively representing the ordinate of the detection frame direction vector head and the label value thereof,/-> and />Respectively representing the tail abscissa value and the label value of the detection frame direction vector, < >> and />The vertical coordinate value of the tail of the detection frame direction vector and the label value thereof are respectively represented.
At step 124, model training is performed on the neural network using Adam's method until the model performs well on the validation set and no overfitting stops training.
And step 13, loading trained model parameters, performing angle correction on the instrument panel image, and correcting the detected image angle to a uniform orientation.
specifically, the YOLOv5 algorithm usually adopts dark net as a feature extraction network, although the dark net utilizes a residual network structure to reduce the training difficulty of a model, the network is too deep to cause huge calculation amount and parameter amount, the training is finally complicated, the real-time performance is difficult to meet, and in order to achieve the real-time performance of target detection, the embodiment adopts a lightweight mobilenet v3 network as the feature extraction network.
As shown in fig. 5, the MobileNet V3 network comprises five convolutions from C1 layer to C5 layer, the input image is convolved by the MobileNet V3 network to output corresponding characteristic diagrams, the characteristic diagrams are defined to correspond to the C1 layer to the C5 layer respectively, the characteristic diagrams are sent to the F3 layer to the F5 layer of the FPN network to learn, the F5 layer is obtained by the C5 layer through one convolutions layer in the sending process, then the F5 layer is up-sampled, the up-sampled value is added with the convolved C4 layer to obtain the F4 layer, then the F4 layer is up-sampled once, and the up-sampled value is added with the convolved C3 layer to obtain the F3 layer.
And transmitting the obtained F3-F5 layers to the P3-P5 layers of the PAN network for learning, obtaining the P3 layer by the F3 layer through a convolution layer in the transmitting process, then downsampling the P3 layer, adding the downsampled value and the convolved F4 layer to obtain the P4 layer, then downsampling the P4 layer once, adding the downsampled value and the convolved F5 layer to obtain the P5 layer, finally transmitting the obtained P3-P5 layers to a Prediction Head module to predict the confidence degree of the Prediction category and the coordinates of the Prediction boundary frame, removing the repeated detection frame through a non-maximum suppression algorithm, and finally displaying the category, the category confidence degree and the boundary frame of the instrument after setting a threshold value.
In this embodiment, 1500 pointer instrument images covering multiple scenes and multiple types are captured and acquired, and after preliminary labeling of targets is performed in LabelImg, the images are input as a training set into the convolutional neural network model for training.
And 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument.
Specifically, according to the information such as instrument category, category confidence coefficient and boundary box output by the convolutional neural network model, the output image is cut, and then mean shift filtering is carried out: the elements with similar color distribution are clustered by means of the segmentation characteristic of the Mean Shift algorithm, color details are smoothed, and subsequent calculated amount is reduced.
As shown in FIG. 6, since the subsequent step requires the binarized edge image of the pointer instrument, the Canny edge detection algorithm is improved and optimized in the embodiment, and the nonlinear bilateral filtering which considers the value domain and the space domain simultaneously is used for replacing the common Gaussian filtering, so that the image edge information is completely reserved while the image is denoised.
After obtaining the image subjected to bilateral filtering, calculating gradient amplitude values of the image in a 3×3 neighborhood; as shown in fig. 7, the gradient calculation in each direction uses a Sober operator direction template, and the gradient magnitude and direction of a pixel are determined by finite differences of first partial derivatives of x direction, y direction, 45 ° direction and 135 ° direction of 8 neighborhoods of a certain pixel of an image.
After the gradient amplitude and direction of the pixel points in the eight fields are obtained, non-maximum suppression is carried out on the edge information, and the effect of edge refinement is achieved.
According to the gray level characteristics of the image, the foreground part and the background part of the image are segmented by counting the gray level distribution of pixel points in the image, the inter-class variance of gray level in the area is maximized, a threshold value which maximizes the variance is found, the threshold value is defined as a high threshold value, and a low threshold value is defined as a high threshold value k epsilon [0.5,0.8] which is k times.
And (3) performing edge connection after obtaining the high and low threshold values of the image:
when the amplitude of a certain pixel point is larger than the high threshold value, the pixel point is an edge point.
When the amplitude of a certain pixel point is lower than a low threshold value, the pixel point is not an edge point;
when the amplitude of a certain pixel point of the image is between the high threshold value and the low threshold value, if the pixel point is connected with the pixel point which is larger than the high threshold value, the pixel point is an edge point, otherwise, the pixel point is not the edge point.
Obtaining an output pointer instrument binarization image through the Canny edge detection algorithm, and then applying Hough circle detection positioning instrument panel circle center based on Hough gradient method: and traversing all edges of the image after reading the binarized image, accumulating in a two-dimensional accumulator along the intersection point of the gradient direction of the edge and the opposite direction line segment, sorting the counts in the two-dimensional accumulator from large to small, reserving the position with the largest count as the circle center of the instrument panel, and calculating the distance from the circle center to the edge point of the image to obtain the radius, thereby positioning the position and the range of the instrument panel in the image.
And 4, processing the image positioned to the instrument by using a CTPN+CRNN network character detection and recognition model to obtain the numerical value and the position of the starting scale and the maximum range of the instrument.
Specifically, the ctpn+crnn network text detection recognition model includes a CTPN network text detection model and a CRNN network text recognition model, where the CTPN network text detection model includes the following operation steps:
after the position and the range of the pointer instrument in the image are positioned, firstly, the pointer instrument image is subjected to feature extraction by utilizing a VGG16 network to generate a feature map, wherein the VGG16 uses a 3 multiplied by 3 small convolution kernel, and compared with other neural networks using a large convolution kernel, the feature map has better extraction effect.
Then, extracting texts on the feature map by using an RNN network, taking each feature point on the feature map as an anchor point, wherein each anchor point can select and generate 10 text proposal boxes with the width of 16 and different heights; the RNN network roughly classifies the generated text proposal boxes, selects proposal boxes possibly containing texts, inputs the proposal boxes into a full-connection layer to carry out accurate classification prediction, and adjusts position coordinates.
And finally outputting the initial coordinates and the height of the prediction candidate areas, the classification scores of the foreground and the background and the horizontal offset of the text proposal frame by the CTPN network text detection model, dividing the text areas of the pointer instrument image, and inputting the divided text areas into the CRNN network text recognition model.
The CRNN network text recognition model comprises the following operation steps:
and scaling and inputting each text region segmented by the CTPN network text detection model into a CNN network to obtain a Feature map, wherein in the embodiment, the text region is uniformly scaled into a gray image with height=32 and width width=160, and the gray image is input into the CNN network to obtain the Feature map with height=l, width=40 and channel=512.
The Feature Map is extracted into Feature sequences needed by an RNN network through Map-to-Sequence, each Feature vector corresponds to a receptive field of an original image, the Feature Sequence is output to two layers of two-way LSTM networks of 256 units to obtain character labels corresponding to each Feature vector, probability distribution vectors are output by the LSTM networks to form a probability matrix W, each element in the probability matrix W represents probability that the Feature vector possibly contains the character W, characters corresponding to the maximum value of each column in the probability matrix W are output to a CTC layer as actual characters to be combined and redundancy removed, text information of a pointer instrument is obtained, and after only numbers are reserved for sorting, the numerical value and the position of a starting scale corresponding to the minimum value and the numerical value and the position of the maximum range corresponding to the maximum value are set.
And 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
Specifically, after the reading of the starting scale and the maximum range information of the pointer type meter is obtained, the pointer in the pointer type meter needs to be extracted. Because the angle range of the pointer of the instrument panel is limited, the included angle area between the initial scale of the instrument panel and the scale of the maximum range is removed from the extraction range of the Hough straight line detection algorithm.
The method comprises the steps of extracting a pointer detection range, then according to the detected positions of a starting scale and a maximum range, reducing the radius of the extraction range, avoiding false detection of scales with longer length as pointers, reducing the area of a region detected by a Hough straight line, substituting pixel points in a target region detected by the pointers into Hough transformation, obtaining the extraction range according to the detected positions of the circle center, the starting scale and the maximum range of an instrument as shown in figure 8, traversing all edge points in the extraction range, continuously and repeatedly randomly extracting the edge points to map to a polar coordinate space straight line, extracting line segments after an accumulator of each edge point exceeds a preset value, thus obtaining all line segments in the extraction range, calculating the lengths of all line segments, and sequencing the line segments with the longest length as pointers of the instrument according to the lengths of the line segments.
As shown in fig. 9, the meter reading is finally calculated using an angle method, where the starting scale is point a, coordinates (x A ,y A ) The pointer end point is point B, coordinates (x B ,y B ) The maximum scale is point C, coordinate (x C ,y C ) The centre of the instrument panel is the point O, the coordinates (x O ,y O ) The connection line between the starting scale and the circle center is a vectorPointer is vector +.>The connecting line of the maximum range scale and the circle center is a vector +.>The pointer is at an angle to the start scale>Slope of line connecting start scale A and circle center O +.>Intercept b of line between starting scale A and circle center O A =y o -k A ×x o Coordinates of point B (x B ,y B ) Substitution vector->Obtaining position information position=y in a linear equation of a straight line b -k A ×x b -b A If the position is greater than or equal to 0, the angle formed by the pointer and the starting scale is theta, otherwise, the angle formed by the pointer and the starting scale is 2 pi-theta; similarly, the angle between the starting scale and the maximum range scale is calculated>
The maximum range MaxRange of the instrument is obtained by the prior CTPN+CRNN network character detection and recognition model, and the final indication is obtainedAnd outputting to finish the reading identification.
The present invention is not limited to the above embodiments, and those skilled in the art can implement the present invention in various other embodiments according to the examples and the disclosure of the drawings, so that the design of the present invention is simply changed or modified while adopting the design structure and concept of the present invention, and the present invention falls within the scope of protection.
Claims (10)
1. The instrument angle correction and indication recognition method based on the deep learning is characterized by comprising the following steps of:
step 1, constructing a neural network to detect and correct the angle of an instrument panel image;
step 2, acquiring multiple scene and multiple kinds of pointer instrument images, constructing a training set, and performing deep training on a convolutional neural network model by using the training set;
step 3, performing mean shift filtering on an image output by the convolutional neural network model, binarizing the image by using a Canny edge detection algorithm based on nonlinear bilateral filtering, and then detecting and positioning a dial plate by using a Hough circle to obtain the circle center position and the radius of the instrument;
step 4, using a CTPN+CRNN network character detection and recognition model to process the image positioned to the instrument, and obtaining the numerical value and the position of the starting scale and the maximum range of the instrument;
and 5, extracting a pointer in the instrument panel by using a Hough straight line detection algorithm based on region selection, and finally calculating the instrument reading by using an angle method according to the scale, the measuring range and the pointer.
2. The method for correcting and identifying the angle of a meter based on deep learning according to claim 1, wherein the step 1 further comprises the steps of:
step 11, collecting image information of an instrument panel, and cleaning and marking data;
step 12, constructing a multi-scale convolutional neural network model, and training the network model by using instrument panel image information;
step 13, loading trained model parameters, carrying out angle correction on the instrument panel image, and correcting the detected pattern angles of the image to a uniform orientation;
in step 12, the multi-scale convolutional neural network comprises a feature extraction module and a multi-scale decision fusion module;
the feature extraction module comprises a small-scale CNN network, a medium-scale CNN network, a large-scale CNN network and a full-scale CNN network, and outputs small-scale feature vectors, medium-scale feature vectors, large-scale feature vectors and full-scale feature vectors respectively;
the multi-scale decision fusion module generates a coefficient matrix from the four feature vectors with different scales, and generates an output vector by fusing the coefficient matrix with the small-scale feature vector, the middle-scale feature vector and the large-scale feature vector.
3. The method for instrument angle correction and registration recognition based on deep learning of claim 2, further comprising the steps of, at step 12:
step 121, extracting characteristics of a displacement field multi-scale convolutional neural network;
step 122, decision-level fusion of the displacement field multi-scale convolutional neural network;
step 123, calculating a displacement field multi-scale convolutional neural network loss function;
the small-scale CNN network comprises a first small-scale convolution module, an average pooling layer, a second small-scale convolution module, a third small-scale convolution module, a fourth small-scale convolution module, a fifth small-scale convolution module and a sixth small-scale convolution module which are connected in sequence;
the mesoscale CNN network comprises a first mesoscale convolution module, an average pooling layer, a second mesoscale convolution module, a third mesoscale convolution module, a fourth mesoscale convolution module, a fifth mesoscale convolution module and a sixth mesoscale convolution module which are connected in sequence;
the large-scale CNN network comprises a first large-scale convolution module, an average pooling layer, a second large-scale convolution module, a third large-scale convolution module, a fourth large-scale convolution module, a fifth large-scale convolution module, a sixth large-scale convolution module and a seventh large-scale convolution module which are connected in sequence;
the full-scale CNN network comprises a first full-scale convolution module, an average pooling layer, a second full-scale convolution module, a third full-scale convolution module, a fourth full-scale convolution module, a fifth full-scale convolution module, a sixth full-scale convolution module and a seventh full-scale convolution module which are sequentially connected;
each convolution module comprises convolution kernels, normalization units and Relu function units which are sequentially connected and have different numbers and sizes.
4. The method of instrument angle correction and registration recognition based on deep learning of claim 3, wherein in step 122, four output data of different scales are superimposed together and then enter the full connection layer to generate a 6 x 1 vector, which is shaped and then passed through a SoftMax layer to generate a 3 x 2 coefficient matrix
In the coefficient matrix, the following equation is satisfied:
wherein ,cj and dj Elements representing matrices of scale factors and are all in [0,1]Is within the range of (2); finally, the displacement vector corresponding to the input subset image can be obtained as follows:
wherein ,i1 Is the abscissa of the small scale feature vector, v 1 Is the ordinate of the small scale feature vector, u 2 Is the abscissa of the mesoscale feature vector, v 2 Is the ordinate of the mesoscale feature vector, u 3 Is the abscissa of the large scale feature vector, v 3 Is the ordinate of the large scale feature vector.
5. The method of deep learning based instrument angle correction and registration recognition of claim 3, wherein in step 123, the loss function comprises:
optimizing the minimum circumscribed rectangle of the detection frame in any direction by adopting a Ciou loss function:
wherein IOU represents the intersection ratio of the prediction frame and the labeling frame, b represents the center point of the prediction detection frame, b gt Representing the center point of the annotation frame ρ 2 Representing the square of the distances between two central points of the prediction frame and the labeling frame, wherein alpha and upsilon are aspect ratios, and w, h and w gt 、h gt Representing the height and width of the prediction frame and the height and width of the real frame respectively;
confidence loss:
category loss:
loss of direction vector:
wherein ,for positive sample coefficients, there is a positive sample +.>1, the rest 0, "> and />Respectively representing the abscissa of the head of the detection frame and the tag value thereof,/-> and />Respectively representing the ordinate of the head of the detection frame direction vector and the label value, T i j(x) and Ti j (x) Respectively representing the tail abscissa value and the label value of the direction vector of the detection frame, T i j(y) and Ti j (y) the tail ordinate value of the detection frame direction vector and the tag value thereof, respectively.
6. The method for instrument angle correction and registration recognition based on deep learning according to claim 1, wherein the Canny edge detection algorithm in step S3 specifically comprises:
meanwhile, nonlinear bilateral filtering is carried out on the image by considering the value domain and the space domain, edge information is well reserved, gradient amplitude values are calculated in a 3×3 neighborhood of the filtered image through a Sober operator direction template, and then non-maximum suppression is carried out on the edge information to achieve an edge refinement effect;
the method comprises the steps of dividing a foreground part and a background part according to gray distribution of an image, maximizing gray class variance, finding a threshold value which maximizes the variance and defining the threshold value as a high threshold value, defining a high threshold value which is k times as a low threshold value, and performing edge connection according to the high threshold value and the low threshold value, wherein k is [0.5,0.8].
7. The method for correcting and identifying instrument angles based on deep learning according to claim 1, wherein in step 2, the mobilenet v3 network in the YOLOv5 algorithm comprises five convolution layers, the input image is convolved by the mobilenet v3 network and then outputs a corresponding feature map, the input image is learned by the FPN network and the PAN network, finally the input image is sent to a Prediction Head module to predict the confidence of the predicted class and the coordinates of the predicted boundary frame, the repeated detection frame is removed by a non-maximum suppression algorithm, and the class, the class confidence and the boundary frame of the instrument are finally displayed after a threshold value is set.
8. The method for instrument angle correction and registration recognition based on deep learning according to claim 1, wherein in step 5, the Hough straight line detection algorithm step includes: obtaining an extraction range according to the detected positions of the circle center, the initial scale and the maximum range of the instrument, traversing all edge points in the extraction range, continuously and repeatedly and randomly extracting the edge points to be mapped into polar coordinate space straight lines, extracting line segments after the accumulator of each edge point exceeds a preset value, and finally calculating the lengths of all the extracted line segments, wherein the extracted line segment with the longest length is used as a pointer of the instrument.
9. The method for correcting and identifying instrument angles and readings based on deep learning as claimed in claim 8, wherein in step 5, in the extraction range of the Hough straight line detection algorithm, firstly, the included angle area between the instrument start scale and the maximum range scale is removed, and secondly, the detection radius is reduced to avoid mistaking the scale line as a pointer.
10. The method for instrument angle correction and indication recognition based on deep learning according to claim 1, wherein in step 3, the step of applying Hough circle detection positioning dial plate comprises: and (3) reading a binarized image output by the Canny edge detection algorithm, traversing all edges of the image, accumulating in a two-dimensional accumulator along the intersection point of the gradient direction of the edge and the opposite direction line segment, sequencing the count in the two-dimensional accumulator from large to small, reserving the position with the maximum count as the circle center of the instrument, and calculating the distance from the circle center to the edge point of the image to obtain the radius.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211507795.6A CN116188756A (en) | 2022-11-25 | 2022-11-25 | Instrument angle correction and indication recognition method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211507795.6A CN116188756A (en) | 2022-11-25 | 2022-11-25 | Instrument angle correction and indication recognition method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116188756A true CN116188756A (en) | 2023-05-30 |
Family
ID=86444961
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211507795.6A Pending CN116188756A (en) | 2022-11-25 | 2022-11-25 | Instrument angle correction and indication recognition method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116188756A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116958998A (en) * | 2023-09-20 | 2023-10-27 | 四川泓宝润业工程技术有限公司 | Digital instrument reading identification method based on deep learning |
CN117037162A (en) * | 2023-08-14 | 2023-11-10 | 北京数字绿土科技股份有限公司 | Detection method and system of pointer instrument based on deep learning |
-
2022
- 2022-11-25 CN CN202211507795.6A patent/CN116188756A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117037162A (en) * | 2023-08-14 | 2023-11-10 | 北京数字绿土科技股份有限公司 | Detection method and system of pointer instrument based on deep learning |
CN116958998A (en) * | 2023-09-20 | 2023-10-27 | 四川泓宝润业工程技术有限公司 | Digital instrument reading identification method based on deep learning |
CN116958998B (en) * | 2023-09-20 | 2023-12-26 | 四川泓宝润业工程技术有限公司 | Digital instrument reading identification method based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112949564B (en) | Pointer type instrument automatic reading method based on deep learning | |
CN106529537B (en) | A kind of digital instrument reading image-recognizing method | |
CN111325203B (en) | American license plate recognition method and system based on image correction | |
CN106875381B (en) | Mobile phone shell defect detection method based on deep learning | |
CN116188756A (en) | Instrument angle correction and indication recognition method based on deep learning | |
CN113160192A (en) | Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background | |
CN108921163A (en) | A kind of packaging coding detection method based on deep learning | |
CN108629286B (en) | Remote sensing airport target detection method based on subjective perception significance model | |
CN111046881B (en) | Pointer type instrument reading identification method based on computer vision and deep learning | |
CN112307919B (en) | Improved YOLOv 3-based digital information area identification method in document image | |
CN112766136B (en) | Space parking space detection method based on deep learning | |
CN114549981A (en) | Intelligent inspection pointer type instrument recognition and reading method based on deep learning | |
CN110659637A (en) | Electric energy meter number and label automatic identification method combining deep neural network and SIFT features | |
CN115841669A (en) | Pointer instrument detection and reading identification method based on deep learning technology | |
CN111598098A (en) | Water gauge water line detection and effectiveness identification method based on full convolution neural network | |
CN112365462A (en) | Image-based change detection method | |
CN114155527A (en) | Scene text recognition method and device | |
CN114241469A (en) | Information identification method and device for electricity meter rotation process | |
CN114913498A (en) | Parallel multi-scale feature aggregation lane line detection method based on key point estimation | |
CN116453104B (en) | Liquid level identification method, liquid level identification device, electronic equipment and computer readable storage medium | |
CN111369526B (en) | Multi-type old bridge crack identification method based on semi-supervised deep learning | |
CN110991374B (en) | Fingerprint singular point detection method based on RCNN | |
CN111950559A (en) | Pointer instrument automatic reading method based on radial gray scale | |
CN109902751B (en) | Dial digital character recognition method integrating convolution neural network and half-word template matching | |
CN116188755A (en) | Instrument angle correction and reading recognition device based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |