CN113487668A

CN113487668A - Radius-unlimited learnable cylindrical surface back projection method

Info

Publication number: CN113487668A
Application number: CN202110571944.4A
Authority: CN
Inventors: 林绍福; 李松静
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2021-10-08

Abstract

The invention discloses a radius-unlimited learnable cylindrical surface back projection method, and provides a radius-unlimited cylindrical surface back projection method aiming at the problems that two sides of the edge of cylindrical surface label image data are compressed by information of different degrees, and the edge part is missed and wrong in detection during surface detection. A projection coordinate transformation formula is redesigned on the basis of a cylindrical back projection geometric principle, so that coordinate transformation does not depend on specific radius parameters, and the parameter quantity and the calculated quantity are reduced by applying a BP positioning prediction network learning coordinate prediction method. And (4) completing the pixel value by adopting a bilinear interpolation method for the coordinates of the corresponding points which cannot be found on the cylindrical surface in the planar image. And finally, verifying the result of the method by using an arbitrary-shape text detection model DBNet. The result shows that the method does not depend on the specific model radius, and has better flattening effect when cylinder image correction processing with different cylinder radius sizes and different bending degrees is carried out.

Description

Radius-unlimited learnable cylindrical surface back projection method

Technical Field

The invention relates to a cylindrical surface back projection correction algorithm in the field of image processing. The method is a key step of application of cylindrical surface image data label identification, surface defect detection of a columnar model, a panoramic stitching technology and the like.

Background

Cylindrical projection refers to a process of mapping a planar image and a cylindrical surface to each other, and includes cylindrical orthographic projection and cylindrical back projection. Cylindrical orthographic projection refers to the process of projecting a planar image onto a cylindrical surface, and cylindrical backprojection refers to the process of projecting a particular viewing area of a cylindrical surface onto the tangent plane of the cylinder. Cylindrical backprojection algorithms generally have a formula method and a quadratic curve fitting method.

The traditional image cylindrical surface flattening correction processing is based on a cylindrical surface back projection algorithm, the algorithm depends on specific cylindrical surface model radius, when model radii with different sizes are processed, parameters need to be replaced, and the range of a corrected object is limited. Meanwhile, projection coordinate transformation is mostly obtained through formula calculation in a cylindrical surface back projection algorithm, and the calculation amount is large when each pixel point in the traversal image is calculated. The invention designs a cylindrical surface back projection method without limitation on radius, which utilizes a BP positioning prediction network to learn and predict projection coordinates, assigns a pixel value of a corresponding coordinate point and completes cylindrical surface flattening correction processing. Experiments prove that the algorithm has strong universality and practicability.

Disclosure of Invention

The invention aims to provide a cylindrical back projection method with an unlimited radius, aiming at the problems that the two sides of the edge of cylindrical label image data are compressed by information of different degrees, and the edge part is missed and wrong in detection when surface detection is carried out. The method does not depend on the specific model radius, and has a good flattening effect when cylindrical image correction processing with different cylindrical radius sizes and different bending degrees is carried out.

In order to achieve the purpose, the projection coordinate transformation formula is redesigned on the basis of the cylindrical back projection geometric principle, so that the coordinate transformation does not depend on specific radius parameters any more, and the parameter quantity and the calculated quantity are reduced by applying a BP positioning prediction network learning coordinate prediction method. In the experiment, the data of the cylindrical wire pole signboards with different radiuses are used as image data, and the collected images are preprocessed to be converted into uniform size; and obtaining the optimal cylindrical surface edge line through a first-order edge canny detection operator, and performing coordinate transformation within the edge line range. According to the design formula X, the formula X is Xorigin + Xoffest, wherein Xorigin is the original horizontal coordinate of a pixel point of a cylindrical image in the horizontal direction, Xoffest is the offset obtained by utilizing the horizontal coordinate X of the pixel point and the angle theta between the point and an origin to train a BP neural network model for prediction, and the coordinate in the vertical direction is unchanged; and (4) completing the pixel value by adopting a bilinear interpolation method for the coordinates of the corresponding points which cannot be found on the cylindrical surface in the planar image. And finally, verifying the result of the method by using an arbitrary-shape text detection model DBNet.

In order to achieve the technical scheme of the purpose, the specific implementation steps are as follows:

the method comprises the following steps: cylindrical image data acquisition. The experimental data of the invention is that the data of the label attached to the surface of the cylindrical wire pole is manually shot, and the diameters of the wire pole are found to be different from 190mm to 370mm according to research, so the invention selects the data of the cylindrical wire pole identification labels with the diameters of 190mm, 240mm, 290mm, 340mm and 390mm as the experimental data.

Step two: the lenticular image size is set. Considering the problems of more cylindrical image pixel points and large experimental data amount, the invention uniformly sets the acquired cylindrical image data to be 200 × 200.

Step three: the canny edge detection operator acquires the cylindrical surface edge lines. Smoothing the image subjected to the operation in the second step by Gaussian filtering, calculating the amplitude and the direction of the gradient by using the finite difference of the first-order partial derivatives, and performing non-maximum suppression operation on the gradient amplitude; and detecting and connecting edges by using a dual-threshold algorithm to obtain edge lines of the cylindrical surface image.

Step four: and establishing a projection coordinate transformation formula. And 4, carrying out back projection coordinate transformation on each pixel point of the image within the range of the edge line of the cylindrical surface obtained in the step three, and calculating corresponding plane coordinates. According to the principle of cylindrical geometric projection, the coordinate in the vertical direction is not changed when cylindrical back projection transformation is carried out, projection transformation is carried out on the coordinate in the horizontal direction, and the horizontal coordinate in the horizontal direction after projection is divided into two parts, namely: the method comprises the steps of obtaining an original horizontal coordinate Xorigin and a projection offset value Xofest, wherein the Xofest can change according to the position change of a pixel point, so that x and theta are used as independent variables, cylindrical surfaces with different radiuses are intercepted under the same coordinate system, and projection coordinates are calculated according to the traditional coordinate transformation, and a coordinate pair of the original coordinate and the projection coordinates is generated.

Step five: and (5) building a BP neural network prediction model. And using the coordinate pair obtained in the step four as input data and label data of the model respectively, wherein the input is horizontal direction abscissa x and an angle theta between a pixel point and a cylinder center point, and the input is marked as: x1, x2, the output is the predicted offset, labeled y. Selecting the number of hidden layer neurons as 5 and 10 respectively according to the optimal result of the experiment, constructing a 2 x 5 x 10 x1 four-layer neural network model, wherein the weight matrix between the first layer and the second layer of the model is W1, the bias matrix is B1, and the matrix elements are W_ijI and j are the number of the layer number of the neuron, and similarly, the connection parameter matrix among the neuron layers is W_iAnd Bi. Model activation function selection sigmoid function

Loss function selection mean square error loss

And the model saves parameters W and B of each layer through the steps of forward propagation, loss calculation and backward propagation.

Step six: and inputting the cylindrical image into the neural network prediction model obtained in the fifth step to obtain a plane coordinate corresponding to back projection, and assigning the pixel value of the original coordinate to the back projection plane coordinate to obtain a corresponding projection flattened image.

Step seven: and for the pixel points which cannot be found in the cylindrical surface image on the flattened image projected in the sixth step, filling missing pixel values by adopting a bilinear interpolation method to obtain a complete flattened image.

Step eight: inputting the flattened image obtained in the step seven into a text detection model DBNet network with any shape for verification, and marking the detected text box in the flattened image.

The invention is mainly characterized in that:

(1) the invention discloses a novel coordinate transformation calculation method according to the principle of cylindrical back projection, which is not dependent on specific cylindrical model radius parameters any more, and can realize the back projection transformation of cylindrical images with any radius.

(2) When the transformation coordinates are calculated, the coordinate positioning prediction network is utilized, the prediction model is trained through data, and the positioning prediction network is used for calculating the transformed parameters, so that the parameter quantity and the calculated quantity are reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments will be briefly described below.

Fig. 1 shows the incomplete detection of both side edges of a cylinder when a certain APP performs text recognition of a cylinder image.

Fig. 2 shows the coordinate axes of the image coordinate transformation used in the experiments of the present invention.

FIG. 3 shows a flow diagram of a learnable cylindrical backprojection algorithm without radius limitations.

Fig. 4 shows a cylindrical projection space geometry.

Fig. 5 shows a cylindrical back-projected coordinate transformation diagram.

FIG. 6 illustrates a schematic diagram of a learnable cylindrical projection without radius limitation of the present invention.

FIG. 7 is a coordinate positioning prediction network model.

FIG. 8 is a diagram of the effect of the cylindrical surface flattening part of the algorithm of the present invention.

Fig. 9 is a placard text detection block of the present invention.

Detailed Description

Description of the drawings in order to better understand the technical solution of the present invention, the following detailed description of the embodiments of the present invention is made with reference to the accompanying drawings.

The first embodiment is as follows:

As shown in fig. 1, when cylindrical text detection or cylindrical surface detection is performed, information compression occurs to different degrees at two sides of the edge of a cylindrical image formed due to the problem of the shooting range of a camera, and the problems of missing detection and false detection of information occur when computer vision detection is performed.

Example two:

The pixel coordinates of the image shown in fig. 2 are in OpenCV with the upper left corner as the origin, the horizontal direction as the X-axis, and the vertical direction as the y-axis. The image coordinate system takes the image central point as an origin to establish a coordinate system, and when the OpenCV library is used for calculation, the image coordinate and the pixel coordinate are converted based on the pixel coordinate.

Example three:

FIG. 3 shows a flow chart of the learnable cylindrical backprojection algorithm for radius limitation of the steps proposed by the present invention. The experimental process of the whole method is developed according to 5 parts, namely cylindrical image acquisition, cylindrical image preprocessing, a back projection positioning prediction network, bilinear interpolation processing and DBNet model verification. The back projection positioning prediction network is the core of the method, and the part utilizes canny edge point detection to obtain a cylindrical surface edge line, designs back projection coordinate transformation according to a cylindrical surface back projection geometric principle, and utilizes the traditional formula transformation to obtain a projection coordinate pair which does not pass through a radius to support the back projection positioning prediction network.

Example four:

fig. 4 shows a cylindrical projection space geometry.

As shown in fig. 4, the red dots are dots on the cylindrical surface, the green plane is a tangent plane of the cylindrical surface projection, where the a dots are projection dots of the dot a' on the cylindrical surface, and when the calculation of the cylindrical back projection algorithm is performed, each pixel dot coordinate of the cylindrical surface needs to be projected onto the tangent plane according to the coordinate formula.

Example five:

fig. 5 shows a cylindrical back-projected coordinate transformation diagram.

A calculation formula for introducing the conventional cylindrical back projection coordinates is represented by one dot as shown in fig. 5. The coordinates (x, y) of an arbitrary point P on the plane image after projection transformation are P ' (x ', y '), θ is 1/2 field angle, and PN is x, and P ' Q is x ', where the projection arc of the plane image PN on the cylindrical surface is P ' N, and the plane projection corresponding to P ' N is a straight line PN. Knowing the coordinates (x ', y') of point P and the cylinder radius r, the process of calculating P 'becomes cylindrical forward projection, and knowing the P' coordinates, the coordinate of P becomes cylindrical back projection. The formula of the cylindrical orthogonal projection is as follows:

θ＝arctan(x/r)

x′＝r*θ

deriving a transformation formula of the front back projection from the transformation formula of the cylindrical front projection:

example six:

Fig. 6 shows a schematic diagram of a learnable cylindrical projection with unlimited radius according to the present invention. By analyzing the projection relation between the cylindrical surface and the plane, the points P1, P2, P3, P4 and P5 on the cylindrical surface correspond to the projection points P1 ', P2', P3 ', P4' and P5 'on the plane, the abscissa of the points on the projected plane can be taken as the sum of the base value x and the increment value delta x, and the abscissa is expressed as x' ═ x + delta x, so that the sign with any radius and any bending degree can be corrected without being dependent on the radius. Now x is known, the problem is converted into the calculation of Δ x, and it can be seen from the figure that the change of Δ x is related to the abscissa of the pixel and the angle at which the pixel is located. We fit a functional relationship of x, θ and Δ x using a BP neural network, with the fitted Δ x denoted as Xoffest. Correcting the coordinate expression according to the algorithm as

X＝Xorigin+xoffest

Wherein Xorigin is x, Xoffet is f (x, theta)

Example seven:

FIG. 7 shows a BP location prediction network model diagram.

Fig. 7 is a structural diagram of a positioning prediction network, which is a four-layer BP neural network, an in-out layer of the network is formed by two neurons representing a control variable abscissa x and an angle θ between a position of a pixel point and a coordinate origin, and according to an experimental result test, the first hidden layer is 5 neurons and the second hidden layer is 10 neurons under the condition of the minimum error, and the output is one neuron representing a predicted projection coordinate.

Example eight:

FIG. 8 is a graph showing the cylinder flattening correction results under different model radii.

FIG. 8 is a graph showing the flattening of a cylindrical backprojection transform using the algorithm of the present invention. And (4) obtaining the transformed coordinate position of each pixel point coordinate on the ergodic image by utilizing a positioning prediction network, and assigning the coordinate position by using the original pixel value. In fig. 8, a, b, and c are different cylinder radius sizes, and the correction processing is performed by using the algorithm in sequence, where the left side image is a cylinder image, and the right side image is a flattened result image, and experimental results show that cylinder images with different radii in an error range can obtain a better flattening effect even if radius parameters are not changed.

Example nine:

the text detection model DBNet in any shape selects an FPN characteristic pyramid network to extract image characteristics and fuses with an up-sampling result to obtain a characteristic diagram; using the feature map to predict a probability map and a threshold map; and calculating a detection effect binary image by using the probability image and the threshold value image. And (3) carrying out self-adaptive Differentiable Binarization threshold processing by using a differential Binarization module, and segmenting the background and the text according to a self-adaptive threshold to obtain a detection text box.

Example ten:

fig. 9 shows a text detection result diagram of the text detection model.

Fig. 9 is a diagram showing a text detection result obtained by inputting a plane graph processed by a back projection algorithm into a DBNet network model. The image after cylindrical surface correction can be seen from the image for detection, the detection of the edge information is more complete, and all text boxes can be identified.

Claims

1. A radius-unlimited learnable cylindrical back projection method is characterized in that: the technical process of the method is as follows:

redesigning a projection coordinate transformation formula based on a cylindrical back projection geometric principle to ensure that coordinate transformation does not depend on specific radius parameters any more, and reducing the parameter quantity and the calculated quantity by applying a BP positioning prediction network learning coordinate prediction method; in the experiment, cylindrical wire pole signboard data with different radiuses are used as image data, and the collected images are preprocessed to be converted into uniform size; obtaining an optimal cylinder surface edge line through a first-order edge canny detection operator, performing horizontal direction abscissa transformation within the edge line range, and calculating a transformed coordinate according to a design formula X which is Xorigin + Xoffest, wherein Xorigin is a cylinder image pixel point horizontal direction abscissa, Xoffest is an offset obtained by training a BP neural network model by using the pixel point horizontal direction abscissa X and an angle theta between the point and an origin to predict, and the vertical direction coordinate is unchanged; completing pixel values by a bilinear interpolation method for coordinates of corresponding points which cannot be found on a cylindrical surface in a planar image; and (5) verifying the result of the method by using an arbitrary shape text detection model DBNet.

2. A radius-unlimited learnable cylindrical backprojection method as claimed in claim 1, wherein: the specific implementation steps are as follows:

the method comprises the following steps: cylindrical image data acquisition. The experimental data of the invention is that the data of the label attached to the surface of the cylindrical wire pole is shot manually, and the diameters of the wire pole are found to be 190mm-370mm according to investigation, so the invention selects the data of the cylindrical wire pole identification labels with the diameters of 190mm, 240mm, 290mm, 340mm and 390mm as the experimental data;

step two: the lenticular image size is set. Considering the problems of more cylindrical image pixel points and large experimental data quantity, the collected cylindrical image data is uniformly set to be 200 × 200.

Step three: the canny edge detection operator acquires the cylindrical surface edge lines. Smoothing the image subjected to the operation in the second step by Gaussian filtering, calculating the amplitude and the direction of the gradient by using the finite difference of the first-order partial derivatives, and performing non-maximum suppression operation on the gradient amplitude; detecting and connecting edges by using a dual-threshold algorithm to obtain edge lines of the cylindrical images;

step four: and establishing a projection coordinate transformation formula. And 4, carrying out back projection coordinate transformation on each pixel point of the image within the range of the edge line of the cylindrical surface obtained in the step three, and calculating corresponding plane coordinates. According to the principle of cylindrical geometric projection, the coordinate in the vertical direction is not changed when cylindrical back projection transformation is carried out, projection transformation is carried out on the coordinate in the horizontal direction, and the horizontal coordinate in the horizontal direction after projection is divided into two parts, namely: the method comprises the following steps of (1) obtaining an original horizontal coordinate Xorigin and a projection offset value Xofest, wherein the Xofest can change according to the position change of a pixel point, so that x and theta are taken as independent variables, cylindrical surfaces with different radiuses are intercepted under the same coordinate system, and projection coordinates are calculated according to the traditional coordinate transformation, and a coordinate pair of the original coordinate and the projection coordinates is generated;

Selection of loss functionLoss of mean square error

The model saves parameters W and B of each layer through the steps of forward propagation, loss calculation and backward propagation;

step six: inputting the cylindrical surface image into the neural network prediction model obtained in the fifth step to obtain a plane coordinate corresponding to back projection, and assigning the pixel value of the original coordinate to the back projection plane coordinate to obtain a corresponding projection flattened image;

step seven: filling missing pixel values in the flattened image projected in the sixth step by using a bilinear interpolation method to obtain a complete flattened image, wherein the pixel points cannot be found in the cylindrical image;