CN112991517B

CN112991517B - Three-dimensional reconstruction method for texture image coding and decoding automatic matching

Info

Publication number: CN112991517B
Application number: CN202110250681.7A
Authority: CN
Inventors: 胡庆武; 陈雨婷; 艾明耀; 赵鹏程; 李加元
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-03-08
Filing date: 2021-03-08
Publication date: 2022-04-29
Anticipated expiration: 2041-03-08
Also published as: CN112991517A

Abstract

The invention discloses a three-dimensional reconstruction method based on texture coding images, which comprises the following steps: step 1, projecting a color coding image generated by using M array coding onto an object to be measured, and shooting an area to be measured by using a camera to obtain left and right images. And 2, extracting a target region based on a Hough circle detection method and a perspective transformation principle so as to avoid the influence of environmental impurities on the projection region. And 3, converting the color information of the image into a code value based on image enhancement and color identification preprocessing of a color migration technology. And 4, directly decoding the M array coding method to obtain matching point pairs. And 5, performing three-dimensional reconstruction based on the stereoscopic vision principle by using the obtained homonymy point pairs to obtain three-dimensional coordinates of the space points corresponding to the two-dimensional points. The invention combines the spatial encoding and decoding and the binocular vision structured light three-dimensional reconstruction technology, improves the speed and the precision of three-dimensional reconstruction, and provides a new thought for high-efficiency three-dimensional reconstruction on the premise of keeping high simulation degree.

Description

Three-dimensional reconstruction method for texture image coding and decoding automatic matching

Technical Field

The invention belongs to the field of image coding and decoding and structured light three-dimensional reconstruction, and realizes rapid and high-precision three-dimensional reconstruction by obtaining left and right images with enhanced textures by shooting an object to be measured which is projected with a coded pattern, and directly positioning the left and right images by a decoding algorithm to obtain the homonymy points of the left and right images.

Background

With the gradual development of the technological level, the computer vision technology is also continuously advanced, wherein the three-dimensional reconstruction technology is rapidly developed and gradually perfected, and is widely and effectively applied in various fields. However, the conventional three-dimensional reconstruction method is limited in application in production practice, and how to ensure the robustness of matching in the three-dimensional reconstruction process so as to ensure high reconstruction reliability and improve the efficiency of the reconstruction process is a key point of research of scholars in related fields at present. The three-dimensional reconstruction technology based on the coded structured light utilizes a projector to map a coded texture pattern to the surface of an object to be detected, then a camera is used as a sensor to capture an image of the object to be detected, the coded pattern on the surface of the object on the obtained image can generate geometric deformation due to three-dimensional depth information of the object, the image is decoded by utilizing a digital image processing technology, and a point can be directly located, so that a matching point pair is obtained. Obviously, the encoding and decoding algorithm is the key of the method, determines the efficiency of three-dimensional reconstruction, and in recent years, various fields have higher requirements on three-dimensional reconstruction technology, and the three-dimensional reconstruction of a moving object is expected to be realized while high precision is maintained, so that only a single image is used for projection enhancement, and therefore, the encoding and decoding algorithm for the single image becomes a necessary trend for research and is also a key point and a difficulty for research and application.

Disclosure of Invention

The invention provides a three-dimensional reconstruction method based on perspective transformation for extracting a target region, color migration-based image preprocessing, an M array coding-based decoding algorithm and stereoscopic vision, aiming at utilizing a single M array coding image for structured light three-dimensional reconstruction, and realizing the combination of a space coding and decoding technology and a stereoscopic vision three-dimensional reconstruction technology, thereby avoiding the difficulty and error-prone point of characteristic point matching in the traditional three-dimensional reconstruction method, improving the speed of the three-dimensional reconstruction process and the precision of a three-dimensional reconstruction result, and providing a new thought for high-efficiency three-dimensional reconstruction on the premise of keeping high simulation degree.

In order to achieve the above object, the present invention provides a three-dimensional reconstruction method based on texture image coding, which comprises the following steps:

step 1), projecting a color coding image generated by using the M array coding matrix onto an object to be detected, and shooting an area to be detected by using a camera to obtain a left image and a right image.

Step 2), extracting a projected target area of the image and the right image based on a Hough circle detection method and a perspective transformation principle so as to avoid the influence of environmental impurities on the projection area;

step 3), respectively enhancing the left image and the right image after the target area is extracted based on a color migration technology, carrying out color identification, converting the color information of the image into a code value, and obtaining a left image color identification image and a right image color identification image;

step 4), directly decoding the color identification image obtained in the step 3) by aiming at the M array coding method to obtain a matching point pair;

and 5) carrying out three-dimensional reconstruction based on the stereoscopic vision principle by using the obtained matching point pairs to obtain three-dimensional coordinates of the space points corresponding to the two-dimensional points.

Further, the specific implementation manner of the step 1) is as follows;

firstly, projecting a designed single color coding image to the surface of an object to be measured through a projector, then shooting images of the object to be measured from different angles by adopting a camera to obtain a left image and a right image, wherein the color coding image is generated by utilizing an M array coding matrix.

Further, the specific implementation manner of step 2) is as follows;

firstly, carrying out gradient edge detection on an image to obtain a binary image of edge detection; secondly, detecting a circular function by using Hough transform, selecting three input parameters of proper minimum distance, maximum radius and minimum radius, and detecting to obtain the center coordinates of four circles in the camera-shot image, namely the coordinates of four corner points of the projection area; and finally, taking the four detected corner coordinates as original coordinates, taking the four corner coordinates of the designed color coded image as coordinates after perspective transformation, substituting the coordinates of the four groups of two-dimensional mapping points into a perspective transformation equation set, calculating values of eight unknowns, and obtaining a perspective transformation matrix of the left and right image extraction target areas.

Further, the specific implementation manner of step 3) is as follows;

firstly, taking a color coding image as a target image of color migration, taking an image with an extracted target area as an original image of the color migration, converting the target image and the original image from an RGB color space to an l alpha beta color space, and approximating an orthogonal luminance component l and two chrominance components alpha and beta;

secondly, respectively calculating the mean value and the standard deviation of two images in the l alpha beta space, firstly subtracting the mean value of the l alpha beta channel of the target image from the data of the l alpha beta channel of the target image, then scaling the obtained new data according to the proportion, wherein the scaling coefficient is the ratio of the standard deviation of the original image to the standard deviation of the target image, and finally respectively adding the obtained results to the mean value of the l alpha beta channel of the original image to obtain the original image after color migration processing with the same mean value and the same variance as the target image;

finally, carrying out color identification on the image subjected to the color migration treatment; converting the image subjected to color migration processing from the l alpha beta color space to an RGB color space, extracting the RGB value of each pixel, classifying the pixels through a judgment statement, if the gray values of three channels are all larger than a threshold t1, identifying the pixels as white, if the gray value of a red channel is maximum and the gray difference value between the red channel and the other two channels is larger than t2, identifying the pixels as red, wherein the identification principle of the green and the blue is the same as that of the red; and for the identified pixels of the three colors, respectively representing the pixels by corresponding 0, 1 and 2 in the design of a coding pattern, and representing the pixels by corresponding 3, and obtaining a matrix of the image size, namely obtaining the color identification graph.

Further, the specific implementation manner of the step 4) is as follows;

respectively constructing a right neighborhood map and a lower neighborhood map according to the left image color identification map; the process of constructing the right neighborhood graph is to search right for each pixel point in the color identification graph until the first primitive pixel after the white frame is searched, and record the code value and the coordinate of the point in the right neighborhood graph, and the process of constructing the lower neighborhood graph is to search downwards for each pixel point in the color identification graph until the first primitive pixel after the white frame is searched, and record the code value and the coordinate of the point in the lower neighborhood graph; based on the condition that the pixel point is positioned in the single color block, if the pixel point is positioned on the white frame, searching rightwards or downwards along the point until the pixel point of the first code value of '0' or '1' or '2' is obtained by searching, and taking the point as a starting point;

secondly, for an undetermined point in the left image, searching a right element which is most adjacent to the current element in the right neighborhood image and searching a lower element which is most adjacent to the current element in the lower neighborhood image by using a color identification image, a right neighborhood image and a lower neighborhood image, constructing a 3 x 3 window where the current element is located, putting the window into an M array coding matrix corresponding to a color coded image for template matching, and realizing window positioning and element positioning, wherein the element refers to a single color block in the color coded image, the values of the coding matrix correspond to the elements of the color coded image one by one, and the values of the coding matrix are respectively corresponding to the elements of three colors of red, green and blue when the coded image is designed;

thirdly, further accurately positioning the interior of the element according to the position relation between the pixel coordinate of the to-be-positioned pixel and a white frame on the periphery of the element, if the to-be-positioned pixel is positioned in the element, respectively retrieving the to-be-positioned pixel in the upper, lower, left and right directions until the to-be-positioned pixel touches the white frame and stops, namely retrieving a point with a first code value of 3, stopping, calculating the number of pixels passing through the retrieval in the four directions respectively, and realizing positioning through simple calculation; if the undetermined point is not in the primitive, searching rightwards or downwards until the first point in the primitive is searched within the threshold range, and taking the point as a starting point to perform accurate positioning;

finally, searching a matching point of the undetermined point in the right image, positioning the primitive according to the position of the to-be-positioned point in the left image, and then performing further primitive accurate positioning similarly to the steps of primitive positioning and primitive accurate positioning; when the primitive is positioned, the initial position of search is determined by using the geometric constraint of perspective transformation, the search is only carried out in the certain range, and if the matched primitive cannot be searched in the range, the matching of the point to be positioned is abandoned.

Further, the specific implementation manner of step 5) is as follows;

firstly, calibrating a camera internal reference matrix by using an MATLAB camera calibration tool box, making a checkerboard calibration picture as a calibration board, shooting the calibration board by using a camera from different angles to obtain pictures as many as possible, taking the actual side length of a checkerboard small square on the calibration board by using a ruler, wherein the unit is millimeter, and inputting the obtained data into the calibration tool box to obtain a calibration result of the camera internal reference;

secondly, calculating the value of a basic matrix by adopting an eight-point algorithm of RANSAC, randomly selecting eight point pairs from the same-name point pairs, calculating by a method for solving a linear equation set to obtain the result of the basic matrix, then searching for the point pairs supporting the calculated basic matrix in all the original point pairs, if the number of the supported point pairs is enough, considering the result of the calculated basic matrix to be credible, and pinching off by using all the supported point pairs through a least square method to obtain a final basic matrix result, otherwise, if only a small number of matched point pairs meet the basic matrix result obtained by initial calculation, repeating the steps until an optimal solution is found; after the optimal solution of the basic matrix is obtained, the basic matrix and the camera internal reference matrix are operated to obtain the result of the essential matrix;

and finally, the essential matrix is only related to the relative position and posture relation of the cameras when two pictures are shot, the essential matrix is subjected to SVD (singular value decomposition) to obtain a transformation matrix [ R | t ] representing the right image relative to the left image, four results are obtained through calculation, only one of the four results enables points obtained through the triangulation principle calculation to fall in front of the two cameras, and the only result is obtained through point taking calculation verification.

The invention has the following positive effects:

1) the invention provides a decoding method based on a single M-array coding scheme, which is characterized in that coordinates of points in left and right images in a coding pattern are respectively calculated, the coordinates are the same, namely matching point pairs, the process is used for replacing a characteristic point matching algorithm which is high in complexity, long in time consumption and easy to generate mismatching in the traditional three-dimensional reconstruction process, the reconstruction precision and efficiency are improved, positioning can be realized only by one coding image, and a moving scene is also suitable.

2) Before decoding the coded image, the invention firstly utilizes perspective transformation to extract the target, thereby avoiding the influence of environmental impurities on a projection area and facilitating the matching of the same name points of the left image and the right image after subsequent decoding.

3) The invention provides a method for preprocessing by using a color migration technology, which solves the problem of low decoding precision caused by the influence of the surface texture of an object and environmental illumination, and the image color information after color migration processing is rich, and can realize color identification only by simple judgment.

The method can more accurately and steadily realize the three-dimensional reconstruction of the object to be measured, combines the space coding and decoding technology and the binocular vision structured light three-dimensional reconstruction technology, integrates the advantages of the space coding and decoding technology and the binocular vision structured light three-dimensional reconstruction technology, and improves the speed of the three-dimensional reconstruction process and the precision of the three-dimensional reconstruction result.

Drawings

FIG. 1 is a color encoded image for projection according to the present invention.

FIG. 2 is a flow chart of the present invention.

FIG. 3 is a flow chart of the present invention for making a neighborhood map.

FIG. 4 is a flowchart of window template matching to achieve primitive positioning in the present invention.

FIG. 5 is a schematic diagram of precise placement within a cell of the present invention.

Detailed Description

In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is described in further detail below with reference to the accompanying drawings.

In embodiment 1, a three-dimensional reconstruction method based on texture image coding includes the following steps:

step 1), the texture coding image of the object to be detected after projection enhancement is obtained by shooting in the step for subsequent processing. Firstly, projecting a designed single color coding image to the surface of an object to be measured through a projector, then shooting images of the object to be measured from different angles by adopting a camera to obtain a left image and a right image, wherein the color coding image is generated by utilizing an M array coding matrix.

And 2) extracting projected target areas of the left image and the right image by using a perspective transformation principle so as to avoid the influence of environmental impurities on the projection areas.

Firstly, gradient edge detection is carried out on the image to obtain a binary image of the edge detection.

Secondly, a circle function is detected by using Hough transform packaged in OpenCV, three input parameters of proper minimum distance, maximum radius and minimum radius are selected, and circle center coordinates of four circles in a camera shooting image, namely coordinates of four corner points of a projection area, are detected.

And finally, taking the detected four corner coordinates as original coordinates, taking the four corner coordinates of the designed color coded image as coordinates after perspective transformation, substituting the coordinates of the four groups of two-dimensional mapping points into a perspective transformation equation set, and calculating to obtain eight unknowns to obtain a perspective transformation matrix of the left and right image extraction target areas.

And 3) enhancing the left image and the right image after the target area is extracted by using a color migration technology, identifying colors, and converting color information of the image into code values to obtain a color identification image.

First, a color coded image is used as a target image for color transition, a video image from which a target region is extracted is used as an original image for color transition, and both the target image and the original image are converted from an RGB color space to an l α β color space (an approximately orthogonal luminance component l and two chrominance components α and β).

Secondly, respectively calculating the mean value and the standard deviation of two images in the l alpha beta space, firstly subtracting the mean value of the l alpha beta channel of the target image from the data of the l alpha beta channel of the target image, then scaling the obtained new data according to the proportion, wherein the scaling coefficient is the ratio of the standard deviation of the original image to the standard deviation of the target image, and finally respectively adding the obtained results to the mean value of the l alpha beta channel of the original image to obtain the original image after color migration processing with the same mean value and the same variance as the target image.

And finally, carrying out color identification on the image subjected to the color migration treatment. Converting the color-shifted image from the l alpha beta color space to the RGB color space, extracting the RGB value of each pixel, and classifying the pixels through simple judgment sentences, wherein in the example, if the gray values of three channels are all larger than 50, the pixels are identified as white, if the gray value of a red channel is the maximum and the difference value between the gray value of the red channel and the gray values of the other two channels is larger than 60, the pixels are identified as red, and the identification of the green and the blue is the same as the red. And for the identified pixels of the three colors, respectively representing the pixels by corresponding 0, 1 and 2 in the design of a coding pattern, and representing the pixels by corresponding 3, and obtaining a matrix of the image size, namely obtaining the color identification graph.

And 4) directly decoding the color identification image obtained in the step 3) by aiming at the M array coding method to obtain a matching point pair.

First, as shown in fig. 3, a right neighborhood map and a lower neighborhood map are respectively constructed according to the left image color identification map. The process of constructing the right neighborhood map is to search right for each pixel point in the color identification map until the first primitive pixel after the white border (the code value is '3') is searched, the code value and the coordinate of the point are recorded in the right neighborhood map, and the process of constructing the lower neighborhood map is to search downwards for each pixel point in the color identification map until the first primitive pixel after the white border (the code value is '3') is searched, and the code value and the coordinate of the point are recorded in the lower neighborhood map. The method is based on the condition that the pixel point is positioned in the single color block, if the pixel point is positioned on the white frame, the pixel point is searched rightwards or downwards along the point until the pixel point of the first code value of 0 or 1 or 2 is obtained through searching, and the point is taken as a starting point.

Next, as shown in fig. 4, for an undetermined point in the left image, by using the color identification map, the right neighborhood map and the lower neighborhood map, the right primitive which is most adjacent to the current primitive is found in the right neighborhood map, the lower primitive which is most adjacent to the current primitive is found in the lower neighborhood map, a 3 × 3 window where the current primitive is located is constructed, and the window is placed in an M-array coding matrix corresponding to the color coded image for template matching, so that window positioning and primitive positioning are achieved, wherein the primitive refers to a single color block in the color coded image, the values of the coding matrix correspond to the primitives of the color coded image one by one, and "0", "1" and "2" correspond to the primitives of three colors, namely red, green and blue, when the coded image is designed.

And thirdly, as shown in fig. 5, further accurate positioning is carried out inside the primitive through the position relation between the pixel coordinate of the point to be detected and the white frame on the periphery of the primitive. If the undetermined point is located inside the primitive, searching is respectively carried out by the pixel to be positioned in the four directions of the upper direction, the lower direction, the left direction and the right direction until the pixel to be positioned touches a white frame, namely the searching stops when the point with the first code value of 3 is searched, the number of the pixels passing through the searching in the four directions is calculated, and the positioning is realized through simple calculation. If the undetermined point is not in the primitive, searching rightwards or downwards until the first point in the primitive is searched in the threshold range, and taking the point as a starting point to perform accurate positioning.

Finally, searching a matching point of the undetermined point in the right image, positioning the primitive according to the position of the to-be-positioned point in the left image, and then performing further primitive accurate positioning similarly to the steps of primitive positioning and primitive accurate positioning. When the primitive is positioned, the initial position of search is determined by using the geometric constraint of perspective transformation, the search is only carried out in the certain range, and if the matched primitive cannot be searched in the range, the matching of the point to be positioned is abandoned.

Firstly, calibrating a camera internal reference matrix by using an MATLAB camera calibration tool box, making a checkerboard calibration picture as a calibration board, shooting the calibration board by using a camera from different angles to obtain pictures as many as possible, taking the actual side length of a checkerboard small square on the calibration board by using a ruler, wherein the unit is millimeter, and inputting the obtained data into the calibration tool box to obtain a calibration result of the camera internal reference.

Secondly, calculating the value of the basic matrix by adopting an eight-point algorithm of RANSAC, randomly selecting eight point pairs from the same-name point pairs, calculating by a method of solving a linear equation set to obtain the result of the basic matrix, then searching the point pairs supporting the calculated basic matrix in all the original point pairs, if the number of the supported point pairs is enough, considering the result of the calculated basic matrix to be credible, and calculating by using all the supported points through a least square method to obtain a final basic matrix result, otherwise, if only a small number of matched point pairs meet the basic matrix result obtained by the initial calculation, repeating the steps until an optimal solution is found. And after the optimal solution of the basic matrix is obtained, the basic matrix and the camera internal reference matrix are operated to obtain the result of the essential matrix.

Thirdly, the essential matrix is only related to the relative position and posture relation of the cameras when two pictures are shot, SVD decomposition is carried out on the essential matrix to obtain a transformation matrix [ R | t ] representing the right image relative to the left image, four results are obtained through calculation, only one of the four results enables points obtained through the triangulation principle to fall in front of the two cameras, and the unique result can be obtained through point taking calculation verification.

And finally, carrying out triangular intersection on the homonymous point pairs by using the obtained [ R | t ] matrix by adopting a triangulation method, and calculating to obtain the three-dimensional coordinates of the space points corresponding to the two-dimensional points.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the present invention, and it will be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention.

Claims

1. A three-dimensional reconstruction method based on texture image coding is characterized by comprising the following steps:

step 1), projecting a color coding image generated by using an M array coding matrix onto an object to be detected, and shooting an area to be detected by using a camera to obtain a left image and a right image;

the specific implementation manner of the step 4) is as follows;

respectively constructing a right neighborhood map and a lower neighborhood map according to the left image color identification map; the process of constructing the right neighborhood graph is to search right for each pixel point in the color identification graph until the first primitive pixel after the white frame is searched, and record the code value and the coordinate of the point in the right neighborhood graph, and the process of constructing the lower neighborhood graph is to search downwards for each pixel point in the color identification graph until the first primitive pixel after the white frame is searched, and record the code value and the coordinate of the point in the lower neighborhood graph; the method for constructing the right neighborhood graph and the lower neighborhood graph is based on the condition that the pixel point is positioned in the single color block, if the pixel point is positioned on the white frame, the pixel point is searched rightwards or downwards along the point until the pixel point of the first code value of 0, 1 or 2 is obtained through searching, and the point is used as a starting point;

finally, searching a matching point of the undetermined point in the right image, positioning the primitive according to the position of the to-be-positioned point in the left image, and then performing further primitive accurate positioning similarly to the steps of primitive positioning and primitive accurate positioning; when the primitive is positioned, the geometric constraint of perspective transformation is firstly utilized to determine the initial position of search, the search is only carried out in a certain range, and if the matched primitive cannot be searched in the range, the matching of the point to be positioned is abandoned;

2. The method of claim 1, wherein the texture image coding-based three-dimensional reconstruction method comprises: the specific implementation manner of the step 1) is as follows;

3. The method of claim 1, wherein the texture image coding-based three-dimensional reconstruction method comprises: the specific implementation manner of the step 2) is as follows;

4. The method of claim 1, wherein the texture image coding-based three-dimensional reconstruction method comprises: the specific implementation manner of the step 3) is as follows;

5. The method of claim 1, wherein the texture image coding-based three-dimensional reconstruction method comprises: the concrete implementation manner of the step 5) is as follows;