Disclosure of Invention
The invention aims to provide a method for realizing three-dimensional reconstruction of a workpiece based on binocular stereo vision.
The technical scheme for realizing the purpose of the invention is as follows: a method for realizing three-dimensional reconstruction of a workpiece based on binocular stereo vision comprises the following steps:
step 1: constructing a workpiece image acquisition system, wherein the workpiece image acquisition system comprises a workpiece three-dimensional rotating device and a binocular camera hardware measurement system;
step 2: when the workpiece three-dimensional rotating device rotates for one angle from the initial position, the binocular camera hardware system collects a frame of workpiece image and measures the inclination angle of the workpiece three-dimensional rotating device;
and step 3: carrying out gray processing, ROI region selection and self-adaptive median filtering on the collected image to obtain a binary image, and carrying out contour extraction on the binary image by using a canny edge extraction algorithm;
and 4, step 4: extracting feature points from the left and right contour maps by adopting an SIFT algorithm, and performing stereo matching;
step 5, converting the pixel coordinates of the characteristic points into coordinates under a world coordinate system according to the calibration result obtained in the step 1 and the distance measured by the laser radar;
and 6, performing curve fitting on the coordinates of the obtained image characteristic points in a world coordinate system to obtain a workpiece contour map.
Preferably, the binocular camera hardware measurement system comprises two cameras and a laser radar, the centers of the two cameras and the workpiece rotating device are located on the same horizontal line, and the laser radar is located between the two cameras.
Preferably, an inclination sensor is arranged on the workpiece three-dimensional rotating device and used for measuring a rotating angle.
Preferably, the specific steps of extracting the feature points of the left and right contour maps by adopting the SIFT algorithm are as follows:
searching image positions on all scales, and identifying interest points which are invariable in scale and rotation through a Gaussian differential function;
at the position of each interest point, the position and the scale of the feature point are determined by a fitting model.
Preferably, the specific method for determining the positions and the dimensions of the feature points through the fitting model is as follows:
performing curve fitting by using a Talor expansion of the DoG function in the scale space, wherein the Talor expansion of the DoG function in the scale space is as follows:
wherein D (X) is a Gaussian difference operator, X (X, y, sigma) represents pixel coordinates under a scale, sigma is a scale factor, X and y are coordinates of any pixel point in an image pixel coordinate system, and X0(x0,y0,σ0) An origin coordinate of an image pixel coordinate system under an original scale;
and (3) carrying out derivation on the Talor expansion and making the equation equal to zero to obtain the offset of the extreme point as follows:
the corresponding extreme point equation has the value:
preferably, the conversion relationship between the pixel coordinates and the world coordinates of the feature points is specifically as follows:
wherein (u, v) is the coordinate of the characteristic point in the pixel coordinate system, dy is the size of the characteristic point pixel in the X and y directions in the physical coordinate system, f represents the focal length of the camera, R represents the rotation third order matrix, T represents the translational column vector, (X)W,YW,ZW) Indicating the position of the point in a world coordinate system.
Compared with the prior art, the invention has the following remarkable advantages:
(1) the method is simple to operate, simple and quick in operation processing, low in requirement on environment and suitable for workpiece measurement in different environments;
(2) the method solves the problem that shadow areas are generated on the images due to insufficient illumination in the industrial environment, and the images of the workpieces at different angles are obtained by rotating the workpieces, so that the influence of the shadow areas on the subsequent three-dimensional reconstruction work is effectively avoided;
(3) the distance between the workpiece and the camera can be measured by adopting the laser radar, and the complete three-dimensional coordinates of the workpiece can be obtained by combining image coordinate conversion;
(4) the invention adopts the inclination sensor ADXL345 to measure the rotation angle of the workpiece, and the characteristic matching is carried out on the left image and the right image through multiple angles;
(5) the invention directly transplants OpenCV to an ARM development board, calls a related function kernel algorithm of a computer vision library, and carries out a series of image preprocessing of acquisition kernel on the workpiece image, thereby selecting and identifying the ROI area.
The present invention is described in further detail below with reference to the attached drawing figures.
Detailed Description
As shown in fig. 1 to 3, a method for realizing three-dimensional reconstruction of a workpiece based on binocular stereo vision includes:
step 1: constructing a workpiece image acquisition system, wherein the workpiece image acquisition system comprises a workpiece three-dimensional rotating device and a binocular camera hardware measurement system, and an inclination sensor is arranged on the workpiece three-dimensional rotating device and used for measuring a rotating angle; the binocular camera hardware measurement system comprises two cameras and a laser radar, the centers of the two cameras and the workpiece rotating device are located on the same horizontal line, and the laser radar is located between the two cameras; the distance between the binocular camera hardware measurement system and the workpiece three-dimensional rotating device is measured, and the two cameras can move to achieve measurement of different distances.
Calibrating the left camera and the right camera to obtain internal parameters and relative attitude parameters of the two cameras; measuring the Z coordinate of the characteristic point of the workpiece by using the laser radar;
in some embodiments, the binocular camera hardware measurement system and the workpiece three-dimensional rotating device center are located on the same horizontal line, and the calculation amount of space conversion can be reduced.
When the workpiece acquires an image, under the industrial ring 00, a large shadow area is generated due to the influence of insufficient illumination. Therefore, the influence of the shadow area needs to be reduced, and the invention adopts the method of rotating the workpiece to acquire the workpiece images under different angles, so as to reduce the influence of the shadow area.
Step 2: starting from an initial position, when the workpiece three-dimensional rotating device rotates by an angle, a binocular camera hardware system collects a frame of image, and an inclination angle is measured by an inclination sensor;
the video Capture in OpenCV is used to open the camera, which is used to process the video file or the video stream of the camera, and can control the opening and closing of the camera, and the video stream can be read into the hardware platform and stored in the matrix frame by using the cap > frame, so as to process each frame image in the video.
And step 3: analyzing and processing the acquired image, wherein the analyzing and processing comprises gray processing, ROI area selection and self-adaptive median filtering to obtain a binary image, and extracting the contour of the binary image by using a canny edge extraction algorithm;
because the video collected by the camera is colorful, the video is processed into a gray level image when being processed, and three components of the gray level image R, G, B in the RGB format are equal and equal to the gray level value. In OpenCV, the functional declaration that enables the conversion of RGB color space to grayscale is: cvcvcvtcolor (const CvArr src, CvArr dst, int code), i.e. converting the original image src to dst, code representing the color space conversion parameter, and using this function to perform the gray-scale conversion on each frame of color image. The specific function is implemented as cvtColor (frame, edges, CV _ BGR2GRAY), where frame is the original image and edges is the grayscale image.
The image denoising is a commonly used step in image preprocessing, and commonly used image denoising algorithms include adaptive median filtering, gaussian filtering and the like. Wherein the adaptive median filtering is more suitable for such salt and pepper noise with abrupt white or black spots. The image noise mainly comes from the image acquisition and transmission process, and common noises include additive noise, multiplicative noise, quantization noise, salt and pepper noise and the like. Therefore, the present invention employs adaptive median filtering to eliminate noise.
Canny edge detection is carried out on the binary image, the edge of the image is detected, and a workpiece edge contour map is obtained;
and 4, extracting feature points from the left and right contour maps by adopting an SIFT algorithm, and performing stereo matching.
The SIFT algorithm is a description used in the field of image processing, can detect key points in an image, and is a local feature descriptor. The SIFT algorithm is mainly divided into scale space extreme value detection, key point positioning and key point feature description.
And (3) detection of extreme values in the scale space: image locations at all scales are searched and potential scale and rotation invariant points of interest are identified by gaussian differential functions. The dimensional image of the space is described as:
in the formula, L (x, y, σ) represents an image in a scale space, I (x, y) is an input image, G (x, y, σ) represents a two-dimensional gaussian kernel function whose scale can be changed, coordinates (x, y) of a pixel point, and σ is a scale factor.
Key point positioning: at the location of each point of interest, the location and scale are determined by a fitting fine model. In some embodiments, curve fitting is performed using a Talor expansion of the DoG function in scale space;
the Talor expansion of the DoG function in scale space is:
wherein D (X) is a Gaussian difference operator, X (X, y, sigma) represents pixel coordinates under a scale, sigma is a scale factor, X and y are coordinates of any pixel point in an image pixel coordinate system, and X0(x0,y0,σ0) An origin coordinate of an image pixel coordinate system under an original scale;
and (3) obtaining the offset of the extreme point by obtaining the derivation and the yield equal to zero:
the corresponding extreme point equation has the value:
and matching the characteristic points obtained from the left image and the right image.
Step 5, converting the coordinates of the characteristic points under the image pixel coordinate system into the coordinates under the world coordinate system according to the calibration result obtained in the step 1 and the distance measured by the laser radar;
step 5.1: converting the pixel coordinates of the image feature points into image physical coordinates;
for the feature point p, its coordinates are (u, v) in the pixel coordinate system and (x, y) in the physical coordinate system. Given that the dimensions of a single pixel in the x and y directions in the physical coordinate system are dx and dy, respectively, the following equations hold:
the arrangement into the form of its secondary transformation matrix is as follows:
in the formula (u)0,v0) Coordinates representing the origin of the physical coordinate system of the image
Step 5.2: camera coordinates that convert the physical coordinates of the image feature points.
The camera coordinate system is a space three-dimensional coordinate system established by taking the optical center of a camera lens as an origin, the Z axis is vertical to the image physical coordinate system, and a conversion matrix is obtained according to the similar triangle principle as follows:
in the formula (X)C,YC,ZC) Is the coordinate of the camera coordinate system, and f is the focal length of the camera.
Step 5.3, converting the camera coordinates of the image feature points into world coordinates;
finally, the conversion relation between the world coordinate system and the pixel coordinate system is obtained as follows:
where f denotes the focal length of the camera, R denotes a rotational third-order matrix, T denotes a translational column vector, and (X) denotes a translational column vectorW,YW,ZW) Is the world coordinate system coordinate.
And 6, performing curve fitting on the world coordinates of the obtained image feature points to obtain a workpiece contour map.
The invention realizes the three-dimensional reconstruction of the workpiece by adopting multi-angle fusion, records the angle data of the workpiece by rotating the workpiece for a certain angle, and respectively collects one frame of image by a left camera and a right camera. And then, performing image algorithms such as image preprocessing, image feature matching and the like on each frame of image to extract the feature points of the image, acquiring the pixel coordinates of the feature points, and then performing coordinate conversion on the pixel coordinates of the feature points to acquire the actual physical coordinates of the workpiece feature points. The method is simple and convenient to operate, can effectively realize three-dimensional reconstruction of the small workpiece, and effectively reduces the influence of the image shadow area.