CN114255286B

CN114255286B - Target size measuring method based on multi-view binocular vision perception

Info

Publication number: CN114255286B
Application number: CN202210184835.1A
Authority: CN
Inventors: 郑欣; 彭靓; 吴昊; 李庆武; 马云鹏; 周亚琴
Original assignee: Changzhou Robost Robot Co ltd
Current assignee: Changzhou Robost Robot Co ltd
Priority date: 2022-02-28
Filing date: 2022-02-28
Publication date: 2022-05-13
Anticipated expiration: 2042-02-28
Also published as: CN114255286A

Abstract

The invention relates to the technical field of image processing, and discloses a target size measuring method for multi-view binocular vision perception, which comprises the following steps: acquiring two groups of binocular camera parameters based on a Zhang calibration method; shooting a target by using two groups of binocular cameras to obtain two groups of binocular images, and correcting the two groups of binocular images by using an improved Bouguet algorithm to enable the two groups of binocular images to meet epipolar constraint; performing stereo matching on the two groups of binocular images respectively to obtain the parallax of the two groups of binocular images; dividing the binocular image to obtain a target area of the binocular image and obtain two groups of target three-dimensional point clouds; carrying out three-dimensional data fusion on data points obtained from the two groups of local coordinate systems and unifying the data points to the same coordinate system; determining the contour of the target area, and utilizing the fused three-dimensional point cloud to realize the length measurement of the contour. The invention can improve the precision of target contour dimension measurement and has great application value in industry.

Description

Target size measuring method based on multi-view binocular vision perception

Technical Field

The invention relates to the technical field of image processing, in particular to a target size measuring method based on multi-view binocular vision perception.

Background

The binocular vision simulates the mechanism of human eye vision, and the technology has high efficiency, simple equipment and lower cost, so the binocular vision simulation technology is widely applied in many fields. In the industrial field, the binocular vision technology can realize non-contact detection and monitoring of products without influencing the motion state of a target, so the technology is often used as an assistant to carry out three-dimensional reconstruction on the target, and further realizes the purposes of distance measurement, size measurement and the like;

at present, a single group of binocular cameras are mostly adopted for binocular vision-based three-dimensional reconstruction, binocular images are corrected by using camera parameter values obtained through calibration, and three-dimensional point cloud is obtained through stereo matching. The stereo matching and the three-dimensional reconstruction based on the single group of binocular cameras have low precision at the sheltering and shadow positions, neglect the possibility that the target object has different shape characteristics at each angle, have limitations on the target three-dimensional reconstruction, and are difficult to ensure the accuracy of the subsequent contour dimension measurement. According to the method for measuring the contour dimension of the binocular vision target with multiple visual angles, the target image is divided and subjected to stereo matching respectively, two groups of three-dimensional point clouds can be obtained, the two groups of data points are unified, the problem that target information obtained by a single camera is incomplete is solved, the precision of target three-dimensional reconstruction can be effectively improved, the precision of target contour dimension measurement is improved, and the method has important research value and significance in industry.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides a target size measuring method based on multi-view binocular vision perception, which can effectively improve the precision of target three-dimensional reconstruction.

In order to achieve the purpose, the invention provides the following technical scheme: a target size measuring method for multi-view binocular vision perception comprises the following steps:

1. two groups of binocular camera parameters are obtained based on the Zhang calibration method, including internal reference matrixes of four cameras and three groups of cameras (

And

，

and

and

and

) A rotation matrix and a translation matrix of (a);

2. shooting a target by using two groups of binocular cameras to obtain two groups of binocular images, and correcting the two groups of binocular images by using an improved Bouguet algorithm to enable the two groups of binocular images to meet epipolar constraint;

3. performing stereo matching on the two groups of binocular images respectively to obtain the parallax of the two groups of binocular images;

4. dividing the binocular image to obtain a target area of the binocular image and obtain two groups of target three-dimensional point clouds;

5. carrying out three-dimensional data fusion on data points obtained from the two groups of local coordinate systems and unifying the data points to the same coordinate system;

6. determining the contour of the target area, and utilizing the fused three-dimensional point cloud to realize the length measurement of the contour.

The invention provides a target size measuring method based on multi-view binocular vision perception, which has the beneficial effects that:

1. in the design process of the camera model, the characteristics of high technical efficiency, simple equipment and low cost of binocular vision are utilized, the problem that the shooting range of a single group of binocular cameras is limited is considered, two groups of binocular cameras are arranged to shoot a target object in an all-round way, and the comprehensive appearance information of the target object is obtained;

2. the invention solves the problem of cooperation of three-dimensional data obtained by a multi-view camera, unifies a plurality of groups of three-dimensional point clouds obtained by the multi-view camera into a world coordinate system to generate a point cloud picture with a complete target, improves the precision of three-dimensional reconstruction and further improves the precision of contour dimension measurement.

Drawings

FIG. 1 is a schematic view of a multi-view binocular vision target contour dimension measurement algorithm of the present invention;

FIG. 2 is a schematic diagram of two binocular camera models according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1, please refer to fig. 1-2, the present invention provides a technical solution: a target size measuring method for multi-view binocular vision perception comprises the following steps:

step 1, shooting a plurality of groups of calibration plate images from a plurality of angles by using an image acquisition device, and acquiring parameters of a camera based on a Zhang calibration method, wherein the method comprises the following specific steps:

step 11, the image acquisition device adopts four cameras with the same specification to form two groups of binocular cameras which comprise

And

the image acquisition device is used for acquiring images of the calibration plate from multiple angles and ensuring that the calibration plate is positioned

Clear and complete;

step 12, calibrating the four-eye camera by using a Zhang calibration method to obtain internal and external parameters of the camera;

(1a) to four-eye camera

Respectively calibrating to obtain internal parameters of each camera;

(1b) respectively to two groups of cameras

And

calibrating to obtain rotation matrix and translation matrix of two groups of cameras, and definingIs composed of

And

；

(1c) to pair

And

calibrating to obtain rotation matrix of two cameras

And translation matrix

。

Step 21, acquiring a target image to be measured by using an image acquisition device to obtain two groups of binocular images, wherein the first group of cameras

The binocular image is taken as

Second group of cameras

The binocular image is taken as

(ii) a Defining world coordinate system

First group of left eye cameras

Coordinate system is

And is and

and

coincidence, second group left eye camera

Coordinate system is

(ii) a First group of right eye cameras

Coordinate system is

Second group of right eye cameras

Coordinate system is

；

Step 22, constructing a first group of binocular images by using a Bouguet algorithm

Of the rotation matrix

To pair

Performing primary horizontal correction, and specifically comprising the following steps:

(2a) firstly, the first step is to

A rotation matrix of

Composite matrix divided into left and right cameras

Wherein

，

；

First set of binocular cameras having rotation matrix

The rotation matrix is divided into two opposite direction matrixes, which is equivalent to that the left eye camera rotates towards one direction

Half of the right eye camera rotates in the opposite direction

The half of the image is converted into the same plane by the left camera and the right camera;

(2b) creating

A rotation matrix of translation vector directions in between such that the baseline is parallel to the imaging plane;

formula (1)

Wherein the content of the first and second substances,

being the poles in the same direction as the translation vectors,

，

、

and

are respectively as

And

a translation vector in a direction;

a vector in the direction of an image plane;

is perpendicular to

And with

The vector of the plane in which the lens is located;

(2c) obtaining left and right cameras according to formula (2)

Integral rotation matrix of

(ii) a First set of left and right camera coordinate systems

，

Multiplying by the respectiveIntegral rotation matrix

So that the main optical axes of the left camera and the right camera are parallel, the image plane is parallel to the base line, and the coordinate systems of the first group of the left camera and the right camera are the same after rotation

；

Formula (2)

Step 23, mixing

，

Rotate simultaneously about respective optical centers

Obtaining a new coordinate system

，

At this time

And

and world coordinates

Overlapping; obtaining a line alignment image after rotation

；

Step 24, repeating step 22, and carrying out binocular image processing on the second group of binocular images

Performing primary correction to obtain

Integral rotation matrix of

The coordinate system of the second group of left and right eye cameras after correction

，

；

Step 25, repeat step 23, will

，

Rotate simultaneously about respective optical centers

Obtaining a new coordinate system

，

Then, then

And

obtaining a line alignment image after overlapping and rotating

。

Wherein, the step 24 and the step 25 correspond to the step 22 and the step 23 respectively; the operating method is completely identical, with the difference that step 24, step 25, is directed to a second set of binocular images; step 22, step 23 for a first set of binocular images;

for two groups of binocular images

And

respectively carrying out stereo matching to generate a disparity map

，

(ii) a An improved stereo matching algorithm based on AD-Census is adopted and divided into four steps of matching cost calculation, cost aggregation, parallax calculation and parallax optimization so as to

For example, the specific steps are as follows:

step 31, calculating initial matching cost, and defining Census matching cost

Is a pixel point in the representation shown in formula (3)

And

middle corresponds to parallax

Pixel point of

Census transformation betweenA similarity measure;

formula (3)

Wherein the content of the first and second substances,

and

are respectively left eye images

Middle pixel point

And the right eye image

Middle pixel point

The Census-transformed code of (a),

represents an exclusive or;

defining the cost of AD as

As shown in equation (4):

formula (4)

Wherein the content of the first and second substances,

，

are respectively left eye images

Middle pixel point

And the right eye image

Middle pixel point

A mapped gray value in RGB space; final matching cost

As shown in equation (5);

formula (5)

Wherein the content of the first and second substances,

，

respectively controlling Census matching cost and AD matching cost;

，

these two values are control parameters, Census matching cost and AD matching cost are respectively expressed as:

；

；

equation (5) is equivalent to adding the two equations when

，

，

，

When both are positive values, C (p, d) of the formula (5) is controlled to [0, 2 ]]A range of (d);

step 32, smoothing the matching cost by guiding filtering, aggregating the matching cost by taking the filter and the function as adaptive weight, and defining a kernel function as shown in formula (6);

formula (6)

In the formula (I), the compound is shown in the specification,

as a window

The size of (a) is (b),

and

are respectively windows

The mean and variance of the gray values of the inner pixels,

to adjust the parameters, the matching cost after aggregation is

P is a window

Q is a pixel point (terminating pixel point) at the corner of the window,

and

the gray values of the two pixel points are respectively shown as a formula (7);

formula (7)

The cost aggregation method is based on a cross window, Np is the selected cross window, and q belongs to Np and represents that a termination pixel point q is in the selected cross window;

step 33, selecting the value corresponding to the lowest matching cost from the candidate disparity values as the disparity value of the pixel point

Obtaining an initial disparity map corresponding to the first group of binocular images

As shown in equation (8);

formula (8)

In the formula (I), the compound is shown in the specification,

representing a maximum disparity search range;

step 34, distinguishing shielded points and non-shielded points through left-right consistency check, as shown in formula (9);

formula (9)

In the formula (I), the compound is shown in the specification,

is that

Middle pixel point

The value of the disparity of (a) to (b),

is that

Corresponding pixel point in

When the difference between the two is more than 1 pixel, the pixel point is a shielding point; distinguishing color segmentation areas after obtaining occlusion points and non-occlusion points, wherein the color segmentation areas are reliable areas when the following conditions are met, and are unreliable areas if the color segmentation areas are not met, as shown in a formula (10);

formula (10)

In the formula (I), the compound is shown in the specification,

is the total number of pixels of the region,

the number of non-occlusion points in the region,

is a constant and is provided with a constant,

is a proportionality coefficient; in order to obtain a better fitting effect, the plane fitting work is only adopted for the reliable area, and the parallax plane equation is defined as shown in a formula (11);

formula (11)

In the formula (I), the compound is shown in the specification,

is the coordinate of the pixel point and is,

as a result of the disparity value,

is a parallax plane parameter, which can be obtained according to the weighted least squares method, as shown in formula (12);

formula (12)

Wherein, the formulas of the sum are respectively as shown in formula (13);

formula (13)

In the formula (I), the compound is shown in the specification,

the plane parameters can be obtained from the formula (12) and the formula (13) for the number of correct matching points in the region

Substituting the formula (11) to obtain the parallax value of each pixel in the reliable region, and finally obtaining the parallax map corresponding to the first group of binocular images

；

Step 35, repeating the steps 31 to 34 to obtain a disparity map corresponding to the second group of binocular images

。

For binocular image

And

segmenting to obtain target areas of binocular images and obtain three-dimensional point clouds of two groups of target areas, and the method comprises the following specific steps:

step 41, performing fine segmentation on the target areas in the two groups of left and right eye images by using Grabcut algorithm and combining with manual frame selection of the target areas, and respectively obtaining image area segmentation results

，

The method comprises the following specific steps:

(4a) inputting images taken by a first group of left eye cameras

The user selects the marked area by adopting the rectangular area

The foreground of the initialization is that the foreground of the display,

the inner area is the foreground area

，

The outer area is the background area

(ii) a For the

Each of the pixels in

If, if

Then to the pixel

Dispensing label

(ii) a If it is

Then the label is assigned

；

(4b) Using a K-means clustering algorithm to cluster the foreground region

And a background region

Clustering K kinds of pixels respectively;

(4c) by using

，

The two label sets respectively initialize GMM parameters of the foreground and the background, and the foreground area is divided into a plurality of regions

Each pixel in the image is substituted into the two obtained GMMs to obtain the probability that the pixel belongs to the foreground area and the background area respectively, and the probability is in a negative logarithm form to obtain an area item;

(4d) computing foreground regions

Obtaining boundary terms by Euclidean distances between all every two adjacent pixels, obtaining the minimum value of energy by adopting a maximum flow minimum cut algorithm, and giving the calculated result to the foreground region again

The pixels in the row are allocated with label sets;

(4e) repeating the steps (4 b) to (4 d) until convergence, and outputting the image

Is divided into

；

(4f) Repeating the steps (4 a) to (4 e) to segment the target area in the rest of the binocular images, and finally obtaining the segmentation result of the two groups of binocular images

，

；

Step 42, based on the camera parameters obtained by the Zhang calibration method, setting the focal length of the four-eye camera as

，

Has a base line distance of

First group of left eye images

Has principal point coordinates of

To a

Middle pixel point

From a disparity map

With a parallax value of

(ii) a As shown in equation (14), it is possible to obtain from the principle of the triangular parallax

In a first set of left eye camera coordinate systems

The three-dimensional coordinates of

(ii) a Computing

All the pixels in the coordinate system

The three-dimensional coordinates of the camera to obtain a first group of cameras

All three-dimensional point clouds of visible targets under the visual angle are recorded as a first group of three-dimensional point clouds

，

Representing a quantity of a first set of three-dimensional point clouds;

formula (14)

Step 43, set up

Has a base line distance of

Second group of left eye images

Has principal point coordinates of

(ii) a Same as step 41, calculate

All the pixels in the coordinate system

Obtaining the three-dimensional coordinates of the second group of cameras

All three-dimensional point clouds of visible targets under the visual angle are recorded as a second group of three-dimensional point clouds

，

Representing the number of the second set of three-dimensional point clouds.

Three-dimensional point clouds obtained from two groups of local coordinate systems

，

Unifying three-dimensional data fusion to a world coordinate system

Obtaining a group of target collaborative three-dimensional point clouds, which comprises the following steps:

step 51, because of the first group of the left eye camera coordinate system

With a predetermined world coordinate system

Coincidence, then coordinate system

And

a rotation matrix of

Translation matrix

(ii) a (ii) Point cloud of the first set of three-dimensional points according to equation (15)

All three-dimensional data in (a) are subjected to rotational translation to a world coordinate system

Then, a three-dimensional point cloud set under the world coordinate system is obtained

；

Formula (15)

Step 52, second group of left eye camera coordinate system

With the first group of left eye camera coordinate system

The rotation matrix and the translation matrix in between are respectively

，

(ii) a (ii) the second set of three-dimensional point clouds according to equation (16)

；

Formula (16)

Step 53, unifying the three-dimensional point clouds under all the local coordinate systems to the world coordinate system to obtain the three-dimensional point cloud with complete target

。

Determining the outline of the target area, and realizing the dimension measurement of the outline by utilizing the fused three-dimensional point cloud.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A target size measuring method of multi-view binocular vision perception is characterized by comprising the following steps:

the method comprises the following steps of (1) acquiring two groups of binocular camera parameters based on a Zhang calibration method;

step (2), shooting a target by using two groups of binocular cameras to obtain two groups of binocular images, and correcting the two groups of binocular images by using an improved Bouguet algorithm to enable the two groups of binocular images to meet epipolar constraint;

step (3), performing stereo matching on the two groups of binocular images respectively to obtain the parallax of the two groups of binocular images;

the method comprises the following steps:

step 31, calculating initial matching cost, and defining Census matching cost

Is a pixel point in the representation shown in formula (3)

And

middle corresponds to parallax

Pixel point of

Census transform similarity measure between;

formula (3)

Wherein, the first and the second end of the pipe are connected with each other,

and

are respectively left eye images

Middle pixel point

And the right eye image

Middle pixel point

The Census-transformed code of (a),

represents an exclusive or;

defining the cost of AD as

As shown in equation (4):

formula (4)

Wherein the content of the first and second substances,

，

are respectively left eye images

Middle pixel point

And the right eye image

Middle pixel point

A mapped gray value in RGB space; final matching cost

As shown in equation (5);

formula (5)

Wherein the content of the first and second substances,

，

respectively controlling Census matching cost and AD matching cost;

，

；

；

equation (5) is equivalent to adding the two equations;

formula (6)

In the formula (I), the compound is shown in the specification,

as a window

The size of (a) is (b),

and

are respectively windows

The mean and variance of the gray values of the inner pixels,

to adjust the parameters, the matching cost after aggregation is

P is a window

Q is the pixel point at the corner of the window,

and

formula (7)

step 33, selecting the value corresponding to the lowest matching cost from the candidate disparity values as the value of the pixel pointDisparity value

As shown in equation (8);

formula (8)

In the formula (I), the compound is shown in the specification,

representing a maximum disparity search range;

formula (9)

In the formula (I), the compound is shown in the specification,

is that

Middle pixel point

The value of the disparity of (a) to (b),

is that

Corresponding pixel point in

formula (10)

In the formula (I), the compound is shown in the specification,

is the total number of pixels of the region,

the number of non-occluded points in the area,

is a constant number of times, and is,

formula (11)

In the formula (I), the compound is shown in the specification,

is the coordinate of the pixel point and is,

as a result of the disparity value,

formula (12)

Wherein, the formulas of the sum are respectively as shown in formula (13);

formula (13)

In the formula (I), the compound is shown in the specification,

；

；

Dividing the binocular image to obtain a target area of the binocular image and obtain two groups of target three-dimensional point clouds;

step 5, carrying out three-dimensional data fusion on data points obtained from the two groups of local coordinate systems and unifying the data points to the same coordinate system;

and (6) determining the contour of the target area, and measuring the length of the contour by using the fused three-dimensional point cloud.

2. The method for measuring the size of the target based on the multi-view binocular vision perception according to claim 1, wherein the method comprises the following steps: in the step (1), two groups of binocular camera parameters are obtained based on a Zhang calibration method, and the method comprises the following steps:

And

Clear and complete;

(1a) to four-eye camera

Respectively calibrating to obtain internal parameters of each camera;

(1b) respectively to two groups of cameras

And

calibrating to obtain rotation matrix of two groups of camerasAnd a translation matrix, defined as

And

；

(1c) to pair

And

calibrating to obtain rotation matrix of two cameras

And translation matrix

。

3. The method for measuring the size of the target for the multi-view binocular visual perception according to claim 2, wherein the method comprises the following steps: in the step (2), shooting a target by using two groups of binocular cameras to obtain two groups of binocular images, and correcting the two groups of binocular images by using an improved Bouguet algorithm respectively to enable the two groups of binocular images to meet epipolar constraint, wherein the method comprises the following steps:

The binocular image is taken as

Second group of cameras

Taken by shootingThe binocular image is

(ii) a Defining world coordinate system

First group of left eye cameras

Coordinate system is

And is and

and with

Coincidence, second group left eye camera

Coordinate system is

(ii) a First group of right eye cameras

Coordinate system is

Second group of right eye cameras

Coordinate system is

；

Of the rotation matrix

To pair

(2a) firstly, the first step is to

A rotation matrix of

Composite matrix divided into left and right cameras

Wherein

，

；

(2b) Creating

formula (1)

Wherein the content of the first and second substances,

being the poles in the same direction as the translation vectors,

，

、

and

are respectively as

And

a translation vector in a direction;

a vector in the direction of an image plane;

is perpendicular to

And

the vector of the plane in which the lens is located;

(2c) obtaining left and right cameras according to formula (2)

Integral rotation matrix of

(ii) a Left of the first groupCoordinate system of right camera

，

Multiplying by respective integral rotation matrices

；

Formula (2)

Step 23, mixing

，

Rotate simultaneously about respective optical centers

Obtaining a new coordinate system

，

At this time

And

and world coordinates

Overlapping; obtaining a line alignment image after rotation

；

Performing primary correction to obtain

Integral rotation matrix of

，

；

Step 25, repeat step 23, will

，

Rotate simultaneously about respective optical centers

Obtaining a new coordinate system

，

Then, then

And with

Obtaining line alignment images after superposition and rotation

。

4. The method for measuring the size of the target of the multi-view binocular vision perception according to claim 3, wherein the method comprises the following steps: in the step (4), the binocular image is segmented to obtain a target area of the binocular image, and two groups of target three-dimensional point clouds are obtained, which include:

，

The method comprises the following specific steps:

(4a) inputting images taken by a first group of left eye cameras

The user selects the marked area by adopting the rectangular area

The foreground of the initialization is that the foreground of the display,

the inner area is the foreground area

，

The outer area is the background area

(ii) a For the

Each of the pixels in

If, if

Then to the pixel

Dispensing label

(ii) a If it is

Then the label is assigned

；

(4b) Using a K-means clustering algorithm to cluster the foreground region

And a background region

Clustering K kinds of pixels respectively;

(4c) by using

，

(4d) computing foreground regions

Obtaining boundary terms by Euclidean distance between every two adjacent pixels, obtaining the minimum value of energy by adopting a maximum flow minimum cut algorithm, and giving the calculated result to the foreground region again

The pixels in the row are assigned label sets;

Is divided into

；

，

；

，

Has a base line distance of

First group of left eye images

Has principal point coordinates of

To a

Middle pixel point

From a disparity map

With a parallax value of

(ii) a As shown in equation (14), acquisition is based on the principle of triangular parallax

In a first set of left eye camera coordinate systems

The three-dimensional coordinates of

(ii) a Computing

All the pixels in the coordinate system

，

Representing a quantity of a first set of three-dimensional point clouds;

formula (14)

Step 43, set up

Has a base line distance of

Second group of left eye images

Has principal point coordinates of

(ii) a Same as step 41, calculate

All the pixels in the coordinate system

Obtaining the three-dimensional coordinates of the second group of cameras

，

Representing the number of the second set of three-dimensional point clouds.

5. The method of claim 4, wherein the method comprises: in the step (5), three-dimensional data fusion of data points obtained from the two sets of local coordinate systems is unified to the same coordinate system, which includes:

step 51, because of the first group of the left eye camera coordinate system

With a defined world coordinate system

Coincidence, then coordinate system

And

between are rotatedRotating matrix

Translation matrix

All three-dimensional data in (1) are subjected to rotational translation to a world coordinate system

；

Formula (15)

Step 52, second group of left eye camera coordinate system

With the first group of left eye camera coordinate system

The rotation matrix and the translation matrix in between are respectively

，

；

Formula (16)

。

6. The method of claim 5, wherein the method comprises: determining the outline of the target area, and realizing the dimension measurement of the outline by utilizing the fused three-dimensional point cloud.