CN112926503A

CN112926503A - Automatic captured data set generation method based on rectangle fitting

Info

Publication number: CN112926503A
Application number: CN202110307109.XA
Authority: CN
Inventors: 李育文; 张智辉
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2021-03-23
Filing date: 2021-03-23
Publication date: 2021-06-08
Anticipated expiration: 2041-03-23
Also published as: CN112926503B

Abstract

The invention discloses a captured data set automatic generation method based on rectangular fitting, which comprises the steps of clamping the edge of an object by adopting an end effector of a mechanical arm, driving the object to rotate around the center of the object, and acquiring images of the object at various angles by a camera; determining an object region and generating an object mask; detecting contour information of a target object; carrying out Hough transform on the image, detecting straight lines in the object contour, and merging short line segments; sorting the linear lengths, detecting corresponding parallel lines according to the linear lengths, and fitting the object contour by using a plurality of rectangles; generating a plurality of grabbing rectangles suitable for the double-finger paw by the fitting rectangle according to an equidistant sampling mode; the background image and the object image are combined to generate a captured dataset image, while the corresponding label is generated. The invention has simple operation, convenient and quick realization and does not need additional equipment. The problem that manual labeling work of the data sets in the grabbing task wastes time and labor is solved, and a convenient way is provided for manufacturing the training set required by the deep learning model in the grabbing task.

Description

Automatic captured data set generation method based on rectangle fitting

Technical Field

The invention belongs to the field of robots, and particularly relates to a captured data set automatic generation method based on rectangle fitting.

Background

With the development of computer vision and artificial intelligence, robotics has received more and more attention, wherein intelligent robots are more considered as the mainstream development direction in the future. In recent years, various methods for detecting object grabbing postures are proposed in journals and meetings in the robot field, and particularly, a huge target grabbing data set with labeling information is required for training and testing a model for detecting scene object grabbing postures under a deep learning framework. The data set needs to have object image information and a corresponding plurality of capture poses.

In the field of robotics, there are available as grab test data sets Cornell grab lsp Dataset, Jacquard Dataset, VMRD, etc., which are manually labeled by researchers. Each sample in the data set may contain a plurality of objects, each object has a plurality of possible grabbing postures, manual production of the grabbing data set is time-consuming and labor-consuming and is influenced by human subjectivity, the marked grabbing data set has a certain tendency, and the marked grabbing positions only contain a part of available grabbing positions and cannot comprehensively represent the grabbing positions of the objects.

Disclosure of Invention

The invention aims to provide a captured data set automatic generation method based on rectangle fitting, aiming at the problems in the existing captured data set production process. The method is simple to operate, saves time and labor, and can finish the manufacturing work of capturing the database in a short time.

In order to achieve the purpose, the invention adopts the technical scheme that:

a method for automatically generating a captured data set based on rectangle fitting comprises the following steps:

firstly, acquiring an object image:

the acquisition of multi-angle image information of an object is completed in a mode of combining a mechanical arm and a camera;

step two, determining an object region, and generating an object mask:

according to the position relation among all parts in the data collection system, the approximate position of an object in an image is preliminarily determined, and most of background noise is removed; utilizing a background subtraction method to segment an object region from a background to obtain an object mask;

step three, detecting the contour information of the target object:

detecting the edge of the object by using a canny operator, and representing the outline of the object by a large number of short lines;

step four, detecting straight lines by Hough transform:

carrying out Hough transform on the object contour curve, detecting straight lines in the object contour, and integrating line segments meeting merging conditions to reduce the number of the lines;

step five, fitting the contour by using a plurality of rectangles:

sorting the straight lines according to the length of the straight lines, and sequentially detecting corresponding parallel lines according to the sorting of the length of the straight lines; the rectangle used to fit the object contour needs to satisfy four conditions simultaneously:

firstly, the included angle of the two line segments is less than 20 degrees;

② the distance between the parallel lines is larger than threshold th₁＝50；

③, the number of pixels between the parallel lines is not zero;

fourthly, the shorter line segment has projection on the longer line segment;

mutually projecting parallel lines which simultaneously meet the conditions of the first to the fourth to generate a plurality of rectangles for fitting the outline of the object;

step six, generating a grabbing rectangle:

according to the finger width w, a plurality of grabbing rectangles suitable for the double-finger paw are generated on the fitting rectangle in an equidistant sampling mode;

step seven, synthesizing the captured data set image, and manufacturing a captured data set:

and combining the color image of the object with a plurality of preset backgrounds by using the object mask generated in the second step to generate a plurality of captured data set images, and simultaneously making label files for all samples in the data set according to the positions of the object in the images and the captured rectangles generated in the sixth step.

Preferably, the data acquisition in the first step is implemented by combining a mechanical arm and a camera, and the camera is mounted right above and vertically downward from the background desktop through a frame. For the convenience of later object segmentation, the background desktop is a single color, and white is selected here. And controlling the gripper to clamp the edge of the object and enabling the object to be positioned under the camera, driving the object to rotate respectively around X, Y, Z three coordinate axes of a gripper coordinate system by plus or minus 30 degrees, and acquiring an object image by the camera after the system is stabilized.

Preferably, in the second step, the relationship between the camera and the robot arm is used to calculate the coordinates (u, v) of the midpoint position (X, Y, Z) of the connecting line between the two fingertips in the robot arm coordinate system in the image through a kinematic forward solution, so as to determine the position of the object in the image:

wherein the content of the first and second substances,

representing the relation of the robot arm coordinate system with respect to the camera coordinate system, f_x,f_y,c_x,c_yIs the camera internal reference. Assuming that the target object is smaller than a cube with a side length of 20cm, the region of the target in the image can be determined by projecting the cube, and the smallest square that can contain the region is clipped, assuming a side length of e. The region only contains the target object and the background image, then the background is subtracted from the image, and the threshold th is set_v10, wherein the portion of the difference value less than the threshold is the object region.

Preferably, in the third step, the object mask image obtained in the second step is subjected to edge detection, a Canny operator is used for detecting the edges of the object, and gradient values of all points in the image are calculated. If the point having the gradient value less than 100 is set as the non-boundary point, the point having the gradient value greater than 150 is set as the boundary point. The point in the interval judges the property according to whether the adjacent point is the boundary point, if the adjacent point is adjacent to the boundary point, the point is considered as the boundary point, and if the adjacent point is adjacent to the non-boundary point, the point is considered as the non-boundary point.

Preferably, in the fourth step, all points on the contour are mapped into a hough space through hough transformation, the midpoint of the hough space corresponds to a straight line in an image space, the number of straight lines corresponding to each intersection point in the hough space is counted, and the number of the straight lines is greater than a threshold th_cPoint map ofStraight lines in image space, at th_c30. And then calculating the included angle between any two line segments and the distance between the two adjacent nearest points of the two line segments to judge whether the two line segments can be merged, and if the included angle of the straight lines where the two line segments are located is less than 10 degrees and the distance between the nearest points is less than 20 pixels, determining that the two line segments are both part of the real line segment. And generating k line segment pairs which can be used for merging through pairwise matching. And then clustering the k line segment pairs, and clustering all the line segments with matching relations into one class. And calculating a minimum bounding rectangle formed by all the points on each type of straight line, wherein the middle point of two short sides in the connecting rectangle is used for representing the merged straight line.

Preferably, in the fifth step, all the lengths of the merged straight lines are sorted, and the rectangle is generated by preferentially using the longer straight line. Whether the rectangle used for fitting the object can be generated by the longer straight line and the residual straight lines is sequentially judged, and the rectangle of the fitting object needs to meet the following conditions:

the first line segment and the second line segment are parallel lines;

secondly, the distance between the parallel lines is larger than the threshold th₁＝50；

Thirdly, parallel lines are not empty;

and fourthly, the projection of the shorter line segment on the longer line segment exists.

And establishing a fitting rectangle according to the conditions. Calculating included angle by using included angle residue between vectors

The distance between the midpoints of the two line segments is simplified to represent the distance between the two line segments; and respectively connecting two end points of one line segment with two end points of the other line segment to form four vectors, wherein if the vectors generated by the two end points of one line segment at least have one vector which forms an acute angle with the line segment, the two line segments have projections mutually. And connecting corresponding end points of the two line segments to form a quadrangle, and counting the number n of non-zero points in the quadrangle. If the four conditions cannot be simultaneously met, the next straight line is judged until the complete straight line is traversed. For the straight line meeting the conditions, calculating the projection point of the shorter line segment on the longer line segment,if no projection point exists, the side end point of the overlong line segment is perpendicular to the short linearity, and the intersection point is updated to be the end point of the short line segment. Calculating the middle point, the rotation angle, the width and the height of the fitting rectangle, wherein the horizontal and vertical coordinates of the middle point are the average values of the horizontal and vertical coordinates of four end points of the line segment respectively, and the rotation angle theta_rThe average of the included angles formed by the two line segments and the horizontal line is shown, the width W is equal to the distance between the middle points of the line segments, and the height H is the shorter length of the line segments.

Preferably, in the sixth step, the finger width w is measured, and the finger projection width w' in the image is calculated according to the hand-eye relationship. And taking w'/2 as a step length, and sampling from one end of the rectangle to the other end of the rectangle along the direction of the middle edge H of the fitted rectangle to generate a plurality of grabbed rectangles. Width w of the grabbing rectangle_gW' high

The axes of the grabbing rectangle and the fitting rectangle are mutually vertical. The central point of the grabbing rectangle is located on the axis of the fitting rectangle, and the grabbing rectangle is ensured not to exceed the range of the fitting rectangle.

Preferably, in the seventh step, both a monochrome background and a cluttered background are selected as the background. Cutting object region in the image by the object mask obtained in the step two, pasting the object region to the central position of the background image, and rotating the object mask by theta_▽，θ_▽Values from 0-360 ° are taken at 30 ° intervals, and a total of 2 × 12 — 24 samples are generated for a single object. For color images, the background is directly replaced by the object image, and when the mask value is true, I_B(x,y)＝I_F(x, y), and the values of other positions are kept unchanged. For depth images, the value of the object region is subtracted from the value of the background, and when the mask value is true, I_B(x,y)＝I_B(x,y)-I_F(x, y), and the values of other positions are kept unchanged. Calculating the grabbing rectangle (c ') of the object in the composite image according to the position and the angle of the object in the background image and the grabbing rectangle obtained in the step six'_x,c'_y,w'_g,h'_g,θ'_g) Wherein

The abscissa representing the center point of the captured rectangle in the composite image,

ordinate, w 'representing center point'_g＝w_gDenotes the width of the finger in the grabbing rectangle, h'_g＝h_gDenotes the size of the opening of the gripper rectangle, θ'_g＝θ_g+θ_▽Representing the angle of the grabbed rectangle to the horizontal axis of the image.

Compared with the prior art, the invention has the following obvious prominent substantive characteristics and obvious advantages:

1. the method is simple and time-saving in the manufacturing operation of the captured data set, only the object needs to be clamped manually in the middle, excessive manual intervention is not needed, and the manufacturing work of the captured data set can be completed in the shortest time;

2. the method is simple, high in efficiency and easy to implement.

Drawings

FIG. 1 is a flow chart of the present invention.

FIG. 2 is a schematic diagram of a data acquisition system of the present invention.

FIG. 3 is a schematic diagram of the calculation of the fitted rectangle of the present invention.

FIG. 4 is a sample diagram of a capture frame according to the present invention.

Fig. 5 is a diagram illustrating the object segmentation result according to the present invention.

Fig. 6 is a schematic diagram of the contour detection of the present invention.

FIG. 7 is a graphical representation of the results of the rectangle fitting of the present invention.

FIG. 8 is a schematic diagram illustrating the capturing of a sample of a data set according to the present invention.

Detailed Description

The invention is further illustrated by the following figures and examples:

the first embodiment is as follows:

referring to fig. 1, a method for automatically generating a captured data set based on rectangle fitting includes the following steps:

firstly, acquiring an object image:

step two, determining an object region, and generating an object mask:

step three, detecting the contour information of the target object:

step four, detecting straight lines by Hough transform:

step five, fitting the contour by using a plurality of rectangles:

firstly, the included angle of the two line segments is less than 20 degrees;

③, the number of pixels between the parallel lines is not zero;

fourthly, the shorter line segment has projection on the longer line segment;

step six, generating a grabbing rectangle:

The method is simple and time-saving in the manufacturing operation of the captured data set, only the object needs to be clamped manually in the middle, excessive manual intervention is not needed, and the manufacturing work of the captured data set can be completed in the shortest time.

Example two:

this embodiment is substantially the same as the first embodiment, and is characterized in that:

in the first step, a fixed Kinect camera is used as data acquisition equipment, the posture of the object is changed by clamping the object through a mechanical arm tail end paw, and image information of the object under various angles is acquired.

In the fifth step, four conditions for forming the rectangle are utilized to detect whether any two parallel lines can be used for synthesizing the fitting rectangle, and the fitting rectangle is established by the two parallel lines in a mutual projection mode.

In the sixth step, the projection w of the finger width is taken as the width of the grabbing rectangle, H of the fitting rectangle which is 6/5 times is taken as the height H of the grabbing rectangle, and sampling is carried out along the H side of the fitting rectangle at the interval of 0.5w to construct the grabbing rectangle.

And 5, in the seventh step, combining the object image and the background image to form a captured data set image, and generating captured parameters according to the position of the object image in the background.

The automatic captured data set generating method based on the rectangular fitting is simple to operate, time-saving and labor-saving, and can complete the manufacturing work of the captured database in a short time.

Example three:

referring to fig. 1 to 8, a method for automatically generating a captured data set based on rectangle fitting includes the following steps:

step one, a data collection system is built according to the structure shown in fig. 2, a camera is installed right above a background desktop and vertically downward through a frame, and a mechanical arm is fixed on the desktop. In order to facilitate later object segmentation, a white desktop background is selected. The camera is connected with the computer through an L-to-USB data line, and the mechanical arm controller is connected with the computer through a twisted pair. And powering on all the devices after all the hardware is successfully connected, and initializing the data acquisition system. And the gripper of the manual control mechanical arm clamps the edge of the object to be collected and drags the edge to the position right below the camera, the object is driven to rotate respectively around X, Y, Z coordinate axes of a gripper coordinate system by plus or minus 30 degrees, and the camera collects an object image after the system is stabilized.

And secondly, establishing a coordinate system at the midpoint position of a connecting line of the two fingertips, and calculating the projection point of the fingertips in the image by using the pose relation between the camera and the mechanical arm so as to determine the approximate position of the object. A rectangle with a side length e of 50 pixels is created parallel to the image edges, and the projected point of the fingertip is located at 3/4 of the rectangle axis, so that the rectangle bounds the area where the object is located. Only the target object and the background image are included in the area, then the background is subtracted from the image, and a threshold th is set_v10, where the portion of the difference value less than the threshold is the object region, the effect is shown in fig. 5.

And step three, detecting the edge of the object by utilizing a cvCanny () function provided in OpenCV for the object mask image obtained in the step two. The mask image is input as a function, and the parameters threshold1 is set to 100, threshold2 is set to 150, and alert _ size is set to 3. A two-dimensional matrix of the same size as the input is output, the object contour position is represented by '1', and the other portions are represented by '0', and the effect is shown in fig. 6.

And step four, detecting a point set which is possibly a straight line in the contour by using a probability Hough transform function Hough LinesP () provided in OpenCV. And taking the output in the third step as an input, setting the search step rho to be 2, the search radian interval theta to be pi/180, the accumulated point number threshold to be 30, the line segment maximum interval maxLineGap to be 10 and the line segment minimum length to be 100. The function output is a two-dimensional matrix L consisting of a plurality of line segments, the first dimension represents the number of detected line segments, and the second dimension is the horizontal and vertical coordinates of two end points of each line segment. To simplify the amount of computation in the line segment integration, the line segments are first clustered. Firstly, initializing a set S for storing line segments used by the integrated straight line segments, sequentially extracting one line segment from all detected line segments, and deleting the line segment from L. And judging whether a line segment capable of being merged with the line segment exists in the S, if so, adding the line segment into a corresponding line segment set, otherwise, establishing a new line segment set until all detected line segments are traversed. And calculating a minimum bounding rectangle formed by all the points on each type of straight line, wherein the middle point of two short sides in the connecting rectangle is used for representing the merged straight line.

And step five, calculating the lengths of all the merged straight lines, and sequencing the straight lines by using a quick sequencing algorithm. Preferentially using a longer straight line to generate a rectangle, and sequentially judging whether the longer straight line and the residual straight lines can generate a rectangle for fitting an object, wherein the rectangle for fitting the object needs to meet the following conditions: the included angle of the first line segment and the second line segment is less than 20 degrees; secondly, the distance between the parallel lines is larger than the threshold th_l50; thirdly, the number of pixels between parallel lines is not zero; and fourthly, the projection of the shorter line segment on the longer line segment exists. And establishing a fitting rectangle according to the conditions. Let l be the long line of any given two straight lines₁Short line is l₂，l₁Is p at the left end point of₁The right end point is p₂，l₂Is p at the left end point of₃The right end point is p₄As shown in fig. 3. Connection point p₁、p₂Form vector V₁Point of attachment p₃、p₄Form vector V₂. Calculating the included angle between two vectors

By calculating l₂Midpoint to l₁The distance d of (a) represents the distance between two line segments. Respectively combine p with₃、p₄Point and p₁、p₂The point connection forms a vector V₁₃、V₁₄、V₂₃、V₂₄If V is₁₃、V₁₄In which at least one vector and a vector V are present₁The included angle is acute and V₂₃、V₂₄In which at least one vector and a vector V are present₁The included angle is acute angle, which indicates the angle between two line segmentsHave projections to each other. Connecting point p in sequence₁、p₂、p₄、p₃、p₁And forming a quadrangle, and counting the number n of non-zero points in the quadrangle. If the four conditions cannot be simultaneously met, the next straight line is judged until the complete straight line is traversed. For a eligible line, calculate point p₃、p₄In the vector V₁Projected point p 'on'₃、p'₄Abscissa of projection point

Ordinate of the curve

Where i ∈ {3,4}, α₁Represents a vector V₁The included angle between the straight line and the horizontal line. If p'₃In the vector V₁Above, then p'₁P 3. Otherwise, a passing point p is established₁And vector V₂Straight line perpendicular to V₂The intersection of the straight lines is point p'₁，p'₃＝p₁. Similarly, if p'₄In the vector V₁Above, then p'₂＝p₄. Otherwise, a passing point p is established₂And vector V₂Straight line perpendicular to V₂The intersection of the straight lines is point p'₂，p'₄＝p₂. Calculating the midpoint (x, y) of the fitted rectangle, wherein

Calculating the rotation angle of the fitted rectangle

Calculating the width W of the fitted rectangle to be equal to the point p'₁And p'₂From midpoint to point p'₃And p'₄Calculating the fitted rectangular height H as a point p'₁And p'₂Distance between and point p'₃And p'₄The smaller the distance between them. The fitting is finally done by a rectangle, as shown in fig. 7.

And step six, measuring the width w of the finger, and calculating the projection width w' of the finger in the image according to the hand-eye relationship. And with w'/2 as a step size, sampling from the P1 end of the rectangle to the P2 end of the rectangle along the direction of the side H in the fitted rectangle to generate a plurality of grabbed rectangles, as shown in FIG. 4. Width w of the grabbing rectangle_gW' high

When theta is_rRotation angle theta of grabbing rectangle with timing_g＝θ_r-90 ° when θ_rTheta when the value is negative_g＝θ_r+90 °. Capturing the abscissa c of the center point of the rectangle_x＝x_P1+k·w_gcos(θ_r) Ordinate c_y＝y_P1+k·w_gsin(θ_r) Wherein k is 0.5,1,1.5,2 ….

And step seven, selecting a single-color background and a cluttered background as the background, wherein the size of the background is 300 x 300. And D, cutting the object area in the image by using the object mask obtained in the step two, and obtaining a matrix with numerical values only in the object area and 0 in other parts. The image is rotated using the warpAffine () function, the rotation center is set to the center of the image, the boundary fill value is 0, and the rotation angle θ is_▽At intervals of 30 from 0-360 deg. And adding the rotated image and the corresponding element of the background image, and generating 2 × 12 to 24 samples in the single object. Calculating the grabbing rectangle (c ') of the object in the composite image according to the position and the angle of the object in the background image and the grabbing rectangle obtained in the step six'_x,c'_y,w'_g,h'_g,θ'_g) Wherein

ordinate, w 'representing center point'_g＝w_gIndicating the width of the finger in the grabbing rectangleDegree, h'_g＝h_gDenotes the size of the opening of the gripper rectangle, θ'_g＝θ_g+θ_▽Representing the angle of the grab rectangle to the horizontal axis of the image, an example of the obtained grab data set is shown in fig. 8.

In the embodiment, the automatic captured data set generation method based on rectangular fitting utilizes the mechanical arm end effector to clamp the edge of an object, drives the object to rotate around the center of the object, and acquires images of the object at various angles through a camera; determining an object region and generating an object mask; detecting contour information of a target object; carrying out Hough transform on the image, detecting straight lines in the object contour, and merging short line segments; sorting the linear lengths, detecting corresponding parallel lines according to the linear lengths, and fitting the object contour by using a plurality of rectangles; generating a plurality of grabbing rectangles suitable for the double-finger paw by the fitting rectangle in an equidistant sampling mode; the background image and the object image are combined to generate a captured dataset image, while the corresponding label is generated. The method is simple to operate, convenient and quick to implement, and does not need additional equipment. The problem that manual labeling work of the data sets in the grabbing task wastes time and labor is solved, and a convenient way is provided for manufacturing the training set required by the deep learning model in the grabbing task.

The present invention is not limited to the above embodiments, and all changes, substitutions and simplifications made according to the principles of the present invention are equivalent substitutions, or directly or indirectly applied to other related fields, which are included in the protection scope of the present application.

Claims

1. A captured data set automatic generation method based on rectangle fitting is characterized by comprising the following steps:

firstly, acquiring an object image:

step two, determining an object region, and generating an object mask:

step three, detecting the contour information of the target object:

step four, detecting straight lines by Hough transform:

step five, fitting the contour by using a plurality of rectangles:

firstly, the included angle of the two line segments is less than 20 degrees;

③, the number of pixels between the parallel lines is not zero;

fourthly, the shorter line segment has projection on the longer line segment;

step six, generating a grabbing rectangle:

2. The method for automatically generating a grabbed data set based on rectangle fitting as claimed in claim 1, wherein: in the first step, a fixed Kinect camera is used as data acquisition equipment, the posture of the object is changed by clamping the object through a mechanical arm tail end paw, and image information of the object under various angles is acquired.

3. The method for automatically generating a grabbed data set based on rectangle fitting as claimed in claim 1, wherein: in the fifth step, four conditions for forming the rectangle are utilized to detect whether any two parallel lines can be used for synthesizing the fitting rectangle, and the fitting rectangle is established by the two parallel lines in a mutual projection mode.

4. The method for automatically generating a grabbed data set based on rectangle fitting as claimed in claim 1, wherein: in the sixth step, the projection w of the finger width is taken as the width of the grabbing rectangle, H of the fitting rectangle which is 6/5 times is taken as the height H of the grabbing rectangle, and sampling is carried out along the H side of the fitting rectangle at the interval of 0.5w to construct the grabbing rectangle.

5. The method for automatically generating a grabbed data set based on rectangle fitting as claimed in claim 1, wherein: and in the seventh step, combining the object image and the background image to form a captured data set image, and generating captured parameters according to the position of the object image in the background.