CN115205356A - Binocular stereo vision-based quick debugging method for practical training platform - Google Patents

Binocular stereo vision-based quick debugging method for practical training platform Download PDF

Info

Publication number
CN115205356A
CN115205356A CN202211092343.6A CN202211092343A CN115205356A CN 115205356 A CN115205356 A CN 115205356A CN 202211092343 A CN202211092343 A CN 202211092343A CN 115205356 A CN115205356 A CN 115205356A
Authority
CN
China
Prior art keywords
training platform
point
module
feature
debugging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211092343.6A
Other languages
Chinese (zh)
Other versions
CN115205356B (en
Inventor
李博
郑泽胜
朱万锦
杨丹媚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Yidao Intelligent Information Technology Co ltd
Original Assignee
Guangzhou Yidao Intelligent Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Yidao Intelligent Information Technology Co ltd filed Critical Guangzhou Yidao Intelligent Information Technology Co ltd
Priority to CN202211092343.6A priority Critical patent/CN115205356B/en
Publication of CN115205356A publication Critical patent/CN115205356A/en
Application granted granted Critical
Publication of CN115205356B publication Critical patent/CN115205356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a binocular vision-based quick installation and debugging method for a practical training platform, which comprises the following steps of: quickly calibrating binocular stereoscopic vision; positioning each module on the practical training platform by using an improved lightweight yolo v5 network, and acquiring point cloud of a target area by using a stereo matching method; point cloud registration of each module of the training platform; and debugging errors aiming at the positions of all modules of the reference training platform. The three-dimensional pose information of each module on the practical training platform is obtained by performing high-precision rapid identification and positioning on each module of the practical training platform through the binocular stereoscopic vision positioning system, and the real-time position deviation correction information feedback is performed on each module in the debugging process to guide the installation error of debugging personnel, so that the debugging efficiency of the practical training platform is improved, the debugging time is shortened, the debugging cost is reduced, and the purpose of rapid shipment is achieved.

Description

Binocular stereo vision-based quick debugging method for practical training platform
Technical Field
The invention relates to the technical field of computers, in particular to a quick debugging method of a practical training platform based on binocular stereo vision.
Background
With the rapid development of the intelligent manufacturing industry, more robots are needed to debug and operate employees in a production line, and a large amount of training platform equipment is needed to train the employees in the enterprises in higher vocational schools, social training institutions and the like. The practical training platform has more modules, and in the production process, the debugging period is long due to inaccurate assembly of the relative positions of the modules, so that the robot prefabricated program or the PLC prefabricated program cannot be directly copied into the platform to directly run.
Therefore, it is important to debug the installation position of each module, and there are several existing debugging methods:
firstly, the debugging engineer uses the ruler to measure the position of each module, and the method has long time and low precision, and the debugging effect has larger difference due to the individual difference of the debugging engineer.
Secondly, debugging is carried out through monocular machine vision, but the monocular machine vision can only solve the problems of translation and rotation of each module on a plane, the offset error of the integral three-dimensional pose of each module cannot be obtained, and due to the problem of depth of field, if the difference between the height of each module and the height of the monocular camera is large, the monocular camera cannot carry out accurate imaging, so that positioning is not accurate.
And thirdly, positioning and debugging are carried out through binocular vision, the existing binocular vision positioning is mainly still used for positioning a single object, and when the module is applied to a practical training platform with a plurality of modules, the modules are mutually shielded to cause inaccurate positioning so as to influence a debugging result.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a binocular vision-based rapid installation and debugging method for a practical training platform, which can rapidly and precisely identify and position each module of the practical training platform to obtain three-dimensional pose information of each module on the practical training platform, and feed back real-time position deviation correction information of each module in a debugging process to guide the installation error of debugging personnel, thereby improving the debugging efficiency of the practical training platform, shortening the debugging time and reducing the debugging cost.
The technical scheme of the invention is realized as follows:
a quick debugging method for a practical training platform based on binocular stereo vision comprises the following steps:
calibrating the binocular, namely calibrating and correcting the relative position of a monocular camera in a binocular vision system;
adding a double attention model into a Yolo v5 network, and designing a lightweight Yolo v5 network;
manufacturing a practical training platform module position data set, wherein the module position data set comprises a first characteristic diagram and a second characteristic diagram of a practical training platform, which are shot through a binocular vision system;
inputting the training platform module position data set into the lightweight Yolo v5 network, and positioning a target area of each module on the training platform;
performing feature point extraction operation on the target area through a stereo matching method to obtain a first feature point set and a second feature point set;
performing characteristic point matching on the first characteristic point set and the second characteristic point set through an Euclidean distance matching operation to obtain a matching point pair, and performing polarity constraint judgment on the matching point pair;
calculating the point cloud to be matched of the target area by using a feature matching point pair calculation formula according to a triangular imaging principle;
and performing point cloud registration operation on the point cloud to be registered, comparing the registered point cloud with a preset template point cloud, acquiring the offset error of each installation module, and guiding the offset error into visual software to output guide debugging information.
Preferably, the double-target centering further comprises that a left monocular camera and a right monocular camera in the binocular vision system shoot a series of checkerboard calibration plate images and fit internal parameters and external parameters of the left monocular camera and the right monocular camera;
a left monocular camera and a right monocular camera in the binocular vision system shoot a series of checkerboard calibration plate images;
searching angular point information in the chessboard pattern calibration plate image by using a Harris angular point detection method;
fitting the internal parameters and the external parameters of the left monocular camera and the right monocular camera according to the angular point information;
and carrying out camera coordinate conversion on the images acquired by the left monocular camera and the right monocular camera through an internal reference matrix, and multiplying by a rotation matrix to obtain new coordinate systems of the left monocular camera and the right monocular camera. And carrying out distortion correction on the left camera and the right camera through the distortion removal operation of the left camera and the right camera, and carrying out epipolar line verification on the left monocular camera and the right monocular camera on the image.
Preferably, the dual attention model includes a channel attention module and a location attention module;
adding the dual attention model to backbone and neck of a Yolo v5 network to obtain the lightweight Yolo v5 network.
Preferably, a truncated ICP method is used for carrying out registration operation on the point clouds to be matched, the point clouds to be matched are arranged in an ascending order, and the first half of the point clouds to be matched are taken as the minimum;
selecting a matching point pair meeting the requirement by utilizing the truncation ratio;
solving rigid body transformation between the matching point sets by using a generalized least square method;
updating the truncation ratio using the rigid body transformation;
and selecting the matching point pairs meeting the requirements by using the updated truncation ratio.
Preferably, the step of truncating the ICP to solve for the rigid transformation comprises constructing a residual metric function; accelerating and traversing by utilizing a K-D tree to obtain the most adjacent point pair of the template point cloud and the point cloud to be matched; solving the nearest point pair meeting the requirement by utilizing the truncation coefficient; the rigid transformation is found from the nearest pair of neighboring points that satisfy the requirements using the SVD method.
Preferably, the stereo matching method is to search the target region through a SURF feature point operator to obtain the first feature point set and the second feature point set.
Preferably, designing the lightweight Yolo v5 network includes acquiring a training platform top view sample diagram data set to construct a sample data set, and training the lightweight Yolo v5 network through the sample data set.
Preferably, the channel Attention module is a structure of Residual + Attention, reshape and transit are respectively performed on the feature map a to respectively obtain a feature map R1 and a feature map RT1, the feature map R1 and the feature map RT1 are multiplied by softmax to obtain a channel Attention feature map X, the feature map X and the feature map a are multiplied by a scale coefficient and then subjected to Reshape to be changed into an original shape, and finally the channel Attention module is added to the feature map a to obtain an output feature map E.
Preferably, the position Attention module is a structure of Residual + Attention, the feature map a is convolved by 3 convolution kernels respectively to obtain 3 feature maps B, C and D, then the feature map B is subjected to Reshape and Transpose and the matrix after Reshape is performed on the feature map C, softmax is performed to obtain a feature map S, the matrix after Reshape is performed on the feature map S and the feature map D is subjected to multiplication and then is subjected to Reshape after a scale coefficient, the original shape is changed, and finally the result is added to the feature map a to obtain a final output feature map E.
Preferably, the training platform module position data set includes: taking a picture of the random placement position of the module; and shooting pictures of different postures of the robot on the practical training platform.
Compared with the prior art, the invention has the following advantages.
According to the invention, the binocular vision system is used for carrying out high-precision and rapid identification on each module on the practical training platform, positioning the modules through a Yolo v5 network, carrying out point cloud generation, point cloud registration and other operations to obtain the three-dimensional pose information of each module on the practical training platform, and outputting the guiding and debugging information of each module for deviation rectification through comparison of the information and the three-dimensional pose information of the reference version, so that the adjustment of debugging personnel is facilitated, the debugging efficiency of the practical training platform is improved, the debugging time is shortened, and the debugging cost is reduced.
Because the model is of a great variety on the practical training platform, the problem of shielding often exists among all the modules, in order to enable target feature extraction to be more accurate, when all the modules are positioned, a lightweight Yolo v5 network is used for adding a double-attention module in a Yolo v5 model so as to solve the shielding problem, object position information still ensures that bottom layer feature information is not lost under the shielding condition, and reasoning speed and accuracy are still ensured while network calculation amount is not increased.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a rapid debugging method of a practical training platform based on binocular stereo vision;
FIG. 2 is a structural diagram of a lightweight Yolo v5 network in an embodiment of the present invention;
FIG. 3 is a block diagram of a channel attention module in an embodiment of the present invention;
FIG. 4 is a block diagram of a location attention module in an embodiment of the present invention;
FIG. 5 is a schematic view of a model for pinhole imaging in an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1 to 4, the embodiment of the invention discloses a method for quickly debugging a practical training platform based on binocular stereo vision, which comprises the following steps:
s01, calibrating and correcting the relative position of a monocular camera in a binocular vision system;
s02, training a lightweight Yolo v5 network; the native Yolo v5 network is an object detection algorithm, and is usually used to detect the required object position in the picture.
S03, manufacturing a practical training platform module position data set, wherein the module position data set comprises a first characteristic diagram and a second characteristic diagram of a practical training platform, which are shot through a binocular vision system;
s04, inputting the position data set of the practical training platform module into the lightweight Yolo v5 network, and positioning a target area of each module on the practical training platform;
s05, extracting the feature points of the target area by a stereo matching method to obtain a first feature point set and a second feature point set;
s06, performing characteristic point matching on the first characteristic point set and the second characteristic point set through Euclidean distance matching operation to obtain matching point pairs, and performing polarity constraint judgment on the matching point pairs;
s07, according to a triangular imaging principle, calculating point cloud to be matched of the target area by using a feature matching point pair calculation formula;
and S08, performing point cloud registration operation on the point cloud to be registered, comparing the registered point cloud with a preset template point cloud, acquiring offset errors of each installation module, and importing the offset errors into visual software to output guiding debugging information. Wherein the offset error comprises six degrees of freedom of offset error data.
In a specific embodiment, in step S01, the method further includes the following steps:
s11, firstly, shooting a series of images containing a checkerboard calibration plate by a monocular camera in a binocular vision system, wherein 15 pictures of the calibration plate at different angles are shot for each camera in the embodiment;
s12, searching the corner information in the checkerboard calibration plate image by using a Harris corner detection method, wherein the specific formula is as follows:
Figure 190562DEST_PATH_IMAGE001
wherein,
Figure 997981DEST_PATH_IMAGE002
for the strength value of the corner point,
Figure 293833DEST_PATH_IMAGE003
is the coordinates of the pixels of the image,
Figure 921124DEST_PATH_IMAGE004
for the sliding variables of the Harris corner sliding window,
Figure 429466DEST_PATH_IMAGE005
is a weighted function of the coordinates of the pixels of the image,
Figure 91391DEST_PATH_IMAGE006
is a pixel point of an image
Figure 26986DEST_PATH_IMAGE007
The value of the pixel of (a) is,
Figure 875993DEST_PATH_IMAGE008
is a pixel point
Figure 656868DEST_PATH_IMAGE009
On the sliding window
Figure 170370DEST_PATH_IMAGE010
The pixel value of the dot.
S13, fitting the internal parameters and the external parameters of the left monocular camera and the right monocular camera according to the corner point information;
as shown in fig. 5, the pinhole imaging model is a process model of linear camera imaging, and a light source emits light from a certain point P (XC, YC, ZC) on the object surface into the camera lens to form a projection point P (x, y), then the proportional relationship is obtained:
this formula can be expressed as:
Figure 11287DEST_PATH_IMAGE011
Figure 613169DEST_PATH_IMAGE012
and combining the above equation to obtain the relation between the world coordinate of the point P and the projection coordinate of the point P on the image:
Figure 932155DEST_PATH_IMAGE013
wherein M is a 3 x 4 projection matrix;
Figure 303094DEST_PATH_IMAGE014
are internal parameters related to the internal structure of the camera,
Figure 314912DEST_PATH_IMAGE015
external parameters determined by the orientation of the camera relative to the world coordinate system;
Figure 872932DEST_PATH_IMAGE016
the internal parameters of the camera are fitted by a least square method according to the shot multiple chessboard pattern calibration plate images for the homogeneous coordinate of the point P in the world coordinate system
Figure 261189DEST_PATH_IMAGE014
And external parameters
Figure 486633DEST_PATH_IMAGE015
And S14, calibrating and correcting the binocular vision system, carrying out camera coordinate conversion on images acquired by the left camera and the right camera through an internal reference matrix, and multiplying by a rotation matrix to obtain new coordinate systems of the left barrage camera and the right barrage camera. And carrying out distortion correction on the left camera and the right camera through the distortion removing operation of the left camera and the right camera, and carrying out epipolar line verification on the image of the left camera and the right camera.
In a specific embodiment, step S03 further includes the following steps:
s31, manufacturing a practical training platform module position data set, erecting a binocular vision system right above a practical training platform, enabling all modules on the practical training platform to be located in the visual field range of a camera, and respectively collecting 2000 top views of the practical training platform by utilizing a left monocular camera and a right monocular camera in the binocular vision system. The system comprises pictures of different states such as random positions of modules, different postures of the robot and the like, wherein a left monocular camera generates a first feature map, and a right monocular camera generates a second feature map.
And S32, carrying out bounding box bounding and class labeling on the spraying mold block, the welding module, the storage rack module, the rotary table, the drawing module and the like on the practical training platform by using a labeling tool.
As shown in fig. 2, because the training platform modules are of various types, and a shielding problem often exists among the modules, for example, the swing arm of the robot shields the modules in different postures, in order to enable the binocular vision system to more accurately extract target features, in the embodiment, a double attention module is added in the Yolo V5 model to solve the shielding problem, in addition, the correction system needs to be deployed on an industrial personal computer of the training platform, the industrial personal computer considers the integration cost and has relatively low configuration, so that the real-time operation requirement of the system cannot be met by adopting native Yolo V5, in the embodiment, the back bone and the rock of the Yolo V5 are improved by using a double attention mechanism, and the light-weight Yolo V5 network can improve the identification precision and the identification rate of the shielded objects; the dual attention module comprises a channel attention module and a position attention module, and a lightweight Yolo V5 network structure fused with a dual attention mechanism is shown in fig. 2.
In a preferred embodiment, the channel Attention module is shown in fig. 3, the channel Attention module itself is a Residual + Attention structure, and Reshape, reshape and transit are respectively performed on the feature map a to obtain feature maps respectively
Figure 403774DEST_PATH_IMAGE017
And characteristic diagrams
Figure 714670DEST_PATH_IMAGE018
Will be
Figure 641037DEST_PATH_IMAGE017
And
Figure 720989DEST_PATH_IMAGE018
multiplying by softmax to obtain a channel attention feature map X, multiplying the X and the feature map A by a scale factor, then multiplying the Reshape into the original shape, and finally adding the Reshape and the feature map A to obtain an output feature map E.
In a preferred embodiment, the position Attention module is as shown in fig. 4, the position Attention module itself is a Residual + Attention structure, the feature map a is convolved by 3 convolution kernels to obtain 3 feature maps B, C, D, then the matrix obtained by multiplying B by Reshape and transit by C Reshape is multiplied, softmax is performed to obtain a feature map S, the matrix obtained by multiplying S by D Reshape is multiplied by a scale factor, then Reshape is in the original shape, and finally the result is added to a to obtain the final output feature map E.
As shown in fig. 2, the dual attention module (DA) is added after the CBL layer of the backbone network so that the backbone network does not lose detail in terms of both the channel and object position dimensions when extracting the main features. On a bottleneck layer, a double attention module (DA) is accessed after CBL is carried out on three dimension dimensions, so that the object position information of each dimension still ensures that the bottom layer characteristic information is not lost under the condition of shielding, and the reasoning speed and accuracy are still ensured while the network calculation amount is not increased.
Specifically, in the embodiment, S05, a feature point extraction operation is performed on the target region through a stereo matching method to obtain a first feature point set and a second feature point set, specifically, according to the region of each module in the image captured by the left monocular camera and the right monocular camera in the binocular vision system obtained in step S03, a SURF feature point operator is used for stereo matching, and SURF feature points searched in the identified region are as follows:
Figure 74610DEST_PATH_IMAGE019
wherein,
Figure 872801DEST_PATH_IMAGE020
is the feature point set of the c-th target area module in the first feature map,
Figure 337281DEST_PATH_IMAGE021
is the ith point in the first set of feature points.
Figure 271739DEST_PATH_IMAGE022
Is the feature point set of the c-th target area module in the second feature map,
Figure 796261DEST_PATH_IMAGE023
is the ith point in the second feature point set.
In a specific embodiment, step S06
Figure 81749DEST_PATH_IMAGE024
And
Figure 84340DEST_PATH_IMAGE025
the Euclidean distance matching is firstly carried out on the characteristic points, two characteristic point pairs with the shortest Euclidean distance are searched, and the Euclidean distance matching formula is as follows:
Figure 873304DEST_PATH_IMAGE026
wherein,
Figure 837237DEST_PATH_IMAGE027
is composed of
Figure 344441DEST_PATH_IMAGE028
To
Figure 150723DEST_PATH_IMAGE029
The euclidean distance of (c) is,
Figure 325353DEST_PATH_IMAGE030
is the ith point in the first set of feature points,
Figure 191678DEST_PATH_IMAGE031
for the ith point in the second feature point set, finding out two most matched feature point pairs in the c-th area as follows:
Figure 186179DEST_PATH_IMAGE032
wherein
Figure 796151DEST_PATH_IMAGE033
In the present embodiment, a preferred set threshold is 60 pixels, and if the epipolar distance of the row pixels of the left and right two feature points to be matched differs by 60px, the matched point pair is rejected.
In step S07, according to the triangular imaging principle of the left camera and the right camera, the characteristic matching point pairs are utilized
Figure 825287DEST_PATH_IMAGE034
Then, the point cloud in the c-th area can be calculated, and the point cloud calculation formula is as follows:
Figure 596934DEST_PATH_IMAGE035
wherein,
Figure 78731DEST_PATH_IMAGE036
respectively the abscissa, ordinate and depth coordinate of the point p in the c-th area, f is the focal length of the camera, the distance between the optical centers of the two cameras is d,
Figure 492395DEST_PATH_IMAGE037
in order to match the abscissa of the point pair,
Figure 110458DEST_PATH_IMAGE038
is a matching point pair ordinate.
In a specific embodiment, the point cloud registration of each module in step S08 further includes the following steps:
obtaining point cloud in C area
Figure 584165DEST_PATH_IMAGE039
After (x, y, z), registering the point cloud to be registered by utilizing a method of truncating ICP,
the difference between the idea of the algorithm and the ICP method is that the traditional ICP method uses the least square sum of the residual errors of all the point pairs, and the truncation ICP makes the least square sum of the residual errors of the point pairs which are arranged in the ascending order in the first half, so that the purpose of processing the point pairs which are arranged in the ascending order in the later half as the abnormal point pairs is achieved, and the influence of noise on the closed solution of the ICP method is reduced.
The formula for truncating the ICP estimator breakdown point is:
Figure 553258DEST_PATH_IMAGE040
wherein n is the number of the point pairs, m is the number of the parameters, l is the number of the point pairs meeting the truncation threshold, and the first formula is selected to calculate the truncation coefficient because m is smaller than n. When l = n/2, the collapse point is close to 0.5.
Since the normal least square estimation is not ideal when the occupancy of the correct point pair is less than 50%, the theoretically correct point pair obtained in the first iteration includes many wrong point pairs even if the truncation factor is 0.5.
Therefore, in order to make the truncated ICP robust while maintaining structural convergence, in the present embodiment, a truncation coefficient is used in each ICP iteration to process an erroneous point pair, i.e., a matching point pair meeting the requirement is selected by using a truncation ratio, then a rigid body transformation between point sets is solved by using a generalized least square method, a template matching point pair is updated by using the obtained transformation, a truncation ratio phi is updated, a point pair meeting the requirement is selected by using a new truncation ratio, and the process is repeated until the final convergence. The more the number of iterations, the larger the truncation ratio phi will be, which is done in order to make the ratio of the mismatching point to the entire matching point smaller and smaller. Rate of truncation
Figure 239454DEST_PATH_IMAGE041
The update formula of (c) is as follows:
Figure 977603DEST_PATH_IMAGE042
where k is the number of iterations.
The rigid transformation steps for solving the objective function by truncating ICP are as follows:
s81, constructing a residual error measurement function
Figure 356632DEST_PATH_IMAGE043
S82, accelerating and traversing by using a K-D tree to obtain the most adjacent point pair of the template point cloud and the point cloud to be approved.
And S83, calculating the nearest neighbor point pair meeting the requirement by using the truncation coefficient.
And S84, obtaining rigid transformation according to the nearest point pairs meeting the requirement by using an SVD method.
In the process of assembling each module of the practical training equipment, offset errors of each installation module and each module in the practical training platform equipment of the reference version are quickly and accurately positioned by using binocular stereo vision, namely
Figure 547442DEST_PATH_IMAGE044
And guiding an installation and debugging engineer to debug the relative position on the platform by using the visual software.
According to the invention, the binocular vision system is used for carrying out high-precision rapid identification and positioning on each module on the practical training platform to obtain the three-dimensional pose information of each module on the practical training platform, and carrying out real-time position deviation correction information feedback on each module in the debugging process to guide the installation error of a debugging worker, thereby improving the debugging efficiency of the practical training platform, shortening the debugging time, reducing the debugging cost and achieving the purpose of rapid shipment.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for quickly debugging a practical training platform based on binocular stereo vision is characterized by comprising the following steps:
calibrating the binocular, namely calibrating and correcting the relative position of a monocular camera in a binocular vision system;
adding a double attention model into a Yolo v5 network, and designing a lightweight Yolo v5 network;
manufacturing a practical training platform module position data set, wherein the module position data set comprises a first characteristic diagram and a second characteristic diagram of a practical training platform, which are shot through a binocular vision system;
inputting the training platform module position data set into the lightweight Yolo v5 network, and positioning a target area of each module on the training platform;
performing feature point extraction operation on the target area through a stereo matching method to obtain a first feature point set and a second feature point set;
performing characteristic point matching on the first characteristic point set and the second characteristic point set through an Euclidean distance matching operation to obtain a matching point pair, and performing polarity constraint judgment on the matching point pair;
calculating a point cloud to be matched of the target area by using a feature matching point pair calculation formula according to a triangular imaging principle;
and performing point cloud registration operation on the point cloud to be registered, comparing the registered point cloud with a preset template point cloud, acquiring the offset error of each installation module, and guiding the offset error into visual software to output guide debugging information.
2. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein the double-target centering further comprises:
a left monocular camera and a right monocular camera in the binocular vision system shoot a series of checkerboard calibration plate images;
searching angular point information in the chessboard pattern calibration plate image by using a Harris angular point detection method;
fitting the internal parameters and the external parameters of the left monocular camera and the right monocular camera according to the angular point information;
and carrying out camera coordinate conversion on the images acquired by the left and right monocular cameras through an internal reference matrix, multiplying by a rotation matrix to obtain new coordinate systems of the left and right monocular cameras, carrying out distortion correction on the left and right cameras through the distortion removal operation of the left and right cameras, and carrying out epipolar line verification on the images by the left and right monocular cameras.
3. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein: the dual attention model includes a channel attention module and a location attention module;
adding the dual attention model to a backbone and a tack of a Yolo v5 network to obtain the lightweight Yolo v5 network.
4. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein the method comprises the following steps: performing registration operation on the point clouds to be matched by using a truncation ICP (inductively coupled plasma) method, wherein the point clouds to be matched are arranged in an ascending order, and the first half of the point clouds to be matched are taken as the minimum;
selecting a matching point pair meeting the requirement by utilizing the truncation ratio;
solving rigid body transformation between the matching point sets by using a generalized least square method;
updating the truncation ratio using the rigid body transformation;
and selecting the matching point pairs meeting the requirements by using the updated truncation ratio.
5. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein: the step of truncating the ICP to solve for the rigid transformation includes
Constructing a residual error measurement function;
accelerating and traversing by utilizing a K-D tree to obtain the nearest point pair of the template point cloud and the point cloud to be matched;
solving the nearest point pair meeting the requirement by utilizing the truncation coefficient;
the rigid transformation is found from the nearest-neighbor pairs that satisfy the requirements using the SVD method.
6. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein the method comprises the following steps: the stereo matching method comprises the step of searching in the target area through an SURF feature point operator to obtain the first feature point set and the second feature point set.
7. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein: the design of the lightweight Yolo v5 network further comprises the steps of collecting a sample data set constructed by a sample diagram data set of a practical training platform, and training the lightweight Yolo v5 network through the sample data set.
8. The binocular stereo vision based training platform rapid debugging method of claim 3, wherein the method comprises the following steps: the channel Attention module is a structure of Residual + Attention, a characteristic diagram A is subjected to Reshape and Transpose respectively to obtain a characteristic diagram R1 and a characteristic diagram RT1 respectively, the characteristic diagram R1 and the characteristic diagram RT1 are multiplied through softmax to obtain a channel Attention characteristic diagram X, the characteristic diagram X and the characteristic diagram A are multiplied through a multiplication scale coefficient and then subjected to Reshape, the obtained product is changed into an original shape, and finally the obtained product is added with the characteristic diagram A to obtain an output characteristic diagram E.
9. The binocular stereo vision based training platform rapid debugging method of claim 3, wherein the method comprises the following steps: the position Attention module is a structure of Residual + Attention, the feature diagram A is convolved through 3 convolution kernels respectively to obtain 3 feature diagrams B, C and D, then the feature diagram B is multiplied by a matrix after Reshape and transfer are carried out on the feature diagram B and the feature diagram C, softmax is carried out to obtain a feature diagram S, the feature diagram S and the matrix after Reshape are multiplied by a scale coefficient to carry out Reshape, the original shape is changed, and finally the feature diagram S and the feature diagram A are added to obtain a final output feature diagram E.
10. The binocular stereo vision based training platform rapid debugging method of claim 1, wherein the method comprises the following steps: the training platform module position data set comprises:
taking a picture of the random placement position of the module; and
and shooting pictures of different postures of the robot on the practical training platform.
CN202211092343.6A 2022-09-08 2022-09-08 Binocular stereo vision-based quick debugging method for practical training platform Active CN115205356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211092343.6A CN115205356B (en) 2022-09-08 2022-09-08 Binocular stereo vision-based quick debugging method for practical training platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211092343.6A CN115205356B (en) 2022-09-08 2022-09-08 Binocular stereo vision-based quick debugging method for practical training platform

Publications (2)

Publication Number Publication Date
CN115205356A true CN115205356A (en) 2022-10-18
CN115205356B CN115205356B (en) 2022-12-30

Family

ID=83573482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211092343.6A Active CN115205356B (en) 2022-09-08 2022-09-08 Binocular stereo vision-based quick debugging method for practical training platform

Country Status (1)

Country Link
CN (1) CN115205356B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150288947A1 (en) * 2014-04-03 2015-10-08 Airbus Ds Gmbh Position and location detection of objects
US20170046840A1 (en) * 2015-08-11 2017-02-16 Nokia Technologies Oy Non-Rigid Registration for Large-Scale Space-Time 3D Point Cloud Alignment
CN111243033A (en) * 2020-01-10 2020-06-05 大连理工大学 Method for optimizing external parameters of binocular camera
AU2020101932A4 (en) * 2020-07-16 2020-10-01 Xi'an University Of Science And Technology Binocular vision–based method and system for pose measurement of cantilever tunneling equipment
CN112734776A (en) * 2021-01-21 2021-04-30 中国科学院深圳先进技术研究院 Minimally invasive surgical instrument positioning method and system
CN114282649A (en) * 2021-12-14 2022-04-05 江苏省特种设备安全监督检验研究院 Target detection method based on bidirectional attention mechanism enhanced YOLO V5

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150288947A1 (en) * 2014-04-03 2015-10-08 Airbus Ds Gmbh Position and location detection of objects
US20170046840A1 (en) * 2015-08-11 2017-02-16 Nokia Technologies Oy Non-Rigid Registration for Large-Scale Space-Time 3D Point Cloud Alignment
CN111243033A (en) * 2020-01-10 2020-06-05 大连理工大学 Method for optimizing external parameters of binocular camera
AU2020101932A4 (en) * 2020-07-16 2020-10-01 Xi'an University Of Science And Technology Binocular vision–based method and system for pose measurement of cantilever tunneling equipment
CN112734776A (en) * 2021-01-21 2021-04-30 中国科学院深圳先进技术研究院 Minimally invasive surgical instrument positioning method and system
CN114282649A (en) * 2021-12-14 2022-04-05 江苏省特种设备安全监督检验研究院 Target detection method based on bidirectional attention mechanism enhanced YOLO V5

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JUN FU ET AL: ""Dual Attention Network for Scene Segmentation"", 《HTTPS://ARXIV.ORG/ABS/1809.02983》 *
张青哲等: ""基于对极约束的双目立体视觉标定精度评价方法"", 《激光与光电子学进展》 *

Also Published As

Publication number Publication date
CN115205356B (en) 2022-12-30

Similar Documents

Publication Publication Date Title
CN108648240B (en) Non-overlapping view field camera attitude calibration method based on point cloud feature map registration
JP6685199B2 (en) System and method for combining machine vision coordinate spaces in a guided assembly environment
CN105740899B (en) A kind of detection of machine vision image characteristic point and match compound optimization method
CN111721259B (en) Underwater robot recovery positioning method based on binocular vision
CN112183171B (en) Method and device for building beacon map based on visual beacon
CN109859277A (en) A kind of robotic vision system scaling method based on Halcon
CN104036542B (en) Spatial light clustering-based image surface feature point matching method
CN113920205B (en) Calibration method of non-coaxial camera
CN107330927B (en) Airborne visible light image positioning method
CN111784655A (en) Underwater robot recovery positioning method
CN108492282B (en) Three-dimensional gluing detection based on line structured light and multitask cascade convolution neural network
CN115131268A (en) Automatic welding system based on image feature extraction and three-dimensional model matching
CN112767546B (en) Binocular image-based visual map generation method for mobile robot
CN112484746A (en) Monocular vision-assisted laser radar odometer method based on ground plane
CN111738971B (en) Circuit board stereoscopic scanning detection method based on line laser binocular stereoscopic vision
CN110363801B (en) Method for matching corresponding points of workpiece real object and three-dimensional CAD (computer-aided design) model of workpiece
CN110992416A (en) High-reflection-surface metal part pose measurement method based on binocular vision and CAD model
CN115222819A (en) Camera self-calibration and target tracking method based on multi-mode information reference in airport large-range scene
CN108180829B (en) Method for measuring target space orientation with parallel line characteristics
CN117197241B (en) Robot tail end absolute pose high-precision tracking method based on multi-eye vision
CN112785647A (en) Three-eye stereo image detection method and system
CN111415384B (en) Industrial image component accurate positioning system based on deep learning
CN115205356B (en) Binocular stereo vision-based quick debugging method for practical training platform
CN114140541B (en) Parameter calibration method of multi-line structured light weld tracking sensor
JP2008224641A (en) System for estimation of camera attitude

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant