CN115965690A - Binocular vision-based non-contact excavator operation posture learning and estimating method - Google Patents

Binocular vision-based non-contact excavator operation posture learning and estimating method Download PDF

Info

Publication number
CN115965690A
CN115965690A CN202211722477.1A CN202211722477A CN115965690A CN 115965690 A CN115965690 A CN 115965690A CN 202211722477 A CN202211722477 A CN 202211722477A CN 115965690 A CN115965690 A CN 115965690A
Authority
CN
China
Prior art keywords
image
angle
shovel arm
bucket
angle1
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211722477.1A
Other languages
Chinese (zh)
Inventor
孙辉
杨娇娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Suzhou Automotive Research Institute of Tsinghua University
Original Assignee
Tsinghua University
Suzhou Automotive Research Institute of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Suzhou Automotive Research Institute of Tsinghua University filed Critical Tsinghua University
Priority to CN202211722477.1A priority Critical patent/CN115965690A/en
Publication of CN115965690A publication Critical patent/CN115965690A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an end-to-end excavator operation attitude estimation method using deep learning, which realizes non-contact measurement of the attitude of a shovel arm and a bucket by using a binocular camera plus a deep learning algorithm. Compared with a contact type measuring mode using an angle sensor and an IMU (inertial measurement Unit), the method greatly reduces the system cost and solves the problem that the sensor is easy to damage in the using process of the excavator. Compared with the conventional method for estimating angles by simply using characteristic points, the method for cascade connection of one end to multiple ends provided by the invention enhances the stability of a neural network and ensures that the network has certain interpretability by using a segmented cascade method.

Description

Binocular vision-based non-contact excavator operation posture learning and estimating method
Technical Field
The invention relates to the field of unmanned engineering machinery, in particular to a binocular vision-based non-contact type excavator operation posture learning and estimating method.
Background
With the increasing maturity of automatic driving technology, foreign famous engineering machinery companies, such as carter piler and komata, are always researching and developing unmanned engineering machinery, and the automatic driving mine dump truck of komata directly cancels a cab and is already put into practical use in australia and chile.
In the process of driving automation of an excavator, detection of the posture of a shovel arm and a bucket is an indispensable part in closed-loop automatic driving operation, and currently, angle measurement or estimation is mainly performed by using an angle sensor and an IMU (inertial measurement unit). However, the excavator working environment is severe, and the conventional direct measurement method cannot be directly used for measuring the bucket part and is easy to damage. In addition, the problems of complex sensor installation and calibration, high cost and the like exist.
The non-contact measurement technology needs to be developed urgently, a scheme of machine vision based on a monocular camera is a common choice, the camera is used for detecting key points, then the key points are combined with a calibration relation to reversely deduce angles, but the monocular camera has no scale information, effective constraint on characteristic points is difficult to find, and stability and reliability are difficult to achieve.
The binocular camera is used for directly observing the scale, and meanwhile, the self physical size information of the excavator is combined for constraint, so that the method is a feasible scheme.
Disclosure of Invention
The invention aims to: the invention provides a non-contact type excavator operation attitude estimation method based on a binocular camera and using deep learning, aiming at the problems that a direct sensor (an angle sensor/IMU) in the automatic measurement of the excavator operation attitude is high in price and easy to damage, and the method can greatly reduce the system cost and overcome the problem of short service life in severe environment.
Aiming at the defects of instability and inexplicability of a conventional method for estimating an attitude angle by using a camera and feature points, the invention provides a one-end-to-multi-end cascading type deep learning network, which makes full use of image information, binocular depth information and excavator physical model information, innovatively integrates the image information, the binocular depth information and the excavator physical model information into the deep learning network, enhances the stability of a neural network, and enables the network to have certain interpretability by a segmented cascading method.
Aiming at the problem that most machine vision methods have complex processes, the invention provides an end-to-end method, which simplifies the processes of the whole method.
The technical scheme of the invention is as follows:
the method is characterized in that wide-angle binocular cameras arranged above a cab are used, a left camera and a right camera respectively acquire images containing a shovel arm shovel, and three angle values of an included angle1 between the shovel arm and a first shovel arm, an included angle2 between the first shovel arm and a second shovel arm, and an included angle3 between the second shovel arm and a third shovel arm are obtained through a series of methods; the treatment process comprises the following steps:
s01, defining a starting point as one point on the bucket, setting the starting point as P0, defining the other 9 points as the joint points of the rotation of the shovel arm and the bucket, and defining angles to be solved as angle1, angle2 and angle3;
s02, designing a cascade deep learning network with one end to multiple ends, wherein one end is an image, the other end is divided into 3 branches, the branch 1 is used for predicting a rotation key point, the branch 2 is used for predicting three angles, the branch 3 is used for predicting the position of 3D, the 3 branches are in progressive relation and mutually constrained, and meanwhile, the annotated image, the angle sensor result and the depth image respectively provide true values for the 3 branches and are used for constructing a load capable of realizing back propagation;
s03, in a training stage, firstly acquiring an image signal from a left image, constructing a feature point detection network comprising a convolutional layer and a full-link layer, predicting the positions of 10 key feature points in the image, marking as P0 '-P9', and comparing the 10 feature points with feature points P0-P9 marked by the image to generate a comparison error Ep; at the moment, mapping the characteristic points to a certain projection plane according to projection transformation, thereby directly calculating attitude angles of a shovel arm and a bucket; then, combining the positions of P0 '-P9' and the depths of the characteristic points measured by the binoculars to form P0d '-P9 d';
s04, after P0d '-P9 d' is obtained, taking P0d '-P9 d' as input, predicting attitude angles angle1 '-angle 3' of a shovel arm and a bucket as output, adding more than 2 layers of hidden layers to construct a second neural network, and comparing the values of angle1 '-angle 3' with true values angle 1-angle 3 obtained by an angle sensor to generate a comparison error Ea;
s05, after angle1 '-angle 3' is obtained, converting the angle into position information according to the real 3D model of the shovel arm and the bucket, at the moment, fusing and calculating the left camera image and the right camera image to obtain a depth map, and comparing the information of the depth map with the calculated 3D position information to generate a comparison error Ed;
s06, in the stage of deducing use, obtaining the predicted angles angle1 '-angle 3' by adopting the trained model M according to the left image and the right image.
Preferably, the Loss in S02 is constructed by a method of Loss = α Ep + β Ea + γ Ed, where the values of α, β, and γ are the weights of the respective losses and are determined by experimental results.
Preferably, in S05, the left camera image and the right camera image are fused to obtain the depth map by a calculation method including BM and SGM.
The invention has the advantages that:
1. according to the end-to-end excavator operation attitude estimation method using deep learning, the non-contact measurement of the attitude of the shovel arm and the bucket is realized by using a binocular camera and a deep learning algorithm. Compared with a contact type measuring mode using an angle sensor and an IMU (inertial measurement Unit), the method greatly reduces the system cost and solves the problem that the sensor is easy to damage in the using process of the excavator.
2. Compared with the conventional method for estimating angles by simply using characteristic points, the method for cascade connection of one end to multiple ends provided by the invention has the advantages that the stability of the neural network is enhanced, and the network has certain interpretability by the method of cascade connection in sections.
Drawings
The invention is further described below with reference to the following figures and examples:
the accompanying drawings are included to provide a further understanding of the invention, are incorporated in and constitute a part of this specification, and together with the description, serve to explain the invention and not to limit the invention, in which:
FIG. 1 is an illustration of hardware installation positions, feature points and angles of a binocular vision-based non-contact excavator work attitude learning and estimation method;
fig. 2 is a neural network model construction process of a binocular vision-based non-contact type excavator operation posture learning and estimation method.
Detailed Description
The invention provides a binocular vision-based non-contact type excavator operation attitude learning and estimation method, as shown in figure 1, a wide-angle binocular camera arranged above a cab is used, a left camera and a right camera respectively acquire an image containing a shovel arm bucket, and three angle values of angle1, angle2 and angle3 are obtained through a series of methods. The processing flow comprises the following steps:
and S01, defining a starting point as one point on the bucket, setting the starting point as P0, defining the other 9 points as joint points of rotation of the shovel arm and the bucket, and defining angles to be solved as angle1, angle2 and angle3, wherein each angle can be determined by a plurality of points together so as to improve robustness.
S02, as shown in FIG. 2, the end-to-end deep learning method can greatly simplify the complexity of the algorithm, and meanwhile, hidden features can be more fully utilized through learning. In order to realize end-to-end training and inference and increase reliability, a cascade deep learning network with one end to multiple ends is designed, one end is an image, the other end is divided into 3 branches, the branch 1 is used for predicting a rotation key point, the branch 2 is used for predicting three angles, the branch 3 is used for predicting the position of 3D, the 3 branches are in progressive relation and constrained with each other, and meanwhile, the annotated image, the angle sensor result and the depth image respectively provide true values for the 3 branches and are used for constructing a load capable of realizing back propagation.
S03, in a training stage, firstly acquiring an image signal from a left image, constructing a feature point detection network comprising a convolutional layer and a full-link layer, predicting the positions of 10 key feature points in the image, marking as P0 'to P9', and comparing the 10 feature points with feature points P0 to P9 marked by the image to generate a comparison error Ep. In this case, the feature points may be mapped onto a certain projection plane according to projection transformation, thereby directly calculating the attitude angles of the shovel arm and the bucket. However, since the direct calculation method has too few constraints, angular jitter is likely to occur. The positions of P0 'to P9' are then combined with the binocular measured depths at these feature points to form P0d 'to P9d'.
S04, after P0d '-P9 d' is obtained, the P0d '-P9 d' is used as input, the attitude angles of the shovel arm and the bucket are predicted to be 1 '-angle 3' as output, more than 2 layers of hidden layers are added, and a second neural network is constructed. Meanwhile, each angle is determined by a plurality of characteristic points, and the neural network can automatically synthesize the calculation results of the characteristic points. Comparing the values of angle1 'to angle3' with the true values of angle1 to angle3 obtained by the angle sensor to generate a comparison error Ea.
S05, after angle1 '-angle 3' is obtained, the angle can be converted into position information according to the real 3D model of the shovel arm and the bucket, and the step naturally integrates the 3D physical model and the deep learning network to generate the effect of strong rule constraint. At this time, the left camera image and the right camera image are fused to obtain a depth map, and typical calculation methods include BM (block matching), SGM (semi-global matching), and the like. The information of this depth map is compared with the calculated 3D position information to generate a comparison error Ed.
The construction method of S06 and Loss is that Loss = α Ep + β Ea + γ Ed, where the values of α, β, γ are the weights of the respective losses, and are determined by the experimental results.
S07, in an inference (reference) use stage, obtaining the prediction angles of angle1 '-angle 3' only by adopting a trained model M according to a left image and a right image.
The preferred embodiments of the present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1 and 2, in the implementation of the method for estimating the operation posture of the end-to-end excavator by using deep learning, the method is divided into two independent links of training and inference, and may include the following steps:
a training stage:
s1, installing a wide-angle binocular camera at the top of a cab of the excavator, and enabling an optical axis of the camera to be parallel to an operation plane of the excavator. The angle sensor is mounted in place.
S2, using the binocular camera and the angle sensor to collect data under different weather, illumination and operation postures, and reaching more than 20000 frames.
And S3, distributing the acquired image to image annotation personnel, and annotating P0-P9 key feature points. And (4) calculating the binocular parallax offline by using an accurate algorithm to form a parallax map for later use.
And S4, forming a training set by the left image, the right image, the parallax image, the angle value and the characteristic point marking value which are ready in the S2 and the S3, constructing a one-end-to-multi-end cascade deep learning network, starting training, and obtaining a trained model M after a period of time.
And (3) an inference stage:
s5, using the trained model M, taking the left image, the right image and the disparity map as input, and predicting P0-P9 feature points and three angles angle 1-angle 3.
While specific embodiments of the present invention have been described above, it should be understood that these are by way of example only and that numerous changes and modifications can be made to the embodiments without departing from the principles and spirit of the invention.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose of the embodiments is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All modifications made according to the spirit of the main technical scheme of the invention are covered in the protection scope of the invention.

Claims (3)

1. The non-contact excavator operation posture learning and estimating method based on binocular vision is characterized in that a wide-angle binocular camera arranged above a cab is used, a left camera and a right camera respectively acquire an image containing a shovel arm bucket, and three angle values of an included angle1 between the bucket and a first shovel arm, an included angle2 between the first shovel arm and a second shovel arm, and an included angle3 between the second shovel arm and a third shovel arm are obtained through a series of methods; the processing flow comprises the following steps:
s01, defining a starting point as a point on the bucket, setting the starting point as P0, defining another 9 points as joint points of rotation of the shovel arm and the bucket, and defining angles to be solved as angle1, angle2 and angle3;
s02, designing a cascade deep learning network with one end to multiple ends, wherein one end is an image, the other end is divided into 3 branches, a branch 1 is used for predicting a rotation key point, a branch 2 is used for predicting three angles, a branch 3 is used for predicting the position of 3D, the 3 branches are in a progressive relation and are mutually constrained, and meanwhile, an annotated image, an angle sensor result and a depth image respectively provide true values for the 3 branches and are used for constructing an LOSS (load operating system) capable of realizing back propagation;
s03, in a training stage, firstly acquiring an image signal from a left image, constructing a feature point detection network comprising a convolutional layer and a full-link layer, predicting the positions of 10 key feature points in the image, marking as P0 'to P9', and comparing the 10 feature points with feature points P0 to P9 marked by the image to generate a comparison error Ep; at the moment, the feature points are mapped to a certain projection surface according to projection transformation, so that the attitude angles of the shovel arm and the bucket are directly calculated; then, combining the positions of P0 'to P9' and the depths of the binocular measured characteristic points to form P0d 'to P9d';
s04, after P0d '-P9 d' is obtained, taking P0d '-P9 d' as input, predicting attitude angles angle1 '-angle 3' of a shovel arm and a bucket as output, adding more than 2 layers of hidden layers to construct a second neural network, and comparing the values of angle1 '-angle 3' with true values angle 1-angle 3 obtained by an angle sensor to generate a comparison error Ea;
s05, after angle1 '-angle 3' is obtained, converting the angle into position information according to the real 3D model of the shovel arm and the bucket, at the moment, fusing and calculating the left camera image and the right camera image to obtain a depth map, and comparing the information of the depth map with the calculated 3D position information to generate a comparison error Ed;
s06, in the stage of deducing use, obtaining the predicted angles angle1 '-angle 3' by adopting the trained model M according to the left image and the right image.
2. The binocular vision based non-contact excavator work attitude learning and estimation method according to claim 1, wherein the Loss in S02 is constructed by the method that Loss = α Ep + β Ea + γ Ed, wherein values of α, β and γ are weights of respective losses and are determined by experimental results.
3. The binocular vision based non-contact excavator work posture learning and estimation method according to claim 1, wherein in the step S05, the left camera image and the right camera image are subjected to fusion calculation, and a calculation method for obtaining the depth map includes BM and SGM.
CN202211722477.1A 2022-12-30 2022-12-30 Binocular vision-based non-contact excavator operation posture learning and estimating method Pending CN115965690A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211722477.1A CN115965690A (en) 2022-12-30 2022-12-30 Binocular vision-based non-contact excavator operation posture learning and estimating method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211722477.1A CN115965690A (en) 2022-12-30 2022-12-30 Binocular vision-based non-contact excavator operation posture learning and estimating method

Publications (1)

Publication Number Publication Date
CN115965690A true CN115965690A (en) 2023-04-14

Family

ID=87361596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211722477.1A Pending CN115965690A (en) 2022-12-30 2022-12-30 Binocular vision-based non-contact excavator operation posture learning and estimating method

Country Status (1)

Country Link
CN (1) CN115965690A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117197769A (en) * 2023-11-03 2023-12-08 江苏智能无人装备产业创新中心有限公司 Loader front image generation system and method based on bucket position observation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117197769A (en) * 2023-11-03 2023-12-08 江苏智能无人装备产业创新中心有限公司 Loader front image generation system and method based on bucket position observation
CN117197769B (en) * 2023-11-03 2024-01-26 江苏智能无人装备产业创新中心有限公司 Loader front image generation system and method based on bucket position observation

Similar Documents

Publication Publication Date Title
JP5832341B2 (en) Movie processing apparatus, movie processing method, and movie processing program
JP4889351B2 (en) Image processing apparatus and processing method thereof
EP1855247B1 (en) Three-dimensional reconstruction from an image sequence with outlier removal
WO2021035669A1 (en) Pose prediction method, map construction method, movable platform, and storage medium
US9892552B2 (en) Method and apparatus for creating 3-dimensional model using volumetric closest point approach
CN107845114B (en) Map construction method and device and electronic equipment
CN109903326B (en) Method and device for determining a rotation angle of a construction machine
KR102113068B1 (en) Method for Automatic Construction of Numerical Digital Map and High Definition Map
JP2018189636A (en) Imaging device, image processing method and program
CN115965690A (en) Binocular vision-based non-contact excavator operation posture learning and estimating method
CN116222543B (en) Multi-sensor fusion map construction method and system for robot environment perception
US6175648B1 (en) Process for producing cartographic data by stereo vision
CN110726413B (en) Multi-sensor fusion and data management method for large-scale SLAM
CN112556719A (en) Visual inertial odometer implementation method based on CNN-EKF
CN111623773A (en) Target positioning method and device based on fisheye vision and inertial measurement
CN111508026A (en) Vision and IMU integrated indoor inspection robot positioning and map construction method
CN113587934A (en) Robot, indoor positioning method and device and readable storage medium
Ji et al. Self-calibration of a rotating camera with a translational offset
Yu et al. Tightly-coupled fusion of VINS and motion constraint for autonomous vehicle
JPH10240934A (en) Object extractor
JP2020107336A (en) Method, device, and robot apparatus of improving robust property of visual inertial navigation system
KR102387717B1 (en) Method for inspection and diagnosis of facilities damage using photo album technique and drone photograph
CN113763481B (en) Multi-camera visual three-dimensional map construction and self-calibration method in mobile scene
CN113379850B (en) Mobile robot control method, device, mobile robot and storage medium
CN112344966B (en) Positioning failure detection method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination