CN115994937A - Depth estimation method and device and robot - Google Patents

Depth estimation method and device and robot Download PDF

Info

Publication number
CN115994937A
CN115994937A CN202310281183.8A CN202310281183A CN115994937A CN 115994937 A CN115994937 A CN 115994937A CN 202310281183 A CN202310281183 A CN 202310281183A CN 115994937 A CN115994937 A CN 115994937A
Authority
CN
China
Prior art keywords
depth information
camera module
depth
image
robot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310281183.8A
Other languages
Chinese (zh)
Inventor
赖嘉骏
殷保才
李华清
张圆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202310281183.8A priority Critical patent/CN115994937A/en
Publication of CN115994937A publication Critical patent/CN115994937A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Length Measuring Devices By Optical Means (AREA)

Abstract

The application discloses depth estimation method, device and robot, be provided with first camera module and depth sensor on the robot body front panel of this application, the robot body top is provided with the adjustable second camera module of angle, and this application acquires the first image that first camera module gathered, and the second image that second camera module gathered obtains first depth information based on first image and second image calculation to fuse with the second depth information that depth sensor gathered, obtain the depth information after the fusion. The second image is acquired by the second camera module with the adjustable angle, and the first depth information can be calculated and obtained by matching with the first image acquired by the first camera module of the front panel, so that the second depth information acquired by the depth sensor is supplemented, the integrity of the depth information is improved, denser point cloud information can be obtained, and the performance of the corresponding function depending on the depth information is improved.

Description

Depth estimation method and device and robot
Technical Field
The present application relates to the field of man-machine interaction technologies, and in particular, to a depth estimation method, a depth estimation device, and a robot.
Background
Along with the development of sensors and the energization of AI, more and more intelligent robots are widely applied to production and living, such as cleaning robots, transfer robots, companion robots and the like.
Taking the cleaning robot as an example, the existing cleaning robot is not limited to a simple cleaning function, and can collect surrounding environment images through a sensor, so that the functions of obstacle avoidance and the like are realized. For a common cleaning robot, a non-contact obstacle avoidance function is basically provided, and in order to achieve the function, depth information of a front object needs to be acquired by means of a depth sensor. Currently mainstream cleaning robots generally choose to employ depth sensors based on the TOF principle. The sensor is based on the principle that the number of the light beams actively emitted is limited, the density of the provided information (point cloud) can become sparse gradually along with the distance, so that the depth information obtained by measurement is incomplete, and the performance and the effect of the functions (such as obstacle avoidance and detection) which depend on the depth information are improved.
Disclosure of Invention
In view of the foregoing, the present application has been proposed to provide a depth estimation method, apparatus, and robot to obtain more comprehensive depth information. The specific scheme is as follows:
in a first aspect, a depth estimation method is provided and applied to a robot, a first camera module and a depth sensor are disposed on a front panel of the robot body, and a second camera module with an adjustable angle is disposed above the robot body, and the method includes:
acquiring a first image acquired by the first camera module and acquiring a second image acquired by the second camera module;
calculating to obtain first depth information based on the first image and the second image;
and acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
In a second aspect, a depth estimation device is provided and applied to a robot, a first camera module and a depth sensor are arranged on a front panel of the robot body, and a second camera module with an adjustable angle is arranged above the robot body, and the device comprises:
the image acquisition unit is used for acquiring a first image acquired by the first camera module and acquiring a second image acquired by the second camera module;
a first depth information calculating unit, configured to calculate first depth information based on the first image and the second image;
and the depth information fusion unit is used for acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
In a third aspect, a robot is provided, comprising:
a robot body;
the first camera module and the depth sensor are arranged on the front panel of the robot body, the second camera module is arranged above the robot body, and the angle of the second camera module is adjustable;
the processor is used for acquiring a first image acquired by the first camera module, acquiring a second image acquired by the second camera module, calculating to obtain first depth information based on the first image and the second image, acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
By means of the technical scheme, the depth estimation method is applied to a robot, a first camera module and a depth sensor are arranged on a front panel of the robot body, a second camera module with an adjustable angle is arranged above the robot body, the first image collected by the first camera module is obtained, the second image collected by the second camera module can be calculated to obtain first depth information based on the first image and the second image by adopting a binocular depth estimation method, and the first depth information is fused with the second depth information collected by the depth sensor to obtain fused depth information. Obviously, this application is through additionally setting up the second image of second camera module collection second image that an angle is adjustable above the robot body, the first image of the first camera module collection of cooperation front panel can calculate first degree of depth information to realize supplementing the second degree of depth information that the depth sensor gathered, promoted the integrality of degree of depth information, can obtain more intensive point cloud information, help promoting the performance of follow-up corresponding function that relies on degree of depth information.
In addition, the robot of this application through set up the second camera module of angularly adjustable in body top, can promote camera module's scanning field of vision greatly, the first camera module that sets up on the cooperation front panel can increase the field of vision when face identification, home monitoring for the performance and the effect of functions such as face identification, home monitoring are more excellent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a side view of a robot as exemplified herein;
FIG. 2 is a front view of a robot as an example of the present application;
fig. 3 is a schematic flow chart of a depth estimation method according to an embodiment of the present application;
FIG. 4 illustrates a schematic view of respective angles of view of first and second camera modules;
fig. 5 is a schematic flow chart of a scanning depth estimation method according to an embodiment of the present application;
FIG. 6 illustrates a schematic diagram of a binocular vision system;
fig. 7 is a schematic structural diagram of a depth estimation device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The depth estimation method is applied to the robot, can realize depth information estimation on the environment where the robot is located, and further is beneficial to improving performance of corresponding functions which depend on the depth information.
Robots include, but are not limited to: cleaning robots, transfer robots, companion robots, and the like. Taking a cleaning robot as an example, functions such as image construction, obstacle avoidance, detection and the like can be realized by acquiring depth information.
In order to make up for the fact that the existing robot only relies on the TOF type depth sensor to acquire depth information of a front object, the density of information (point cloud) provided by the TOF type depth sensor gradually becomes sparse along with the distance, and the depth information obtained by measurement is incomplete. The embodiment of the application provides a robot setting mode, and a first camera module 101 and a depth sensor 102 can be set on a front panel of a robot body 100. Above the robot body 100, there is provided a second camera module 103 whose angle (i.e., installation angle, also referred to as tilt angle, hereinafter referred to as angle for convenience of description) is adjustable. The cleaning robot is exemplified in fig. 1 and 2, and is applicable to other types of robots as well.
Wherein, the first camera module 101 and the second camera module 103 may be RGB cameras, the first camera module 101 may be a fixed setting, the second camera module 103 may be an angle-adjustable setting, and since the second camera module 103 is disposed above the robot body, it may be set as a look-up camera, and its field angle may be smaller than that of the first camera module 101.
The depth sensor 102 may be a TOF-type depth sensor, such as an ITOF depth sensor. In addition, other types of depth sensors such as structured light, laser depth sensors, and the like are also possible. While ITOF depth sensors may be preferred in embodiments of the present application for balancing performance and price.
Based on the above-mentioned robot structure, the present application provides a depth estimation method, which may be applied to a controller of a robot, where the controller may be disposed on a robot body, or may be other terminal devices, cloud end, server, etc. that communicate with the robot. Next, a depth estimation method of the present application will be described, and as described in connection with fig. 3, the depth estimation method may include the following steps:
step S100, acquiring a first image acquired by the first camera module and acquiring a second image acquired by the second camera module.
Specifically, the angles of view of the first image capturing module and the second image capturing module may be different, and the first image and the second image captured by the first image capturing module and the second image at least include a part of objects in the same field of view, so that in the following steps, based on the first image and the second image, the first depth information is calculated by adopting a binocular depth estimation method.
Step S110, calculating first depth information based on the first image and the second image.
Specifically, a binocular depth estimation method may be adopted, and depth estimation of the pixel point is performed through a field of view difference between the first image and the second image, so as to obtain the first depth information.
And step S120, acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
Specifically, the depth sensor may acquire a depth point cloud image as the second depth information. Considering that a single depth sensor is susceptible to environmental and algorithmic effects, resulting in incomplete second depth information, for example, for ITOF-type depth sensors, the information (point cloud) density provided by the same may become gradually sparse as the distance becomes longer, resulting in incomplete measured depth information. In this step, the second depth information and the first depth information calculated in the previous step are fused to obtain fused depth information, and the fused depth information is used as a final depth information estimation result.
It will be appreciated that the process of acquiring the second depth information acquired by the depth sensor in this embodiment is not limited to the execution sequence illustrated in fig. 3, and may be performed at any time before the first depth information and the second depth information are fused.
The depth estimation method is applied to a robot, a first camera module and a depth sensor are arranged on a front panel of a robot body, a second camera module with an adjustable angle is arranged above the robot body, the first image collected by the first camera module is obtained, the second image collected by the second camera module can be calculated to obtain first depth information based on the first image and the second image by adopting a binocular depth estimation method, and the first depth information is fused with the second depth information collected by the depth sensor to obtain fused depth information. Obviously, this application is through additionally setting up the second image of second camera module collection second image that an angle is adjustable above the robot body, the first image of the first camera module collection of cooperation front panel can calculate first degree of depth information to realize supplementing the second degree of depth information that the depth sensor gathered, promoted the richness of degree of depth information, can obtain more intensive point cloud information, help promoting the performance of follow-up corresponding function that relies on the degree of depth information.
In addition, the robot of this application through set up the second camera module 103 that the angle is adjustable in body top, can promote the scanning field of view of second camera module 103 greatly, the field of view when cooperation front panel is last to set up first camera module 101 can increase face identification, family's control for the performance and the effect of functions such as face identification, family's control are more excellent.
Based on the depth estimation method provided by the embodiment of the application, after the fused depth information is obtained, other functions depending on the depth information can be realized, such as image construction, obstacle avoidance, recognition detection and the like.
For the mapping process, the map information of the place where the robot is located can be determined based on the fused depth information obtained in the previous step, so as to realize the mapping function.
For the obstacle avoidance process, the obstacle around the robot can be identified based on the fused depth information obtained in the previous step, and the obstacle avoidance function can be realized by controlling the robot to avoid the obstacle in the travelling process.
For the recognition and detection process, the gesture or the gesture of the user can be recognized based on the fused depth information obtained in the previous step, and the running or working mode of the robot is controlled according to the gesture or the gesture. For example, a target ground area pointed by a user gesture is identified and the robot is controlled to travel toward or evade the target ground area.
Optionally, in consideration of the possible problems of unclear images acquired by the first and second camera modules in some dark light environments, the first depth information cannot be calculated based on the first and second images. Therefore, after the first image and the second image are acquired in step S100, a process of checking the validity of the images, such as checking whether the sharpness of the first image and the second image meets the set requirement, may be further added, and the process of step S110 is only performed when the images meet the set requirement, otherwise, the second depth information acquired by the depth sensor is directly used as the final depth information.
Based on the above robot structure, the angle of the second image capturing module 103 is adjustable, so that the angle of the second image capturing module 103 can be controlled and adjusted, scanning type image capturing is realized, and a plurality of second images under different angles are obtained.
Specifically, the angle adjustment step length of the second camera module can be preset, the angle of the second camera module is adjusted according to the set step length, and after each angle adjustment, the second image acquired by the second camera module is acquired, so that the second images under a plurality of different angles can be obtained.
The second camera module has an adjustable angle range, for example 45 ° to 0 °. Therefore, the above-mentioned process of adjusting the angle of the second camera module according to the set step may be that the angle of the second camera module is adjusted according to the set step within the adjustable angle range of the second camera module. When the camera is started initially, the second camera module can be adjusted to a set angle value, such as a maximum or minimum adjustable angle value, and the angle of the second camera module is further adjusted gradually within the adjustable angle range according to the set step length.
An alternative example, assuming that the adjustable angle range of the second camera module is 45 ° to 0 °, the angle of the second camera module may be adjusted to 45 ° at the initial moment, and further the angle of the second camera module is gradually reduced according to the set step length of 2 ° until the angle is adjusted to 0 °, so as to complete the process of collecting the second image by one scanning. As shown in connection with fig. 4:
the second camera module illustrated in fig. 4 is denoted by C, and the first camera module is denoted by B. The field angle of the second camera module C is smaller than the field angle of the first camera module B.
The angle of the second camera module C is adjustable, the angle of the first camera module B is fixed, and the angle of view is also fixed. At the initial moment, the angle of the second camera module C can be adjusted to the maximum value, and then the angle of the second camera module C is gradually reduced according to the set adjustment step length until the minimum angle position is reached, so that the scanning type image acquisition process of the second camera module C is realized.
Through setting up the second camera module that the angle is adjustable, can guarantee to acquire accurate second camera module's angle value, avoid camera structure to become flexible and lead to unable estimation second camera module and first camera module between relative extrinsic variation (the extrinsic parameter is the parameter describing the relative pose of first, second camera module, need use this extrinsic parameter when calculating first degree of depth information). In addition, the design with adjustable angle can further enrich the service scene of second camera module, for example increases the field of vision when face identification, family monitor.
Based on the above, step S110, the process of calculating the first depth information based on the first image and the second image may include:
and calculating to obtain first depth information based on the first image and the second image obtained after each angle adjustment.
Since there are a plurality of second images, there may be a plurality of first depth information calculated, and in step S120, each first depth information and the second depth information may be fused to obtain fused depth information.
Referring to fig. 5, a flowchart of a method for scanning depth estimation is illustrated, comprising the steps of:
step 200, a first image acquired by a first camera module is acquired.
Step S210, adjusting the angle of the second camera module.
Specifically, the angle can be adjusted step by step in a scanning manner within the adjustable angle range of the second camera module according to a set step. The one-round scanning process can be understood as that the adjusted angle covers all the adjustable angle ranges of the second camera module.
Step S220, a second image acquired by the second camera module is acquired.
Specifically, after the angle of the second camera module is adjusted each time, a second image acquired by the second camera module under the current angle is acquired.
Step S230, determining whether the image is valid, if so, executing step S240, and if not, executing step S270.
Specifically, in some dark light environments, the images acquired by the first and second camera modules may have problems such as unclear image, so that the first depth information cannot be calculated later. Therefore, in this step, validity determination can be performed on the first image and the second image, and if any one of the first image and the second image fails to pass the validity detection, the process may directly jump to step S270, and in this case, in step S270, since the first depth information cannot be acquired, the second depth information may directly be used as the final depth information after fusion. If the first image and the second image pass the validity judgment, step S240 may be entered.
Step S240, calculating first depth information based on the first image and the second image.
Step S250, determining whether the second image capturing module finishes scanning, if yes, executing step S270, and if not, returning to execute step S210.
Specifically, the second camera module has an adjustable angle range, and the angle of the second camera module is gradually adjusted according to the set step in step S210. In this step, it is determined whether all the adjusted angles of the second camera module cover all the adjustable angle ranges, that is, whether a round of scanning process is completed, if yes, the process of depth information fusion in the next step S270 may be performed, otherwise, the process may return to step S210 to continue angle adjustment.
Step S260, second depth information acquired by the depth sensor is acquired.
It is to be understood that the execution sequence of step S260 is not limited to that shown in fig. 4, and may be executed at any time before step S270.
Step S270, fusing the first depth information and the second depth information to obtain fused depth information.
Specifically, the first depth information obtained by each cycle calculation in the first round of scanning process of the second camera module and the second depth information can be fused to obtain fused depth information.
In an alternative mode, the first depth information obtained by each cycle of calculation in a round of scanning process can be fused first to obtain first depth information after fusion, and then the first depth information and the second depth information are fused second to obtain final depth information.
Alternatively, after the first depth information is obtained by the first calculation, the first depth information can be directly fused with the second depth information, and the fusion result is sequentially fused with the first depth information obtained by each subsequent cycle calculation until the final depth information is obtained after the first depth information obtained by the last cycle calculation is fused.
According to the scanning type depth information estimation method, the second image under different angles is acquired through the second camera module with the adjustable angle, the second image is calculated with the first image acquired by the first camera module respectively, the first depth information is obtained, and then the first depth information is fused with the second depth information acquired by the depth sensor, so that the information quantity of the finally obtained depth information can be improved, and the depth information is complemented.
In some embodiments of the present application, the process of calculating the first depth information based on the first image and the second image in step S110 in the foregoing embodiments is further described.
In this embodiment, the first depth information may be calculated by using the first image and the second image based on a binocular vision estimation algorithm, and the specific process may include:
s1, determining parallax of the same feature point on the first image and the second image through a feature point matching mode.
S2, calculating first depth information of the same feature points according to the parallax, the internal and external parameters of the first image pickup module and the internal and external parameters of the second image pickup module when the second image is acquired.
Next, a specific calculation process of the above steps S1 to S2 will be described with reference to a schematic diagram of the binocular vision system illustrated in fig. 6.
As shown in fig. 6:
the projection centers of the left camera and the right camera are respectively O L And O R The continuous distance between the two is b, also called the baseline. An imaging point of any point P in the three-dimensional space at the left camera is P L The imaging point of the right camera is P R . According to the principle of linear light propagation, the point P is the intersection point of the projection center points of the two cameras and the connecting line of the imaging points. O (O) L The line segment with the upper length L is the imaging plane of the left camera, and the line segment X L For the distance from the imaging point of the left camera to the left side of the imaging plane, a line segment X R The distance from the imaging point of the right camera to the left of the imaging plane. The vertical distance from point P to the base line is denoted Z, and the focal length of the left and right cameras is denoted f.
The disparity of point P at the left and right cameras can be defined as d:
d=∣X L -X R
two imaging points P L And P R The distance between the two is as follows:
P L P R =b-( X L -L/2)- ( L/2- X R )=b-( X L - X R )
from the theory of similar triangles it can be derived:
Figure SMS_1
the distance Z from point P to the baseline can be found:
Figure SMS_2
from the above equation, when the parallax d, the base line length b, and the camera focal length f of the same feature point on two images are known, the depth information of the feature point can be calculated.
Of course, the above-described fig. 6 illustrates an ideal model, and does not consider the case where the optical axes of the two cameras are not parallel. After the relative postures of the two cameras are changed, the depth information of the feature points can be calculated based on parallax and internal and external parameters of the two cameras. This process can be derived based on knowledge of the principle of mathematical triangles and is not described in detail herein.
In some embodiments of the present application, the process of fusing the first depth information and the second depth information to obtain the fused depth information in the step S120 is described.
The first depth information and the second depth information are depth information obtained through different modes, and in view of the fact that the first depth information is calculated by adopting a binocular vision estimation algorithm based on the first image and the second image, when the first image and the second image acquired in a dark light environment are possibly affected, the calculated first depth information has errors, the second depth information can be taken as the main part, the first depth information is taken as the auxiliary part, and the final depth information is obtained through fusion of the first depth information and the second depth information.
In this embodiment, a fusion mode is described, and the specific process is as follows:
for the null coordinate point of the missing depth value in the second depth information:
and determining a target depth value of the null coordinate point based on the depth values of the coordinate points around the null coordinate point in the second depth information and the depth values corresponding to the null coordinate point in the first depth information.
And further, supplementing second depth information by using the target depth value of the null coordinate point to obtain fused depth information.
That is, for the coordinate point where the depth value exists in the second depth information, the depth information after the fusion thereof remains unchanged, and the depth value in the second depth information is still used. For null coordinate points (or holes) where no depth value exists in the second depth information, the depth values of coordinate points around the null coordinate points in the second depth information and the depth values corresponding to the null coordinate points in the first depth information may be used, and the target depth value of the null coordinate points may be determined by weighting.
For the weighted calculation process of the target depth value of the null coordinate point, the following steps may be referred to:
s1, calculating the ratio of the depth value of the neighbor coordinate point to the distance from the neighbor coordinate point to the null coordinate point for each neighbor coordinate point in a set range around the null coordinate point in the second depth information.
Specifically, the number of neighbor coordinate points in a set range around a coordinate point of a hollow value in the second depth information is defined as N, and the depth value of the ith neighbor coordinate point is defined as D i The distance from the ith neighbor coordinate point to the null coordinate point is
Figure SMS_3
The ratio calculated in this step is expressed as: />
Figure SMS_4
S2, each neighboring coordinate point is processedCorresponding ratio is added to obtain the reference depth value of the null coordinate point
Figure SMS_5
Figure SMS_6
And S3, weighting the depth value corresponding to the null coordinate point in the first depth information by using a first weight to obtain a first weighted result, weighting the reference depth value of the null coordinate point by using a second weight to obtain a second weighted result, and adding the first weighted result and the second weighted result to obtain the target depth value of the null coordinate point.
Specifically, defining a depth value corresponding to the null coordinate point in the first depth information to be expressed as
Figure SMS_7
The target depth value of the null coordinate point may be expressed as: />
Figure SMS_8
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_9
representing a first weight,/->
Figure SMS_10
Representing the second weight.
Typically, the first weight
Figure SMS_11
Can be greater than the second weight +.>
Figure SMS_12
For example, let' s>
Figure SMS_13
The value is 0.8%>
Figure SMS_14
The value is 0.2.
Of course, the above embodiment only exemplifies an alternative fusion method of the first depth information and the second depth information, other fusion methods may be alternatively used, for example, directly adding the depth information of the same coordinate point in the first depth information and the second depth information, or other fusion methods are not listed in this embodiment.
By fusing the first depth information and the second depth information, more complete fused depth information can be obtained.
In the embodiments described above, the depth estimation method of the robot is described, and the first depth information is calculated by means of the second camera module with the adjustable angle arranged above the robot body and matched with the first camera module, so that the first depth information is fused with the second depth information acquired by the depth sensor, and the integrity of the fused depth information is improved.
On this basis, through setting up the second camera module of angularly adjustable in robot body top, can cooperate the first camera module that sets up on the front panel, realize more functions.
In an alternative scenario, the face above the robot may be identified by means of an adjustable angle setting of the second camera module. Specifically, when the face recognition instruction is detected, the angle of the second camera module can be adjusted so that the field angle of the second camera module is aligned to the head of the user, and face recognition processing is performed based on the second image acquired by the second camera module.
Compared with the first camera module with fixed angle arranged on the front panel, the second camera module can collect images in a larger field of view above the robot by adjusting the angle of the second camera module, so that the accuracy of face recognition is improved.
In another alternative scenario, for tasks such as location monitoring, location monitoring may be completed based on images collected by the first camera module and the second camera module at the same time. Specifically, after a site monitoring instruction is detected, a first image acquired by the first camera module and a second image acquired by the second camera module are combined into a monitoring image set.
Because the setting positions and the view angles of the first shooting module and the second shooting module are different, the view fields of the two acquired images are not identical, and the view fields of the monitoring of the place can be improved by summarizing the first image and the second image as a monitoring image set, so that a user can see monitoring images in more view fields.
It can be understood that in this embodiment, only two scenes of face recognition and place monitoring are taken as examples for explanation, and for other scenes, different scene tasks can be better adapted by selecting the first camera module and/or the second camera module.
The depth estimation device provided in the embodiments of the present application will be described below, and the depth estimation device described below and the depth estimation method described above may be referred to correspondingly to each other.
The depth estimation device of the present embodiment may be applied to a robot whose structure is described with reference to the foregoing. Referring to fig. 7, fig. 7 is a schematic structural diagram of a depth estimation device according to an embodiment of the present application.
As shown in fig. 7, the apparatus may include:
an image acquisition unit 11, configured to acquire a first image acquired by the first image capturing module and acquire a second image acquired by the second image capturing module;
a first depth information calculating unit 12 for calculating first depth information based on the first image and the second image;
and the depth information fusion unit 13 is used for acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
Optionally, the process of acquiring the second image acquired by the second image capturing module by the image acquiring unit may include:
and adjusting the angle of the second camera module according to a set step length, and acquiring a second image acquired by the second camera module after each angle adjustment to obtain a plurality of second images under different angles. On the basis, the process of calculating the first depth information by the first depth information calculating unit based on the first image and the second image may include:
and calculating to obtain first depth information based on the first image and the second image obtained after each angle adjustment.
Optionally, the process of adjusting the angle of the second image capturing module by the image capturing unit according to the set step length may include:
and adjusting the angle of the second camera module according to a set step length within the adjustable angle range of the second camera module.
Optionally, the process of calculating the first depth information by the first depth information calculating unit based on the first image and the second image may include:
determining parallax of the same feature point on the first image and the second image in a feature point matching mode;
and calculating first depth information of the same feature point according to the parallax, the internal and external parameters of the first camera module and the internal and external parameters of the second camera module when the second image is acquired.
Optionally, the process of the above depth information fusion unit fusing the first depth information and the second depth information to obtain fused depth information may include:
for the null coordinate points of the missing depth values in the second depth information:
determining a target depth value of the null coordinate point based on the depth values of the coordinate points around the null coordinate point in the second depth information and the depth values corresponding to the null coordinate point in the first depth information;
and supplementing the second depth information by using the target depth value of the null coordinate point to obtain the fused depth information.
Optionally, the process of determining, by the depth information fusion unit, the target depth value of the null coordinate point based on the depth values of coordinate points around the null coordinate point in the second depth information and the depth values corresponding to the null coordinate point in the first depth information may include:
for each neighbor coordinate point in a set range around the null coordinate point in the second depth information, calculating the ratio of the depth value of the neighbor coordinate point to the distance from the neighbor coordinate point to the null coordinate point;
adding the ratio corresponding to each neighbor coordinate point to obtain a reference depth value of the null coordinate point;
and weighting the depth value corresponding to the null coordinate point in the first depth information by using a first weight to obtain a first weighted result, weighting the reference depth value of the null coordinate point by using a second weight to obtain a second weighted result, and adding the first weighted result and the second weighted result to obtain the target depth value of the null coordinate point.
Optionally, the apparatus of the present application may further include: and the face recognition unit is used for adjusting the angle of the second camera module after the face recognition instruction is detected so that the field angle of the second camera module is aligned to the head of the user, and performing face recognition processing based on the second image acquired by the second camera module.
Optionally, the apparatus of the present application may further include: and the place monitoring unit is used for combining the first image acquired by the first camera module and the second image acquired by the second camera module into a monitoring image set after detecting the place monitoring instruction.
In some embodiments of the present application, a robot is also provided, which may be a cleaning robot, a handling robot, or a companion robot, among many different types of intelligent robots.
As shown in connection with fig. 1 and 2, the robot includes:
a robot body 100; a first camera module 101 and a depth sensor 102 disposed on a front panel of the robot body 100, a second camera module 103 disposed above the robot body 100, an angle of the second camera module 103 being adjustable;
a processor (not shown in the figure) configured to acquire a first image acquired by the first camera module 101, acquire a second image acquired by the second camera module 103, calculate first depth information based on the first image and the second image, acquire second depth information acquired by the depth sensor 102, and fuse the first depth information and the second depth information to obtain fused depth information.
According to the robot provided by the embodiment of the application, the second image is acquired by additionally arranging the second camera module with the adjustable angle above the robot body, the first depth information can be calculated and obtained by matching the first image acquired by the first camera module of the front panel, so that the second depth information acquired by the depth sensor is supplemented, the richness of the depth information is improved, denser point cloud information can be obtained, and the subsequent performance depending on the corresponding function of the depth information is improved.
The foregoing processing of the processor may refer to the description of the foregoing method, and will not be repeated herein.
Alternatively, the first camera module 101 and the second camera module 103 may be RGB cameras.
The depth sensor 102 may be a TOF-type depth sensor, such as an ITOF depth sensor, for example.
Further optionally, the processor is further configured to: after the face recognition instruction is detected, the angle of the second camera module is adjusted to enable the field angle of the second camera module to be aligned to the head of the user, and face recognition processing is carried out based on the second image acquired by the second camera module.
Further optionally, the processor is further configured to: and after detecting the site monitoring instruction, combining the first image acquired by the first camera module and the second image acquired by the second camera module into a monitoring image set.
According to the robot provided by the embodiment of the application, the second camera module with the adjustable angle is arranged above the body, so that the scanning visual field of the camera module can be greatly improved, and the first camera module arranged on the front panel is matched with the visual field of the face recognition and home monitoring, so that the performance and the effect of the face recognition and home monitoring functions are more excellent.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the present specification, each embodiment is described in a progressive manner, and each embodiment focuses on the difference from other embodiments, and may be combined according to needs, and the same similar parts may be referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (12)

1. The utility model provides a degree of depth estimation method which characterized in that is applied to the robot, be provided with first camera module and depth sensor on the robot body front panel, the robot body top is provided with the adjustable second camera module of angle, the method includes:
acquiring a first image acquired by the first camera module and acquiring a second image acquired by the second camera module;
calculating to obtain first depth information based on the first image and the second image;
and acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
2. The method of claim 1, wherein acquiring the second image acquired by the second camera module comprises:
the angle of the second camera module is adjusted according to a set step length, and after each angle adjustment, a second image acquired by the second camera module is acquired, so that a plurality of second images under different angles are obtained;
then, based on the first image and the second image, a process of calculating first depth information includes:
and calculating to obtain first depth information based on the first image and the second image obtained after each angle adjustment.
3. The method of claim 2, wherein adjusting the angle of the second camera module in a set step size comprises:
and adjusting the angle of the second camera module according to a set step length within the adjustable angle range of the second camera module.
4. The method of claim 1, wherein fusing the first depth information and the second depth information to obtain fused depth information comprises:
for the null coordinate points of the missing depth values in the second depth information:
determining a target depth value of the null coordinate point based on the depth values of the coordinate points around the null coordinate point in the second depth information and the depth values corresponding to the null coordinate point in the first depth information;
and supplementing the second depth information by using the target depth value of the null coordinate point to obtain the fused depth information.
5. The method of claim 4, wherein determining the target depth value for the null coordinate point based on the depth values for the coordinate points surrounding the null coordinate point in the second depth information and the depth values corresponding to the null coordinate point in the first depth information comprises:
for each neighbor coordinate point in a set range around the null coordinate point in the second depth information, calculating the ratio of the depth value of the neighbor coordinate point to the distance from the neighbor coordinate point to the null coordinate point;
adding the ratio corresponding to each neighbor coordinate point to obtain a reference depth value of the null coordinate point;
and weighting the depth value corresponding to the null coordinate point in the first depth information by using a first weight to obtain a first weighted result, weighting the reference depth value of the null coordinate point by using a second weight to obtain a second weighted result, and adding the first weighted result and the second weighted result to obtain the target depth value of the null coordinate point.
6. The method according to any one of claims 1-5, further comprising:
after the face recognition instruction is detected, the angle of the second camera module is adjusted to enable the field angle of the second camera module to be aligned to the head of the user, and face recognition processing is carried out based on the second image acquired by the second camera module.
7. The method according to any one of claims 1-5, further comprising:
and after detecting the site monitoring instruction, combining the first image acquired by the first camera module and the second image acquired by the second camera module into a monitoring image set.
8. The method according to any one of claims 1-5, further comprising:
determining map information of a place where the robot is located based on the fused depth information;
and/or the number of the groups of groups,
identifying obstacles around the robot based on the fused depth information, and controlling the robot to avoid the obstacles in the travelling process;
and/or the number of the groups of groups,
and based on the fused depth information, recognizing a gesture or a gesture of a user, and controlling the advancing or working mode of the robot according to the gesture or the gesture.
9. The utility model provides a degree of depth estimation device, its characterized in that is applied to the robot, be provided with first camera module and depth sensor on the robot body front panel, the robot body top is provided with the adjustable second camera module of angle, and the device includes:
the image acquisition unit is used for acquiring a first image acquired by the first camera module and acquiring a second image acquired by the second camera module;
a first depth information calculating unit, configured to calculate first depth information based on the first image and the second image;
and the depth information fusion unit is used for acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
10. A robot, comprising:
a robot body;
the first camera module and the depth sensor are arranged on the front panel of the robot body, the second camera module is arranged above the robot body, and the angle of the second camera module is adjustable;
the processor is used for acquiring a first image acquired by the first camera module, acquiring a second image acquired by the second camera module, calculating to obtain first depth information based on the first image and the second image, acquiring second depth information acquired by the depth sensor, and fusing the first depth information and the second depth information to obtain fused depth information.
11. The robot of claim 10, wherein the first camera module and the second camera module are each RGB cameras;
the depth sensor is a TOF-type depth sensor.
12. The robot of claim 10, wherein the robot is a cleaning robot, a handling robot, or a companion robot.
CN202310281183.8A 2023-03-22 2023-03-22 Depth estimation method and device and robot Pending CN115994937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310281183.8A CN115994937A (en) 2023-03-22 2023-03-22 Depth estimation method and device and robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310281183.8A CN115994937A (en) 2023-03-22 2023-03-22 Depth estimation method and device and robot

Publications (1)

Publication Number Publication Date
CN115994937A true CN115994937A (en) 2023-04-21

Family

ID=85992304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310281183.8A Pending CN115994937A (en) 2023-03-22 2023-03-22 Depth estimation method and device and robot

Country Status (1)

Country Link
CN (1) CN115994937A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741405A (en) * 2019-01-21 2019-05-10 同济大学 A kind of depth information acquisition system based on dual structure light RGB-D camera
CN110335211A (en) * 2019-06-24 2019-10-15 Oppo广东移动通信有限公司 Bearing calibration, terminal device and the computer storage medium of depth image
US10510155B1 (en) * 2019-06-11 2019-12-17 Mujin, Inc. Method and processing system for updating a first image generated by a first camera based on a second image generated by a second camera
CN113034568A (en) * 2019-12-25 2021-06-25 杭州海康机器人技术有限公司 Machine vision depth estimation method, device and system
US20210279950A1 (en) * 2020-03-04 2021-09-09 Magic Leap, Inc. Systems and methods for efficient floorplan generation from 3d scans of indoor scenes
CN115714855A (en) * 2022-10-11 2023-02-24 华中科技大学 Three-dimensional visual perception method and system based on stereoscopic vision and TOF fusion

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741405A (en) * 2019-01-21 2019-05-10 同济大学 A kind of depth information acquisition system based on dual structure light RGB-D camera
US10510155B1 (en) * 2019-06-11 2019-12-17 Mujin, Inc. Method and processing system for updating a first image generated by a first camera based on a second image generated by a second camera
CN110335211A (en) * 2019-06-24 2019-10-15 Oppo广东移动通信有限公司 Bearing calibration, terminal device and the computer storage medium of depth image
CN113034568A (en) * 2019-12-25 2021-06-25 杭州海康机器人技术有限公司 Machine vision depth estimation method, device and system
US20210279950A1 (en) * 2020-03-04 2021-09-09 Magic Leap, Inc. Systems and methods for efficient floorplan generation from 3d scans of indoor scenes
CN115714855A (en) * 2022-10-11 2023-02-24 华中科技大学 Three-dimensional visual perception method and system based on stereoscopic vision and TOF fusion

Similar Documents

Publication Publication Date Title
CN108406731B (en) Positioning device, method and robot based on depth vision
CN109901590B (en) Recharging control method of desktop robot
CN109887040B (en) Moving target active sensing method and system for video monitoring
US8446492B2 (en) Image capturing device, method of searching for occlusion region, and program
JP3895238B2 (en) Obstacle detection apparatus and method
JP5588812B2 (en) Image processing apparatus and imaging apparatus using the same
CN111337947A (en) Instant mapping and positioning method, device, system and storage medium
CN110503040B (en) Obstacle detection method and device
CN110989631A (en) Self-moving robot control method, device, self-moving robot and storage medium
CN108459597B (en) Mobile electronic device and method for processing tasks in task area
KR20110011424A (en) Method for recognizing position and controlling movement of a mobile robot, and the mobile robot using the same
WO2015024407A1 (en) Power robot based binocular vision navigation system and method based on
KR20200071960A (en) Method and Apparatus for Vehicle Detection Using Lidar Sensor and Camera Convergence
WO2019144269A1 (en) Multi-camera photographing system, terminal device, and robot
CN111780744A (en) Mobile robot hybrid navigation method, equipment and storage device
CN115994937A (en) Depth estimation method and device and robot
CN102542563A (en) Modeling method of forward direction monocular vision of mobile robot
CN111399014A (en) Local stereoscopic vision infrared camera system and method for monitoring wild animals
CN113959398B (en) Distance measurement method and device based on vision, drivable equipment and storage medium
JP6734994B2 (en) Stereo measuring device and system
CN112683266A (en) Robot and navigation method thereof
CN115588036A (en) Image acquisition method and device and robot
CN113379850B (en) Mobile robot control method, device, mobile robot and storage medium
JP4584405B2 (en) 3D object detection apparatus, 3D object detection method, and recording medium
TW202311781A (en) Obstacle detection method utilizing an obstacle recognition model to recognize obstacle category corresponding to each obstacle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination