WO2021092805A1 - Multi-modal data fusion method and apparatus, and intellignet robot - Google Patents

Multi-modal data fusion method and apparatus, and intellignet robot Download PDF

Info

Publication number
WO2021092805A1
WO2021092805A1 PCT/CN2019/118102 CN2019118102W WO2021092805A1 WO 2021092805 A1 WO2021092805 A1 WO 2021092805A1 CN 2019118102 W CN2019118102 W CN 2019118102W WO 2021092805 A1 WO2021092805 A1 WO 2021092805A1
Authority
WO
WIPO (PCT)
Prior art keywords
data set
depth
distance
resolution
depth data
Prior art date
Application number
PCT/CN2019/118102
Other languages
French (fr)
Chinese (zh)
Inventor
朱森强
杨光雨
Original Assignee
中新智擎科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中新智擎科技有限公司 filed Critical 中新智擎科技有限公司
Priority to PCT/CN2019/118102 priority Critical patent/WO2021092805A1/en
Publication of WO2021092805A1 publication Critical patent/WO2021092805A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/93Lidar systems specially adapted for specific applications for anti-collision purposes
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions

Definitions

  • the embodiments of the present invention relate to the field of electronic information technology, and in particular to a method, device and intelligent robot for multi-modal data fusion.
  • the inventor found that the above related technologies have at least the following problems: currently intelligent robots usually use a single beam of lidar for navigation and obstacle avoidance, and can only scan obstacles on the plane where the lidar is located. Intelligent robots Obstacles that are lower or higher than the lidar plane cannot be detected, and there are considerable visual blind spots.
  • the purpose of the embodiments of the present invention is to provide a multi-modal data fusion method, device and intelligent robot.
  • an embodiment of the present invention provides a multi-modal data fusion method, which is applied to an intelligent robot, the intelligent robot is provided with a depth camera and a lidar, and the method includes:
  • Fusion processing is performed on the 2D depth data set and the laser data set.
  • the laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each of the data points
  • the 3D depth data set includes pixel points distributed along the X axis and the Y axis X-axis coordinates and Y-axis coordinates of and the second distance of each pixel
  • the step of reading the 3D depth data set in the 3D depth image and converting the 3D depth data set into a 2D depth data set specifically includes:
  • the X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
  • the step of reading the 3D depth data set in the 3D depth image and converting the 3D depth data set into a 2D depth data set further includes:
  • each group the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
  • the step of performing fusion processing on the 2D depth data set and the laser data set specifically includes:
  • the data segment in the coincidence area in the laser data set is replaced with the corrected 2D depth data set.
  • the method further includes:
  • the data segment in the overlap area in the laser data set is replaced with the 2D depth data set.
  • the method further includes:
  • an embodiment of the present invention provides a multi-modal data fusion device, which is applied to an intelligent robot, the intelligent robot is provided with a depth camera and a lidar, and the device includes:
  • the first acquisition module is configured to acquire a 3D depth image of the external environment through the depth camera;
  • the second collection module is configured to collect a laser data set of the external environment through the lidar
  • a conversion module configured to read a 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set;
  • the fusion module is used to perform fusion processing on the 2D depth data set and the laser data set.
  • the laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each data point
  • the 3D depth data set includes pixel points distributed along the X axis and the Y axis X-axis coordinates and Y-axis coordinates of and the second distance of each pixel
  • the conversion module is further configured to select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
  • the X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
  • the conversion module is further configured to obtain the first resolution of the laser data set, and the second resolution of the 2D depth data set;
  • each group the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
  • an embodiment of the present invention provides an intelligent robot, including:
  • At least one processor and,
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method described in the first aspect above.
  • embodiments of the present invention also provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute The method described in the first aspect above.
  • embodiments of the present invention also provide a computer program product.
  • the computer program product includes a computer program stored on a computer-readable storage medium.
  • the computer program includes program instructions. When the program instructions are executed by a computer, the computer executes the method described in the first aspect above.
  • the embodiment of the present invention provides a multi-modal data fusion method, which is applied to an intelligent robot, and the intelligent robot is provided with a depth Camera and lidar, this method first collects the 3D depth image and laser data set of the external environment through the depth camera and lidar respectively, then reads the 3D depth data set in the 3D depth image, and converts the 3D depth data set Converted into a 2D depth data set, and finally, the 2D depth data set and the laser data set are fused, and the intelligent robot using the method provided in the embodiment of the present invention can detect the position in front of the intelligent robot and in the lidar Obstacles in the visual blind zone improve the safety performance of the intelligent robot.
  • FIG. 1 is a schematic diagram of the structure of an intelligent robot and the collection area of the robot applied to the multimodal data fusion method of an embodiment of the present invention
  • Figure 2 is a flowchart of a multimodal data fusion method provided by an embodiment of the present invention
  • FIG. 3 is an example diagram of an image in front of a smart robot collected by a depth camera and a lidar provided by an embodiment of the present invention
  • FIG. 4 is a sub-flow chart of step 130 in the method shown in FIG. 2;
  • FIG. 5 is a sub-flow chart of step 130 and step 140 in the method shown in FIG. 2;
  • FIG. 6 is a schematic structural diagram of a multi-modal data fusion device provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of the hardware structure of an intelligent robot that executes the above-mentioned multi-modal data fusion method provided by an embodiment of the present invention.
  • FIG. 1 shows a schematic diagram of the structure of the intelligent robot and the collection area of the intelligent robot applied to the multi-modal data fusion method of the embodiment of the present invention, where the intelligent robot 10 on the left side of FIG. 1 is capable of executing
  • the intelligent robot 10 is provided with a depth camera 11 and a lidar 12.
  • the depth camera 11 is a binocular camera, which can be used to collect a depth image in front of the intelligent robot 10, and the depth image contains distance information from the intelligent robot 10 to an obstacle in front of the robot.
  • it may be a common depth camera such as a Ki nect-v1 camera, a Foton camera, a ZED camera, and so on.
  • the lidar 12 is a device that can obtain characteristic quantities such as the position and speed of the obstacle by emitting a laser beam to the obstacle, and usually can only detect the distance information of the obstacle on a plane of a certain height.
  • the field angle ⁇ of the lidar 12 is much larger than the field angle ⁇ of the depth camera 11, and the field angle ⁇ of the lidar 12 can usually reach 180 degrees or even exceed 180 degrees.
  • the depth camera 11 shoots the image in front of the robot, it uses a wide-angle lens, and the captured images on both sides are usually distorted. Therefore, it is preferable that the depth camera 11 in the embodiment of the present invention adopts a field angle ⁇ of less than ninety. Degree of binocular camera.
  • the embodiment of the present invention provides a multi-modal data fusion method, which can be executed by the above-mentioned intelligent robot 10. Please refer to FIG. 2, which shows the flow of a multi-modal data fusion method provided by the embodiment of the present invention.
  • the method includes but is not limited to the following steps:
  • Step 110 Collect a 3D depth image of the external environment through the depth camera.
  • the depth camera 11 can be used to collect a 3D depth image S2 in front of the intelligent robot 11 within the field of view ⁇ .
  • Each pixel of the 3D depth image S2 contains Distance information on a straight line from the obstacle.
  • Step 120 Collect a laser data set of the external environment through the lidar.
  • the laser radar 12 can collect the laser data set S1 in front of the intelligent robot 11 within the field of view ⁇ . It can be understood that it includes the laser radar in each data collection The straight-line distance information from the obstacle in front of the point collected.
  • the laser data set S1 can be understood as a line carrying distance information, and the line contains multiple data collection points.
  • Step 130 Read the 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set.
  • the lidar collects data information on a certain height plane, and the field of view of the lidar is larger than the field of view of the depth camera, the viewing range of the depth camera in the height direction is larger than that of the lidar in the height direction.
  • the detection range of the direction, and the detection range of the lidar in the horizontal direction is also larger than the viewing range of the depth camera in the horizontal direction. Also, there is an overlap area S in the images obtained by the depth camera and the lidar.
  • the 3D depth data set in order to represent the 3D depth data set in the detection direction of the laser data set S1, to further perform fusion processing with the laser data set to obtain more accurate obstacle detection information, it is also necessary
  • the 3D depth data set is converted into a 2D depth data set in the detection direction of the laser data set S1.
  • Step 140 Perform fusion processing on the 2D depth data set and the laser data set.
  • the 3D depth image and the laser data set are fused.
  • the fusion processing specifically includes performing fusion processing on the 2D depth data set converted from the 3D depth data set and the laser data set in the overlapping area S.
  • the embodiment of the present invention provides a multi-modal data fusion method, which is applied to a robot, and the robot is provided with a depth camera and a lidar.
  • the method first collects 3D depth images and lasers of the external environment through the depth camera and the lidar, respectively. Data set, then read the 3D depth data set in the 3D depth image, convert the 3D depth data set into a 2D depth data set, and finally, fuse the 2D depth data set and the laser data set
  • the intelligent robot using the method provided by the embodiment of the present invention can detect obstacles in front of the intelligent robot and in the blind spot of the lidar, thereby improving the safety performance of the intelligent robot.
  • FIG. 3 shows an example diagram of a data set in front of the robot collected by a depth camera and a lidar provided by an embodiment of the present invention.
  • the data set is shown in FIG. 1 above.
  • the 3D depth data set S2' includes X-axis coordinates and Y-axis coordinates of pixel points distributed along the X-axis and Y-axis, and the second distance of each pixel point.
  • the laser data set S1 includes 9 data points distributed along the X axis, and the first distance (a1, a2,..., a9) of each of the data points, and each data point The points all carry their one-dimensional coordinates.
  • the 3D depth data set S2' includes 36 pixels distributed along the X-axis and Y-axis and the second distance of each pixel (b11, b12, ..., b16, b21,..., b66), each pixel carries its two-dimensional coordinates.
  • the system superimposes the laser data set S1 and the 3D depth data set S2' through the one-dimensional coordinates of each data point and the two-dimensional coordinates of each pixel point to obtain
  • the superimposed images of the laser data set S1 and the 3D depth data set S2' shown in FIG. 3 are specifically calibrated according to the one-dimensional coordinates of the data points and the coordinates of the pixel points on the X axis to achieve superposition.
  • the number of data points in the laser data set S1 and the number of pixels in the 3D depth data set S2' shown in FIG. 3 are not limited to the numbers described in the foregoing embodiment, which are based on actual applications.
  • the sampling frequency of the lidar and the resolution of the depth camera are determined.
  • step 130 includes:
  • Step 131 Select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate.
  • Step 132 Combine the X-axis coordinates and the minimum second distance corresponding to the X-axis coordinates into the 2D depth data set.
  • the 3D depth data set needs to be converted into a 2D depth data set.
  • the conversion process is specifically: taking the second distance of all pixels on the same Y axis in the 3D depth data set S2', taking the smallest value among them, and assigning them to the X corresponding to the Y axis on the coincidence area S For the pixel points on the axis coordinates, take the second distance of all the pixels on the same Y axis to take the minimum value of the second distance and assign it to the corresponding Y axis on the coincident area S on the corresponding X axis coordinate After pixel points, the 2D depth data set can be obtained.
  • the minimum value of the second distance of the first column of pixels is obtained, that is, the minimum value of the six second distance values from b11 to b61, and then the minimum value is assigned to the pixel where b31 is located Point, and so on.
  • the obtained 2D depth data set only contains a row of pixels on the X axis corresponding to the pixel where b31 is located, and each pixel on the row of pixels is assigned a value on the corresponding Y axis of the pixel
  • the minimum value of the second distance of all pixels, so that each pixel on the finally obtained 2D depth data set contains the information of the shortest distance between all the pixels on the Y axis where the pixel is located and the obstacle.
  • FIG. 5 shows a sub-flow chart of step 130 and step 140 in the method shown in FIG. 2. Based on the methods shown in FIG. 2 and FIG. 3, the method further includes:
  • Step 133 Obtain the first resolution of the laser data set and the second resolution of the 2D depth data set.
  • Step 134 Determine whether the second resolution is greater than the first resolution. If yes, skip to step 135; if not, skip to step 143.
  • Step 135 Calculate the multiple of the second resolution and the first resolution.
  • Step 136 Group the pixel points in the 2D depth data set according to the multiple.
  • Step 137 Select the pixel corresponding to the minimum value of the minimum second distance in each group, and delete other pixels to obtain a corrected 2D depth data set.
  • the first resolution may refer to the number of data points of the laser data set in the coincidence area
  • the second resolution may refer to the number of pixels of the 2D depth data set in the coincidence area.
  • the multiple relationship is obtained by comparing the number of data points and the number of pixels in the overlap area, and the pixels in the 2D depth data set are calculated according to the multiple relationship.
  • the points are grouped, and the minimum value of the minimum second distance of the pixel points in each group is taken, and the set of these minimum values is the corrected 2D depth data set.
  • the number of pixels in the overlap area S of the 2D depth data set S2' in the overlap area is 6 (the pixel where b31 is located to the pixel where b36 is Point, there are a total of 6 pixels), so the second resolution is 6, and the first resolution of the laser data set S1 in the overlap area S is 3 (data point a4 to data point a6, There are 3 data points in total). Therefore, it can be calculated that the multiple of the second resolution and the first resolution is 2.
  • each group contains multiple pixels of the 2D depth data set.
  • the pixel with the smallest value in the smallest second distance is taken, and other pixels are deleted to obtain the corrected 2D depth data set. That is, in the example shown in Figure 3, divide every two pixels in the transition image into a group, and take the minimum value in each group to obtain a corrected 2D depth data set.
  • the corrected 2D depth The data set contains three data points and the values of the three data points are c4, c5, and c6.
  • the method further includes:
  • Step 141 Obtain the overlapping area of the acquisition area of the lidar and the acquisition area of the depth camera.
  • Step 142 In combination with the overlapping area, replace the data segment in the overlapping area in the laser data set with the corrected 2D depth data set.
  • the step of correcting the 2D depth data set aims to make the corrected 2D depth data set and the laser data set have the same number of pixels/data points in the image in the overlap area , So that the corrected 2D depth data set can be used to replace the data segments in the overlap area in the laser data set to obtain a fusion image.
  • the resolution of the corrected 2D depth data set is not in the laser data set. The resolutions of the data segments in the overlap area are the same.
  • the multiple of the second resolution and the first resolution is not an integer.
  • the overlap area with the data point of the laser data set is taken.
  • the pixel points of the 2D depth data set are rounded around the upper position and a multiple of the pixel points are taken to obtain the minimum value to obtain the corrected 2D depth data set.
  • the multiple is 2.4, round and add one to 3, and take the minimum value on the first, second, and third pixels of the 2D depth data set image as the first data point of the corrected 2D depth data set , Take the minimum value on the 3rd, 4th, and 5th pixel points of the 2D depth data set as the second data point of the corrected 2D depth data set, and so on to obtain the corrected 2D depth data set, and
  • the number of data points in the corrected 2D depth data set is consistent with the number of data points in the overlap area of the laser data set.
  • the method further includes:
  • Step 143 Determine whether the second resolution is equal to the first resolution. If yes, skip to step 144; if not, skip to step 145.
  • Step 144 Replace the data segment in the overlapping area in the laser data set with the 2D depth data set.
  • the second resolution and the first resolution are equal, that is, the number of data points in the overlap area of the laser data set is the same as the number of pixels in the 2D depth data set At this time, the data segment in the overlap area in the laser data set can be directly replaced with the 2D depth data set to obtain the fused image.
  • the method further includes:
  • Step 145 Obtain the pixel point of the 3D depth image corresponding to each data point of the laser data set in the overlap area.
  • Step 146 Determine whether the first distance of each data point is greater than the minimum second distance of the corresponding pixel point; if yes, skip to step 147; if not, keep the 2D depth data set.
  • Step 147 Replace the first distance of the data point whose first distance is less than the minimum second distance with the minimum second distance of the corresponding pixel point.
  • the number of data points in the overlap area of the laser data set is greater than the number of pixels in the overlap area of the 2D depth data set.
  • the minimum second distance if there is no data point greater than the relationship, the first distance carried by the data points of the laser data set is retained, so that the distance value of each data point on the final fusion image can represent the robot distance The shortest distance of the obstacle ahead.
  • the minimum value of the distance information of all data points on the fused image can be taken to obtain the distance of the intelligent robot
  • the shortest distance between the obstacle in front and the distance between the obstacle and the intelligent robot is the shortest.
  • the forward path of the robot is planned to avoid collision between the robot and the obstacle, and to ensure the safety of the robot and the obstacle.
  • the embodiment of the present invention also provides a multi-modal data fusion device.
  • FIG. 6, shows a schematic structural diagram of a multi-modal data fusion device provided by an embodiment of the present invention.
  • the device 200 is applied to an intelligent robot, and the intelligent robot is provided with a depth camera and a lidar.
  • the device 200 includes: a first acquisition module 210, a second acquisition module 220, a conversion module 230, and a fusion module 240. among them,
  • the first collection module 210 is configured to collect a laser data set of the external environment through the lidar;
  • the second collection module 220 is used to collect a laser data set of the external environment through the lidar;
  • the conversion module 230 is configured to read a 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set;
  • the fusion module 240 is configured to perform fusion processing on the 2D depth data set and the laser data set.
  • the laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each data point
  • the 3D depth data set includes pixel points distributed along the X axis and the Y axis X-axis coordinates and Y-axis coordinates of and the second distance of each pixel
  • the conversion module 230 is further configured to select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
  • the X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
  • the conversion module 230 is further configured to obtain the first resolution of the laser data set and the second resolution of the 2D depth data set;
  • each group the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
  • the fusion module 240 is also used to obtain the overlapping area of the acquisition area of the lidar and the acquisition area of the depth camera;
  • the data segment in the coincidence area in the laser data set is replaced with the corrected 2D depth data set.
  • the fusion module 240 is further configured to, if the second resolution is equal to the first resolution, replace the data segment in the coincident area in the laser data set with the 2D depth data set.
  • the fusion module 240 is further configured to obtain the 3D corresponding to each data point of the laser data set in the coincidence area if the second resolution is smaller than the first resolution.
  • the embodiment of the present invention also provides an intelligent robot. Please refer to FIG. 7, which shows the hardware structure of the intelligent robot capable of executing the multi-modal data fusion method described in FIGS. 2 to 5.
  • the intelligent robot 10 may be the intelligent robot 10 shown in FIG. 1.
  • the intelligent robot 10 includes: at least one processor 11; and a memory 12 communicatively connected with the at least one processor 11, and one processor 11 is taken as an example in FIG. 7.
  • the memory 12 stores instructions that can be executed by the at least one processor 11, and the instructions are executed by the at least one processor 11, so that the at least one processor 11 can execute the instructions shown in FIGS. 2 to 5 above.
  • the described multi-modal data fusion method The processor 11 and the memory 12 may be connected by a bus or in other ways. In FIG. 7, the connection by a bus is taken as an example.
  • the memory 12 can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the corresponding multi-modal data fusion method in the embodiment of the present application.
  • the program instructions/modules for example, the modules shown in Fig. 6.
  • the processor 11 executes various functional applications and data processing of the server by running non-volatile software programs, instructions, and modules stored in the memory 12, that is, realizing the multi-modal data fusion method of the foregoing method embodiment.
  • the memory 12 may include a storage program area and a storage data area.
  • the storage program area can store an operating system and an application program required by at least one function; the storage data area can store data created according to the use of the multi-modal data fusion device, etc. .
  • the memory 12 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the memory 12 may optionally include memories remotely provided with respect to the processor 11, and these remote memories may be connected to the multi-modal data fusion device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the one or more modules are stored in the memory 12, and when executed by the one or more processors 11, the multi-modal data fusion method in any of the foregoing method embodiments is executed, for example, the foregoing described
  • the method steps in Fig. 2 to Fig. 5 realize the functions of each module and each unit in Fig. 6.
  • the embodiment of the present application also provides a non-volatile computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more processors, for example, The above described method steps in FIG. 2 to FIG. 5 realize the functions of each module in FIG. 6.
  • the embodiments of the present application also provide a computer program product, including a calculation program stored on a non-volatile computer-readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, cause all
  • the computer executes the multi-modal data fusion method in any of the foregoing method embodiments, for example, executes the method steps in FIGS. 2 to 5 described above to realize the functions of the modules in FIG. 6.
  • the embodiment of the present invention provides a multi-modal data fusion method, which is applied to an intelligent robot, and the intelligent robot is provided with a depth camera and a lidar.
  • the method first collects 3D depth images of the external environment through the depth camera and the lidar respectively And the laser data set, then read the 3D depth data set in the 3D depth image, convert the 3D depth data set into a 2D depth data set, and finally, compare the 2D depth data set and the laser data set
  • the intelligent robot using the method provided by the embodiment of the present invention can detect obstacles in front of the intelligent robot and in the blind spot of the lidar, thereby improving the safety performance of the intelligent robot.
  • the device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate. Units can be located in one place or distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each implementation manner can be implemented by means of software plus a general hardware platform, and of course, it can also be implemented by hardware.
  • a person of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by a computer program instructing relevant hardware.
  • the program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. When executed, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Abstract

Disclosed is a multi-modal data fusion method, relating to the technical field of electronic information. The multi-modal data fusion method is applied to an intelligent robot, and the intelligent robot is provided with a depth camera and a laser radar. The method comprises: firstly, respectively collecting a 3D depth image and a laser data set of an external environment by means of a depth camera and a laser radar (110, 120); then, reading a 3D depth data set in the 3D depth image, and converting the 3D depth data set into a 2D depth data set (130); and finally, performing fusion processing on the 2D depth data set and the laser data set (140). An intelligent robot using the method can detect an obstacle that is in front of the intelligent robot and in a vision blind area of a laser radar, thereby improving the safety performance of the intelligent robot.

Description

一种多模态数据融合方法、装置及智能机器人Multimodal data fusion method, device and intelligent robot 技术领域Technical field
本发明实施例涉及电子信息技术领域,特别涉及一种多模态数据融合方法、装置及智能机器人。The embodiments of the present invention relate to the field of electronic information technology, and in particular to a method, device and intelligent robot for multi-modal data fusion.
背景技术Background technique
随着人工智能技术的发展,机器人越来越智能化,智能机器人在各行各业中的应用也越来越广泛,其中,能够移动的智能机器人,通常都需要带有一定的路径规划功能,能够进行导航避障。With the development of artificial intelligence technology, robots are becoming more and more intelligent, and the application of intelligent robots in all walks of life is becoming more and more extensive. Among them, intelligent robots that can move usually need to have certain path planning functions. Navigate and avoid obstacles.
在实现本发明实施例过程中,发明人发现以上相关技术中至少存在如下问题:目前智能机器人通常使用单束的激光雷达进行导航避障,只能在激光雷达所在平面进行障碍物扫描.智能机器人无法检测低于或者高于激光雷达平面的障碍物,有相当大的视觉盲区。In the process of implementing the embodiments of the present invention, the inventor found that the above related technologies have at least the following problems: currently intelligent robots usually use a single beam of lidar for navigation and obstacle avoidance, and can only scan obstacles on the plane where the lidar is located. Intelligent robots Obstacles that are lower or higher than the lidar plane cannot be detected, and there are considerable visual blind spots.
发明内容Summary of the invention
针对现有技术的上述缺陷,本发明实施例的目的是提供一种多模态数据融合方法、装置及智能机器人。In view of the above-mentioned defects of the prior art, the purpose of the embodiments of the present invention is to provide a multi-modal data fusion method, device and intelligent robot.
本发明实施例的目的是通过如下技术方案实现的:The purpose of the embodiments of the present invention is achieved through the following technical solutions:
为解决上述技术问题,第一方面,本发明实施例中提供了一种多模态数据融合方法,应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,所述方法包括:In order to solve the above technical problems, in the first aspect, an embodiment of the present invention provides a multi-modal data fusion method, which is applied to an intelligent robot, the intelligent robot is provided with a depth camera and a lidar, and the method includes:
通过所述深度摄像头采集外部环境的3D深度图像;Collecting a 3D depth image of the external environment through the depth camera;
通过所述激光雷达采集外部环境的激光数据集;Collecting a laser data set of the external environment through the lidar;
读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集;Reading the 3D depth data set in the 3D depth image, and converting the 3D depth data set into a 2D depth data set;
对所述2D深度数据集和所述激光数据集进行融合处理。Fusion processing is performed on the 2D depth data set and the laser data set.
在一些实施例中,所述激光数据集包括一线性排列的数据点的一维坐标以及各所述数据点的第一距离,所述3D深度数据集包括沿X轴和Y 轴分布的像素点的X轴坐标和Y轴坐标以及各所述像素点的第二距离;In some embodiments, the laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each of the data points, and the 3D depth data set includes pixel points distributed along the X axis and the Y axis X-axis coordinates and Y-axis coordinates of and the second distance of each pixel;
所述读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集的步骤,具体包括:The step of reading the 3D depth data set in the 3D depth image and converting the 3D depth data set into a 2D depth data set specifically includes:
选取X轴坐标相同的各所述像素点的第二距离中的最小第二距离;Selecting the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
将所述X轴坐标以及所述X轴坐标对应的所述最小第二距离组成所述2D深度数据集。The X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
在一些实施例中,所述读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集的步骤,进一步包括:In some embodiments, the step of reading the 3D depth data set in the 3D depth image and converting the 3D depth data set into a 2D depth data set further includes:
获取所述激光数据集的第一分辨率,以及,所述2D深度数据集的第二分辨率;Acquiring the first resolution of the laser data set and the second resolution of the 2D depth data set;
判断所述第二分辨率是否大于所述第一分辨率;Determine whether the second resolution is greater than the first resolution;
若是,则计算所述第二分辨率与第一分辨率的倍数;If yes, calculate the multiple of the second resolution and the first resolution;
将所述2D深度数据集中的像素点按所述倍数进行分组;Grouping the pixel points in the 2D depth data set according to the multiple;
在每个分组中选取所述最小第二距离中的最小值对应的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。In each group, the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
在一些实施例中,所述对所述2D深度数据集和所述激光数据集进行融合处理的步骤,具体包括:In some embodiments, the step of performing fusion processing on the 2D depth data set and the laser data set specifically includes:
获取所述激光雷达的采集区域和所述深度摄像头的采集区域中的重合区域;Acquiring the overlapping area between the acquisition area of the lidar and the acquisition area of the depth camera;
结合所述重合区域,将所述激光数据集中处于该重合区域的数据段替换为所述修正后的2D深度数据集。In combination with the coincidence area, the data segment in the coincidence area in the laser data set is replaced with the corrected 2D depth data set.
在一些实施例中,所述方法还包括:In some embodiments, the method further includes:
若所述第二分辨率等于所述第一分辨率,则将所述激光数据集中处于该重合区域的数据段替换为所述2D深度数据集。If the second resolution is equal to the first resolution, then the data segment in the overlap area in the laser data set is replaced with the 2D depth data set.
在一些实施例中,所述方法还包括:In some embodiments, the method further includes:
若所述第二分辨率小于所述第一分辨率,则获取所述重合区域中激光数据集的每一数据点对应所在的所述3D深度图像的像素点;If the second resolution is less than the first resolution, acquiring each data point of the laser data set in the overlap area corresponds to the pixel point of the 3D depth image;
判断所述每一数据点的第一距离是否小于所述对应的像素点的最小第二距离;Judging whether the first distance of each data point is less than the minimum second distance of the corresponding pixel;
若是,则将所述第一距离小于所述最小第二距离的数据点的第一距离替换为所述对应的像素点的所述最小第二距离。If yes, replace the first distance of the data point whose first distance is less than the minimum second distance with the minimum second distance of the corresponding pixel point.
为解决上述技术问题,第二方面,本发明实施例中提供了一种多模态数据融合装置,应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,所述装置包括:In order to solve the above technical problems, in a second aspect, an embodiment of the present invention provides a multi-modal data fusion device, which is applied to an intelligent robot, the intelligent robot is provided with a depth camera and a lidar, and the device includes:
第一采集模块,用于通过所述深度摄像头采集外部环境的3D深度图像;The first acquisition module is configured to acquire a 3D depth image of the external environment through the depth camera;
第二采集模块,用于通过所述激光雷达采集外部环境的激光数据集;The second collection module is configured to collect a laser data set of the external environment through the lidar;
转换模块,用于读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集;A conversion module, configured to read a 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set;
融合模块,用于对所述2D深度数据集和所述激光数据集进行融合处理。The fusion module is used to perform fusion processing on the 2D depth data set and the laser data set.
在一些实施例中,所述激光数据集包括一线性排列的数据点的一维坐标以及各所述数据点的第一距离,所述3D深度数据集包括沿X轴和Y轴分布的像素点的X轴坐标和Y轴坐标以及各所述像素点的第二距离;In some embodiments, the laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each data point, and the 3D depth data set includes pixel points distributed along the X axis and the Y axis X-axis coordinates and Y-axis coordinates of and the second distance of each pixel;
所述转换模块还用于选取X轴坐标相同的各所述像素点的第二距离中的最小第二距离;The conversion module is further configured to select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
将所述X轴坐标以及所述X轴坐标对应的所述最小第二距离组成所述2D深度数据集。The X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
在一些实施例中,所述转换模块还用于获取所述激光数据集的第一分辨率,以及,所述2D深度数据集的第二分辨率;In some embodiments, the conversion module is further configured to obtain the first resolution of the laser data set, and the second resolution of the 2D depth data set;
判断所述第二分辨率是否大于所述第一分辨率;Determine whether the second resolution is greater than the first resolution;
若是,则计算所述第二分辨率与第一分辨率的倍数;If yes, calculate the multiple of the second resolution and the first resolution;
将所述2D深度数据集中的像素点按所述倍数进行分组;Grouping the pixel points in the 2D depth data set according to the multiple;
在每个分组中选取所述最小第二距离中的最小值对应的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。In each group, the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
为解决上述技术问题,第三方面,本发明实施例提供了一种智能机器人,包括:In order to solve the above technical problems, in a third aspect, an embodiment of the present invention provides an intelligent robot, including:
至少一个处理器;以及,At least one processor; and,
与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如上第一方面所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method described in the first aspect above.
为解决上述技术问题,第四方面,本发明实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如上第一方面所述的方法。In order to solve the above technical problems, in the fourth aspect, embodiments of the present invention also provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make a computer execute The method described in the first aspect above.
为解决上述技术问题,第五方面,本发明实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行如上第一方面所述的方法。In order to solve the above technical problems, in the fifth aspect, embodiments of the present invention also provide a computer program product. The computer program product includes a computer program stored on a computer-readable storage medium. The computer program includes program instructions. When the program instructions are executed by a computer, the computer executes the method described in the first aspect above.
与现有技术相比,本发明的有益效果是:区别于现有技术的情况,本发明实施例中提供了一种多模态数据融合方法,应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,该方法首先分别通过深度摄像头和激光雷达采集外部环境的3D深度图像和激光数据集,然后,读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集,最后,对所述2D深度数据集和所述激光数据集进行融合处理,采用本发明实施例提供的方法的智能机器人,能够检测到在智能机器人前方且在激光雷达的视觉盲区的障碍物,提高了智能机器人的安全性能。Compared with the prior art, the beneficial effect of the present invention is: different from the prior art, the embodiment of the present invention provides a multi-modal data fusion method, which is applied to an intelligent robot, and the intelligent robot is provided with a depth Camera and lidar, this method first collects the 3D depth image and laser data set of the external environment through the depth camera and lidar respectively, then reads the 3D depth data set in the 3D depth image, and converts the 3D depth data set Converted into a 2D depth data set, and finally, the 2D depth data set and the laser data set are fused, and the intelligent robot using the method provided in the embodiment of the present invention can detect the position in front of the intelligent robot and in the lidar Obstacles in the visual blind zone improve the safety performance of the intelligent robot.
附图说明Description of the drawings
一个或多个实施例中通过与之对应的附图中的图片进行示例性说 明,这些示例性说明并不构成对实施例的限定,附图中具有相同参考数字标号的元件/模块和步骤表示为类似的元件/模块和步骤,除非有特别申明,附图中的图不构成比例限制。One or more embodiments are exemplified by the pictures in the corresponding drawings. These exemplified descriptions do not constitute a limitation on the embodiments. The components/modules and steps with the same reference numerals in the drawings represent For similar components/modules and steps, unless otherwise stated, the figures in the drawings do not constitute a limitation of scale.
图1是应用于本发明实施例的多模态数据融合方法的智能机器人的结构及机器人的采集区域的示意图;FIG. 1 is a schematic diagram of the structure of an intelligent robot and the collection area of the robot applied to the multimodal data fusion method of an embodiment of the present invention;
图2是本发明实施例提供的一种多模态数据融合方法的流程图;Figure 2 is a flowchart of a multimodal data fusion method provided by an embodiment of the present invention;
图3是本发明实施例提供的深度摄像头和激光雷达所采集的智能机器人前方的一种图像的示例图;FIG. 3 is an example diagram of an image in front of a smart robot collected by a depth camera and a lidar provided by an embodiment of the present invention;
图4是图2所示方法中步骤130的一子流程图;FIG. 4 is a sub-flow chart of step 130 in the method shown in FIG. 2;
图5是图2所示方法中步骤130和步骤140的一子流程图;FIG. 5 is a sub-flow chart of step 130 and step 140 in the method shown in FIG. 2;
图6是本发明实施例提供的一种多模态数据融合装置的结构示意图;6 is a schematic structural diagram of a multi-modal data fusion device provided by an embodiment of the present invention;
图7是本发明实施例提供的执行上述多模态数据融合方法的智能机器人的硬件结构示意图。FIG. 7 is a schematic diagram of the hardware structure of an intelligent robot that executes the above-mentioned multi-modal data fusion method provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面结合具体实施例对本发明进行详细说明。以下实施例将有助于本领域的技术人员进一步理解本发明,但不以任何形式限制本发明。应当指出的是,对本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进。这些都属于本发明的保护范围。The present invention will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be pointed out that for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present invention. These all belong to the protection scope of the present invention.
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
需要说明的是,如果不冲突,本发明实施例中的各个特征可以相互结合,均在本申请的保护范围之内。另外,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。此外,本文所采用的“第一”、“第二”、“第三”等字样并不对数据和执行次序进行限定,仅是对功能和作用基本相同的相同项或相似项进 行区分。It should be noted that, if there is no conflict, the various features in the embodiments of the present invention can be combined with each other, and all fall within the protection scope of the present application. In addition, although the functional modules are divided in the schematic diagram of the device, and the logical sequence is shown in the flowchart, in some cases, the module division in the device may be different from the module division in the device, or the sequence shown in the flowchart may be executed. Or the steps described. In addition, the words "first", "second", and "third" used in this document do not limit the data and execution order, but only distinguish the same or similar items with basically the same function and effect.
除非另有定义,本说明书所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本说明书中在本发明的说明书中所使用的术语只是为了描述具体的实施方式的目的,不是用于限制本发明。本说明书所使用的术语“和/或”包括一个或多个相关的所列项目的任意的和所有的组合。Unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as commonly understood by those skilled in the technical field of the present invention. The terms used in the specification of the present invention in this specification are only for the purpose of describing specific embodiments, and are not used to limit the present invention. The term "and/or" used in this specification includes any and all combinations of one or more related listed items.
此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.
请参见图1,其示出了应用于本发明实施例的多模态数据融合方法的智能机器人的结构及智能机器人的采集区域的示意图,其中,图1左侧的智能机器人10即为能够执行本发明实施例所述多模态数据融合方法的智能机器人,该智能机器人10设置有深度摄像头11及激光雷达12。Please refer to FIG. 1, which shows a schematic diagram of the structure of the intelligent robot and the collection area of the intelligent robot applied to the multi-modal data fusion method of the embodiment of the present invention, where the intelligent robot 10 on the left side of FIG. 1 is capable of executing In the intelligent robot of the multi-modal data fusion method according to the embodiment of the present invention, the intelligent robot 10 is provided with a depth camera 11 and a lidar 12.
所述深度摄像头11为一双目摄像头,其能够用于采集智能机器人10前方的深度图像,所述深度图像包含所述智能机器人10到所述机器人前方的障碍物的距离信息。例如,可以是Ki nect-v1摄像头、Foton i c摄像头、ZED摄像头等常见的深度摄像头。The depth camera 11 is a binocular camera, which can be used to collect a depth image in front of the intelligent robot 10, and the depth image contains distance information from the intelligent robot 10 to an obstacle in front of the robot. For example, it may be a common depth camera such as a Ki nect-v1 camera, a Foton camera, a ZED camera, and so on.
所述激光雷达12为一能够通过向障碍物发射激光光束,以获取障碍物位置、速度等特征量的装置,通常只能在探测某一高度的平面上的障碍物的距离信息。The lidar 12 is a device that can obtain characteristic quantities such as the position and speed of the obstacle by emitting a laser beam to the obstacle, and usually can only detect the distance information of the obstacle on a plane of a certain height.
通常情况下,激光雷达12的视场角β都会远大于深度摄像头11的视场角α,激光雷达12的视场角β通常可以达到180度甚至超过180度。而深度摄像头11在拍摄机器人前方的图像时,即是采用广角镜头,采集到的两侧的图像也通常存在畸变,因此,优选的,本发明实施例的深度摄像头11采用视场角α小于九十度的双目摄像头。Normally, the field angle β of the lidar 12 is much larger than the field angle α of the depth camera 11, and the field angle β of the lidar 12 can usually reach 180 degrees or even exceed 180 degrees. When the depth camera 11 shoots the image in front of the robot, it uses a wide-angle lens, and the captured images on both sides are usually distorted. Therefore, it is preferable that the depth camera 11 in the embodiment of the present invention adopts a field angle α of less than ninety. Degree of binocular camera.
具体地,下面结合附图,对本发明实施例作进一步阐述。Specifically, the embodiments of the present invention will be further described below in conjunction with the accompanying drawings.
本发明实施例提供了一种多模态数据融合方法,该方法可被上述智能机器人10执行,请参见图2,其示出了本发明实施例提供的一种多模态数据融合方法的流程图,该方法包括但不限于以下步骤:The embodiment of the present invention provides a multi-modal data fusion method, which can be executed by the above-mentioned intelligent robot 10. Please refer to FIG. 2, which shows the flow of a multi-modal data fusion method provided by the embodiment of the present invention. Figure, the method includes but is not limited to the following steps:
步骤110:通过所述深度摄像头采集外部环境的3D深度图像。Step 110: Collect a 3D depth image of the external environment through the depth camera.
在本发明实施例中,如图1所示,通过深度摄像头11可采集智能机器人11前方,视场角α范围内的3D深度图像S2,该3D深度图像S2的每一像素点上都包含有与障碍物直线距离上的的距离信息。In the embodiment of the present invention, as shown in FIG. 1, the depth camera 11 can be used to collect a 3D depth image S2 in front of the intelligent robot 11 within the field of view α. Each pixel of the 3D depth image S2 contains Distance information on a straight line from the obstacle.
步骤120:通过所述激光雷达采集外部环境的激光数据集。Step 120: Collect a laser data set of the external environment through the lidar.
在本发明实施例中,如图1所示,通过激光雷达12可采集智能机器人11前方,视场角β范围内的激光数据集S1,可以理解的是,其包含激光雷达在每一数据采集点上所采集的与前方障碍物的直线距离信息。所述激光数据集S1可以理解为一携带距离信息的线,该线上包含多个数据采集点。In the embodiment of the present invention, as shown in Figure 1, the laser radar 12 can collect the laser data set S1 in front of the intelligent robot 11 within the field of view β. It can be understood that it includes the laser radar in each data collection The straight-line distance information from the obstacle in front of the point collected. The laser data set S1 can be understood as a line carrying distance information, and the line contains multiple data collection points.
步骤130:读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集。Step 130: Read the 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set.
由于激光雷达采集的是某一高度平面上的数据信息,而激光雷达的视场角比深度摄像头的视场角大,因此,所述深度摄像头在高度方向的取景范围大于所述激光雷达在高度方向的探测范围,而所述激光雷达在水平方向上的探测范围也大于深度摄像头在水平方向上的取景范围。且有,深度摄像头和激光雷达所获得图像存在重合区域S。Since the lidar collects data information on a certain height plane, and the field of view of the lidar is larger than the field of view of the depth camera, the viewing range of the depth camera in the height direction is larger than that of the lidar in the height direction. The detection range of the direction, and the detection range of the lidar in the horizontal direction is also larger than the viewing range of the depth camera in the horizontal direction. Also, there is an overlap area S in the images obtained by the depth camera and the lidar.
在本发明实施例中,为了将所述3D深度数据集在所述激光数据集S1的探测方向上表示,以进一步与所述激光数据集进行融合处理得到更精确的障碍物探测信息,还需要将所述3D深度数据集转换为在所述激光数据集S1的探测方向上的2D深度数据集。In the embodiment of the present invention, in order to represent the 3D depth data set in the detection direction of the laser data set S1, to further perform fusion processing with the laser data set to obtain more accurate obstacle detection information, it is also necessary The 3D depth data set is converted into a 2D depth data set in the detection direction of the laser data set S1.
步骤140:对所述2D深度数据集和所述激光数据集进行融合处理。Step 140: Perform fusion processing on the 2D depth data set and the laser data set.
在本发明实施例中,为了获取到在智能机器人的前方的障碍物与该智能机器人的最近的距离的信息,进一步地,将所述3D深度图像和所述激光数据集进行融合处理,所述融合处理具体为,将所述重合区域S内,由所述3D深度数据集转换得到的2D深度数据集和所述激光数据集进行融合处理。In the embodiment of the present invention, in order to obtain the information of the closest distance between the obstacle in front of the intelligent robot and the intelligent robot, further, the 3D depth image and the laser data set are fused. The fusion processing specifically includes performing fusion processing on the 2D depth data set converted from the 3D depth data set and the laser data set in the overlapping area S.
本发明实施例中提供了一种多模态数据融合方法,应用于机器人,所述机器人设置有深度摄像头和激光雷达,该方法首先分别通过深度摄 像头和激光雷达采集外部环境的3D深度图像和激光数据集,然后,读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集,最后,对所述2D深度数据集和所述激光数据集进行融合处理,采用本发明实施例提供的方法的智能机器人,能够检测到在智能机器人前方且在激光雷达的视觉盲区的障碍物,提高了智能机器人的安全性能。The embodiment of the present invention provides a multi-modal data fusion method, which is applied to a robot, and the robot is provided with a depth camera and a lidar. The method first collects 3D depth images and lasers of the external environment through the depth camera and the lidar, respectively. Data set, then read the 3D depth data set in the 3D depth image, convert the 3D depth data set into a 2D depth data set, and finally, fuse the 2D depth data set and the laser data set For processing, the intelligent robot using the method provided by the embodiment of the present invention can detect obstacles in front of the intelligent robot and in the blind spot of the lidar, thereby improving the safety performance of the intelligent robot.
在一些实施例中,请参见图3,其示出了本发明实施例提供的深度摄像头和激光雷达所采集的机器人前方的一种数据集的示例图,该数据集即为上述图1所示的激光数据集S1和由所述3D深度图像S2中的3D深度数据集S2’,所述激光数据集S1包括一线性排列的数据点的一维坐标以及各所述数据点的第一距离,所述3D深度数据集S2’包括沿X轴和Y轴分布的像素点的X轴坐标和Y轴坐标以及各所述像素点的第二距离。In some embodiments, please refer to FIG. 3, which shows an example diagram of a data set in front of the robot collected by a depth camera and a lidar provided by an embodiment of the present invention. The data set is shown in FIG. 1 above. The laser data set S1 and the 3D depth data set S2' in the 3D depth image S2, the laser data set S1 includes the one-dimensional coordinates of a linearly arranged data point and the first distance of each of the data points, The 3D depth data set S2' includes X-axis coordinates and Y-axis coordinates of pixel points distributed along the X-axis and Y-axis, and the second distance of each pixel point.
在图3所示实施例中,所述激光数据集S1包括沿X轴分布的9个数据点,以及各所述数据点的第一距离(a1、a2、……、a9),每一数据点皆携带有其一维坐标,所述3D深度数据集S2’包括沿X轴和Y轴分布的36个像素点以及各所述像素点的第二距离(b11、b12、……、b16、b21、……、b66),每一像素点皆携带有其二维坐标。可以理解的是,系统通过所述每一数据点的一维坐标和所述每一像素点的二维坐标,将所述激光数据集S1和所述3D深度数据集S2’进行叠加,从而得到如图3所示的所述激光数据集S1和所述3D深度数据集S2’的叠加图像,具体地,根据数据点的一维坐标和像素点在X轴的坐标进行校准以实现叠加。In the embodiment shown in FIG. 3, the laser data set S1 includes 9 data points distributed along the X axis, and the first distance (a1, a2,..., a9) of each of the data points, and each data point The points all carry their one-dimensional coordinates. The 3D depth data set S2' includes 36 pixels distributed along the X-axis and Y-axis and the second distance of each pixel (b11, b12, ..., b16, b21,..., b66), each pixel carries its two-dimensional coordinates. It is understandable that the system superimposes the laser data set S1 and the 3D depth data set S2' through the one-dimensional coordinates of each data point and the two-dimensional coordinates of each pixel point to obtain The superimposed images of the laser data set S1 and the 3D depth data set S2' shown in FIG. 3 are specifically calibrated according to the one-dimensional coordinates of the data points and the coordinates of the pixel points on the X axis to achieve superposition.
需要说明的是,图3所示的所述激光数据集S1中数据点的数量和所述3D深度数据集S2’的中像素点的数量不限于上述实施例所述的数量,其根据实际应用的激光雷达的采样频率和深度摄像头的分辨率来确定。It should be noted that the number of data points in the laser data set S1 and the number of pixels in the 3D depth data set S2' shown in FIG. 3 are not limited to the numbers described in the foregoing embodiment, which are based on actual applications. The sampling frequency of the lidar and the resolution of the depth camera are determined.
请一并参见图4,其示出了图2所示方法中步骤130的一子流程图, 所述步骤130包括:Please also refer to FIG. 4, which shows a sub-flow chart of step 130 in the method shown in FIG. 2, where step 130 includes:
步骤131:选取X轴坐标相同的各所述像素点的第二距离中的最小第二距离。Step 131: Select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate.
步骤132:将所述X轴坐标以及所述X轴坐标对应的所述最小第二距离组成所述2D深度数据集。Step 132: Combine the X-axis coordinates and the minimum second distance corresponding to the X-axis coordinates into the 2D depth data set.
在本发明实施例中,在将所述2D深度数据集和所述激光数据集进行融合处理之前,还需要将所述3D深度数据集转换为2D深度数据集,请一并参见图3,其转换过程具体为:将所述3D深度数据集S2’中,位于同一Y轴上的所有像素点的第二距离,取其中的最小值,赋值给重合区域S上的该Y轴所在相应的X轴坐标上的像素点,将所有的位于同一Y轴上的所有像素点的第二距离进行取第二距离的最小值并赋值给重合区域S上相应Y轴的所在相应的X轴坐标上的像素点后,即可得到所述2D深度数据集。In the embodiment of the present invention, before the 2D depth data set and the laser data set are fused, the 3D depth data set needs to be converted into a 2D depth data set. Please also refer to FIG. 3, which The conversion process is specifically: taking the second distance of all pixels on the same Y axis in the 3D depth data set S2', taking the smallest value among them, and assigning them to the X corresponding to the Y axis on the coincidence area S For the pixel points on the axis coordinates, take the second distance of all the pixels on the same Y axis to take the minimum value of the second distance and assign it to the corresponding Y axis on the coincident area S on the corresponding X axis coordinate After pixel points, the 2D depth data set can be obtained.
例如,在图3中,获取第一列像素点的第二距离的最小值,也即是b11至b61这六个第二距离的数值的最小值,然后,将最小值赋值给b31所在的像素点,以此类推,最后,得到的2D深度数据集仅包含b31所在像素点相应的X轴上的一行像素点,且该一行像素点上每个像素点赋值为该像素点相应的Y轴上所有的像素点的第二距离的最小值,以使最后得到的2D深度数据集上每一像素点都包含该像素点所在Y轴上所有像素点与障碍物的最短距离的信息。For example, in Figure 3, the minimum value of the second distance of the first column of pixels is obtained, that is, the minimum value of the six second distance values from b11 to b61, and then the minimum value is assigned to the pixel where b31 is located Point, and so on. Finally, the obtained 2D depth data set only contains a row of pixels on the X axis corresponding to the pixel where b31 is located, and each pixel on the row of pixels is assigned a value on the corresponding Y axis of the pixel The minimum value of the second distance of all pixels, so that each pixel on the finally obtained 2D depth data set contains the information of the shortest distance between all the pixels on the Y axis where the pixel is located and the obstacle.
在一些实施例中,请参见图5,其示出了图2所示方法中步骤130和步骤140的一子流程图,基于图2和图3所示的方法,所述方法还包括:In some embodiments, please refer to FIG. 5, which shows a sub-flow chart of step 130 and step 140 in the method shown in FIG. 2. Based on the methods shown in FIG. 2 and FIG. 3, the method further includes:
步骤133:获取所述激光数据集的第一分辨率,以及,所述2D深度数据集的第二分辨率。Step 133: Obtain the first resolution of the laser data set and the second resolution of the 2D depth data set.
步骤134:判断所述第二分辨率是否大于所述第一分辨率。若是,跳转至步骤135;若否,跳转至步骤143。Step 134: Determine whether the second resolution is greater than the first resolution. If yes, skip to step 135; if not, skip to step 143.
步骤135:计算所述第二分辨率与第一分辨率的倍数。Step 135: Calculate the multiple of the second resolution and the first resolution.
步骤136:将所述2D深度数据集中的像素点按所述倍数进行分组。Step 136: Group the pixel points in the 2D depth data set according to the multiple.
步骤137:在每个分组中选取所述最小第二距离中的最小值对应的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。Step 137: Select the pixel corresponding to the minimum value of the minimum second distance in each group, and delete other pixels to obtain a corrected 2D depth data set.
在本发明实施例中,进一步地,还需要确定激光雷达所采集的激光数据集的第一分辨率与深度摄像头所采集的3D深度图像的2D深度数据集的第二分辨率之间的大小关系及倍数关系,以得到修正后的2D深度数据集用于对所述2D深度数据集和所述激光数据集进行融合处理。可以理解的是,由于融合处理是获取所述2D深度数据集和所述激光数据集的重合区域,并将重合区域中的激光数据集的每个数据点进行重新赋值,因此,所述第一分辨率可以指的是所述激光数据集的数据点在重合区域内的数量,所述第二分辨率可以指的是所述2D深度数据集在重合区域内的像素点的数量。In the embodiment of the present invention, it is further necessary to determine the size relationship between the first resolution of the laser data set collected by the lidar and the second resolution of the 2D depth data set of the 3D depth image collected by the depth camera And the multiple relationship to obtain a corrected 2D depth data set for fusion processing on the 2D depth data set and the laser data set. It is understandable that since the fusion process is to obtain the overlap area of the 2D depth data set and the laser data set, and re-assign each data point of the laser data set in the overlap area, therefore, the first The resolution may refer to the number of data points of the laser data set in the coincidence area, and the second resolution may refer to the number of pixels of the 2D depth data set in the coincidence area.
当所述第二分辨率大于所述第一分辨率时,通过比较重合区域中,数据点的数量和像素点的数量,得到其倍数关系,并根据倍数关系将所述2D深度数据集中的像素点进行分组,取每一组内的像素点的最小第二距离中的最小值,这些最小值的集合即为修正后的2D深度数据集。When the second resolution is greater than the first resolution, the multiple relationship is obtained by comparing the number of data points and the number of pixels in the overlap area, and the pixels in the 2D depth data set are calculated according to the multiple relationship. The points are grouped, and the minimum value of the minimum second distance of the pixel points in each group is taken, and the set of these minimum values is the corrected 2D depth data set.
例如,请一并参见图3,图3所示图像中,所述重合区域S中所述2D深度数据集S2’在重合区域内的像素点的数量为6(b31所在像素点至b36所在像素点,总共有6个像素点),因而所述第二分辨率为6,所述重合区域S中所述激光数据集S1的第一分辨率为3(a4所在数据点至a6所在数据点,总共有3个数据点)。因此,可以计算得到第二分辨率与第一分辨率的倍数为2。For example, please refer to Fig. 3 together. In the image shown in Fig. 3, the number of pixels in the overlap area S of the 2D depth data set S2' in the overlap area is 6 (the pixel where b31 is located to the pixel where b36 is Point, there are a total of 6 pixels), so the second resolution is 6, and the first resolution of the laser data set S1 in the overlap area S is 3 (data point a4 to data point a6, There are 3 data points in total). Therefore, it can be calculated that the multiple of the second resolution and the first resolution is 2.
进一步地,按该倍数进行分组,每组包含倍数个2D深度数据集的像素点。并在每个分组中取最小第二距离中的最小值的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。也即是,如图3所示示例中,将过渡图像中的每两个像素点分为一组,取每组中的最小值,得到修正后的2D深度数据集,该修正后的2D深度数据集包含三个数据点且三个数据点的数值分别为c4、c5和c6。Further, grouping is performed according to the multiple, and each group contains multiple pixels of the 2D depth data set. And in each group, the pixel with the smallest value in the smallest second distance is taken, and other pixels are deleted to obtain the corrected 2D depth data set. That is, in the example shown in Figure 3, divide every two pixels in the transition image into a group, and take the minimum value in each group to obtain a corrected 2D depth data set. The corrected 2D depth The data set contains three data points and the values of the three data points are c4, c5, and c6.
进一步地,请继续参见图5,所述方法还包括:Further, please continue to refer to FIG. 5, the method further includes:
步骤141:获取所述激光雷达的采集区域和所述深度摄像头的采集区域中的重合区域。Step 141: Obtain the overlapping area of the acquisition area of the lidar and the acquisition area of the depth camera.
步骤142:结合所述重合区域,将所述激光数据集中处于该重合区域的数据段替换为所述修正后的2D深度数据集。Step 142: In combination with the overlapping area, replace the data segment in the overlapping area in the laser data set with the corrected 2D depth data set.
最后,将所述激光数据集中处于所述重合区域的数据段替换为所述修正后的2D深度数据集,从而得到如图3所示的融合图像S3。在本发明实施例中,修正所述2D深度数据集的步骤,目的是为了让所述修正后的2D深度数据集与所述激光数据集在重合区域的图像的像素点/数据点的数量一致,以使得修正后的2D深度数据集可以用于替换掉激光数据集中处于所述重合区域的数据段,得到融合图像,该修正后的2D深度数据集的分辨率与激光数据集中不处于所述重合区域的数据段的分辨率一致。Finally, the data segment in the overlap area in the laser data set is replaced with the corrected 2D depth data set, so as to obtain the fused image S3 as shown in FIG. 3. In the embodiment of the present invention, the step of correcting the 2D depth data set aims to make the corrected 2D depth data set and the laser data set have the same number of pixels/data points in the image in the overlap area , So that the corrected 2D depth data set can be used to replace the data segments in the overlap area in the laser data set to obtain a fusion image. The resolution of the corrected 2D depth data set is not in the laser data set. The resolutions of the data segments in the overlap area are the same.
在一些实施例中,还可能存在第二分辨率与第一分辨率的倍数不为整数的情况,此时,将倍数取整加一后,取与该激光数据集的数据点所在的重合区域上的位置周围附近取整加一倍数个2D深度数据集的像素点进行取最小值,以得到修正后的2D深度数据集。例如,倍数为2.4时,取整加一后为3,在2D深度数据集图像的第1、2、3个像素点上取最小值,作为修正后的2D深度数据集的第一个数据点,在2D深度数据集的第3、4、5个像素点上取最小值,作为修正后的2D深度数据集的第二个数据点,以此类推,得到修正后的2D深度数据集,且使得修正后的2D深度数据集的数据点的数量和所述激光数据集在重合区域中的数据点的数量一致。In some embodiments, there may also be cases where the multiple of the second resolution and the first resolution is not an integer. In this case, after rounding the multiple and adding one, the overlap area with the data point of the laser data set is taken. The pixel points of the 2D depth data set are rounded around the upper position and a multiple of the pixel points are taken to obtain the minimum value to obtain the corrected 2D depth data set. For example, when the multiple is 2.4, round and add one to 3, and take the minimum value on the first, second, and third pixels of the 2D depth data set image as the first data point of the corrected 2D depth data set , Take the minimum value on the 3rd, 4th, and 5th pixel points of the 2D depth data set as the second data point of the corrected 2D depth data set, and so on to obtain the corrected 2D depth data set, and The number of data points in the corrected 2D depth data set is consistent with the number of data points in the overlap area of the laser data set.
在一些实施例中,请继续参见图5,所述方法还包括:In some embodiments, please continue to refer to FIG. 5, and the method further includes:
步骤143:判断所述第二分辨率是否等于所述第一分辨率。若是,跳转至步骤144;若否,跳转至步骤145。Step 143: Determine whether the second resolution is equal to the first resolution. If yes, skip to step 144; if not, skip to step 145.
步骤144:将所述激光数据集中处于该重合区域的数据段替换为所述2D深度数据集。Step 144: Replace the data segment in the overlapping area in the laser data set with the 2D depth data set.
在本发明实施例中,也可能存在第二分辨率和第一分辨率相等的情况,也即是,激光数据集在重合区域中的数据点的数量与2D深度数据 集的像素点的数量一致时,则可以直接将所述激光数据集中处于所述重合区域的数据段直接替换为所述2D深度数据集,得到所述融合图像。In the embodiment of the present invention, there may also be a situation where the second resolution and the first resolution are equal, that is, the number of data points in the overlap area of the laser data set is the same as the number of pixels in the 2D depth data set At this time, the data segment in the overlap area in the laser data set can be directly replaced with the 2D depth data set to obtain the fused image.
在一些实施例中,请继续参见图5,所述方法还包括:In some embodiments, please continue to refer to FIG. 5, and the method further includes:
步骤145:获取所述重合区域中激光数据集的每一数据点对应所在的所述3D深度图像的像素点。Step 145: Obtain the pixel point of the 3D depth image corresponding to each data point of the laser data set in the overlap area.
步骤146:判断所述每一数据点的第一距离是否大于所述对应的像素点的最小第二距离;若是,跳转至步骤147;若否,保留所述2D深度数据集。Step 146: Determine whether the first distance of each data point is greater than the minimum second distance of the corresponding pixel point; if yes, skip to step 147; if not, keep the 2D depth data set.
步骤147:将所述第一距离小于所述最小第二距离的数据点的第一距离替换为所述对应的像素点的所述最小第二距离。Step 147: Replace the first distance of the data point whose first distance is less than the minimum second distance with the minimum second distance of the corresponding pixel point.
在本发明实施例中,还可能存在激光数据集在重合区域中的数据点的数量比2D深度数据集在重合区域中的的像素点的数量多的情况,此时,需要判断每一数据点的第一距离是否大于该数据点所在位置的像素点所包含的最小第二距离,若存在大于关系的数据点,则将该数据点的第一距离替换为该数据点所在位置的像素点的最小第二距离,若不存在大于关系的数据点,则保留该激光数据集的数据点所携带的第一距离,以使最终的所述融合图像上每个数据点的距离数值能够表征机器人距离前方障碍物的最短距离。In the embodiment of the present invention, there may also be a situation where the number of data points in the overlap area of the laser data set is greater than the number of pixels in the overlap area of the 2D depth data set. In this case, it is necessary to determine each data point. Is the first distance of the data point greater than the minimum second distance contained in the pixel at the location of the data point, if there is a data point greater than the relationship, replace the first distance of the data point with the pixel at the location of the data point The minimum second distance, if there is no data point greater than the relationship, the first distance carried by the data points of the laser data set is retained, so that the distance value of each data point on the final fusion image can represent the robot distance The shortest distance of the obstacle ahead.
进一步地,在获取到所述融合图像上每个数据点智能机器人距离前方障碍物的距离信息后,可以取所述融合图像上所有数据点的距离信息的最小值,即可得到该智能机器人距离前方障碍物的最短距离,以及障碍物跟智能机器人哪个部位的距离是最短的,然后,为机器人规划前进路径,避免机器人与障碍物产生碰撞,保证机器人和障碍物的安全。Further, after obtaining the distance information of each data point of the intelligent robot from the obstacle in front of the fused image, the minimum value of the distance information of all data points on the fused image can be taken to obtain the distance of the intelligent robot The shortest distance between the obstacle in front and the distance between the obstacle and the intelligent robot is the shortest. Then, the forward path of the robot is planned to avoid collision between the robot and the obstacle, and to ensure the safety of the robot and the obstacle.
本发明实施例还提供了一种多模态数据融合装置,请参见图6,其示出了本发明实施例提供的一种多模态数据融合装置的结构示意图,所述多模态数据融合装置200应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,所述装置200包括:第一采集模块210、第二采集模块220、转换模块230和融合模块240。其中,The embodiment of the present invention also provides a multi-modal data fusion device. Please refer to FIG. 6, which shows a schematic structural diagram of a multi-modal data fusion device provided by an embodiment of the present invention. The device 200 is applied to an intelligent robot, and the intelligent robot is provided with a depth camera and a lidar. The device 200 includes: a first acquisition module 210, a second acquisition module 220, a conversion module 230, and a fusion module 240. among them,
所述第一采集模块210用于通过所述激光雷达采集外部环境的激光数据集;The first collection module 210 is configured to collect a laser data set of the external environment through the lidar;
所述第二采集模块220用于通过所述激光雷达采集外部环境的激光数据集;The second collection module 220 is used to collect a laser data set of the external environment through the lidar;
所述转换模块230用于读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集;The conversion module 230 is configured to read a 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set;
所述融合模块240用于对所述2D深度数据集和所述激光数据集进行融合处理。The fusion module 240 is configured to perform fusion processing on the 2D depth data set and the laser data set.
在一些实施例中,所述激光数据集包括一线性排列的数据点的一维坐标以及各所述数据点的第一距离,所述3D深度数据集包括沿X轴和Y轴分布的像素点的X轴坐标和Y轴坐标以及各所述像素点的第二距离;In some embodiments, the laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each data point, and the 3D depth data set includes pixel points distributed along the X axis and the Y axis X-axis coordinates and Y-axis coordinates of and the second distance of each pixel;
所述转换模块230还用于选取X轴坐标相同的各所述像素点的第二距离中的最小第二距离;The conversion module 230 is further configured to select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
将所述X轴坐标以及所述X轴坐标对应的所述最小第二距离组成所述2D深度数据集。The X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
在一些实施例中,所述转换模块230还用于获取所述激光数据集的第一分辨率,以及,所述2D深度数据集的第二分辨率;In some embodiments, the conversion module 230 is further configured to obtain the first resolution of the laser data set and the second resolution of the 2D depth data set;
判断所述第二分辨率是否大于所述第一分辨率;Determine whether the second resolution is greater than the first resolution;
若是,则计算所述第二分辨率与第一分辨率的倍数;If yes, calculate the multiple of the second resolution and the first resolution;
将所述2D深度数据集中的像素点按所述倍数进行分组;Grouping the pixel points in the 2D depth data set according to the multiple;
在每个分组中选取所述最小第二距离中的最小值对应的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。In each group, the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
在一些实施例中,所述融合模块240还用于获取所述激光雷达的采集区域和所述深度摄像头的采集区域中的重合区域;In some embodiments, the fusion module 240 is also used to obtain the overlapping area of the acquisition area of the lidar and the acquisition area of the depth camera;
结合所述重合区域,将所述激光数据集中处于该重合区域的数据段替换为所述修正后的2D深度数据集。In combination with the coincidence area, the data segment in the coincidence area in the laser data set is replaced with the corrected 2D depth data set.
在一些实施例中,所述融合模块240还用于若所述第二分辨率等于所述第一分辨率,则将所述激光数据集中处于该重合区域的数据段替换为所述2D深度数据集。In some embodiments, the fusion module 240 is further configured to, if the second resolution is equal to the first resolution, replace the data segment in the coincident area in the laser data set with the 2D depth data set.
在一些实施例中,所述融合模块240还用于若所述第二分辨率小于所述第一分辨率,则获取所述重合区域中激光数据集的每一数据点对应所在的所述3D深度图像的像素点;In some embodiments, the fusion module 240 is further configured to obtain the 3D corresponding to each data point of the laser data set in the coincidence area if the second resolution is smaller than the first resolution. The pixels of the depth image;
判断所述每一数据点的第一距离是否小于所述对应的像素点的最小第二距离;Judging whether the first distance of each data point is less than the minimum second distance of the corresponding pixel;
若是,则将所述第一距离小于所述最小第二距离的数据点的第一距离替换为所述对应的像素点的所述最小第二距离。If yes, replace the first distance of the data point whose first distance is less than the minimum second distance with the minimum second distance of the corresponding pixel point.
本发明实施例还提供了一种智能机器人,请参见图7,其示出了能够执行图2至图5所述多模态数据融合方法的智能机器人的硬件结构。所述智能机器人10可以是图1所示的智能机器人10。The embodiment of the present invention also provides an intelligent robot. Please refer to FIG. 7, which shows the hardware structure of the intelligent robot capable of executing the multi-modal data fusion method described in FIGS. 2 to 5. The intelligent robot 10 may be the intelligent robot 10 shown in FIG. 1.
所述智能机器人10包括:至少一个处理器11;以及,与所述至少一个处理器11通信连接的存储器12,图7中以其以一个处理器11为例。所述存储器12存储有可被所述至少一个处理器11执行的指令,所述指令被所述至少一个处理器11执行,以使所述至少一个处理器11能够执行上述图2至图5所述的多模态数据融合方法。所述处理器11和所述存储器12可以通过总线或者其他方式连接,图7中以通过总线连接为例。The intelligent robot 10 includes: at least one processor 11; and a memory 12 communicatively connected with the at least one processor 11, and one processor 11 is taken as an example in FIG. 7. The memory 12 stores instructions that can be executed by the at least one processor 11, and the instructions are executed by the at least one processor 11, so that the at least one processor 11 can execute the instructions shown in FIGS. 2 to 5 above. The described multi-modal data fusion method. The processor 11 and the memory 12 may be connected by a bus or in other ways. In FIG. 7, the connection by a bus is taken as an example.
存储器12作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块,如本申请实施例中的多模态数据融合方法对应的程序指令/模块,例如,附图6所示的各个模块。处理器11通过运行存储在存储器12中的非易失性软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例多模态数据融合方法。As a non-volatile computer-readable storage medium, the memory 12 can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the corresponding multi-modal data fusion method in the embodiment of the present application. The program instructions/modules, for example, the modules shown in Fig. 6. The processor 11 executes various functional applications and data processing of the server by running non-volatile software programs, instructions, and modules stored in the memory 12, that is, realizing the multi-modal data fusion method of the foregoing method embodiment.
存储器12可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据多模态数据融合装置的使用所创建的数据等。此外,存储器12可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实 施例中,存储器12可选包括相对于处理器11远程设置的存储器,这些远程存储器可以通过网络连接至多模态数据融合装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 12 may include a storage program area and a storage data area. The storage program area can store an operating system and an application program required by at least one function; the storage data area can store data created according to the use of the multi-modal data fusion device, etc. . In addition, the memory 12 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices. In some embodiments, the memory 12 may optionally include memories remotely provided with respect to the processor 11, and these remote memories may be connected to the multi-modal data fusion device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
所述一个或者多个模块存储在所述存储器12中,当被所述一个或者多个处理器11执行时,执行上述任意方法实施例中的多模态数据融合方法,例如,执行以上描述的图2至图5的方法步骤,实现图6中的各模块和各单元的功能。The one or more modules are stored in the memory 12, and when executed by the one or more processors 11, the multi-modal data fusion method in any of the foregoing method embodiments is executed, for example, the foregoing described The method steps in Fig. 2 to Fig. 5 realize the functions of each module and each unit in Fig. 6.
上述产品可执行本申请实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本申请实施例所提供的方法。The above-mentioned products can execute the methods provided in the embodiments of the present application, and have functional modules and beneficial effects corresponding to the execution methods. For technical details that are not described in detail in this embodiment, please refer to the method provided in the embodiment of this application.
本申请实施例还提供了一种非易失性计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令被一个或多个处理器执行,例如,执行以上描述的图2至图5的方法步骤,实现图6中的各模块的功能。The embodiment of the present application also provides a non-volatile computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by one or more processors, for example, The above described method steps in FIG. 2 to FIG. 5 realize the functions of each module in FIG. 6.
本申请实施例还提供了一种计算机程序产品,包括存储在非易失性计算机可读存储介质上的计算程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时时,使所述计算机执行上述任意方法实施例中的多模态数据融合方法,例如,执行以上描述的图2至图5的方法步骤,实现图6中的各模块的功能。The embodiments of the present application also provide a computer program product, including a calculation program stored on a non-volatile computer-readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, cause all The computer executes the multi-modal data fusion method in any of the foregoing method embodiments, for example, executes the method steps in FIGS. 2 to 5 described above to realize the functions of the modules in FIG. 6.
本发明实施例中提供了一种多模态数据融合方法,应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,该方法首先分别通过深度摄像头和激光雷达采集外部环境的3D深度图像和激光数据集,然后,读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集,最后,对所述2D深度数据集和所述激光数据集进行融合处理,采用本发明实施例提供的方法的智能机器人,能够检测到在智能机器人前方且在激光雷达的视觉盲区的障碍物,提高了智能 机器人的安全性能。The embodiment of the present invention provides a multi-modal data fusion method, which is applied to an intelligent robot, and the intelligent robot is provided with a depth camera and a lidar. The method first collects 3D depth images of the external environment through the depth camera and the lidar respectively And the laser data set, then read the 3D depth data set in the 3D depth image, convert the 3D depth data set into a 2D depth data set, and finally, compare the 2D depth data set and the laser data set For fusion processing, the intelligent robot using the method provided by the embodiment of the present invention can detect obstacles in front of the intelligent robot and in the blind spot of the lidar, thereby improving the safety performance of the intelligent robot.
需要说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。It should be noted that the device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate. Units can be located in one place or distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
通过以上的实施方式的描述,本领域普通技术人员可以清楚地了解到各实施方式可借助软件加通用硬件平台的方式来实现,当然也可以通过硬件。本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。Through the description of the above implementation manners, those of ordinary skill in the art can clearly understand that each implementation manner can be implemented by means of software plus a general hardware platform, and of course, it can also be implemented by hardware. A person of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by a computer program instructing relevant hardware. The program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. When executed, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;在本发明的思路下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,步骤可以以任意顺序实现,并存在如上所述的本发明的不同方面的许多其它变化,为了简明,它们没有在细节中提供;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; under the idea of the present invention, the technical features of the above embodiments or different embodiments can also be combined. The steps can be implemented in any order, and there are many other variations of the different aspects of the present invention as described above. For the sake of brevity, they are not provided in the details; although the present invention has been described in detail with reference to the foregoing embodiments, it is common in the art The skilled person should understand that: they can still modify the technical solutions recorded in the foregoing embodiments, or equivalently replace some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the implementations of the present invention. Examples of the scope of technical solutions.

Claims (11)

  1. 一种多模态数据融合方法,其特征在于,应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,所述方法包括:A multi-modal data fusion method, characterized in that it is applied to an intelligent robot provided with a depth camera and a lidar, and the method includes:
    通过所述深度摄像头采集外部环境的3D深度图像;Collecting a 3D depth image of the external environment through the depth camera;
    通过所述激光雷达采集外部环境的激光数据集;Collecting a laser data set of the external environment through the lidar;
    读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集;Reading the 3D depth data set in the 3D depth image, and converting the 3D depth data set into a 2D depth data set;
    对所述2D深度数据集和所述激光数据集进行融合处理。Fusion processing is performed on the 2D depth data set and the laser data set.
  2. 根据权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    所述激光数据集包括一线性排列的数据点的一维坐标以及各所述数据点的第一距离,所述3D深度数据集包括沿X轴和Y轴分布的像素点的X轴坐标和Y轴坐标以及各所述像素点的第二距离;The laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each data point, and the 3D depth data set includes X-axis coordinates and Y-axis coordinates of pixel points distributed along the X-axis and Y-axis. Axis coordinates and the second distance of each pixel;
    所述读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集的步骤,具体包括:The step of reading the 3D depth data set in the 3D depth image and converting the 3D depth data set into a 2D depth data set specifically includes:
    选取X轴坐标相同的各所述像素点的第二距离中的最小第二距离;Selecting the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
    将所述X轴坐标以及所述X轴坐标对应的所述最小第二距离组成所述2D深度数据集。The X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
  3. 根据权利要求2所述的方法,其特征在于,所述读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集的步骤,进一步包括:The method according to claim 2, wherein the step of reading a 3D depth data set in the 3D depth image and converting the 3D depth data set into a 2D depth data set further comprises:
    获取所述激光数据集的第一分辨率,以及,所述2D深度数据集的第二分辨率;Acquiring the first resolution of the laser data set and the second resolution of the 2D depth data set;
    判断所述第二分辨率是否大于所述第一分辨率;Determine whether the second resolution is greater than the first resolution;
    若是,则计算所述第二分辨率与第一分辨率的倍数;If yes, calculate the multiple of the second resolution and the first resolution;
    将所述2D深度数据集中的像素点按所述倍数进行分组;Grouping the pixel points in the 2D depth data set according to the multiple;
    在每个分组中选取所述最小第二距离中的最小值对应的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。In each group, the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
  4. 根据权利要求3所述的方法,其特征在于,所述对所述2D深度 数据集和所述激光数据集进行融合处理的步骤,具体包括:The method according to claim 3, wherein the step of performing fusion processing on the 2D depth data set and the laser data set specifically comprises:
    获取所述激光雷达的采集区域和所述深度摄像头的采集区域中的重合区域;Acquiring the overlapping area between the acquisition area of the lidar and the acquisition area of the depth camera;
    结合所述重合区域,将所述激光数据集中处于该重合区域的数据段替换为所述修正后的2D深度数据集。In combination with the coincidence area, the data segment in the coincidence area in the laser data set is replaced with the corrected 2D depth data set.
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method according to claim 4, wherein the method further comprises:
    若所述第二分辨率等于所述第一分辨率,则将所述激光数据集中处于该重合区域的数据段替换为所述2D深度数据集。If the second resolution is equal to the first resolution, then the data segment in the overlap area in the laser data set is replaced with the 2D depth data set.
  6. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method according to claim 4, wherein the method further comprises:
    若所述第二分辨率小于所述第一分辨率,则获取所述重合区域中激光数据集的每一数据点对应所在的所述3D深度图像的像素点;If the second resolution is less than the first resolution, acquiring each data point of the laser data set in the overlap area corresponds to the pixel point of the 3D depth image;
    判断所述每一数据点的第一距离是否大于所述对应的像素点的最小第二距离;Judging whether the first distance of each data point is greater than the minimum second distance of the corresponding pixel;
    若是,则将所述第一距离小于所述最小第二距离的数据点的第一距离替换为所述对应的像素点的所述最小第二距离。If yes, replace the first distance of the data point whose first distance is less than the minimum second distance with the minimum second distance of the corresponding pixel point.
  7. 一种多模态数据融合装置,其特征在于,应用于智能机器人,所述智能机器人设置有深度摄像头和激光雷达,所述装置包括:A multi-modal data fusion device, characterized in that it is applied to an intelligent robot, the intelligent robot is provided with a depth camera and a lidar, and the device includes:
    第一采集模块,用于通过所述深度摄像头采集外部环境的3D深度图像;The first acquisition module is configured to acquire a 3D depth image of the external environment through the depth camera;
    第二采集模块,用于通过所述激光雷达采集外部环境的激光数据集;The second collection module is configured to collect a laser data set of the external environment through the lidar;
    转换模块,用于读取所述3D深度图像中的3D深度数据集,将所述3D深度数据集转换为2D深度数据集;A conversion module, configured to read a 3D depth data set in the 3D depth image, and convert the 3D depth data set into a 2D depth data set;
    融合模块,用于对所述2D深度数据集和所述激光数据集进行融合处理。The fusion module is used to perform fusion processing on the 2D depth data set and the laser data set.
  8. 根据权利要求7所述的装置,其特征在于,The device according to claim 7, wherein:
    所述激光数据集包括一线性排列的数据点的一维坐标以及各所述数据点的第一距离,所述3D深度数据集包括沿X轴和Y轴分布的像素点的X轴坐标和Y轴坐标以及各所述像素点的第二距离;The laser data set includes one-dimensional coordinates of linearly arranged data points and the first distance of each data point, and the 3D depth data set includes X-axis coordinates and Y-axis coordinates of pixel points distributed along the X-axis and Y-axis. Axis coordinates and the second distance of each pixel;
    所述转换模块还用于选取X轴坐标相同的各所述像素点的第二距离中的最小第二距离;The conversion module is further configured to select the smallest second distance among the second distances of the pixel points with the same X-axis coordinate;
    将所述X轴坐标以及所述X轴坐标对应的所述最小第二距离组成所述2D深度数据集。The X-axis coordinate and the minimum second distance corresponding to the X-axis coordinate form the 2D depth data set.
  9. 根据权利要求8所述的装置,其特征在于,The device according to claim 8, wherein:
    所述转换模块还用于获取所述激光数据集的第一分辨率,以及,所述2D深度数据集的第二分辨率;The conversion module is further configured to obtain the first resolution of the laser data set and the second resolution of the 2D depth data set;
    判断所述第二分辨率是否大于所述第一分辨率;Determine whether the second resolution is greater than the first resolution;
    若是,则计算所述第二分辨率与第一分辨率的倍数;If yes, calculate the multiple of the second resolution and the first resolution;
    将所述2D深度数据集中的像素点按所述倍数进行分组;Grouping the pixel points in the 2D depth data set according to the multiple;
    在每个分组中选取所述最小第二距离中的最小值对应的像素点,并且将其它像素点删除,得到修正后的2D深度数据集。In each group, the pixel corresponding to the minimum value of the minimum second distance is selected, and other pixels are deleted to obtain a corrected 2D depth data set.
  10. 一种智能机器人,其特征在于,包括:An intelligent robot, characterized in that it includes:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1-6任一项所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute any one of claims 1 to 6 Methods.
  11. 一种计算机程序产品,其特征在于,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行如权利要求1-6任一项所述的方法。A computer program product, characterized in that the computer program product includes a computer program stored on a computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer Perform the method according to any one of claims 1-6.
PCT/CN2019/118102 2019-11-13 2019-11-13 Multi-modal data fusion method and apparatus, and intellignet robot WO2021092805A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/118102 WO2021092805A1 (en) 2019-11-13 2019-11-13 Multi-modal data fusion method and apparatus, and intellignet robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/118102 WO2021092805A1 (en) 2019-11-13 2019-11-13 Multi-modal data fusion method and apparatus, and intellignet robot

Publications (1)

Publication Number Publication Date
WO2021092805A1 true WO2021092805A1 (en) 2021-05-20

Family

ID=75911325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118102 WO2021092805A1 (en) 2019-11-13 2019-11-13 Multi-modal data fusion method and apparatus, and intellignet robot

Country Status (1)

Country Link
WO (1) WO2021092805A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116360466A (en) * 2023-05-31 2023-06-30 天津博诺智创机器人技术有限公司 Robot operation obstacle avoidance system based on depth camera

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106526605A (en) * 2016-10-28 2017-03-22 北京康力优蓝机器人科技有限公司 Data fusion method and data fusion system for laser radar and depth camera
CN108828606A (en) * 2018-03-22 2018-11-16 中国科学院西安光学精密机械研究所 One kind being based on laser radar and binocular Visible Light Camera union measuring method
WO2019040800A1 (en) * 2017-08-23 2019-02-28 TuSimple 3d submap reconstruction system and method for centimeter precision localization using camera-based submap and lidar-based global map
CN109655825A (en) * 2018-03-29 2019-04-19 上海智瞳通科技有限公司 Data processing method, device and the multiple sensor integrated method of Multi-sensor Fusion
CN109947097A (en) * 2019-03-06 2019-06-28 东南大学 A kind of the robot localization method and navigation application of view-based access control model and laser fusion
CN110428372A (en) * 2019-07-08 2019-11-08 希格斯动力科技(珠海)有限公司 Depth data and 2D laser data fusion method and device, storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106526605A (en) * 2016-10-28 2017-03-22 北京康力优蓝机器人科技有限公司 Data fusion method and data fusion system for laser radar and depth camera
WO2019040800A1 (en) * 2017-08-23 2019-02-28 TuSimple 3d submap reconstruction system and method for centimeter precision localization using camera-based submap and lidar-based global map
CN108828606A (en) * 2018-03-22 2018-11-16 中国科学院西安光学精密机械研究所 One kind being based on laser radar and binocular Visible Light Camera union measuring method
CN109655825A (en) * 2018-03-29 2019-04-19 上海智瞳通科技有限公司 Data processing method, device and the multiple sensor integrated method of Multi-sensor Fusion
CN109947097A (en) * 2019-03-06 2019-06-28 东南大学 A kind of the robot localization method and navigation application of view-based access control model and laser fusion
CN110428372A (en) * 2019-07-08 2019-11-08 希格斯动力科技(珠海)有限公司 Depth data and 2D laser data fusion method and device, storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116360466A (en) * 2023-05-31 2023-06-30 天津博诺智创机器人技术有限公司 Robot operation obstacle avoidance system based on depth camera
CN116360466B (en) * 2023-05-31 2023-09-15 天津博诺智创机器人技术有限公司 Robot operation obstacle avoidance system based on depth camera

Similar Documents

Publication Publication Date Title
US20230260151A1 (en) Simultaneous Localization and Mapping Method, Device, System and Storage Medium
CN106780601B (en) Spatial position tracking method and device and intelligent equipment
US20210279444A1 (en) Systems and methods for depth map sampling
CN112907676B (en) Calibration method, device and system of sensor, vehicle, equipment and storage medium
EP3876141A1 (en) Object detection method, related device and computer storage medium
Varga et al. Super-sensor for 360-degree environment perception: Point cloud segmentation using image features
KR102516326B1 (en) Camera extrinsic parameters estimation from image lines
KR102295809B1 (en) Apparatus for acquisition distance for all directions of vehicle
CN110869974A (en) Point cloud processing method, point cloud processing device and storage medium
CN111462503B (en) Vehicle speed measuring method and device and computer readable storage medium
CN107533753A (en) Image processing apparatus
CN105578019A (en) Image extraction system capable of obtaining depth information and focusing method
EP3041230A1 (en) Image acquisition device and image processing method and system
CN111198378B (en) Boundary-based autonomous exploration method and device
JP6326641B2 (en) Image processing apparatus and image processing method
US20190102898A1 (en) Method and apparatus for monitoring region around vehicle
CN112097732A (en) Binocular camera-based three-dimensional distance measurement method, system, equipment and readable storage medium
WO2021195939A1 (en) Calibrating method for external parameters of binocular photographing device, movable platform and system
CN110889873A (en) Target positioning method and device, electronic equipment and storage medium
CN111699410A (en) Point cloud processing method, device and computer readable storage medium
Cvišić et al. Recalibrating the KITTI dataset camera setup for improved odometry accuracy
CN110895821A (en) Image processing device, storage medium storing image processing program, and driving support system
CN115376109B (en) Obstacle detection method, obstacle detection device, and storage medium
CN111142514B (en) Robot and obstacle avoidance method and device thereof
WO2021092805A1 (en) Multi-modal data fusion method and apparatus, and intellignet robot

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19952294

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.09.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19952294

Country of ref document: EP

Kind code of ref document: A1