CN114299131A

CN114299131A - Method, device and terminal equipment for detecting low and small obstacles based on three cameras

Info

Publication number: CN114299131A
Application number: CN202111622043.XA
Authority: CN
Inventors: 赖志林; 周东开
Original assignee: Guangzhou Saite Intelligent Technology Co Ltd
Current assignee: Guangzhou Saite Intelligent Technology Co Ltd
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2022-04-08
Anticipated expiration: 2041-12-28
Also published as: CN114299131B

Abstract

The embodiment of the invention provides a method, a device and terminal equipment for detecting short and small obstacles based on three cameras, wherein the method comprises the following steps: acquiring a first image, a second image and a third image which are synchronously acquired by a first camera, a second camera and a third camera; calculating the difference between the first image and the second image to obtain a first difference characteristic diagram; calculating the difference between the third image and the second image to obtain a second difference characteristic diagram; fusing the first difference feature map and the second difference feature map to obtain a fused feature map; and determining the area where the obstacle is located based on the fused feature map. The generated difference characteristics in two different directions are fused to form a complete obstacle difference characteristic, so that short and short obstacles are effectively identified, the influence of interference factors such as marked lines and shadows on the identification accuracy is reduced, and the sensing capability of the obstacles is enhanced.

Description

Method, device and terminal equipment for detecting low and small obstacles based on three cameras

技术领域technical field

本发明实施例涉及自动驾驶视觉感知的技术领域，尤其涉及一种基于三摄像头的低矮小障碍物检测方法、装置、终端设备。Embodiments of the present invention relate to the technical field of automatic driving visual perception, and in particular, to a method, device, and terminal device for detecting low-profile obstacles based on three cameras.

背景技术Background technique

随着基于辅助驾驶和自动驾驶的车辆应用到各个场景，对障碍物的检测的需求越来越高。对障碍物的精确检测能够提高车辆在驾驶中的安全性，保证车辆的正常行驶。As vehicles based on assisted driving and autonomous driving are applied to various scenarios, the demand for obstacle detection is getting higher and higher. Accurate detection of obstacles can improve the safety of vehicles in driving and ensure the normal driving of vehicles.

目前，在自动驾驶感知技术中对障碍物的识别，主要有两种方法，视觉感知识别和激光感知识别。视觉感知中包括有单目视觉结合激光雷达、TOF(Time of flight；飞行时间法)、双目视觉避障法。单目视觉结合激光雷达和TOF识别的障碍物主要包括人、车等有限类别，对有限类别之外的障碍物或者地面比较小的障碍物识别能力比较差；而双目视觉虽然不局限于类别，但对光照的要求十分敏感，偏亮或偏暗都会导致识别效果不好。至于激光感知识别，对障碍物的识别虽然也不局限于障碍物类别，但由于激光点云比较稀疏，在检测低矮小的障碍物时，效果欠佳。At present, there are two main methods to identify obstacles in autonomous driving perception technology, visual perception recognition and laser perception recognition. Visual perception includes monocular vision combined with lidar, TOF (Time of flight; time of flight method), and binocular vision obstacle avoidance method. The obstacles recognized by monocular vision combined with lidar and TOF mainly include limited categories such as people and vehicles, and the ability to recognize obstacles outside the limited categories or relatively small obstacles on the ground is relatively poor; while binocular vision is not limited to categories. , but it is very sensitive to the requirements of lighting, bright or dark will lead to poor recognition effect. As for laser perception recognition, although the recognition of obstacles is not limited to the obstacle category, because the laser point cloud is relatively sparse, the effect is not good when detecting low and small obstacles.

发明内容SUMMARY OF THE INVENTION

本发明实施例提出了一种基于三摄像头的低矮小障碍物检测方法、装置、机器人和存储介质，以解决提高低矮小障碍物的检测准确率的问题。Embodiments of the present invention provide a three-camera-based low-profile obstacle detection method, device, robot, and storage medium, so as to solve the problem of improving the detection accuracy of low-profile obstacles.

第一方面，本发明实施例提供了一种基于三摄像头的低矮小障碍物检测方法，包括：In a first aspect, an embodiment of the present invention provides a three-camera-based low-profile obstacle detection method, including:

获取第一摄像头、第二摄像头和第三摄像头同步采集的第一图像、第二图像和第三图像；acquiring a first image, a second image and a third image synchronously collected by the first camera, the second camera and the third camera;

计算所述第一图像与所述第二图像的差异，得到第一差异特征图；calculating the difference between the first image and the second image to obtain a first difference feature map;

计算所述第三图像与所述第二图像的差异，得到第二差异特征图；calculating the difference between the third image and the second image to obtain a second difference feature map;

融合所述第一差异特征图和所述第二差异特征图，得到融合特征图；fusing the first difference feature map and the second difference feature map to obtain a fusion feature map;

基于所述融合特征图确定障碍物所在的区域。The region where the obstacle is located is determined based on the fusion feature map.

第二方面，本发明实施例还提供了一种低矮小障碍物检测装置，包括：In a second aspect, an embodiment of the present invention also provides a low-profile obstacle detection device, including:

图像获取模块，用于获取第一摄像头、第二摄像头和第三摄像头同步采集的第一图像、第二图像和第三图像；an image acquisition module, configured to acquire the first image, the second image and the third image synchronously collected by the first camera, the second camera and the third camera;

第一差异特征计算模块，用于计算所述第一图像与所述第二图像的差异，得到第一差异特征图；a first difference feature calculation module, configured to calculate the difference between the first image and the second image to obtain a first difference feature map;

第二差异特征计算模块，用于计算所述第三图像与所述第二图像的差异，得到第二差异特征图；a second difference feature calculation module, configured to calculate the difference between the third image and the second image to obtain a second difference feature map;

差异融合模块，用于融合所述第一差异特征图和所述第二差异特征图，得到融合特征图；a difference fusion module, configured to fuse the first difference feature map and the second difference feature map to obtain a fusion feature map;

后处理模块，用于基于所述融合特征图确定障碍物所在的区域。The post-processing module is used for determining the area where the obstacle is located based on the fusion feature map.

第三方面，本发明实施例还提供了一种低矮小障碍物检测装置，包括：In a third aspect, an embodiment of the present invention also provides a low-profile obstacle detection device, including:

至少一个处理器；以及至少一个存储器，所述存储器存储有可被所述至少一个处理器执行的指令；at least one processor; and at least one memory storing instructions executable by the at least one processor;

所述指令被所述至少一个处理器执行，以使所述至少一个处理器实现如第一方面所述的基于三摄像头的低矮小障碍物检测方法。The instructions are executed by the at least one processor, so that the at least one processor implements the three-camera-based low-profile obstacle detection method according to the first aspect.

第四方面，本发明实施例还提供了一种机器人，包括：In a fourth aspect, an embodiment of the present invention also provides a robot, including:

第一摄像头，用于采集第一图像；a first camera for collecting a first image;

第二摄像头，用于采集第二图像；a second camera for collecting a second image;

第三摄像头，用于采集第三图像；a third camera for collecting a third image;

以及如第二方面或第三方面所述的低矮小障碍物检测装置。And the low-profile obstacle detection device according to the second aspect or the third aspect.

第五方面，本发明实施例还提供了一种计算机可读存储介质，所述计算机可读存储介质上存储计算机程序，所述计算机程序被处理器执行时实现如第一方面所述的基于三摄像头的低矮小障碍物检测方法。In a fifth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the third aspect based on the third aspect is implemented. Camera-based low-profile obstacle detection method.

在本实施例中，获取第一摄像头、第二摄像头和第三摄像头同步采集的第一图像、第二图像和第三图像；计算第一图像与第二图像的差异，得到第一差异特征图；计算第三图像与第二图像的差异，得到第二差异特征图；融合第一差异特征图和第二差异特征图，得到融合特征图；基于融合特征图确定障碍物所在的区域。通过三个摄像头和两个网络结合，将生成的两个不同方向上的差异特征进行融合，形成完整的障碍物差异特征，可以有效识别室内和室外路面上的低矮小障碍物，减小路上的标线、阴影等干扰因素对识别准确率的影响，可以与通用视觉感知和激光感知形成互补，增强在自动驾驶过程中对障碍物的感知能力。In this embodiment, a first image, a second image, and a third image that are synchronously collected by the first camera, the second camera, and the third camera are acquired; the difference between the first image and the second image is calculated to obtain a first difference feature map Calculate the difference between the third image and the second image to obtain a second difference feature map; fuse the first difference feature map and the second difference feature map to obtain a fusion feature map; determine the area where the obstacle is located based on the fusion feature map. Through the combination of three cameras and two networks, the generated difference features in two different directions are fused to form a complete obstacle difference feature, which can effectively identify low and small obstacles on indoor and outdoor roads and reduce the number of obstacles on the road. The influence of interference factors such as marking lines and shadows on the recognition accuracy can be complementary to general visual perception and laser perception, and enhance the ability to perceive obstacles in the process of automatic driving.

附图说明Description of drawings

图1A为本发明实施例一提供的一种基于三摄像头的低矮小障碍物检测方法的流程图；1A is a flowchart of a method for detecting low and small obstacles based on three cameras according to Embodiment 1 of the present invention;

图1B为本发明实施例一中基于三摄像头的低矮小障碍物检测方法的数据处理示例流程图；FIG. 1B is an example flow chart of data processing of the three-camera-based low-profile obstacle detection method in Embodiment 1 of the present invention;

图1C为圆锥形路障是使用基于三摄像头的低矮小障碍物检测方法得到的融合特征图；Fig. 1C is a fusion feature map obtained by using a three-camera-based low and small obstacle detection method for a conical roadblock;

图2为本发明实施例二提供的一种低矮小障碍物检测装置的结构示意图；2 is a schematic structural diagram of a low-profile obstacle detection device according to Embodiment 2 of the present invention;

图3为本发明实施例三提供的一种低矮小障碍物检测装置的结构示意图；3 is a schematic structural diagram of a low-profile obstacle detection device according to Embodiment 3 of the present invention;

图4为本发明实施例四提供的一种机器人的结构示意图。FIG. 4 is a schematic structural diagram of a robot according to Embodiment 4 of the present invention.

具体实施方式Detailed ways

下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释本发明，而非对本发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本发明相关的部分而非全部结构。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, the drawings only show some but not all structures related to the present invention.

实施例一Example 1

图1A为本发明实施例一提供的一种基于三摄像头的低矮小障碍物检测方法的流程图，图1B为本发明实施例一中基于三摄像头的低矮小障碍物检测方法的数据处理示例流程图，本实施例可适用于提高低矮小障碍物的检测准确率情况，该方法可以由基于三摄像头的低矮小障碍物检测装置来执行，该基于三摄像头的低矮小障碍物检测装置可以由软件和/或硬件实现，可配置在计算机设备、机器人中，例如，智能机器人、服务器、个人电脑，等等，具体包括如下步骤：1A is a flowchart of a method for detecting low and small obstacles based on three cameras according to Embodiment 1 of the present invention, and FIG. 1B is a data processing of the method for detecting low and small obstacles based on three cameras in Embodiment 1 of the present invention An example flow chart, this embodiment can be applied to improve the detection accuracy of low and small obstacles, the method can be performed by a three-camera-based low and small obstacle detection device, the three-camera-based low and small obstacles The detection device can be implemented by software and/or hardware, and can be configured in computer equipment, robots, such as intelligent robots, servers, personal computers, etc., and specifically includes the following steps:

步骤101、获取第一摄像头、第二摄像头和第三摄像头同步采集的第一图像、第二图像和第三图像。Step 101: Acquire a first image, a second image, and a third image synchronously collected by a first camera, a second camera, and a third camera.

示例性的，第一摄像头、第二摄像头、第三摄像头均设置在同一平面，均用于获取当前平面前的空间场景。第一摄像头和第二摄像头在同一水平方向间隔预设的距离对齐摆放，第二摄像头和第三摄像头在同一垂直方向间隔预设的距离对齐摆放，第三摄像头向下倾斜预设的角度，以获得更大部分地面部分的图像。设置一个三缓冲队列，第一摄像头、第二摄像头、第三摄像头拍摄采集到的图像分别输入进每一道缓冲队列中，在三缓冲队列中分别选择同一时间拍摄的图像，作为第一图像、第二图像、第三图像。Exemplarily, the first camera, the second camera, and the third camera are all set on the same plane, and are all used to acquire the spatial scene in front of the current plane. The first camera and the second camera are aligned at a preset distance in the same horizontal direction, the second camera and the third camera are aligned at a preset distance in the same vertical direction, and the third camera is tilted downward by a preset angle , to obtain an image of a larger portion of the ground. A three-buffer queue is set up, and the images captured by the first camera, the second camera, and the third camera are input into each buffer queue, respectively, and the images captured at the same time are selected from the three buffer queues as the first image, the first image and the third buffer queue. Second image, third image.

示例性的，第一摄像头和第二摄像头在同一水平方向间隔30cm对称设置，第一摄像头和第二摄像头的角度也是水平于地面的。第二摄像头和第三摄像头在同一垂直方向间隔40cm设置的，第三摄像头设置在第二摄像头上方，向下倾斜20°。第一摄像头、第二摄像头、第三摄像头均持续对摄像头前方的场景进行拍摄，从而进行图像的采集。第一摄像头、第二摄像头、第三摄像头将采集的数据分别传入三缓冲队列的不同队列中，在三缓冲队列中获取同一时间的图像，分别作为第一图像、第二图像、第三图像，若没有时间完全吻合的三张图像选取时间差不超过预设间隔阈值、例如10ms的三张图像，分别作为第一图像、第二图像、第三图像。三个摄像头采集得到的图像的分辨率均为1028×720，帧率为30帧/秒。Exemplarily, the first camera and the second camera are symmetrically arranged at a distance of 30 cm in the same horizontal direction, and the angles of the first camera and the second camera are also horizontal to the ground. The second camera and the third camera are arranged at an interval of 40cm in the same vertical direction, and the third camera is arranged above the second camera and is inclined downward by 20°. The first camera, the second camera, and the third camera all continue to photograph the scene in front of the cameras, so as to collect images. The first camera, the second camera, and the third camera transfer the collected data into different queues of the three buffer queues, respectively, and acquire images at the same time in the three buffer queues, as the first image, the second image, and the third image respectively. , if there are no three images whose times are completely matched, three images whose time difference does not exceed a preset interval threshold, for example, 10 ms, are selected as the first image, the second image, and the third image, respectively. The resolutions of the images collected by the three cameras are all 1028×720, and the frame rate is 30 frames per second.

需要说明的是，上述实施例中的第一摄像头和第二摄像头在同一水平方向上、第二摄像头和第三摄像头在同一垂直方向上为对本发明实施例的示例性说明，在本发明其他实施例中，第一摄像头、第二摄像头、第三摄像头还可以有其他的不同位置的设置方法，例如第一摄像头和第二摄像头在同一水平方向上、第三摄像头位于第一摄像头和第二摄像头中点的正上方，本发明在此不做限定。It should be noted that the first camera and the second camera in the above embodiment are in the same horizontal direction, and the second camera and the third camera are in the same vertical direction are exemplary descriptions of the embodiments of the present invention, and other implementations of the present invention In an example, the first camera, the second camera, and the third camera may also have other different position setting methods, for example, the first camera and the second camera are in the same horizontal direction, and the third camera is located between the first camera and the second camera. Just above the midpoint, the present invention is not limited here.

在本发明的一些实施例中，步骤101中包括：In some embodiments of the present invention, step 101 includes:

步骤1011、根据第一图像和第二图像中四对关键标记点对计算第一单应性矩阵。Step 1011: Calculate a first homography matrix according to four pairs of key marker points in the first image and the second image.

单应性变换，是用来描述物体在两张图像之间的位置映射关系，通过两张图像的选取的四对关键标记点的坐标，计算两个图像之间的单应性矩阵H，然后调用射影变换函数，将一副图像变换到另一幅图像的数据中，其用于射影变换的变换矩阵称为单应性矩阵。单应性矩阵可以用一个3×3的非奇异矩阵H表示，如下：The homography transformation is used to describe the positional mapping relationship of objects between two images. Through the coordinates of the four pairs of key marker points selected from the two images, the homography matrix H between the two images is calculated, and then The projective transformation function is called to transform one image into the data of another image. The transformation matrix used for the projective transformation is called the homography matrix. A homography matrix can be represented by a 3×3 nonsingular matrix H as follows:

单应性矩阵H是一个齐次矩阵，有8个未知量，将最后一个元素归一化为1。The homography matrix H is a homogeneous matrix with 8 unknowns, with the last element normalized to 1.

在第一图像和第二图像中选取四对相同特征的点作为关键标记点对，每对关键标记点对都是在第一图像、第二图像中表示相同特征的像素点对。Four pairs of points with the same feature in the first image and the second image are selected as key marker point pairs, and each pair of key marker point pairs is a pixel point pair representing the same feature in the first image and the second image.

利用第一图像和第二图像中四对关键标记点对的坐标关系通过如下公式，求得用于使第一图像能够与第二图像对齐的第一单应性矩阵H₁的值。The value of the first homography matrix H ₁ for enabling the first image to be aligned with the second image is obtained through the following formula using the coordinate relationship of the four pairs of key marker points in the first image and the second image.

在本发明的以下实施例中，在计算第一单应性矩阵前，可以先使用第一摄像头、第二摄像头的内参矩阵分别对第一图像、第二图像进行矫正，以便于显示画面、为导航提供标准的位置信息。在使用内参矩阵矫正第一图像、第二图像后，使用矫正后的第一图像与矫正后的第二图像进行标记关键标记点对，以进行第一单应性矩阵的计算。In the following embodiments of the present invention, before calculating the first homography matrix, the internal parameter matrices of the first camera and the second camera can be used to correct the first image and the second image respectively, so as to facilitate the display of the picture and be Navigation provides standard location information. After correcting the first image and the second image by using the internal reference matrix, use the corrected first image and the corrected second image to mark key marker point pairs, so as to calculate the first homography matrix.

需要说明的是，上述实施例中的通过四对关键标记点对求第一单应性矩阵的过程为对本发明实施例的示例性说明，在本发明其他实施例中，还可以有其他通过四对关键标记点对求第一单应性矩阵的过程，本发明在此不做限定。It should be noted that the process of obtaining the first homography matrix by using four pairs of key marker points in the above embodiment is an exemplary description of the embodiment of the present invention, and in other embodiments of the present invention, there may be other The process of obtaining the first homography matrix for key marker point pairs is not limited in the present invention.

步骤1012、将第一单应性矩阵与由第一图像的像素值组成的矩阵相乘，使第一图像和第二图像对齐。Step 1012: Multiply the first homography matrix by a matrix consisting of pixel values of the first image to align the first image and the second image.

对于第一图像平面内的点，使用w＝1来归一化点值，并用图像坐标的两个坐标x、y与其组成一个坐标矩阵[x y 1]^-1，使用该坐标矩阵与第一单应性矩阵H₁相乘得到的坐标矩阵[x' y' w']^-1，即通过如下公式使第一图像和第二图像对齐。For a point in the first image plane, use w=1 to normalize the point value, and use the two coordinates x, y of the image coordinates to form a coordinate matrix [xy 1] ^-1 with it, and use this coordinate matrix to correspond to the first homography The coordinate matrix [x'y'w'] ^-1 obtained by multiplying the property matrix H ₁ , that is, the first image and the second image are aligned by the following formula.

步骤1013、根据第二图像和第三图像中四对关键标记点对计算第一单应性矩阵。Step 1013: Calculate a first homography matrix according to four pairs of key marker points in the second image and the third image.

在第二图像和第三图像中选取四对相同特征的点作为关键标记点对，每对关键标记点对都是在第二图像、第三图像中表示相同特征的像素点对。Four pairs of points with the same feature in the second image and the third image are selected as key marker point pairs, and each pair of key marker point pairs is a pixel point pair representing the same feature in the second image and the third image.

利用第二图像和第三图像中四对关键标记点对的坐标关系通过如下公式，求得用于使第三图像能够与第二图像对齐的第二单应性矩阵H₂的值。The value of the second homography matrix H ₂ for enabling the third image to be aligned with the second image is obtained by the following formula using the coordinate relationship of the four pairs of key marker points in the second image and the third image.

在本发明的一下实施例中，在计算第二单应性矩阵前，可以先使用第二摄像头、第三摄像头的内参矩阵分别对第二图像、第三图像进行矫正，以便于显示画面、为导航提供标准的位置信息。在使用内参矩阵矫正第二图像、第三图像后，使用矫正后的第二图像与矫正后的第三图像进行标记关键标记点对，以进行第二单应性矩阵的计算。In an embodiment of the present invention, before calculating the second homography matrix, the internal parameter matrices of the second camera and the third camera can be used to correct the second image and the third image respectively, so as to facilitate the display of the picture and the Navigation provides standard location information. After correcting the second image and the third image by using the internal reference matrix, use the corrected second image and the corrected third image to mark key marker point pairs, so as to calculate the second homography matrix.

需要说明的是，上述实施例中的通过四对关键标记点对求第二单应性矩阵的过程为对本发明实施例的示例性说明，在本发明其他实施例中，还可以有其他通过四对关键标记点对求第二单应性矩阵的过程，本发明在此不做限定。It should be noted that the process of obtaining the second homography matrix by using four pairs of key marker points in the above embodiment is an exemplary description of the embodiment of the present invention, and in other embodiments of the present invention, there may be other The process of obtaining the second homography matrix for key marker point pairs is not limited in the present invention.

步骤1014、将第二单应性矩阵与由第三图像的像素值组成的矩阵相乘，使第三图像和第二图像对齐。Step 1014: Multiply the second homography matrix by a matrix consisting of pixel values of the third image to align the third image and the second image.

对于第二图像平面内的点，使用w＝1来归一化点值，并用图像坐标的两个坐标x、y与其组成一个坐标矩阵[x y 1]^-1，使用该坐标矩阵与第二单应性矩阵H₂相乘得到的坐标矩阵[x' y' w']^-1，即通过如下公式使第二图像和第三图像对齐。For points in the second image plane, use w=1 to normalize the point values, and use the two coordinates x, y of the image coordinates to form a coordinate matrix [xy 1] ^-1 with them, and use this coordinate matrix to correspond to the second homography The coordinate matrix [x'y'w'] ^-1 obtained by multiplying the sex matrix H ₂ , that is, the second image and the third image are aligned by the following formula.

步骤102、计算第一图像与第二图像的差异，得到第一差异特征图。Step 102: Calculate the difference between the first image and the second image to obtain a first difference feature map.

第一图像和第二图像中包括有共同的一部分，第一图像和第二图像是水平对齐的，通过计算第一图像和第二图像之间的差异，生成第一差异特征图，能够得到当前摄像头前方区域中水平方向上的特征。针对有宽度的物体，会有比较明显的特征。The first image and the second image include a common part, and the first image and the second image are horizontally aligned. By calculating the difference between the first image and the second image, the first difference feature map is generated, and the current Features in the horizontal direction in the area in front of the camera. For objects with width, there will be more obvious features.

需要说明的是，上述实施例中的水平方向上的为对本发明实施例的示例性说明，在本发明其他实施例中，根据第一摄像头和第二摄像头的设置情况，计算第一图像和第二图像的差异还可以得到其他方向上的特征，本发明在此不做限定。It should be noted that the horizontal direction in the above embodiment is an exemplary illustration of the embodiment of the present invention. In other embodiments of the present invention, the first image and the second camera are calculated according to the settings of the first camera and the second camera. The difference between the two images can also obtain features in other directions, which are not limited in the present invention.

在本发明的一些实施例中，步骤102包括：In some embodiments of the present invention, step 102 includes:

步骤1021、将第一图像与第二图像分别输入MobileNetV2网络中进行处理，取MobileNetV2网络中前k个卷积块输出的特征图，得到第一图像对应的k个第一特征图和第二图像对应的k个第一特征图。Step 1021: Input the first image and the second image respectively into the MobileNetV2 network for processing, take the feature maps output by the first k convolution blocks in the MobileNetV2 network, and obtain k first feature maps and second images corresponding to the first image The corresponding k first feature maps.

MobileNetV2网络是一个能在保持类似精度的条件下显著的减少模型参数和计算量的网络，该网络现将输入的低维压缩表示(low-dimensional compressedrepresentation)扩展到高维，使用轻量级深度卷积做过滤；随后用线性瓶颈(linearbottleneck)将特征投影回低维压缩表示。MobileNetV2网络中深度卷积采用的是3×3深度可分卷积(Depthwise Separable Convolution)。MobileNetV2网络包括多个依次连接的卷积块，前一个卷积块的输出作为后一卷积块的输入。The MobileNetV2 network is a network that can significantly reduce model parameters and computation while maintaining similar accuracy. The network now extends the input low-dimensional compressed representation to high dimensions, using lightweight depth volumes. The product is filtered; the features are then projected back to a low-dimensional compressed representation using a linear bottleneck. The depthwise convolution in the MobileNetV2 network uses a 3×3 depthwise separable convolution (Depthwise Separable Convolution). The MobileNetV2 network consists of multiple convolutional blocks connected in sequence, and the output of the previous convolutional block is used as the input of the subsequent convolutional block.

将第一图像调整至预设的尺寸后，输入进MobileNetV2网络后，取MobileNetV2网络中的前k个卷积块输出的特征图，作为第一图像对应的k个第一特征图，第一特征图是MobileNetV2网络的卷积块输出的含有图像特征的特征图。After adjusting the first image to the preset size and inputting it into the MobileNetV2 network, take the feature maps output by the first k convolution blocks in the MobileNetV2 network as the k first feature maps corresponding to the first image. The figure is a feature map containing image features output by the convolution block of the MobileNetV2 network.

将第二图像调整至预设的尺寸后，输入进MobileNetV2网络后，取MobileNetV2网络中的前k个卷积块输出的特征图，作为第二图像对应的k个第一特征图。第一图像和第二图像提取出的第一特征图的数量必须相同。After the second image is adjusted to a preset size and input into the MobileNetV2 network, the feature maps output by the first k convolution blocks in the MobileNetV2 network are taken as the k first feature maps corresponding to the second image. The number of first feature maps extracted from the first image and the second image must be the same.

示例性的，k为3，将第一图像和第二图像调整至512×256后，分别输入MobileNetV2网络，并分别提取前3个卷积块输出的特征图，得到第一图像对应的3个第一特征图和第二图像对应的3个第一特征图。他们分别包括有第一图像和第二图像的特征。Exemplarily, k is 3. After the first image and the second image are adjusted to 512×256, they are respectively input to the MobileNetV2 network, and the feature maps output by the first three convolution blocks are extracted respectively to obtain three corresponding to the first image. The first feature map and the three first feature maps corresponding to the second image. They include the features of the first image and the second image, respectively.

需要说明的是，上述实施例中的k的值为对本发明实施例的示例性说明，在本发明其他实施例中，k还可以是其他的值，本发明在此不做限定。It should be noted that the value of k in the above embodiments is an exemplary description of the embodiments of the present invention, and in other embodiments of the present invention, k may also be other values, which are not limited in the present invention.

步骤1022、分别计算第一图像对应的第i个第一特征图与第二图像对应的第i个第一特征图的差值，得到k个第一显著特征图，其中，i小于或等于k。Step 1022: Calculate the difference between the ith first feature map corresponding to the first image and the ith first feature map corresponding to the second image, respectively, to obtain k first salient feature maps, where i is less than or equal to k .

由第一图像和第二图像提取出来的第一特征图的数目是相同的，i的值从1变化至k逐层计算第一图像对应的第一特征图和第二图像对应的第一特征图的差值，以使第一图像和第二图像重合部分中的特征更加明显，相减后得到的特征图称为第一显著特征图。The number of first feature maps extracted from the first image and the second image is the same, and the value of i changes from 1 to k to calculate the first feature map corresponding to the first image and the first feature corresponding to the second image layer by layer. The difference between the two images makes the features in the overlapping part of the first image and the second image more obvious, and the feature map obtained after the subtraction is called the first salient feature map.

示例性的，k的值为3，i先设为1，计算第一图像对应的第1个第一特征图和第二图像对应的第1个第一特征图的差值，得到第1个第一显著特征图；i设为2，计算第一图像对应的第2个第一特征图和第二图像对应的第2个第一特征图的差值，得到第2个第一显著特征图；i设为3，计算第一图像对应的第3个第一特征图和第二图像对应的第3个第一特征图的差值，得到第3个第一显著特征图，因此一共得到了k个第一显著特征图。Exemplarily, the value of k is 3, i is first set to 1, and the difference between the first first feature map corresponding to the first image and the first first feature map corresponding to the second image is calculated to obtain the first The first salient feature map; i is set to 2, calculate the difference between the second first feature map corresponding to the first image and the second first feature map corresponding to the second image, and obtain the second first salient feature map ; i is set to 3, calculate the difference between the third first feature map corresponding to the first image and the third first feature map corresponding to the second image, and obtain the third first salient feature map, so a total of k first salient feature maps.

步骤1023、将第一图像与第二图像分别输入mbv2_ca网络中进行处理，取mbv2_ca网络中前m个卷积块输出的特征图，得到第一图像对应的m个第二特征图和第二图像对应的m个第二特征图。Step 1023: Input the first image and the second image respectively into the mbv2_ca network for processing, take the feature maps output by the first m convolution blocks in the mbv2_ca network, and obtain m second feature maps and second images corresponding to the first image The corresponding m second feature maps.

mbv2_ca网络是加入了注意力机制attention块的MobileNetV2的网络。注意力机制可以认为是一种资源分配机制，对于原本平均分配的资源根据attention对象的重要程度重新分配资源，重要的单位多分一点，不重要或者不好的单位就少分一点，在深度神经网络的结构设计中，attention所要分配的资源基本上就是权重，通过加入attention块，有效提升卷积特征表达能力。mbv2_ca网络包括多个依次连接的卷积块，前一个卷积块的输出作为后一卷积块的输入。The mbv2_ca network is a MobileNetV2 network with an attention block added. The attention mechanism can be considered as a resource allocation mechanism. For the resources that were originally allocated evenly, the resources are redistributed according to the importance of the attention object. The important units are divided a little more, and the unimportant or bad units are divided a little less. In the deep neural network In the structural design of , the resources to be allocated by attention are basically weights. By adding attention blocks, the ability to express convolutional features can be effectively improved. The mbv2_ca network consists of multiple convolutional blocks connected in sequence, and the output of the previous convolutional block is used as the input of the subsequent convolutional block.

将第一图像调整至预设的尺寸后，输入进mbv2_ca网络后，取mbv2_ca网络中的前m个卷积块输出的特征图，作为第一图像对应的m个第二特征图，第二特征图是mbv2_ca网络的卷积块输出的含有图像特征的特征图。After adjusting the first image to the preset size and inputting it into the mbv2_ca network, take the feature maps output by the first m convolution blocks in the mbv2_ca network as m second feature maps corresponding to the first image, the second feature The figure is a feature map containing image features output by the convolutional block of the mbv2_ca network.

将第二图像调整至预设的尺寸后，输入进mbv2_ca网络后，取mbv2_ca网络中的前m个卷积块输出的特征图，作为第二图像对应的m个第二特征图。第一图像和第二图像提取出的第二特征图的数量必须相同。After the second image is adjusted to a preset size and input into the mbv2_ca network, the feature maps output by the first m convolution blocks in the mbv2_ca network are taken as m second feature maps corresponding to the second image. The number of second feature maps extracted from the first image and the second image must be the same.

使用mbv2_ca网络对第一图像、第二图像进行特征提取得到m个第二特征图，能够和步骤1021中使用MobileNetV2网络特征提取得到的k个第一特征图，组合形成(k+m)个特征，便于更加搜寻得到第一图像和第二图像中的更多不同的特征。Use the mbv2_ca network to perform feature extraction on the first image and the second image to obtain m second feature maps, which can be combined with the k first feature maps obtained by using the MobileNetV2 network feature extraction in step 1021 to form (k+m) features , it is convenient to search for more different features in the first image and the second image.

示例性的，m为2，将第一图像和第二图像调整至512×256后，分别输入mbv2_ca网络，并分别提取前2个卷积块输出的特征图，得到第一图像对应的2个第二特征图和第二图像对应的2个第二特征图。他们分别包括有第一图像和第二图像的特征。Exemplarily, m is 2. After the first image and the second image are adjusted to 512×256, they are respectively input to the mbv2_ca network, and the feature maps output by the first two convolution blocks are extracted respectively to obtain two corresponding to the first image. The second feature map and two second feature maps corresponding to the second image. They include the features of the first image and the second image, respectively.

需要说明的是，上述实施例中的m的值为对本发明实施例的示例性说明，在本发明其他实施例中，m还可以是其他的值，m和k的值可以相等、也可以不相等，本发明在此不做限定。It should be noted that the value of m in the above-mentioned embodiment is an exemplary illustration of the embodiment of the present invention. In other embodiments of the present invention, m may also be other values, and the values of m and k may be equal or different. Equivalent, the present invention is not limited here.

步骤1024、计算第一图像对应的第i个第二特征图与第二图像对应的第i个第二特征图的差值，得到m个第二显著特征图，其中，i小于或等于m，k个第一显著特征图和m个第二显著特征图组成第一差异特征图。Step 1024: Calculate the difference between the ith second feature map corresponding to the first image and the ith second feature map corresponding to the second image to obtain m second salient feature maps, where i is less than or equal to m, The k first salient feature maps and the m second salient feature maps form the first difference feature map.

由第一图像和第二图像提取出来的第二特征图的数目是相同的，i的值从1变化至m逐层计算第一图像对应的第二特征图和第二图像对应的第二特征图的差值，以使第一图像和第二图像重合部分中的特征更加明显，相减后得到的特征图称为第二显著特征图。The number of second feature maps extracted from the first image and the second image is the same, and the value of i varies from 1 to m. The second feature map corresponding to the first image and the second feature corresponding to the second image are calculated layer by layer. The difference between the two images is used to make the features in the overlapping part of the first image and the second image more obvious, and the feature map obtained after the subtraction is called the second salient feature map.

示例性的，m的值为2，i先设为1，计算第一图像对应的第1个第二特征图和第二图像对应的第1个第二特征图的差值，得到第1个第二显著特征图；i设为2，计算第一图像对应的第2个第二特征图和第二图像对应的第2个第二特征图的差值，得到第2个第二显著特征图；因此一共得到了2个第二显著特征图。Exemplarily, the value of m is 2, i is set to 1 first, and the difference between the first second feature map corresponding to the first image and the first second feature map corresponding to the second image is calculated to obtain the first Second salient feature map; i is set to 2, calculate the difference between the second second feature map corresponding to the first image and the second second feature map corresponding to the second image, and obtain the second second salient feature map ; so a total of 2 second salient feature maps are obtained.

利用MobileNetV2网络提取计算出来的k个第一显著特征图和利用mbv2_ca网络提取计算出来的m个第二显著特征图组成了第一差异特征图，由于第一图像、第二图像是由在同一水平方向存在水平视觉差异的第一摄像头、第二摄像头拍摄得到的两个图像，因此由第一图像、第二图像通过处理计算生成的第一差异特征图针对有宽度的物体会有比较明显的水平方向上的差异特征。The k first salient feature maps extracted and calculated by the MobileNetV2 network and the m second salient feature maps extracted and calculated by the mbv2_ca network constitute the first difference feature map. There are two images captured by the first camera and the second camera with horizontal visual difference in direction, so the first difference feature map generated by the processing and calculation of the first image and the second image will have a more obvious level for objects with width. Differential features in direction.

网络浅层提取出来的是图像的初级特征，比如是纹理、颜色、角点等特征，而网络深层提取出来的特征是肉眼识别不了的语义特征。本发明的目的是为了提取图像的显著性差异特征，不需要提取到网络深层的语义特征，在本实施例中，提取MobileNetV2网络的前k层的特征图和mbv2_ca网络的前m层特征图即可提取出各图像中的类似于纹理、颜色等初级特征，足以用于检测障碍物，从而提高检测低矮小障碍物过程中的效率。The shallow layer of the network extracts the primary features of the image, such as texture, color, corners and other features, while the features extracted from the deep layer of the network are semantic features that cannot be recognized by the naked eye. The purpose of the present invention is to extract the saliency difference feature of the image without extracting the semantic feature deep in the network. In this embodiment, the feature map of the first k layers of the MobileNetV2 network and the feature map of the first m layers of the mbv2_ca network are extracted The primary features such as texture and color in each image can be extracted, which are sufficient to detect obstacles, thereby improving the efficiency in the process of detecting low and small obstacles.

并且采用MobileNetV2网络、mbv2_ca网络双网络提取图像的初级特征，两个网络所提取出的初级特征结合能够使检测的准确性提高。Moreover, the MobileNetV2 network and the mbv2_ca network are used to extract the primary features of the image. The combination of the primary features extracted by the two networks can improve the detection accuracy.

需要说明的是，上述实施例中所使用的MobileNetV2网络和mbv2_ca网络为对本发明实施例的示例性说明，在本发明的一些实施例中，还可以采用其它网络对第一图像、第二图像进行特征提取，本发明在此不做限定。It should be noted that the MobileNetV2 network and the mbv2_ca network used in the above embodiments are exemplary descriptions of the embodiments of the present invention. Feature extraction is not limited in the present invention.

步骤103、计算第三图像与第二图像的差异，得到第二差异特征图。Step 103: Calculate the difference between the third image and the second image to obtain a second difference feature map.

第二图像和第三图像中包括共同的一部分，第二图像和第三图像是垂直对齐的，通过计算第二图像和第三图像之间的差异，生成第二差异特征图，能够得到当前摄像头前方区域中垂直方向上的特征。针对有高度的物体，会有比较明显的特征。The second image and the third image include a common part. The second image and the third image are vertically aligned. By calculating the difference between the second image and the third image, a second difference feature map is generated, and the current camera can be obtained. Features in the vertical direction in the front area. For objects with height, there will be more obvious features.

在本发明的一些实施例中，步骤103包括：In some embodiments of the present invention, step 103 includes:

步骤1031、将第三图像输入MobileNetV2网络中进行处理，取MobileNetV2网络中前k个卷积块输出的特征图，得到第三图像对应的k个第一特征图。Step 1031: Input the third image into the MobileNetV2 network for processing, and obtain the k first feature maps corresponding to the third image by taking the feature maps output by the first k convolution blocks in the MobileNetV2 network.

将第三图像调整至预设的尺寸后，输入进MobileNetV2网络后，取MobileNetV2网络中的前k个卷积块输出的特征图，作为第三图像对应的k个第一特征图。After the third image is adjusted to a preset size and input into the MobileNetV2 network, the feature maps output by the first k convolution blocks in the MobileNetV2 network are taken as the k first feature maps corresponding to the third image.

示例性的，k为3，将第三图像调整至512×256后，输入MobileNetV2网络，并提取前3个卷积块输出的特征图，得到第三图像对应的3个第一特征图。他们包括有第三图像。Exemplarily, k is 3. After the third image is adjusted to 512×256, it is input to the MobileNetV2 network, and the feature maps output by the first three convolution blocks are extracted to obtain three first feature maps corresponding to the third image. They include a third image.

步骤1032、分别计算第三图像对应的第i个第一特征图与第二图像对应的第i个第一特征图的差值，得到k个第三显著特征图，其中，i小于或等于k。Step 1032: Calculate the difference between the ith first feature map corresponding to the third image and the ith first feature map corresponding to the second image, respectively, to obtain k third salient feature maps, where i is less than or equal to k .

由第二图像和第三图像提取出来的第一特征图数目是相同的，都是有k个第一特征图，i的值从1变化至k逐层计算第三图像对应的第一特征图和第二图像对应的第一特征图的差值，以使第三图像和第二图像重合部分的特征更加明显地表现出来。The number of first feature maps extracted from the second image and the third image is the same, there are k first feature maps, and the value of i changes from 1 to k to calculate the first feature map corresponding to the third image layer by layer The difference value of the first feature map corresponding to the second image, so that the features of the overlapping part of the third image and the second image are more clearly expressed.

示例性的，k的值为3，i先设为1，计算第三图像对应的第1个第一特征图和第二图像对应的第1个第一特征图的差值，得到第1个第三显著特征图；i设为2，计算第三图像对应的第2个第一特征图和第二图像对应的第2个第一特征图的差值，得到第2个第三显著特征图；i设为3，计算第三图像对应的第3个第一特征图和第二图像对应的第3个第一特征图的差值，得到第3个第三显著特征图，因此一共得到了k个第三显著特征图。Exemplarily, the value of k is 3, i is set to 1 first, and the difference between the first first feature map corresponding to the third image and the first first feature map corresponding to the second image is calculated to obtain the first The third salient feature map; i is set to 2, calculate the difference between the second first feature map corresponding to the third image and the second first feature map corresponding to the second image, and obtain the second third salient feature map ; i is set to 3, calculate the difference between the third first feature map corresponding to the third image and the third first feature map corresponding to the second image, and obtain the third third significant feature map, so a total of k third salient feature maps.

步骤1033、将第三图像输入mbv2_ca网络中进行处理，取mbv2_ca网络中前m个卷积块输出的特征图，得到第三图像对应的m个第二特征图。Step 1033: Input the third image into the mbv2_ca network for processing, take the feature maps output by the first m convolution blocks in the mbv2_ca network, and obtain m second feature maps corresponding to the third image.

将第三图像调整至预设的尺寸后，输入进mbv2_ca网络后，取mbv2_ca网络中的前m个卷积块输出的特征图，作为第三图像对应的m个第二特征图。After the third image is adjusted to a preset size and input into the mbv2_ca network, the feature maps output by the first m convolution blocks in the mbv2_ca network are taken as m second feature maps corresponding to the third image.

示例性的，m为2，将第三图像调整至512×256后，输入mbv2_ca网络，并提取前2个卷积块输出的特征图，得到第三图像对应的2个第二特征图。他们分别包括有第一图像和第二图像的特征。Exemplarily, m is 2, after adjusting the third image to 512×256, input it to the mbv2_ca network, and extract the feature maps output by the first two convolution blocks to obtain two second feature maps corresponding to the third image. They include the features of the first image and the second image, respectively.

需要说明的是，上述实施例中的m的值为对本发明实施例的示例性说明，在本发明其他实施例中，m还可以是其他的值，m和k的值可以相等、也可以不相等，本发明在此不做限定。It should be noted that the value of m in the above-mentioned embodiment is an exemplary illustration of the embodiment of the present invention. In other embodiments of the present invention, m may also be other values, and the values of m and k may be equal or different. Equivalent, the present invention is not limited herein.

步骤1034、计算第三图像对应的第i个第二特征图与第二图像对应的第i个第二特征图的差值，得到m个第四显著特征图，其中，i小于或等于m，k个第三显著特征图和m个第四显著特征图组成第二差异特征图。Step 1034: Calculate the difference between the ith second feature map corresponding to the third image and the ith second feature map corresponding to the second image, and obtain m fourth significant feature maps, where i is less than or equal to m, The k third salient feature maps and the m fourth salient feature maps form the second difference feature map.

由第二图像和第三图像提取出来的第二特征图的数目是相同的，i的值从1变化至m逐层计算第三图像对应的第二特征图和第二图像对应的第二特征图的差值，以使第三图像和第二图像重合部分中的特征更加明显，相减后得到的特征图称为第四显著特征图。The number of second feature maps extracted from the second image and the third image is the same, and the value of i varies from 1 to m. The second feature map corresponding to the third image and the second feature corresponding to the second image are calculated layer by layer. The difference value between the maps makes the features in the overlapping part of the third image and the second image more obvious, and the feature map obtained after the subtraction is called the fourth salient feature map.

示例性的，m的值为2，i先设为1，计算第三图像对应的第1个第二特征图和第二图像对应的第1个第二特征图的差值，得到第1个第四显著特征图；i设为2，计算第三图像对应的第2个第二特征图和第二图像对应的第2个第二特征图的差值，得到第2个第四显著特征图；因此一共得到了2个第四显著特征图。Exemplarily, the value of m is 2, i is set to 1 first, and the difference between the first second feature map corresponding to the third image and the first second feature map corresponding to the second image is calculated to obtain the first Fourth salient feature map; i is set to 2, calculate the difference between the second second feature map corresponding to the third image and the second second feature map corresponding to the second image, and obtain the second fourth salient feature map ; so a total of 2 fourth salient feature maps are obtained.

步骤103中利用MobileNetV2网络提取计算出来的k个第三显著特征图和利用mbv2_ca网络提取计算出来的m个第四显著特征图组成了第二差异特征图，第二图像、第三图像是由同一垂直方向存在垂直视觉上的特征差异的第二摄像头、第三摄像头拍摄得到的两个图像，因此由第二图像、第三图像通过处理计算生成的第二差异特征图针对有高度的物体会有比较明显的垂直方向上的差异特征。In step 103, the k third salient feature maps extracted and calculated by the MobileNetV2 network and the m fourth salient feature maps extracted and calculated by the mbv2_ca network constitute the second difference feature map. The second image and the third image are made of the same There are two images captured by the second camera and the third camera with vertical visual feature differences in the vertical direction. Therefore, the second difference feature map generated by the processing and calculation of the second image and the third image will have a different feature map for objects with height. There are obvious differences in the vertical direction.

步骤104、融合第一差异特征图和第二差异特征图，得到融合特征图。Step 104 , fuse the first difference feature map and the second difference feature map to obtain a fusion feature map.

第一差异特征图表现的是水平方向上的特征，第二差异特征图表现的是垂直方向上的特征。The first difference feature map represents features in the horizontal direction, and the second difference feature map represents features in the vertical direction.

将第一差异特征图和第二差异特征图融合后生成的融合特征图能够完整地体现障碍物的显著特征，使各种形状、各种高度、各种宽度的障碍物的特征均能够表现出来，提高检测的准确率，防止障碍物因太窄或太矮而无法被检测出来。The fusion feature map generated by fusing the first difference feature map and the second difference feature map can fully reflect the salient features of obstacles, so that the features of obstacles of various shapes, heights, and widths can be displayed. , to improve the detection accuracy and prevent obstacles that are too narrow or too short to be detected.

在本发明的一些实施例中，步骤104包括：In some embodiments of the present invention, step 104 includes:

步骤1041、计算第一差异特征图中第i个第一显著特征图和第二差异特征图中第i个第三显著特征图的和值，其中，i小于或等于k，得到k个第一融合特征图。Step 1041: Calculate the sum of the ith first salient feature map in the first difference feature map and the ith third salient feature map in the second difference feature map, where i is less than or equal to k, and k first salient feature maps are obtained. Fusion feature maps.

第一差异特征图中包括k个第一显著特征图，第二差异特征图中包括k个第三显著特征图。他们分别表示由MobileNetV2网络计算提取出来的水平方向上、垂直方向上的特征。i从1变化至k，逐层计算第i层中第一差异特征图和第二差异特征图的和值，以使水平方向和垂直方向上的特征能够构成完整的障碍物的显著特征。The first difference feature map includes k first salient feature maps, and the second difference feature map includes k third salient feature maps. They respectively represent the features in the horizontal and vertical directions extracted by the MobileNetV2 network. i varies from 1 to k, and the sum of the first difference feature map and the second difference feature map in the i-th layer is calculated layer by layer, so that the features in the horizontal and vertical directions can constitute the salient features of the complete obstacle.

示例性的，k为3，i先设为1,计算第1个第一显著特征图和第1个第三显著特征图的和值，得到第1个第一融合特征图；i变为2,计算第2个第一显著特征图和第2个第三显著特征图的和值，得到第2个第一融合特征图；i变为3,计算第3个第一显著特征图和第3个第三显著特征图的和值，得到第3个第一融合特征图，经过上述计算共得到k个第一融合特征图，第一融合特征图是通过MobileNetV2网络在水平、垂直方向上提取特征后结合得到的障碍物完整显著特征。Exemplarily, k is 3, i is first set to 1, and the sum of the first first salient feature map and the first third salient feature map is calculated to obtain the first first fusion feature map; i becomes 2 , Calculate the sum of the second first salient feature map and the second third salient feature map, and get the second first fusion feature map; i becomes 3, calculate the third first salient feature map and the third The sum of the third salient feature maps, and the third first fusion feature map is obtained. After the above calculation, a total of k first fusion feature maps are obtained. The first fusion feature map is to extract features in the horizontal and vertical directions through the MobileNetV2 network. The complete salient features of obstacles obtained by post-combination.

需要说明的是，上述实施例中的k的值为对本发明实施例的示例性说明，在本发明其他实施例中，k还可以是其他的值，本发明在此不做限定。It should be noted that the value of k in the above embodiments is an exemplary description of the embodiments of the present invention. In other embodiments of the present invention, k may also be other values, which are not limited herein.

步骤1042、计算第一差异特征图中第i个第二显著特征图和第二差异特征图中第i个第四显著特征图的和值，其中，i小于或等于m，得到m个第二融合特征图。Step 1042: Calculate the sum of the i-th second salient feature map in the first difference feature map and the i-th fourth salient feature map in the second difference feature map, where i is less than or equal to m, and m second salient feature maps are obtained. Fusion feature maps.

第一差异特征图中包括m个第二显著特征图，第二差异特征图中包括m个第四显著特征图。他们分别表示由mbv2_ca网络计算提取出来的水平方向上、垂直方向上的特征。i从1变化至m，逐层计算第i层中第一差异特征图和第二差异特征图的和值，以使水平方向和垂直方向上的特征能够构成完整的障碍物的显著特征。The first difference feature map includes m second salient feature maps, and the second difference feature map includes m fourth salient feature maps. They represent the features in the horizontal and vertical directions extracted by the mbv2_ca network calculation, respectively. i varies from 1 to m, and the sum of the first difference feature map and the second difference feature map in the i-th layer is calculated layer by layer, so that the features in the horizontal and vertical directions can constitute the salient features of the complete obstacle.

示例性的，m为2，i先设为1,计算第1个第二显著特征图和第1个第四显著特征图的和值，得到第1个第二融合特征图；i变为2,计算第2个第二显著特征图和第2个第四显著特征图的和值，得到第2个第二融合特征图；经过上述计算共得到m个第二融合特征图，第二融合特征图是通过mbv2_ca网络在水平、垂直方向上提取特征后结合得到的障碍物完整显著特征。Exemplarily, m is 2, i is first set to 1, and the sum of the first second salient feature map and the first fourth salient feature map is calculated to obtain the first second fusion feature map; i becomes 2 , calculate the sum of the second second salient feature map and the second fourth salient feature map, and obtain the second second fusion feature map; after the above calculation, a total of m second fusion feature maps are obtained, and the second fusion feature The figure is the complete salient features of obstacles obtained by combining the features extracted in the horizontal and vertical directions through the mbv2_ca network.

需要说明的是，上述实施例中的m的值为对本发明实施例的示例性说明，在本发明其他实施例中，m还可以是其他的值，本发明在此不做限定。It should be noted that the value of m in the above embodiments is an exemplary description of the embodiments of the present invention, and in other embodiments of the present invention, m may also be other values, which are not limited in the present invention.

步骤1043、计算第一融合特征图和第二融合特征图与对应的超参数的乘积，得到(k+m)个融合超参数特征图。Step 1043: Calculate the product of the first fused feature map and the second fused feature map and the corresponding hyperparameters to obtain (k+m) fused hyperparameter feature maps.

为了更加明显的表现障碍物的特征，预先设置了(k+m)个超参数，用于分别与k个第一融合特征图、m个第二融合特征图相乘，以突出或减弱部分特征。步骤1041中得到的k个第一融合特征图与对应的k个超参数相乘，得到k个融合超参数特征图；步骤1042中得到的m个第二融合特征图与对应的m个超参数相乘，得到m个融合超参数特征图。因此共得到(k+m)个融合超参数特征图。In order to more clearly express the characteristics of obstacles, (k+m) hyperparameters are preset, which are used to multiply with k first fusion feature maps and m second fusion feature maps respectively to highlight or weaken some features . The k first fusion feature maps obtained in step 1041 are multiplied with the corresponding k hyperparameters to obtain k fusion hyperparameter feature maps; the m second fusion feature maps obtained in step 1042 and the corresponding m hyperparameters Multiply to get m fused hyperparameter feature maps. Therefore, a total of (k+m) fused hyperparameter feature maps are obtained.

示例性的，步骤1041中得到了3个第一融合特征图，步骤1042中得到了2个第二融合特征图。3个第一融合特征图分别和3个超参数相乘，得到3个融合超参数特征图；2个第二融合特征图分别和2个超参数相乘，得到2个融合超参数特征图；一共得到5个融合超参数特征图。Exemplarily, three first fusion feature maps are obtained in step 1041, and two second fusion feature maps are obtained in step 1042. The 3 first fusion feature maps are multiplied with 3 hyperparameters respectively to obtain 3 fusion hyperparameter feature maps; the 2 second fusion feature maps are multiplied with 2 hyperparameters respectively to obtain 2 fusion hyperparameter feature maps; A total of 5 fused hyperparameter feature maps are obtained.

步骤1044、计算(k+m)个融合超参数特征图的和，得到融合特征图。Step 1044: Calculate the sum of (k+m) fused hyperparameter feature maps to obtain a fused feature map.

计算(k+m)个融合超参数特征图的和，也就是将所有经过超参数调整后的障碍物的特征图加起来，得到表现力较好的融合特征图。融合特征图能够对障碍物的特征进行更加准确的体现。Calculate the sum of (k+m) fused hyperparameter feature maps, that is, add up the feature maps of all obstacles adjusted by hyperparameters to obtain a better expressive fused feature map. The fusion feature map can reflect the features of obstacles more accurately.

步骤105、基于融合特征图确定障碍物所在的区域。Step 105: Determine the area where the obstacle is located based on the fusion feature map.

图1C为圆锥形路障是使用基于三摄像头的低矮小障碍物检测方法得到的融合特征图，如图1C所示，经过计算处理后得到的融合特征图中有较明显的、连续的灰度值不同的区域，该区域则为障碍物所在的区域。Figure 1C shows the fusion feature map of the conical roadblock using the three-camera-based low and small obstacle detection method. As shown in Figure 1C, the fusion feature map obtained after calculation processing has obvious and continuous grayscales The area with different values, the area is the area where the obstacle is located.

需要说明的是，上述实施例中的将较明显的、连续的灰度值不同的区域作为障碍物所在区域为对本发明实施例的示例性说明，在本发明其他实施例中，还可以通过其他方法确定融合特征图中的障碍物所在的区域，本发明在此不做限定。It should be noted that in the above embodiment, the obvious and continuous areas with different gray values are used as the area where the obstacle is located, which is an exemplary illustration of the embodiment of the present invention. In other embodiments of the present invention, other The method determines the region where the obstacle in the fusion feature map is located, which is not limited in the present invention.

在本发明的一些实施例中，步骤105包括：In some embodiments of the present invention, step 105 includes:

步骤1051、将融合特征图中各像素的灰度值与预设的灰度阈值进行比较。Step 1051: Compare the grayscale value of each pixel in the fusion feature map with a preset grayscale threshold.

预设的灰度阈值是用于判定是否有存在障碍物的区域。The preset grayscale threshold is used to determine whether there is an area with obstacles.

将融合特征图中个像素的灰度值和预设的灰度阈值进行比较，并根据预设的灰度阈值来检测有障碍物的区域。The gray value of each pixel in the fusion feature map is compared with the preset gray threshold, and the area with obstacles is detected according to the preset gray threshold.

步骤1053、将灰度值大于或等于灰度阈值的像素形成的区域作为障碍物所在的区域。Step 1053: Take the area formed by the pixels whose gray value is greater than or equal to the gray threshold value as the area where the obstacle is located.

当融合特征图中的有若干个灰度值大于或等于灰度阈值的像素点形成的区域，则视该区域为障碍物所在的区域。When there is an area formed by several pixels whose gray value is greater than or equal to the gray threshold in the fusion feature map, the area is regarded as the area where the obstacle is located.

示例性的，预设的灰度阈值为200，当融合特征图中有若干的像素点的灰度值均大于或等于200时，则将这些像素点形成的区域视为障碍物所在的区域，也就是已经检测出障碍物。Exemplarily, the preset grayscale threshold is 200. When the grayscale values of several pixels in the fusion feature map are all greater than or equal to 200, the area formed by these pixels is regarded as the area where the obstacle is located, That is, the obstacle has been detected.

需要说明的是，上述实施例中的预设的灰度阈值200为对本发明实施例的示例性说明，在本发明其他实施例中，灰度阈值还可以根据实际场景中运用的情况、准确度要求而决定实际的数值，本发明在此不做限定。It should be noted that the preset grayscale threshold 200 in the above-mentioned embodiment is an exemplary illustration of the embodiment of the present invention. In other embodiments of the present invention, the grayscale threshold value may also be based on the situation and accuracy of application in an actual scene. The actual numerical value is determined according to the requirements, and the present invention is not limited here.

需要说明的是，对于方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本发明实施例并不受所描述的动作顺序的限制，因为依据本发明实施例，某些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作并不一定是本发明实施例所必须的。It should be noted that, for the sake of simple description, the method embodiments are expressed as a series of action combinations, but those skilled in the art should know that the embodiments of the present invention are not limited by the described action sequence, because According to embodiments of the present invention, certain steps may be performed in other sequences or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the embodiments of the present invention.

实施例二Embodiment 2

图2为本发明实施例二提供的一种低矮小障碍物检测装置的结构框图，具体可以包括如下模块：2 is a structural block diagram of a low-profile obstacle detection device provided in Embodiment 2 of the present invention, which may specifically include the following modules:

图像获取模块201，用于获取第一摄像头、第二摄像头和第三摄像头同步采集的第一图像、第二图像和第三图像；An image acquisition module 201, configured to acquire a first image, a second image and a third image synchronously collected by the first camera, the second camera and the third camera;

第一差异特征计算模块202，用于计算所述第一图像与所述第二图像的差异，得到第一差异特征图；a first difference feature calculation module 202, configured to calculate the difference between the first image and the second image to obtain a first difference feature map;

第二差异特征计算模块203，用于计算所述第三图像与所述第二图像的差异，得到第二差异特征图；The second difference feature calculation module 203 is configured to calculate the difference between the third image and the second image to obtain a second difference feature map;

差异融合模块204，用于融合所述第一差异特征图和所述第二差异特征图，得到融合特征图；Difference fusion module 204, configured to fuse the first difference feature map and the second difference feature map to obtain a fusion feature map;

后处理模块205，用于基于所述融合特征图确定障碍物所在的区域。The post-processing module 205 is configured to determine the area where the obstacle is located based on the fusion feature map.

在本发明的一些实施例中，图像获取模块201中，包括：In some embodiments of the present invention, the image acquisition module 201 includes:

第一单应性矩阵生成子模块，用于根据所述第一图像和所述第二图像中四对关键标记点对计算第一单应性矩阵，其中，每对所述关键标记点对都是在所述第一图像、所述第二图像中表示相同特征的像素点对；The first homography matrix generation submodule is used to calculate the first homography matrix according to four pairs of key marker point pairs in the first image and the second image, wherein each pair of the key marker point pairs is is a pair of pixels representing the same feature in the first image and the second image;

第一图像对齐模块，用于将所述第一单应性矩阵与由所述第一图像的像素值组成的矩阵相乘，使所述第一图像和所述第二图像对齐。A first image alignment module, configured to multiply the first homography matrix by a matrix composed of pixel values of the first image to align the first image and the second image.

第二单应性矩阵生成子模块，用于根据所述第二图像和所述第三图像中四对关键标记点对计算第二单应性矩阵，其中，每对所述关键标记点对都是在所述第二图像、所述第三图像中表示相同特征的像素点对；A second homography matrix generating submodule, configured to calculate a second homography matrix according to four pairs of key marker points in the second image and the third image, wherein each pair of the key marker points is are pixel pairs representing the same feature in the second image and the third image;

第三图像对齐模块，用于将所述第二单应性矩阵与由所述第三图像的像素值组成的矩阵相乘，使所述第三图像和所述第二图像对齐。A third image alignment module, configured to multiply the second homography matrix by a matrix composed of pixel values of the third image to align the third image and the second image.

在本发明的一些实施例中，第一差异特征计算模块202中，包括：In some embodiments of the present invention, the first difference feature calculation module 202 includes:

第一特征图生成子模块，用于将所述第一图像与所述第二图像分别输入MobileNetV2网络中进行处理，取所述MobileNetV2网络中前k个卷积块输出的特征图，得到所述第一图像对应的k个第一特征图和所述第二图像对应的k个第一特征图；The first feature map generation sub-module is used to input the first image and the second image into the MobileNetV2 network respectively for processing, and take the feature maps output by the first k convolution blocks in the MobileNetV2 network to obtain the k first feature maps corresponding to the first image and k first feature maps corresponding to the second image;

第一显著特征图生成子模块，用于分别计算所述第一图像对应的第i个第一特征图与所述第二图像对应的第i个第一特征图的差值，得到k个第一显著特征图，其中，i小于或等于k；The first salient feature map generation sub-module is used to calculate the difference between the ith first feature map corresponding to the first image and the ith first feature map corresponding to the second image, and obtain k th first feature maps. a salient feature map, where i is less than or equal to k;

第二特征图生成子模块，用于将所述第一图像与所述第二图像分别输入mbv2_ca网络中进行处理，取所述mbv2_ca网络中前m个卷积块输出的特征图，得到所述第一图像对应的m个第二特征图和所述第二图像对应的m个第二特征图；The second feature map generation sub-module is used to input the first image and the second image into the mbv2_ca network for processing, and take the feature maps output by the first m convolution blocks in the mbv2_ca network to obtain the m second feature maps corresponding to the first image and m second feature maps corresponding to the second image;

第二显著特征图生成子模块，用于计算所述第一图像对应的第i个第二特征图与所述第二图像对应的第i个第二特征图的差值，得到m个第二显著特征图，其中，i小于或等于m，k个所述第一显著特征图和m个所述第二显著特征图组成第一差异特征图。The second salient feature map generation sub-module is used to calculate the difference between the ith second feature map corresponding to the first image and the ith second feature map corresponding to the second image, and obtain m second feature maps. A salient feature map, wherein i is less than or equal to m, and the k first salient feature maps and the m second salient feature maps constitute a first difference feature map.

在本发明的一些实施例中，第二差异特征计算模块203中，包括：In some embodiments of the present invention, the second difference feature calculation module 203 includes:

第一特征图生成子模块，用于将所述第三图像输入MobileNetV2网络中进行处理，取所述MobileNetV2网络中前k个卷积块输出的特征图，得到所述第三图像对应的k个第一特征图；The first feature map generation sub-module is used to input the third image into the MobileNetV2 network for processing, take the feature maps output by the first k convolution blocks in the MobileNetV2 network, and obtain k corresponding to the third image. the first feature map;

第三显著特征图生成子模块，用于分别计算所述第三图像对应的第i个第一特征图与所述第二图像对应的第i个第一特征图的差值，得到k个第三显著特征图，其中，i小于或等于k；The third salient feature map generation sub-module is used to calculate the difference between the ith first feature map corresponding to the third image and the ith first feature map corresponding to the second image, respectively, to obtain k th first feature maps. Three salient feature maps, where i is less than or equal to k;

第二特征图生成子模块，用于将所述第三图像输入mbv2_ca网络中进行处理，取所述mbv2_ca网络中前m个卷积块输出的特征图，得到所述第三图像对应的m个第二特征图；The second feature map generation sub-module is used to input the third image into the mbv2_ca network for processing, take the feature maps output by the first m convolution blocks in the mbv2_ca network, and obtain m corresponding to the third image. the second feature map;

第四显著特征图生成子模块，用于计算所述第三图像对应的第i个第二特征图与所述第二图像对应的第i个第二特征图的差值，得到m个第四显著特征图，其中，i小于或等于m，k个所述第三显著特征图和m个所述第四显著特征图组成所述第二差异特征图。The fourth salient feature map generation sub-module is used to calculate the difference between the ith second feature map corresponding to the third image and the ith second feature map corresponding to the second image, to obtain m fourth A salient feature map, wherein i is less than or equal to m, and k third salient feature maps and m fourth salient feature maps constitute the second difference feature map.

在本发明的一些实施例中，差异融合模块204中，包括：In some embodiments of the present invention, the difference fusion module 204 includes:

第一融合特征图生成子模块，用于计算所述第一差异特征图中第i个所述第一显著特征图和所述第二差异特征图中第i个所述第三显著特征图的和值，其中，i小于或等于k，得到k个第一融合特征图；The first fusion feature map generation sub-module is used to calculate the i-th first salient feature map in the first difference feature map and the i-th third salient feature map in the second difference feature map. and value, where i is less than or equal to k, and k first fusion feature maps are obtained;

第二融合特征图生成子模块，用于计算所述第一差异特征图中第i个所述第二显著特征图和所述第二差异特征图中第i个所述第四显著特征图的和值，其中，i小于或等于m，得到m个第二融合特征图；The second fusion feature map generation sub-module is used to calculate the i-th second salient feature map in the first difference feature map and the i-th fourth salient feature map in the second difference feature map. and value, where i is less than or equal to m, to obtain m second fusion feature maps;

融合超参数特征图生成子模块，用于计算所述第一融合特征图和所述第二融合特征图与对应的超参数的乘积，得到(k+m)个融合超参数特征图；The fusion hyperparameter feature map generation submodule is used to calculate the product of the first fusion feature map and the second fusion feature map and the corresponding hyperparameters to obtain (k+m) fusion hyperparameter feature maps;

融合特征图生成子模块，用于计算(k+m)个所述融合超参数特征图的和，得到融合特征图。The fusion feature map generation sub-module is used to calculate the sum of the (k+m) fusion hyperparameter feature maps to obtain a fusion feature map.

在本发明的一些实施例中，后处理模块205中，包括：In some embodiments of the present invention, the post-processing module 205 includes:

灰度比较子模块，用于将所述融合特征图中各像素的灰度值与预设的灰度阈值进行比较；a grayscale comparison submodule, configured to compare the grayscale value of each pixel in the fusion feature map with a preset grayscale threshold;

障碍物判定子模块，用于将灰度值大于或等于所述灰度阈值的像素形成的区域作为障碍物所在的区域。The obstacle determination sub-module is used for taking the area formed by the pixels whose gray value is greater than or equal to the gray threshold value as the area where the obstacle is located.

本发明实施例所提供的基于三摄像头的低矮小障碍物检测装置可执行本发明任意实施例所提供的基于三摄像头的低矮小障碍物检测方法，具备执行方法相应的功能模块和有益效果。The three-camera-based low-profile obstacle detection device provided by the embodiment of the present invention can execute the three-camera-based low-profile obstacle detection method provided by any embodiment of the present invention, and has functional modules and beneficial effects corresponding to the execution method. .

实施例三Embodiment 3

图3为本发明实施例三提供的一种低矮小障碍物检测装置的结构示意图。图3示出了适于用来实现本发明实施方式的示例性低矮小障碍物检测装置12的框图。图3显示的低矮小障碍物检测装置12仅仅是一个示例，不应对本发明实施例的功能和使用范围带来任何限制。FIG. 3 is a schematic structural diagram of a low-profile obstacle detection device according to Embodiment 3 of the present invention. Figure 3 shows a block diagram of an exemplary low profile obstacle detection device 12 suitable for use in implementing embodiments of the present invention. The low and small obstacle detection device 12 shown in FIG. 3 is only an example, and should not impose any limitation on the function and scope of use of the embodiment of the present invention.

如图3所示，低矮小障碍物检测装置12以通用计算设备的形式表现。低矮小障碍物检测装置12的组件可以包括但不限于：一个或者多个处理器或者处理单元16，系统存储器28，连接不同系统组件(包括系统存储器28和处理单元16)的总线18。As shown in FIG. 3 , the low-profile obstacle detection device 12 is embodied in the form of a general-purpose computing device. Components of low profile obstacle detection device 12 may include, but are not limited to, one or more processors or processing units 16 , system memory 28 , and a bus 18 connecting various system components including system memory 28 and processing unit 16 .

总线18表示几类总线结构中的一种或多种，包括存储器总线或者存储器控制器，外围总线，图形加速端口，处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说，这些体系结构包括但不限于工业标准体系结构(ISA)总线，微通道体系结构(MAC)总线，增强型ISA总线、视频电子标准协会(VESA)局域总线以及外围组件互连(PCI)总线。Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any of a variety of bus structures. By way of example, these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, Enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect ( PCI) bus.

低矮小障碍物检测装置12典型地包括多种计算机系统可读介质。这些介质可以是任何能够被低矮小障碍物检测装置12访问的可用介质，包括易失性和非易失性介质，可移动的和不可移动的介质。Low profile obstacle detection device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by the low-profile obstacle detection device 12, including volatile and non-volatile media, removable and non-removable media.

系统存储器28可以包括易失性存储器形式的计算机系统可读介质，例如随机存取存储器(RAM)30和/或高速缓存存储器32。低矮小障碍物检测装置12可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例，存储系统34可以用于读写不可移动的、非易失性磁介质(图3未显示，通常称为“硬盘驱动器”)。尽管图3中未示出，可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器，以及对可移动非易失性光盘(例如CD-ROM,DVD-ROM或者其它光介质)读写的光盘驱动器。在这些情况下，每个驱动器可以通过一个或者多个数据介质接口与总线18相连。存储器28可以包括至少一个程序产品，该程序产品具有一组(例如至少一个)程序模块，这些程序模块被配置以执行本发明各实施例的功能。System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 . The low profile obstacle detection device 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media. For example only, storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 3, commonly referred to as a "hard disk drive"). Although not shown in Figure 3, a disk drive may be provided for reading and writing to removable non-volatile magnetic disks (eg "floppy disks"), as well as removable non-volatile optical disks (eg CD-ROM, DVD-ROM) or other optical media) to read and write optical drives. In these cases, each drive may be connected to bus 18 through one or more data media interfaces. Memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present invention.

具有一组(至少一个)程序模块42的程序/实用工具40，可以存储在例如存储器28中，这样的程序模块42包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据，这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块42通常执行本发明所描述的实施例中的功能和/或方法。A program/utility 40 having a set (at least one) of program modules 42, which may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data , each or some combination of these examples may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the described embodiments of the present invention.

低矮小障碍物检测装置12也可以与一个或多个外部设备14(例如键盘、指向设备、显示器24等)通信，还可与一个或者多个使得用户能与该低矮小障碍物检测装置12交互的设备通信，和/或与使得该低矮小障碍物检测装置12能与一个或多个其它计算设备进行通信的任何设备(例如网卡，调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口22进行。并且，低矮小障碍物检测装置12还可以通过网络适配器20与一个或者多个网络(例如局域网(LAN)，广域网(WAN)和/或公共网络，例如因特网)通信。如图所示，网络适配器20通过总线18与低矮小障碍物检测装置12的其它模块通信。应当明白，尽管图中未示出，可以结合低矮小障碍物检测装置12使用其它硬件和/或软件模块，包括但不限于：微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The low-profile obstacle detection device 12 may also communicate with one or more external devices 14 (eg, keyboards, pointing devices, displays 24, etc.) 12, and/or with any device (eg, network card, modem, etc.) that enables the low-profile obstacle detection device 12 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 22 . Also, the low-profile obstacle detection device 12 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 20 . As shown, the network adapter 20 communicates with other modules of the low profile obstacle detection device 12 via the bus 18 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with the low-profile obstacle detection device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, etc.

处理单元16通过运行存储在系统存储器28中的程序，从而执行各种功能应用以及数据处理，例如实现本发明实施例所提供的低矮小障碍物检测方法。The processing unit 16 executes various functional applications and data processing by running the programs stored in the system memory 28, for example, to implement the low and small obstacle detection method provided by the embodiments of the present invention.

实施例四Embodiment 4

图4为本发明实施例四提供的一种机器人的结构示意图，具体可以包括如下设备：4 is a schematic structural diagram of a robot according to Embodiment 4 of the present invention, which may specifically include the following equipment:

第一摄像头401，用于采集第一图像；a first camera 401, used for collecting a first image;

第二摄像头402，用于采集第二图像；a second camera 402, configured to collect a second image;

第三摄像头403，用于采集第三图像；a third camera 403, configured to collect a third image;

低矮小障碍物检测装置12。Low and small obstacle detection device 12 .

为使本领域技术人员更好地理解本申请实施例，在本说明书中，将一种机器人的结构的一种示例进行说明。In order for those skilled in the art to better understand the embodiments of the present application, in this specification, an example of the structure of a robot is described.

示例性的，第一摄像头401、第二摄像头402、第三摄像头403均和低矮小障碍物检测装置12进行连接，将采集到的第一图像、第二图像、第三图像传至低矮小障碍物装置12进行处理。结合图3与图4，第一摄像头401、第二摄像头402、第三摄像头403相当于图3中的外部设备14，与低矮小障碍物装置12通过I/O接口22进行连接。Exemplarily, the first camera 401, the second camera 402, and the third camera 403 are all connected to the low and low obstacle detection device 12, and the collected first, second, and third images are transmitted to the low and low obstacle detection device 12. The small obstacle device 12 is processed. 3 and 4 , the first camera 401 , the second camera 402 , and the third camera 403 are equivalent to the external device 14 in FIG. 3 , and are connected to the low-profile obstacle device 12 through the I/O interface 22 .

具体的说，机器人可以为不同用途、不同形式的机器人，例如配送机器人、清扫机器人、轮式机器人或移动机器人，本实施例仅做举例，不作限定。Specifically, the robot may be a robot with different purposes and different forms, such as a delivery robot, a cleaning robot, a wheeled robot, or a mobile robot. This embodiment is only an example and not a limitation.

本发明实施例所提供的机器人可执行本发明任意实施例所提供的机器人的处理方法，具备执行方法相应的功能模块和有益效果。The robot provided by the embodiment of the present invention can execute the processing method of the robot provided by any embodiment of the present invention, and has functional modules and beneficial effects corresponding to the execution method.

实施例五Embodiment 5

本发明实施例五还提供一种计算机可读存储介质，计算机可读存储介质上存储有计算机程序，该计算机程序被处理器执行时实现上述基于三摄像头的低矮小障碍物检测方法的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。Embodiment 5 of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, each process of the foregoing three-camera-based low-profile obstacle detection method is implemented , and can achieve the same technical effect, in order to avoid repetition, it is not repeated here.

其中，计算机可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Wherein, the computer-readable storage medium may include, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination of the above. More specific examples (a non-exhaustive list) of computer readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this document, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

注意，上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解，本发明不限于这里所述的特定实施例，对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此，虽然通过以上实施例对本发明进行了较为详细的说明，但是本发明不仅仅限于以上实施例，在不脱离本发明构思的情况下，还可以包括更多其他等效实施例，而本发明的范围由所附的权利要求范围决定。Note that the above are only preferred embodiments of the present invention and applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention. The scope is determined by the scope of the appended claims.

Claims

1. A short and small obstacle detection method based on three cameras is characterized by comprising the following steps:

acquiring a first image, a second image and a third image which are synchronously acquired by a first camera, a second camera and a third camera;

calculating the difference between the first image and the second image to obtain a first difference characteristic diagram;

calculating the difference between the third image and the second image to obtain a second difference characteristic diagram;

fusing the first difference feature map and the second difference feature map to obtain a fused feature map;

and determining the area where the obstacle is located based on the fused feature map.

2. The method of claim 1, wherein the obtaining the first image, the second image, and the third image synchronously captured by the first camera, the second camera, and the third camera comprises:

calculating a first homography matrix according to four pairs of key mark point pairs in the first image and the second image, wherein each pair of key mark point pairs is a pixel point pair which represents the same characteristic in the first image and the second image;

multiplying the first homography matrix with a matrix consisting of pixel values of the first image to align the first image and the second image.

3. The method of claim 1, wherein the obtaining the first image, the second image, and the third image synchronously captured by the first camera, the second camera, and the third camera comprises:

calculating a second homography matrix according to four pairs of key mark point pairs in the second image and the third image, wherein each pair of key mark point pairs is a pixel point pair which represents the same characteristic in the second image and the third image;

multiplying the second homography matrix with a matrix consisting of pixel values of the third image to align the third image and the second image.

4. The method of claim 1, wherein calculating the difference between the first image and the second image to obtain a first difference feature map comprises:

inputting the first image and the second image into a MobileNet V2 network for processing, and taking feature maps output by the first k convolution blocks in the MobileNet V2 network to obtain k first feature maps corresponding to the first image and k first feature maps corresponding to the second image;

respectively calculating the difference value of the ith first feature map corresponding to the first image and the ith first feature map corresponding to the second image to obtain k first significant feature maps, wherein i is less than or equal to k;

inputting the first image and the second image into an mbv2_ ca network respectively for processing, and taking feature maps output by the first m convolution blocks in the mbv2_ ca network to obtain m second feature maps corresponding to the first image and m second feature maps corresponding to the second image;

and calculating the difference value between the ith second feature map corresponding to the first image and the ith second feature map corresponding to the second image to obtain m second significant feature maps, wherein i is less than or equal to m, and k first significant feature maps and m second significant feature maps form a first difference feature map.

5. The method of claim 4, wherein calculating the difference between the third image and the second image to obtain a second difference feature map comprises:

inputting the third image into a MobileNet V2 network for processing, and taking feature maps output by the first k convolution blocks in the MobileNet V2 network to obtain k first feature maps corresponding to the third image;

respectively calculating the difference value of the ith first feature map corresponding to the third image and the ith first feature map corresponding to the second image to obtain k third significant feature maps, wherein i is less than or equal to k;

inputting the third image into mbv2_ ca network for processing, and taking feature maps output by the first m convolution blocks in mbv2_ ca network to obtain m second feature maps corresponding to the third image;

and calculating the difference value between the ith second feature map corresponding to the third image and the ith second feature map corresponding to the second image to obtain m fourth significant feature maps, wherein i is less than or equal to m, and the k third significant feature maps and the m fourth significant feature maps form the second difference feature map.

6. The method according to claim 5, wherein the fusing the first difference feature map and the second difference feature map to obtain a fused feature map comprises:

calculating the sum of the ith first significant feature map in the first difference feature map and the ith third significant feature map in the second difference feature map, wherein i is less than or equal to k, and obtaining k first fusion feature maps;

calculating the sum of the ith second significant feature map in the first significant feature map and the ith fourth significant feature map in the second significant feature map, wherein i is less than or equal to m, and obtaining m second fusion feature maps;

calculating the product of the first fusion feature map and the second fusion feature map and the corresponding hyper-parameters to obtain (k + m) fusion hyper-parameter feature maps;

and (k + m) sums of the fusion hyper-parameter feature maps are calculated to obtain a fusion feature map.

7. The method of claim 1, wherein determining the region in which the obstacle is located based on the fused feature map comprises:

comparing the gray value of each pixel in the fusion characteristic graph with a preset gray threshold value;

and taking the area formed by the pixels with the gray value larger than or equal to the gray threshold value as the area where the obstacle is positioned.

8. A short small obstacle detection device, comprising:

the image acquisition module is used for acquiring a first image, a second image and a third image which are synchronously acquired by the first camera, the second camera and the third camera;

the first difference feature calculation module is used for calculating the difference between the first image and the second image to obtain a first difference feature map;

the second difference feature calculation module is used for calculating the difference between the third image and the second image to obtain a second difference feature map;

the difference fusion module is used for fusing the first difference feature map and the second difference feature map to obtain a fusion feature map;

and the post-processing module is used for determining the area where the obstacle is located based on the fusion feature map.

9. A short small obstacle detection device, comprising:

at least one processor; and at least one memory storing instructions executable by the at least one processor;

the instructions are executable by the at least one processor to cause the at least one processor to implement a three-camera based short and small obstacle detection method according to any one of claims 1-7.

10. A robot, characterized in that the robot comprises:

the first camera is used for acquiring a first image;

the second camera is used for acquiring a second image;

the third camera is used for acquiring a third image;

and a short and small obstacle detecting device according to claim 8 or 9.

11. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a processor, implements the three-camera based short and small obstacle detection method according to any one of claims 1 to 7.