WO2020258703A1 - 障碍物检测方法、智能驾驶控制方法、装置、介质及设备 - Google Patents
障碍物检测方法、智能驾驶控制方法、装置、介质及设备 Download PDFInfo
- Publication number
- WO2020258703A1 WO2020258703A1 PCT/CN2019/120833 CN2019120833W WO2020258703A1 WO 2020258703 A1 WO2020258703 A1 WO 2020258703A1 CN 2019120833 W CN2019120833 W CN 2019120833W WO 2020258703 A1 WO2020258703 A1 WO 2020258703A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- disparity
- obstacle
- value
- pixel
- map
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20228—Disparity calculation for image-based rendering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
- G06T2207/30261—Obstacle
Definitions
- the present disclosure relates to computer vision technology, and in particular to an obstacle detection method, an obstacle detection device, an intelligent driving control method, an intelligent driving control device, electronic equipment, a computer-readable storage medium, and a computer program.
- perception technology In the field of computer vision technology, perception technology is usually used to perceive obstacles in the outside world. That is, the perception technology includes obstacle detection.
- the perception results of the perception technology are usually provided to the decision-making layer so that the decision-making layer makes decisions based on the perception results.
- the perception layer provides the road information it perceives to the vehicle and the obstacle information around the vehicle to the decision-making layer, so that the decision-making layer executes driving decisions to avoid obstacles and ensure the safety of the vehicle Driving.
- the types of obstacles are generally defined in advance, such as pedestrians, vehicles, non-motorized vehicles and other obstacles with inherent shapes, textures, and colors, and then related detection algorithms are used to detect the predefined types of obstacles. Detection.
- the embodiments of the present disclosure provide a technical solution for obstacle detection and intelligent driving control.
- an obstacle detection method including: acquiring a first disparity map of an environment image, the environment image being an image that characterizes the spatial environment information where the smart device is located during movement; A plurality of obstacle pixel regions are determined in the first disparity map of the environment image; clustering processing is performed on the plurality of obstacle pixel regions to obtain at least one cluster; according to obstacle pixel regions belonging to the same cluster To determine the obstacle detection result.
- a smart driving control method includes: acquiring an environment image of the smart device during movement through an image acquisition device set on the smart device; and adopting the above obstacle detection method , Perform obstacle detection on the acquired environment image to determine the obstacle detection result; generate and output a control instruction according to the obstacle detection result.
- an obstacle detection device including: an acquisition module for acquiring a first disparity map of an environment image, the environment image representing the spatial environment in which the smart device is located during movement Information image; a first determination module, used to determine a plurality of obstacle pixel regions in the first disparity map of the environment image; a clustering module, used to perform clustering processing on the plurality of obstacle pixel regions , Obtain at least one cluster; the second determining module is used to determine the obstacle detection result according to the obstacle pixel area belonging to the same cluster.
- an intelligent driving control device which includes: an acquisition module configured to acquire an environmental image of the intelligent device during movement through an image acquisition device set on the intelligent device; An obstacle detection device detects obstacles on the environment image to determine an obstacle detection result; a control module is used to generate and output control instructions according to the obstacle detection result.
- an electronic device including: a memory for storing a computer program; a processor for executing the computer program stored in the memory, and when the computer program is executed, Any method embodiment of the present disclosure.
- a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, any method embodiment of the present disclosure is implemented.
- a computer program including computer instructions, which, when the computer instructions run in the processor of the device, implement any method embodiment of the present disclosure.
- the present disclosure can determine multiple obstacle pixels from the first disparity map of the environment image Region, and obtain the obstacle detection result by clustering multiple obstacle pixel regions.
- the detection method adopted in the present disclosure does not need to predefine the obstacle to be detected, and does not use the predefined information such as the texture, color, shape, and category of the obstacle, and can be directly based on the method of clustering the obstacle area
- the detected obstacles are not limited to certain pre-defined obstacles, and can realize various obstacles that may hinder the movement of smart devices in the surrounding space environment (It may be called a generic obstacle in the present disclosure) performs detection, thereby realizing detection of a generic obstacle.
- the technical solution provided by the present disclosure is a more general obstacle detection solution, which can be applied to the detection of general types of obstacles, and is beneficial to respond to real environments. Diverse types of obstacle detection. Moreover, for smart devices, the technical solutions provided by the present disclosure can detect diversified and random obstacles that may occur during driving, and then output control instructions for the driving process based on the detection results, which is beneficial Improve the safety of intelligent vehicle driving.
- the technical solutions of the present disclosure will be further described in detail below through the drawings and embodiments.
- FIG. 1 is a flowchart of an embodiment of the obstacle detection method of the present disclosure
- FIG. 2 is a schematic diagram of an embodiment of the environmental image of the present disclosure
- FIG. 3 is a schematic diagram of an implementation manner of the first disparity map of FIG. 2;
- FIG. 4 is a schematic diagram of an embodiment of the first disparity map of the present disclosure.
- FIG. 5 is a schematic diagram of an embodiment of the convolutional neural network of the present disclosure.
- FIG. 6 is a schematic diagram of an embodiment of the first weight distribution diagram of the first disparity map of the present disclosure
- FIG. 7 is a schematic diagram of another embodiment of the first weight distribution diagram of the first disparity map of the present disclosure.
- FIG. 8 is a schematic diagram of an embodiment of the second weight distribution diagram of the first disparity map of the present disclosure.
- FIG. 9 is a schematic diagram of an embodiment of the second mirror image of the present disclosure.
- FIG. 10 is a schematic diagram of an embodiment of the second weight distribution diagram of the second mirror image shown in FIG. 9;
- FIG. 11 is a schematic diagram of an embodiment of the present disclosure to optimize and adjust the disparity map of a monocular image
- FIG. 12 is a schematic diagram of an implementation manner of obstacle edge information in the first disparity map of the environmental image of the present disclosure
- FIG. 13 is a schematic diagram of an embodiment of the statistical disparity map of the present disclosure.
- FIG. 14 is a schematic diagram of an embodiment of forming a statistical disparity map of the present disclosure.
- 15 is a schematic diagram of an embodiment of the straight line fitting of the present disclosure.
- FIG. 16 is a schematic diagram of the ground area and the non-ground area of the present disclosure.
- FIG. 17 is a schematic diagram of an embodiment of the coordinate system established by the present disclosure.
- FIG. 19 is a schematic diagram of an embodiment of forming an obstacle pixel columnar region of the present disclosure.
- 20 is a schematic diagram of an embodiment of clustering columnar regions of obstacle pixels in the present disclosure
- FIG. 21 is a schematic diagram of an embodiment of forming an obstacle detection frame of the present disclosure.
- FIG. 22 is a flowchart of an embodiment of the convolutional neural network training method of the present disclosure.
- FIG. 23 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure.
- FIG. 24 is a schematic structural diagram of an embodiment of the obstacle detection device of the present disclosure.
- FIG. 25 is a flowchart of an embodiment of the intelligent driving control device of the present disclosure.
- Fig. 26 is a block diagram of an exemplary device for implementing the embodiments of the present disclosure.
- the embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, and servers, which can operate with many other general or special computing system environments or configurations.
- Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, and servers including but not limited to: personal computer systems, server computer systems, thin clients, thick Client computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, etc.
- Electronic devices such as terminal devices, computer systems, and servers can be described in the general context of computer system executable instructions (such as program modules) executed by the computer system.
- program modules can include routines, programs, target programs, components, logic, and data structures, etc., which perform specific tasks or implement specific abstract data types.
- the computer system/server can be implemented in a distributed cloud computing environment. In the distributed cloud computing environment, tasks are executed by remote processing equipment linked through a communication network.
- program modules may be located on a storage medium of a local or remote computing system including a storage device.
- FIG. 1 is a flowchart of an embodiment of the obstacle detection method of the present disclosure.
- the method in this embodiment includes steps: S100, S110, S120, and S130. The steps are described in detail below.
- the environment image is an image that characterizes the spatial environment information where the smart device is located during the movement.
- the smart device is, for example, a smart driving device (such as an autonomous car), an smart flying device (such as a drone), an intelligent robot, and the like.
- the environment image is, for example, an image that characterizes the road space environment information of the intelligent driving device or the intelligent robot during the movement, or the image of the space environment information of the intelligent flying device during the flight.
- the smart device and the environment image in the present disclosure are not limited to the above examples, and the present disclosure does not limit this.
- obstacles in the environment image are detected. Any object in the surrounding space environment where the smart device is located that may hinder the movement process may fall into the obstacle detection range and be regarded as the obstacle detection object. For example, during the driving of the smart driving device, objects such as stones, animals, and fallen goods may appear on the road. These objects have no specific shapes, textures, colors, types, and they are very different from each other. It is considered an obstacle. In the present disclosure, any object that may cause obstruction during the movement is called a generic obstacle.
- the first disparity map of the present disclosure is used to describe the disparity of the environmental image. Parallax can be considered as the difference in the position of the target object when observing the same target object from two points at a certain distance.
- An example of the environmental image is shown in Figure 2.
- An example of the first disparity map of the environment image shown in FIG. 2 is shown in FIG. 3.
- the first disparity map of the environment image in the present disclosure may also be expressed in the form shown in FIG. 4.
- Each number in FIG. 4 (such as 0, 1, 2, 3, 4, 5, etc.) respectively represents: the parallax of the pixel at the position (x, y) in the environment image. It should be particularly noted that FIG. 4 does not show a complete disparity map.
- the environmental image in the present disclosure may be a monocular image or a binocular image.
- Monocular images are usually obtained by shooting with a monocular camera.
- Binocular images are usually obtained by shooting with a binocular camera device.
- both the monocular image and the binocular image in the present disclosure may be photos or pictures, etc., and may also be video frames in a video.
- the present disclosure can realize obstacle detection without the need to provide a binocular camera device, thereby helping to reduce obstacle detection costs.
- the present disclosure may use a successfully trained convolutional neural network in advance to obtain the first disparity map of the monocular image.
- a monocular image is input into a convolutional neural network, the monocular image is processed for disparity analysis via the convolutional neural network, and the convolutional neural network outputs the disparity analysis processing result, so that the present disclosure can be based on the disparity analysis processing result , Obtain the first disparity map of the monocular image.
- the convolutional neural network to obtain the first disparity map of the monocular image, the first disparity map can be obtained without using two images for pixel-by-pixel disparity calculation and camera calibration. It is beneficial to improve the convenience and real-time performance of obtaining the first disparity map.
- the convolutional neural network in the present disclosure generally includes but is not limited to: multiple convolutional layers (Conv) and multiple deconvolutional layers (Deconv).
- the convolutional neural network of the present disclosure can be divided into two parts, namely an encoding part and a decoding part.
- the monocular image input to the convolutional neural network (the monocular image shown in Figure 2) is encoded by the encoding part (ie feature extraction processing), and the encoding processing result of the encoding part is provided to the decoding part,
- the decoding part decodes the encoding processing result and outputs the decoding processing result.
- the present disclosure can obtain the first disparity map of the monocular image (the first disparity map shown in FIG.
- the coding part in the convolutional neural network includes but is not limited to: multiple convolutional layers, and multiple convolutional layers are connected in series.
- the decoding part in the convolutional neural network includes, but is not limited to: multiple convolutional layers and multiple deconvolutional layers, and multiple convolutional layers and multiple deconvolutional layers are arranged at intervals and connected in series.
- FIG. 5 An alternative example of the convolutional neural network in the present disclosure is shown in FIG. 5.
- the first rectangle on the left represents the monocular image input to the convolutional neural network
- the first rectangle on the right represents the disparity map output by the convolutional neural network.
- Each rectangle from the second rectangle to the 15th rectangle on the left represents a convolutional layer
- all the rectangles from the 16th rectangle on the left to the second rectangle on the right represent deconvolution layers and convolutions set apart from each other
- the 16th rectangle on the left represents the deconvolution layer
- the 17th rectangle on the left represents the convolution layer
- the 18th rectangle on the left represents the deconvolution layer
- the 19th rectangle on the left represents the convolution layer.
- the convolutional neural network of the present disclosure may merge the low-level information and high-level information in the convolutional neural network by means of skip connection.
- the output of at least one convolutional layer in the encoding part is provided to at least one deconvolutional layer in the decoding part through a jump connection.
- the input of all convolutional layers in the convolutional neural network usually includes: the output of the previous layer (such as a convolutional layer or a deconvolutional layer), and at least one deconvolutional layer (such as The input of a partial deconvolution layer or all deconvolution layers) includes: the upsample (Upsample) result of the output of the previous convolution layer and the output of the convolution layer of the coding part jump connected to the deconvolution layer.
- the content pointed by the solid arrow drawn from the bottom of the convolution layer on the right side of Figure 5 represents the output of the convolution layer
- the dotted arrow in Figure 5 represents the upsampling result provided to the deconvolution layer.
- the solid arrow drawn above the convolutional layer on the left represents the output of the convolutional layer jump-connected to the deconvolutional layer.
- the present disclosure does not limit the number of jump connections and the network structure of the convolutional neural network.
- the present disclosure helps to improve the accuracy of the disparity map generated by the convolutional neural network by fusing the low-level information and the high-level information in the convolutional neural network.
- the convolutional neural network of the present disclosure is obtained by training using binocular image samples. For the training process of the convolutional neural network, refer to the description in the following embodiments. I will not elaborate on it here.
- the present disclosure may also optimize and adjust the first disparity map of the environment image obtained by using the convolutional neural network, so as to obtain a more accurate first disparity map.
- the present disclosure may use the disparity map of the mirror image of the monocular image to optimize and adjust the first disparity map of the monocular image, so that the present disclosure can be adjusted after the disparity A plurality of obstacle pixel regions are determined in the first disparity map.
- the mirror image of the monocular image is referred to as the first mirror image
- the disparity image of the first mirror image is referred to as the second disparity image.
- the first mirror image is obtained, and the disparity map of the first mirror image is obtained, and then the first mirror image of the monocular image is obtained according to the disparity map of the first mirror image.
- multiple obstacle pixel regions can be determined in the first disparity map after the disparity adjustment.
- a specific example of optimal adjustment of the first disparity map is as follows:
- Step A Obtain a second disparity map of the first mirror image of the monocular image, and acquire the mirror image of the second disparity map.
- the first mirror image of the monocular image in the present disclosure may be a mirror image formed by performing mirror processing (such as left mirror processing or right mirror processing) on the monocular image in the horizontal direction.
- the mirror image of the second disparity map in the present disclosure may be a mirror image formed after performing mirror processing (such as left mirror processing or right mirror processing) on the second disparity map in the horizontal direction.
- the mirror image of the second disparity map is still the disparity map.
- the present disclosure can first perform left mirror processing or right mirror processing on the monocular image (because the left mirror processing result is the same as the right mirror processing result, therefore, the present disclosure can perform left mirror processing or right mirror processing on the monocular image) to obtain The first mirror image (left mirror image or right mirror image) of the monocular image, and then the disparity map of the first mirror image of the monocular image is obtained, thereby obtaining the second disparity image; finally, the second disparity image is obtained Perform left mirror processing or right mirror processing (because the left mirror processing result of the second disparity map is the same as the right mirror processing result, therefore, the present disclosure can perform left mirror processing or right mirror processing on the second disparity map) to obtain the first The mirror image of the two-disparity image (left mirror image or right mirror image).
- the mirror image of the second disparity map is still a disparity map.
- the mirror image of the second disparity map is referred to as the second mirror image below.
- the present disclosure when the present disclosure performs mirror image processing on a monocular image, it may not consider whether the monocular image is mirrored as a left-eye image or as a right-eye image. That is to say, regardless of whether the monocular image is used as the left-eye image or the right-eye image, the present disclosure can perform left mirror processing or right mirror processing on the monocular image to obtain the first mirror image. Similarly, when performing mirror image processing on the second disparity map in the present disclosure, it is also possible to ignore whether to perform left mirror processing on the second disparity map or perform right mirror processing on the second disparity map.
- the convolutional neural network used to generate the disparity map of the monocular image if the left-eye image sample in the binocular image sample is used as input, it is provided to the convolutional neural network for training, and it is successful The trained convolutional neural network will use the input monocular image as the left-eye image in testing and practical applications. If the right-eye image sample in the binocular image sample is used as input, it is provided to the convolutional neural network for training, then the successfully trained convolutional neural network will use the input monocular image as the right-eye image in testing and practical applications .
- the present disclosure may also use the aforementioned convolutional neural network to obtain the second disparity map.
- the first mirror image is input to the convolutional neural network, and the disparity analysis processing is performed on the first mirror image through the convolutional neural network, and the convolutional neural network outputs the disparity analysis processing result, so that the present disclosure can according to the output disparity Analyze the processing result to obtain the second disparity map.
- Step B Obtain the weight distribution map of the first disparity map and the weight distribution map of the second mirror image of the monocular image.
- the weight distribution map of the first disparity map is used to describe the respective weight values of multiple disparity values (such as all disparity values) in the first disparity map.
- the weight distribution map of the first disparity map may include, but is not limited to: a first weight distribution map of the first disparity map and a second weight distribution map of the first disparity map.
- the first weight distribution map of the first disparity map is a weight distribution map set uniformly for the disparity maps of a plurality of different monocular images, that is, the first weight distribution map of the first disparity map may face multiple different disparity maps.
- the first disparity map of the monocular image that is, the first disparity maps of different monocular images use the same first weight distribution map.
- the present disclosure may refer to the first weight distribution map of the first disparity map as The global weight distribution map of the first disparity map.
- the global weight distribution map of the first disparity map is used to describe the global weight values corresponding to multiple disparity values (such as all disparity values) in the first disparity map.
- the second weight distribution map of the first disparity map is a weight distribution map set for the first disparity map of a single monocular image, that is, the second weight distribution map of the first disparity map is for a single monocular image
- the first disparity map that is, the first disparity map of different monocular images uses different second weight distribution maps, therefore, the second weight distribution map of the first disparity map may be referred to as the first disparity map in the present disclosure
- the local weight distribution map is used to describe the respective local weight values of multiple disparity values (such as all disparity values) in the first disparity map.
- the weight distribution map of the second mirror image is used to describe the respective weight values of the multiple disparity values in the second mirror image.
- the weight distribution diagram of the second mirror image may include, but is not limited to: the first weight distribution diagram of the second mirror image and the second weight distribution diagram of the second mirror image.
- the first weight distribution diagram of the second mirror image is a weight distribution diagram uniformly set for the second mirror images of multiple different monocular images, that is, the first weight distribution diagram of the second mirror image faces multiple
- the second mirror image of different monocular images, that is, the second mirror image of different monocular images uses the same first weight distribution diagram. Therefore, the present disclosure can call the first weight distribution diagram of the second mirror image It is the global weight distribution map of the second mirror image.
- the global weight distribution diagram of the second mirror image is used to describe the respective global weight values of multiple disparity values (such as all disparity values) in the second mirror image.
- the second weight distribution diagram of the second mirror image is a weight distribution diagram set for the second mirror image of a single monocular image, that is, the second weight distribution diagram of the second mirror image is for a single monocular image That is, the second mirror images of different monocular images use different second weight distribution maps. Therefore, the present disclosure may refer to the second weight distribution map of the second mirror image as the second mirror image The local weight distribution of the graph.
- the local weight distribution map of the second mirror image is used to describe the respective local weight values of multiple disparity values (such as all disparity values) in the second mirror image.
- the first weight distribution map of the first disparity map includes: at least two left and right separated regions, and different regions have different weight values.
- the magnitude relationship between the weight value of the area on the left and the weight value of the area on the right is usually related to whether the monocular image is used as the left-eye image or the right-eye image.
- FIG. 6 is a first weight distribution diagram of the first disparity map shown in FIG. 3, and the first weight distribution diagram is divided into five regions, namely, region 1, region 2, region 3, region 4, and region in FIG. 5.
- the weight value of area 5 is not less than the weight value of area 4
- the weight value of area 4 is not less than the weight value of area 3
- the weight value of area 3 is not less than the weight value of area 2
- the weight value of area 2 is not less than the weight of area 1. value.
- any region in the first weight distribution map of the first disparity map may have the same weight value, or may have different weight values.
- the weight value of the left part in the region is usually less than or equal to the weight value of the right part in the region.
- the weight value of region 1 in Figure 6 can be 0, that is, in the first disparity map, the disparity corresponding to region 1 is completely unreliable; the weight value of region 2 can be from left to right, from 0 Gradually increase and approach 0.5; the weight value of area 3 is 0.5; the weight value of area 4 can be from the left to the right, gradually increasing from a value greater than 0.5 and approaching 1; the weight value of area 5 is 1, that is In the first disparity map, the disparity corresponding to area 5 is completely credible.
- FIG. 7 shows a first weight distribution diagram that is used as a disparity map of the right eye image to be processed.
- the first weight distribution diagram is divided into five regions, namely, region 1, region 2, region 3, and region 4 in FIG. And area 5.
- the weight value of region 1 is not less than the weight value of region 2
- the weight value of region 2 is not less than the weight value of region 3
- the weight value of region 3 is not less than the weight value of region 4
- the weight value of region 4 is not less than the weight value of region 5. value.
- any region in the first weight distribution map of the first disparity map may have the same weight value, or may have different weight values.
- the weight value of the right part in the region is usually not greater than the weight value of the left part in the region.
- the weight value in area 4 can be from right to left, by 0 gradually increases to 0.5; the weight value of area 3 is 0.5; the weight value in area 2 can be from the right to the left, gradually increasing from a value greater than 0.5 and approaching 1; the weight value of area 1 is 1, That is, in the first disparity map, the disparity corresponding to area 1 is completely credible.
- the first weight distribution map of the second mirror image includes at least two left and right divided regions, and different regions have different weight values.
- the magnitude relationship between the weight value of the area on the left and the weight value of the area on the right is usually related to whether the monocular image is used as the left-eye image or the right-eye image.
- the weight value of the region on the right is not less than that of the region on the left.
- Weights any area in the first weight distribution diagram of the second mirror image may have the same weight value or may have different weight values.
- the weight value of the left part in the region is usually not greater than the weight value of the right part in the region.
- the weight value of the region on the left is not less than that of the region on the right The weight value of.
- any area in the first weight distribution diagram of the second mirror image may have the same weight value or may have different weight values.
- the weight value of the right part in the region is usually not greater than the weight value of the left part in the region.
- the manner of setting the second weight distribution map of the first disparity map may include the following steps:
- the weight value in the second weight distribution map of the first disparity map is set.
- the second weight of the first disparity map is distributed The weight value of the pixel at this position in the figure is set to the first value.
- the second weight distribution of the first disparity map is set in the The weight value of the pixel at the position is set to the second value.
- the The weight value of the pixel at the position in the second weight distribution map is set to the first value, otherwise, it is set to the second value.
- the first value in this disclosure is greater than the second value.
- the first value is 1 and the second value is 0.
- an example of the second weight distribution map of the first disparity map is shown in FIG. 8.
- the weight values of the white areas in FIG. 8 are all 1, which indicates that the disparity value at this position is completely reliable.
- the weight value of the black area in FIG. 8 is 0, which means that the disparity value at this position is completely unreliable.
- the first reference value corresponding to a pixel at any position in the present disclosure may be set according to the disparity value of the pixel at that position in the first disparity map and a constant value greater than zero.
- the product of the disparity value of the pixel at the position in the first disparity map and a constant value greater than zero is used as the first reference value corresponding to the pixel at the position in the mirror disparity map.
- the second weight distribution map of the first disparity map may be expressed by the following formula (1):
- L l represents the second weight distribution map of the first disparity map
- Re represents the disparity value of the pixel at the corresponding position of the mirror disparity map
- d l represents the disparity value of the pixel at the corresponding position in the first disparity map
- the setting manner of the second weight distribution map of the second mirror image may be: according to the disparity value in the first disparity map, the weight value in the second weight distribution map of the second mirror image is set.
- the disparity value of the pixel at that position in the first disparity image satisfies a second predetermined condition, then the second mirror image
- the weight value of the pixel at the position in the second weight distribution map is set to the third value.
- the weight value of the pixel at this position in the second weight distribution map of the second mirror image is set to the first Four values; where the third value is greater than the fourth value.
- the weight value of the pixel at the position in the second weight distribution diagram of the second mirror image is set to the third value, otherwise, it is set to the fourth value.
- the third value in the present disclosure is greater than the fourth value.
- the third value is 1, and the fourth value is 0.
- the second reference value corresponding to the pixel in the present disclosure may be set according to the disparity value of the pixel at the corresponding position in the mirror disparity map and a constant value greater than zero. For example, first perform a left/right mirror image processing on the first disparity map to form a mirror disparity map, that is, a mirror disparity map, and then combine the disparity value of the pixel at the corresponding position in the mirror disparity map with a constant value greater than zero The product is used as the second reference value corresponding to the pixel at the corresponding position in the first disparity map.
- FIG. 9 An example of the second weight distribution map of the second mirror image shown in FIG. 9 is shown in FIG. 10.
- the weight values of the white areas in FIG. 10 are all 1, which indicates that the disparity value at this position is completely reliable.
- the weight value of the black area in FIG. 10 is 0, which means that the disparity value at this position is completely unreliable.
- the second weight distribution graph of the second mirror image can be expressed by the following formula (2):
- Step C According to the weight distribution map of the first disparity map of the monocular image and the weight distribution map of the second mirror image, the first disparity map of the monocular image is optimized and adjusted, and the optimized and adjusted disparity map is finally obtained The first disparity map of the monocular image.
- the present disclosure may use the first weight distribution map and the second weight distribution map of the first disparity map to adjust multiple disparity values in the first disparity map to obtain the adjusted first disparity map ;
- a disparity map and the adjusted second mirror image are combined to obtain the first disparity map of the optimized and adjusted monocular image.
- an example of obtaining the first disparity map of the optimized and adjusted monocular image is as follows:
- the third weight distribution graph can be expressed by the following formula (3):
- W l represents the third weight distribution map
- M l represents the first weight distribution map of the first disparity map
- L l represents the second weight distribution map of the first disparity map
- the fourth weight distribution graph can be expressed by the following formula (4):
- W l ' represents the fourth weight distribution map
- M l ' represents the first weight distribution map of the second mirror image
- L l ' represents the second weight distribution map of the second mirror image
- the adjusting the multiple disparity values in the second mirror image according to the fourth weight distribution map to obtain the adjusted second mirror image. For example, for the disparity value of a pixel at any position in the second mirror image, the disparity value of the pixel at that position is replaced with: the disparity value of the pixel at that position and the fourth The product of the weight value of the pixel at the corresponding position in the weight distribution map. After performing the above-mentioned replacement processing on all pixels in the second mirror image, an adjusted second mirror image is obtained.
- the first disparity map of the monocular image finally obtained can be expressed by the following formula (5):
- d final represents the first disparity map of the monocular image finally obtained (as shown in the first image on the right in Figure 11);
- W l represents the third weight distribution map (as shown in Figure 11 Is shown in the first image from the upper left of the image);
- W l' represents the fourth weight distribution map (as shown in the first image from the lower left in Figure 11);
- dl represents the first disparity map (the second image from the upper left in Figure 11) As shown in the picture); Shows the second mirror image (as shown in the second image from the bottom left in Figure 11).
- the present disclosure does not limit the execution order of the two steps of merging the first weight distribution map and the second weight distribution map.
- the two merging processing steps can be executed simultaneously or sequentially.
- the present disclosure does not limit the sequence of adjusting the disparity value in the first disparity image and adjusting the disparity value in the second mirror image.
- the two adjustment steps can be performed at the same time or Execute successively.
- the present disclosure performs mirror image processing on the monocular image and performs mirror processing on the second disparity map, and then uses the mirrored disparity map (ie, the second mirror image) to optimize and adjust the first disparity map of the monocular image, which is beneficial to The phenomenon that the disparity value of the corresponding area in the first disparity map of the monocular image is reduced is reduced, thereby helping to improve the accuracy of obstacle detection.
- the mirrored disparity map ie, the second mirror image
- the method of obtaining the first disparity map of the binocular image in the present disclosure includes but is not limited to: obtaining the first disparity of the binocular image by means of stereo matching Figure.
- BM Block Matching
- SGBM Semi-Global Block Matching, semi-global block matching
- GC Graph Cuts
- a convolutional neural network for obtaining a disparity map of a binocular image is used to perform disparity processing on the binocular image, thereby obtaining a first disparity map of the binocular image.
- S110 Determine a plurality of obstacle pixel regions in the first disparity map of the environment image.
- the obstacle pixel area may be a pixel area including at least two consecutive pixels in the first disparity map.
- the obstacle pixel area may be a columnar area of obstacle pixels.
- the columnar area of obstacle pixels in the present disclosure is a stripe area, and the width of the stripe area is at least one column of pixels. The height is at least two rows of pixels. Since the stripe area can be used as the basic unit of the obstacle, the present disclosure refers to the stripe area as an obstacle pixel column area.
- the present disclosure may first perform edge detection on the first disparity map of the environment image obtained in the above steps to obtain obstacle edge information; then, determine the obstacle area in the first disparity map of the environment image; Finally, according to the obstacle edge information, a plurality of obstacle pixel columnar regions are determined in the obstacle area.
- the present disclosure is beneficial to avoid the phenomenon of forming an obstacle pixel columnar area in an area of low value of attention, and is beneficial to improve the convenience of forming an obstacle pixel columnar area.
- Different obstacles in the actual space, due to the different distances from the camera device will cause different parallaxes, thereby forming obstacles with parallax edges.
- the present disclosure can separate the obstacles in the parallax map by detecting the obstacle edge information, so that the present disclosure can easily form the obstacle pixel columnar area by searching for the obstacle edge information, which is beneficial to improve the formation of the obstacle pixel columnar area The convenience.
- the method of obtaining the obstacle edge information in the first disparity map of the environment image in the present disclosure includes but is not limited to: using a convolutional neural network for edge extraction to obtain the first disparity map of the environment image Obstacle edge information in the image; and using an edge detection algorithm to obtain the obstacle edge information in the first disparity map of the environment image.
- the present disclosure uses an edge detection algorithm to obtain the obstacle edge information in the first disparity map of the environment image as shown in FIG. 12.
- step 1 Perform histogram equalization processing on the first disparity map of the environment image.
- the first disparity map of the environment image is the image in the upper left corner of FIG. 12, and the first disparity map may be the first disparity map of the environment image shown in FIG. 2 finally obtained by using the above step 100.
- the result of the histogram equalization processing is shown in the second image from the upper left of FIG. 12.
- Step 2 Perform average filtering processing on the result of the histogram equalization processing.
- the result of the filtering process is shown in the third picture from the upper left in Figure 12.
- the above steps 1 and 2 are the preprocessing of the first disparity map of the environment image.
- Steps 1 and 2 are only an example of preprocessing the first disparity map of the environment image. The present disclosure does not limit the specific implementation of preprocessing.
- Step 3 Use an edge detection algorithm to perform edge detection processing on the filtered result to obtain edge information.
- the edge information obtained in this step is shown in the fourth image from the upper left in Figure 12.
- the edge detection algorithms in the present disclosure include, but are not limited to: Canny edge detection algorithm, Sobel edge detection algorithm, or Laplacian edge detection algorithm.
- Step 4 Perform a morphological expansion operation on the obtained edge information.
- the result of the expansion operation is shown in the fifth graph from the upper left in Figure 12.
- This step belongs to a post-processing method of the detection result of the edge detection algorithm.
- the present disclosure does not limit the specific implementation of post-processing.
- Step 5 Perform a reverse operation on the result of the expansion operation to obtain an edge mask (Mask) of the first disparity map of the environment image.
- the edge mask of the first disparity map of the environmental image is shown in the lower left corner of FIG. 12.
- Step 6 Perform an AND operation on the edge mask of the first disparity map of the environment image and the first disparity map of the environment image to obtain the obstacle edge information in the first disparity map of the environment image.
- the right side diagram of FIG. 12 shows obstacle edge information in the first disparity map of the environment image.
- the disparity value at the position of the obstacle edge in the first disparity map of the environment image is set to 0.
- Obstacle edge information is shown as black edge lines in Figure 12.
- an example of determining the obstacle area in the first disparity map in the present disclosure includes the following steps:
- Step a Perform statistical processing on the disparity value of each row of pixels in the first disparity map to obtain statistical information of the disparity value of each row of pixels, and determine based on the statistical information of the disparity value of each row of pixels Statistical disparity map.
- the present disclosure may perform horizontal statistics (row-direction statistics) on the first disparity map of the environment image to obtain a V disparity map, which may be used as a statistical disparity map. That is, for each row of the first disparity map of the environment image, the number of disparity values in the row is counted, and the statistical result is set on the corresponding column of the V disparity map.
- the width of the V-disparity map (that is, the number of columns) is related to the value range of the disparity value. For example, if the value range of the disparity value is 0-254, the width of the V-disparity map is 255.
- the height of the V-disparity map is the same as the height of the first disparity map of the environment image, that is, the number of rows included in the two is the same.
- the statistical disparity map formed by the present disclosure is shown in FIG. 13.
- the top row represents the disparity value from 0 to 5; the value in the second row and the first column is 1, which means that the number of disparity values in the first row of Figure 4 is 1; the second row is 2
- the value of the column is 6, which means that the number of disparity values in the first row of Figure 4 is 6; the value in the fifth row and the sixth column is 5, which means that the disparity value in the fifth row of Figure 4 is 5
- the other numerical values in FIG. 13 are not described one by one.
- the first disparity map of the environmental image shown in the left image in FIG. 14 is processed, and the obtained V-disparity map is the right image in FIG. Shown.
- Step b Perform a first straight line fitting process on the statistical disparity map (also referred to as a V disparity map in the present disclosure), and determine the ground area and the non-ground area according to the result of the first straight line fitting process.
- the statistical disparity map also referred to as a V disparity map in the present disclosure
- the present disclosure may preprocess the V-disparity map.
- the preprocessing of the V-disparity map may include, but is not limited to: removing noise, etc.
- threshold filtering is performed on the V-disparity map to filter out noise in the V-disparity map.
- the V-disparity map for filtering noise is shown in the second image on the left in FIG. 15.
- v represents the row coordinates in the V disparity map
- d represents the disparity value.
- the oblique line in FIG. 13 represents the fitted first linear equation.
- the white diagonal line in the first picture on the right in FIG. 15 represents the fitted first straight line equation.
- the first straight line fitting method includes but is not limited to: RANSAC straight line fitting method.
- the first straight line equation obtained by the above fitting may represent the relationship between the disparity value of the ground area and the row coordinates of the V disparity map. That is, for any row in the V disparity map, when v is determined, the disparity value d of the ground area should be a certain value.
- the disparity value of the ground area can be expressed in the form of the following formula (6):
- d road represents the parallax value of the ground area
- a and B are known values, such as the values obtained by the first straight line fitting.
- the present disclosure can use formula (6) to segment the first disparity map of the environment image, so as to obtain the ground area I road and the non-ground area I notroad .
- the present disclosure may use the following formula (7) to determine the ground area and the non-ground area:
- I(*) represents the set of pixels. If the disparity value of a pixel in the first disparity map of the environment image satisfies
- the ground area I road may be as shown in the upper right diagram in FIG. 16.
- the non-ground area I notroad can be as shown in the lower right diagram in Figure 16.
- the non-ground area I notroad in the present disclosure may include: at least one of a first area I high above the ground and a second area I low below the ground.
- an area that is higher than the ground and whose height above the ground is less than a predetermined height value can be used as an obstacle area.
- the area I low below the ground may be an area such as a pit, a ditch, or a valley
- the present disclosure may use the area below the ground in the non-ground area I notroad and the height below the ground is less than a predetermined height value as an obstacle ⁇ Object area.
- the first area I high above the ground and the second area I low below the ground in the present disclosure can be expressed by the following formula (8):
- I notroad (*) represents the set of pixels. If the disparity value of a pixel in the first disparity map of the environment image satisfies dd road > thresh4, then the pixel belongs to the first area above the ground I high ; if the disparity value of a pixel in the first disparity map of the environment image satisfies d road -d> thresh4, then the pixel belongs to the second area I low below the ground; thresh4 represents a threshold, which is a known value . The size of the threshold can be set according to the actual situation.
- the first area I high above the ground often includes obstacles that do not need attention, for example, traffic lights and overpasses and other target objects. Because they will not affect the driving of the vehicle, for the vehicle, These target objects belong to obstacles that do not require attention. These obstacles that do not need attention are often in a high position, and will not affect the driving of vehicles and pedestrians.
- the present disclosure can remove regions belonging to higher positions from the first region I high above the ground, for example, remove regions whose height above the ground is greater than or equal to a first predetermined height value, thereby forming the obstacle region I obstacle .
- the present disclosure may perform the second straight line fitting process according to the V disparity map, and according to the result of the second straight line fitting process, determine the area belonging to the higher position in the non-ground area (that is, the high The height above the ground is greater than or equal to the first predetermined height value), thereby obtaining the obstacle area I obstacle in the non-ground area.
- the second straight line fitting method includes but is not limited to: RANSAC straight line fitting method.
- v represents the row coordinates in the V disparity map
- d represents the disparity value.
- C and D can be expressed as: Therefore, the second straight line equation of the present disclosure can be expressed as:
- H is a known constant value, and H can be set according to actual needs. For example, in the intelligent control technology of vehicles, H can be set to 2.5 meters.
- the intermediate image in FIG. 18 contains two upper and lower white diagonal lines, and the upper white diagonal line represents the second linear equation fitted.
- the second straight line equation obtained by the above fitting may express the relationship between the parallax value of the obstacle area and the row coordinates of the V-disparity map. That is, for any row in the V disparity map, when v is determined, the disparity value d of the obstacle area should be a determined value.
- the present disclosure may divide the first area I high above the ground into the form represented by the following formula (9):
- I high (*) represents the set of pixels. If the disparity value d of a pixel in the first disparity map of the environmental image satisfies d ⁇ d H , then the pixel is above the ground but low In the region I ⁇ H at the height of H above the ground, the present disclosure may regard I ⁇ H as the obstacle region I obstacle ; if the disparity value d of one pixel in the first disparity map of the environment image satisfies d>d H , then This pixel belongs to an area higher than the ground and higher than the height of H above the ground.
- I >H d H represents the disparity value of the pixel point at the height of H above the ground;
- I >H can be as shown in the upper right figure in Figure 18 .
- I ⁇ H can be as shown in the lower right diagram in FIG. 18.
- the method of determining the pixel columnar area according to the edge information of the obstacle may be as follows: First, the disparity value of the pixel point of the non-obstacle area in the first disparity map And the disparity value of the pixel at the edge information of the obstacle is set to a predetermined value.
- the determined target row is used as the boundary of the columnar area of obstacle pixels in the row direction by using N pixels in the column direction as the column width to determine the obstacle Obstacle pixel columnar area in the object area.
- the method of determining the pixel columnar area according to the obstacle edge information may be: first, according to the detected obstacle edge information, the disparity value at the edge position of the obstacle in the disparity map Are set to a predetermined value (such as 0), and the disparity value in the area except the obstacle area in the disparity map is also set to a predetermined value (such as 0); then, according to the predetermined column width (at least one column of pixel width , Such as the width of 6 columns of pixels, etc.), search from the bottom of the disparity map upwards.
- a predetermined value such as 0
- the predetermined column width at least one column of pixel width , Such as the width of 6 columns of pixels, etc.
- the disparity value of any column of the predetermined column width changes from a predetermined value to a non-predetermined value
- determine the position (disparity map ) Is the bottom of the pixel columnar area, which starts to form the pixel columnar area, that is, starts to extend the pixel columnar area upwards. For example, continue to search upwards for the jump from a non-predetermined value to a predetermined value in the disparity map.
- the disparity value of any column of pixels in the width jumps from a non-predetermined value to a predetermined value, stop the upward extension of the pixel columnar area, and determine that the position (the row of the disparity map) is the top of the pixel columnar area, thus forming an obstacle Object pixel columnar area.
- the present disclosure can start the determination process of the obstacle pixel columnar area from the lower left corner of the disparity map to the lower right corner of the disparity map.
- the above obstacle pixel columnar area can be executed from the leftmost 6 columns of the disparity map.
- the area determination process, and then, the above determination process of the obstacle pixel columnar area is performed again starting from the leftmost column 7-12 of the disparity map until the rightmost column of the disparity map.
- the present disclosure can also start the determination process of the obstacle pixel columnar area from the lower right corner of the disparity map to the lower left corner of the disparity map.
- the method of forming the pixel columnar area according to the obstacle edge information may be: first, according to the detected obstacle edge information, the obstacle edge position in the disparity map The disparity values at are all set to a predetermined value (such as 0), and the disparity value in the area except the obstacle area in the disparity map is also set to a predetermined value (such as 0); then, according to the predetermined column width (At least one column of pixel width, such as 6 columns of pixel width, etc.), search from the top of the disparity map downwards, when the disparity value of any column of the predetermined column width changes from a predetermined value to a non-predetermined value, Determine the position (the row of the disparity map) as the top of the pixel columnar area, and start to form the pixel columnar area, that is, start to extend the pixel columnar area downwards, for example, continue to search upward for the jump from an unpredicted value to a predetermined value in the
- the present disclosure can start the determination process of the obstacle pixel columnar area from the upper left corner of the disparity map to the upper right corner of the disparity map.
- the above obstacle pixels can be executed from the top 6 columns on the leftmost side of the disparity map.
- the determination process of the columnar area, and then, the determination process of the above-mentioned obstacle pixel columnar area is performed again from the 7-12th column on the top leftmost side of the disparity map until the rightmost column of the disparity map.
- the present disclosure can also start the determination process of the obstacle pixel columnar area from the upper right corner of the disparity map to the upper left corner of the disparity map.
- the present disclosure is directed to the environment image shown in FIG. 2, and an example of the formed obstacle pixel columnar area is shown in the right figure of FIG. 19.
- the width of each obstacle pixel columnar area in the right image of FIG. 19 is 6 columns of pixels.
- the width of the obstacle pixel columnar area can be set according to actual requirements. The larger the width of the obstacle pixel columnar area is set, the rougher the formed obstacle pixel columnar area, and the shorter the time to form the obstacle pixel columnar area.
- the attribute information of the obstacle pixel columnar area should be determined.
- the attribute information of the obstacle pixel columnar area includes but is not limited to: the spatial position information of the obstacle pixel columnar area , The bottom information bottom of the obstacle pixel columnar area, the disparity value disp of the obstacle pixel columnar area, the top information top of the obstacle pixel columnar area, and the column information col of the obstacle pixel columnar area.
- the spatial position information of the obstacle pixel columnar area may include: the coordinates of the obstacle pixel columnar area on the horizontal coordinate axis (X coordinate axis), and the obstacle pixel columnar area on the depth coordinate axis (Z coordinate axis).
- An example of X, Y, and Z coordinate axes is shown in Figure 17.
- the bottom information of the columnar area of obstacle pixels may be the row number of the bottom end of the columnar area of obstacle pixels.
- the disparity value of the columnar area of the obstacle pixel may be: when the disparity value changes from zero to non-zero, the disparity value of the pixel at the non-zero position; the columnar area of the obstacle pixel
- the top information of can be the row number of the pixel at the zero position when the disparity value changes from non-zero to zero.
- the column information of the obstacle pixel columnar area may be the column number of any column among all the columns included in the device pixel, for example, the column number of a column located in the middle of the pixel columnar area.
- the present disclosure uses the following formula (10) to calculate the spatial position information of the obstacle pixel columnar area, that is, the X coordinate, Z coordinate, and maximum Y coordinate of the obstacle pixel columnar area And minimum Y coordinate:
- the Y coordinate of each pixel in the obstacle pixel columnar area can be expressed by the following formula (11):
- Y i represents the Y coordinate of the i-th pixel in the obstacle pixel columnar area
- row i represents the row number of the i-th pixel in the obstacle pixel columnar area
- c y represents the main camera
- Z represents the Z coordinate of the obstacle pixel columnar area
- f represents the focal length of the camera.
- the maximum Y coordinate and the minimum Y coordinate can be obtained.
- the maximum Y coordinate and the minimum Y coordinate can be expressed as the following formula (12):
- Y min represents the minimum Y coordinate of the pixel cylindrical obstacle region
- Y max represents the maximum Y coordinate obstacle columnar pixel region
- min (Y i) represents a minimum value of all calculated in the Y i
- max (Y i) Y i represents the calculated maximum value of all.
- the present disclosure may perform clustering processing on a plurality of obstacle pixel columnar regions to obtain at least one cluster.
- the present disclosure can perform clustering processing on all obstacle pixel columnar regions according to the spatial position information of the obstacle pixel columnar region, and one cluster corresponds to one obstacle instance.
- the present disclosure can use a corresponding clustering algorithm to perform clustering processing on the columnar regions of each obstacle pixel.
- normalization processing ie, normalization processing
- the present disclosure may adopt the min-max normalization processing method to map the X and Z coordinates of the columnar area of obstacle pixels, so that the X and Z coordinates of the columnar area of obstacle pixels are mapped to [0-1] Within the value range.
- This normalization processing method is shown in the following formula (13):
- X * represents the normalized X coordinate
- Z * represents the normalized Z coordinate
- X represents the X coordinate of the obstacle pixel columnar area
- Z represents the obstacle pixel columnar area
- X min represents the minimum value of the X coordinates of all obstacle pixel columnar areas
- X max represents the maximum value of the X coordinates of all obstacle pixel columnar areas
- Z min represents the Z coordinate of all obstacle pixel columnar areas
- Z max represents the maximum value in the Z coordinate of all obstacle pixel columnar regions.
- the present disclosure may also adopt the Z-score normalization processing method to perform normalization processing on the X coordinate and Z coordinate of the columnar region of obstacle pixels.
- An example of this normalization processing method is shown in the following formula (14):
- X * represents the normalized X coordinate
- Z * represents the normalized Z coordinate
- X represents the X coordinate of the obstacle pixel columnar area
- Z represents the obstacle pixel columnar area
- ⁇ X represents the mean value calculated for the X coordinates of all obstacle pixel columnar areas
- ⁇ X represents the standard deviation calculated for the X coordinates of all obstacle pixel column areas
- ⁇ Z represents the calculated standard deviation for all obstacle pixel columnar areas
- ⁇ Z indicates the standard deviation calculated for the Z coordinate of all obstacle pixel columnar areas.
- the X * and Z * of all obstacle pixel columnar regions processed by the present disclosure conform to the standard normal distribution, that is, the mean is 0 and the standard deviation is 1.
- the present disclosure may adopt a density clustering (DBSCAN) algorithm to perform clustering processing on the columnar regions of obstacle pixels according to the normalized spatial position information of the columnar regions of all obstacle pixels, thereby forming at least one class Clusters, each cluster is an example of obstacles.
- DBSCAN density clustering
- the present disclosure does not limit the clustering algorithm.
- An example of the clustering result is shown in the right panel in Figure 20.
- the obstacle detection result may include, but is not limited to, at least one of the obstacle detection frame and the spatial position information of the obstacle.
- the present disclosure may determine the obstacle detection frame (Bounding-Box) in the environment image according to the spatial position information of the pixel columnar regions belonging to the same cluster. For example, for a cluster, the present disclosure can calculate the maximum column coordinate u max and the minimum column coordinate u min of all obstacle pixel columnar regions in the cluster in the environment image, and calculate all obstacles in the cluster The largest bottom (ie v max ) and smallest top (ie v min ) of the object pixel columnar area (Note: It is assumed that the origin of the image coordinate system is at the upper left corner of the image).
- the coordinates of the obstacle detection frame obtained by the present disclosure in the environment image can be expressed as (u min , v min , u max , v max ).
- an example of the obstacle detection frame determined by the present disclosure is shown in the right figure of FIG. 21.
- the multiple rectangular frames in the right figure of FIG. 21 are all obstacle detection frames obtained in the present disclosure.
- the present disclosure obtains obstacles by clustering a plurality of obstacle pixel columnar regions. There is no need to predefine the obstacles to be detected, and the texture, color, shape, category and other predefined information of the obstacles are not used. Obstacles can be detected directly based on the method of clustering the obstacle area, and the detected obstacles are not limited to some predefined obstacles, and can realize the detection of the smart device in the surrounding space environment. Various obstacles that hinder the movement process are detected, thereby realizing the detection of general types of obstacles.
- the present disclosure may also determine the spatial position information of obstacles based on the spatial position information of multiple obstacle pixel columnar regions that belong to the same cluster.
- the spatial position information of the obstacle may include, but is not limited to: the coordinates of the obstacle on the horizontal coordinate axis (X coordinate axis), the coordinate of the obstacle on the depth coordinate axis (Z coordinate axis), and the obstacle in the vertical direction. The height (that is, the height of the obstacle) and so on.
- the present disclosure may first determine the distance between the multiple obstacle pixel columnar regions in the cluster and the camera device that generates the environment image based on the spatial position information of multiple obstacle pixel columnar regions belonging to the same cluster. Distance, and then, based on the space position information of the nearest obstacle pixel columnar area, the space position information of the obstacle is determined.
- the present disclosure may use the following formula (15) to calculate the distance between a plurality of obstacle pixel columnar regions in a cluster and the camera device, and select the minimum distance:
- d min represents the minimum distance
- X-coordinate X i represents the i-th obstacle is a class of the pixel regions of the columnar cluster
- the Z i represents the i-th obstacle is a pixel region class columnar cluster The Z coordinate.
- the X coordinate and Z coordinate of the columnar area of the obstacle pixel with the minimum distance can be used as the spatial position information of the obstacle, as shown in the following formula (16):
- O X represents the coordinate of the obstacle on the horizontal coordinate axis, that is, the X coordinate of the obstacle
- O Z represents the coordinate of the obstacle on the depth coordinate axis (X coordinate axis), that is, the obstacle X close represents the X coordinate of the columnar area of obstacle pixels with the minimum distance calculated above
- Z close represents the Z coordinate of the columnar area of obstacle pixels with the minimum distance calculated as described above.
- the present disclosure may use the following formula (17) to calculate the height of the obstacle:
- O H represents the height of the obstacle
- Y max represents the maximum Y coordinate of all the pixels of the columnar region obstacle a class cluster
- Y min represents all the pixels obstacle columnar cluster region of a Class The smallest Y coordinate.
- FIG. 22 The flow of one embodiment of training a convolutional neural network in the present disclosure is shown in FIG. 22.
- S2200 Input one of the binocular image samples (such as left/right) image samples into the convolutional neural network to be trained.
- the image samples input to the convolutional neural network of the present disclosure may always be left-eye image samples of binocular image samples, or may always be right-eye image samples of binocular image samples.
- the successfully trained convolutional neural network will use the input environment image as the left eye image in the test or actual application scenarios .
- the image sample in the input convolutional neural network is always the right eye image sample of the binocular image sample, the successfully trained convolutional neural network will use the input environment image as the right eye image in the test or actual application scenarios .
- S2210 Perform disparity analysis processing via a convolutional neural network, and obtain a disparity map of the left-eye image sample and a disparity map of the right-eye image sample based on the output of the convolutional neural network.
- S2220 Reconstruct the right-eye image according to the disparity map of the left-eye image sample and the right-eye image sample.
- the method of reconstructing the right-eye image in the present disclosure includes but is not limited to: performing reprojection calculation on the disparity map of the left-eye image sample and the right-eye image sample to obtain the reconstructed right-eye image.
- S2230 Reconstruct the left-eye image according to the disparity map of the right-eye image sample and the left-eye image sample.
- the method of reconstructing the left-eye image in the present disclosure includes but is not limited to: performing re-projection calculation on the right-eye image sample and the disparity map of the left-eye image sample to obtain the reconstructed left-eye image.
- S2240 Adjust the network parameters of the convolutional neural network according to the difference between the reconstructed left-eye image and the left-eye image sample, and the difference between the reconstructed right-eye image and the right-eye image sample.
- the loss function used in the present disclosure when determining the difference includes, but is not limited to: L1 loss function, smooth loss function, lr-Consistency loss function, etc.
- the present disclosure propagates the calculated loss back to adjust the network parameters of the convolutional neural network (such as the weight of the convolution kernel)
- the gradient calculated based on the chain derivation of the convolutional neural network can be used.
- To back-propagate the loss which helps to improve the training efficiency of the convolutional neural network.
- the predetermined iterative conditions in the present disclosure may include: the difference between the left eye image and the left eye image sample reconstructed based on the disparity map output by the convolutional neural network, and the right eye image and the right eye image reconstructed based on the disparity map output by the convolutional neural network.
- the difference between the image samples meets the predetermined difference requirement. If the difference meets the requirements, the training of the convolutional neural network is successfully completed this time.
- the predetermined iterative conditions in the present disclosure may also include: training the convolutional neural network, and the number of binocular image samples used reaches a predetermined number requirement, etc.
- the number of binocular image samples in use reaches the predetermined number requirement.
- the difference between the left eye image and the left eye image samples reconstructed based on the disparity map output by the convolutional neural network, and the disparity map output based on the convolutional neural network If the difference between the reconstructed right-eye image and the right-eye image sample does not meet the predetermined difference requirements, the training of the convolutional neural network is not successful this time.
- FIG. 23 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure.
- the intelligent driving control method of the present disclosure can be applied but not limited to: an automatic driving (such as a fully unassisted automatic driving) environment or an assisted driving environment.
- S2300 Acquire an environment image of the smart device during the movement process through an image acquisition device set on the smart device.
- the image acquisition device includes but is not limited to: an RGB-based camera device and the like.
- S2310 Perform obstacle detection on the acquired environment image, and determine the obstacle detection result.
- the specific implementation process of this step please refer to the description of FIG. 1 in the foregoing method implementation, which is not described in detail here.
- S2320 Generate and output a control instruction according to the obstacle detection result.
- control commands generated by the present disclosure include but are not limited to: speed maintaining control commands, speed adjustment control commands (such as decelerating driving commands, accelerating driving commands, etc.), direction maintaining control commands, and direction adjustment control commands (such as left steering commands) , Right turn command, left lane merging command, or right lane merging command, etc.), whistle command, warning prompt control command or driving mode switching control command (such as switching to automatic cruise driving mode, etc.).
- the obstacle detection technology of the present disclosure can be applied in the field of intelligent driving control as well as other fields; for example, it can realize obstacle detection in industrial manufacturing and obstacles in indoor fields such as supermarkets. Object detection, obstacle detection in the security field, etc.
- the present disclosure does not limit the application scenarios of the obstacle detection technology.
- FIG. 24 is a schematic structural diagram of an embodiment of the obstacle detection device of the present disclosure.
- the device in FIG. 24 includes: an acquisition module 2400, a first determination module 2410, a clustering module 2420, a second determination module 2430, and a training module 2440.
- the obtaining module 2400 is used to obtain the first disparity map of the environment image.
- the environment image is an image that characterizes the spatial environment information of the smart device during the movement.
- the environmental image includes a monocular image.
- the obtaining module 2400 may include: a first sub-module, a second sub-module, and a third sub-module.
- the first sub-module is used to analyze and process the disparity of the monocular image using the convolutional neural network, and obtain the first disparity map of the monocular image based on the output of the convolutional neural network; the convolutional neural network uses the binocular image Sample, obtained by training.
- the second sub-module is used to perform mirror image processing on the monocular image to obtain a first mirror image and obtain a disparity map of the first mirror image.
- the third sub-module is configured to perform disparity adjustment on the first disparity map of the monocular image according to the disparity map of the first mirror image to obtain the first disparity map after the disparity adjustment.
- the third sub-module may include: a first unit and a second unit.
- the first unit is used to perform mirror processing on the disparity map of the first mirror image to obtain the second mirror image.
- the second unit is configured to perform disparity adjustment on the first disparity map according to the weight distribution map of the first disparity map and the weight distribution map of the second mirror image to obtain the first disparity map after the disparity adjustment.
- the weight distribution map of the first disparity map includes the weight values that represent the multiple disparity values in the first disparity map
- the weight distribution map of the second mirror image includes the corresponding weight values of the multiple disparity values in the second mirror image. Weights.
- the weight distribution diagram in the present disclosure includes: at least one of a first weight distribution diagram and a second weight distribution diagram.
- the first weight distribution map is a weight distribution map set uniformly for multiple environmental images; the second weight distribution map is a weight distribution map set separately for different environmental images.
- the first weight distribution map includes at least two left and right regions, and different regions have different weight values.
- the weight value of the region on the right is not less than that of the region on the left
- the weight value of the region on the right is not less than the weight value of the region on the left.
- the weight value of the left part of the region is not greater than the weight value of the right part of the region; for the first weight of the second mirror image
- the weight value of the left part of the area is not greater than the weight value of the right part of the area.
- the weight value of the region on the left is not less than that of the region on the right
- the weight value of the region on the left is not less than the weight value of the region on the right
- the weight value of the right part of the region is not greater than the weight value of the left part of the region; for the second mirror image
- the weight value of the right part of the area is not greater than the weight value of the left part of the area.
- the third submodule may further include: a third unit configured to set a second weight distribution map of the first disparity map. Specifically, the third unit performs mirror image processing on the first disparity map to form a mirror disparity map; and sets the weights in the second weight distribution map of the first disparity map according to the disparity values in the mirror disparity map of the first disparity map value. For example, for a pixel at any position in the mirror disparity map, if the disparity value of the pixel at that position satisfies the first predetermined condition, the third unit adds the second weight of the first disparity map The weight value of the pixel at this position in the distribution map is set to the first value.
- a third unit configured to set a second weight distribution map of the first disparity map. Specifically, the third unit performs mirror image processing on the first disparity map to form a mirror disparity map; and sets the weights in the second weight distribution map of the first disparity map according to the disparity values in the mirror disparity map of the first disparity map value.
- the third unit may set the weight value of the pixel at the position in the second weight distribution map of the first disparity map to the second value; Among them, the first value is greater than the second value.
- the first predetermined condition may include: the disparity value of the pixel at the position is greater than the first reference value of the pixel at the position.
- the first reference value of the pixel at the position is set according to the disparity value of the pixel at the position in the first disparity map and a constant value greater than zero.
- the third sub-module may further include: a fourth unit.
- the fourth unit is used to set the fourth unit of the second weight distribution graph of the second mirror image.
- the fourth unit sets the weight value in the second weight distribution map of the second mirror image according to the disparity value in the first disparity map. More specifically, for a pixel at any position in the second mirror image, if the disparity value of the pixel at that position in the first disparity map meets the second predetermined condition, the fourth unit will The weight value of the pixel at the position in the second weight distribution diagram of the second mirror image is set to the third value.
- the fourth unit will change the weight value of the pixel at the position in the second weight distribution map of the second mirror image Set to the fourth value; where the third value is greater than the fourth value.
- the second predetermined condition includes: the disparity value of the pixel at the position in the first disparity map is greater than the second reference value of the pixel at the position; wherein the second reference value of the pixel at the position is It is set according to the disparity value of the pixel at the position in the mirror disparity map of the first disparity map and a constant value greater than zero.
- the second unit may adjust the disparity value in the first disparity map according to the first weight distribution map and the second weight distribution map of the first disparity map; and according to the first weight distribution map of the second mirror image and The second weight distribution map adjusts the disparity value in the second mirror image; the second unit merges the first disparity map after the disparity adjustment and the second mirror image after the disparity value adjustment, and finally obtains the adjusted disparity The first disparity map.
- the first determining module 2410 is configured to determine multiple obstacle pixel regions in the first disparity map of the environment image.
- the first determining module 2410 may include: a fourth sub-module, a fifth sub-module, and a sixth sub-module.
- the fourth sub-module is used to perform edge detection on the first disparity map of the environment image to obtain obstacle edge information.
- the fifth sub-module is used to determine the obstacle area in the first disparity map of the environment image; the sixth sub-module is used to determine a plurality of obstacle pixel columns in the obstacle area of the first disparity map according to the obstacle edge information area.
- the fifth sub-module may include: a fifth unit, a sixth unit, a seventh unit, and an eighth unit.
- the fifth unit is used to perform statistical processing on the disparity value of each row of pixels in the first disparity map to obtain statistical information of the disparity value of each row of pixels.
- the sixth unit is used to determine the statistical disparity map based on the statistical information of the disparity value of each row of pixels; the seventh unit is used to perform the first straight line fitting processing on the statistical disparity map, and according to the first straight line fitting processing The result of determining the ground area and the non-ground area; the eighth unit is used to determine the obstacle area according to the non-ground area.
- the non-ground area includes: the first area above the ground.
- the non-ground area includes: the first area above the ground and the second area below the ground.
- the eighth unit may perform the second straight line fitting process on the statistical disparity map, and according to the result of the second straight line fitting process, determine the first target area in the first area whose height above the ground is less than the first predetermined height value, and A target area is an obstacle area; in the case that there is a second area lower than the ground in the non-ground area, the eighth unit determines the second target area in the second area whose height below the ground is greater than the second predetermined height value.
- the second target area is the obstacle area.
- the sixth sub-module may set the disparity value of the pixel point of the non-obstacle area in the first disparity map and the disparity value of the pixel point at the edge information of the obstacle to a predetermined value;
- the N pixels in the column direction of the first disparity map are used as the traversal unit, and the disparity value of the N pixels on each row is traversed from the set row of the first disparity map, and the disparity value of the pixel is determined to exist in the predetermined
- the target row for the jump between the value and the non-predetermined value; the sixth sub-module is used to use N pixels in the column direction as the column width, and the determined target row is used as the boundary of the obstacle pixel columnar area in the row direction , To determine the obstacle pixel columnar area in the obstacle area.
- the clustering module 2420 is used to perform clustering processing on a plurality of obstacle pixel regions to obtain at least one cluster.
- the clustering module 2420 may perform clustering processing on a plurality of obstacle pixel columnar regions.
- the clustering module 2420 may include a seventh sub-module and an eighth sub-module.
- the seventh sub-module is used to determine the spatial position information of a plurality of obstacle pixel columnar regions.
- the eighth sub-module is used to perform clustering processing on the multiple obstacle pixel columnar regions according to the spatial position information of the multiple obstacle pixel columnar regions.
- the eighth sub-module determines the attribute information of the obstacle pixel columnar area according to the pixels contained in the obstacle pixel columnar area, and according to the attribute information of the obstacle pixel columnar area Information to determine the spatial position information of the obstacle pixel columnar area.
- the attribute information of the obstacle pixel columnar area may include at least one of pixel columnar area bottom information, pixel columnar area top information, pixel columnar area disparity value, and pixel columnar area column information.
- the spatial position information of the obstacle pixel columnar area may include the coordinates of the obstacle pixel columnar area on the horizontal coordinate axis, and the coordinate of the obstacle pixel columnar area on the depth coordinate axis.
- the spatial position information of the obstacle pixel columnar area may further include: the coordinates of the highest point of the obstacle pixel columnar area on the vertical coordinate axis, and the coordinates of the lowest point of the obstacle pixel columnar area on the vertical coordinate axis; where The coordinates of the highest point and the lowest point are used to determine the height of the obstacle.
- the second determining module 2430 is configured to determine the obstacle detection result according to the obstacle pixel areas belonging to the same cluster.
- the second determining module may include: at least one of a ninth sub-module and a tenth sub-module.
- the ninth sub-module is used to determine the obstacle detection frame in the environment image according to the spatial position information of the columnar area of obstacle pixels belonging to the same cluster.
- the tenth sub-module is used to determine the spatial position information of the obstacle according to the spatial position information of the columnar area of obstacle pixels belonging to the same cluster.
- the tenth sub-module can determine the distance between the columnar regions of multiple obstacle pixels and the camera device that generates environmental images according to the spatial position information of the columnar regions of multiple obstacle pixels belonging to the same cluster;
- the spatial position information of the nearest obstacle pixel columnar area of the device determines the spatial position information of the obstacle.
- the training module 2440 is used to train the training module of the convolutional neural network. For example, the training module 2440 inputs one of the binocular image samples into the convolutional neural network to be trained, performs parallax analysis processing through the convolutional neural network, and obtains the left eye image sample based on the output of the convolutional neural network.
- the disparity map and the disparity map of the right eye image sample; the training module 2440 reconstructs the right eye image according to the disparity map of the left eye image sample and the right eye image sample; the training module 2440 reconstructs the left eye image according to the disparity map of the right eye image sample and the left eye image sample; the training module 2440 The difference between the reconstructed left-eye image and the left-eye image sample, and the difference between the reconstructed right-eye image and the right-eye image sample, adjust the network parameters of the convolutional neural network.
- the training module 2440 For the specific operations performed by the training module 2440, please refer to the above description of FIG. 22, which will not be described in detail here.
- FIG. 25 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure.
- the device in FIG. 25 includes: an acquisition module 2500, an obstacle detection device 2510, and a control module 2520.
- the acquiring module 2500 is configured to acquire an environmental image of the smart device during the movement process through the image acquisition device set on the smart device.
- the obstacle detection device 2510 is used to perform obstacle detection on the environment image and determine the obstacle detection result.
- the control module 2520 is used to generate and output vehicle control instructions according to the obstacle detection result.
- FIG. 26 shows an exemplary device 2600 suitable for implementing the present disclosure.
- the device 2600 may be a control system/electronic system configured in a car, a mobile terminal (for example, a smart mobile phone, etc.), a personal computer (PC, for example, a desktop computer). Or notebook computers, etc.), tablets, servers, etc.
- the device 2600 includes one or more processors, communication parts, etc., the one or more processors may be: one or more central processing units (CPU) 2601, and/or, one or more An image processor (GPU) 2613 for visual tracking by a neural network, etc.
- CPU central processing units
- GPU An image processor
- the processor can execute instructions stored in a read-only memory (ROM) 2602 or can be loaded from the storage portion 2608 to a random access memory (RAM) 2603. Executing instructions to perform various appropriate actions and processing.
- the communication unit 2612 may include but is not limited to a network card, and the network card may include but is not limited to an IB (Infiniband) network card.
- the processor can communicate with the read-only memory 2602 and/or the random access memory 2603 to execute executable instructions, connect to the communication unit 2612 via the bus 2604, and communicate with other target devices via the communication unit 2612, thereby completing the corresponding in this disclosure step.
- RAM 2603 can also store various programs and data required for device operation.
- the CPU 2601, the ROM 2602, and the RAM 2603 are connected to each other through a bus 2604.
- ROM2602 is an optional module.
- the RAM 2603 stores executable instructions, or writes executable instructions into the ROM 2602 during operation, and the executable instructions cause the central processing unit 2601 to execute the steps included in the above method.
- An input/output (I/O) interface 2605 is also connected to the bus 2604.
- the communication unit 2612 may be integrated, or may be configured to have multiple sub-modules (for example, multiple IB network cards) and be connected to the bus respectively.
- the following components are connected to the I/O interface 2605: an input part 2606 including a keyboard, a mouse, etc.; an output part 2607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc. and speakers, etc.; a storage part 2608 including a hard disk, etc. ; And a communication part 2609 including a network interface card such as a LAN card, a modem, etc.
- the communication section 2609 performs communication processing via a network such as the Internet.
- the driver 2610 is also connected to the I/O interface 2605 as needed.
- a removable medium 2611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 2610 as required, so that the computer program read from it is installed in the storage portion 2608 as
- FIG. 26 is only an optional implementation. In the specific practice process, the number and types of components in Figure 26 can be selected, deleted, added or replaced according to actual needs. ; In the setting of different functional components, separate settings or integrated settings can also be used. For example, GPU2613 and CPU2601 can be set separately, and then GPU2613 can be integrated on CPU2601. The communication part can be set separately or integrated. Set on CPU2601 or GPU2613, etc. These alternative embodiments all fall into the protection scope of the present disclosure.
- the process described below with reference to the flowcharts can be implemented as a computer software program.
- the embodiments of the present disclosure include a computer program product, which includes a computer program product tangibly contained on a machine-readable medium.
- the computer program includes program code for executing the steps shown in the flowchart.
- the program code may include instructions corresponding to the steps in the method provided by the present disclosure.
- the computer program may be downloaded and installed from the network through the communication part 2609, and/or installed from the removable medium 2611.
- the central processing unit (CPU) 2601 the instructions described in the present disclosure to implement the above-mentioned corresponding steps are executed.
- the embodiments of the present disclosure also provide a computer program program product for storing computer-readable instructions, which when executed, cause a computer to execute the procedures described in any of the foregoing embodiments. Obstacle detection method or intelligent driving control method.
- the computer program product can be specifically implemented by hardware, software or a combination thereof.
- the computer program product is specifically embodied as a computer storage medium.
- the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc. Wait.
- SDK software development kit
- the embodiments of the present disclosure also provide another obstacle detection method and intelligent driving control method and corresponding devices and electronic equipment, computer storage media, computer programs, and computer program products.
- the method includes: the first device sends an obstacle detection instruction or an intelligent driving control instruction to a second device, and the instruction causes the second device to execute the obstacle detection method or the intelligent driving control method in any of the foregoing possible embodiments; A device receives the obstacle detection result or the intelligent driving control result sent by the second device.
- the visual obstacle detection instruction or the intelligent driving control instruction may be specifically a calling instruction
- the first device may instruct the second device to perform the obstacle detection operation or the intelligent driving control operation by calling, and respond accordingly
- the second device may execute the steps and/or processes in any embodiment of the obstacle detection method or the intelligent driving control method described above.
- any component, data, or structure mentioned in the present disclosure can generally be understood as one or more unless it is clearly defined or the context gives opposite enlightenment. It should also be understood that the description of the various embodiments in the present disclosure emphasizes the differences between the various embodiments, and the same or similarities can be referred to each other, and for the sake of brevity, the details are not repeated one by one.
- the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure may be implemented in many ways.
- the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware.
- the present disclosure can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure.
- the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
- the description of the present disclosure is given for the sake of example and description, and is not exhaustive or limits the present disclosure to the disclosed form. Many modifications and changes are obvious to those of ordinary skill in the art. The embodiments are selected and described in order to better explain the principles and practical applications of the present disclosure, and to enable those of ordinary skill in the art to understand that the embodiments of the present disclosure can design various embodiments with various modifications suitable for specific purposes. .
Abstract
Description
Claims (69)
- 一种障碍物检测方法,其特征在于,包括:获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;在所述环境图像的第一视差图中确定出多个障碍物像素区域;对所述多个障碍物像素区域进行聚类处理,获得至少一个类簇;根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
- 根据权利要求1所述的方法,其特征在于,所述环境图像包括单目图像;在获得环境图像的第一视差图之后,还包括:将所述单目图像进行镜像处理后,得到第一镜像图,并获取所述第一镜像图的视差图;根据所述第一镜像图的视差图,对所述单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图;所述在所述环境图像的第一视差图中确定出多个障碍物像素区域,包括:在所述经视差调整后的第一视差图中确定出多个障碍物像素区域。
- 根据权利要求2所述的方法,其特征在于,所述根据所述第一镜像图的视差图,对所述单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图,包括:将所述第一镜像图的视差图进行镜像处理后,得到第二镜像图;根据所述第一视差图的权重分布图、以及所述第二镜像图的权重分布图,对所述第一视差图进行视差调整,获得经视差调整后的第一视差图;其中,所述第一视差图的权重分布图包括表示所述第一视差图中多个视差值各自对应的权重值;所述第二镜像图的权重分布图包括所述第二镜像图中多个视差值各自对应的权重。
- 根据权利要求3所述的方法,其特征在于,所述权重分布图包括:第一权重分布图,和/或,第二权重分布图;所述第一权重分布图是针对多个环境图像统一设置的权重分布图;所述第二权重分布图是针对不同环境图像分别设置的权重分布图。
- 根据权利要求4所述的方法,其特征在于,所述第一权重分布图包括至少两个左右分列的区域,不同区域具有不同的权重值。
- 根据权利要求5所述的方法,其特征在于,在所述单目图像为左目图像的情况下:对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值;对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。
- 根据权利要求6所述的方法,其特征在于:对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中的右侧部分的权重值;对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中右侧部分的权重值。
- 根据权利要求5所述的方法,其特征在于,在所述单目图像为右目图像的情况下:对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值;对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。
- 根据权利要求8所述的方法,其特征在于:对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值;对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值。
- 根据权利要求4至9中任一项所述的方法,其特征在于,所述第一视差图的第二权重分布图的设置方式包括:对所述第一视差图进行镜像处理,形成镜像视差图;根据所述第一视差图的镜像视差图中的视差值,设置所述第一视差图的第二权重分布图中的权重值。
- 根据权利要求10所述的方法,其特征在于,所述根据所述第一视差图的镜像视差图中的视差值,设置所述第一视差图的第二权重分布图中的权重值,包括:对于所述镜像视差图中的任一位置处的像素点而言,在该位置处的像素点的视差值满足第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第一值。
- 根据如权利要求11所述的方法,其特征在于,所述方法还包括:在该位置处的像素点的视差值不满足所述第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第二值;其中,所述第一值大于所述第二值。
- 根据权利要求11或12所述的方法,其特征在于,所述第一预定条件包括:该位置处的像素点的视差值大于该位置处的像素点的第一参考值;其中,该位置处的像素点的第一参考值是根据所述第一视差图中该位置处的像素点的视差值以及大于零的常数值,设置的。
- 根据权利要求4至13中任一项所述的方法,其特征在于,所述第二镜像图的第二权重分布图的设置方式包括:根据所述第一视差图中的视差值,设置所述第二镜像图的第二权重分布图中的权重值。
- 根据如权利要求14所述的方法,其特征在于,所述根据所述第一视差图中的视差值,设置所述第二镜像图的第二权重分布图中的权重值,包括:对于所述第二镜像图中的任一位置处的像素点而言,在所述第一视差图中该位置处的像素点的视差值满足第二预定条件,则将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第三值。
- 根据如权利要求15所述的方法,其特征在于,所述方法还包括:在所述第一视差图中该位置处的像素点的视差值不满足第二预定条件的情况下,将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第四值;其中,所述第三值大于第四值。
- 根据权利要求15或16所述的方法,其特征在于,所述第二预定条件包括:所述第一视差图中该位置处的像素点的视差值大于该位置处的像素点的第二参考值;其中,该位置处的像素点的第二参考值是根据所述第一视差图的镜像视差图中在该位置处的像素点的视差值以及大于零的常数值,设置的。
- 根据权利要求4至17任一所述的方法,其特征在于,所述根据所述第一视差图的权重分布图、以及所述第二镜像图的权重分布图,对所述第一视差图进行视差调整,获得经视差调整后的第一视差图,包括:根据所述第一视差图的第一权重分布图和第二权重分布图,调整所述第一视差图中的视差值;根据所述第二镜像图的第一权重分布图和第二权重分布图,调整所述第二镜像图中的视差值;将视差调整后的第一视差图和视差值调整后的第二镜像图进行合并,最终获得经视差调整后的第一视差图。
- 根据权利要求1所述的方法,其特征在于,所述环境图像包括单目图像;所述获取环境图像的第一视差图,包括:利用卷积神经网络对所述单目图像进行视差分析处理,基于所述卷积神经网络的输出,获得所述单目图像的第一视差图;其中,所述卷积神经网络是利用双目图像样本,训练获得的。
- 根据权利要求19所述的方法,其特征在于,所述卷积神经网络的训练过程,包括:将双目图像样本中的其中一目图像样本输入至待训练的卷积神经网络中,经由所述卷积神经网络进行视差分析处理,基于所述卷积神经网络的输出,获得左目图像样本的视差图和右目图像样本的视差图;根据所述左目图像样本以及右目图像样本的视差图重建右目图像;根据所述右目图像样本以及左目图像样本的视差图重建左目图像;根据重建的左目图像和左目图像样本之间的差异、以及重建的右目图像和右目图像样本之间的差异,调整所述卷积神经网络的网络参数。
- 根据权利要求1至20中任一项所述的方法,其特征在于,所述在所述环境图像的第一视差图中确定出多个障碍物像素区域,包括:对所述环境图像的第一视差图进行边缘检测,获得障碍物边缘信息;确定所述环境图像的第一视差图中的障碍物区域;根据所述障碍物边缘信息,在所述障碍物区域中,确定多个障碍物像素柱状区域。
- 根据权利要求21所述的方法,其特征在于,所述确定所述环境图像的第一视差图中的障碍物区域,包括:对所述第一视差图中每行像素点的视差值进行统计处理,得到对每行像素点的视差值的统计信息;基于对每行像素点的视差值的统计信息,确定统计视差图;对所述统计视差图进行第一直线拟合处理,根据所述第一直线拟合处理的结果确定地面区域和非地面区域;根据所述非地面区域,确定障碍物区域。
- 根据权利要求22所述的方法,其特征在于,所述非地面区域包括:高于地面的第一区域;或者,所述非地面区域包括:高于地面的第一区域和低于地面的第二区域。
- 根据权利要求23所述的方法,其特征在于,所述根据所述非地面区域,确定障碍物区域,包括:对所述统计视差图进行第二直线拟合处理,根据所述第二直线拟合处理的结果,确定所述第一区域中的高于地面的高度小于第一预定高度值的第一目标区域,所述第一目标区域为障碍物区域;在所述非地面区域存在低于地面的第二区域的情况下,确定所述第二区域中低于地面的高度大于第二预定高度值的第二目标区域,所述第二目标区域为障碍物区域。
- 根据权利要求21至24中任一项所述的方法,其特征在于,所述根据所述障碍物边缘信息,在所述第一视差图的障碍物区域中,确定多个障碍物像素柱状区域,包括:将所述第一视差图中的非障碍物区域的像素点的视差值以及所述障碍物边缘信息处的像素点的视差值设置为预定值;以所述第一视差图的列方向的N个像素点作为遍历单位,从所述第一视差图的设定行起遍历每行上N个像素点的视差值,确定像素点的视差值存在所述预定值和非预定值之间的跳变的目标行;N为正整数;以列方向上的N个像素点作为柱宽度、以确定出的目标行作为所述障碍物像素柱状区域在行方向上的边界,确定所述障碍物区域中的障碍物像素柱状区域。
- 根据权利要求1至25中任一项所述的方法,其特征在于,所述障碍物像素区域包括障碍物像素柱状区域;所述对所述多个障碍物像素区域进行聚类处理,包括:确定所述多个障碍物像素柱状区域的空间位置信息;根据所述多个障碍物像素柱状区域的空间位置信息,对所述多个障碍物像素柱状区域进行聚类处理。
- 根据权利要求26所述的方法,其特征在于,所述确定所述多个障碍物像素柱状区域的空间位置信息,包括:针对任一障碍物像素柱状区域而言,根据该障碍物像素柱状区域所包含的像素,确定该障碍物像素柱状区域的属性信息,并根据该障碍物像素柱状区域的属性信息,确定该障碍物像素柱状区域的空间位置信息。
- 根据权利要求27所述的方法,其特征在于,所述障碍物像素柱状区域的属性信息包括:像素柱状区域底部信息、像素柱状区域顶部信息、像素柱状区域视差值、以及像素柱状区域列信息中的至少一个。
- 根据权利要求26至28中任一项所述的方法,其特征在于,所述障碍物像素柱状区域的空间位置信息包括:障碍物像素柱状区域在水平方向坐标轴上的坐标、障碍物像素柱状区域在深度方向坐标轴上的坐标。
- 根据权利要求29所述的方法,其特征在于,所述障碍物像素柱状区域的空间位置信息还包括:障碍物像素柱状区域在竖直方向坐标轴上的最高点坐标、以及障碍物像素柱状区域在竖直方向坐标轴上的最低点坐标;所述最高点坐标和所述最低点坐标用于确定障碍物高度。
- 根据权利要求1至30中任一项所述的方法,其特征在于,所述障碍物像素区域包括;障碍物像素柱状区域;所述根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果,包括:根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定所述环境图像中的障碍物检测框;和/或根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
- 根据权利要求31所述的方法,其特征在于,所述根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息,包括:根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定所述多个障碍物像素柱状区域与生成所述环境图像的摄像装置之间的距离;根据距离所述摄像装置最近的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
- 一种智能驾驶控制方法,其特征在于,包括:通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像;采用如权利要求1-32中任一项所述的方法,对获取的环境图像进行障碍物检测,确定障碍物检测结果;根据所述障碍物检测结果生成并输出控制指令。
- 一种障碍物检测装置,其特征在于,包括:获取模块,用于获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;第一确定模块,用于在所述环境图像的第一视差图中确定出多个障碍物像素区域;聚类模块,用于对所述多个障碍物像素区域进行聚类处理,获得至少一个类簇;第二确定模块,用于根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
- 根据权利要求34所述的装置,其特征在于,所述获取模块还包括:第二子模块,用于将所述环境图像中的单目图像进行镜像处理后,得到第一镜像图,并获取所述第一镜像图的视差图;第三子模块,用于根据所述第一镜像图的视差图,对所述单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图;所述第一确定模块进一步用于:在所述经视差调整后的第一视差图中确定出多个障碍物像素区域。
- 根据权利要求35所述的装置,其特征在于,所述第三子模块,包括:第一单元,用于将所述第一镜像图的视差图进行镜像处理后,得到第二镜像图;第二单元,用于根据所述第一视差图的权重分布图、以及所述第二镜像图的权重分布图,对所述第一视差图进行视差调整,获得经视差调整后的第一视差图;其中,所述第一视差图的权重分布图包括表示所述第一视差图中多个视差值各自对应的权重值;所述第二镜像图的权重分布图包括所述第二镜像图中多个视差值各自对应的权重。
- 根据权利要求36所述的装置,其特征在于,所述权重分布图包括:第一权重分布图,和/或,第二权重分布图;所述第一权重分布图是针对多个环境图像统一设置的权重分布图;所述第二权重分布图是针对不同环境图像分别设置的权重分布图。
- 根据权利要求37所述的装置,其特征在于,所述第一权重分布图包括至少两个左右分列的区域,不同区域具有不同的权重值。
- 根据权利要求38所述的装置,其特征在于,在所述单目图像为左目图像的情况下:对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值;对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。
- 根据权利要求39所述的装置,其特征在于:对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中的右侧部分的权重值;对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中右侧部分的权重值。
- 根据权利要求38所述的装置,其特征在于,在所述单目图像为右目图像的情况下:对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于 位于右侧的区域的权重值;对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。
- 根据权利要求41所述的装置,其特征在于:对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值;对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值。
- 根据权利要求37至42中任一项所述的装置,其特征在于,所述第三子模块,还包括:用于设置第一视差图的第二权重分布图的第三单元;所述第三单元对所述第一视差图进行镜像处理,形成镜像视差图;并根据所述第一视差图的镜像视差图中的视差值,设置所述第一视差图的第二权重分布图中的权重值。
- 根据权利要求43所述的装置,其特征在于,所述第三单元进一步用于,对于所述镜像视差图中的任一位置处的像素点而言,在该位置处的像素点的视差值满足第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第一值。
- 根据如权利要求44所述的装置,其特征在于,所述第三单元还进一步用于,在该位置处的像素点的视差值不满足所述第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第二值;其中,所述第一值大于所述第二值。
- 根据权利要求44或45所述的装置,其特征在于,所述第一预定条件包括:该位置处的像素点的视差值大于该位置处的像素点的第一参考值;其中,该位置处的像素点的第一参考值是根据所述第一视差图中该位置处的像素点的视差值以及大于零的常数值,设置的。
- 根据权利要求37至46中任一项所述的装置,其特征在于,所述第三子模块还包括:用于设置第二镜像图的第二权重分布图的第四单元;所述第四单元根据第一视差图中的视差值,设置所述第二镜像图的第二权重分布图中的权重值。
- 根据如权利要求47所述的装置,其特征在于,所述第四单元进一步用于:对于所述第二镜像图中的任一位置处的像素点而言,在所述第一视差图中该位置处的像素点的视差值满足第二预定条件,则将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第三值。
- 根据如权利要求48所述的装置,其特征在于,所述第四单元进一步用于,在所述第一视差图中该位置处的像素点的视差值不满足第二预定条件的情况下,将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第四值;其中,所述第三值大于第四值。
- 根据权利要求48或49所述的装置,其特征在于,所述第二预定条件包括:所述第一视差图中该位置处的像素点的视差值大于该位置处的像素点的第二参考值;其中,该位置处的像素点的第二参考值是根据所述第一视差图的镜像视差图中在该位置处的像素点的视差值以及大于零的常数值,设置的。
- 根据权利要求37至50任一所述的装置,其特征在于,所述第二单元进一步用于:根据所述第一视差图的第一权重分布图和第二权重分布图,调整所述第一视差图中的视差值;根据所述第二镜像图的第一权重分布图和第二权重分布图,调整所述第二镜像图中的视差值;将视差调整后的第一视差图和视差值调整后的第二镜像图进行合并,最终获得经视差调整后的第一视差图。
- 根据权利要求34所述的装置,其特征在于,所述环境图像包括单目图像;所述获取模块包括:第一子模块,用于利用卷积神经网络对所述单目图像进行视差分析处理,基于所述卷积神经网络的输出,获得所述单目图像的第一视差图;其中,所述卷积神经网络是利用双目图像样本,训练获得的。
- 根据权利要求52所述的装置,其特征在于,所述装置还包括:用于训练卷积神经网络的训练模块,所述训练模块进一步用于:将双目图像样本中的其中一目图像样本输入至待训练的卷积神经网络中,经由所述卷积神经网络进行视差分析处理,基于所述卷积神经网络的输出,获得左目图像样本的视差图和右目图像样本的视差图;根据所述左目图像样本以及右目图像样本的视差图重建右目图像;根据所述右目图像样本以及左目图像样本的视差图重建左目图像;根据重建的左目图像和左目图像样本之间的差异、以及重建的右目图像和右目图像样本之间的差异,调整所述卷积神经网络的网络参数。
- 根据权利要求34至53中任一项所述的装置,其特征在于,所述第一确定模块,包括:第四子模块,用于对所述环境图像的第一视差图进行边缘检测,获得障碍物边缘信息;第五子模块,用于确定所述环境图像的第一视差图中的障碍物区域;第六子模块,用于根据所述障碍物边缘信息,在所述障碍物区域中,确定多个障碍物像素柱状区域。
- 根据权利要求54所述的装置,其特征在于,所述第五子模块,包括:第五单元,用于对所述第一视差图中每行像素点的视差值进行统计处理,得到对每行像素点的视差值的统计信息;第六单元,用于基于对每行像素点的视差值的统计信息,确定统计视差图;第七单元,用于对所述统计视差图进行第一直线拟合处理,根据所述第一直线拟合处理的结果确定地面区域和非地面区域;第八单元,用于根据所述非地面区域,确定障碍物区域。
- 根据权利要求55所述的装置,其特征在于,所述非地面区域包括:高于地面的第一区域;或者,所述非地面区域包括:高于地面的第一区域和低于地面的第二区域。
- 根据权利要求56所述的装置,其特征在于,所述第八单元进一步用于:对所述统计视差图进行第二直线拟合处理,根据所述第二直线拟合处理的结果,确定所述第一区域中的高于地面的高度小于第一预定高度值的第一目标区域,所述第一目标区域为障碍物区域;在所述非地面区域存在低于地面的第二区域的情况下,确定所述第二区域中低于地面的高度大于第二预定高度值的第二目标区域,所述第二目标区域为障碍物区域。
- 根据权利要求54至57中任一项所述的装置,其特征在于,所述第六子模块进一步用于:将所述第一视差图中的非障碍物区域的像素点的视差值以及所述障碍物边缘信息处的像素点的视差值设置为预定值;以所述第一视差图的列方向的N个像素点作为遍历单位,从所述第一视差图的设定行起遍历每行上N个像素点的视差值,确定像素点的视差值存在所述预定值和非预定值之间的跳变的目标行;N为正整数;以列方向上的N个像素点作为柱宽度、以确定出的目标行作为所述障碍物像素柱状区域在行方向上的边界,确定所述障碍物区域中的障碍物像素柱状区域。
- 根据权利要求34至58中任一项所述的装置,其特征在于,所述障碍物像素区域包括障碍物像素柱状区域;所述聚类模块,包括:第七子模块,用于确定所述多个障碍物像素柱状区域的空间位置信息;第八子模块,用于根据所述多个障碍物像素柱状区域的空间位置信息,对所述多个障碍物像素柱状区域进行聚类处理。
- 根据权利要求59所述的装置,其特征在于,所述第八子模块进一步用于:针对任一障碍物像素柱状区域而言,根据该障碍物像素柱状区域所包含的像素,确定该障碍物像素柱状区域的属性信息,并根据该障碍物像素柱状区域的属性信息,确定该障碍物像素柱状区域的空间位置信息。
- 根据权利要求60所述的装置,其特征在于,所述障碍物像素柱状区域的属性信息包括:像素柱状区域底部信息、像素柱状区域顶部信息、像素柱状区域视差值、以及像素柱状区域列信息中的至少一个。
- 根据权利要求59至61中任一项所述的装置,其特征在于,所述障碍物像素柱状区域的空间位置信息包括:障碍物像素柱状区域在水平方向坐标轴上的坐标、障碍物像素柱状区域在深度方向坐标轴上的坐标。
- 根据权利要求62所述的装置,其特征在于,所述障碍物像素柱状区域的空间位置信息还包括:障碍物像素柱状区域在竖直方向坐标轴上的最高点坐标、以及障碍物像素柱状区域在竖直方向坐标轴上的最低点坐标;所述最高点坐标和所述最低点坐标用于确定障碍物高度。
- 根据权利要求34至63中任一项所述的装置,其特征在于,所述障碍物像素区域包括;障碍 物像素柱状区域;所述第二确定模块包括:第九子模块,用于根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定所述环境图像中的障碍物检测框;和/或第十子模块,用于根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
- 根据权利要求64所述的装置,其特征在于,所述第十子模块进一步用于:根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定所述多个障碍物像素柱状区域与生成所述环境图像的摄像装置之间的距离;根据距离所述摄像装置最近的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
- 一种智能驾驶控制装置,其特征在于,包括:获取模块,用于通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像;采用如权利要求34-65中任一项所述的装置,对所述环境图像进行障碍物检测,确定障碍物检测结果;控制模块,用于根据所述障碍物检测结果生成并输出控制指令。
- 一种电子设备,其特征在于,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,实现上述权利要求1-33中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,该计算机程序被处理器执行时,实现上述权利要求1-33中任一项所述的方法。
- 一种计算机程序,其特征在于,包括计算机指令,当所述计算机指令在设备的处理器中运行时,实现上述权利要求1-33中任一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG11202013264YA SG11202013264YA (en) | 2019-06-27 | 2019-11-26 | Obstacle detection method, intelligent driving control method, apparatus, medium, and device |
JP2021513777A JP2021536071A (ja) | 2019-06-27 | 2019-11-26 | 障害物検出方法、知的運転制御方法、装置、媒体、及び機器 |
KR1020217007268A KR20210043628A (ko) | 2019-06-27 | 2019-11-26 | 장애물 감지 방법, 지능형 주행 제어 방법, 장치, 매체, 및 기기 |
US17/137,542 US20210117704A1 (en) | 2019-06-27 | 2020-12-30 | Obstacle detection method, intelligent driving control method, electronic device, and non-transitory computer-readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910566416.2A CN112149458A (zh) | 2019-06-27 | 2019-06-27 | 障碍物检测方法、智能驾驶控制方法、装置、介质及设备 |
CN201910566416.2 | 2019-06-27 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/137,542 Continuation US20210117704A1 (en) | 2019-06-27 | 2020-12-30 | Obstacle detection method, intelligent driving control method, electronic device, and non-transitory computer-readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020258703A1 true WO2020258703A1 (zh) | 2020-12-30 |
Family
ID=73868506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/120833 WO2020258703A1 (zh) | 2019-06-27 | 2019-11-26 | 障碍物检测方法、智能驾驶控制方法、装置、介质及设备 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20210117704A1 (zh) |
JP (1) | JP2021536071A (zh) |
KR (1) | KR20210043628A (zh) |
CN (1) | CN112149458A (zh) |
SG (1) | SG11202013264YA (zh) |
WO (1) | WO2020258703A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792583A (zh) * | 2021-08-03 | 2021-12-14 | 北京中科慧眼科技有限公司 | 基于可行驶区域的障碍物检测方法、系统和智能终端 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102125538B1 (ko) * | 2019-12-13 | 2020-06-22 | 주식회사 토르 드라이브 | 자율 주행을 위한 효율적인 맵 매칭 방법 및 그 장치 |
CN112733653A (zh) * | 2020-12-30 | 2021-04-30 | 智车优行科技(北京)有限公司 | 目标检测方法和装置、计算机可读存储介质、电子设备 |
CN112631312B (zh) * | 2021-03-08 | 2021-06-04 | 北京三快在线科技有限公司 | 一种无人设备的控制方法、装置、存储介质及电子设备 |
CN113269838B (zh) * | 2021-05-20 | 2023-04-07 | 西安交通大学 | 一种基于fira平台的障碍物视觉检测方法 |
CN113747058B (zh) * | 2021-07-27 | 2023-06-23 | 荣耀终端有限公司 | 基于多摄像头的图像内容屏蔽方法和装置 |
KR102623109B1 (ko) * | 2021-09-10 | 2024-01-10 | 중앙대학교 산학협력단 | 합성곱 신경망 모델을 이용한 3차원 의료 영상 분석 시스템 및 방법 |
CN114119700B (zh) * | 2021-11-26 | 2024-03-29 | 山东科技大学 | 一种基于u-v视差图的障碍物测距方法 |
CN115474032B (zh) * | 2022-09-14 | 2023-10-03 | 深圳市火乐科技发展有限公司 | 投影交互方法、投影设备和存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573646A (zh) * | 2014-12-29 | 2015-04-29 | 长安大学 | 基于激光雷达和双目相机的车前行人检测方法及系统 |
CN105741312A (zh) * | 2014-12-09 | 2016-07-06 | 株式会社理光 | 目标对象跟踪方法和设备 |
CN105866790A (zh) * | 2016-04-07 | 2016-08-17 | 重庆大学 | 一种考虑激光发射强度的激光雷达障碍物识别方法及系统 |
CN108197698A (zh) * | 2017-12-13 | 2018-06-22 | 中国科学院自动化研究所 | 基于多模态融合的多脑区协同自主决策方法 |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100962329B1 (ko) * | 2009-02-05 | 2010-06-10 | 연세대학교 산학협력단 | 스테레오 카메라 영상으로부터의 지면 추출 방법과 장치 및이와 같은 방법을 구현하는 프로그램이 기록된 기록매체 |
CN101701818B (zh) * | 2009-11-05 | 2011-03-30 | 上海交通大学 | 远距离障碍的检测方法 |
CN105095905B (zh) * | 2014-04-18 | 2018-06-22 | 株式会社理光 | 目标识别方法和目标识别装置 |
CN103955943A (zh) * | 2014-05-21 | 2014-07-30 | 西安电子科技大学 | 基于融合变化检测算子与尺度驱动的无监督变化检测方法 |
CN106971348B (zh) * | 2016-01-14 | 2021-04-30 | 阿里巴巴集团控股有限公司 | 一种基于时间序列的数据预测方法和装置 |
CN106157307B (zh) * | 2016-06-27 | 2018-09-11 | 浙江工商大学 | 一种基于多尺度cnn和连续crf的单目图像深度估计方法 |
EP3505310B1 (en) * | 2016-08-25 | 2024-01-03 | LG Electronics Inc. | Mobile robot and control method therefor |
EP3736537A1 (en) * | 2016-10-11 | 2020-11-11 | Mobileye Vision Technologies Ltd. | Navigating a vehicle based on a detected vehicle |
CN106708084B (zh) * | 2016-11-24 | 2019-08-02 | 中国科学院自动化研究所 | 复杂环境下无人机自动障碍物检测和避障方法 |
WO2018120040A1 (zh) * | 2016-12-30 | 2018-07-05 | 深圳前海达闼云端智能科技有限公司 | 一种障碍物检测方法及装置 |
CN107729856B (zh) * | 2017-10-26 | 2019-08-23 | 海信集团有限公司 | 一种障碍物检测方法及装置 |
CN108725440B (zh) * | 2018-04-20 | 2020-11-27 | 深圳市商汤科技有限公司 | 前向碰撞控制方法和装置、电子设备、程序和介质 |
CN108961327B (zh) * | 2018-05-22 | 2021-03-30 | 深圳市商汤科技有限公司 | 一种单目深度估计方法及其装置、设备和存储介质 |
CN109190704A (zh) * | 2018-09-06 | 2019-01-11 | 中国科学院深圳先进技术研究院 | 障碍物检测的方法及机器人 |
CN109087346B (zh) * | 2018-09-21 | 2020-08-11 | 北京地平线机器人技术研发有限公司 | 单目深度模型的训练方法、训练装置和电子设备 |
CN109508673A (zh) * | 2018-11-13 | 2019-03-22 | 大连理工大学 | 一种基于棒状像素的交通场景障碍检测与识别方法 |
-
2019
- 2019-06-27 CN CN201910566416.2A patent/CN112149458A/zh active Pending
- 2019-11-26 SG SG11202013264YA patent/SG11202013264YA/en unknown
- 2019-11-26 JP JP2021513777A patent/JP2021536071A/ja not_active Ceased
- 2019-11-26 KR KR1020217007268A patent/KR20210043628A/ko active Search and Examination
- 2019-11-26 WO PCT/CN2019/120833 patent/WO2020258703A1/zh active Application Filing
-
2020
- 2020-12-30 US US17/137,542 patent/US20210117704A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105741312A (zh) * | 2014-12-09 | 2016-07-06 | 株式会社理光 | 目标对象跟踪方法和设备 |
CN104573646A (zh) * | 2014-12-29 | 2015-04-29 | 长安大学 | 基于激光雷达和双目相机的车前行人检测方法及系统 |
CN105866790A (zh) * | 2016-04-07 | 2016-08-17 | 重庆大学 | 一种考虑激光发射强度的激光雷达障碍物识别方法及系统 |
CN108197698A (zh) * | 2017-12-13 | 2018-06-22 | 中国科学院自动化研究所 | 基于多模态融合的多脑区协同自主决策方法 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792583A (zh) * | 2021-08-03 | 2021-12-14 | 北京中科慧眼科技有限公司 | 基于可行驶区域的障碍物检测方法、系统和智能终端 |
Also Published As
Publication number | Publication date |
---|---|
JP2021536071A (ja) | 2021-12-23 |
US20210117704A1 (en) | 2021-04-22 |
CN112149458A (zh) | 2020-12-29 |
SG11202013264YA (en) | 2021-01-28 |
KR20210043628A (ko) | 2021-04-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020258703A1 (zh) | 障碍物检测方法、智能驾驶控制方法、装置、介质及设备 | |
WO2020108311A1 (zh) | 目标对象3d检测方法、装置、介质及设备 | |
EP3295426B1 (en) | Edge-aware bilateral image processing | |
US10353271B2 (en) | Depth estimation method for monocular image based on multi-scale CNN and continuous CRF | |
WO2020029758A1 (zh) | 对象三维检测及智能驾驶控制的方法、装置、介质及设备 | |
US9384556B2 (en) | Image processor configured for efficient estimation and elimination of foreground information in images | |
US11049270B2 (en) | Method and apparatus for calculating depth map based on reliability | |
US20140161359A1 (en) | Method for detecting a straight line in a digital image | |
CN110543858A (zh) | 多模态自适应融合的三维目标检测方法 | |
WO2020238008A1 (zh) | 运动物体检测及智能驾驶控制方法、装置、介质及设备 | |
US11049275B2 (en) | Method of predicting depth values of lines, method of outputting three-dimensional (3D) lines, and apparatus thereof | |
US20210124928A1 (en) | Object tracking methods and apparatuses, electronic devices and storage media | |
CN110570457A (zh) | 一种基于流数据的三维物体检测与跟踪方法 | |
Jia et al. | Real-time obstacle detection with motion features using monocular vision | |
CN114926747A (zh) | 一种基于多特征聚合与交互的遥感图像定向目标检测方法 | |
US20210078597A1 (en) | Method and apparatus for determining an orientation of a target object, method and apparatus for controlling intelligent driving control, and device | |
KR102262671B1 (ko) | 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체 | |
CN114627173A (zh) | 通过差分神经渲染进行对象检测的数据增强 | |
US20100014716A1 (en) | Method for determining ground line | |
EP4323952A1 (en) | Semantically accurate super-resolution generative adversarial networks | |
CN115147809B (zh) | 一种障碍物检测方法、装置、设备以及存储介质 | |
He et al. | A novel way to organize 3D LiDAR point cloud as 2D depth map height map and surface normal map | |
JP2013164643A (ja) | 画像認識装置、画像認識方法および画像認識プログラム | |
CN113284221B (zh) | 一种目标物检测方法、装置及电子设备 | |
CN111765892B (zh) | 一种定位方法、装置、电子设备及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19935079 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20217007268 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021513777 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 31.03.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19935079 Country of ref document: EP Kind code of ref document: A1 |