WO2020258703A1 - 障碍物检测方法、智能驾驶控制方法、装置、介质及设备 - Google Patents

障碍物检测方法、智能驾驶控制方法、装置、介质及设备 Download PDF

Info

Publication number
WO2020258703A1
WO2020258703A1 PCT/CN2019/120833 CN2019120833W WO2020258703A1 WO 2020258703 A1 WO2020258703 A1 WO 2020258703A1 CN 2019120833 W CN2019120833 W CN 2019120833W WO 2020258703 A1 WO2020258703 A1 WO 2020258703A1
Authority
WO
WIPO (PCT)
Prior art keywords
disparity
obstacle
value
pixel
map
Prior art date
Application number
PCT/CN2019/120833
Other languages
English (en)
French (fr)
Inventor
姚兴华
周星宇
刘润涛
曾星宇
Original Assignee
商汤集团有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 商汤集团有限公司 filed Critical 商汤集团有限公司
Priority to SG11202013264YA priority Critical patent/SG11202013264YA/en
Priority to JP2021513777A priority patent/JP2021536071A/ja
Priority to KR1020217007268A priority patent/KR20210043628A/ko
Priority to US17/137,542 priority patent/US20210117704A1/en
Publication of WO2020258703A1 publication Critical patent/WO2020258703A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20228Disparity calculation for image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30261Obstacle

Definitions

  • the present disclosure relates to computer vision technology, and in particular to an obstacle detection method, an obstacle detection device, an intelligent driving control method, an intelligent driving control device, electronic equipment, a computer-readable storage medium, and a computer program.
  • perception technology In the field of computer vision technology, perception technology is usually used to perceive obstacles in the outside world. That is, the perception technology includes obstacle detection.
  • the perception results of the perception technology are usually provided to the decision-making layer so that the decision-making layer makes decisions based on the perception results.
  • the perception layer provides the road information it perceives to the vehicle and the obstacle information around the vehicle to the decision-making layer, so that the decision-making layer executes driving decisions to avoid obstacles and ensure the safety of the vehicle Driving.
  • the types of obstacles are generally defined in advance, such as pedestrians, vehicles, non-motorized vehicles and other obstacles with inherent shapes, textures, and colors, and then related detection algorithms are used to detect the predefined types of obstacles. Detection.
  • the embodiments of the present disclosure provide a technical solution for obstacle detection and intelligent driving control.
  • an obstacle detection method including: acquiring a first disparity map of an environment image, the environment image being an image that characterizes the spatial environment information where the smart device is located during movement; A plurality of obstacle pixel regions are determined in the first disparity map of the environment image; clustering processing is performed on the plurality of obstacle pixel regions to obtain at least one cluster; according to obstacle pixel regions belonging to the same cluster To determine the obstacle detection result.
  • a smart driving control method includes: acquiring an environment image of the smart device during movement through an image acquisition device set on the smart device; and adopting the above obstacle detection method , Perform obstacle detection on the acquired environment image to determine the obstacle detection result; generate and output a control instruction according to the obstacle detection result.
  • an obstacle detection device including: an acquisition module for acquiring a first disparity map of an environment image, the environment image representing the spatial environment in which the smart device is located during movement Information image; a first determination module, used to determine a plurality of obstacle pixel regions in the first disparity map of the environment image; a clustering module, used to perform clustering processing on the plurality of obstacle pixel regions , Obtain at least one cluster; the second determining module is used to determine the obstacle detection result according to the obstacle pixel area belonging to the same cluster.
  • an intelligent driving control device which includes: an acquisition module configured to acquire an environmental image of the intelligent device during movement through an image acquisition device set on the intelligent device; An obstacle detection device detects obstacles on the environment image to determine an obstacle detection result; a control module is used to generate and output control instructions according to the obstacle detection result.
  • an electronic device including: a memory for storing a computer program; a processor for executing the computer program stored in the memory, and when the computer program is executed, Any method embodiment of the present disclosure.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, any method embodiment of the present disclosure is implemented.
  • a computer program including computer instructions, which, when the computer instructions run in the processor of the device, implement any method embodiment of the present disclosure.
  • the present disclosure can determine multiple obstacle pixels from the first disparity map of the environment image Region, and obtain the obstacle detection result by clustering multiple obstacle pixel regions.
  • the detection method adopted in the present disclosure does not need to predefine the obstacle to be detected, and does not use the predefined information such as the texture, color, shape, and category of the obstacle, and can be directly based on the method of clustering the obstacle area
  • the detected obstacles are not limited to certain pre-defined obstacles, and can realize various obstacles that may hinder the movement of smart devices in the surrounding space environment (It may be called a generic obstacle in the present disclosure) performs detection, thereby realizing detection of a generic obstacle.
  • the technical solution provided by the present disclosure is a more general obstacle detection solution, which can be applied to the detection of general types of obstacles, and is beneficial to respond to real environments. Diverse types of obstacle detection. Moreover, for smart devices, the technical solutions provided by the present disclosure can detect diversified and random obstacles that may occur during driving, and then output control instructions for the driving process based on the detection results, which is beneficial Improve the safety of intelligent vehicle driving.
  • the technical solutions of the present disclosure will be further described in detail below through the drawings and embodiments.
  • FIG. 1 is a flowchart of an embodiment of the obstacle detection method of the present disclosure
  • FIG. 2 is a schematic diagram of an embodiment of the environmental image of the present disclosure
  • FIG. 3 is a schematic diagram of an implementation manner of the first disparity map of FIG. 2;
  • FIG. 4 is a schematic diagram of an embodiment of the first disparity map of the present disclosure.
  • FIG. 5 is a schematic diagram of an embodiment of the convolutional neural network of the present disclosure.
  • FIG. 6 is a schematic diagram of an embodiment of the first weight distribution diagram of the first disparity map of the present disclosure
  • FIG. 7 is a schematic diagram of another embodiment of the first weight distribution diagram of the first disparity map of the present disclosure.
  • FIG. 8 is a schematic diagram of an embodiment of the second weight distribution diagram of the first disparity map of the present disclosure.
  • FIG. 9 is a schematic diagram of an embodiment of the second mirror image of the present disclosure.
  • FIG. 10 is a schematic diagram of an embodiment of the second weight distribution diagram of the second mirror image shown in FIG. 9;
  • FIG. 11 is a schematic diagram of an embodiment of the present disclosure to optimize and adjust the disparity map of a monocular image
  • FIG. 12 is a schematic diagram of an implementation manner of obstacle edge information in the first disparity map of the environmental image of the present disclosure
  • FIG. 13 is a schematic diagram of an embodiment of the statistical disparity map of the present disclosure.
  • FIG. 14 is a schematic diagram of an embodiment of forming a statistical disparity map of the present disclosure.
  • 15 is a schematic diagram of an embodiment of the straight line fitting of the present disclosure.
  • FIG. 16 is a schematic diagram of the ground area and the non-ground area of the present disclosure.
  • FIG. 17 is a schematic diagram of an embodiment of the coordinate system established by the present disclosure.
  • FIG. 19 is a schematic diagram of an embodiment of forming an obstacle pixel columnar region of the present disclosure.
  • 20 is a schematic diagram of an embodiment of clustering columnar regions of obstacle pixels in the present disclosure
  • FIG. 21 is a schematic diagram of an embodiment of forming an obstacle detection frame of the present disclosure.
  • FIG. 22 is a flowchart of an embodiment of the convolutional neural network training method of the present disclosure.
  • FIG. 23 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure.
  • FIG. 24 is a schematic structural diagram of an embodiment of the obstacle detection device of the present disclosure.
  • FIG. 25 is a flowchart of an embodiment of the intelligent driving control device of the present disclosure.
  • Fig. 26 is a block diagram of an exemplary device for implementing the embodiments of the present disclosure.
  • the embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, and servers, which can operate with many other general or special computing system environments or configurations.
  • Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, and servers including but not limited to: personal computer systems, server computer systems, thin clients, thick Client computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, etc.
  • Electronic devices such as terminal devices, computer systems, and servers can be described in the general context of computer system executable instructions (such as program modules) executed by the computer system.
  • program modules can include routines, programs, target programs, components, logic, and data structures, etc., which perform specific tasks or implement specific abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment. In the distributed cloud computing environment, tasks are executed by remote processing equipment linked through a communication network.
  • program modules may be located on a storage medium of a local or remote computing system including a storage device.
  • FIG. 1 is a flowchart of an embodiment of the obstacle detection method of the present disclosure.
  • the method in this embodiment includes steps: S100, S110, S120, and S130. The steps are described in detail below.
  • the environment image is an image that characterizes the spatial environment information where the smart device is located during the movement.
  • the smart device is, for example, a smart driving device (such as an autonomous car), an smart flying device (such as a drone), an intelligent robot, and the like.
  • the environment image is, for example, an image that characterizes the road space environment information of the intelligent driving device or the intelligent robot during the movement, or the image of the space environment information of the intelligent flying device during the flight.
  • the smart device and the environment image in the present disclosure are not limited to the above examples, and the present disclosure does not limit this.
  • obstacles in the environment image are detected. Any object in the surrounding space environment where the smart device is located that may hinder the movement process may fall into the obstacle detection range and be regarded as the obstacle detection object. For example, during the driving of the smart driving device, objects such as stones, animals, and fallen goods may appear on the road. These objects have no specific shapes, textures, colors, types, and they are very different from each other. It is considered an obstacle. In the present disclosure, any object that may cause obstruction during the movement is called a generic obstacle.
  • the first disparity map of the present disclosure is used to describe the disparity of the environmental image. Parallax can be considered as the difference in the position of the target object when observing the same target object from two points at a certain distance.
  • An example of the environmental image is shown in Figure 2.
  • An example of the first disparity map of the environment image shown in FIG. 2 is shown in FIG. 3.
  • the first disparity map of the environment image in the present disclosure may also be expressed in the form shown in FIG. 4.
  • Each number in FIG. 4 (such as 0, 1, 2, 3, 4, 5, etc.) respectively represents: the parallax of the pixel at the position (x, y) in the environment image. It should be particularly noted that FIG. 4 does not show a complete disparity map.
  • the environmental image in the present disclosure may be a monocular image or a binocular image.
  • Monocular images are usually obtained by shooting with a monocular camera.
  • Binocular images are usually obtained by shooting with a binocular camera device.
  • both the monocular image and the binocular image in the present disclosure may be photos or pictures, etc., and may also be video frames in a video.
  • the present disclosure can realize obstacle detection without the need to provide a binocular camera device, thereby helping to reduce obstacle detection costs.
  • the present disclosure may use a successfully trained convolutional neural network in advance to obtain the first disparity map of the monocular image.
  • a monocular image is input into a convolutional neural network, the monocular image is processed for disparity analysis via the convolutional neural network, and the convolutional neural network outputs the disparity analysis processing result, so that the present disclosure can be based on the disparity analysis processing result , Obtain the first disparity map of the monocular image.
  • the convolutional neural network to obtain the first disparity map of the monocular image, the first disparity map can be obtained without using two images for pixel-by-pixel disparity calculation and camera calibration. It is beneficial to improve the convenience and real-time performance of obtaining the first disparity map.
  • the convolutional neural network in the present disclosure generally includes but is not limited to: multiple convolutional layers (Conv) and multiple deconvolutional layers (Deconv).
  • the convolutional neural network of the present disclosure can be divided into two parts, namely an encoding part and a decoding part.
  • the monocular image input to the convolutional neural network (the monocular image shown in Figure 2) is encoded by the encoding part (ie feature extraction processing), and the encoding processing result of the encoding part is provided to the decoding part,
  • the decoding part decodes the encoding processing result and outputs the decoding processing result.
  • the present disclosure can obtain the first disparity map of the monocular image (the first disparity map shown in FIG.
  • the coding part in the convolutional neural network includes but is not limited to: multiple convolutional layers, and multiple convolutional layers are connected in series.
  • the decoding part in the convolutional neural network includes, but is not limited to: multiple convolutional layers and multiple deconvolutional layers, and multiple convolutional layers and multiple deconvolutional layers are arranged at intervals and connected in series.
  • FIG. 5 An alternative example of the convolutional neural network in the present disclosure is shown in FIG. 5.
  • the first rectangle on the left represents the monocular image input to the convolutional neural network
  • the first rectangle on the right represents the disparity map output by the convolutional neural network.
  • Each rectangle from the second rectangle to the 15th rectangle on the left represents a convolutional layer
  • all the rectangles from the 16th rectangle on the left to the second rectangle on the right represent deconvolution layers and convolutions set apart from each other
  • the 16th rectangle on the left represents the deconvolution layer
  • the 17th rectangle on the left represents the convolution layer
  • the 18th rectangle on the left represents the deconvolution layer
  • the 19th rectangle on the left represents the convolution layer.
  • the convolutional neural network of the present disclosure may merge the low-level information and high-level information in the convolutional neural network by means of skip connection.
  • the output of at least one convolutional layer in the encoding part is provided to at least one deconvolutional layer in the decoding part through a jump connection.
  • the input of all convolutional layers in the convolutional neural network usually includes: the output of the previous layer (such as a convolutional layer or a deconvolutional layer), and at least one deconvolutional layer (such as The input of a partial deconvolution layer or all deconvolution layers) includes: the upsample (Upsample) result of the output of the previous convolution layer and the output of the convolution layer of the coding part jump connected to the deconvolution layer.
  • the content pointed by the solid arrow drawn from the bottom of the convolution layer on the right side of Figure 5 represents the output of the convolution layer
  • the dotted arrow in Figure 5 represents the upsampling result provided to the deconvolution layer.
  • the solid arrow drawn above the convolutional layer on the left represents the output of the convolutional layer jump-connected to the deconvolutional layer.
  • the present disclosure does not limit the number of jump connections and the network structure of the convolutional neural network.
  • the present disclosure helps to improve the accuracy of the disparity map generated by the convolutional neural network by fusing the low-level information and the high-level information in the convolutional neural network.
  • the convolutional neural network of the present disclosure is obtained by training using binocular image samples. For the training process of the convolutional neural network, refer to the description in the following embodiments. I will not elaborate on it here.
  • the present disclosure may also optimize and adjust the first disparity map of the environment image obtained by using the convolutional neural network, so as to obtain a more accurate first disparity map.
  • the present disclosure may use the disparity map of the mirror image of the monocular image to optimize and adjust the first disparity map of the monocular image, so that the present disclosure can be adjusted after the disparity A plurality of obstacle pixel regions are determined in the first disparity map.
  • the mirror image of the monocular image is referred to as the first mirror image
  • the disparity image of the first mirror image is referred to as the second disparity image.
  • the first mirror image is obtained, and the disparity map of the first mirror image is obtained, and then the first mirror image of the monocular image is obtained according to the disparity map of the first mirror image.
  • multiple obstacle pixel regions can be determined in the first disparity map after the disparity adjustment.
  • a specific example of optimal adjustment of the first disparity map is as follows:
  • Step A Obtain a second disparity map of the first mirror image of the monocular image, and acquire the mirror image of the second disparity map.
  • the first mirror image of the monocular image in the present disclosure may be a mirror image formed by performing mirror processing (such as left mirror processing or right mirror processing) on the monocular image in the horizontal direction.
  • the mirror image of the second disparity map in the present disclosure may be a mirror image formed after performing mirror processing (such as left mirror processing or right mirror processing) on the second disparity map in the horizontal direction.
  • the mirror image of the second disparity map is still the disparity map.
  • the present disclosure can first perform left mirror processing or right mirror processing on the monocular image (because the left mirror processing result is the same as the right mirror processing result, therefore, the present disclosure can perform left mirror processing or right mirror processing on the monocular image) to obtain The first mirror image (left mirror image or right mirror image) of the monocular image, and then the disparity map of the first mirror image of the monocular image is obtained, thereby obtaining the second disparity image; finally, the second disparity image is obtained Perform left mirror processing or right mirror processing (because the left mirror processing result of the second disparity map is the same as the right mirror processing result, therefore, the present disclosure can perform left mirror processing or right mirror processing on the second disparity map) to obtain the first The mirror image of the two-disparity image (left mirror image or right mirror image).
  • the mirror image of the second disparity map is still a disparity map.
  • the mirror image of the second disparity map is referred to as the second mirror image below.
  • the present disclosure when the present disclosure performs mirror image processing on a monocular image, it may not consider whether the monocular image is mirrored as a left-eye image or as a right-eye image. That is to say, regardless of whether the monocular image is used as the left-eye image or the right-eye image, the present disclosure can perform left mirror processing or right mirror processing on the monocular image to obtain the first mirror image. Similarly, when performing mirror image processing on the second disparity map in the present disclosure, it is also possible to ignore whether to perform left mirror processing on the second disparity map or perform right mirror processing on the second disparity map.
  • the convolutional neural network used to generate the disparity map of the monocular image if the left-eye image sample in the binocular image sample is used as input, it is provided to the convolutional neural network for training, and it is successful The trained convolutional neural network will use the input monocular image as the left-eye image in testing and practical applications. If the right-eye image sample in the binocular image sample is used as input, it is provided to the convolutional neural network for training, then the successfully trained convolutional neural network will use the input monocular image as the right-eye image in testing and practical applications .
  • the present disclosure may also use the aforementioned convolutional neural network to obtain the second disparity map.
  • the first mirror image is input to the convolutional neural network, and the disparity analysis processing is performed on the first mirror image through the convolutional neural network, and the convolutional neural network outputs the disparity analysis processing result, so that the present disclosure can according to the output disparity Analyze the processing result to obtain the second disparity map.
  • Step B Obtain the weight distribution map of the first disparity map and the weight distribution map of the second mirror image of the monocular image.
  • the weight distribution map of the first disparity map is used to describe the respective weight values of multiple disparity values (such as all disparity values) in the first disparity map.
  • the weight distribution map of the first disparity map may include, but is not limited to: a first weight distribution map of the first disparity map and a second weight distribution map of the first disparity map.
  • the first weight distribution map of the first disparity map is a weight distribution map set uniformly for the disparity maps of a plurality of different monocular images, that is, the first weight distribution map of the first disparity map may face multiple different disparity maps.
  • the first disparity map of the monocular image that is, the first disparity maps of different monocular images use the same first weight distribution map.
  • the present disclosure may refer to the first weight distribution map of the first disparity map as The global weight distribution map of the first disparity map.
  • the global weight distribution map of the first disparity map is used to describe the global weight values corresponding to multiple disparity values (such as all disparity values) in the first disparity map.
  • the second weight distribution map of the first disparity map is a weight distribution map set for the first disparity map of a single monocular image, that is, the second weight distribution map of the first disparity map is for a single monocular image
  • the first disparity map that is, the first disparity map of different monocular images uses different second weight distribution maps, therefore, the second weight distribution map of the first disparity map may be referred to as the first disparity map in the present disclosure
  • the local weight distribution map is used to describe the respective local weight values of multiple disparity values (such as all disparity values) in the first disparity map.
  • the weight distribution map of the second mirror image is used to describe the respective weight values of the multiple disparity values in the second mirror image.
  • the weight distribution diagram of the second mirror image may include, but is not limited to: the first weight distribution diagram of the second mirror image and the second weight distribution diagram of the second mirror image.
  • the first weight distribution diagram of the second mirror image is a weight distribution diagram uniformly set for the second mirror images of multiple different monocular images, that is, the first weight distribution diagram of the second mirror image faces multiple
  • the second mirror image of different monocular images, that is, the second mirror image of different monocular images uses the same first weight distribution diagram. Therefore, the present disclosure can call the first weight distribution diagram of the second mirror image It is the global weight distribution map of the second mirror image.
  • the global weight distribution diagram of the second mirror image is used to describe the respective global weight values of multiple disparity values (such as all disparity values) in the second mirror image.
  • the second weight distribution diagram of the second mirror image is a weight distribution diagram set for the second mirror image of a single monocular image, that is, the second weight distribution diagram of the second mirror image is for a single monocular image That is, the second mirror images of different monocular images use different second weight distribution maps. Therefore, the present disclosure may refer to the second weight distribution map of the second mirror image as the second mirror image The local weight distribution of the graph.
  • the local weight distribution map of the second mirror image is used to describe the respective local weight values of multiple disparity values (such as all disparity values) in the second mirror image.
  • the first weight distribution map of the first disparity map includes: at least two left and right separated regions, and different regions have different weight values.
  • the magnitude relationship between the weight value of the area on the left and the weight value of the area on the right is usually related to whether the monocular image is used as the left-eye image or the right-eye image.
  • FIG. 6 is a first weight distribution diagram of the first disparity map shown in FIG. 3, and the first weight distribution diagram is divided into five regions, namely, region 1, region 2, region 3, region 4, and region in FIG. 5.
  • the weight value of area 5 is not less than the weight value of area 4
  • the weight value of area 4 is not less than the weight value of area 3
  • the weight value of area 3 is not less than the weight value of area 2
  • the weight value of area 2 is not less than the weight of area 1. value.
  • any region in the first weight distribution map of the first disparity map may have the same weight value, or may have different weight values.
  • the weight value of the left part in the region is usually less than or equal to the weight value of the right part in the region.
  • the weight value of region 1 in Figure 6 can be 0, that is, in the first disparity map, the disparity corresponding to region 1 is completely unreliable; the weight value of region 2 can be from left to right, from 0 Gradually increase and approach 0.5; the weight value of area 3 is 0.5; the weight value of area 4 can be from the left to the right, gradually increasing from a value greater than 0.5 and approaching 1; the weight value of area 5 is 1, that is In the first disparity map, the disparity corresponding to area 5 is completely credible.
  • FIG. 7 shows a first weight distribution diagram that is used as a disparity map of the right eye image to be processed.
  • the first weight distribution diagram is divided into five regions, namely, region 1, region 2, region 3, and region 4 in FIG. And area 5.
  • the weight value of region 1 is not less than the weight value of region 2
  • the weight value of region 2 is not less than the weight value of region 3
  • the weight value of region 3 is not less than the weight value of region 4
  • the weight value of region 4 is not less than the weight value of region 5. value.
  • any region in the first weight distribution map of the first disparity map may have the same weight value, or may have different weight values.
  • the weight value of the right part in the region is usually not greater than the weight value of the left part in the region.
  • the weight value in area 4 can be from right to left, by 0 gradually increases to 0.5; the weight value of area 3 is 0.5; the weight value in area 2 can be from the right to the left, gradually increasing from a value greater than 0.5 and approaching 1; the weight value of area 1 is 1, That is, in the first disparity map, the disparity corresponding to area 1 is completely credible.
  • the first weight distribution map of the second mirror image includes at least two left and right divided regions, and different regions have different weight values.
  • the magnitude relationship between the weight value of the area on the left and the weight value of the area on the right is usually related to whether the monocular image is used as the left-eye image or the right-eye image.
  • the weight value of the region on the right is not less than that of the region on the left.
  • Weights any area in the first weight distribution diagram of the second mirror image may have the same weight value or may have different weight values.
  • the weight value of the left part in the region is usually not greater than the weight value of the right part in the region.
  • the weight value of the region on the left is not less than that of the region on the right The weight value of.
  • any area in the first weight distribution diagram of the second mirror image may have the same weight value or may have different weight values.
  • the weight value of the right part in the region is usually not greater than the weight value of the left part in the region.
  • the manner of setting the second weight distribution map of the first disparity map may include the following steps:
  • the weight value in the second weight distribution map of the first disparity map is set.
  • the second weight of the first disparity map is distributed The weight value of the pixel at this position in the figure is set to the first value.
  • the second weight distribution of the first disparity map is set in the The weight value of the pixel at the position is set to the second value.
  • the The weight value of the pixel at the position in the second weight distribution map is set to the first value, otherwise, it is set to the second value.
  • the first value in this disclosure is greater than the second value.
  • the first value is 1 and the second value is 0.
  • an example of the second weight distribution map of the first disparity map is shown in FIG. 8.
  • the weight values of the white areas in FIG. 8 are all 1, which indicates that the disparity value at this position is completely reliable.
  • the weight value of the black area in FIG. 8 is 0, which means that the disparity value at this position is completely unreliable.
  • the first reference value corresponding to a pixel at any position in the present disclosure may be set according to the disparity value of the pixel at that position in the first disparity map and a constant value greater than zero.
  • the product of the disparity value of the pixel at the position in the first disparity map and a constant value greater than zero is used as the first reference value corresponding to the pixel at the position in the mirror disparity map.
  • the second weight distribution map of the first disparity map may be expressed by the following formula (1):
  • L l represents the second weight distribution map of the first disparity map
  • Re represents the disparity value of the pixel at the corresponding position of the mirror disparity map
  • d l represents the disparity value of the pixel at the corresponding position in the first disparity map
  • the setting manner of the second weight distribution map of the second mirror image may be: according to the disparity value in the first disparity map, the weight value in the second weight distribution map of the second mirror image is set.
  • the disparity value of the pixel at that position in the first disparity image satisfies a second predetermined condition, then the second mirror image
  • the weight value of the pixel at the position in the second weight distribution map is set to the third value.
  • the weight value of the pixel at this position in the second weight distribution map of the second mirror image is set to the first Four values; where the third value is greater than the fourth value.
  • the weight value of the pixel at the position in the second weight distribution diagram of the second mirror image is set to the third value, otherwise, it is set to the fourth value.
  • the third value in the present disclosure is greater than the fourth value.
  • the third value is 1, and the fourth value is 0.
  • the second reference value corresponding to the pixel in the present disclosure may be set according to the disparity value of the pixel at the corresponding position in the mirror disparity map and a constant value greater than zero. For example, first perform a left/right mirror image processing on the first disparity map to form a mirror disparity map, that is, a mirror disparity map, and then combine the disparity value of the pixel at the corresponding position in the mirror disparity map with a constant value greater than zero The product is used as the second reference value corresponding to the pixel at the corresponding position in the first disparity map.
  • FIG. 9 An example of the second weight distribution map of the second mirror image shown in FIG. 9 is shown in FIG. 10.
  • the weight values of the white areas in FIG. 10 are all 1, which indicates that the disparity value at this position is completely reliable.
  • the weight value of the black area in FIG. 10 is 0, which means that the disparity value at this position is completely unreliable.
  • the second weight distribution graph of the second mirror image can be expressed by the following formula (2):
  • Step C According to the weight distribution map of the first disparity map of the monocular image and the weight distribution map of the second mirror image, the first disparity map of the monocular image is optimized and adjusted, and the optimized and adjusted disparity map is finally obtained The first disparity map of the monocular image.
  • the present disclosure may use the first weight distribution map and the second weight distribution map of the first disparity map to adjust multiple disparity values in the first disparity map to obtain the adjusted first disparity map ;
  • a disparity map and the adjusted second mirror image are combined to obtain the first disparity map of the optimized and adjusted monocular image.
  • an example of obtaining the first disparity map of the optimized and adjusted monocular image is as follows:
  • the third weight distribution graph can be expressed by the following formula (3):
  • W l represents the third weight distribution map
  • M l represents the first weight distribution map of the first disparity map
  • L l represents the second weight distribution map of the first disparity map
  • the fourth weight distribution graph can be expressed by the following formula (4):
  • W l ' represents the fourth weight distribution map
  • M l ' represents the first weight distribution map of the second mirror image
  • L l ' represents the second weight distribution map of the second mirror image
  • the adjusting the multiple disparity values in the second mirror image according to the fourth weight distribution map to obtain the adjusted second mirror image. For example, for the disparity value of a pixel at any position in the second mirror image, the disparity value of the pixel at that position is replaced with: the disparity value of the pixel at that position and the fourth The product of the weight value of the pixel at the corresponding position in the weight distribution map. After performing the above-mentioned replacement processing on all pixels in the second mirror image, an adjusted second mirror image is obtained.
  • the first disparity map of the monocular image finally obtained can be expressed by the following formula (5):
  • d final represents the first disparity map of the monocular image finally obtained (as shown in the first image on the right in Figure 11);
  • W l represents the third weight distribution map (as shown in Figure 11 Is shown in the first image from the upper left of the image);
  • W l' represents the fourth weight distribution map (as shown in the first image from the lower left in Figure 11);
  • dl represents the first disparity map (the second image from the upper left in Figure 11) As shown in the picture); Shows the second mirror image (as shown in the second image from the bottom left in Figure 11).
  • the present disclosure does not limit the execution order of the two steps of merging the first weight distribution map and the second weight distribution map.
  • the two merging processing steps can be executed simultaneously or sequentially.
  • the present disclosure does not limit the sequence of adjusting the disparity value in the first disparity image and adjusting the disparity value in the second mirror image.
  • the two adjustment steps can be performed at the same time or Execute successively.
  • the present disclosure performs mirror image processing on the monocular image and performs mirror processing on the second disparity map, and then uses the mirrored disparity map (ie, the second mirror image) to optimize and adjust the first disparity map of the monocular image, which is beneficial to The phenomenon that the disparity value of the corresponding area in the first disparity map of the monocular image is reduced is reduced, thereby helping to improve the accuracy of obstacle detection.
  • the mirrored disparity map ie, the second mirror image
  • the method of obtaining the first disparity map of the binocular image in the present disclosure includes but is not limited to: obtaining the first disparity of the binocular image by means of stereo matching Figure.
  • BM Block Matching
  • SGBM Semi-Global Block Matching, semi-global block matching
  • GC Graph Cuts
  • a convolutional neural network for obtaining a disparity map of a binocular image is used to perform disparity processing on the binocular image, thereby obtaining a first disparity map of the binocular image.
  • S110 Determine a plurality of obstacle pixel regions in the first disparity map of the environment image.
  • the obstacle pixel area may be a pixel area including at least two consecutive pixels in the first disparity map.
  • the obstacle pixel area may be a columnar area of obstacle pixels.
  • the columnar area of obstacle pixels in the present disclosure is a stripe area, and the width of the stripe area is at least one column of pixels. The height is at least two rows of pixels. Since the stripe area can be used as the basic unit of the obstacle, the present disclosure refers to the stripe area as an obstacle pixel column area.
  • the present disclosure may first perform edge detection on the first disparity map of the environment image obtained in the above steps to obtain obstacle edge information; then, determine the obstacle area in the first disparity map of the environment image; Finally, according to the obstacle edge information, a plurality of obstacle pixel columnar regions are determined in the obstacle area.
  • the present disclosure is beneficial to avoid the phenomenon of forming an obstacle pixel columnar area in an area of low value of attention, and is beneficial to improve the convenience of forming an obstacle pixel columnar area.
  • Different obstacles in the actual space, due to the different distances from the camera device will cause different parallaxes, thereby forming obstacles with parallax edges.
  • the present disclosure can separate the obstacles in the parallax map by detecting the obstacle edge information, so that the present disclosure can easily form the obstacle pixel columnar area by searching for the obstacle edge information, which is beneficial to improve the formation of the obstacle pixel columnar area The convenience.
  • the method of obtaining the obstacle edge information in the first disparity map of the environment image in the present disclosure includes but is not limited to: using a convolutional neural network for edge extraction to obtain the first disparity map of the environment image Obstacle edge information in the image; and using an edge detection algorithm to obtain the obstacle edge information in the first disparity map of the environment image.
  • the present disclosure uses an edge detection algorithm to obtain the obstacle edge information in the first disparity map of the environment image as shown in FIG. 12.
  • step 1 Perform histogram equalization processing on the first disparity map of the environment image.
  • the first disparity map of the environment image is the image in the upper left corner of FIG. 12, and the first disparity map may be the first disparity map of the environment image shown in FIG. 2 finally obtained by using the above step 100.
  • the result of the histogram equalization processing is shown in the second image from the upper left of FIG. 12.
  • Step 2 Perform average filtering processing on the result of the histogram equalization processing.
  • the result of the filtering process is shown in the third picture from the upper left in Figure 12.
  • the above steps 1 and 2 are the preprocessing of the first disparity map of the environment image.
  • Steps 1 and 2 are only an example of preprocessing the first disparity map of the environment image. The present disclosure does not limit the specific implementation of preprocessing.
  • Step 3 Use an edge detection algorithm to perform edge detection processing on the filtered result to obtain edge information.
  • the edge information obtained in this step is shown in the fourth image from the upper left in Figure 12.
  • the edge detection algorithms in the present disclosure include, but are not limited to: Canny edge detection algorithm, Sobel edge detection algorithm, or Laplacian edge detection algorithm.
  • Step 4 Perform a morphological expansion operation on the obtained edge information.
  • the result of the expansion operation is shown in the fifth graph from the upper left in Figure 12.
  • This step belongs to a post-processing method of the detection result of the edge detection algorithm.
  • the present disclosure does not limit the specific implementation of post-processing.
  • Step 5 Perform a reverse operation on the result of the expansion operation to obtain an edge mask (Mask) of the first disparity map of the environment image.
  • the edge mask of the first disparity map of the environmental image is shown in the lower left corner of FIG. 12.
  • Step 6 Perform an AND operation on the edge mask of the first disparity map of the environment image and the first disparity map of the environment image to obtain the obstacle edge information in the first disparity map of the environment image.
  • the right side diagram of FIG. 12 shows obstacle edge information in the first disparity map of the environment image.
  • the disparity value at the position of the obstacle edge in the first disparity map of the environment image is set to 0.
  • Obstacle edge information is shown as black edge lines in Figure 12.
  • an example of determining the obstacle area in the first disparity map in the present disclosure includes the following steps:
  • Step a Perform statistical processing on the disparity value of each row of pixels in the first disparity map to obtain statistical information of the disparity value of each row of pixels, and determine based on the statistical information of the disparity value of each row of pixels Statistical disparity map.
  • the present disclosure may perform horizontal statistics (row-direction statistics) on the first disparity map of the environment image to obtain a V disparity map, which may be used as a statistical disparity map. That is, for each row of the first disparity map of the environment image, the number of disparity values in the row is counted, and the statistical result is set on the corresponding column of the V disparity map.
  • the width of the V-disparity map (that is, the number of columns) is related to the value range of the disparity value. For example, if the value range of the disparity value is 0-254, the width of the V-disparity map is 255.
  • the height of the V-disparity map is the same as the height of the first disparity map of the environment image, that is, the number of rows included in the two is the same.
  • the statistical disparity map formed by the present disclosure is shown in FIG. 13.
  • the top row represents the disparity value from 0 to 5; the value in the second row and the first column is 1, which means that the number of disparity values in the first row of Figure 4 is 1; the second row is 2
  • the value of the column is 6, which means that the number of disparity values in the first row of Figure 4 is 6; the value in the fifth row and the sixth column is 5, which means that the disparity value in the fifth row of Figure 4 is 5
  • the other numerical values in FIG. 13 are not described one by one.
  • the first disparity map of the environmental image shown in the left image in FIG. 14 is processed, and the obtained V-disparity map is the right image in FIG. Shown.
  • Step b Perform a first straight line fitting process on the statistical disparity map (also referred to as a V disparity map in the present disclosure), and determine the ground area and the non-ground area according to the result of the first straight line fitting process.
  • the statistical disparity map also referred to as a V disparity map in the present disclosure
  • the present disclosure may preprocess the V-disparity map.
  • the preprocessing of the V-disparity map may include, but is not limited to: removing noise, etc.
  • threshold filtering is performed on the V-disparity map to filter out noise in the V-disparity map.
  • the V-disparity map for filtering noise is shown in the second image on the left in FIG. 15.
  • v represents the row coordinates in the V disparity map
  • d represents the disparity value.
  • the oblique line in FIG. 13 represents the fitted first linear equation.
  • the white diagonal line in the first picture on the right in FIG. 15 represents the fitted first straight line equation.
  • the first straight line fitting method includes but is not limited to: RANSAC straight line fitting method.
  • the first straight line equation obtained by the above fitting may represent the relationship between the disparity value of the ground area and the row coordinates of the V disparity map. That is, for any row in the V disparity map, when v is determined, the disparity value d of the ground area should be a certain value.
  • the disparity value of the ground area can be expressed in the form of the following formula (6):
  • d road represents the parallax value of the ground area
  • a and B are known values, such as the values obtained by the first straight line fitting.
  • the present disclosure can use formula (6) to segment the first disparity map of the environment image, so as to obtain the ground area I road and the non-ground area I notroad .
  • the present disclosure may use the following formula (7) to determine the ground area and the non-ground area:
  • I(*) represents the set of pixels. If the disparity value of a pixel in the first disparity map of the environment image satisfies
  • the ground area I road may be as shown in the upper right diagram in FIG. 16.
  • the non-ground area I notroad can be as shown in the lower right diagram in Figure 16.
  • the non-ground area I notroad in the present disclosure may include: at least one of a first area I high above the ground and a second area I low below the ground.
  • an area that is higher than the ground and whose height above the ground is less than a predetermined height value can be used as an obstacle area.
  • the area I low below the ground may be an area such as a pit, a ditch, or a valley
  • the present disclosure may use the area below the ground in the non-ground area I notroad and the height below the ground is less than a predetermined height value as an obstacle ⁇ Object area.
  • the first area I high above the ground and the second area I low below the ground in the present disclosure can be expressed by the following formula (8):
  • I notroad (*) represents the set of pixels. If the disparity value of a pixel in the first disparity map of the environment image satisfies dd road > thresh4, then the pixel belongs to the first area above the ground I high ; if the disparity value of a pixel in the first disparity map of the environment image satisfies d road -d> thresh4, then the pixel belongs to the second area I low below the ground; thresh4 represents a threshold, which is a known value . The size of the threshold can be set according to the actual situation.
  • the first area I high above the ground often includes obstacles that do not need attention, for example, traffic lights and overpasses and other target objects. Because they will not affect the driving of the vehicle, for the vehicle, These target objects belong to obstacles that do not require attention. These obstacles that do not need attention are often in a high position, and will not affect the driving of vehicles and pedestrians.
  • the present disclosure can remove regions belonging to higher positions from the first region I high above the ground, for example, remove regions whose height above the ground is greater than or equal to a first predetermined height value, thereby forming the obstacle region I obstacle .
  • the present disclosure may perform the second straight line fitting process according to the V disparity map, and according to the result of the second straight line fitting process, determine the area belonging to the higher position in the non-ground area (that is, the high The height above the ground is greater than or equal to the first predetermined height value), thereby obtaining the obstacle area I obstacle in the non-ground area.
  • the second straight line fitting method includes but is not limited to: RANSAC straight line fitting method.
  • v represents the row coordinates in the V disparity map
  • d represents the disparity value.
  • C and D can be expressed as: Therefore, the second straight line equation of the present disclosure can be expressed as:
  • H is a known constant value, and H can be set according to actual needs. For example, in the intelligent control technology of vehicles, H can be set to 2.5 meters.
  • the intermediate image in FIG. 18 contains two upper and lower white diagonal lines, and the upper white diagonal line represents the second linear equation fitted.
  • the second straight line equation obtained by the above fitting may express the relationship between the parallax value of the obstacle area and the row coordinates of the V-disparity map. That is, for any row in the V disparity map, when v is determined, the disparity value d of the obstacle area should be a determined value.
  • the present disclosure may divide the first area I high above the ground into the form represented by the following formula (9):
  • I high (*) represents the set of pixels. If the disparity value d of a pixel in the first disparity map of the environmental image satisfies d ⁇ d H , then the pixel is above the ground but low In the region I ⁇ H at the height of H above the ground, the present disclosure may regard I ⁇ H as the obstacle region I obstacle ; if the disparity value d of one pixel in the first disparity map of the environment image satisfies d>d H , then This pixel belongs to an area higher than the ground and higher than the height of H above the ground.
  • I >H d H represents the disparity value of the pixel point at the height of H above the ground;
  • I >H can be as shown in the upper right figure in Figure 18 .
  • I ⁇ H can be as shown in the lower right diagram in FIG. 18.
  • the method of determining the pixel columnar area according to the edge information of the obstacle may be as follows: First, the disparity value of the pixel point of the non-obstacle area in the first disparity map And the disparity value of the pixel at the edge information of the obstacle is set to a predetermined value.
  • the determined target row is used as the boundary of the columnar area of obstacle pixels in the row direction by using N pixels in the column direction as the column width to determine the obstacle Obstacle pixel columnar area in the object area.
  • the method of determining the pixel columnar area according to the obstacle edge information may be: first, according to the detected obstacle edge information, the disparity value at the edge position of the obstacle in the disparity map Are set to a predetermined value (such as 0), and the disparity value in the area except the obstacle area in the disparity map is also set to a predetermined value (such as 0); then, according to the predetermined column width (at least one column of pixel width , Such as the width of 6 columns of pixels, etc.), search from the bottom of the disparity map upwards.
  • a predetermined value such as 0
  • the predetermined column width at least one column of pixel width , Such as the width of 6 columns of pixels, etc.
  • the disparity value of any column of the predetermined column width changes from a predetermined value to a non-predetermined value
  • determine the position (disparity map ) Is the bottom of the pixel columnar area, which starts to form the pixel columnar area, that is, starts to extend the pixel columnar area upwards. For example, continue to search upwards for the jump from a non-predetermined value to a predetermined value in the disparity map.
  • the disparity value of any column of pixels in the width jumps from a non-predetermined value to a predetermined value, stop the upward extension of the pixel columnar area, and determine that the position (the row of the disparity map) is the top of the pixel columnar area, thus forming an obstacle Object pixel columnar area.
  • the present disclosure can start the determination process of the obstacle pixel columnar area from the lower left corner of the disparity map to the lower right corner of the disparity map.
  • the above obstacle pixel columnar area can be executed from the leftmost 6 columns of the disparity map.
  • the area determination process, and then, the above determination process of the obstacle pixel columnar area is performed again starting from the leftmost column 7-12 of the disparity map until the rightmost column of the disparity map.
  • the present disclosure can also start the determination process of the obstacle pixel columnar area from the lower right corner of the disparity map to the lower left corner of the disparity map.
  • the method of forming the pixel columnar area according to the obstacle edge information may be: first, according to the detected obstacle edge information, the obstacle edge position in the disparity map The disparity values at are all set to a predetermined value (such as 0), and the disparity value in the area except the obstacle area in the disparity map is also set to a predetermined value (such as 0); then, according to the predetermined column width (At least one column of pixel width, such as 6 columns of pixel width, etc.), search from the top of the disparity map downwards, when the disparity value of any column of the predetermined column width changes from a predetermined value to a non-predetermined value, Determine the position (the row of the disparity map) as the top of the pixel columnar area, and start to form the pixel columnar area, that is, start to extend the pixel columnar area downwards, for example, continue to search upward for the jump from an unpredicted value to a predetermined value in the
  • the present disclosure can start the determination process of the obstacle pixel columnar area from the upper left corner of the disparity map to the upper right corner of the disparity map.
  • the above obstacle pixels can be executed from the top 6 columns on the leftmost side of the disparity map.
  • the determination process of the columnar area, and then, the determination process of the above-mentioned obstacle pixel columnar area is performed again from the 7-12th column on the top leftmost side of the disparity map until the rightmost column of the disparity map.
  • the present disclosure can also start the determination process of the obstacle pixel columnar area from the upper right corner of the disparity map to the upper left corner of the disparity map.
  • the present disclosure is directed to the environment image shown in FIG. 2, and an example of the formed obstacle pixel columnar area is shown in the right figure of FIG. 19.
  • the width of each obstacle pixel columnar area in the right image of FIG. 19 is 6 columns of pixels.
  • the width of the obstacle pixel columnar area can be set according to actual requirements. The larger the width of the obstacle pixel columnar area is set, the rougher the formed obstacle pixel columnar area, and the shorter the time to form the obstacle pixel columnar area.
  • the attribute information of the obstacle pixel columnar area should be determined.
  • the attribute information of the obstacle pixel columnar area includes but is not limited to: the spatial position information of the obstacle pixel columnar area , The bottom information bottom of the obstacle pixel columnar area, the disparity value disp of the obstacle pixel columnar area, the top information top of the obstacle pixel columnar area, and the column information col of the obstacle pixel columnar area.
  • the spatial position information of the obstacle pixel columnar area may include: the coordinates of the obstacle pixel columnar area on the horizontal coordinate axis (X coordinate axis), and the obstacle pixel columnar area on the depth coordinate axis (Z coordinate axis).
  • An example of X, Y, and Z coordinate axes is shown in Figure 17.
  • the bottom information of the columnar area of obstacle pixels may be the row number of the bottom end of the columnar area of obstacle pixels.
  • the disparity value of the columnar area of the obstacle pixel may be: when the disparity value changes from zero to non-zero, the disparity value of the pixel at the non-zero position; the columnar area of the obstacle pixel
  • the top information of can be the row number of the pixel at the zero position when the disparity value changes from non-zero to zero.
  • the column information of the obstacle pixel columnar area may be the column number of any column among all the columns included in the device pixel, for example, the column number of a column located in the middle of the pixel columnar area.
  • the present disclosure uses the following formula (10) to calculate the spatial position information of the obstacle pixel columnar area, that is, the X coordinate, Z coordinate, and maximum Y coordinate of the obstacle pixel columnar area And minimum Y coordinate:
  • the Y coordinate of each pixel in the obstacle pixel columnar area can be expressed by the following formula (11):
  • Y i represents the Y coordinate of the i-th pixel in the obstacle pixel columnar area
  • row i represents the row number of the i-th pixel in the obstacle pixel columnar area
  • c y represents the main camera
  • Z represents the Z coordinate of the obstacle pixel columnar area
  • f represents the focal length of the camera.
  • the maximum Y coordinate and the minimum Y coordinate can be obtained.
  • the maximum Y coordinate and the minimum Y coordinate can be expressed as the following formula (12):
  • Y min represents the minimum Y coordinate of the pixel cylindrical obstacle region
  • Y max represents the maximum Y coordinate obstacle columnar pixel region
  • min (Y i) represents a minimum value of all calculated in the Y i
  • max (Y i) Y i represents the calculated maximum value of all.
  • the present disclosure may perform clustering processing on a plurality of obstacle pixel columnar regions to obtain at least one cluster.
  • the present disclosure can perform clustering processing on all obstacle pixel columnar regions according to the spatial position information of the obstacle pixel columnar region, and one cluster corresponds to one obstacle instance.
  • the present disclosure can use a corresponding clustering algorithm to perform clustering processing on the columnar regions of each obstacle pixel.
  • normalization processing ie, normalization processing
  • the present disclosure may adopt the min-max normalization processing method to map the X and Z coordinates of the columnar area of obstacle pixels, so that the X and Z coordinates of the columnar area of obstacle pixels are mapped to [0-1] Within the value range.
  • This normalization processing method is shown in the following formula (13):
  • X * represents the normalized X coordinate
  • Z * represents the normalized Z coordinate
  • X represents the X coordinate of the obstacle pixel columnar area
  • Z represents the obstacle pixel columnar area
  • X min represents the minimum value of the X coordinates of all obstacle pixel columnar areas
  • X max represents the maximum value of the X coordinates of all obstacle pixel columnar areas
  • Z min represents the Z coordinate of all obstacle pixel columnar areas
  • Z max represents the maximum value in the Z coordinate of all obstacle pixel columnar regions.
  • the present disclosure may also adopt the Z-score normalization processing method to perform normalization processing on the X coordinate and Z coordinate of the columnar region of obstacle pixels.
  • An example of this normalization processing method is shown in the following formula (14):
  • X * represents the normalized X coordinate
  • Z * represents the normalized Z coordinate
  • X represents the X coordinate of the obstacle pixel columnar area
  • Z represents the obstacle pixel columnar area
  • ⁇ X represents the mean value calculated for the X coordinates of all obstacle pixel columnar areas
  • ⁇ X represents the standard deviation calculated for the X coordinates of all obstacle pixel column areas
  • ⁇ Z represents the calculated standard deviation for all obstacle pixel columnar areas
  • ⁇ Z indicates the standard deviation calculated for the Z coordinate of all obstacle pixel columnar areas.
  • the X * and Z * of all obstacle pixel columnar regions processed by the present disclosure conform to the standard normal distribution, that is, the mean is 0 and the standard deviation is 1.
  • the present disclosure may adopt a density clustering (DBSCAN) algorithm to perform clustering processing on the columnar regions of obstacle pixels according to the normalized spatial position information of the columnar regions of all obstacle pixels, thereby forming at least one class Clusters, each cluster is an example of obstacles.
  • DBSCAN density clustering
  • the present disclosure does not limit the clustering algorithm.
  • An example of the clustering result is shown in the right panel in Figure 20.
  • the obstacle detection result may include, but is not limited to, at least one of the obstacle detection frame and the spatial position information of the obstacle.
  • the present disclosure may determine the obstacle detection frame (Bounding-Box) in the environment image according to the spatial position information of the pixel columnar regions belonging to the same cluster. For example, for a cluster, the present disclosure can calculate the maximum column coordinate u max and the minimum column coordinate u min of all obstacle pixel columnar regions in the cluster in the environment image, and calculate all obstacles in the cluster The largest bottom (ie v max ) and smallest top (ie v min ) of the object pixel columnar area (Note: It is assumed that the origin of the image coordinate system is at the upper left corner of the image).
  • the coordinates of the obstacle detection frame obtained by the present disclosure in the environment image can be expressed as (u min , v min , u max , v max ).
  • an example of the obstacle detection frame determined by the present disclosure is shown in the right figure of FIG. 21.
  • the multiple rectangular frames in the right figure of FIG. 21 are all obstacle detection frames obtained in the present disclosure.
  • the present disclosure obtains obstacles by clustering a plurality of obstacle pixel columnar regions. There is no need to predefine the obstacles to be detected, and the texture, color, shape, category and other predefined information of the obstacles are not used. Obstacles can be detected directly based on the method of clustering the obstacle area, and the detected obstacles are not limited to some predefined obstacles, and can realize the detection of the smart device in the surrounding space environment. Various obstacles that hinder the movement process are detected, thereby realizing the detection of general types of obstacles.
  • the present disclosure may also determine the spatial position information of obstacles based on the spatial position information of multiple obstacle pixel columnar regions that belong to the same cluster.
  • the spatial position information of the obstacle may include, but is not limited to: the coordinates of the obstacle on the horizontal coordinate axis (X coordinate axis), the coordinate of the obstacle on the depth coordinate axis (Z coordinate axis), and the obstacle in the vertical direction. The height (that is, the height of the obstacle) and so on.
  • the present disclosure may first determine the distance between the multiple obstacle pixel columnar regions in the cluster and the camera device that generates the environment image based on the spatial position information of multiple obstacle pixel columnar regions belonging to the same cluster. Distance, and then, based on the space position information of the nearest obstacle pixel columnar area, the space position information of the obstacle is determined.
  • the present disclosure may use the following formula (15) to calculate the distance between a plurality of obstacle pixel columnar regions in a cluster and the camera device, and select the minimum distance:
  • d min represents the minimum distance
  • X-coordinate X i represents the i-th obstacle is a class of the pixel regions of the columnar cluster
  • the Z i represents the i-th obstacle is a pixel region class columnar cluster The Z coordinate.
  • the X coordinate and Z coordinate of the columnar area of the obstacle pixel with the minimum distance can be used as the spatial position information of the obstacle, as shown in the following formula (16):
  • O X represents the coordinate of the obstacle on the horizontal coordinate axis, that is, the X coordinate of the obstacle
  • O Z represents the coordinate of the obstacle on the depth coordinate axis (X coordinate axis), that is, the obstacle X close represents the X coordinate of the columnar area of obstacle pixels with the minimum distance calculated above
  • Z close represents the Z coordinate of the columnar area of obstacle pixels with the minimum distance calculated as described above.
  • the present disclosure may use the following formula (17) to calculate the height of the obstacle:
  • O H represents the height of the obstacle
  • Y max represents the maximum Y coordinate of all the pixels of the columnar region obstacle a class cluster
  • Y min represents all the pixels obstacle columnar cluster region of a Class The smallest Y coordinate.
  • FIG. 22 The flow of one embodiment of training a convolutional neural network in the present disclosure is shown in FIG. 22.
  • S2200 Input one of the binocular image samples (such as left/right) image samples into the convolutional neural network to be trained.
  • the image samples input to the convolutional neural network of the present disclosure may always be left-eye image samples of binocular image samples, or may always be right-eye image samples of binocular image samples.
  • the successfully trained convolutional neural network will use the input environment image as the left eye image in the test or actual application scenarios .
  • the image sample in the input convolutional neural network is always the right eye image sample of the binocular image sample, the successfully trained convolutional neural network will use the input environment image as the right eye image in the test or actual application scenarios .
  • S2210 Perform disparity analysis processing via a convolutional neural network, and obtain a disparity map of the left-eye image sample and a disparity map of the right-eye image sample based on the output of the convolutional neural network.
  • S2220 Reconstruct the right-eye image according to the disparity map of the left-eye image sample and the right-eye image sample.
  • the method of reconstructing the right-eye image in the present disclosure includes but is not limited to: performing reprojection calculation on the disparity map of the left-eye image sample and the right-eye image sample to obtain the reconstructed right-eye image.
  • S2230 Reconstruct the left-eye image according to the disparity map of the right-eye image sample and the left-eye image sample.
  • the method of reconstructing the left-eye image in the present disclosure includes but is not limited to: performing re-projection calculation on the right-eye image sample and the disparity map of the left-eye image sample to obtain the reconstructed left-eye image.
  • S2240 Adjust the network parameters of the convolutional neural network according to the difference between the reconstructed left-eye image and the left-eye image sample, and the difference between the reconstructed right-eye image and the right-eye image sample.
  • the loss function used in the present disclosure when determining the difference includes, but is not limited to: L1 loss function, smooth loss function, lr-Consistency loss function, etc.
  • the present disclosure propagates the calculated loss back to adjust the network parameters of the convolutional neural network (such as the weight of the convolution kernel)
  • the gradient calculated based on the chain derivation of the convolutional neural network can be used.
  • To back-propagate the loss which helps to improve the training efficiency of the convolutional neural network.
  • the predetermined iterative conditions in the present disclosure may include: the difference between the left eye image and the left eye image sample reconstructed based on the disparity map output by the convolutional neural network, and the right eye image and the right eye image reconstructed based on the disparity map output by the convolutional neural network.
  • the difference between the image samples meets the predetermined difference requirement. If the difference meets the requirements, the training of the convolutional neural network is successfully completed this time.
  • the predetermined iterative conditions in the present disclosure may also include: training the convolutional neural network, and the number of binocular image samples used reaches a predetermined number requirement, etc.
  • the number of binocular image samples in use reaches the predetermined number requirement.
  • the difference between the left eye image and the left eye image samples reconstructed based on the disparity map output by the convolutional neural network, and the disparity map output based on the convolutional neural network If the difference between the reconstructed right-eye image and the right-eye image sample does not meet the predetermined difference requirements, the training of the convolutional neural network is not successful this time.
  • FIG. 23 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure.
  • the intelligent driving control method of the present disclosure can be applied but not limited to: an automatic driving (such as a fully unassisted automatic driving) environment or an assisted driving environment.
  • S2300 Acquire an environment image of the smart device during the movement process through an image acquisition device set on the smart device.
  • the image acquisition device includes but is not limited to: an RGB-based camera device and the like.
  • S2310 Perform obstacle detection on the acquired environment image, and determine the obstacle detection result.
  • the specific implementation process of this step please refer to the description of FIG. 1 in the foregoing method implementation, which is not described in detail here.
  • S2320 Generate and output a control instruction according to the obstacle detection result.
  • control commands generated by the present disclosure include but are not limited to: speed maintaining control commands, speed adjustment control commands (such as decelerating driving commands, accelerating driving commands, etc.), direction maintaining control commands, and direction adjustment control commands (such as left steering commands) , Right turn command, left lane merging command, or right lane merging command, etc.), whistle command, warning prompt control command or driving mode switching control command (such as switching to automatic cruise driving mode, etc.).
  • the obstacle detection technology of the present disclosure can be applied in the field of intelligent driving control as well as other fields; for example, it can realize obstacle detection in industrial manufacturing and obstacles in indoor fields such as supermarkets. Object detection, obstacle detection in the security field, etc.
  • the present disclosure does not limit the application scenarios of the obstacle detection technology.
  • FIG. 24 is a schematic structural diagram of an embodiment of the obstacle detection device of the present disclosure.
  • the device in FIG. 24 includes: an acquisition module 2400, a first determination module 2410, a clustering module 2420, a second determination module 2430, and a training module 2440.
  • the obtaining module 2400 is used to obtain the first disparity map of the environment image.
  • the environment image is an image that characterizes the spatial environment information of the smart device during the movement.
  • the environmental image includes a monocular image.
  • the obtaining module 2400 may include: a first sub-module, a second sub-module, and a third sub-module.
  • the first sub-module is used to analyze and process the disparity of the monocular image using the convolutional neural network, and obtain the first disparity map of the monocular image based on the output of the convolutional neural network; the convolutional neural network uses the binocular image Sample, obtained by training.
  • the second sub-module is used to perform mirror image processing on the monocular image to obtain a first mirror image and obtain a disparity map of the first mirror image.
  • the third sub-module is configured to perform disparity adjustment on the first disparity map of the monocular image according to the disparity map of the first mirror image to obtain the first disparity map after the disparity adjustment.
  • the third sub-module may include: a first unit and a second unit.
  • the first unit is used to perform mirror processing on the disparity map of the first mirror image to obtain the second mirror image.
  • the second unit is configured to perform disparity adjustment on the first disparity map according to the weight distribution map of the first disparity map and the weight distribution map of the second mirror image to obtain the first disparity map after the disparity adjustment.
  • the weight distribution map of the first disparity map includes the weight values that represent the multiple disparity values in the first disparity map
  • the weight distribution map of the second mirror image includes the corresponding weight values of the multiple disparity values in the second mirror image. Weights.
  • the weight distribution diagram in the present disclosure includes: at least one of a first weight distribution diagram and a second weight distribution diagram.
  • the first weight distribution map is a weight distribution map set uniformly for multiple environmental images; the second weight distribution map is a weight distribution map set separately for different environmental images.
  • the first weight distribution map includes at least two left and right regions, and different regions have different weight values.
  • the weight value of the region on the right is not less than that of the region on the left
  • the weight value of the region on the right is not less than the weight value of the region on the left.
  • the weight value of the left part of the region is not greater than the weight value of the right part of the region; for the first weight of the second mirror image
  • the weight value of the left part of the area is not greater than the weight value of the right part of the area.
  • the weight value of the region on the left is not less than that of the region on the right
  • the weight value of the region on the left is not less than the weight value of the region on the right
  • the weight value of the right part of the region is not greater than the weight value of the left part of the region; for the second mirror image
  • the weight value of the right part of the area is not greater than the weight value of the left part of the area.
  • the third submodule may further include: a third unit configured to set a second weight distribution map of the first disparity map. Specifically, the third unit performs mirror image processing on the first disparity map to form a mirror disparity map; and sets the weights in the second weight distribution map of the first disparity map according to the disparity values in the mirror disparity map of the first disparity map value. For example, for a pixel at any position in the mirror disparity map, if the disparity value of the pixel at that position satisfies the first predetermined condition, the third unit adds the second weight of the first disparity map The weight value of the pixel at this position in the distribution map is set to the first value.
  • a third unit configured to set a second weight distribution map of the first disparity map. Specifically, the third unit performs mirror image processing on the first disparity map to form a mirror disparity map; and sets the weights in the second weight distribution map of the first disparity map according to the disparity values in the mirror disparity map of the first disparity map value.
  • the third unit may set the weight value of the pixel at the position in the second weight distribution map of the first disparity map to the second value; Among them, the first value is greater than the second value.
  • the first predetermined condition may include: the disparity value of the pixel at the position is greater than the first reference value of the pixel at the position.
  • the first reference value of the pixel at the position is set according to the disparity value of the pixel at the position in the first disparity map and a constant value greater than zero.
  • the third sub-module may further include: a fourth unit.
  • the fourth unit is used to set the fourth unit of the second weight distribution graph of the second mirror image.
  • the fourth unit sets the weight value in the second weight distribution map of the second mirror image according to the disparity value in the first disparity map. More specifically, for a pixel at any position in the second mirror image, if the disparity value of the pixel at that position in the first disparity map meets the second predetermined condition, the fourth unit will The weight value of the pixel at the position in the second weight distribution diagram of the second mirror image is set to the third value.
  • the fourth unit will change the weight value of the pixel at the position in the second weight distribution map of the second mirror image Set to the fourth value; where the third value is greater than the fourth value.
  • the second predetermined condition includes: the disparity value of the pixel at the position in the first disparity map is greater than the second reference value of the pixel at the position; wherein the second reference value of the pixel at the position is It is set according to the disparity value of the pixel at the position in the mirror disparity map of the first disparity map and a constant value greater than zero.
  • the second unit may adjust the disparity value in the first disparity map according to the first weight distribution map and the second weight distribution map of the first disparity map; and according to the first weight distribution map of the second mirror image and The second weight distribution map adjusts the disparity value in the second mirror image; the second unit merges the first disparity map after the disparity adjustment and the second mirror image after the disparity value adjustment, and finally obtains the adjusted disparity The first disparity map.
  • the first determining module 2410 is configured to determine multiple obstacle pixel regions in the first disparity map of the environment image.
  • the first determining module 2410 may include: a fourth sub-module, a fifth sub-module, and a sixth sub-module.
  • the fourth sub-module is used to perform edge detection on the first disparity map of the environment image to obtain obstacle edge information.
  • the fifth sub-module is used to determine the obstacle area in the first disparity map of the environment image; the sixth sub-module is used to determine a plurality of obstacle pixel columns in the obstacle area of the first disparity map according to the obstacle edge information area.
  • the fifth sub-module may include: a fifth unit, a sixth unit, a seventh unit, and an eighth unit.
  • the fifth unit is used to perform statistical processing on the disparity value of each row of pixels in the first disparity map to obtain statistical information of the disparity value of each row of pixels.
  • the sixth unit is used to determine the statistical disparity map based on the statistical information of the disparity value of each row of pixels; the seventh unit is used to perform the first straight line fitting processing on the statistical disparity map, and according to the first straight line fitting processing The result of determining the ground area and the non-ground area; the eighth unit is used to determine the obstacle area according to the non-ground area.
  • the non-ground area includes: the first area above the ground.
  • the non-ground area includes: the first area above the ground and the second area below the ground.
  • the eighth unit may perform the second straight line fitting process on the statistical disparity map, and according to the result of the second straight line fitting process, determine the first target area in the first area whose height above the ground is less than the first predetermined height value, and A target area is an obstacle area; in the case that there is a second area lower than the ground in the non-ground area, the eighth unit determines the second target area in the second area whose height below the ground is greater than the second predetermined height value.
  • the second target area is the obstacle area.
  • the sixth sub-module may set the disparity value of the pixel point of the non-obstacle area in the first disparity map and the disparity value of the pixel point at the edge information of the obstacle to a predetermined value;
  • the N pixels in the column direction of the first disparity map are used as the traversal unit, and the disparity value of the N pixels on each row is traversed from the set row of the first disparity map, and the disparity value of the pixel is determined to exist in the predetermined
  • the target row for the jump between the value and the non-predetermined value; the sixth sub-module is used to use N pixels in the column direction as the column width, and the determined target row is used as the boundary of the obstacle pixel columnar area in the row direction , To determine the obstacle pixel columnar area in the obstacle area.
  • the clustering module 2420 is used to perform clustering processing on a plurality of obstacle pixel regions to obtain at least one cluster.
  • the clustering module 2420 may perform clustering processing on a plurality of obstacle pixel columnar regions.
  • the clustering module 2420 may include a seventh sub-module and an eighth sub-module.
  • the seventh sub-module is used to determine the spatial position information of a plurality of obstacle pixel columnar regions.
  • the eighth sub-module is used to perform clustering processing on the multiple obstacle pixel columnar regions according to the spatial position information of the multiple obstacle pixel columnar regions.
  • the eighth sub-module determines the attribute information of the obstacle pixel columnar area according to the pixels contained in the obstacle pixel columnar area, and according to the attribute information of the obstacle pixel columnar area Information to determine the spatial position information of the obstacle pixel columnar area.
  • the attribute information of the obstacle pixel columnar area may include at least one of pixel columnar area bottom information, pixel columnar area top information, pixel columnar area disparity value, and pixel columnar area column information.
  • the spatial position information of the obstacle pixel columnar area may include the coordinates of the obstacle pixel columnar area on the horizontal coordinate axis, and the coordinate of the obstacle pixel columnar area on the depth coordinate axis.
  • the spatial position information of the obstacle pixel columnar area may further include: the coordinates of the highest point of the obstacle pixel columnar area on the vertical coordinate axis, and the coordinates of the lowest point of the obstacle pixel columnar area on the vertical coordinate axis; where The coordinates of the highest point and the lowest point are used to determine the height of the obstacle.
  • the second determining module 2430 is configured to determine the obstacle detection result according to the obstacle pixel areas belonging to the same cluster.
  • the second determining module may include: at least one of a ninth sub-module and a tenth sub-module.
  • the ninth sub-module is used to determine the obstacle detection frame in the environment image according to the spatial position information of the columnar area of obstacle pixels belonging to the same cluster.
  • the tenth sub-module is used to determine the spatial position information of the obstacle according to the spatial position information of the columnar area of obstacle pixels belonging to the same cluster.
  • the tenth sub-module can determine the distance between the columnar regions of multiple obstacle pixels and the camera device that generates environmental images according to the spatial position information of the columnar regions of multiple obstacle pixels belonging to the same cluster;
  • the spatial position information of the nearest obstacle pixel columnar area of the device determines the spatial position information of the obstacle.
  • the training module 2440 is used to train the training module of the convolutional neural network. For example, the training module 2440 inputs one of the binocular image samples into the convolutional neural network to be trained, performs parallax analysis processing through the convolutional neural network, and obtains the left eye image sample based on the output of the convolutional neural network.
  • the disparity map and the disparity map of the right eye image sample; the training module 2440 reconstructs the right eye image according to the disparity map of the left eye image sample and the right eye image sample; the training module 2440 reconstructs the left eye image according to the disparity map of the right eye image sample and the left eye image sample; the training module 2440 The difference between the reconstructed left-eye image and the left-eye image sample, and the difference between the reconstructed right-eye image and the right-eye image sample, adjust the network parameters of the convolutional neural network.
  • the training module 2440 For the specific operations performed by the training module 2440, please refer to the above description of FIG. 22, which will not be described in detail here.
  • FIG. 25 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure.
  • the device in FIG. 25 includes: an acquisition module 2500, an obstacle detection device 2510, and a control module 2520.
  • the acquiring module 2500 is configured to acquire an environmental image of the smart device during the movement process through the image acquisition device set on the smart device.
  • the obstacle detection device 2510 is used to perform obstacle detection on the environment image and determine the obstacle detection result.
  • the control module 2520 is used to generate and output vehicle control instructions according to the obstacle detection result.
  • FIG. 26 shows an exemplary device 2600 suitable for implementing the present disclosure.
  • the device 2600 may be a control system/electronic system configured in a car, a mobile terminal (for example, a smart mobile phone, etc.), a personal computer (PC, for example, a desktop computer). Or notebook computers, etc.), tablets, servers, etc.
  • the device 2600 includes one or more processors, communication parts, etc., the one or more processors may be: one or more central processing units (CPU) 2601, and/or, one or more An image processor (GPU) 2613 for visual tracking by a neural network, etc.
  • CPU central processing units
  • GPU An image processor
  • the processor can execute instructions stored in a read-only memory (ROM) 2602 or can be loaded from the storage portion 2608 to a random access memory (RAM) 2603. Executing instructions to perform various appropriate actions and processing.
  • the communication unit 2612 may include but is not limited to a network card, and the network card may include but is not limited to an IB (Infiniband) network card.
  • the processor can communicate with the read-only memory 2602 and/or the random access memory 2603 to execute executable instructions, connect to the communication unit 2612 via the bus 2604, and communicate with other target devices via the communication unit 2612, thereby completing the corresponding in this disclosure step.
  • RAM 2603 can also store various programs and data required for device operation.
  • the CPU 2601, the ROM 2602, and the RAM 2603 are connected to each other through a bus 2604.
  • ROM2602 is an optional module.
  • the RAM 2603 stores executable instructions, or writes executable instructions into the ROM 2602 during operation, and the executable instructions cause the central processing unit 2601 to execute the steps included in the above method.
  • An input/output (I/O) interface 2605 is also connected to the bus 2604.
  • the communication unit 2612 may be integrated, or may be configured to have multiple sub-modules (for example, multiple IB network cards) and be connected to the bus respectively.
  • the following components are connected to the I/O interface 2605: an input part 2606 including a keyboard, a mouse, etc.; an output part 2607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc. and speakers, etc.; a storage part 2608 including a hard disk, etc. ; And a communication part 2609 including a network interface card such as a LAN card, a modem, etc.
  • the communication section 2609 performs communication processing via a network such as the Internet.
  • the driver 2610 is also connected to the I/O interface 2605 as needed.
  • a removable medium 2611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 2610 as required, so that the computer program read from it is installed in the storage portion 2608 as
  • FIG. 26 is only an optional implementation. In the specific practice process, the number and types of components in Figure 26 can be selected, deleted, added or replaced according to actual needs. ; In the setting of different functional components, separate settings or integrated settings can also be used. For example, GPU2613 and CPU2601 can be set separately, and then GPU2613 can be integrated on CPU2601. The communication part can be set separately or integrated. Set on CPU2601 or GPU2613, etc. These alternative embodiments all fall into the protection scope of the present disclosure.
  • the process described below with reference to the flowcharts can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program product tangibly contained on a machine-readable medium.
  • the computer program includes program code for executing the steps shown in the flowchart.
  • the program code may include instructions corresponding to the steps in the method provided by the present disclosure.
  • the computer program may be downloaded and installed from the network through the communication part 2609, and/or installed from the removable medium 2611.
  • the central processing unit (CPU) 2601 the instructions described in the present disclosure to implement the above-mentioned corresponding steps are executed.
  • the embodiments of the present disclosure also provide a computer program program product for storing computer-readable instructions, which when executed, cause a computer to execute the procedures described in any of the foregoing embodiments. Obstacle detection method or intelligent driving control method.
  • the computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is specifically embodied as a computer storage medium.
  • the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc. Wait.
  • SDK software development kit
  • the embodiments of the present disclosure also provide another obstacle detection method and intelligent driving control method and corresponding devices and electronic equipment, computer storage media, computer programs, and computer program products.
  • the method includes: the first device sends an obstacle detection instruction or an intelligent driving control instruction to a second device, and the instruction causes the second device to execute the obstacle detection method or the intelligent driving control method in any of the foregoing possible embodiments; A device receives the obstacle detection result or the intelligent driving control result sent by the second device.
  • the visual obstacle detection instruction or the intelligent driving control instruction may be specifically a calling instruction
  • the first device may instruct the second device to perform the obstacle detection operation or the intelligent driving control operation by calling, and respond accordingly
  • the second device may execute the steps and/or processes in any embodiment of the obstacle detection method or the intelligent driving control method described above.
  • any component, data, or structure mentioned in the present disclosure can generally be understood as one or more unless it is clearly defined or the context gives opposite enlightenment. It should also be understood that the description of the various embodiments in the present disclosure emphasizes the differences between the various embodiments, and the same or similarities can be referred to each other, and for the sake of brevity, the details are not repeated one by one.
  • the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure may be implemented in many ways.
  • the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware.
  • the present disclosure can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
  • the description of the present disclosure is given for the sake of example and description, and is not exhaustive or limits the present disclosure to the disclosed form. Many modifications and changes are obvious to those of ordinary skill in the art. The embodiments are selected and described in order to better explain the principles and practical applications of the present disclosure, and to enable those of ordinary skill in the art to understand that the embodiments of the present disclosure can design various embodiments with various modifications suitable for specific purposes. .

Abstract

本公开的实施方式公开了一种障碍物检测方法和装置、智能驾驶控制方法和装置、电子设备、计算机可读存储介质以及计算机程序,其中的障碍物检测方法包括:获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;在所述第一视差图中,确定出多个障碍物像素区域;对所述多个障碍物像素区域进行聚类处理,以获得至少一个类簇;根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。

Description

障碍物检测方法、智能驾驶控制方法、装置、介质及设备
本公开要求在2019年6月27日提交中国专利局、申请号为201910566416.2、发明名称为“障碍物检测方法、智能驾驶控制方法、装置、介质及设备”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及计算机视觉技术,尤其是涉及一种障碍物检测方法、障碍物检测装置、智能驾驶控制方法、智能驾驶控制装置、电子设备、计算机可读存储介质以及计算机程序。
背景技术
在计算机视觉技术领域中,感知技术通常用于感知外界的障碍物。即感知技术包括障碍物检测。
感知技术的感知结果,通常会被提供给决策层,使决策层基于感知结果进行决策。例如,在智能驾驶系统中,感知层将其感知到的车辆所在道路信息以及车辆周边的障碍物信息等,提供给决策层,使决策层执行行驶决策,以避开障碍物,保障车辆的安全行驶。在相关技术中,一般是预先定义好障碍物的类型,例如行人,车辆,非机动车等具有固有的形状、纹理、颜色的障碍物,然后利用相关检测算法对预先定义好类型的障碍物进行检测。
发明内容
本公开实施方式提供一种障碍物检测以及智能驾驶控制的技术方案。
根据本公开实施方式第一方面,提供一种障碍物检测方法,包括:获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;在所述环境图像的第一视差图中确定出多个障碍物像素区域;对所述多个障碍物像素区域进行聚类处理,获得至少一个类簇;根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
根据本公开实施方式的第二方面,提供一种智能驾驶控制方法,该方法包括:通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像;采用上述障碍物检测方法,对获取的环境图像进行障碍物检测,确定障碍物检测结果;根据所述障碍物检测结果生成并输出控制指令。
根据本公开实施方式第三方面,提供一种障碍物检测装置,包括:获取模块,用于获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;第一确定模块,用于在所述环境图像的第一视差图中确定出多个障碍物像素区域;聚类模块,用于对所述多个障碍物像素区域进行聚类处理,获得至少一个类簇;第二确定模块,用于根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
根据本公开实施方式第四方面,提供一种智能驾驶控制装置,该装置包括:获取模块,用于通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像;采用上述障碍物检测装置,对所述环境图像进行障碍物检测,确定障碍物检测结果;控制模块,用于根据所述障碍物检测结果生成并输出控制指令。
根据本公开实施方式第五方面,提供一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,实现本公开任一方法实施方式。
根据本公开实施方式第六方面,提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时,实现本公开任一方法实施方式。
根据本公开实施方式的第七方面,提供一种计算机程序,包括计算机指令,当所述计算机指令在设备的处理器中运行时,实现本公开任一方法实施方式。
基于本公开提供的障碍物检测方法和装置、智能驾驶控制方法和装置、电子设备、计算机可读存储介质以及计算机程序,本公开可以从环境图像的第一视差图中确定出多个障碍物像素区域,并通过对多个障碍物像素区域进行聚类处理来获得障碍物检测结果。本公开所采用的检测方式,无需对要检测的障碍物进行预定义,没有利用障碍物的纹理、颜色、形状、类别等预定义的信息,便可以直接基于对障碍物区域进行聚类的方式来检测出障碍物,并且,检测出的障碍物并不局限于某些预先定义好的障碍物,可以实现对周围空间环境中可能对智能设备的移动过程造成阻碍的各式各样的障碍物(本公开中可以称为泛类型障碍物)进行检测,从而实现了泛类型障碍物的检测。
相比相关技术中针对预定义类型的障碍物进行检测的方式,本公开提供的技术方案是更为通用的 障碍物检测方案,能够适用在泛类型障碍物的检测中,有利于应对现实环境中的类型多样化的障碍物检测。并且,对于智能设备来说,通过本公开提供的技术方案,能够实现对驾驶过程中可能出现的多样化随机性的障碍物的检测,进而基于检测结果来输出驾驶过程的控制指令,从而有利于提高车辆智能行驶的安全性。下面通过附图和实施方式,对本公开的技术方案做进一步的详细描述。
附图说明
构成说明书的一部分的附图描述了本公开的实施方式,并连同描述一起用于解释本公开的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:
图1为本公开的障碍物检测方法一个实施方式的流程图;
图2为本公开的环境图像一个实施方式的示意图;
图3为图2的第一视差图一个实施方式的示意图;
图4为本公开的第一视差图一个实施方式的示意图;
图5为本公开的卷积神经网络一个实施方式的示意图;
图6为本公开的第一视差图的第一权重分布图一个实施方式的示意图;
图7为本公开的第一视差图的第一权重分布图另一个实施方式的示意图;
图8为本公开的第一视差图的第二权重分布图一个实施方式的示意图;
图9为本公开的第二镜像图一个实施方式的示意图;
图10为图9所示的第二镜像图的第二权重分布图一个实施方式的示意图;
图11为本公开的对单目图像的视差图进行优化调整一个实施方式示意图;
图12为本公开的环境图像的第一视差图中的障碍物边缘信息的一个实施方式示意图;
图13为本公开的统计视差图一个实施方式示意图;
图14为本公开的形成统计视差图的一个实施方式示意图;
图15为本公开的直线拟合一个实施方式示意图;
图16为本公开的地面区域和非地面区域的示意图;
图17为本公开建立的坐标系一个实施方式示意图;
图18为本公开中的高于地面的第一区域包含的两个区域的示意图;
图19为本公开的形成障碍物像素柱状区域一个实施方式示意图;
图20为本公开的对障碍物像素柱状区域进行聚类的一个实施方式示意图;
图21为本公开的形成障碍物检测框一个实施方式示意图;
图22为本公开的卷积神经网络训练方法一个实施方式的流程图;
图23为本公开的智能驾驶控制方法一个实施方式的流程图;
图24为本公开的障碍物检测装置一个实施方式的结构示意图;
图25为本公开的智能驾驶控制装置一个实施方式的流程图;
图26为实现本公开实施方式的一示例性设备的框图。
具体实施例
现在将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。对于相关领域普通技术人员已知的技术、方法以及设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。应当注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本公开实施例可以应用于终端设备、计算机系统及服务器等电子设备,其可与众多其它通用或者专用的计算系统环境或者配置一起操作。适于与终端设备、计算机系统以及服务器等电子设备一起使用的众所周知的终端设备、计算系统、环境和/或配置的例子,包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境等。
终端设备、计算机系统以及服务器等电子设备可以在由计算机系统执行的计算机系统可执行指令 (诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑以及数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
示例性实施例
图1为本公开的障碍物检测方法一个实施例的流程图。
如图1所示,该实施例方法包括步骤:S100、S110、S120和S130。下面对各步骤进行详细描述。
S100、获取环境图像的第一视差图。其中,环境图像为表征智能设备在移动过程中所处的空间环境信息的图像。
示例性的,智能设备例如为智能驾驶设备(如自动驾驶汽车)、智能飞行设备(如无人机)、智能机器人等。环境图像例如为表征智能驾驶设备或智能机器人在移动过程中所处的道路空间环境信息的图像,或者,智能飞行设备在飞行过程中所处的空间环境信息的图像。当然,本公开中智能设备和环境图像并不局限于以上示例,本公开对此不作限定。
本公开中对环境图像中的障碍物进行检测,其中,智能设备所处的周围空间环境中可能对移动过程造成阻碍的任何物体均可能落入障碍物检测范围,被视为障碍物检测对象。例如,在智能驾驶设备在行驶过程中,路面上可能会出现石块,动物,掉落的货物等物体,这些物体没有具体的形状,纹理,颜色,类别,彼此之间差异很大,均可以被视为障碍物。本公开中,对以上这种在移动过程可能造成阻碍的任何物体称为泛类型障碍物。
在一个可选示例中,本公开的第一视差图用于描述环境图像的视差。视差可以认为指,从相距一定距离的两个点位置处,观察同一个目标对象时,所产生的目标对象位置差异。环境图像的一个例子如图2所示。图2所示的环境图像的第一视差图的一个例子如图3所示。可选的,本公开中的环境图像的第一视差图还可以表示为如图4所示的形式。图4中的各个数字(如0、1、2、3、4和5等)分别表示:环境图像中的(x,y)位置处的像素的视差。需要特别说明的是,图4并没有示出一个完整的视差图。
在一个可选示例中,本公开中的环境图像可以为单目图像,也可以为双目图像。单目图像通常是利用单目摄像装置进行拍摄,所获得的图像。双目图像通常是利用双目摄像装置进行拍摄,所获得的图像。可选的,本公开中的单目图像和双目图像均可以为照片或者图片等,也可以为视频中的视频帧。在环境图像为单目图像的情况下,本公开可以在不需要设置双目摄像装置的情况下,实现障碍物检测,从而有利于降低障碍物检测成本。
在一个可选的实施方式中,在环境图像为单目图像的情况下,本公开可以利用预先成功训练的卷积神经网络,来获得单目图像的第一视差图。例如,将单目图像,输入至卷积神经网络中,经由该卷积神经网络对单目图像进行视差分析处理,该卷积神经网络输出视差分析处理结果,从而本公开可以基于视差分析处理结果,获得单目图像的第一视差图。通过利用卷积神经网络来获得单目图像的第一视差图,可以在不需要使用两个图像进行逐像素视差计算,且不需要进行摄像装置标定的情况下,获得第一视差图。有利于提高获得第一视差图的便捷性和实时性。
在一个可选示例中,本公开中的卷积神经网络通常包括但不限于:多个卷积层(Conv)以及多个反卷积层(Deconv)。本公开的卷积神经网络可以被划分为两部分,即编码部分和解码部分。输入至卷积神经网络中的单目图像(如图2所示的单目图像),由编码部分对其进行编码处理(即特征提取处理),编码部分的编码处理结果被提供给解码部分,由解码部分对编码处理结果进行解码处理,并输出解码处理结果。本公开可以根据卷积神经网络输出的解码处理结果,获得单目图像的第一视差图(如图3所示的第一视差图)。可选的,卷积神经网络中的编码部分包括但不限于:多个卷积层,且多个卷积层串联。卷积神经网络中的解码部分包括但不限于:多个卷积层和多个反卷积层,多个卷积层和多个反卷积层相互间隔设置,且串联连接。
本公开中的卷积神经网络的一个可选例子,如图5所示。图5中,左侧第1个长方形表示输入卷积神经网络中的单目图像,右侧第1个长方形表示卷积神经网络输出的视差图。左侧第2个长方形至第15个长方形中的每一个长方形均表示卷积层,左侧第16个长方形至右侧第2个长方形中的所有长方形表示相互间隔设置的反卷积层和卷积层,如左侧第16个长方形表示反卷积层,左侧第17个长方形表示卷积层,左侧第18个长方形表示反卷积层,左侧第19个长方形表示卷积层,以此类推,直到右侧第2个长方形,且右侧第2个长方形表示反卷积层。
在一个可选示例中,本公开的卷积神经网络可通过跳连接(Skip Connect)的方式,使卷积神经网络中的低层信息和高层信息融合。例如,将编码部分中的至少一卷积层的输出通过跳连接的方式提 供给解码部分中的至少一反卷积层。可选的,卷积神经网络中的所有卷积层的输入通常包括:上一层(如卷积层或者反卷积层)的输出,卷积神经网络中的至少一反卷积层(如部分反卷积层或者所有反卷积层)的输入包括:上一卷积层的输出的上采样(Upsample)结果和与该反卷积层跳连接的编码部分的卷积层的输出。例如,由图5右侧的卷积层的下方引出的实线箭头所指向的内容表示该卷积层的输出,图5中的虚线箭头表示提供给反卷积层的上采样结果,由图5左侧的卷积层的上方引出的实线箭头表示与反卷积层跳连接的卷积层的输出。本公开不限制跳连接的数量以及卷积神经网络的网络结构。本公开通过将卷积神经网络中的低层信息和高层信息进行融合,有利于提高卷积神经网络生成的视差图的准确性。可选的,本公开的卷积神经网络的是利用双目图像样本训练获得的。该卷积神经网络的训练过程可以参见下述实施方式中的描述。在此不再详细说明。
在一个可选的实施方式中,本公开还可以对利用卷积神经网络获得的环境图像的第一视差图进行优化调整,以便于获得更为准确的第一视差图。可选的,在环境图像为单目图像的情况下,本公开可以利用单目图像的镜像图的视差图,对单目图像的第一视差图进行优化调整,从而本公开可以在经视差调整后的第一视差图中确定出多个障碍物像素区域。为便于描述,下述将单目图像的镜像图称为第一镜像图,将第一镜像图的视差图称为第二视差图。示例性的,可以将环境图像中的单目图像进行镜像处理后,得到第一镜像图,并获取第一镜像图的视差图,进而根据第一镜像图的视差图,对单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图。后续,便可以在经视差调整后的第一视差图中确定出多个障碍物像素区域。对第一视差图进行优化调整的一个具体例子如下:
步骤A、获取单目图像的第一镜像图的第二视差图,并获取第二视差图的镜像图。
可选的,本公开中的单目图像的第一镜像图可以是对单目图像进行水平方向上的镜像处理(如左镜像处理或者右镜像处理),所形成的镜像图。本公开中的第二视差图的镜像图可以是,对第二视差图进行水平方向上的镜像处理(如左镜像处理或者右镜像处理)后,形成的镜像图。第二视差图的镜像图仍然是视差图。本公开可以先对单目图像进行左镜像处理或者右镜像处理(由于左镜像处理结果与右镜像处理结果相同,因此,本公开对单目图像进行左镜像处理或者右镜像处理均可),获得单目图像的第一镜像图(左镜像图或右镜像图),然后,再获取单目图像的第一镜像图的视差图,从而获得第二视差图;最后,再对该第二视差图进行左镜像处理或者右镜像处理(由于第二视差图的左镜像处理结果与右镜像处理结果相同,因此,本公开对第二视差图进行左镜像处理或者右镜像处理均可),从而获得第二视差图的镜像图(左镜像图或右镜像图)。第二视差图的镜像图仍然为视差图。为了方便描述,下述将第二视差图的镜像图称为第二镜像图。
由上述描述可知,本公开在对单目图像进行镜像处理时,可以不考虑单目图像是被作为左目图像进行镜像处理,还是被作为右目图像进行镜像处理。也就是说,无论单目图像被作为左目图像,还是被作为右目图像,本公开均可以对单目图像进行左镜像处理或者右镜像处理,从而获得第一镜像图。同样的,本公开在对第二视差图进行镜像处理时,也可以不考虑是对第二视差图进行左镜像处理,还是对第二视差图进行右镜像处理。需要说明的是,在训练用于生成单目图像的视差图的卷积神经网络的过程,如果以双目图像样本中的左目图像样本作为输入,提供给卷积神经网络,进行训练,则成功训练后的卷积神经网络在测试以及实际应用中,会将输入的单目图像作为左目图像。如果以双目图像样本中的右目图像样本作为输入,提供给卷积神经网络,进行训练,则成功训练后的卷积神经网络在测试以及实际应用中,会将输入的单目图像作为右目图像。
可选的,本公开同样可以利用上述卷积神经网络,来获得第二视差图。例如,将第一镜像图,输入至卷积神经网络中,经由该卷积神经网络对第一镜像图进行视差分析处理,卷积神经网络输出视差分析处理结果,从而本公开可以根据输出的视差分析处理结果,获得第二视差图。
步骤B、获取单目图像的第一视差图的权重分布图以及第二镜像图的权重分布图。
在一个可选示例中,第一视差图的权重分布图用于描述第一视差图中的多个视差值(如所有视差值)各自对应的权重值。第一视差图的权重分布图可以包括但不限于:第一视差图的第一权重分布图以及第一视差图的第二权重分布图。可选的,上述第一视差图的第一权重分布图是针对多个不同的单目图像的视差图统一设置的权重分布图,即第一视差图的第一权重分布图可以面向多个不同的单目图像的第一视差图,也就是说,不同单目图像的第一视差图使用同一个第一权重分布图,因此,本公开可以将第一视差图的第一权重分布图称为第一视差图的全局权重分布图。第一视差图的全局权重分布图用于描述第一视差图中的多个视差值(如所有视差值)各自对应的全局权重值。可选的,上述第一视差图的第二权重分布图是针对单个单目图像的第一视差图而设置的权重分布图,即第一视差图的第二权重分布图是面向单个单目图像的第一视差图,也就是说,不同单目图像的第一视差图使用不同的第二权重分布图,因此,本公开可以将第一视差图的第二权重分布图称为第一视差图的局部权重分布 图。第一视差图的局部权重分布图用于描述第一视差图中的多个视差值(如所有视差值)各自对应的局部权重值。
在一个可选示例中,第二镜像图的权重分布图用于描述第二镜像图中的多个视差值各自对应的权重值。第二镜像图的权重分布图可以包括但不限于:第二镜像图的第一权重分布图以及第二镜像图的第二权重分布图。可选的,上述第二镜像图的第一权重分布图是针对多个不同的单目图像的第二镜像图统一设置的权重分布图,即第二镜像图的第一权重分布图面向多个不同的单目图像的第二镜像图,也就是说,不同单目图像的第二镜像图使用同一个第一权重分布图,因此,本公开可以将第二镜像图的第一权重分布图称为第二镜像图的全局权重分布图。第二镜像图的全局权重分布图用于描述第二镜像图中的多个视差值(如所有视差值)各自对应的全局权重值。可选的,上述第二镜像图的第二权重分布图是针对单个单目图像的第二镜像图而设置的权重分布图,即第二镜像图的第二权重分布图是面向单个单目图像的第二镜像图的,也就是说,不同单目图像的第二镜像图使用不同的第二权重分布图,因此,本公开可以将第二镜像图的第二权重分布图称为第二镜像图的局部权重分布图。第二镜像图的局部权重分布图用于描述第二镜像图中的多个视差值(如所有视差值)各自对应的局部权重值。
在一个可选示例中,第一视差图的第一权重分布图包括:至少两个左右分列的区域,不同区域具有不同的权重值。可选的,位于左侧的区域的权重值与位于右侧的区域的权重值的大小关系,通常与单目图像被作为左目图像,还是被作为右目图像,相关。
例如,在单目图像被作为左目图像的情况下,对于第一视差图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。图6为图3所示的第一视差图的第一权重分布图,该第一权重分布图被划分为五个区域,即图6中的区域1、区域2、区域3、区域4和区域5。区域5的权重值不小于区域4的权重值,区域4的权重值不小于区域3的权重值,区域3的权重值不小于区域2的权重值,区域2的权重值不小于区域1的权重值。另外,第一视差图的第一权重分布图中的任一区域内可以具有相同的权重值,也可以具有不同的权重值。在第一视差图的第一权重分布图中的一个区域内具有不同的权重值的情况下,该区域内左侧部分的权重值通常小于或等于该区域内右侧部分的权重值。可选的,图6中的区域1的权重值可以为0,即在第一视差图中,区域1对应视差是完全不可信的;区域2的权重值可以从左侧到右侧,由0逐渐增大并接近0.5;区域3的权重值为0.5;区域4的权重值可以从左侧到右侧,由一大于0.5的数值逐渐增大并接近1;区域5的权重值为1,即在第一视差图中,区域5对应视差是完全可信的。
再例如,在单目图像被作为右目图像的情况下,对于第一视差图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。图7示出了被作为待处理右目图像的视差图的第一权重分布图,该第一权重分布图被划分为五个区域,即图7中的区域1、区域2、区域3、区域4和区域5。区域1的权重值不小于区域2的权重值,区域2的权重值不小于区域3的权重值,区域3的权重值不小于区域4的权重值,区域4的权重值不小于区域5的权重值。另外,第一视差图的第一权重分布图中的任一区域内可以具有相同的权重值,也可以具有不同的权重值。在第一视差图的第一权重分布图中的一个区域内具有不同的权重值的情况下,该区域内的右侧部分的权重值通常不大于该区域内的左侧部分的权重值。可选的,图7中的区域5的权重值可以为0,即在第一视差图中,区域5对应视差是完全不可信的;区域4内的权重值可以从右侧到左侧,由0逐渐增大到0.5;区域3的权重值为0.5;区域2内的权重值可以从右侧到左侧,由一大于0.5的数值逐渐增大并接近1;区域1的权重值为1,即在第一视差图中,区域1对应视差是完全可信的。
可选的,第二镜像图的第一权重分布图包括至少两个左右分列的区域,不同区域具有不同的权重值。可选的,位于左侧的区域的权重值与位于右侧的区域的权重值的大小关系,通常与单目图像被作为左目图像,还是被作为右目图像,相关。
例如,在单目图像被作为左目图像的情况下,对于第二镜像图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。另外,第二镜像图的第一权重分布图中的任一区域内可以具有相同的权重值,也可以具有不同的权重值。在第二镜像图的第一权重分布图中的一个区域内具有不同的权重值的情况下,该区域内的左侧部分的权重值通常不大于该区域内的右侧部分的权重值。
再例如,在单目图像被作为右目图像的情况下,对于第二镜像图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。另外,第二镜像图的第一权重分布图中的任一区域内可以具有相同的权重值,也可以具有不同的权重值。在第二镜像图的第一权重分布图中的一个区域内具有不同的权重值的情况下,该区域内的右侧部分的权重值通常不大于该区域内的左侧部分的权重值。
可选的,第一视差图的第二权重分布图的设置方式可以包括下述步骤:
首先,对第一视差图进行左/右镜像处理,形成镜像视差图。
其次,根据镜像视差图中的视差值,设置第一视差图的第二权重分布图中的权重值。
可选的,对于镜像视差图中的任一位置处的像素点而言,在该位置处的像素点的视差值满足第一预定条件的情况下,将第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第一值,在该像素点的视差值不满足第一预定条件的情况下,将第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第二值。例如,对于镜像视差图中的任一位置处的像素点而言,如果该位置处的像素点的视差值大于该位置处的像素点对应的第一参考值,则将第一视差图的第二权重分布图中的该位置处的像素点的权重值设置为第一值,否则,被设置为第二值。本公开中的第一值大于第二值。例如,第一值为1,第二值为0。可选的,第一视差图的第二权重分布图的一个例子如图8所示。图8中的白色区域的权重值均为1,表示该位置处的视差值完全可信。图8中的黑色区域的权重值为0,表示该位置处的视差值完全不可信。
可选的,本公开中的任一位置处的像素点对应的第一参考值可以是根据第一视差图中在该位置处的像素点的视差值以及大于零的常数值设置的。例如,将第一视差图中在该位置处的像素点的视差值与大于零的常数值的乘积,作为镜像视差图中的该位置处的像素点对应的第一参考值。
可选的,第一视差图的第二权重分布图可以使用下述公式(1)表示:
Figure PCTCN2019120833-appb-000001
在上述公式(1)中,L l表示第一视差图的第二权重分布图;
Figure PCTCN2019120833-appb-000002
表示镜像视差图的相应位置处的像素点的视差值;d l表示第一视差图中的相应位置处的像素点的视差值;thresh1表示大于零的常数值,thresh1的取值范围可以为1.1-1.5,如thresh1=1.2或者thresh2=1.25等。
在一个可选示例中,第二镜像图的第二权重分布图的设置方式可以为:根据第一视差图中的视差值,设置第二镜像图的第二权重分布图中的权重值。可选的,对于第二镜像图中的任一位置处的像素点而言,在第一视差图中该位置处的像素点的视差值满足第二预定条件,则将第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第三值。在第一视差图中该位置处的像素点的视差值不满足第二预定条件的情况下,将第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第四值;其中,第三值大于第四值。例如,对于第一视差图中的任一位置处的像素点而言,如果第一视差图中的该位置处的像素点的视差值大于该位置处的像素点对应的第二参考值,则将第二镜像图的第二权重分布图中的该位置处的像素点的权重值设置为第三值,否则,设置为第四值。可选的,本公开中的第三值大于第四值。例如,第三值为1,第四值为0。
可选的,本公开中的像素点对应的第二参考值可以是根据镜像视差图中的相应位置处的像素点的视差值以及大于零的常数值设置的。例如,先对第一视差图进行左/右镜像处理,形成镜像视差图,即镜像视差图,然后,将镜像视差图中的相应位置处的像素点的视差值与大于零的常数值的乘积,作为第一视差图中的相应位置处的像素点对应的第二参考值。
可选的,基于图2的环境图像,所形成的第二镜像图的一个例子如图9所示。图9所示的第二镜像图的第二权重分布图的一个例子如图10所示。图10中的白色区域的权重值均为1,表示该位置处的视差值完全可信。图10中的黑色区域的权重值为0,表示该位置处的视差值完全不可信。
可选的,第二镜像图的第二权重分布图可以使用下述公式(2)表示:
Figure PCTCN2019120833-appb-000003
在上述公式(2)中,L l′表示第二镜像图的第二权重分布图;
Figure PCTCN2019120833-appb-000004
表示镜像视差图的相应位置处的像素点的视差值;d l表示第一视差图中的相应位置处的像素点的视差值;thresh2表示大于零的常数值,thresh2的取值范围可以为1.1-1.5,如thresh2=1.2或者thresh2=1.25等。
步骤C、根据单目图像的第一视差图的权重分布图、以及第二镜像图的权重分布图,对单目图像的第一视差图进行优化调整,优化调整后的视差图即为最终获得的单目图像的第一视差图。
在一个可选示例中,本公开可以利用第一视差图的第一权重分布图和第二权重分布图对第一视差 图中的多个视差值进行调整,获得调整后的第一视差图;利用第二镜像图的第一权重分布图和第二权重分布图,对第二镜像图中的多个视差值进行调整,获得调整后的第二镜像图;之后,对调整后的第一视差图和调整后的第二镜像图进行合并处理,从而获得优化调整后的单目图像的第一视差图。
可选的,获得优化调整后的单目图像的第一视差图的一个例子如下:
首先,对第一视差图的第一权重分布图和第一视差图的第二权重分布图进行合并处理,获得第三权重分布图。第三权重分布图可以采用下述公式(3)表示:
W l=M l+L l·0.5          公式(3)
在公式(3)中,W l表示第三权重分布图;M l表示第一视差图的第一权重分布图;L l表示第一视差图的第二权重分布图;其中的0.5也可以变换为其他常数值。
其次,对第二镜像图的第一权重分布图和第二镜像图的第二权重分布图进行合并处理,获得第四权重分布图。第四权重分布图可以采用下述公式(4)表示:
W l'=M l'+L l'·0.5           公式(4)
在公式(4)中,W l'表示第四权重分布图,M l'表示第二镜像图的第一权重分布图;L l'表示第二镜像图的第二权重分布图;其中的0.5也可以变换为其他常数值。
再次,根据第三权重分布图调整第一视差图中的多个视差值,获得调整后的第一视差图。例如,针对第一视差图中的任一位置处的像素点的视差值而言,将该位置处的像素点的视差值替换为:该位置处的像素点的视差值与第三权重分布图中的相应位置处的像素点的权重值的乘积。在对第一视差图中的所有像素点均进行了上述替换处理后,获得调整后的第一视差图。
以及,根据第四权重分布图调整第二镜像图中的多个视差值,获得调整后的第二镜像图。例如,针对第二镜像图中的任一位置处的像素点的视差值而言,将该位置处的像素点的视差值替换为:该位置处的像素点的视差值与第四权重分布图中的相应位置处的像素点的权重值的乘积。在对第二镜像图中的所有像素点均进行了上述替换处理后,获得调整后的第二镜像图。
最后,合并调整后的第一视差图和调整后的第二镜像图,最终获得单目图像的第一视差图。最终获得的单目图像的第一视差图可以采用下述公式(5)表示:
Figure PCTCN2019120833-appb-000005
在公式(5)中,d final表示最终获得的单目图像的第一视差图(如图11中的右侧第1幅图所示);W l表示第三权重分布图(如图11中的左上第1幅图所示);W l'表示第四权重分布图(如图11中的左下第1幅图所示);d l表示第一视差图(如图11中的左上第2幅图所示);
Figure PCTCN2019120833-appb-000006
表示第二镜像图(如图11中的左下第2幅图所示)。
需要说明的是,本公开不限制对第一权重分布图和第二权重分布图进行合并处理的两个步骤的执行顺序,例如,两个合并处理的步骤可以同时执行,也可以先后执行。另外,本公开也不限制对第一视差图中的视差值进行调整和对第二镜像图中的视差值进行调整的先后执行顺序,例如,两个调整的步骤可以同时进行,也可以先后执行。
在单目图像被作为左目图像的情况下,通常会存在左侧视差缺失以及物体的左侧边缘被遮挡等现象,这些现象会导致的单目图像的第一视差图中的相应区域的视差值不准确。同样的,在单目图像被作为右目图像的情况下,通常会存在右侧视差缺失以及物体的右侧边缘被遮挡等现象,这些现象会导致的单目图像的第一视差图中的相应区域的视差值不准确。本公开通过对单目图像进行镜像处理,并对第二视差图进行镜像处理,进而利用镜像处理后的视差图(即第二镜像图)来优化调整单目图像的第一视差图,有利于减弱单目图像的第一视差图中的相应区域的视差值不准确的现象,从而有利于提高障碍物检测的精度。
在一个可选示例中,在环境图像为双目图像的应用场景中,本公开获得双目图像的第一视差图的方式包括但不限于:利用立体匹配的方式获得双目图像的第一视差图。例如,利用BM(Block Matching,块匹配)算法、SGBM(Semi-Global Block Matching,半全局块匹配)算法、或者GC(Graph Cuts,图割)算法等立体匹配算法获得双目图像的第一视差图。再例如,利用用于获取双目图像的视差图的卷积神经网络,对双目图像进行视差处理,从而获得双目图像的第一视差图。
S110、在环境图像的第一视差图中,确定出多个障碍物像素区域。
示例性的,障碍物像素区域可以为第一视差图中包含位置连续的至少两个像素的像素区域。一种实施方式中,障碍物像素区域可以为障碍物像素柱状区域,例如,本公开中的障碍物像素柱状区域为一条状区域,该条状区域的宽度为至少一列像素,该条状区域的高度为至少两行像素。由于该条状区域可以被作为障碍物的基本单位,因此,本公开将该条状区域称为障碍物像素柱状区域。
在一个可选示例中,本公开可以先对上述步骤获得的环境图像的第一视差图进行边缘检测,获得障碍物边缘信息;然后,再确定环境图像的第一视差图中的障碍物区域;最后,根据障碍物边缘信息,在障碍物区域中,确定多个障碍物像素柱状区域。本公开通过划分出障碍物区域,有利于避免在关注价值低的区域内形成障碍物像素柱状区域的现象,有利于提高形成障碍物像素柱状区域的便捷性。实际空间中的不同障碍物,由于与摄像装置的距离不同,会造成视差不同,从而形成障碍物存在视差边缘。本公开通过检测出障碍物边缘信息,可以对视差图中的障碍物进行分隔,从而本公开可以通过搜索障碍物边缘信息,方便的形成障碍物像素柱状区域,有利于提高形成障碍物像素柱状区域的便捷性。
在一个可选示例中,本公开获得环境图像的第一视差图中的障碍物边缘信息的方式包括但不限于:利用用于边缘提取的卷积神经网络,来获得环境图像的第一视差图中的障碍物边缘信息;以及利用边缘检测算法,来获得环境图像的第一视差图中的障碍物边缘信息。可选的,本公开利用边缘检测算法,来获得环境图像的第一视差图中的障碍物边缘信息的一个实施方式如图12所示。
图12中,步骤1、对环境图像的第一视差图进行直方图均衡化处理。其中的环境图像的第一视差图如图12的左上角的图像,该第一视差图可以利用上述步骤100而最终获得的图2所示的环境图像的第一视差图。直方图均衡化处理的结果如图12的左上第2幅图所示。
步骤2、对直方图均衡化处理的结果进行均值滤波处理。滤波处理后的结果如图12的左上第3幅图所示。上述步骤1和步骤2是对环境图像的第一视差图的预处理。步骤1和步骤2仅为对环境图像的第一视差图进行预处理的一个例子。本公开并不限制预处理的具体实现方式。
步骤3、采用边缘检测算法,对滤波处理后的结果进行边缘检测处理,获得边缘信息。本步骤获得的边缘信息如图12的左上第4幅图所示。本公开中的边缘检测算法包括但不限于:Canny边缘检测算法、Sobel边缘检测算法或者Laplacian边缘检测算法等。
步骤4、对获得的边缘信息进行形态学膨胀运算。膨胀运算的结果如图12的左上第5幅图所示。本步骤属于对边缘检测算法的检测结果的一种后处理方式。本公开并不限制后处理的具体实现方式。
步骤5、对膨胀运算的结果进行反向操作,获得环境图像的第一视差图的边缘掩膜(Mask)。环境图像的第一视差图的边缘掩膜如图12的左下角图所示。
步骤6、将环境图像的第一视差图的边缘掩膜与环境图像的第一视差图进行与运算,获得环境图像的第一视差图中的障碍物边缘信息。图12右侧图示出了环境图像的第一视差图中的障碍物边缘信息,例如,环境图像的第一视差图中的障碍物边缘位置处的视差值被设置为0。障碍物边缘信息在图12中,呈现为黑色边缘线。
在一个可选示例中,本公开确定第一视差图中的障碍物区域的一个例子包括如下步骤:
步骤a、对第一视差图中每行像素点的视差值进行统计处理,得到对每行像素点的视差值的统计信息,基于对每行像素点的视差值的统计信息,确定统计视差图。
可选的,本公开可以对环境图像的第一视差图进行横向统计(行方向统计),从而获得V视差图,该V视差图可以被作为统计视差图。也就是说,针对环境图像的第一视差图的每一行,对该行中的各个视差值的个数进行统计,并将统计的结果设置在V视差图的相应列上。V视差图的宽度(即列数)与视差值的取值范围相关,例如,视差值的取值范围为0-254,则V视差图的宽度为255。V视差图的高度与环境图像的第一视差图的高度相同,即两者所包含的行数相同。可选的,对于图4所示的环境图像的第一视差图而言,本公开形成的统计视差图如图13所示。图13中,最上面一行表示视差值0至5;第2行第1列的数值为1,表示图4的第1行中的视差值为0的数量为1;第2行第2列的数值为6,表示图4的第1行中的视差值为1的数量为6;第5行第6列的数值为5,表示图4的第5行中的视差值为5的数量为5。在此不再对图13中的其他数值逐一进行说明。
可选的,对于图14中的左侧图所示的环境图像的第一视差图而言,对环境图像的第一视差图进行处理,所获得的V视差图如图14中的右侧图所示。
步骤b、对统计视差图(本公开中也称为V视差图)进行第一直线拟合处理,并根据第一直线拟合处理的结果确定地面区域和非地面区域。
首先,本公开可以对V视差图进行预处理。对V视差图的预处理可以包括但不限于:去除噪声等。例如,对V视差图进行阈值滤波(threshold),以滤除V视差图中的噪声。在V视差图如图15中的 左侧第1幅图所示的情况下,滤除噪声的V视差图如图15中的左侧第2幅图所示。
其次,本公开针对去除噪声后的V视差图进行第一直线拟合(fitline),从而获得第一直线方程v=Ad+B。其中的v表示V视差图中的行坐标,d表示视差值。
例如,图13中的斜线表示拟合出的第一直线方程。再例如,图15中的右侧第1幅图中的白斜线表示拟合出的第一直线方程。第一直线拟合方式包括但不限于:RANSAC直线拟合方式。
可选的,上述拟合获得的第一直线方程可以表示地面区域的视差值与V视差图的行坐标的关系。也就是说,针对V视差图中的任一行而言,在v确定的情况下,地面区域的视差值d应为确定值。地面区域的视差值可以表示为下述公式(6)的形式:
Figure PCTCN2019120833-appb-000007
在公式(6)中,d road表示地面区域的视差值;A和B为已知值,如通过第一直线拟合获得的数值。
再次,本公开可以利用公式(6)对环境图像的第一视差图进行分割,从而获得地面区域I road和非地面区域I notroad
可选的,本公开可以利用下述公式(7)来确定地面区域和非地面区域:
Figure PCTCN2019120833-appb-000008
在上述公式(7)中,I(*)表示像素集合,如果环境图像的第一视差图中的一个像素的视差值满足|d-d road|≤thresh3,则该像素属于地面区域I road;如果环境图像的第一视差图中的一个像素的视差值满足|d-d road|>thresh3,则该像素属于非地面区域;thresh3表示一阈值,为已知值。阈值的大小可以根据实际情况设置。
可选的,地面区域I road可以如图16中的右上图所示。非地面区域I notroad可以如图16中的右下图所示。本公开通过设置阈值,有利于去除环境图像的第一视差图中的噪声对区域判断的影响,从而有利于更加准确的确定出地面区域和非地面区域。
最后,根据非地面区域,确定障碍物区域。
可选的,本公开中的非地面区域I notroad可以包括:高于地面的第一区域I high以及低于地面的第二区域I low中的至少一个。本公开可以将非地面区域I notroad中的高于地面,且高于地面的高度小于预定高度值的区域,作为障碍物区域。由于低于地面的区域I low可能是坑、沟渠或山谷等区域,因此,本公开可以将非地面区域I notroad中的低于地面,且低于地面的高度小于预定高度值的区域,作为障碍物区域。
本公开中的高于地面的第一区域I high以及低于地面的第二区域I low可以通过下述公式(8)表示:
Figure PCTCN2019120833-appb-000009
在上述公式(8)中,I notroad(*)表示像素集合,如果环境图像的第一视差图中的一个像素的视差值满足d-d road>thresh4,则该像素属于高于地面的第一区域I high;如果环境图像的第一视差图中的一个像素的视差值满足d road-d>thresh4,则该像素属于低于地面的第二区域I low;thresh4表示 一阈值,为已知值。阈值的大小可以根据实际情况设置。
可选的,在高于地面的第一区域I high中,往往包括不需要关注的障碍物,例如,红绿灯以及过街天桥等目标对象,由于其不会影响车辆的驾驶,因此,对于车辆而言,这些目标对象属于不需要关注的障碍物。这些不需要关注的障碍物往往处于较高的位置,对车辆的行驶、行人的行走等不会产生影响。本公开可以从高于地面的第一区域I high中,去除属于较高位置的区域,例如,去除高于地面的高度大于等于第一预定高度值的区域,从而形成障碍物区域I obstacle
可选的,本公开可以通过根据V视差图进行第二直线拟合处理,并根据第二直线拟合处理的结果,确定出非地面区域中的需要去除的属于较高位置的区域(即高于地面的高度大于等于第一预定高度值的区域),从而获得非地面区域中的障碍物区域I obstacle。第二直线拟合方式包括但不限于:RANSAC直线拟合方式。可选的,在非地面区域存在低于地面的第二区域的情况下,确定第二区域中低于地面的高度大于第二预定高度值的第二目标区域,第二目标区域为障碍物区域。本公开针对V视差图进行第二直线拟合,从而获得的第二直线方程可以表示为v=Cd+D。其中的v表示V视差图中的行坐标,d表示视差值。通过推导计算可知,其中的C和D可以表示为:
Figure PCTCN2019120833-appb-000010
因此,本公开的第二直线方程可以表示为:
Figure PCTCN2019120833-appb-000011
其中的H为已知的常数值,H可以根据实际需要设置。例如,在车辆的智能控制技术中,H可以设置为2.5米。
可选的,图18中的中间图像中包含上下两条白斜线,上面的白斜线表示拟合出的第二直线方程。
可选的,上述拟合获得的第二直线方程可以表示出障碍物区域的视差值与V视差图的行坐标的关系。也就是说,针对V视差图中的任一行,在v确定的情况下,障碍物区域的视差值d应为确定值。
可选的,本公开可以将高于地面的第一区域I high划分为如下述公式(9)表示的形式:
Figure PCTCN2019120833-appb-000012
在上述公式(9)中,I high(*)表示像素集合,如果环境图像的第一视差图中的一个像素的视差值d满足d<d H,则该像素属于高于地面,但低于地面之上H高度的区域I <H,本公开可以将I <H作为障碍物区域I obstacle;如果环境图像的第一视差图中的一个像素的视差值d满足d>d H,则该像素属于高于地面,且高于地面之上H高度的区域I >H;d H表示地面之上H高度的像素点的视差值;I >H可以如图18中的右上图所示。I <H可以如图18中的右下图所示。在上述公式(9)中:
Figure PCTCN2019120833-appb-000013
在一个可选示例中,本公开在障碍物区域I obstacle,根据障碍物边缘信息确定像素柱状区域的方式可以为:首先,将第一视差图中的非障碍物区域的像素点的视差值以及障碍物边缘信息处的像素点的视差值设置为预定值。其次,以第一视差图的列方向的N个像素点作为遍历单位,从第一视差图的设定行起遍历每行上N个像素点的视差值,确定像素点的视差值存在预定值和非预定值之间的跳变的目标行;最后,以列方向上的N个像素点作为柱宽度、以确定出的目标行作为障碍物像素柱状区域在行方向上的边界,确定障碍物区域中的障碍物像素柱状区域。例如,本公开在障碍物区域I obstacle,根据障碍物边缘信息确定像素柱状区域的方式可以为:首先,根据检测到的障碍物边缘信息,将视差图中的障碍物边缘位置处的视差值均设置为预定值(如0),并将视差图中的除障碍物区域之外的区域中的视差值也设置为预定值(如0);然后,根据预定柱宽度(至少一列像素宽度,如6列像素宽度等),从视差图的最底部向上搜索,在搜索到预定柱宽度中的任一列像素的视差值由预定值跳变为非预定值时,确定该位置(视差图的行)为该像素柱状区域的底部,开始形成像素柱状区域,即开始向上延伸该像素柱状区域,例如,继续向上搜索视差图中由非预定值到预定值的跳变,在搜索到预定 柱宽度中的任一列像素的视差值由非预定值跳变到预定值时,停止该像素柱状区域的向上延伸,确定该位置(视差图的行)为像素柱状区域的顶部,从而形成一障碍物像素柱状区域。
需要特别说明的是,本公开可以从视差图的左下角开始障碍物像素柱状区域的确定过程,直到视差图的右下角,例如,从视差图最左侧的6列开始执行上述障碍物像素柱状区域的确定过程,然后,从视差图的最左侧的第7-12列开始再次执行上述障碍物像素柱状区域的确定过程,直到视差图的最右侧列。本公开也可以从视差图的右下角开始障碍物像素柱状区域的确定过程,直到视差图的左下角。另外,从视差图的最下端的中间位置向两侧扩展,以形成障碍物像素柱状区域也是完全可行的。
在一个可选示例中,本公开在障碍物区域I obstacle,根据障碍物边缘信息形成像素柱状区域的方式可以为:首先,根据检测到的障碍物边缘信息,将视差图中的障碍物边缘位置处的视差值均设置为预定值(如0),并将视差图中的除障碍物区域之外的区域中的视差值也设置为预定值(如0);然后,根据预定柱宽度(至少一列像素宽度,如6列像素宽度等),从视差图的最顶部向下搜索,在搜索到预定柱宽度中的任一列像素的视差值由预定值跳变为非预定值时,确定该位置(视差图的行)为该像素柱状区域的顶部,开始形成像素柱状区域,即开始向下延伸该像素柱状区域,例如,继续向上搜索视差图中由非预定值到预定值的跳变,在搜索到预定柱宽度中的任一列像素的视差值由非预定值跳变到预定值时,停止该像素柱状区域的向下延伸,确定该位置(视差图的行)为像素柱状区域的底部,从而形成一障碍物像素柱状区域。需要特别说明的是,本公开可以从视差图的左上角开始障碍物像素柱状区域的确定过程,直到视差图的右上角,例如,从视差图最左侧顶部的6列开始执行上述障碍物像素柱状区域的确定过程,然后,从视差图的最左侧顶部的第7-12列开始再次执行上述障碍物像素柱状区域的确定过程,直到视差图的最右侧列。本公开也可以从视差图的右上角开始障碍物像素柱状区域的确定过程,直到视差图的左上角。另外,从视差图的最上端的中间位置向两侧扩展,以形成障碍物像素柱状区域也是完全可行的。
可选的,本公开针对图2所示的环境图像,所形成的障碍物像素柱状区域的一个例子如图19右图所示。图19右图中的每一个障碍物像素柱状区域的宽度均为6列像素宽度。障碍物像素柱状区域的宽度可以根据实际需求设置。将障碍物像素柱状区域的宽度设置的越大,所形成的障碍物像素柱状区域越粗糙,而形成障碍物像素柱状区域的耗时也会越短。
在一个可选示例中,在形成了障碍物像素柱状区域后,应确定障碍物像素柱状区域的属性信息,障碍物像素柱状区域的属性信息包括但不限于:障碍物像素柱状区域的空间位置信息、障碍物像素柱状区域的底部信息bottom、障碍物像素柱状区域的视差值disp、障碍物像素柱状区域的顶部信息top以及障碍物像素柱状区域的列信息col。
可选的,障碍物像素柱状区域的空间位置信息可以包括:障碍物像素柱状区域在水平方向坐标轴(X坐标轴)上的坐标、障碍物像素柱状区域在深度方向坐标轴(Z坐标轴)上的坐标、障碍物像素柱状区域在竖直方向坐标轴(Y坐标轴)上的最高点坐标和障碍物像素柱状区域在竖直方向坐标轴(Y坐标轴)上的最低点坐标。也就是说,障碍物像素柱状区域的空间位置信息包括:障碍物像素柱状区域的X坐标、Z坐标、最大Y坐标和最小Y坐标。X、Y和Z坐标轴的一个例子如图17所示。
可选的,障碍物像素柱状区域的底部信息可以为障碍物像素柱状区域最下端的行号。在预定值为0的情况下,障碍物像素柱状区域的视差值可以为:视差值由零跳变为非零时,该非零位置处的像素的视差值;障碍物像素柱状区域的顶部信息可以为视差值由非零跳变为零时,零位置处的像素的行号。障碍物像素柱状区域的列信息可以为该装置像素所包含的所有列中的任一列的列号,例如,位于像素柱状区域中间位置处的一列的列号。
可选的,针对每一个障碍物像素柱状区域,本公开均利用下述公式(10)计算障碍物像素柱状区域的空间位置信息,即障碍物像素柱状区域的X坐标、Z坐标、最大Y坐标和最小Y坐标:
Figure PCTCN2019120833-appb-000014
在上述公式(10)中,b表示双目摄像装置的间距;f表示摄像装置的焦距;Disp表示障碍物像素柱状区域的视差值;Col表示障碍物像素柱状区域的列信息;c x表示摄像装置主点的X坐标值。
可选的,障碍物像素柱状区域中的每一个像素的Y坐标可以使用下述公式(11)表示:
Figure PCTCN2019120833-appb-000015
在上述公式(11)中,Y i表示障碍物像素柱状区域中的第i个像素的Y坐标;row i表示障碍物像素柱状区域中的第i个像素的行号;c y表示摄像装置主点的Y坐标值;Z表示障碍物像素柱状区域的Z坐标;f表示摄像装置的焦距。
在获得了一个障碍物像素柱状区域内的所有像素的Y坐标之后,即可获得其中的最大Y坐标和最小Y坐标。最大Y坐标和最小Y坐标可以表示为下述公式(12)所示:
Y min=min(Y i)
Y max=max(Y i)                      公式(12)
在上述公式(12)中,Y min表示障碍物像素柱状区域的最小Y坐标;Y max表示障碍物像素柱状区域的最大Y坐标;min(Y i)表示计算所有Y i中的最小值;max(Y i)表示计算所有Y i中的最大值。
S120、对多个障碍物像素区域进行聚类处理,获得至少一个类簇。
在一个可选示例中,本公开可以对多个障碍物像素柱状区域进行聚类处理,获得至少一个类簇。本公开可以根据障碍物像素柱状区域的空间位置信息,对所有障碍物像素柱状区域进行聚类处理,一个类簇对应一个障碍物实例。本公开可以采用相应聚类算法对各障碍物像素柱状区域进行聚类处理。
可选的,在对多个像素柱状区域进行聚类处理之前,可以先对障碍物像素柱状区域的X坐标和Z坐标进行归一化处理(即归一化处理)。
例如,本公开可以采用min-max归一化处理方式,对障碍物像素柱状区域的X坐标和Z坐标进行映射,使障碍物像素柱状区域的X坐标和Z坐标被映射到[0-1]取值范围内。该归一化处理方式的一个例子,如下述公式(13)所示:
Figure PCTCN2019120833-appb-000016
在上述公式(13)中,X *表示归一化处理后的X坐标;Z *表示归一化处理后的Z坐标;X表示障碍物像素柱状区域的X坐标;Z表示障碍物像素柱状区域的Z坐标;X min表示所有障碍物像素柱状区域的X坐标中的最小值;X max表示所有障碍物像素柱状区域的X坐标中的最大值;Z min表示所有障碍物像素柱状区域的Z坐标中的最小值;Z max表示所有障碍物像素柱状区域的Z坐标中的最大值。
再例如,本公开也可以采用Z-score归一化处理方式,对障碍物像素柱状区域的X坐标和Z坐标进行归一化处理。该归一化处理方式的一个例子,如下述公式(14)所示:
Figure PCTCN2019120833-appb-000017
在上述公式(14)中,X *表示归一化处理后的X坐标;Z *表示归一化处理后的Z坐标;X表示障碍物像素柱状区域的X坐标;Z表示障碍物像素柱状区域的Z坐标;μ X表示针对所有障碍物像素柱状区域的X坐标计算出的均值;σ X表示针对所有障碍物像素柱状区域的X坐标计算出的标准差;μ Z表示针对所有障碍物像素柱状区域的Z坐标计算出的均值;σ Z示针对所有障碍物像素柱状区域的Z坐标计算出的标准差。本公开处理后的所有障碍物像素柱状区域的X *和Z *均符合标准正态分布,即均值为0,标准差为1。
可选的,本公开可以采用密度聚类(DBSCAN)算法,根据归一化处理后的所有障碍物像素柱状区 域的空间位置信息,对障碍物像素柱状区域进行聚类处理,从而形成至少一个类簇,每一个类簇即为一个障碍物示例。本公开不对聚类算法进行限制。聚类结果的一个例子,如图20中的右图所示。
S130、根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
示例性的,障碍物检测结果例如可以但不限于包括障碍物检测框、障碍物的空间位置信息中的至少一种。
在一个可选示例中,本公开可以根据属于同一个类簇的像素柱状区域的空间位置信息,确定出环境图像中的障碍物检测框(Bounding-Box)。例如,对于一个类簇而言,本公开可以计算该类簇中的所有障碍物像素柱状区域在环境图像中的最大列坐标u max以及最小列坐标u min,并计算该类簇中的所有障碍物像素柱状区域的最大的bottom(即v max)以及最小的top(即v min)(注:假定图像坐标系的原点位于图像的左上角)。本公开所获得的障碍物检测框在环境图像中的坐标可以表示为(u min,v min,u max,v max)。可选的,本公开确定出的障碍物检测框的一个例子如图21右图所示。图21右图中的多个矩形框均为本公开获得的障碍物检测框。
本公开通过对多个障碍物像素柱状区域进行聚类处理来获得障碍物,无需对要检测的障碍物进行预定义,没有利用障碍物的纹理、颜色、形状、类别等预定义的信息,便可以直接基于对障碍物区域进行聚类的方式来检测出障碍物,并且,检测出的障碍物并不局限于某些预先定义好的障碍物,可以实现对周围空间环境中可能对智能设备的移动过程造成阻碍的各式各样的障碍物进行检测,从而实现了泛类型障碍物的检测。
在一个可选示例中,本公开也可以根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。障碍物的空间位置信息可以包括但不限于:障碍物在水平方向坐标轴(X坐标轴)上的坐标、障碍物在深度方向坐标轴(Z坐标轴)上的坐标以及障碍物在竖直方向上的高度(即障碍物的高度)等。可选的,本公开可以先根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定该类簇中的多个障碍物像素柱状区域与生成环境图像的摄像装置之间的距离,然后,根据距离最近的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
可选的,本公开可以采用下述公式(15)来计算一个类簇中的多个障碍物像素柱状区域与摄像装置之间的距离,并选取出最小距离:
Figure PCTCN2019120833-appb-000018
在上述公式(15)中,d min表示最小距离;X i表示一个类簇中的第i个障碍物像素柱状区域的X坐标;Z i表示一个类簇中的第i个障碍物像素柱状区域的Z坐标。
在确定出最小距离后,可以将具有该最小距离的障碍物像素柱状区域的X坐标和Z坐标作为该障碍物的空间位置信息,如下述公式(16)所示:
O X=X close
O Z=Z close            公式(16)
在上述公式(16)中,O X表示障碍物在水平方向坐标轴上的坐标,即障碍物的X坐标;O Z表示障碍物在深度方向坐标轴(X坐标轴)上的坐标,即障碍物的Z坐标;X close表示上述计算出具有最小距离的障碍物像素柱状区域的X坐标;Z close表示上述计算出具有最小距离的障碍物像素柱状区域的Z坐标。
可选的,本公开可以采用下述公式(17)来计算障碍物的高度:
O H=Y max-Y min          公式(17)
在上述公式(17)中,O H表示障碍物的高度;Y max表示一类簇中的所有障碍物像素柱状区域的最大Y坐标;Y min表示一类簇中的所有障碍物像素柱状区域的最小Y坐标。
本公开训练卷积神经网络的一个实施方式的流程,如图22所示。
S2200、将双目图像样本中的其中一目(如左/右目)图像样本输入至待训练的卷积神经网络中。
可选的,本公开输入卷积神经网络中的图像样本可以始终为双目图像样本的左目图像样本,也可以始终为双目图像样本的右目图像样本。在输入卷积神经网络中的图像样本始终为双目图像样本的左目图像样本的情况下,成功训练后的卷积神经网络,在测试或者实际应用场景中,会将输入的环境图像作为左目图像。在输入卷积神经网络中的图像样本始终为双目图像样本的右目图像样本的情况下,成功训练后的卷积神经网络,在测试或者实际应用场景中,会将输入的环境图像作为右目图像。
S2210、经由卷积神经网络进行视差分析处理,基于该卷积神经网络的输出,获得左目图像样本的视差图和右目图像样本的视差图。
S2220、根据左目图像样本以及右目图像样本的视差图重建右目图像。
可选的,本公开重建右目图像的方式包括但不限于:对左目图像样本以及右目图像样本的视差图进行重投影计算,从而获得重建的右目图像。
S2230、根据右目图像样本以及左目图像样本的视差图重建左目图像。
可选的,本公开重建左目图像的方式包括但不限于:对右目图像样本以及左目图像样本的视差图进行重投影计算,从而获得重建的左目图像。
S2240、根据重建的左目图像和左目图像样本之间的差异、以及重建的右目图像和右目图像样本之间的差异,调整卷积神经网络的网络参数。
可选的,本公开在确定差异时,所采用的损失函数包括但不限于:L1损失函数、smooth损失函数以及lr-Consistency损失函数等。另外,本公开在将计算出的损失反向传播,以调整卷积神经网络的网络参数(如卷积核的权值)时,可以基于卷积神经网络的链式求导所计算出的梯度,来反向传播损失,从而有利于提高卷积神经网络的训练效率。
在一个可选示例中,在针对卷积神经网络的训练达到预定迭代条件时,本次训练过程结束。本公开中的预定迭代条件可以包括:基于卷积神经网络输出的视差图而重建的左目图像与左目图像样本之间的差异、以及基于卷积神经网络输出的视差图而重建的右目图像和右目图像样本之间的差异,满足预定差异要求。在该差异满足要求的情况下,本次对卷积神经网络成功训练完成。本公开中的预定迭代条件也可以包括:对卷积神经网络进行训练,所使用的双目图像样本的数量达到预定数量要求等。在使用的双目图像样本的数量达到预定数量要求,然而,基于卷积神经网络输出的视差图而重建的左目图像与左目图像样本之间的差异、以及基于卷积神经网络输出的视差图而重建的右目图像和右目图像样本之间的差异,并未满足预定差异要求情况下,本次对卷积神经网络并未训练成功。
图23为本公开的智能驾驶控制方法的一个实施例的流程图。本公开的智能驾驶控制方法可以适用但不限于:自动驾驶(如完全无人辅助的自动驾驶)环境或辅助驾驶环境中。
S2300、通过智能设备上设置的图像采集装置获取智能设备在移动过程中的环境图像。该图像采集装置包括但不限于:基于RGB的摄像装置等。
S2310、对获取的环境图像进行障碍物检测,确定障碍物检测结果。本步骤的具体实现过程可参见上述方法实施方式中针对图1的描述,在此不再详细说明。
S2320、根据障碍物检测结果生成并输出控制指令。
可选的,本公开生成的控制指令包括但不限于:速度保持控制指令、速度调整控制指令(如减速行驶指令、加速行驶指令等)、方向保持控制指令、方向调整控制指令(如左转向指令、右转向指令、向左侧车道并线指令、或者向右侧车道并线指令等)、鸣笛指令、预警提示控制指令或者驾驶模式切换控制指令(如切换为自动巡航驾驶模式等)。
需要特别说明的是,本公开的障碍物检测技术除了可以适用于智能驾驶控制领域之外,还可以应用在其他领域中;例如,可以实现工业制造中的障碍物检测、超市等室内领域的障碍物检测、安防领域中的障碍物检测等等,本公开不限制障碍物检测技术的适用场景。
图24为本公开的障碍物检测装置的一个实施例的结构示意图。图24中的装置包括:获取模块2400、第一确定模块2410、聚类模块2420、第二确定模块2430以及训练模块2440。
获取模块2400用于获取环境图像的第一视差图。环境图像为表征智能设备在移动过程中所处的空间环境信息的图像。可选的,环境图像包括单目图像。获取模块2400可以包括:第一子模块、第二子模块和第三子模块。第一子模块,用于利用卷积神经网络对单目图像进行视差分析处理,基于卷积神经网络的输出,获得单目图像的第一视差图;其中的卷积神经网络是利用双目图像样本,训练获得的。第二子模块用于将单目图像进行镜像处理后,得到第一镜像图,并获取第一镜像图的视差图。第三子模块用于根据第一镜像图的视差图,对单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图。其中的第三子模块可以包括:第一单元和第二单元。其中的第一单元用于将第一镜像图的视差图进行镜像处理后,得到第二镜像图。第二单元用于根据第一视差图的权重分布图、以及 第二镜像图的权重分布图,对第一视差图进行视差调整,获得经视差调整后的第一视差图。其中,第一视差图的权重分布图包括表示第一视差图中多个视差值各自对应的权重值;第二镜像图的权重分布图包括第二镜像图中多个视差值各自对应的权重。
可选的,本公开中的权重分布图包括:第一权重分布图以及第二权重分布图中的至少一个。第一权重分布图是针对多个环境图像统一设置的权重分布图;第二权重分布图是针对不同环境图像分别设置的权重分布图。第一权重分布图包括至少两个左右分列的区域,不同区域具有不同的权重值。
可选的,在单目图像为左目图像的情况下:对于第一视差图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值;对于第二镜像图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。对于第一视差图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中的右侧部分的权重值;对于第二镜像图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中右侧部分的权重值。
可选的,在单目图像为右目图像的情况下,对于第一视差图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值;对于第二镜像图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。
可选的,对于第一视差图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值;对于第二镜像图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值。
可选的,第三子模块还可以包括:第三单元,用于设置第一视差图的第二权重分布图。具体的,第三单元对第一视差图进行镜像处理,形成镜像视差图;并根据第一视差图的镜像视差图中的视差值,设置第一视差图的第二权重分布图中的权重值。例如,对于镜像视差图中的任一位置处的像素点而言,在该位置处的像素点的视差值满足第一预定条件的情况下,第三单元将第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第一值。在该像素点的视差值不满足第一预定条件的情况下,第三单元可以将第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第二值;其中,第一值大于第二值。其中的第一预定条件可以包括:该位置处的像素点的视差值大于该位置处的像素点的第一参考值。该位置处的像素点的第一参考值是根据第一视差图中该位置处的像素点的视差值以及大于零的常数值,设置的。
可选的,第三子模块还可以包括:第四单元。第四单元用于设置第二镜像图的第二权重分布图的第四单元。例如,第四单元根据第一视差图中的视差值,设置第二镜像图的第二权重分布图中的权重值。更具体的,对于第二镜像图中的任一位置处的像素点而言,在所述第一视差图中该位置处的像素点的视差值满足第二预定条件,则第四单元将第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第三值。在第一视差图中该位置处的像素点的视差值不满足第二预定条件的情况下,第四单元将第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第四值;其中,第三值大于第四值。其中的第二预定条件包括:第一视差图中该位置处的像素点的视差值大于该位置处的像素点的第二参考值;其中,该位置处的像素点的第二参考值是根据第一视差图的镜像视差图中在该位置处的像素点的视差值以及大于零的常数值,设置的。
可选的,第二单元可以根据第一视差图的第一权重分布图和第二权重分布图,调整第一视差图中的视差值;并根据第二镜像图的第一权重分布图和第二权重分布图,调整第二镜像图中的视差值;第二单元将视差调整后的第一视差图和视差值调整后的第二镜像图进行合并后,最终获得经视差调整后的第一视差图。
获取模块2400所包括的各部分具体执行的操作可以参见上述方法实施例中针对S100的描述,在此不再详细说明。
第一确定模块2410用于在环境图像的第一视差图中确定出多个障碍物像素区域。第一确定模块2410可以包括:第四子模块、第五子模块和第六子模块。第四子模块用于对环境图像的第一视差图进行边缘检测,获得障碍物边缘信息。第五子模块用于确定环境图像的第一视差图中的障碍物区域;第六子模块用于根据障碍物边缘信息,在第一视差图的障碍物区域中,确定多个障碍物像素柱状区域。其中的第五子模块可以包括:第五单元、第六单元、第七单元以及第八单元。第五单元用于对第一视差图中每行像素点的视差值进行统计处理,得到对每行像素点的视差值的统计信息。第六单元用于基于对每行像素点的视差值的统计信息,确定统计视差图;第七单元用于对统计视差图进行第一直线拟合处理,根据第一直线拟合处理的结果确定地面区域和非地面区域;第八单元用于根据非地面区域,确定障碍物区域。其中,非地面区域包括:高于地面的第一区域。非地面区域包括:高于地面的第一 区域和低于地面的第二区域。第八单元可以对统计视差图进行第二直线拟合处理,根据第二直线拟合处理的结果,确定第一区域中的高于地面的高度小于第一预定高度值的第一目标区域,第一目标区域为障碍物区域;在非地面区域存在低于地面的第二区域的情况下,第八单元确定第二区域中低于地面的高度大于第二预定高度值的第二目标区域,第二目标区域为障碍物区域。
可选的,第六子模块可以将第一视差图中的非障碍物区域的像素点的视差值以及障碍物边缘信息处的像素点的视差值设置为预定值;第六子模块以第一视差图的列方向的N个像素点作为遍历单位,从第一视差图的设定行起遍历每行上N个像素点的视差值,确定像素点的视差值存在所述预定值和非预定值之间的跳变的目标行;第六子模块用于以列方向上的N个像素点作为柱宽度、以确定出的目标行作为障碍物像素柱状区域在行方向上的边界,确定障碍物区域中的障碍物像素柱状区域。
第一确定模块2410所包括的各部分具体执行的操作可以参见上述方法实施例中针对S110的描述,在此不再详细说明。
聚类模块2420用于对多个障碍物像素区域进行聚类处理,获得至少一个类簇。例如,聚类模块2420可以对多个障碍物像素柱状区域进行聚类处理。聚类模块2420可以包括第七子模块和第八子模块。第七子模块用于确定多个障碍物像素柱状区域的空间位置信息。第八子模块用于根据多个障碍物像素柱状区域的空间位置信息,对多个障碍物像素柱状区域进行聚类处理。例如,针对任一障碍物像素柱状区域而言,第八子模块根据该障碍物像素柱状区域所包含的像素,确定该障碍物像素柱状区域的属性信息,并根据该障碍物像素柱状区域的属性信息,确定该障碍物像素柱状区域的空间位置信息。其中的障碍物像素柱状区域的属性信息可以包括:像素柱状区域底部信息、像素柱状区域顶部信息、像素柱状区域视差值、以及像素柱状区域列信息中的至少一个。其中的障碍物像素柱状区域的空间位置信息可以包括:障碍物像素柱状区域在水平方向坐标轴上的坐标、障碍物像素柱状区域在深度方向坐标轴上的坐标。障碍物像素柱状区域的空间位置信息还可以包括:障碍物像素柱状区域在竖直方向坐标轴上的最高点坐标、以及障碍物像素柱状区域在竖直方向坐标轴上的最低点坐标;其中的最高点坐标和最低点坐标用于确定障碍物高度。聚类模块2420所包括的各部分具体执行的操作可以参见上述方法实施例中针对S120的描述,在此不再详细说明。
第二确定模块2430用于根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。第二确定模块可以包括:第九子模块和第十子模块中的至少一个。第九子模块用于根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定环境图像中的障碍物检测框。第十子模块用于根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。例如,第十子模块可以根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定多个障碍物像素柱状区域与生成环境图像的摄像装置之间的距离;并根据距离摄像装置最近的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。第二确定模块2430所包括的各部分具体执行的操作可以参见上述方法实施例中针对S130的描述,在此不再详细说明。
训练模块2440用于训练卷积神经网络的训练模块。例如,训练模块2440将双目图像样本中的其中一目图像样本输入至待训练的卷积神经网络中,经由卷积神经网络进行视差分析处理,基于卷积神经网络的输出,获得左目图像样本的视差图和右目图像样本的视差图;训练模块2440根据左目图像样本以及右目图像样本的视差图重建右目图像;训练模块2440根据右目图像样本以及左目图像样本的视差图重建左目图像;训练模块2440根据重建的左目图像和左目图像样本之间的差异、以及重建的右目图像和右目图像样本之间的差异,调整所述卷积神经网络的网络参数。训练模块2440执行的具体操作可以参见上述针对图22的描述,在此不再详细说明。
图25为本公开的智能驾驶控制装置的一个实施例的结构示意图。图25中的装置包括:获取模块2500、障碍物检测装置2510以及控制模块2520。
获取模块2500用于通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像。障碍物检测装置2510用于对环境图像进行障碍物检测,确定障碍物检测结果。控制模块2520用于根据障碍物检测结果生成并输出车辆的控制指令。
示例性设备
图26示出了适于实现本公开的示例性设备2600,设备2600可以是汽车中配置的控制系统/电子系统、移动终端(例如,智能移动电话等)、个人计算机(PC,例如,台式计算机或者笔记型计算机等)、平板电脑以及服务器等。图26中,设备2600包括一个或者多个处理器、通信部等,所述一个或者多个处理器可以为:一个或者多个中央处理单元(CPU)2601,和/或,一个或者多个利用神经网络进行视觉跟踪的图像处理器(GPU)2613等,处理器可以根据存储在只读存储器(ROM)2602中的可执行指令或者从存储部分2608加载到随机访问存储器(RAM)2603中的可执行指令而执行各种 适当的动作和处理。通信部2612可以包括但不限于网卡,所述网卡可以包括但不限于IB(Infiniband)网卡。处理器可与只读存储器2602和/或随机访问存储器2603中通信以执行可执行指令,通过总线2604与通信部2612相连、并经通信部2612与其他目标设备通信,从而完成本公开中的相应步骤。
上述各指令所执行的操作可以参见上述方法实施例中的相关描述,在此不再详细说明。此外,在RAM 2603中,还可以存储有装置操作所需的各种程序以及数据。CPU2601、ROM2602以及RAM2603通过总线2604彼此相连。在有RAM2603的情况下,ROM2602为可选模块。RAM2603存储可执行指令,或在运行时向ROM2602中写入可执行指令,可执行指令使中央处理单元2601执行上述方法所包括的步骤。输入/输出(I/O)接口2605也连接至总线2604。通信部2612可以集成设置,也可以设置为具有多个子模块(例如,多个IB网卡),并分别与总线连接。以下部件连接至I/O接口2605:包括键盘、鼠标等的输入部分2606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分2607;包括硬盘等的存储部分2608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分2609。通信部分2609经由诸如因特网的网络执行通信处理。驱动器2610也根据需要连接至I/O接口2605。可拆卸介质2611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器2610上,以便于从其上读出的计算机程序根据需要被安装在存储部分2608中。
需要特别说明的是,如图26所示的架构仅为一种可选实现方式,在具体实践过程中,可根据实际需要对上述图26的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如,GPU2613和CPU2601可分离设置,再如理,可将GPU2613集成在CPU2601上,通信部可分离设置,也可集成设置在CPU2601或GPU2613上等。这些可替换的实施方式均落入本公开的保护范围。
特别地,根据本公开的实施方式,下文参考流程图描述的过程可以被实现为计算机软件程序,例如,本公开实施方式包括一种计算机程序产品,其包含有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的步骤的程序代码,程序代码可包括对应执行本公开提供的方法中的步骤对应的指令。在这样的实施方式中,该计算机程序可以通过通信部分2609从网络上被下载及安装,和/或从可拆卸介质2611被安装。在该计算机程序被中央处理单元(CPU)2601执行时,执行本公开中记载的实现上述相应步骤的指令。在一个或多个可选实施方式中,本公开实施例还提供了一种计算机程序程序产品,用于存储计算机可读指令,所述指令被执行时使得计算机执行上述任意实施例中所述的障碍物检测方法或者智能驾驶控制方法。
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选例子中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选例子中,所述计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。在一个或多个可选实施方式中,本公开实施例还提供了另一种障碍物检测方法和智能驾驶控制方法及其对应的装置和电子设备、计算机存储介质、计算机程序以及计算机程序产品,其中的方法包括:第一装置向第二装置发送障碍物检测指示或者智能驾驶控制指示,该指示使得第二装置执行上述任一可能的实施例中的障碍物检测方法或者智能驾驶控制方法;第一装置接收第二装置发送的障碍物检测结果或者智能驾驶控制结果。
在一些实施例中,该视障碍物检测指示或者智能驾驶控制指示可以具体为调用指令,第一装置可以通过调用的方式指示第二装置执行障碍物检测操作或者智能驾驶控制操作,相应地,响应于接收到调用指令,第二装置可以执行上述障碍物检测方法或者智能驾驶控制方法中的任意实施例中的步骤和/或流程。应理解,本公开实施例中的“第一”、“第二”等术语仅仅是为了区分,而不应理解成对本公开实施例的限定。还应理解,在本公开中,“多个”可以指两个或两个以上,“至少一个”可以指一个、两个或两个以上。还应理解,对于本公开中提及的任一部件、数据或结构,在没有明确限定或者在前后文给出相反启示的情况下,一般可以理解为一个或多个。还应理解,本公开对各个实施例的描述着重强调各个实施例之间的不同之处,其相同或相似之处可以相互参考,为了简洁,不再一一赘述。可能以许多方式来实现本公开的方法和装置、电子设备以及计算机可读存储介质。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法和装置、电子设备以及计算机可读存储介质。用于方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施方式中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。本公开的描述,是为了示例和描述起见而给出的,而并不是无遗漏的或者将本公开限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言,是显然的。选择和描述实施方式是为了更好说明本公开的原理以及实际应用,并且使本领域的普通技术人员能够理解本公开实施例可以从而设计适于特定用途的带有各种修改的各种实施方式。

Claims (69)

  1. 一种障碍物检测方法,其特征在于,包括:
    获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;
    在所述环境图像的第一视差图中确定出多个障碍物像素区域;
    对所述多个障碍物像素区域进行聚类处理,获得至少一个类簇;
    根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
  2. 根据权利要求1所述的方法,其特征在于,所述环境图像包括单目图像;
    在获得环境图像的第一视差图之后,还包括:
    将所述单目图像进行镜像处理后,得到第一镜像图,并获取所述第一镜像图的视差图;
    根据所述第一镜像图的视差图,对所述单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图;
    所述在所述环境图像的第一视差图中确定出多个障碍物像素区域,包括:
    在所述经视差调整后的第一视差图中确定出多个障碍物像素区域。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述第一镜像图的视差图,对所述单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图,包括:
    将所述第一镜像图的视差图进行镜像处理后,得到第二镜像图;
    根据所述第一视差图的权重分布图、以及所述第二镜像图的权重分布图,对所述第一视差图进行视差调整,获得经视差调整后的第一视差图;
    其中,所述第一视差图的权重分布图包括表示所述第一视差图中多个视差值各自对应的权重值;所述第二镜像图的权重分布图包括所述第二镜像图中多个视差值各自对应的权重。
  4. 根据权利要求3所述的方法,其特征在于,所述权重分布图包括:第一权重分布图,和/或,第二权重分布图;
    所述第一权重分布图是针对多个环境图像统一设置的权重分布图;
    所述第二权重分布图是针对不同环境图像分别设置的权重分布图。
  5. 根据权利要求4所述的方法,其特征在于,所述第一权重分布图包括至少两个左右分列的区域,不同区域具有不同的权重值。
  6. 根据权利要求5所述的方法,其特征在于,在所述单目图像为左目图像的情况下:
    对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值;
    对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。
  7. 根据权利要求6所述的方法,其特征在于:
    对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中的右侧部分的权重值;
    对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中右侧部分的权重值。
  8. 根据权利要求5所述的方法,其特征在于,在所述单目图像为右目图像的情况下:
    对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值;
    对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。
  9. 根据权利要求8所述的方法,其特征在于:
    对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值;
    对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值。
  10. 根据权利要求4至9中任一项所述的方法,其特征在于,所述第一视差图的第二权重分布图的设置方式包括:
    对所述第一视差图进行镜像处理,形成镜像视差图;
    根据所述第一视差图的镜像视差图中的视差值,设置所述第一视差图的第二权重分布图中的权重值。
  11. 根据权利要求10所述的方法,其特征在于,所述根据所述第一视差图的镜像视差图中的视差值,设置所述第一视差图的第二权重分布图中的权重值,包括:
    对于所述镜像视差图中的任一位置处的像素点而言,在该位置处的像素点的视差值满足第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第一值。
  12. 根据如权利要求11所述的方法,其特征在于,所述方法还包括:
    在该位置处的像素点的视差值不满足所述第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第二值;其中,所述第一值大于所述第二值。
  13. 根据权利要求11或12所述的方法,其特征在于,所述第一预定条件包括:该位置处的像素点的视差值大于该位置处的像素点的第一参考值;
    其中,该位置处的像素点的第一参考值是根据所述第一视差图中该位置处的像素点的视差值以及大于零的常数值,设置的。
  14. 根据权利要求4至13中任一项所述的方法,其特征在于,所述第二镜像图的第二权重分布图的设置方式包括:
    根据所述第一视差图中的视差值,设置所述第二镜像图的第二权重分布图中的权重值。
  15. 根据如权利要求14所述的方法,其特征在于,所述根据所述第一视差图中的视差值,设置所述第二镜像图的第二权重分布图中的权重值,包括:
    对于所述第二镜像图中的任一位置处的像素点而言,在所述第一视差图中该位置处的像素点的视差值满足第二预定条件,则将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第三值。
  16. 根据如权利要求15所述的方法,其特征在于,所述方法还包括:
    在所述第一视差图中该位置处的像素点的视差值不满足第二预定条件的情况下,将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第四值;其中,所述第三值大于第四值。
  17. 根据权利要求15或16所述的方法,其特征在于,所述第二预定条件包括:所述第一视差图中该位置处的像素点的视差值大于该位置处的像素点的第二参考值;
    其中,该位置处的像素点的第二参考值是根据所述第一视差图的镜像视差图中在该位置处的像素点的视差值以及大于零的常数值,设置的。
  18. 根据权利要求4至17任一所述的方法,其特征在于,所述根据所述第一视差图的权重分布图、以及所述第二镜像图的权重分布图,对所述第一视差图进行视差调整,获得经视差调整后的第一视差图,包括:
    根据所述第一视差图的第一权重分布图和第二权重分布图,调整所述第一视差图中的视差值;
    根据所述第二镜像图的第一权重分布图和第二权重分布图,调整所述第二镜像图中的视差值;
    将视差调整后的第一视差图和视差值调整后的第二镜像图进行合并,最终获得经视差调整后的第一视差图。
  19. 根据权利要求1所述的方法,其特征在于,所述环境图像包括单目图像;
    所述获取环境图像的第一视差图,包括:
    利用卷积神经网络对所述单目图像进行视差分析处理,基于所述卷积神经网络的输出,获得所述单目图像的第一视差图;
    其中,所述卷积神经网络是利用双目图像样本,训练获得的。
  20. 根据权利要求19所述的方法,其特征在于,所述卷积神经网络的训练过程,包括:
    将双目图像样本中的其中一目图像样本输入至待训练的卷积神经网络中,经由所述卷积神经网络进行视差分析处理,基于所述卷积神经网络的输出,获得左目图像样本的视差图和右目图像样本的视差图;
    根据所述左目图像样本以及右目图像样本的视差图重建右目图像;
    根据所述右目图像样本以及左目图像样本的视差图重建左目图像;
    根据重建的左目图像和左目图像样本之间的差异、以及重建的右目图像和右目图像样本之间的差异,调整所述卷积神经网络的网络参数。
  21. 根据权利要求1至20中任一项所述的方法,其特征在于,所述在所述环境图像的第一视差图中确定出多个障碍物像素区域,包括:
    对所述环境图像的第一视差图进行边缘检测,获得障碍物边缘信息;
    确定所述环境图像的第一视差图中的障碍物区域;
    根据所述障碍物边缘信息,在所述障碍物区域中,确定多个障碍物像素柱状区域。
  22. 根据权利要求21所述的方法,其特征在于,所述确定所述环境图像的第一视差图中的障碍物区域,包括:
    对所述第一视差图中每行像素点的视差值进行统计处理,得到对每行像素点的视差值的统计信息;
    基于对每行像素点的视差值的统计信息,确定统计视差图;
    对所述统计视差图进行第一直线拟合处理,根据所述第一直线拟合处理的结果确定地面区域和非地面区域;
    根据所述非地面区域,确定障碍物区域。
  23. 根据权利要求22所述的方法,其特征在于,所述非地面区域包括:高于地面的第一区域;或者,所述非地面区域包括:高于地面的第一区域和低于地面的第二区域。
  24. 根据权利要求23所述的方法,其特征在于,所述根据所述非地面区域,确定障碍物区域,包括:
    对所述统计视差图进行第二直线拟合处理,根据所述第二直线拟合处理的结果,确定所述第一区域中的高于地面的高度小于第一预定高度值的第一目标区域,所述第一目标区域为障碍物区域;
    在所述非地面区域存在低于地面的第二区域的情况下,确定所述第二区域中低于地面的高度大于第二预定高度值的第二目标区域,所述第二目标区域为障碍物区域。
  25. 根据权利要求21至24中任一项所述的方法,其特征在于,所述根据所述障碍物边缘信息,在所述第一视差图的障碍物区域中,确定多个障碍物像素柱状区域,包括:
    将所述第一视差图中的非障碍物区域的像素点的视差值以及所述障碍物边缘信息处的像素点的视差值设置为预定值;
    以所述第一视差图的列方向的N个像素点作为遍历单位,从所述第一视差图的设定行起遍历每行上N个像素点的视差值,确定像素点的视差值存在所述预定值和非预定值之间的跳变的目标行;N为正整数;
    以列方向上的N个像素点作为柱宽度、以确定出的目标行作为所述障碍物像素柱状区域在行方向上的边界,确定所述障碍物区域中的障碍物像素柱状区域。
  26. 根据权利要求1至25中任一项所述的方法,其特征在于,所述障碍物像素区域包括障碍物像素柱状区域;
    所述对所述多个障碍物像素区域进行聚类处理,包括:
    确定所述多个障碍物像素柱状区域的空间位置信息;
    根据所述多个障碍物像素柱状区域的空间位置信息,对所述多个障碍物像素柱状区域进行聚类处理。
  27. 根据权利要求26所述的方法,其特征在于,所述确定所述多个障碍物像素柱状区域的空间位置信息,包括:
    针对任一障碍物像素柱状区域而言,根据该障碍物像素柱状区域所包含的像素,确定该障碍物像素柱状区域的属性信息,并根据该障碍物像素柱状区域的属性信息,确定该障碍物像素柱状区域的空间位置信息。
  28. 根据权利要求27所述的方法,其特征在于,所述障碍物像素柱状区域的属性信息包括:像素柱状区域底部信息、像素柱状区域顶部信息、像素柱状区域视差值、以及像素柱状区域列信息中的至少一个。
  29. 根据权利要求26至28中任一项所述的方法,其特征在于,所述障碍物像素柱状区域的空间位置信息包括:障碍物像素柱状区域在水平方向坐标轴上的坐标、障碍物像素柱状区域在深度方向坐标轴上的坐标。
  30. 根据权利要求29所述的方法,其特征在于,所述障碍物像素柱状区域的空间位置信息还包括:障碍物像素柱状区域在竖直方向坐标轴上的最高点坐标、以及障碍物像素柱状区域在竖直方向坐标轴上的最低点坐标;
    所述最高点坐标和所述最低点坐标用于确定障碍物高度。
  31. 根据权利要求1至30中任一项所述的方法,其特征在于,所述障碍物像素区域包括;障碍物像素柱状区域;
    所述根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果,包括:
    根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定所述环境图像中的障碍物检测框;和/或
    根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
  32. 根据权利要求31所述的方法,其特征在于,所述根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息,包括:
    根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定所述多个障碍物像素柱状区域与生成所述环境图像的摄像装置之间的距离;
    根据距离所述摄像装置最近的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
  33. 一种智能驾驶控制方法,其特征在于,包括:
    通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像;
    采用如权利要求1-32中任一项所述的方法,对获取的环境图像进行障碍物检测,确定障碍物检测结果;
    根据所述障碍物检测结果生成并输出控制指令。
  34. 一种障碍物检测装置,其特征在于,包括:
    获取模块,用于获取环境图像的第一视差图,所述环境图像为表征智能设备在移动过程中所处的空间环境信息的图像;
    第一确定模块,用于在所述环境图像的第一视差图中确定出多个障碍物像素区域;
    聚类模块,用于对所述多个障碍物像素区域进行聚类处理,获得至少一个类簇;
    第二确定模块,用于根据属于同一个类簇的障碍物像素区域,确定障碍物检测结果。
  35. 根据权利要求34所述的装置,其特征在于,所述获取模块还包括:
    第二子模块,用于将所述环境图像中的单目图像进行镜像处理后,得到第一镜像图,并获取所述第一镜像图的视差图;
    第三子模块,用于根据所述第一镜像图的视差图,对所述单目图像的第一视差图进行视差调整,获得经视差调整后的第一视差图;
    所述第一确定模块进一步用于:
    在所述经视差调整后的第一视差图中确定出多个障碍物像素区域。
  36. 根据权利要求35所述的装置,其特征在于,所述第三子模块,包括:
    第一单元,用于将所述第一镜像图的视差图进行镜像处理后,得到第二镜像图;
    第二单元,用于根据所述第一视差图的权重分布图、以及所述第二镜像图的权重分布图,对所述第一视差图进行视差调整,获得经视差调整后的第一视差图;
    其中,所述第一视差图的权重分布图包括表示所述第一视差图中多个视差值各自对应的权重值;所述第二镜像图的权重分布图包括所述第二镜像图中多个视差值各自对应的权重。
  37. 根据权利要求36所述的装置,其特征在于,所述权重分布图包括:第一权重分布图,和/或,第二权重分布图;
    所述第一权重分布图是针对多个环境图像统一设置的权重分布图;
    所述第二权重分布图是针对不同环境图像分别设置的权重分布图。
  38. 根据权利要求37所述的装置,其特征在于,所述第一权重分布图包括至少两个左右分列的区域,不同区域具有不同的权重值。
  39. 根据权利要求38所述的装置,其特征在于,在所述单目图像为左目图像的情况下:
    对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值;
    对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于右侧的区域的权重值不小于位于左侧的区域的权重值。
  40. 根据权利要求39所述的装置,其特征在于:
    对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中的右侧部分的权重值;
    对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中左侧部分的权重值不大于该区域中右侧部分的权重值。
  41. 根据权利要求38所述的装置,其特征在于,在所述单目图像为右目图像的情况下:
    对于所述第一视差图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于 位于右侧的区域的权重值;
    对于所述第二镜像图的第一权重分布图中的任意两个区域而言,位于左侧的区域的权重值不小于位于右侧的区域的权重值。
  42. 根据权利要求41所述的装置,其特征在于:
    对于所述第一视差图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值;
    对于所述第二镜像图的第一权重分布图中的至少一区域而言,该区域中右侧部分的权重值不大于该区域中左侧部分的权重值。
  43. 根据权利要求37至42中任一项所述的装置,其特征在于,所述第三子模块,还包括:用于设置第一视差图的第二权重分布图的第三单元;
    所述第三单元对所述第一视差图进行镜像处理,形成镜像视差图;并根据所述第一视差图的镜像视差图中的视差值,设置所述第一视差图的第二权重分布图中的权重值。
  44. 根据权利要求43所述的装置,其特征在于,所述第三单元进一步用于,对于所述镜像视差图中的任一位置处的像素点而言,在该位置处的像素点的视差值满足第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第一值。
  45. 根据如权利要求44所述的装置,其特征在于,所述第三单元还进一步用于,在该位置处的像素点的视差值不满足所述第一预定条件的情况下,将所述第一视差图的第二权重分布图中在该位置处的像素点的权重值设置为第二值;其中,所述第一值大于所述第二值。
  46. 根据权利要求44或45所述的装置,其特征在于,所述第一预定条件包括:该位置处的像素点的视差值大于该位置处的像素点的第一参考值;
    其中,该位置处的像素点的第一参考值是根据所述第一视差图中该位置处的像素点的视差值以及大于零的常数值,设置的。
  47. 根据权利要求37至46中任一项所述的装置,其特征在于,所述第三子模块还包括:用于设置第二镜像图的第二权重分布图的第四单元;
    所述第四单元根据第一视差图中的视差值,设置所述第二镜像图的第二权重分布图中的权重值。
  48. 根据如权利要求47所述的装置,其特征在于,所述第四单元进一步用于:
    对于所述第二镜像图中的任一位置处的像素点而言,在所述第一视差图中该位置处的像素点的视差值满足第二预定条件,则将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第三值。
  49. 根据如权利要求48所述的装置,其特征在于,所述第四单元进一步用于,在所述第一视差图中该位置处的像素点的视差值不满足第二预定条件的情况下,将所述第二镜像图的第二权重分布图中该位置处的像素点的权重值设置为第四值;其中,所述第三值大于第四值。
  50. 根据权利要求48或49所述的装置,其特征在于,所述第二预定条件包括:所述第一视差图中该位置处的像素点的视差值大于该位置处的像素点的第二参考值;
    其中,该位置处的像素点的第二参考值是根据所述第一视差图的镜像视差图中在该位置处的像素点的视差值以及大于零的常数值,设置的。
  51. 根据权利要求37至50任一所述的装置,其特征在于,所述第二单元进一步用于:
    根据所述第一视差图的第一权重分布图和第二权重分布图,调整所述第一视差图中的视差值;
    根据所述第二镜像图的第一权重分布图和第二权重分布图,调整所述第二镜像图中的视差值;
    将视差调整后的第一视差图和视差值调整后的第二镜像图进行合并,最终获得经视差调整后的第一视差图。
  52. 根据权利要求34所述的装置,其特征在于,所述环境图像包括单目图像;
    所述获取模块包括:
    第一子模块,用于利用卷积神经网络对所述单目图像进行视差分析处理,基于所述卷积神经网络的输出,获得所述单目图像的第一视差图;
    其中,所述卷积神经网络是利用双目图像样本,训练获得的。
  53. 根据权利要求52所述的装置,其特征在于,所述装置还包括:用于训练卷积神经网络的训练模块,所述训练模块进一步用于:
    将双目图像样本中的其中一目图像样本输入至待训练的卷积神经网络中,经由所述卷积神经网络进行视差分析处理,基于所述卷积神经网络的输出,获得左目图像样本的视差图和右目图像样本的视差图;
    根据所述左目图像样本以及右目图像样本的视差图重建右目图像;
    根据所述右目图像样本以及左目图像样本的视差图重建左目图像;
    根据重建的左目图像和左目图像样本之间的差异、以及重建的右目图像和右目图像样本之间的差异,调整所述卷积神经网络的网络参数。
  54. 根据权利要求34至53中任一项所述的装置,其特征在于,所述第一确定模块,包括:
    第四子模块,用于对所述环境图像的第一视差图进行边缘检测,获得障碍物边缘信息;
    第五子模块,用于确定所述环境图像的第一视差图中的障碍物区域;
    第六子模块,用于根据所述障碍物边缘信息,在所述障碍物区域中,确定多个障碍物像素柱状区域。
  55. 根据权利要求54所述的装置,其特征在于,所述第五子模块,包括:
    第五单元,用于对所述第一视差图中每行像素点的视差值进行统计处理,得到对每行像素点的视差值的统计信息;
    第六单元,用于基于对每行像素点的视差值的统计信息,确定统计视差图;
    第七单元,用于对所述统计视差图进行第一直线拟合处理,根据所述第一直线拟合处理的结果确定地面区域和非地面区域;
    第八单元,用于根据所述非地面区域,确定障碍物区域。
  56. 根据权利要求55所述的装置,其特征在于,所述非地面区域包括:高于地面的第一区域;或者,所述非地面区域包括:高于地面的第一区域和低于地面的第二区域。
  57. 根据权利要求56所述的装置,其特征在于,所述第八单元进一步用于:
    对所述统计视差图进行第二直线拟合处理,根据所述第二直线拟合处理的结果,确定所述第一区域中的高于地面的高度小于第一预定高度值的第一目标区域,所述第一目标区域为障碍物区域;
    在所述非地面区域存在低于地面的第二区域的情况下,确定所述第二区域中低于地面的高度大于第二预定高度值的第二目标区域,所述第二目标区域为障碍物区域。
  58. 根据权利要求54至57中任一项所述的装置,其特征在于,所述第六子模块进一步用于:
    将所述第一视差图中的非障碍物区域的像素点的视差值以及所述障碍物边缘信息处的像素点的视差值设置为预定值;
    以所述第一视差图的列方向的N个像素点作为遍历单位,从所述第一视差图的设定行起遍历每行上N个像素点的视差值,确定像素点的视差值存在所述预定值和非预定值之间的跳变的目标行;N为正整数;
    以列方向上的N个像素点作为柱宽度、以确定出的目标行作为所述障碍物像素柱状区域在行方向上的边界,确定所述障碍物区域中的障碍物像素柱状区域。
  59. 根据权利要求34至58中任一项所述的装置,其特征在于,所述障碍物像素区域包括障碍物像素柱状区域;所述聚类模块,包括:
    第七子模块,用于确定所述多个障碍物像素柱状区域的空间位置信息;
    第八子模块,用于根据所述多个障碍物像素柱状区域的空间位置信息,对所述多个障碍物像素柱状区域进行聚类处理。
  60. 根据权利要求59所述的装置,其特征在于,所述第八子模块进一步用于:
    针对任一障碍物像素柱状区域而言,根据该障碍物像素柱状区域所包含的像素,确定该障碍物像素柱状区域的属性信息,并根据该障碍物像素柱状区域的属性信息,确定该障碍物像素柱状区域的空间位置信息。
  61. 根据权利要求60所述的装置,其特征在于,所述障碍物像素柱状区域的属性信息包括:像素柱状区域底部信息、像素柱状区域顶部信息、像素柱状区域视差值、以及像素柱状区域列信息中的至少一个。
  62. 根据权利要求59至61中任一项所述的装置,其特征在于,所述障碍物像素柱状区域的空间位置信息包括:障碍物像素柱状区域在水平方向坐标轴上的坐标、障碍物像素柱状区域在深度方向坐标轴上的坐标。
  63. 根据权利要求62所述的装置,其特征在于,所述障碍物像素柱状区域的空间位置信息还包括:障碍物像素柱状区域在竖直方向坐标轴上的最高点坐标、以及障碍物像素柱状区域在竖直方向坐标轴上的最低点坐标;
    所述最高点坐标和所述最低点坐标用于确定障碍物高度。
  64. 根据权利要求34至63中任一项所述的装置,其特征在于,所述障碍物像素区域包括;障碍 物像素柱状区域;所述第二确定模块包括:
    第九子模块,用于根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定所述环境图像中的障碍物检测框;和/或
    第十子模块,用于根据属于同一个类簇的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
  65. 根据权利要求64所述的装置,其特征在于,所述第十子模块进一步用于:
    根据属于同一个类簇的多个障碍物像素柱状区域的空间位置信息,确定所述多个障碍物像素柱状区域与生成所述环境图像的摄像装置之间的距离;
    根据距离所述摄像装置最近的障碍物像素柱状区域的空间位置信息,确定障碍物的空间位置信息。
  66. 一种智能驾驶控制装置,其特征在于,包括:
    获取模块,用于通过智能设备上设置的图像采集装置获取所述智能设备在移动过程中的环境图像;
    采用如权利要求34-65中任一项所述的装置,对所述环境图像进行障碍物检测,确定障碍物检测结果;
    控制模块,用于根据所述障碍物检测结果生成并输出控制指令。
  67. 一种电子设备,其特征在于,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,实现上述权利要求1-33中任一项所述的方法。
  68. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,该计算机程序被处理器执行时,实现上述权利要求1-33中任一项所述的方法。
  69. 一种计算机程序,其特征在于,包括计算机指令,当所述计算机指令在设备的处理器中运行时,实现上述权利要求1-33中任一项所述的方法。
PCT/CN2019/120833 2019-06-27 2019-11-26 障碍物检测方法、智能驾驶控制方法、装置、介质及设备 WO2020258703A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
SG11202013264YA SG11202013264YA (en) 2019-06-27 2019-11-26 Obstacle detection method, intelligent driving control method, apparatus, medium, and device
JP2021513777A JP2021536071A (ja) 2019-06-27 2019-11-26 障害物検出方法、知的運転制御方法、装置、媒体、及び機器
KR1020217007268A KR20210043628A (ko) 2019-06-27 2019-11-26 장애물 감지 방법, 지능형 주행 제어 방법, 장치, 매체, 및 기기
US17/137,542 US20210117704A1 (en) 2019-06-27 2020-12-30 Obstacle detection method, intelligent driving control method, electronic device, and non-transitory computer-readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910566416.2A CN112149458A (zh) 2019-06-27 2019-06-27 障碍物检测方法、智能驾驶控制方法、装置、介质及设备
CN201910566416.2 2019-06-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/137,542 Continuation US20210117704A1 (en) 2019-06-27 2020-12-30 Obstacle detection method, intelligent driving control method, electronic device, and non-transitory computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2020258703A1 true WO2020258703A1 (zh) 2020-12-30

Family

ID=73868506

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120833 WO2020258703A1 (zh) 2019-06-27 2019-11-26 障碍物检测方法、智能驾驶控制方法、装置、介质及设备

Country Status (6)

Country Link
US (1) US20210117704A1 (zh)
JP (1) JP2021536071A (zh)
KR (1) KR20210043628A (zh)
CN (1) CN112149458A (zh)
SG (1) SG11202013264YA (zh)
WO (1) WO2020258703A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792583A (zh) * 2021-08-03 2021-12-14 北京中科慧眼科技有限公司 基于可行驶区域的障碍物检测方法、系统和智能终端

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102125538B1 (ko) * 2019-12-13 2020-06-22 주식회사 토르 드라이브 자율 주행을 위한 효율적인 맵 매칭 방법 및 그 장치
CN112733653A (zh) * 2020-12-30 2021-04-30 智车优行科技(北京)有限公司 目标检测方法和装置、计算机可读存储介质、电子设备
CN112631312B (zh) * 2021-03-08 2021-06-04 北京三快在线科技有限公司 一种无人设备的控制方法、装置、存储介质及电子设备
CN113269838B (zh) * 2021-05-20 2023-04-07 西安交通大学 一种基于fira平台的障碍物视觉检测方法
CN113747058B (zh) * 2021-07-27 2023-06-23 荣耀终端有限公司 基于多摄像头的图像内容屏蔽方法和装置
KR102623109B1 (ko) * 2021-09-10 2024-01-10 중앙대학교 산학협력단 합성곱 신경망 모델을 이용한 3차원 의료 영상 분석 시스템 및 방법
CN114119700B (zh) * 2021-11-26 2024-03-29 山东科技大学 一种基于u-v视差图的障碍物测距方法
CN115474032B (zh) * 2022-09-14 2023-10-03 深圳市火乐科技发展有限公司 投影交互方法、投影设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573646A (zh) * 2014-12-29 2015-04-29 长安大学 基于激光雷达和双目相机的车前行人检测方法及系统
CN105741312A (zh) * 2014-12-09 2016-07-06 株式会社理光 目标对象跟踪方法和设备
CN105866790A (zh) * 2016-04-07 2016-08-17 重庆大学 一种考虑激光发射强度的激光雷达障碍物识别方法及系统
CN108197698A (zh) * 2017-12-13 2018-06-22 中国科学院自动化研究所 基于多模态融合的多脑区协同自主决策方法

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100962329B1 (ko) * 2009-02-05 2010-06-10 연세대학교 산학협력단 스테레오 카메라 영상으로부터의 지면 추출 방법과 장치 및이와 같은 방법을 구현하는 프로그램이 기록된 기록매체
CN101701818B (zh) * 2009-11-05 2011-03-30 上海交通大学 远距离障碍的检测方法
CN105095905B (zh) * 2014-04-18 2018-06-22 株式会社理光 目标识别方法和目标识别装置
CN103955943A (zh) * 2014-05-21 2014-07-30 西安电子科技大学 基于融合变化检测算子与尺度驱动的无监督变化检测方法
CN106971348B (zh) * 2016-01-14 2021-04-30 阿里巴巴集团控股有限公司 一种基于时间序列的数据预测方法和装置
CN106157307B (zh) * 2016-06-27 2018-09-11 浙江工商大学 一种基于多尺度cnn和连续crf的单目图像深度估计方法
EP3505310B1 (en) * 2016-08-25 2024-01-03 LG Electronics Inc. Mobile robot and control method therefor
EP3736537A1 (en) * 2016-10-11 2020-11-11 Mobileye Vision Technologies Ltd. Navigating a vehicle based on a detected vehicle
CN106708084B (zh) * 2016-11-24 2019-08-02 中国科学院自动化研究所 复杂环境下无人机自动障碍物检测和避障方法
WO2018120040A1 (zh) * 2016-12-30 2018-07-05 深圳前海达闼云端智能科技有限公司 一种障碍物检测方法及装置
CN107729856B (zh) * 2017-10-26 2019-08-23 海信集团有限公司 一种障碍物检测方法及装置
CN108725440B (zh) * 2018-04-20 2020-11-27 深圳市商汤科技有限公司 前向碰撞控制方法和装置、电子设备、程序和介质
CN108961327B (zh) * 2018-05-22 2021-03-30 深圳市商汤科技有限公司 一种单目深度估计方法及其装置、设备和存储介质
CN109190704A (zh) * 2018-09-06 2019-01-11 中国科学院深圳先进技术研究院 障碍物检测的方法及机器人
CN109087346B (zh) * 2018-09-21 2020-08-11 北京地平线机器人技术研发有限公司 单目深度模型的训练方法、训练装置和电子设备
CN109508673A (zh) * 2018-11-13 2019-03-22 大连理工大学 一种基于棒状像素的交通场景障碍检测与识别方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105741312A (zh) * 2014-12-09 2016-07-06 株式会社理光 目标对象跟踪方法和设备
CN104573646A (zh) * 2014-12-29 2015-04-29 长安大学 基于激光雷达和双目相机的车前行人检测方法及系统
CN105866790A (zh) * 2016-04-07 2016-08-17 重庆大学 一种考虑激光发射强度的激光雷达障碍物识别方法及系统
CN108197698A (zh) * 2017-12-13 2018-06-22 中国科学院自动化研究所 基于多模态融合的多脑区协同自主决策方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792583A (zh) * 2021-08-03 2021-12-14 北京中科慧眼科技有限公司 基于可行驶区域的障碍物检测方法、系统和智能终端

Also Published As

Publication number Publication date
JP2021536071A (ja) 2021-12-23
US20210117704A1 (en) 2021-04-22
CN112149458A (zh) 2020-12-29
SG11202013264YA (en) 2021-01-28
KR20210043628A (ko) 2021-04-21

Similar Documents

Publication Publication Date Title
WO2020258703A1 (zh) 障碍物检测方法、智能驾驶控制方法、装置、介质及设备
WO2020108311A1 (zh) 目标对象3d检测方法、装置、介质及设备
EP3295426B1 (en) Edge-aware bilateral image processing
US10353271B2 (en) Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
WO2020029758A1 (zh) 对象三维检测及智能驾驶控制的方法、装置、介质及设备
US9384556B2 (en) Image processor configured for efficient estimation and elimination of foreground information in images
US11049270B2 (en) Method and apparatus for calculating depth map based on reliability
US20140161359A1 (en) Method for detecting a straight line in a digital image
CN110543858A (zh) 多模态自适应融合的三维目标检测方法
WO2020238008A1 (zh) 运动物体检测及智能驾驶控制方法、装置、介质及设备
US11049275B2 (en) Method of predicting depth values of lines, method of outputting three-dimensional (3D) lines, and apparatus thereof
US20210124928A1 (en) Object tracking methods and apparatuses, electronic devices and storage media
CN110570457A (zh) 一种基于流数据的三维物体检测与跟踪方法
Jia et al. Real-time obstacle detection with motion features using monocular vision
CN114926747A (zh) 一种基于多特征聚合与交互的遥感图像定向目标检测方法
US20210078597A1 (en) Method and apparatus for determining an orientation of a target object, method and apparatus for controlling intelligent driving control, and device
KR102262671B1 (ko) 비디오 영상에 보케 효과를 적용하는 방법 및 기록매체
CN114627173A (zh) 通过差分神经渲染进行对象检测的数据增强
US20100014716A1 (en) Method for determining ground line
EP4323952A1 (en) Semantically accurate super-resolution generative adversarial networks
CN115147809B (zh) 一种障碍物检测方法、装置、设备以及存储介质
He et al. A novel way to organize 3D LiDAR point cloud as 2D depth map height map and surface normal map
JP2013164643A (ja) 画像認識装置、画像認識方法および画像認識プログラム
CN113284221B (zh) 一种目标物检测方法、装置及电子设备
CN111765892B (zh) 一种定位方法、装置、电子设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19935079

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20217007268

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021513777

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 31.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19935079

Country of ref document: EP

Kind code of ref document: A1