CN111179300A

CN111179300A - Method, apparatus, system, device and storage medium for obstacle detection

Info

Publication number: CN111179300A
Application number: CN201911295246.5A
Authority: CN
Inventors: 冯家政; 程邦胜; 方晓波; 张辉
Original assignee: Newpoint Enterprise Management Group Co Ltd
Current assignee: Newpoint Enterprise Management Group Co Ltd
Priority date: 2019-12-16
Filing date: 2019-12-16
Publication date: 2020-05-19

Abstract

The application provides a method, a device, equipment, a system and a storage medium for detecting obstacles, relates to the technical field of intelligent equipment, and aims to obtain three-dimensional information of the obstacles based on monocular camera detection. Carrying out foreground segmentation on the road condition image acquired by the monocular camera to obtain a foreground region image; extracting obstacles from the foreground area image to obtain plane information of the target obstacles under the camera coordinates; and substituting the plane information into a preset plane equation based on the computer vision projection inverse transformation to obtain the three-dimensional frame of the target barrier.

Description

Method, apparatus, system, device and storage medium for obstacle detection

Technical Field

The present application relates to the field of smart device technologies, and in particular, to a method, an apparatus, a device, a system, and a storage medium for obstacle detection.

Background

In the commercial fields of urban novel infrastructures, future communities and the like, the demands of vehicle-road cooperation, automatic driving, auxiliary driving and the like are increasingly urgent, and in the driving process of unmanned vehicles, obstacles on driving roads need to be detected in real time, and the operations such as obstacle avoidance and the like are carried out according to obstacle information.

In the prior art, the barrier of the running road process of the unmanned vehicle is detected based on a laser radar, a millimeter wave radar and a camera or a combination of various sensors; however, the millimeter wave radar has large size, is inconvenient to install, has low bandwidth and resolution, cannot sense data accurately enough, and cannot obtain height information of the obstacle; the laser radar can obtain the accurate three-dimensional data of the barrier, but the manufacturing cost is high, the batch production is difficult, and the precision of the data is easily influenced by the sensing distance and the environment; the method has the advantages that the data collected by a plurality of different sensors are required to be synchronized when the plurality of sensors are adopted to measure the obstacle in a combined mode, the data synchronization error is large by using software, extra hardware equipment is required for data synchronization by using hardware, and the fusion algorithm design of the plurality of sensors is complex, so that the calculation is complex, and the requirement on the calculation capacity of a processor is high.

The camera is low in cost, rich in pixel semantics, simple in related image algorithm and high in maturity. However, at present, only two-dimensional information of the obstacle can be obtained based on a camera, the positioning accuracy is not high, the obstacle cannot be mapped to an actual three-dimensional space, and the calculation of the speed information of the obstacle is influenced.

Disclosure of Invention

The embodiment of the application provides a method, a device, equipment, a system and a storage medium for detecting obstacles, and aims to obtain three-dimensional information of the obstacles by using only monocular camera detection.

A first aspect of an embodiment of the present application provides a method for obstacle detection, where the method includes:

carrying out foreground segmentation on the road condition image acquired by the monocular camera to obtain a foreground region image;

extracting obstacles from the foreground area image to obtain plane information of the target obstacles under the camera coordinates;

and substituting the plane information into a preset plane equation based on the computer vision projection inverse transformation to obtain the three-dimensional frame of the target barrier.

Optionally, the method further comprises:

summarizing a measured bottom surface three-dimensional coordinate point set of the reference barrier;

substituting the bottom surface three-dimensional coordinate point set into a first preset equation;

and calculating the first preset equation by using a random sampling consistency algorithm to obtain the preset plane equation.

Optionally, based on computer vision projection inverse transformation, substituting the plane information into a preset plane equation to obtain a three-dimensional stereo frame of the target obstacle, including;

performing computer vision projection inverse transformation on the preset plane equation substituted into the plane information to obtain depth information of the target obstacle;

combining the plane information and the depth information to obtain three-dimensional information of the target obstacle;

and geometrically drawing the three-dimensional information to obtain the three-dimensional frame.

Optionally, the method further comprises:

marking the plane information under the camera coordinate on the first sample obstacle image by using a marking tool;

training a first preset model based on a plurality of first sample obstacle images carrying labels to obtain an obstacle extraction model;

extracting obstacles from the foreground area image, including:

and inputting the foreground area image into the obstacle extraction model to obtain plane information of the target obstacle under the camera coordinates.

Optionally, the method further comprises:

acquiring a second sample obstacle image under a specific environment, wherein the specific environment is an environment in which the obstacle extraction model is not applicable;

marking the second sample obstacle image with plane information under the camera coordinates by using a marking tool;

performing secondary training on the obstacle extraction model based on a plurality of second sample obstacle images carrying labels;

obtaining a stable obstacle extraction model when the obstacle extraction model adapts to the specific environment;

redesigning the first preset model when the obstacle extraction model does not adapt to the specific environment.

Optionally, after obtaining the three-dimensional solid frame of the target obstacle, the method further includes:

and obtaining the displacement, rotation and proportion information of the target obstacle according to the three-dimensional frame of the target obstacle.

A second aspect of the embodiments of the present application provides an apparatus for obstacle detection, including:

the foreground segmentation module is used for carrying out foreground segmentation on the road condition image acquired by the monocular camera to obtain a foreground region image;

the obstacle extraction module is used for extracting obstacles from the foreground area image and selecting a target obstacle by using a two-dimensional image plane frame, wherein the two-dimensional image plane frame carries information of the target obstacle in a camera coordinate system;

the conversion module is used for performing computer vision projection inverse transformation on the two-dimensional image plane frame to obtain a front plane frame of the target barrier under a camera coordinate system;

and the first calculation module is used for obtaining a three-dimensional frame of the target barrier according to the front plane frame and a preset plane equation.

Optionally, the apparatus further comprises:

the measuring module is used for summarizing the measured bottom surface three-dimensional coordinate point set of the reference barrier;

the substituting module is used for substituting the bottom surface three-dimensional coordinate point set into a first preset equation;

and the second calculation module is used for calculating the first preset equation by utilizing a random sampling consistency algorithm to obtain the preset plane equation.

Optionally, the first computing module comprises:

the coordinate obtaining submodule is used for obtaining the vertex coordinates of the front plane frame;

the depth calculation submodule is used for substituting the vertex coordinates of the front plane frame into the preset plane equation to obtain the depth point coordinates of the barrier;

and the geometric drawing submodule is used for geometrically drawing the vertex coordinates and the depth point coordinates of the front plane frame to obtain a three-dimensional frame of the target obstacle.

Optionally, the apparatus further comprises:

the first sample marking module is used for marking the plane information of the first sample obstacle image under the coordinate of the camera;

the first model training module is used for training a first preset model based on a plurality of first sample obstacle images carrying labels to obtain an obstacle extraction model;

the obstacle extraction module includes:

and the model calculation submodule is used for inputting the foreground area image into the obstacle extraction model to obtain the two-dimensional image plane frame.

Optionally, the apparatus further comprises:

a sample obtaining module, configured to acquire a second sample obstacle image in a specific environment, where the specific environment is an environment in which the obstacle extraction model is not applicable;

the second sample marking module is used for marking the plane information of the second sample obstacle image under the camera coordinate by using a marking tool;

the second model training module is used for carrying out secondary training on the obstacle extraction model based on a plurality of labeled second sample obstacle images;

the third model training module is used for obtaining a stable obstacle extraction model when the obstacle extraction model is adapted to the specific environment;

a model design module for redesigning the first preset model when the obstacle extraction model is not adapted to the specific environment.

Optionally, the apparatus further comprises:

and the nine-degree-of-freedom information module is used for obtaining the displacement, rotation and proportional information of the target obstacle according to the three-dimensional frame of the target obstacle.

A third aspect of embodiments of the present application provides a system for obstacle detection, where the system includes: the device comprises a monocular camera, a computing unit, an information transmission unit and a matched power supply;

the monocular camera is used for acquiring road condition images and sending the road condition images to the computing unit;

the computing unit executes the method for detecting the obstacle according to any one of claims 1 to 6, and acquires two-dimensional information and three-dimensional information of the obstacle in the road condition image;

the information transmission unit sends the two-dimensional information and the three-dimensional information of the obstacle to a digital rail controller, or stores the two-dimensional information and the three-dimensional information of the obstacle;

the matched power supply provides power required by the system.

A fourth aspect of embodiments of the present application provides a readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps in the method according to the first aspect of the present application.

A fifth aspect of the embodiments of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method according to the first aspect of the present application.

According to the embodiment of the application, the foreground is segmented based on the road condition image acquired by the monocular camera, the road condition image after the foreground is segmented is input into the obstacle extraction model, the plane information of the target obstacle under the camera coordinate is obtained through the neural network calculation, the computer vision projection inverse transformation is carried out on the plane information and the preset plane equation, the depth data of the target obstacle is obtained, and the three-dimensional frame of the target obstacle is obtained through geometric drawing.

According to the analysis, the three-dimensional information of the target obstacle can be obtained only through the image frames acquired by the monocular camera, compared with obstacle data obtained by detection hardware of laser radar, a combination sensor, a millimeter wave radar and the like in the prior art, the data semantic pixels of the obstacle obtained by the monocular camera are rich, the data algorithm is simple and mature, more reference ideas can be provided, the technical reserve is sufficient, the detection model can be updated rapidly, and complex fusion calculation can be performed without combining data of other sensors.

And the monocular camera is low in price, small and exquisite in appearance and suitable for large-scale production and installation.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.

FIG. 1 is a flow chart of steps of a method for obstacle detection according to an embodiment of the present application;

FIG. 2 is a flow chart of image pre-processing according to an embodiment of the present application;

fig. 3 is a foreground region image obtained by foreground segmentation according to an embodiment of the present disclosure;

FIG. 4 is a flowchart illustrating steps for training an obstacle extraction model according to an embodiment of the present application;

FIG. 5 is a foreground region image carrying plane information obtained by an embodiment of the present application;

FIG. 6 is a block diagram of an obstacle detection hardware system according to an embodiment of the present application;

FIG. 7 is a system framework diagram of a computing unit according to an embodiment of the present application;

fig. 8 is a schematic diagram of an obstacle detection device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The obstacles mainly refer to objects influencing the running and passing of vehicles in roads, and are mainly divided into two types, one type mainly refers to pedestrians, vehicles, traffic signs and the like, and the other type also comprises unconventional objects such as stones, plastic bags and the like.

Obstacle detection: detecting obstacles within the perceived foreground region, including but not limited to: pedestrians, bicycles, cars, trucks, etc. Can use hardware devices such as laser radar, depth sensor, camera to carry out the regional perception of prospect, also can the combined use, this application aims at using monocular camera to carry out the barrier and detects, the two-dimensional information and the three-dimensional information of output barrier.

The obstacle detection can be used in the fields of vehicle-road cooperation, automatic driving, auxiliary driving and the like, the real-time road condition of traffic can be analyzed by detecting two-dimensional and three-dimensional information of obstacles in a vehicle driving road, for example, the detected obstacle information can be transmitted to a digital rail controller, the digital rail controller integrates obstacle information sensed by other hardware devices (laser radar, depth sensors and the like), modern communication and network technologies are integrated, intelligent information exchange and sharing between a vehicle and (people, vehicles, roads, cloud ends and the like) are realized, the functions of complex environment sensing, intelligent decision, cooperative control and the like are realized, the 'safe, efficient, comfortable and energy-saving' driving can be realized, and finally, an unmanned vehicle operated by replacing people can be realized.

The digital rail controller is used for comprehensively processing data of a plurality of detection systems and then making a final decision, wherein the detection systems comprise but are not limited to an obstacle detection system, a positioning system, automobile driving state detection and the like.

In the related art, a laser radar, a millimeter wave radar, or a combination of multiple sensors is generally used to detect obstacles on a road.

The millimeter wave radar has the defects of large size, low bandwidth, low resolution, no height information and unstable sensing data. For example, millimeter wave radar cannot tell whether a sensed signpost is airborne; static obstacles are measured in a laboratory by using a millimeter wave radar, and in an obtained visual analysis result, the obstacle points jump all the time.

The laser radar has the price equivalent to that of an automobile, so that the mass production and the use are difficult, the measurement precision of the laser radar is influenced by the sensing distance and the environment, for example, when the distance is 25 meters away from an obstacle, only a few lines of the laser radar are applied to the automobile, and the collected information amount is limited; the water splash splashed when the vehicle runs in rainy days causes noise on the collected information, and further a large amount of calculation is needed to remove the noise.

The method has the advantages that the data collected by a plurality of different sensors are required to be synchronized when the plurality of sensors are adopted to measure the obstacle in a combined mode, the data synchronization error is large by using software, extra hardware equipment is required for data synchronization by using hardware, and the fusion algorithm design of the plurality of sensors is complex, so that the calculation is complex, and the requirement on the calculation capacity of a processor is high.

In view of the above, the present application provides a method for performing foreground and background segmentation on a real-time road condition image acquired by a monocular camera, removing a background portion to obtain a foreground region image, predicting the foreground region image by using a neural network, outputting plane information of an obstacle in the foreground region image under a camera coordinate, calculating depth information of the obstacle by using a preset plane equation according to a computer vision inverse transformation principle and assuming that the obstacle is located on a plane, and performing geometric drawing on the plane information and the depth information to obtain a three-dimensional frame of the obstacle.

In the prior art, most image algorithm researches are developed based on a monocular camera, so that the algorithm maturity of two-dimensional information and three-dimensional information acquired based on the monocular camera is higher compared with hardware devices such as laser radar and millimeter wave radar.

In addition, after the three-dimensional frame of the target obstacle is obtained, displacement amount, rotation amount and proportional amount information can be obtained according to the three-dimensional frame, the displacement amount, the rotation amount and the proportional amount information are transmitted to the digital rail controller through the high-speed network transmission unit, the digital rail controller receives the analysis results of other multiple obstacle detection units, the analysis results are comprehensively processed and analyzed, and finally decision judgment is made. The application principles of vehicle-road cooperation, automatic driving and the like are comprehensively summarized as follows: sensing, analyzing, transmitting information, and judging according to the information. The technical solution of obstacle detection of the present application mainly focuses on the part that "perceives" an obstacle.

The other obstacle detecting units may be various sensor combinations, laser radars, millimeter wave radars, and the like, which is not limited in this application.

The core of automatic driving or auxiliary driving is that the driver is not in the vehicle but in the human body, and the understanding, learning and memorizing of the process of 'environment perception-decision planning-control execution' are realized in the long-term driving practice. The environment perception is used as a first link and is located in a key position of information interaction between an intelligent driving vehicle and an external environment, the key point of the environment perception lies in that the intelligent driving vehicle better simulates the perception capability of a human driver, so that the driving situation of the intelligent driving vehicle and the driving situation of the periphery of the intelligent driving vehicle are understood, and the obstacle detection concerned by the application is just perceiving a target obstacle in a driving road of the vehicle.

Referring to fig. 1, fig. 1 is a flowchart illustrating steps of an obstacle detection method according to an embodiment of the present application.

Step S11: carrying out foreground segmentation on the road condition image acquired by the monocular camera to obtain a foreground region image;

the monocular camera means that the obstacle detection device installed in the fixed road area only comprises a single camera, namely, the single camera is used for collecting real-time road condition images. The fixed position and the installation number of the monocular camera can be selected according to ground traffic safety attributes, terrain conditions, weather reasons, economic conditions and other factors.

The road condition images acquired by the monocular camera are high in resolution and rich in pixel semantics, but most of the road condition images belong to a background area, the background area needs to be removed, and a large amount of computing resources are saved for subsequent obstacle extraction.

In addition, the road condition image after foreground segmentation can be fused with a high-precision map body level subsequently, and meanwhile, useful information is provided for path planning.

Before the real-time road condition image acquired by a single camera is subjected to foreground segmentation, data preprocessing can be performed on the road condition image, the road condition image is firstly in a format which can be read by a neural network, and then the road condition image in the converted format is scaled according to a fixed proportion, so that the size of the road condition image is adapted to the network transmission requirement and the size of a neural network model, and meanwhile, the precision of the road condition image is comprehensively considered. Generally, the larger the road condition image is, the more the retained feature information is, but the larger the neural network model is, the more complicated the calculation is, so that the road condition image accuracy needs to be ensured and the calculation capability of the current neural network and hardware is better met. Then, mean value removal is carried out, so that data features are standardized, and each dimension of the data has zero mean value and unit variance; finally, dimension reduction processing is carried out, and dimension reduction processing can be carried out through methods such as PCA and the like when the data volume is large, so that the calculation amount is reduced, and the processing speed is improved.

Referring to fig. 2, fig. 2 is a flowchart of image preprocessing according to an embodiment of the present disclosure.

Illustratively, for a frame of 3-channel camera data input, the obtained road condition image is converted into a 32-bit jpg format, then the obtained road condition image is scaled to 1280 × 1040, then the average value is removed according to (128,128,128), and finally the road condition image is subjected to dimensionality reduction by using a PCA method.

In this embodiment, the foreground of the road condition image after being preprocessed is segmented by the segmentation model, so as to complete the operation of extracting the foreground of the image. Illustratively, a second preset model can be preset, the second preset model can be selected from U-Net or SegNet, the foreground segmentation sample picture is labeled by using the existing labeling method, the obstacle in the picture is labeled as 1, other parts of the picture which do not belong to the obstacle are labeled as 0, the second preset model is trained by using the foreground segmentation sample picture carrying the foreground background label, and the foreground segmentation sample picture can be accurately segmented by the second preset model to obtain the foreground segmentation model.

When foreground segmentation is carried out by using a trained segmentation model, a road condition picture subjected to picture preprocessing is input into the segmentation model, the segmentation model outputs the probability of the category of each pixel point, and the category of each pixel point is judged according to the maximum probability principle and the maximum value of the probability of each pixel point category to obtain a foreground region image.

Referring to fig. 3, fig. 3 is a foreground region image obtained by foreground segmentation according to an embodiment of the present disclosure.

Only obstacles such as pedestrians, vehicles, traffic signs and the like are reserved in the foreground area image.

Step S12: extracting obstacles from the foreground area image to obtain plane information of the target obstacles under the camera coordinates;

in order to more intelligently extract the obstacles in the foreground area image, the inventor firstly establishes a first preset model, then collects a proper training sample, trains the first preset model to obtain an obstacle extraction model, and then uses the obstacle extraction model to execute the steps.

Referring to fig. 4, fig. 4 is a flowchart illustrating steps of training an obstacle extraction model according to an embodiment of the present application.

Step S21, marking the plane information under the camera coordinate on the first sample obstacle image by using a marking tool;

firstly, a first sample obstacle image is prepared, and picture frame data of the current road condition is collected to be used as the first sample obstacle image through a camera arranged in a road traffic area.

The marking of the image of the first obstacle can be completed by using a marking tool, and the image is output as plane information of the obstacle in a camera coordinate system.

The camera coordinate system of the image is a three-dimensional rectangular coordinate system established with the focus center of the camera taking the image as the origin and the optical axis as the Z axis.

The marking tool in this embodiment includes the basic functions: zooming pictures, moving pictures, turning the frame of the barrier left, right, up and down, and rolling left and right. When the marking tool is used for marking the first sample obstacle image, firstly, a border is generated around the obstacle in the first sample obstacle image, and then the generated border is adjusted through the interface of the marking tool so as to be well attached to the obstacle. The coordinates of the points of the frame are plane information of the obstacle in the camera coordinate system.

Step S22: training a first preset model based on a plurality of first sample obstacle images carrying labels to obtain an obstacle extraction model;

extracting obstacles from the foreground area image, including:

In this embodiment, a first preset model is mainly formed by a full convolution network, a batch processing unit and a pooling layer, and the full convolution network is formed in a pyramid model to better learn regression of barrier parameters in multiple proportions. The network initialization weight of the first preset model can select an activation function ReLU and a jemmy initialization method, or a tanh activation function and a Xavier initialization method, and debug the hyper-parameters according to the importance of (learning rate- > loss function- > batch number- > iteration number- > momentum size in optimization function- > hidden layer number- > penalty parameter), and the debugging follows the principle of 'from coarse to fine': firstly, grid search is carried out, a part of area with the minimum cost function is selected, and then the optimal parameters are randomly searched in the area with the minimum cost function.

And then training and establishing a first preset model by using a plurality of labeled first sample obstacle images.

The two-dimensional information and the three-dimensional information of the target obstacle can be obtained by the obstacle extraction model obtained by training in the method.

In another embodiment of the application, in order to cope with weather factors such as climate and illumination which change constantly, first sample obstacle images under different environmental conditions can be collected, a first preset model is trained, tests of the neural network model are developed under various weather environments and for multiple time periods, and the accuracy and robustness of obstacle detection are checked.

Specifically, when the obstacle extraction model is not suitable for a specific environment, a new training sample can be acquired again in the specific environment. The specific environment may be rainy, snowy, sunny, fog, etc.

Step S31: acquiring a second sample obstacle image under a specific environment, wherein the specific environment is an environment in which the obstacle extraction model is not applicable;

and acquiring a sample obstacle image in a specific environment by using a monocular camera, and marking the sample obstacle image by using a marking tool. It can be understood that the sample obstacle image collected under the specific environment affects the environmental noise (rain, snow, etc.) of the output result, the plane information of the obstacle under the camera coordinate system is correctly marked on the sample obstacle image collected under the specific environment, and the obstacle extraction model is trained based on the sample obstacle image carrying the mark, so that the obstacle extraction model can be more suitable for the current specific environment.

Step S32: marking the second sample obstacle image with plane information under the camera coordinates by using a marking tool;

step S33: performing secondary training on the obstacle extraction model based on a plurality of second sample obstacle images carrying labels;

the method for labeling the plane information under the camera coordinate system of the sample obstacle image set carrying the weather environment mark by using the labeling tool is the same as the method, and the inventor does not need to describe any more.

Step S34, when the obstacle extraction model is adapted to the specific environment, obtaining a stable obstacle extraction model;

if the retrained obstacle extraction model is still not suitable for the current specific environment, the annotation data should be renewed for the environment, the model is retrained again, and if the training result is not ideal, the network model needs to be modified or redesigned, and the model is updated.

Step S35: redesigning the first preset model when the obstacle extraction model does not adapt to the specific environment.

The three-dimensional information of the target obstacle acquired based on the obstacle extraction model of the embodiment is not interfered by weather environments such as rainy days, snowy days, sunny days and fog, the responsiveness is better, meanwhile, the camera has rich pixel information, and the detection of the obstacle is more robust.

Furthermore, the obstacle extraction model can be optimized and adjusted according to weather environment conditions during road testing and actual operation, robustness and stability of the obstacle extraction model are improved, obstacle sensing accuracy is improved, and path planning and decision judgment of the automatic driving vehicle are improved.

For a two-dimensional obstacle, plane information of the two-dimensional obstacle in a camera coordinate system can be obtained. Firstly, by designing specific image characteristics, a large number of interested regions are obtained through a neural network training layer, then most of irrelevant interested regions are removed according to the degree of coincidence between the interested regions, and finally obstacle information on a picture layer is output. The image layer obstacle information can be obtained by using any two-dimensional detection technology, for example, one stage and two-dimensional stage, the onestate method directly sets an anchor frame around a pixel point, then the size and the category of the regression method are obtained, the two-dimensional stage mainly generates a series of candidate frames serving as samples, and then the samples are classified through a convolutional neural network.

The two-dimensional information lacks distance information, and after the position of the obstacle on the picture coordinate system is estimated, the position of the obstacle in the actual world cannot be determined, and how far the obstacle is away from the vehicle cannot be determined, so that a large obstacle is formed to the actual environment perception.

When the obstacle is a three-dimensional obstacle, the three-dimensional information of the obstacle can be obtained by combining the two-dimensional information with the depth data of the obstacle acquired by the depth sensor through a fusion algorithm. The fusion algorithm is generally two: firstly, projecting information of other depth sensors to a picture coordinate system for barrier fusion, and then training data feature learning fusion parameters; the other method is to project the picture information back to the actual surrounding environment and then fuse the depth data collected by the depth sensor.

In addition, the obstacle extraction model of the embodiment of the application can directly obtain the three-dimensional information of the three-dimensional obstacle. For a three-dimensional obstacle, a large number of interesting regions are obtained through a neural network training layer, and most of irrelevant interesting regions are removed according to the degree of contact between the interesting regions, wherein the difference is that the image pixel characteristics are different in the training and predicting processes, and the output is three-dimensional point information.

The training of the obstacle extraction model is based on the first sample obstacle image obtained by labeling with a labeling tool, so that the foreground region image obtained by the foreground segmentation step is input into the obstacle extraction model, and a two-dimensional image plane frame capable of obtaining the three-dimensional information of the target obstacle on the foreground region image is output. That is, the plane information of the target obstacle in the foreground region image labeling in the camera coordinate system can be obtained through step 12-3.

Referring to fig. 5, fig. 5 is a foreground region image carrying plane information obtained by an embodiment of the present application.

The vehicle selected by the frame in the foreground area image is a target obstacle, and the frame of the selected target obstacle expresses plane information of the vehicle under a camera coordinate system. The plane information does not carry depth data of the obstacle.

Step S13: substituting the plane information into a preset plane equation based on computer vision projection inverse transformation to obtain a three-dimensional stereo frame of the target barrier;

as shown in fig. 5, assume that the monocular camera is placed on road B and diagonally behind car C, and an image is obtained. The method comprises the steps of establishing a three-dimensional (x, y, z) coordinate system by taking a monocular camera as a center, taking a straight line extending to the horizontal direction as an x axis of the coordinate system, taking the monocular camera as the center, taking a straight line extending to the vertical direction as a y axis, taking the monocular camera as the center, and taking a straight line extending to the extending direction of a road as a z axis. After the image is preprocessed and the foreground is segmented, the image is input into a pre-trained obstacle extraction model, and plane information of the tail of the automobile C is obtained, namely coordinates (x, y) of a plurality of points of the plane of the tail of the automobile C under a camera coordinate system. At the same time, the (x) of the end point of the bottom surface of the vehicle head is obtained₁,y₁)

And performing computer vision projection inverse transformation on a plurality of points on the tail plane of the automobile C to obtain a set of the plurality of points on the tail plane of the automobile C in an actual space. For example, by performing inverse computer vision projection transformation on a square, a spatial set of points on the square can be obtained, and the set is visually observed as a cube. In order to obtain the information of the z axis of a plurality of points on the plane of the tail of the automobile C, the plane information of the plurality of points on the plane of the tail of the automobile C is substituted into a preset plane equation to obtain the coordinate of the point on the plane of the tail of the automobile C corresponding to the z axis. The preset plane equation can represent the corresponding relation of (x, y, z) of any point M in space under a camera coordinate system, and any two values in (x, y, z) are known and substituted into the preset plane equation to obtain a third value.

Obtaining the z-axis coordinate z of the end point of the bottom surface of the vehicle head based on the same principle₁。

Then according to the three-dimensional information (x, y, z) of a plurality of points on the tail plane of the automobile C and the three-dimensional information (x) of the end point of the bottom surface of the automobile head under the camera coordinate system₁,y₁,z₁) And geometrically drawing to obtain the three-dimensional frame of the automobile C.

In order to obtain a preset plane equation adapted to the actual road condition, the above steps may further include the following substeps:

step S14-1: summarizing a measured bottom surface three-dimensional coordinate point set of the reference barrier;

after the calibration of the camera coordinate system is completed, the position of the camera in the world coordinate system is fixed, and the ground coordinate data of the reference obstacle in the camera coordinate system is obtained by measurement in any existing manner.

And (3) assuming that the reference obstacle is positioned on a plane, establishing a calculation model of a preset plane equation, and screening effective data with small deviation relative to the calculation model to serve as an optimal bottom surface three-dimensional coordinate point set.

Step S14-2: substituting the bottom surface three-dimensional coordinate point set into a first preset equation;

step S14-3: and calculating the first preset equation by using a random sampling consistency algorithm to obtain the preset plane equation.

Substituting the optimal bottom surface three-dimensional coordinate point set into a calculation model of a preset plane equation, and solving to obtain the preset plane equation by a random sampling consistency algorithm: px + Qy + Mz is 0.

Wherein P, Q, M is a plane normal vector parameter

Step S14-4: performing computer vision projection inverse transformation on the preset plane equation substituted into the plane information to obtain depth information of the target obstacle;

step S14-5: geometrically drawing the plane information and the depth information to obtain a three-dimensional frame of the target obstacle;

it is first assumed that the target obstacle is not airborne but is close to the ground.

And substituting the plane information output by the obstacle extraction model and the plane equation into the computer vision projection inverse transformation to calculate the depth information of the obstacle.

And after plane information and depth information of the obstacle are obtained, a three-dimensional frame of the obstacle is obtained through geometric transformation.

The method includes the steps that based on image frames acquired by a monocular camera, a two-dimensional image plane frame of a target obstacle under a picture coordinate system is obtained through calculation of a neural network, the two-dimensional image plane frame carries three-dimensional information of the target obstacle under a camera coordinate system, then the two-dimensional image plane frame is projected to the camera coordinate system through inverse computer vision projection transformation, a front plane frame of the target obstacle under the camera coordinate system is obtained, vertex coordinates of the front plane frame are substituted into a preset plane equation obtained through calculation in advance, depth data of the target obstacle are obtained, and a three-dimensional frame of the target obstacle is obtained through geometric drawing.

According to the analysis, the three-dimensional information of the obstacle in the image frame can be obtained only through the image frame acquired by the monocular camera in the embodiment of the application, compared with the prior art, the three-dimensional information of the obstacle can be obtained only based on the monocular camera, the output information is richer and contains the position information of the obstacle, the posture information of the obstacle can be further obtained according to the three-dimensional information of the obstacle, the data algorithm acquired by the monocular camera is simple and mature, more reference ideas can be provided, the technical reserve is sufficient, the detection model can be rapidly updated, and complex fusion calculation is performed without combining the data of other sensors.

In addition, compared with other hardware devices (such as laser radar, solid-state laser radar and the like), the monocular camera is low in manufacturing cost, mature in technology and easy to develop in a vehicle-road cooperation technology.

The three-dimensional frame carries the actual length, width and height of the target obstacle in a camera coordinate system, the distance and rotation angle of the target obstacle relative to the monocular camera and other information, so that the displacement, rotation and proportion information of the target obstacle can be obtained by further calculating the three-dimensional frame.

After obtaining the three-dimensional volume frame of the target obstacle, the method further comprises:

The displacement amount may be understood as the longitudinal distance of the target obstacle relative to the camera, the rotation amount may be understood as the lateral distance of the target obstacle relative to the camera, and the proportional amount may be understood as the actual size of the target obstacle.

In another embodiment of the present application, a three-dimensional frame of a target obstacle may be input to the digital rail unit, and a relative distance and a relative speed between the target obstacle and the host vehicle may be estimated by using a motion mode of an object to locate or associate the three-dimensional frame obtained by detecting the target obstacle this time with a previous frame result, in combination with a high-precision map, or in combination with detection information of various sensor combinations, a laser radar, a millimeter wave radar, or the like, or in combination with computer vision algorithm analysis.

Meanwhile, the monocular camera is simple and static in shape, can be easily integrated into the design of the automobile and hidden in the structure, cannot make the appearance of the automobile appear to be obtrusive, or is installed at a fixed position with a road, is not integrated with the environment to be obtrusive, and is more attractive to consumers.

Referring to fig. 6, fig. 6 is a block diagram of an obstacle detection hardware system according to an embodiment of the present application.

The obstacle detection system mainly comprises a monocular camera, an information transmission unit, a calculation unit and a matched power supply, and the obstacle detection system acquires and processes images and converts pictures into two-dimensional data; and then identifying a target obstacle in the image, and then transmitting the information quantity of the target obstacle of the information transmission unit to the digital rail controller, wherein the digital rail controller comprehensively processes a plurality of other detection system data, and then makes a final decision.

Referring to fig. 7, fig. 7 is a system framework diagram of a computing unit according to an embodiment of the present application.

The method mainly focuses on the road condition image prediction process acquired by the monocular camera through the calculation unit, and the three-dimensional information of the target obstacle can be directly obtained through the obstacle extraction model provided by the method.

As shown in fig. 7, for each frame of road condition image data sent by the monocular camera, firstly, image data preprocessing is performed, then, image segmentation is performed according to the established segmentation model, the work of extracting the foreground of the road condition image is completed, then, the obstacle extraction is performed, the result includes two-dimensional and three-dimensional information of the target obstacle, then, data information post-processing is performed, information such as the position, posture, proportion size and the like of the obstacle is collected in a unified manner, and finally, the information is sent to the digital rail controller by the information transmission unit or the information is stored by the storage center.

Based on the same inventive concept, an embodiment of the present application provides an apparatus for obstacle detection, and referring to fig. 8, fig. 8 is a schematic view of the apparatus for obstacle detection according to the embodiment of the present application. The device comprises:

the foreground segmentation module 81 is configured to perform foreground segmentation on the road condition image acquired by the monocular camera to obtain a foreground region image;

an obstacle extraction module 82, configured to perform obstacle extraction on the foreground region image, and select a target obstacle with a two-dimensional image plane frame, where the two-dimensional image plane frame carries information of the target obstacle in a camera coordinate system;

the conversion module 83 is configured to perform inverse computer vision projection conversion on the two-dimensional image plane frame to obtain a front plane frame of the target obstacle in a camera coordinate system;

and the first calculation module 84 is configured to obtain a three-dimensional frame of the target obstacle according to the front plane frame and a preset plane equation.

Optionally, the apparatus further comprises:

Optionally, the first computing module comprises:

Optionally, the apparatus further comprises:

the first sample marking module is used for marking the plane information of the first sample obstacle image under the coordinate of the camera by using a marking tool;

the obstacle extraction module includes:

Optionally, the apparatus further comprises:

a stable model obtaining module for obtaining a stable obstacle extraction model when the obstacle extraction model is adapted to the specific environment;

Optionally, the apparatus further comprises:

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

Based on the same inventive concept, another embodiment of the present application provides a system for obstacle detection, the system comprising: the device comprises a monocular camera, a computing unit, an information transmission unit and a matched power supply;

the computing unit executes the method for detecting the obstacle in the first aspect of the application to acquire two-dimensional information and three-dimensional information of the obstacle in the road condition image;

the information transmission unit sends the two-dimensional information and the three-dimensional information of the obstacle to a digital rail controller;

the matched power supply provides power required by the system.

Based on the same inventive concept, another embodiment of the present application provides a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the obstacle detection method according to any of the above-mentioned embodiments of the present application.

Based on the same inventive concept, another embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and running on the processor, and when the processor executes the computer program, the electronic device implements the steps in the obstacle detection method according to any of the above embodiments of the present application.

The embodiments in the present specification are described in a progressive or descriptive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The method, the apparatus, the device, the system and the storage medium for detecting the obstacle provided by the present application are described in detail above, and the above description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of obstacle detection, the method comprising:

2. The method of claim 1, further comprising:

3. The method of claim 1, wherein the plane information is substituted into a preset plane equation based on an inverse computer vision projection transform to obtain a three-dimensional stereo frame of the target obstacle, including;

4. The method of claim 1, further comprising:

extracting obstacles from the foreground area image, including:

5. The method of claim 4, further comprising:

6. The method of any of claims 1-5, wherein after obtaining the three-dimensional volume frame of the target obstacle, the method further comprises:

7. An apparatus for obstacle detection, the apparatus comprising:

8. A system for obstacle detection, the system comprising: the device comprises a monocular camera, a computing unit, an information transmission unit and a matched power supply;

the matched power supply provides power required by the system.

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 6 are implemented when the computer program is executed by the processor.