CN115953762A - 3D travelable space detection method, device and equipment based on multi-view image - Google Patents

3D travelable space detection method, device and equipment based on multi-view image Download PDF

Info

Publication number
CN115953762A
CN115953762A CN202310046329.0A CN202310046329A CN115953762A CN 115953762 A CN115953762 A CN 115953762A CN 202310046329 A CN202310046329 A CN 202310046329A CN 115953762 A CN115953762 A CN 115953762A
Authority
CN
China
Prior art keywords
matrix
view
characteristic
feature
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310046329.0A
Other languages
Chinese (zh)
Other versions
CN115953762B (en
Inventor
江建山
罗宇亮
黄乐涵
彭易锦
方志杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GAC Aion New Energy Automobile Co Ltd
Original Assignee
GAC Aion New Energy Automobile Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GAC Aion New Energy Automobile Co Ltd filed Critical GAC Aion New Energy Automobile Co Ltd
Priority to CN202310046329.0A priority Critical patent/CN115953762B/en
Publication of CN115953762A publication Critical patent/CN115953762A/en
Application granted granted Critical
Publication of CN115953762B publication Critical patent/CN115953762B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the application provides a method, a device, equipment and a computer readable storage medium for detecting a 3D travelable space based on a multi-view image, wherein the method comprises the following steps: acquiring shooting pictures of a plurality of cameras on a vehicle; generating a characteristic matrix of a shot picture of each camera; acquiring depth probabilities of the feature values of the feature matrix at different depths of a viewing cone of the camera; generating a view cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value; generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras; and detecting the travelable space according to the aerial view characteristics. By implementing the embodiment of the application, the image depth is not required to be acquired through the depth sensor, the hardware cost is saved, and the detection speed is improved.

Description

3D travelable space detection method, device and equipment based on multi-view image
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for detecting a 3D travelable space based on a multi-view image.
Background
In the field of intelligent vehicle driving, sensing environmental information around a vehicle is the basis for realizing automatic driving functions such as path planning and automatic parking. The travelable space detection is one of approaches for realizing environment perception, and plays an important role in assisting driving or driving safety of automatic driving by judging whether the space around the vehicle can be traveled or not.
The method for detecting the travelable space in the prior art generally uses the depth information which needs to be acquired, so that the hardware cost is increased, and the detection speed is reduced.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, an apparatus, a device, and a medium for detecting a 3D travelable space based on a multi-view image, which can reduce hardware cost in a travelable space detection process and improve detection speed.
In a first aspect, an embodiment of the present application provides a method for detecting a 3D travelable space based on a multiview image, including:
acquiring shooting pictures of a plurality of cameras on a vehicle;
generating a characteristic matrix of a shot picture of each camera;
acquiring depth probabilities of the feature values of the feature matrix at different depths of a viewing cone of the camera;
generating a view cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value;
generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras;
and detecting the travelable space according to the aerial view characteristics.
In the implementation process, the image depth is not required to be acquired through the depth sensor, the shooting pictures of the multiple cameras are directly acquired, the feature matrix of the shooting picture of each camera is generated, the feature value of the feature matrix is acquired, the depth probability of the different depths of the viewing cones of the cameras is acquired, and the viewing cone feature point cloud matrix is generated according to the feature value of the feature matrix and the depth probability corresponding to the feature value, so that the hardware cost is saved, and the detection speed is improved.
Further, the step of obtaining depth probabilities of the eigenvalues of the feature matrix at different depths of the view frustum of the camera comprises:
and inputting the characteristic matrix into a pre-trained convolution network to obtain the depth probability of the characteristic value of the characteristic matrix at different depths of the visual cone of the camera.
Further, the step of generating a view cone feature point cloud matrix according to the feature values of the feature matrix and the depth probabilities corresponding to the feature values includes:
acquiring the highest depth probability from a plurality of depth probabilities corresponding to each characteristic value;
and multiplying each characteristic value of the characteristic matrix by the highest depth probability corresponding to each characteristic value to obtain the viewing cone characteristic point cloud matrix.
Further, the step of generating aerial view features from the viewing cone feature point cloud matrices of the plurality of cameras includes:
acquiring an internal reference matrix of each camera, a rotation matrix from a camera coordinate system to a vehicle body coordinate system, and a translation matrix from the camera coordinate system to the vehicle body coordinate system;
acquiring the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera according to the internal reference matrix of each camera, the rotation matrix from the camera coordinate system to the vehicle body coordinate system and the translation matrix from the camera coordinate system to the vehicle body coordinate system;
mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera to obtain a plurality of mapped feature points;
and generating the aerial view characteristics according to the plurality of mapped characteristic points.
Further, the bird's eye view space is composed of a plurality of grids;
the step of mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera to obtain a plurality of mapped feature points comprises the following steps:
acquiring the range of the aerial view space and the unit size of the grid;
acquiring the position coordinates of each characteristic point of a view cone characteristic point cloud matrix of each camera in the aerial view space according to the range of the aerial view space, the unit size of the grid and the three-dimensional coordinates of each characteristic point;
and mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the position coordinates of the feature points in the aerial view space to obtain a plurality of mapped feature points.
Further, the step of acquiring the shot pictures of a plurality of cameras on the vehicle includes:
acquiring multiple groups of shot pictures of multiple cameras on the vehicle;
the step of generating the bird's-eye view feature according to the plurality of mapped feature points includes:
acquiring position codes of the plurality of mapped feature points according to the coordinates of the plurality of mapped feature points in the aerial view space;
and generating the aerial view characteristics according to the position codes of the plurality of mapped characteristic points.
Further, the step of obtaining the position codes of the plurality of mapped feature points according to the coordinates of the plurality of mapped feature points in the bird's eye view space includes:
acquiring the product of the x coordinate of each mapped feature point in the aerial view space, the grid number of the aerial view space in the y direction, the grid number of the aerial view space in the z direction and the group number of the shot pictures to obtain a first product;
obtaining the product of the y coordinate of each mapped feature point in the aerial view space, the grid number in the z direction and the group number of the shot pictures to obtain a second product;
obtaining the product of the z coordinate of each mapped feature point in the aerial view space and the group number of the shot pictures to obtain a third product;
and adding the first product, the second product, the third product and the group serial number of the shooting picture corresponding to the mapped feature point to obtain the position code of the mapped feature point.
Further, the step of generating the bird's-eye view feature from the position codes of the plurality of mapped feature points includes:
determining the mapped feature points positioned on the same grid according to the position codes;
adding the mapped feature point values positioned in the same grid to obtain feature values corresponding to the same grid;
and generating the aerial view characteristic according to the characteristic value of the same grid.
Further, the step of detecting a travelable space according to the bird's eye view feature includes:
inputting the aerial view characteristics into an environment characteristic perception model to obtain first aerial view characteristics;
inputting the first aerial view characteristics into a travelable space detection head to obtain a travelable space inference result;
and acquiring the travelable space according to the travelable space reasoning result.
Further, the travelable space inference result is a heatmap corresponding to the first bird's eye view feature and containing foreground background information;
the step of obtaining the travelable space according to the travelable space inference result includes:
generating a linear scanner;
wrapping the linear scanner around the aerial view feature at a preset speed;
acquiring a first index value of a characteristic point scanned on an aerial view characteristic by the linear scanner;
determining a maximum response point based on softmax in feature points corresponding to the first index value on the heatmap;
and acquiring a second index value of the maximum response point, and acquiring the travelable space in the first aerial view feature according to the second index value.
In a second aspect, an embodiment of the present application provides a device for detecting a 3D travelable space based on a multiview image, including:
a shot picture acquisition module for acquiring shot pictures of a plurality of cameras on a vehicle;
the characteristic matrix generation module is used for generating a characteristic matrix of a shot picture of each camera;
a probability obtaining module, configured to obtain depth probabilities of feature values of the feature matrix at different depths of a viewing cone of the camera;
the point cloud matrix generation module is used for generating a viewing cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value;
the aerial view characteristic generation module is used for generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras;
and the travelable space detection module is used for detecting the travelable space according to the aerial view characteristics.
In a third aspect, an electronic device provided in an embodiment of the present application includes: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any of the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium having instructions stored thereon, which, when executed on a computer, cause the computer to perform the method according to any one of the first aspect.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic flowchart of a method for detecting a 3D travelable space based on a multi-view image according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a multi-view image-based 3D travelable space detection apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, an embodiment of the present application provides a method for detecting a 3D travelable space based on a multi-view image, including:
s101: acquiring shooting pictures of a plurality of cameras on a vehicle;
s102: generating a characteristic matrix of a shot picture of each camera;
multiscale image features are generated using Resnet50, and the multiscale image features are up-sampled using FPN to generate a feature matrix for the shot.
S103: acquiring depth probabilities of characteristic values of the characteristic matrix at different depths of a visual cone of the camera;
s104: generating a viewing cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value;
s105: generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras;
s106: and detecting the travelable space according to the aerial view characteristics.
In the above embodiment, the plurality of cameras are installed at different positions of the vehicle. Preferably, the number of the cameras of the embodiment of the present application is 6, which are respectively installed in the forward direction, the front left direction, the front right direction, the backward direction, the back left direction and the back right direction.
In the implementation process, the image depth is not required to be acquired through the depth sensor, but the shooting pictures of the multiple cameras are directly acquired, the feature matrix of the shooting picture of each camera is generated, the depth probability of the feature value of the feature matrix at different depths of the viewing cone of the camera is acquired, and the viewing cone feature point cloud matrix is generated according to the feature value of the feature matrix and the depth probability corresponding to the feature value, so that the hardware cost is saved. In the prior art, the detection of the travelable space is realized by adopting a point cloud analysis scheme of the laser radar, but the detection distance of the laser radar can limit the use scene of the scheme, for example, the laser radar with the detection distance of 60m is only suitable for the scene of low-speed travel, and the detection distance is not limited by directly acquiring the depth based on the image in the embodiment of the application.
In one possible implementation, S103 includes: and inputting the feature matrix into a pre-trained convolution network to obtain the depth probability of the feature value of the feature matrix at different depths of the visual cone of the camera.
In the prior art, the size and distance information of the obstacle are calculated by using the parallax of the binocular camera, so that the detection of the travelable space is realized, but the accurate calculation efficiency of the parallax is low.
In one possible implementation, S104 includes: acquiring the highest depth probability from a plurality of depth probabilities corresponding to each characteristic value;
and multiplying each characteristic value of the characteristic matrix by the highest depth probability corresponding to each characteristic value to obtain a viewing cone characteristic point cloud matrix.
Illustratively, in the embodiment of the application, in the process of constructing the viewing cones 1m-60m away from the camera, every 1m there is an optional depth value, namely 59 discrete depth values, and the probability value of the image feature matrix passing through a convolution layer prediction image feature matrix with the feature values at these depths is multiplied by softmax to the depth with the highest probability value and the corresponding feature of the image to form the viewing cone feature point cloud matrix.
In one possible implementation, S105 includes: acquiring an internal reference matrix of each camera, a rotation matrix from a camera coordinate system to a vehicle body coordinate system, and a translation matrix from the camera coordinate system to the vehicle body coordinate system; acquiring the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera according to the internal reference matrix of each camera, the rotation matrix from the camera coordinate system to the vehicle body coordinate system and the translation matrix from the camera coordinate system to the vehicle body coordinate system; mapping each characteristic point of the view cone characteristic point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each characteristic point of the view cone characteristic point cloud matrix of each camera to obtain a plurality of mapped characteristic points; and generating aerial view characteristics according to the plurality of mapped characteristic points.
Specifically, the camera internal reference matrix comprises a camera internal reference matrix, a rotation matrix from a camera coordinate system to a vehicle body coordinate system and a translation matrix from the camera coordinate system to the vehicle body coordinate system, firstly, the inverse matrix of the camera internal reference is multiplied by the coordinates of the feature points of the view cone feature point cloud matrix of each camera to realize the transformation from the image coordinate system to the camera coordinate system, then, the inverse matrix of the camera internal reference is multiplied by the rotation matrix to realize the transformation from the camera coordinate system to the vehicle body coordinate system, and finally, the inverse matrix of the camera internal reference is added to the translation matrix to finally realize the coordinate transformation from the image coordinate system to the vehicle body coordinate system, so that the three-dimensional coordinates of the feature points of the view cone feature point cloud matrix of each camera are obtained.
In one possible embodiment, the bird's eye view space is made up of a plurality of grids; mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera to obtain a plurality of mapped feature points, wherein the step comprises the following steps of: acquiring the range of the aerial view space and the unit size of the grid; acquiring the position coordinates of each characteristic point of a viewing cone characteristic point cloud matrix of each camera in the aerial view space according to the range of the aerial view space, the unit size of the grid and the three-dimensional coordinates of each characteristic point; and mapping each characteristic point of the view cone characteristic point cloud matrix of each camera in a pre-constructed aerial view space according to the position coordinates of the characteristic points in the aerial view space to obtain a plurality of mapped characteristic points.
Illustratively, the bird's eye view space constructed by the embodiment of the present application has a range of [ -50m,50m ] in the x direction, a range of [ -50m,50m ] in the y direction, and a range of [ -10m,10m ] in the z direction,
and each grid interval in the x direction and the y direction is 0.5 meter, each grid interval in the z direction is 20 meters, the position coordinates of the characteristic points in the aerial view space are obtained by calculating the offset of the positions of the characteristic points and the first grid coordinate and mapping the characteristic points to the corresponding grids according to the positions of the characteristic points in the three-dimensional space, and the characteristic points outside the boundary line of the aerial view space are filtered according to the range of the aerial view space.
In one possible embodiment, the step of acquiring the shot pictures of a plurality of cameras on the vehicle comprises: acquiring multiple groups of shot pictures of multiple cameras on a vehicle; generating a bird's-eye view feature from the plurality of mapped feature points, comprising: acquiring position codes of the plurality of mapped feature points according to coordinates of the plurality of mapped feature points in the aerial view space; and generating the bird's-eye view feature according to the position codes of the plurality of mapped feature points.
Exemplarily, the embodiment of the application is to detect the travelable space according to multiple groups of shot pictures.
In one possible embodiment, the step of obtaining the position codes of the plurality of mapped feature points according to the coordinates of the plurality of mapped feature points in the bird's eye view space includes: obtaining the product of the x coordinate of each mapped feature point in the aerial view space, the grid number in the y direction of the aerial view space, the grid number in the z direction of the aerial view space and the group number of the shot pictures to obtain a first product; obtaining the product of the y coordinate of each mapped feature point in the aerial view space, the grid number in the z direction and the group number of the shot pictures to obtain a second product; obtaining the product of the z coordinate of each mapped feature point in the aerial view space and the group number of the shot pictures to obtain a third product; and adding the first product, the second product, the third product and the group serial number of the shooting picture corresponding to the mapped feature point to obtain the position code of the mapped feature point.
In one possible embodiment, the step of generating the bird's-eye view feature from the position codes of the plurality of mapped feature points includes: determining the mapped feature points positioned on the same grid according to the position codes; adding the mapped feature point values positioned in the same grid to obtain feature values corresponding to the same grid; and generating the aerial view characteristic according to the characteristic value of the same grid.
The same coding means that the points are located in the same grid of the same group of pictures taken.
That is, after each grid feature value is obtained, one initial bird's-eye view feature is generated based on the position of each grid and the feature value of each grid, and since the matrix corresponding to the bird's-eye view feature has a dimension of Z =1, the bird's-eye view feature is eliminated in the Z dimension, and the bird's-eye view feature is obtained.
In a possible embodiment, the step of detecting the travelable space according to the bird's eye view characteristics comprises: inputting the aerial view characteristics into an environment characteristic perception model to obtain first aerial view characteristics; inputting the first aerial view characteristics into a travelable space detection head to obtain a travelable space inference result; and acquiring the travelable space according to the travelable space inference result.
In the above embodiment, the environmental feature perception model is ResNet18, the features of the multiple scales corresponding to the bird's-eye view feature are generated according to ResNet18, and the features of the multiple scales corresponding to the bird's-eye view feature are input to the FPN to perform upsampling, so that the multi-scale feature fusion is realized, and the first bird's-eye view feature is obtained.
The travelable space detection head is a convolutional network, and finally outputs a heatmap containing foreground and background information and the classification of boundary points.
The embodiment of the present application further provides a method for adjusting the parameters of the aforementioned model: the classification of the boundary points is subjected to classification Loss calculation by using FocalLoss, the output of the foreground and the background is subjected to regression Loss calculation by using L1Loss, and the parameters of the model are adjusted according to the Loss calculation result.
In one possible implementation, the travelable space inference result is a heatmap corresponding to the first bird's-eye view feature and containing foreground background information; the step of obtaining the travelable space according to the travelable space inference result comprises the following steps: generating a linear scanner; surrounding the aerial view characteristic with a linear scanner at a preset speed; acquiring a first index value of a characteristic point scanned on the aerial view characteristic by the linear scanner;
determining a maximal response point based on softmax in the feature points corresponding to the first index value on the heatmap;
and acquiring a second index value of the maximum response point, and acquiring the travelable space in the first aerial view characteristic according to the second index value.
Illustratively, one linear scanner is preset to circle the bird's eye view feature at 0.5 degree speed intervals. In the surrounding process, firstly, dividing bird's-eye view features into four regions with the size of 50 × 50 according to surrounding angles of 0, 90 degrees, 180 degrees and 270 degrees, sequentially acquiring positions of feature points on a linear scanner by using a fixed y-axis mode and a fixed x-axis mode, acquiring a feature point with the highest response in the linear scanner, namely a boundary point with the highest probability value at a corresponding position of a heatmap through softmax, and sequentially and linearly connecting the boundary point in each interval acquired by the linear scanner with the previous point to acquire a boundary line of a travelable space.
Example 2
Referring to fig. 2, an embodiment of the present application provides a 3D travelable space detection apparatus based on a multi-view image, including:
a photographed picture acquiring module 1 for acquiring photographed pictures of a plurality of cameras on a vehicle;
a feature matrix generation module 2, configured to generate a feature matrix of a captured image of each camera;
the probability acquisition module 3 is used for acquiring the depth probability of the characteristic value of the characteristic matrix at different depths of a viewing cone of the camera;
the point cloud matrix generation module 4 is used for generating a viewing cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value;
the aerial view characteristic generating module 5 is used for generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras;
and the travelable space detection module 6 is used for detecting the travelable space according to the aerial view characteristics.
In a possible embodiment, the probability obtaining module 3 is further configured to input the feature matrix into a pre-trained convolution network, so as to obtain depth probabilities of feature values of the feature matrix at different depths of a view cone of the camera.
In a possible embodiment, the point cloud matrix generation module 4 is further configured to obtain a highest depth probability among a plurality of depth probabilities corresponding to each feature value; and multiplying each characteristic value of the characteristic matrix by the highest depth probability corresponding to each characteristic value to obtain a viewing cone characteristic point cloud matrix.
In a possible embodiment, the bird's eye view feature generation module 5 is further configured to acquire an internal reference matrix of each camera, a rotation matrix from the camera coordinate system to the vehicle body coordinate system, and a translation matrix from the camera coordinate system to the vehicle body coordinate system; acquiring the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera according to the internal reference matrix of each camera, the rotation matrix from the camera coordinate system to the vehicle body coordinate system and the translation matrix from the camera coordinate system to the vehicle body coordinate system; mapping each characteristic point of the view cone characteristic point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each characteristic point of the view cone characteristic point cloud matrix of each camera to obtain a plurality of mapped characteristic points; and generating aerial view characteristics according to the plurality of mapped characteristic points.
In one possible embodiment, the bird's eye view space is made up of a plurality of grids; the bird's-eye view feature generation module 5 is further used for acquiring the range of the bird's-eye view space and the unit size of the grid; acquiring the position coordinates of each characteristic point of the view cone characteristic point cloud matrix of each camera in the aerial view space according to the range of the aerial view space, the unit size of the grid and the three-dimensional coordinates of each characteristic point; and mapping each characteristic point of the view cone characteristic point cloud matrix of each camera in a pre-constructed aerial view space according to the position coordinates of the characteristic points in the aerial view space to obtain a plurality of mapped characteristic points.
In a possible embodiment, the shot image acquisition module 1 is further configured to acquire a plurality of sets of shot images of a plurality of cameras on the vehicle; the bird's-eye view feature generation module 5 is further configured to obtain position codes of the plurality of mapped feature points according to coordinates of the plurality of mapped feature points in the bird's-eye view space;
and generating the aerial view characteristics according to the position codes of the plurality of mapped characteristic points.
In a possible embodiment, the bird's-eye view feature generation module 5 is further configured to obtain a product of an x coordinate of each mapped feature point in the bird's-eye view space and the number of grids in the y direction of the bird's-eye view space, the number of grids in the z direction of the bird's-eye view space, and the number of sets of captured pictures, and obtain a first product; obtaining the product of the y coordinate of each mapped feature point in the aerial view space, the grid number in the z direction and the group number of the shot pictures to obtain a second product; obtaining the product of the z coordinate of each mapped feature point in the aerial view space and the group number of the shot pictures to obtain a third product; and adding the first product, the second product, the third product and the group serial number of the shooting picture corresponding to the mapped feature point to obtain the position code of the mapped feature point.
In a possible embodiment, the drivable space detecting module 6 is further configured to input the bird's-eye view image feature into the environment feature perception model to obtain a first bird's-eye view image feature;
inputting the first aerial view feature into a travelable space detection head to obtain a travelable space inference result;
and acquiring the travelable space according to the travelable space inference result.
In a possible implementation, the roadable space inference result is that the heatmap travelable space detection module 6 containing foreground background information corresponding to the first bird's eye view feature is further configured to generate a linear scanner; surrounding the aerial view characteristic with a linear scanner at a preset speed; acquiring a first index value of a characteristic point scanned on the aerial view characteristic by the linear scanner; determining a maximal response point based on softmax in the feature points corresponding to the first index value on heatmap; and acquiring a second index value of the maximum response point, and acquiring the travelable space in the first aerial view characteristic according to the second index value.
Fig. 3 is a schematic view of an electronic device, and fig. 3 is a block diagram of the electronic device according to an embodiment of the present disclosure. The electronic device may comprise a processor 31, a communication interface 32, a memory 33 and at least one communication bus 34. Wherein the communication bus 34 is used for realizing direct connection communication of these components. In the embodiment of the present application, the communication interface 32 of the electronic device is used for performing signaling or data communication with other node devices. The processor 31 may be an integrated circuit chip having signal processing capabilities.
The Processor 31 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 31 may be any conventional processor or the like.
The Memory 33 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), and the like. The memory 33 has stored therein computer readable instructions which, when executed by the processor 31, enable the electronic device to perform the various steps involved in the above-described method embodiments.
Optionally, the electronic device may further include a memory controller, an input output unit.
The memory 33, the memory controller, the processor 31, the peripheral interface, and the input/output unit are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, these components may be electrically connected to each other via one or more communication buses 34. The processor 31 is adapted to execute executable modules stored in the memory 33, such as software functional modules or computer programs comprised by the electronic device.
The input and output unit is used for providing a task for a user and starting an optional time interval or preset execution time for the task creation so as to realize the interaction between the user and the server. The input/output unit may be, but is not limited to, a mouse, a keyboard, and the like.
It will be appreciated that the configuration shown in fig. 3 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 3 or have a different configuration than shown in fig. 3. The components shown in fig. 3 may be implemented in hardware, software, or a combination thereof.
The embodiments of the present application further provide a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a computer, a computer program is executed by a processor to implement the method of the method embodiments, and for avoiding repetition, details are not repeated here.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are merely examples of the present application and are not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of additional identical elements in the process, method, article, or apparatus that comprises the element.

Claims (13)

1. A3D travelable space detection method based on multi-view images is characterized by comprising the following steps:
acquiring shooting pictures of a plurality of cameras on a vehicle;
generating a characteristic matrix of a shot picture of each camera;
acquiring depth probabilities of the feature values of the feature matrix at different depths of a viewing cone of the camera;
generating a view cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value;
generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras;
and detecting the travelable space according to the aerial view characteristics.
2. The multi-view image-based 3D travelable space detection method according to claim 1, wherein the step of obtaining depth probabilities of the eigenvalues of the eigenvalue matrix at different depths of the view frustum of the camera comprises:
and inputting the characteristic matrix into a pre-trained convolution network to obtain the depth probability of the characteristic value of the characteristic matrix at different depths of the viewing cone of the camera.
3. The multi-view image-based 3D drivable space detection method as claimed in claim 2, wherein the step of generating a view cone feature point cloud matrix according to the feature values of the feature matrix and the depth probabilities corresponding to the feature values comprises:
acquiring the highest depth probability from a plurality of depth probabilities corresponding to each characteristic value;
and multiplying each characteristic value of the characteristic matrix by the highest depth probability corresponding to each characteristic value to obtain the viewing cone characteristic point cloud matrix.
4. The multi-view image-based 3D travelable space detection method according to claim 3, wherein the step of generating a bird's eye view feature from the view cone feature point cloud matrices of the plurality of cameras comprises:
acquiring an internal reference matrix of each camera, a rotation matrix from a camera coordinate system to a vehicle body coordinate system, and a translation matrix from the camera coordinate system to the vehicle body coordinate system;
acquiring the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera according to the internal reference matrix of each camera, the rotation matrix from the camera coordinate system to the vehicle body coordinate system and the translation matrix from the camera coordinate system to the vehicle body coordinate system;
mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera to obtain a plurality of mapped feature points;
and generating the aerial view characteristics according to the plurality of mapped characteristic points.
5. The multi-view image-based 3D travelable space detection method according to claim 4, characterized in that the bird's eye view space is composed of a plurality of grids;
the step of mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the three-dimensional coordinates of each feature point of the view cone feature point cloud matrix of each camera to obtain a plurality of mapped feature points comprises the following steps:
acquiring the range of the aerial view space and the unit size of the grid;
acquiring the position coordinates of each characteristic point of a view cone characteristic point cloud matrix of each camera in the aerial view space according to the range of the aerial view space, the unit size of the grid and the three-dimensional coordinates of each characteristic point;
and mapping each feature point of the view cone feature point cloud matrix of each camera into a pre-constructed aerial view space according to the position coordinates of the feature points in the aerial view space to obtain a plurality of mapped feature points.
6. The multiview image-based 3D drivable space detection method as claimed in claim 5, wherein the step of acquiring the photographed pictures of the plurality of cameras on the vehicle comprises:
acquiring multiple groups of shot pictures of multiple cameras on the vehicle;
the step of generating the bird's-eye view feature according to the plurality of mapped feature points includes:
acquiring position codes of the plurality of mapped feature points according to the coordinates of the plurality of mapped feature points in the aerial view space;
and generating the aerial view characteristics according to the position codes of the plurality of mapped characteristic points.
7. The multi-view image-based 3D travelable space detection method according to claim 6, wherein the step of obtaining the position codes of the plurality of mapped feature points from the coordinates of the plurality of mapped feature points in the bird's eye view space includes:
acquiring the product of the x coordinate of each mapped feature point in the aerial view space, the grid number of the aerial view space in the y direction, the grid number of the aerial view space in the z direction and the group number of the shot pictures to obtain a first product;
obtaining the product of the y coordinate of each mapped feature point in the aerial view space, the grid number in the z direction and the group number of the shot pictures to obtain a second product;
obtaining the product of the z coordinate of each mapped feature point in the aerial view space and the group number of the shot pictures to obtain a third product;
and adding the first product, the second product, the third product and the group serial number of the shooting picture corresponding to the mapped feature point to obtain the position code of the mapped feature point.
8. The multi-view image-based 3D travelable space detection method according to claim 7, wherein the step of generating the bird's eye view feature from the position codes of the plurality of mapped feature points includes:
determining the mapped feature points positioned on the same grid according to the position codes;
adding the mapped feature point values positioned in the same grid to obtain feature values corresponding to the same grid;
and generating the aerial view characteristic according to the characteristic value of the same grid.
9. The multi-view image-based 3D drivable space detection method as claimed in claim 7, wherein said step of detecting the drivable space in accordance with said bird's eye view features comprises:
inputting the aerial view characteristics into an environment characteristic perception model to obtain first aerial view characteristics;
inputting the first aerial view characteristics into a travelable space detection head to obtain a travelable space inference result;
and acquiring the travelable space according to the travelable space reasoning result.
10. The multi-view image-based 3D travelable space detection method according to claim 9, wherein the travelable space inference result is a heatmap containing foreground-background information corresponding to the first bird's-eye view feature;
the step of obtaining the travelable space according to the travelable space inference result includes:
generating a linear scanner;
encircling the linear scanner around the aerial view feature at a preset speed;
acquiring a first index value of a characteristic point scanned on an aerial view characteristic by the linear scanner;
determining a maximum response point based on softmax in feature points corresponding to the first index value on the heatmap;
and acquiring a second index value of the maximum response point, and acquiring the travelable space in the first aerial view feature according to the second index value.
11. A multi-view image-based 3D travelable space detection apparatus, comprising:
a shot picture acquisition module for acquiring shot pictures of a plurality of cameras on a vehicle;
the characteristic matrix generating module is used for generating a characteristic matrix of a shot picture of each camera;
a probability obtaining module, configured to obtain depth probabilities of feature values of the feature matrix at different depths of a viewing cone of the camera;
the point cloud matrix generation module is used for generating a viewing cone characteristic point cloud matrix according to the characteristic value of the characteristic matrix and the depth probability corresponding to the characteristic value;
the aerial view characteristic generation module is used for generating aerial view characteristics according to the viewing cone characteristic point cloud matrixes of the plurality of cameras;
and the travelable space detection module is used for detecting the travelable space according to the aerial view characteristics.
12. An electronic device, comprising: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any of claims 1-10 when executing the computer program.
13. A computer-readable storage medium having stored thereon instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1-10.
CN202310046329.0A 2023-01-31 2023-01-31 3D (three-dimensional) drivable space detection method, device and equipment based on multi-view images Active CN115953762B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310046329.0A CN115953762B (en) 2023-01-31 2023-01-31 3D (three-dimensional) drivable space detection method, device and equipment based on multi-view images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310046329.0A CN115953762B (en) 2023-01-31 2023-01-31 3D (three-dimensional) drivable space detection method, device and equipment based on multi-view images

Publications (2)

Publication Number Publication Date
CN115953762A true CN115953762A (en) 2023-04-11
CN115953762B CN115953762B (en) 2023-05-26

Family

ID=85897661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310046329.0A Active CN115953762B (en) 2023-01-31 2023-01-31 3D (three-dimensional) drivable space detection method, device and equipment based on multi-view images

Country Status (1)

Country Link
CN (1) CN115953762B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190094875A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Generating occlusion-aware bird eye view representations of complex road scenes
CN113628265A (en) * 2021-08-07 2021-11-09 北京地平线信息技术有限公司 Vehicle panoramic point cloud generation method and depth estimation model training method and device
CN114283394A (en) * 2022-01-03 2022-04-05 南昌智能新能源汽车研究院 Traffic target detection system with integrated vehicle-mounted sensor
CN114565706A (en) * 2022-02-25 2022-05-31 苏州易航远智智能科技有限公司 Point cloud processing method and device based on viewing cone, electronic equipment and storage medium
CN115331025A (en) * 2022-07-21 2022-11-11 北京迈格威科技有限公司 Three-dimensional target detection method, electronic device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190094875A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Generating occlusion-aware bird eye view representations of complex road scenes
CN113628265A (en) * 2021-08-07 2021-11-09 北京地平线信息技术有限公司 Vehicle panoramic point cloud generation method and depth estimation model training method and device
CN114283394A (en) * 2022-01-03 2022-04-05 南昌智能新能源汽车研究院 Traffic target detection system with integrated vehicle-mounted sensor
CN114565706A (en) * 2022-02-25 2022-05-31 苏州易航远智智能科技有限公司 Point cloud processing method and device based on viewing cone, electronic equipment and storage medium
CN115331025A (en) * 2022-07-21 2022-11-11 北京迈格威科技有限公司 Three-dimensional target detection method, electronic device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
同志学;赵涛;贺利乐;王消为;: "基于双目视觉的工程车辆定位与行驶速度检测", 中国机械工程 *

Also Published As

Publication number Publication date
CN115953762B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN112419494B (en) Obstacle detection and marking method and device for automatic driving and storage medium
CN107636680B (en) Obstacle detection method and device
CN105335955B (en) Method for checking object and object test equipment
CN111179329B (en) Three-dimensional target detection method and device and electronic equipment
CN109360239B (en) Obstacle detection method, obstacle detection device, computer device, and storage medium
CN113362444A (en) Point cloud data generation method and device, electronic equipment and storage medium
CN112106111A (en) Calibration method, calibration equipment, movable platform and storage medium
CN113111513B (en) Sensor configuration scheme determining method and device, computer equipment and storage medium
EP3953903A1 (en) Scale-aware monocular localization and mapping
CN114119992A (en) Multi-mode three-dimensional target detection method and device based on image and point cloud fusion
CN114445480A (en) Transformer-based thermal infrared image stereo matching method and device
CN114919584A (en) Motor vehicle fixed point target distance measuring method and device and computer readable storage medium
CN115953762B (en) 3D (three-dimensional) drivable space detection method, device and equipment based on multi-view images
CN116168384A (en) Point cloud target detection method and device, electronic equipment and storage medium
Wu et al. HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View
JP2021051347A (en) Distance image generation apparatus and distance image generation method
CN116012712A (en) Object general feature-based target detection method, device, equipment and medium
CN115240168A (en) Perception result obtaining method and device, computer equipment and storage medium
EP3842757B1 (en) Verification method and device for modeling route, unmanned vehicle, and storage medium
CN115407338A (en) Vehicle environment information sensing method and system
CN114359891A (en) Three-dimensional vehicle detection method, system, device and medium
EP3961556A1 (en) Object recognition device and object recognition method
CN111986248A (en) Multi-view visual perception method and device and automatic driving automobile
CN111898396A (en) Obstacle detection method and device
CN117152231B (en) Three-dimensional shape estimation method and device for preset type target and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant