CN117058640A - 3D lane line detection method and system integrating image features and traffic target semantics - Google Patents

3D lane line detection method and system integrating image features and traffic target semantics Download PDF

Info

Publication number
CN117058640A
CN117058640A CN202311039603.8A CN202311039603A CN117058640A CN 117058640 A CN117058640 A CN 117058640A CN 202311039603 A CN202311039603 A CN 202311039603A CN 117058640 A CN117058640 A CN 117058640A
Authority
CN
China
Prior art keywords
image
features
lane line
top view
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311039603.8A
Other languages
Chinese (zh)
Inventor
徐林海
鲁志瑶
张皓霖
王若彤
陈仕韬
郑南宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202311039603.8A priority Critical patent/CN117058640A/en
Publication of CN117058640A publication Critical patent/CN117058640A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a 3D lane line detection method and system integrating image features and traffic target semantics, wherein the method is based on Gen-Lananenet, and road semantic information in an image is obtained by using a segmentation network aiming at the acquired image; respectively inputting the acquired image and the road semantic information into a downsampling network for processing to obtain road semantic information features and image visual features, and fusing the image features and the road semantic information by using a fusion sub-network to obtain fused features; projecting the fused features into the virtual top view, and predicting lane lines of the top view space by a lane line detection head network; the lane line of the top view space is geometrically transformed to obtain the lane line of the 3D space, the semantic features and the visual features are integrated by using the feature fusion module, the top view lane line is predicted by using the fused feature map, the real three-dimensional lane point is directly obtained through geometric projection, the lane line prediction at the far end of the image is more accurate, and compared with the unused prediction precision, the prediction precision is improved by nearly one time.

Description

3D lane line detection method and system integrating image features and traffic target semantics
Technical Field
The invention belongs to the field of automatic driving, and particularly relates to a 3D lane line detection method and system integrating image features and traffic target semantics.
Background
Lane detection has become an important issue in the field of automatic driving in recent years. As an important part of intelligent driving assistance, accurate recognition of lane lines plays a vital role in advanced driving systems such as advanced lane departure warning systems (LDW, lane Departure Warning), blind spot monitoring systems (BSM, blind Spot Monitoring), adaptive cruise control systems (ACC, adaptive Cruise Control), etc.
Most lane detection methods consider lane detection as a 2D lane segmentation task. For converting 2D detection results to 3D space, inverse perspective mapping (IPM, inverse Perspective Mapping) is typically used as a post-processing step. However, real traffic scenes often have uphill and downhill conditions, and thus this is not practical in a traffic environment.
The existing 3D lane perception field is inspired by the success of CNNs in monocular depth estimation, 3D-Lananenet designs an end-to-end frame, unified image coding, top view conversion and 3D curve extraction, and 3D lane lines are directly predicted by front views. But the end-to-end frame is greatly affected by visual variations. The Gen-lananenet thus decouples the image segmentation and three-dimensional feature extraction sub-networks, forming a two-stage sub-network. In the first stage, the input image is encoded, and then the fused features are decoded into a lane segmentation map. In the second stage, 3D-GeoNet is used, the segmentation map is projected into a virtual top view, a lane is predicted through a lane detection head, and the point of a real three-dimensional lane line is directly obtained through geometric transformation.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a 3D lane line detection method for fusing image features and traffic target semantics, which is based on Gen-Lananenet, uses a segmentation network to obtain road semantic information in an image, respectively encodes the input image features and the road semantic information, uses a fusion sub-network to fuse, and can effectively mine valuable relations between objects around lane lines and the lane lines by fusing embedded semantic features and visual features to obtain a better prediction result.
In addition, a computer device is provided, which comprises a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads the computer executable program from the memory and executes the computer executable program, and the processor can realize the 3D lane line detection method for fusing the image characteristics and the traffic target semantics when executing the program.
The invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program can realize the 3D lane line detection method for fusing the image features and the traffic target semantics when being executed by a processor.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: A3D lane line detection method integrating image features and traffic target semantics comprises the following steps: based on Gen-Lananenet, obtaining road semantic information in the image by using a segmentation network aiming at the acquired image;
respectively inputting the acquired image and the road semantic information into a downsampling network for processing to obtain road semantic information features and image visual features, and fusing the image features and the road semantic information by using a fusion sub-network to obtain fused features;
projecting the fused features into the virtual top view, and predicting lane lines of the top view space by a lane line detection head network;
and carrying out geometric transformation on the lane lines in the top view space to obtain lane lines in the 3D space.
Further, when road semantic information in an image is obtained by using a segmentation network, two visual semantic segmentation networks, namely FCN and deep LabV3+, are adopted, each pixel in the image is classified by the semantic segmentation network based on a main network of the resnet50, an image mask segmented according to the classification is generated, and the road object mask obtained by segmentation is used as road semantic information for analyzing and understanding a scene.
Further, the downsampling network consists of a plurality of convolution networks, the input image is H multiplied by W multiplied by 3 tensor, and the input image is obtained through the downsampling networkIs a characteristic tensor of (c).
Further, when the semantic features and the visual features are fused, the semantic features and the visual features are added by tensor elements; or the semantic features are used as attention weights and multiplied by the visual feature tensor elements, and the visual feature influence is adjusted by using the semantic information.
Further, projecting the fused features into the virtual top view, and predicting lane lines of the top view space by the lane line detection head network includes: and processing the fused features through an up-sampling network to obtain feature tensors, projecting the feature tensors into a virtual top view through inverse perspective mapping, and then predicting lane lines of a top view space by a lane line detection head network to output.
Further, one-stage fusion subnet output in lane line detection head networkTensors of size projected via IPM to the top view space tensor +.>Menstruation againObtaining +.>Output tensor of (2), get->The individual lane line parameters represent.
Further, when the lane line in the top view space is geometrically transformed to obtain the lane line in the 3D space, ego-the 3D lane line points (x, y, z) in the vehicle coordinate system are transformed to the 2D image pixel points (u, v) by projection, and the points in the top view coordinate systemThe pixel points (u, v) transformed to the same 2D image through the homography matrix are formulated as follows:
wherein R is a rotation matrix, T is a conversion vector, K is a camera reference,
wherein θ is the camera elevation angle and h is the camera height; s and c are used for replacing sin theta and cos theta to obtain the following steps:
the following system of equations is obtained by taking apart the calculations:
for a pair ofAnd->Further toThe treatment results in:
substituting alpha into the equation set to obtain:
whereby points within the top view space are directly transformed to points in 3D space by a change in geometry.
Based on the conception of the method, the invention provides a 3D lane line detection system for fusing image features and traffic target semantics, which comprises a feature acquisition module, a feature fusion module, a top view lane line acquisition module and a geometric transformation module;
the feature acquisition module is based on Gen-Lananenet, and uses a segmentation network to acquire road semantic information in the image aiming at the acquired image; respectively inputting the acquired image and the road semantic information into a downsampling network for processing to acquire road semantic information characteristics and image visual characteristics;
the feature fusion module uses a fusion sub-network to fuse the image features and the road semantic information to obtain fused features;
the top view lane line acquisition module projects the fused features into a virtual top view, and a lane line detection head network predicts lane lines of a top view space;
the geometric transformation module is used for carrying out geometric transformation on the lane lines in the top view space to obtain lane lines in the 3D space.
The invention also provides computer equipment, which comprises a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads the computer executable program from the memory and executes the computer executable program, and the processor can realize the 3D lane line detection method for fusing the image characteristics and the traffic target semantics when executing the program.
Meanwhile, a computer readable storage medium is provided, and a computer program is stored in the computer readable storage medium, and when the computer program is executed by a processor, the 3D lane line detection method for fusing the image features and the traffic target semantics can be realized.
Compared with the prior art, the invention has at least the following beneficial effects:
the method comprises the steps of integrating semantic information from a road traffic object into a three-dimensional visual lane detection model based on deep learning, integrating semantic masks of the road moving object into a new branch, integrating semantic features and visual features by using a feature fusion module, predicting a top view lane line by using a fused feature map, directly obtaining a real three-dimensional lane point through geometric projection, and improving the prediction precision of the whole lane line on an Apollo data set by testing the method; the lane line prediction at the far end of the image is more accurate, and compared with the unused prediction precision, the prediction precision is improved by nearly one time; the fusion mechanism provided by the invention is effective and good in universality for the 3D lane detection method, and the proposed target segmentation algorithm can be successfully applied to practical application.
In the invention, in the image segmentation sub-network, the RGB image and the vehicle mask are input into the encoder at the same time, the vehicle mask is embedded, and valuable information can be mined from detected surrounding objects; furthermore, the invention provides different segmentation networks to obtain the road semantic information for auxiliary detection, and two different fusion methods are used for fusing the semantic features and the visual features.
Drawings
Fig. 1 is a two-stage framework of the present invention.
Fig. 2 is a semantic mask extraction and its impact on lane detection. Since the car obstructs the leftmost lane line in the two-dimensional image, direct prediction based on the image alone can lead to significant deviations over long distances. However, the invention which fuses the semantic information of the road can realize the accurate prediction of the lane lines, and proves the auxiliary effect of the object semantics.
Fig. 3 shows a representation of the lane lines (anchor) and the principle of geometrical projection.
Fig. 4 shows a specific composition of a one-stage downsampling network.
Fig. 5 shows a specific composition of a one-stage up-sampling network.
FIG. 6 shows a specific configuration of a two-stage lane line detector network.
FIG. 7 is a comparison of the predicted result of the present invention with the prior art method, wherein the red line indicates the predicted lane line and the blue line indicates the ground-based real lane line.
Detailed Description
As shown in fig. 1, the present invention includes two stages, divided into four steps.
In the first stage, the subnets are converged.
In the first step, in order to obtain the road semantic information in the image, the invention adopts two main stream visual semantic segmentation methods of FCN and DeepLabV3+ which correspond to the semantic segmentation network of FIG. 2 (a). The semantic segmentation network uses the resnet50 backbone network to classify each pixel in the image, resulting in a class-partitioned image mask. The invention mainly uses the road object mask obtained by segmentation as road semantic information for further analyzing and understanding the scene.
Secondly, in the fusion network of fig. 2 (a), the road semantic information obtained in the first step and the obtained image are simultaneously sent into a downsampling network composed of a plurality of convolution networks to respectively obtain the road semantic information characteristics and the image visual characteristics, the downsampling network has a specific structure as shown in fig. 4, the input image is a tensor of H multiplied by W multiplied by 3, and the downsampling network is used for obtaining the road semantic information and the image visual characteristicsIs a characteristic tensor of (c). In order to fuse semantic features and visual features, the invention adopts two methods: one is that semantic features and visual features are added per tensor element; one is that semantic features are used as attention weights, multiplied by visual feature tensor elements, and visual feature influence is adjusted by using semantic information; specific:
recording deviceTensor F for visual features of images I I, j, column element, < ->Feature tensor F for road semantic information M The i and j column elements, H, W, of the image.
Adding and fusing:
multiplication fusion:
wherein, as indicated by element wise multiplication.
The fused features then pass through an upsampling network consisting of several deconvolution networks of FIG. 2 (b), with the specific structure shown in FIG. 5, the input image beingTensors of (2) are obtained via a downsampling network>Is a characteristic tensor of (c).
Second stage, 3D geometry subnetwork.
Third, fig. 2 (c) first projects the decoded features into the virtual top view through Inverse Perspective Mapping (IPM), then a lane line detection head network consisting of several pooled convolution layers predicts the lane line output of the top view space. The specific composition of the lane line detection head network is shown in fig. 6, and the output of the subnetwork is integrated at one stageTensors of size projected via IPM to the top view space tensor +.>Then the +.A. is obtained by a plurality of pooling and convolution layers>Output tensor of (2), thus get +.> The individual lane line parameters represent.
Finally, referring to the geometric transformation of fig. 1 (D) and fig. 3 (a), the lane line points of the top view space are directly projected into the 3D space, resulting in lane line points of the 3D space. The geometric transformation is specifically as follows:
as shown in fig. 3 (b), ego-3D lane line points (x, y, z) in the vehicle coordinate system are transformed by projection to 2D image pixel points (u, v), points in the top view coordinate systemThe pixel points (u, v) transformed to the same 2D image by the homography matrix are formulated as follows.
Wherein R is a rotation matrix, T is a conversion vector, and K is a camera internal reference.
Where θ is the camera elevation angle and h is the camera height.
S and c are used for replacing sin theta and cos theta to obtain the following steps:
the following system of equations is obtained by taking apart the calculations:
simplifying the third equation in the equation set to obtain
The second equation is taken as:
substituting alpha into the equation set to obtain:
whereby points in top view space can be directly transformed to points in 3D space by the geometric transformation.
And predicting a top view lane line by utilizing the fused feature map, and directly obtaining a real three-dimensional lane point through geometric projection. In the long-range detection, as shown in fig. 7, the present invention predicts more accurately than the original model. In addition, various object segmentation algorithms can be successfully applied in practical applications.
The invention provides a 3D lane line detection system integrating image features and traffic target semantics, which comprises a feature acquisition module, a feature integration module, a top view lane line acquisition module and a geometric transformation module;
the feature acquisition module is based on Gen-Lananenet, and uses a segmentation network to acquire road semantic information in the image aiming at the acquired image; respectively inputting the acquired image and the road semantic information into a downsampling network for processing to acquire road semantic information characteristics and image visual characteristics;
the feature fusion module uses a fusion sub-network to fuse the image features and the road semantic information to obtain fused features;
the top view lane line acquisition module projects the fused features into a virtual top view, and a lane line detection head network predicts lane lines of a top view space;
the geometric transformation module is used for carrying out geometric transformation on the lane lines in the top view space to obtain lane lines in the 3D space.
The invention also provides computer equipment, which comprises a processor and a memory, wherein the memory is used for storing a computer executable program, the processor reads the computer executable program from the memory and executes the computer executable program, and the processor can realize the 3D lane line detection method for fusing the image characteristics and the traffic target semantics when executing the computer executable program.
On the other hand, the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the 3D lane line detection method for fusing the image features and the traffic target semantics can be realized.
The computer device may be a notebook computer, a desktop computer, a vehicle computer, or a workstation.
For the processors of the present invention, there may be a central processing unit (CPU, central Processing Unit), a graphics processor (GPU, graphics Processing Unit), a digital signal processor (DSP, digital Signal Processor), an Application specific integrated circuit (ASIC, application-Specfic Integrated Circuit) or an off-the-shelf programmable gate array (FPGA, field-Programmable Gate Array).
The memory can be an internal memory unit of a notebook computer, a desktop computer, a vehicle-mounted computer or a workstation, such as a memory and a hard disk; external storage units such as removable hard disks, flash memory cards may also be used.
Computer readable storage media may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. The computer readable storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), solid state disk (SSD, solid State Drives), or optical disk, etc. The random access memory may include resistive random access memory (ReRAM, resistance Random Access Memory) and dynamic random access memory (DRAM, dynamic Random Access Memory), among others.
It should be noted that, the above description is only for illustrating the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and those skilled in the art should understand that, based on the technical solution of the present invention, modifications or variations made according to the technical solution of the present invention and the inventive concept thereof should be covered in the scope of the present invention.

Claims (10)

1. A3D lane line detection method integrating image features and traffic target semantics is characterized by comprising the following steps: based on Gen-Lananenet, obtaining road semantic information in the image by using a segmentation network aiming at the acquired image;
respectively inputting the acquired image and the road semantic information into a downsampling network for processing to obtain road semantic information features and image visual features, and fusing the image features and the road semantic information by using a fusion sub-network to obtain fused features;
projecting the fused features into the virtual top view, and predicting lane lines of the top view space by a lane line detection head network;
and carrying out geometric transformation on the lane lines in the top view space to obtain lane lines in the 3D space.
2. The 3D lane line detection method for fusing image features and traffic target semantics as claimed in claim 1, wherein when road semantic information in an image is obtained by using a segmentation network, two visual semantic segmentation networks FCN and deep labv3+ are adopted, the semantic segmentation network classifies each pixel in the image based on a res 50 backbone network, generates an image mask segmented by category, and uses the segmented road object mask as road semantic information for analyzing and understanding a scene.
3. The method for detecting 3D lane lines by combining image features and traffic target semantics according to claim 1, wherein the downsampling network is composed of a plurality of convolution networks, the input image is a tensor of h×w×3, and the input image passes through the downsampling networkObtaining the collateralsIs a characteristic tensor of (c).
4. The 3D lane line detection method of fusing image features and traffic target semantics according to claim 3, wherein semantic features and visual features are added by tensor elements when the semantic features and the visual features are fused; or the semantic features are used as attention weights and multiplied by the visual feature tensor elements, and the visual feature influence is adjusted by using the semantic information.
5. The method for detecting 3D lane lines by fusing image features and traffic target semantics according to claim 1, wherein projecting the fused features into a virtual top view, predicting lane lines of a top view space by a lane line detection head network comprises: and processing the fused features through an up-sampling network to obtain feature tensors, projecting the feature tensors into a virtual top view through inverse perspective mapping, and then predicting lane lines of a top view space by a lane line detection head network to output.
6. The method for 3D lane line detection based on the semantic fusion of image features and traffic targets according to claim 5, wherein the one-stage fusion of sub-network output in the lane line detection head networkTensors of size projected via IPM to the top view space tensor +.>Then the +.A. is obtained by a plurality of pooling and convolution layers>Output tensor of (2) to obtain The individual lane line parameters represent.
7. The method for detecting 3D lane lines by fusing image features and traffic target semantics as claimed in claim 1, wherein when lane lines in the top view space are geometrically transformed to obtain lane lines in the 3D space, the 3D lane line points (x, y, z) in the ego-vehicle coordinate system are transformed to 2D image pixel points (u, v) by projection, points in the top view coordinate systemThe pixel points (u, v) transformed to the same 2D image through the homography matrix are formulated as follows:
wherein R is a rotation matrix, T is a conversion vector, K is a camera reference,
wherein θ is the camera elevation angle and h is the camera height; s and c are used for replacing sin theta and cos theta to obtain the following steps:
the following system of equations is obtained by taking apart the calculations:
for a pair ofAnd->Further processing to obtain the following components:
substituting alpha into the equation set to obtain:
whereby points within the top view space are directly transformed to points in 3D space by a change in geometry.
8. The 3D lane line detection system integrating the image features and the traffic target semantics is characterized by comprising a feature acquisition module, a feature integration module, a top view lane line acquisition module and a geometric transformation module;
the feature acquisition module is based on Gen-Lananenet, and uses a segmentation network to acquire road semantic information in the image aiming at the acquired image; respectively inputting the acquired image and the road semantic information into a downsampling network for processing to acquire road semantic information characteristics and image visual characteristics;
the feature fusion module uses a fusion sub-network to fuse the image features and the road semantic information to obtain fused features;
the top view lane line acquisition module projects the fused features into a virtual top view, and a lane line detection head network predicts lane lines of a top view space;
the geometric transformation module is used for carrying out geometric transformation on the lane lines in the top view space to obtain lane lines in the 3D space.
9. A computer device comprising a processor and a memory, the memory storing a computer executable program, the processor reading the computer executable program from the memory and executing the program, the processor executing the program implementing the 3D lane line detection method of fusing image features and traffic target semantics according to any one of claims 1-7.
10. A computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and when the computer program is executed by a processor, the 3D lane line detection method of fusing image features and traffic target semantics according to any one of claims 1 to 7 can be implemented.
CN202311039603.8A 2023-08-17 2023-08-17 3D lane line detection method and system integrating image features and traffic target semantics Pending CN117058640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311039603.8A CN117058640A (en) 2023-08-17 2023-08-17 3D lane line detection method and system integrating image features and traffic target semantics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311039603.8A CN117058640A (en) 2023-08-17 2023-08-17 3D lane line detection method and system integrating image features and traffic target semantics

Publications (1)

Publication Number Publication Date
CN117058640A true CN117058640A (en) 2023-11-14

Family

ID=88654908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311039603.8A Pending CN117058640A (en) 2023-08-17 2023-08-17 3D lane line detection method and system integrating image features and traffic target semantics

Country Status (1)

Country Link
CN (1) CN117058640A (en)

Similar Documents

Publication Publication Date Title
CN111027401B (en) End-to-end target detection method with integration of camera and laser radar
JP7430277B2 (en) Obstacle detection method and apparatus, computer device, and computer program
CN114723955B (en) Image processing method, apparatus, device and computer readable storage medium
JP2020509494A (en) Combining 3D object detection and orientation estimation by multimodal fusion
US20210065393A1 (en) Method for stereo matching using end-to-end convolutional neural network
Kim et al. Crn: Camera radar net for accurate, robust, efficient 3d perception
US11443151B2 (en) Driving assistant system, electronic device, and operation method thereof
US11604272B2 (en) Methods and systems for object detection
CN113284163A (en) Three-dimensional target self-adaptive detection method and system based on vehicle-mounted laser radar point cloud
CN112446227A (en) Object detection method, device and equipment
US20220269900A1 (en) Low level sensor fusion based on lightweight semantic segmentation of 3d point clouds
CN114821507A (en) Multi-sensor fusion vehicle-road cooperative sensing method for automatic driving
CN112183330B (en) Target detection method based on point cloud
CN111814602A (en) Intelligent vehicle environment dynamic target detection method based on vision
CN115147328A (en) Three-dimensional target detection method and device
CN117111055A (en) Vehicle state sensing method based on thunder fusion
Pang et al. TransCAR: Transformer-based camera-and-radar fusion for 3D object detection
Aditya et al. Collision detection: An improved deep learning approach using SENet and ResNext
WO2018143277A1 (en) Image feature value output device, image recognition device, image feature value output program, and image recognition program
CN112241963A (en) Lane line identification method and system based on vehicle-mounted video and electronic equipment
JP2018124963A (en) Image processing device, image recognition device, image processing program, and image recognition program
Ai et al. MVTr: multi-feature voxel transformer for 3D object detection
CN116259040A (en) Method and device for identifying traffic sign and electronic equipment
US20220292806A1 (en) Methods and Systems for Object Detection
CN117058640A (en) 3D lane line detection method and system integrating image features and traffic target semantics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination