CN114842085A - Full-scene vehicle attitude estimation method - Google Patents
Full-scene vehicle attitude estimation method Download PDFInfo
- Publication number
- CN114842085A CN114842085A CN202210780438.0A CN202210780438A CN114842085A CN 114842085 A CN114842085 A CN 114842085A CN 202210780438 A CN202210780438 A CN 202210780438A CN 114842085 A CN114842085 A CN 114842085A
- Authority
- CN
- China
- Prior art keywords
- image
- vehicle
- layer
- key point
- full
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention belongs to the technical field of vehicle attitude estimation, and relates to a full-scene vehicle attitude estimation method.
Description
Technical Field
The invention belongs to the technical field of vehicle attitude estimation, and relates to a full-scene vehicle attitude estimation method.
Background
The automatic driving prospect is wide, the development trend of future automobiles is that the development of automatic driving requires that vehicles have the ability to clearly judge the surrounding environment, correct driving routes and driving behaviors are selected, drivers are assisted to control the vehicles, the driving scenes in reality are complex and changeable, different countermeasures are required in each complex scene, and the estimation of vehicle postures is used as an important task in the automatic driving technology and aims to locate key points of the vehicles from images or videos and help to judge the driving states of the surrounding vehicles.
At present, the main challenge of vehicle attitude estimation is the occlusion problem, and no matter in which driving scene, the occlusion problem exists, such as occlusion between vehicles, occlusion between pedestrians and vehicles, and occlusion between other objects and vehicles, but the existing vehicle attitude estimation method is difficult to identify the vehicle attitude in the occlusion scene, and therefore a vehicle attitude estimation method facing a full scene is urgently needed.
The convolutional neural network obtains excellent performance in the field of attitude estimation, most work regards the deep convolutional neural network as a strong black box predictor, however, how to capture the spatial relationship between components is still unclear, from the viewpoint of science and practical application, the interpretability of the model can help to understand how the model relates variables to achieve final prediction, and how the attitude estimation algorithm processes various input images, and the Transformer can capture long-distance relationship to reveal the dependency relationship between key points in the task of vehicle attitude estimation.
Since the advent of the Transformer, its high computational efficiency and scalability have made it dominate natural language processing, being a deep neural network based mainly on the mechanism of self-attention, and due to its powerful performance, researchers are looking for ways to apply the Transformer to computer vision tasks, where the performance of the Transformer-based model in various vision benchmarking tests is similar or better than that of other types of networks (such as convolutional networks and recursive networks), but no report or use of the model in vehicle pose estimation is known at present.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, designs and provides a full-scene vehicle attitude estimation method, realizes high-efficiency vehicle attitude estimation, takes Swin transform as a backbone network for feature extraction, uses a transform encoder to encode feature map information into position representation of key points, obtains key point dependence items by calculating attention scores, predicts the final key point positions, effectively solves the problem of vehicle occlusion, and realizes full-scene vehicle attitude estimation.
In order to achieve the purpose, the Swin transform is introduced as a backbone network, a network structure is optimized according to the characteristics of a vehicle attitude estimation task, original image information is compressed into a position sequence with compact key points, the vehicle attitude estimation task is converted into a coding task, a key point dependent item is obtained by calculating an attention score, and a final key point position is predicted, wherein the specific process comprises the following steps:
(1) and (3) data set construction:
selecting vehicle images in an open source data set, collecting images of various vehicles in a traffic monitoring and parking lot, constructing a vehicle data set, and dividing the vehicle data set into a training set, a verification set and a test set;
(2) image segmentation: the image in the vehicle data set is segmented into non-overlapping image slices by a slice segmentation module, each image slice is regarded as a mark and is characterized by serial RGB values of the input image;
(3) extracting hierarchical features of a backbone network: the image slice marks obtained in the step (2) firstly pass through a linear embedding layer of a first stage of a backbone network, the feature dimension is changed into a random dimension C, and then, the two embedding layers and a second stage are used for carrying out layered feature extraction to obtain a feature map;
(4) position coding: inputting the characteristic diagram obtained in the step (3) into a position coding layer for position coding, and enabling the characteristic diagram to pass throughConvolution or one linear layer being flattenedAnDimensional vectors which pass through four attention layers and a feedforward neural network and then output characteristic vectors, wherein H and W are the height and width of the image respectively;
(5) generating a keypoint heat map: reshaping the characteristic vector obtained in the step (4) backThen channel dimensions are determined fromLowering to K, and generating a predicted K key point heat map, wherein K is the number of key points of each vehicle and has a value of 78;
(6) and outputting a result: and inhibiting the key point heat map to a key point coordinate through a non-maximum value, and marking the position of the key point in the original image to realize the attitude estimation of the full-scene vehicle.
Further, 78 key points are defined for each vehicle in the vehicle image in the step (1), and a boundary frame and a category of the vehicle, namely a minimum bounding rectangle of the vehicle, are labeled.
Further, the trunk network in the step (3) adopts a Swin Transformer trunk network, the first stage includes a linear embedding layer and two Swin Transformer blocks, and the number of labels of the two Swin Transformer blocks isWhere H and W are the height and width of the input image; the second stage comprises a linear merging layer and two Swin transform blocks, the image slices subjected to the feature extraction in the first stage are marked in a reduction mode through the linear merging layer, and the linear merging layer enables each group to be markedThe features of adjacent blocks are connected, with a linear layer acting in the dimension ofThe number of marks is reduced by 4 times, and the output dimension is changed toThen, feature transformation is carried out through two Swin transform blocks, and the resolution of the obtained image isAnd realizing layered feature extraction.
Further, the position coding layer in step (4) adopts an encoder with a standard transform architecture, the position coding layer regards the feature map as a dynamic weight determined by specific image content, information flow in forward propagation is reweighed, a key point dependency is obtained by calculating a score of a last attention layer, a higher value of a position attention score in an image indicates that the contribution degree to predicting key points is larger, and the blocked key points are predicted by the dependency of the key points.
Compared with the prior art, the method has the advantages that the Swin transducer is used for replacing the traditional convolutional neural network, the layered transducer is adopted for the main network, the calculation efficiency is improved, the linear calculation complexity is low, the long-distance relation in the image is captured by using the encoder of the standard transducer, the dependency relation of the predicted key point is disclosed, the final position of the predicted key point is formed by collecting the dependency item which contributes greatly to the key point through the final attention layer, the shielding problem is solved, the method obtains better balance between the detection precision and the speed, and the method has higher practical application value.
Drawings
Fig. 1 is a schematic structural framework diagram of a vehicle attitude estimation system provided by the present invention.
Fig. 2 is a schematic structural diagram of a first stage of the backbone network according to the present invention.
Fig. 3 is a schematic structural diagram of a second stage of the backbone network according to the present invention.
FIG. 4 is a single structure diagram of the coding layer according to the present invention.
FIG. 5 is a block flow diagram of a vehicle attitude estimation method according to the present invention.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
Example (b):
the embodiment provides a full-scene vehicle attitude estimation method based on a transform backbone and a position encoder, which introduces Swin transform as a backbone network, converts a vehicle attitude estimation task into an encoding task by compressing original image information into a position sequence with compact key points, obtains a key point dependency term by calculating an attention score, predicts a final key point position, can effectively predict a blocked vehicle key point position, and realizes full-scene vehicle attitude estimation, as shown in FIGS. 1-5, and specifically comprises the following steps:
(1) and (3) data set construction:
selecting vehicle images in an open source data set, collecting images containing various vehicles in real scenes such as traffic monitoring, parking lots and the like, constructing a vehicle data set, defining 78 key points on each vehicle, taking a car as an example, mainly defining points with strong local texture feature information, such as corner point definitions (4 corner points of car lamps, 4 corner points of front and rear windshields and the like) on multiple selected vehicles, labeling a boundary frame and a category of the vehicle, namely a minimum bounding rectangle of the vehicle, and finally dividing the data set into a training set, a verification set and a test set;
(2) image segmentation:
the vehicle image is segmented into non-overlapping image slices by a slice segmentation module, each image slice being of a sizeTheir characteristic dimension isEach image slice is regarded as a mark and is characterized by serial RGB values of the input image;
(3) extracting hierarchical features of a backbone network;
the backbone network is divided into two stages, the image slice marking firstly passes through the first stage, as shown in fig. 2, the first stage comprises a linear embedding layer and two Swin transducer blocks, the linear embedding layer is applied to the image slice original value characteristics and maps the image slice original value characteristics to a random dimension C, and the number of the transducer blocks is thatH and W are the height and width of the input image, followed by a second stage, as shown in FIG. 3, of reducing the mark by a linear merge layer that reduces the mark per group as the network goes deepConnecting the characteristics of adjacent blocks, connectingA linear layer acting in a dimension ofThe number of marks is reduced by 4 times, and the output dimension is changed toThen, feature transformation is carried out through two Swin transform blocks, and the resolution of the obtained image isRealizing the extraction of the layered characteristics;
(4) position coding:
the feature graph output by the backbone network is input into the coding layer, the embodiment has 4 coding layers, each coding layer is as shown in fig. 4, firstly, the feature graph passes throughConvolution or a linear layer, flattened intoAnDimensional vectors which are subjected to 4 attention layers and a feedforward neural network to obtain characteristic vectors;
(5) generating a keypoint heat map:
the coding layer outputs the feature vectors, which are first reshaped backThen channel dimensions are determined fromDecreasing to K (K is the number of keypoints per vehicle, value 78), generating a predicted K keypoint heat map;
(6) and outputting a result: and (5) applying a non-maximum value to suppress the key point coordinates in the key point heat map generated in the step (5), and marking the positions of the key points in the original image to realize the estimation of the vehicle attitude of the whole scene.
Structures, algorithms, and computational processes not described in detail herein are all common in the art.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the disclosure of the embodiment examples, but the scope of the invention is defined by the appended claims.
Claims (5)
1. A full scene vehicle attitude estimation method is characterized by comprising the following steps:
(1) and (3) data set construction:
selecting vehicle images in an open source data set, collecting images of various vehicles in a traffic monitoring and parking lot, constructing a vehicle data set, and dividing the vehicle data set into a training set, a verification set and a test set;
(2) image segmentation: the image in the vehicle data set is segmented into non-overlapping image slices by a slice segmentation module, each image slice is regarded as a mark and is characterized by serial RGB values of the input image;
(3) extracting hierarchical features of a backbone network: the image slice marks obtained in the step (2) firstly pass through a linear embedding layer of a first stage of a backbone network, the feature dimension is changed into a random dimension C, and then, the two embedding layers and a second stage are used for carrying out layered feature extraction to obtain a feature map;
(4) position coding: inputting the characteristic diagram obtained in the step (3) into a position coding layer for position coding, and enabling the characteristic diagram to pass throughConvolution or one linear layer being flattenedAnDimensional vectors which pass through four attention layers and a feedforward neural network and then output characteristic vectors, wherein H and W are the height and width of the image respectively;
(5) generating a keypoint heat map: reshaping the characteristic vector obtained in the step (4) backThen channel dimensions are determined fromLowering to K, and generating a predicted K key point heat map, wherein K is the number of key points of each vehicle and has a value of 78;
(6) and outputting a result: and inhibiting the key point heat map to a key point coordinate through a non-maximum value, and marking the position of the key point in the original image to realize the attitude estimation of the full-scene vehicle.
2. The full-scene vehicle pose estimation method according to claim 1, wherein 78 key points are defined for each vehicle in the vehicle image in the step (1), and a boundary box and a category of the vehicle are labeled.
4. The full-scene vehicle pose estimation method according to claim 3, whereinThe trunk network in the step (3) adopts a Swin Transformer trunk network, the first stage comprises a linear embedding layer and two Swin Transformer blocks, and the number of marks of the two Swin Transformer blocks isWhere H and W are the height and width of the input image; the second stage comprises a linear merging layer and two Swin transform blocks, the image slices subjected to the feature extraction in the first stage are marked in a reduction mode through the linear merging layer, and the linear merging layer enables each group to be markedThe features of adjacent blocks are connected, with a linear layer acting in the dimension ofThe number of marks is reduced by 4 times, and the output dimension is changed toThen, feature transformation is carried out through two Swin transform blocks, and the resolution of the obtained image isAnd realizing layered feature extraction.
5. The full-scene vehicle attitude estimation method according to claim 4, wherein the position coding layer in step (4) adopts an encoder of a standard transform architecture, the position coding layer regards the feature map as a dynamic weight determined by specific image content, re-weights the information flow in forward propagation, obtains a key point dependency by calculating a score of a last attention layer, and predicts the occluded key point through the key point dependency, wherein the higher a certain position attention score value in the image is, the greater the contribution degree of the certain position attention score value to the predicted key point is.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210780438.0A CN114842085B (en) | 2022-07-05 | 2022-07-05 | Full-scene vehicle attitude estimation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210780438.0A CN114842085B (en) | 2022-07-05 | 2022-07-05 | Full-scene vehicle attitude estimation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114842085A true CN114842085A (en) | 2022-08-02 |
CN114842085B CN114842085B (en) | 2022-09-16 |
Family
ID=82574897
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210780438.0A Active CN114842085B (en) | 2022-07-05 | 2022-07-05 | Full-scene vehicle attitude estimation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114842085B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115272992A (en) * | 2022-09-30 | 2022-11-01 | 松立控股集团股份有限公司 | Vehicle attitude estimation method |
CN116740714A (en) * | 2023-06-12 | 2023-09-12 | 北京长木谷医疗科技股份有限公司 | Intelligent self-labeling method and device for hip joint diseases based on unsupervised learning |
CN116758341A (en) * | 2023-05-31 | 2023-09-15 | 北京长木谷医疗科技股份有限公司 | GPT-based hip joint lesion intelligent diagnosis method, device and equipment |
CN116894973A (en) * | 2023-07-06 | 2023-10-17 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
CN117352120A (en) * | 2023-06-05 | 2024-01-05 | 北京长木谷医疗科技股份有限公司 | GPT-based intelligent self-generation method, device and equipment for knee joint lesion diagnosis |
CN116894973B (en) * | 2023-07-06 | 2024-05-03 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109598339A (en) * | 2018-12-07 | 2019-04-09 | 电子科技大学 | A kind of vehicle attitude detection method based on grid convolutional network |
US20200020117A1 (en) * | 2018-07-16 | 2020-01-16 | Ford Global Technologies, Llc | Pose estimation |
CN113591936A (en) * | 2021-07-09 | 2021-11-02 | 厦门市美亚柏科信息股份有限公司 | Vehicle attitude estimation method, terminal device and storage medium |
CN113792669A (en) * | 2021-09-16 | 2021-12-14 | 大连理工大学 | Pedestrian re-identification baseline method based on hierarchical self-attention network |
CN114663917A (en) * | 2022-03-14 | 2022-06-24 | 清华大学 | Multi-view-angle-based multi-person three-dimensional human body pose estimation method and device |
-
2022
- 2022-07-05 CN CN202210780438.0A patent/CN114842085B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200020117A1 (en) * | 2018-07-16 | 2020-01-16 | Ford Global Technologies, Llc | Pose estimation |
CN109598339A (en) * | 2018-12-07 | 2019-04-09 | 电子科技大学 | A kind of vehicle attitude detection method based on grid convolutional network |
CN113591936A (en) * | 2021-07-09 | 2021-11-02 | 厦门市美亚柏科信息股份有限公司 | Vehicle attitude estimation method, terminal device and storage medium |
CN113792669A (en) * | 2021-09-16 | 2021-12-14 | 大连理工大学 | Pedestrian re-identification baseline method based on hierarchical self-attention network |
CN114663917A (en) * | 2022-03-14 | 2022-06-24 | 清华大学 | Multi-view-angle-based multi-person three-dimensional human body pose estimation method and device |
Non-Patent Citations (2)
Title |
---|
ZHIHONG WU等: "DST3D: DLA-Swin Transformer for Single-Stage Monocular 3D Object Detection", 《2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV)》 * |
ZINAN XIONG 等: "SWIN-POSE: SWIN TRANSFORMER BASED HUMAN POSE ESTIMATION", 《ARXIV:2201.07384V1》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115272992A (en) * | 2022-09-30 | 2022-11-01 | 松立控股集团股份有限公司 | Vehicle attitude estimation method |
CN116758341A (en) * | 2023-05-31 | 2023-09-15 | 北京长木谷医疗科技股份有限公司 | GPT-based hip joint lesion intelligent diagnosis method, device and equipment |
CN116758341B (en) * | 2023-05-31 | 2024-03-19 | 北京长木谷医疗科技股份有限公司 | GPT-based hip joint lesion intelligent diagnosis method, device and equipment |
CN117352120A (en) * | 2023-06-05 | 2024-01-05 | 北京长木谷医疗科技股份有限公司 | GPT-based intelligent self-generation method, device and equipment for knee joint lesion diagnosis |
CN116740714A (en) * | 2023-06-12 | 2023-09-12 | 北京长木谷医疗科技股份有限公司 | Intelligent self-labeling method and device for hip joint diseases based on unsupervised learning |
CN116740714B (en) * | 2023-06-12 | 2024-02-09 | 北京长木谷医疗科技股份有限公司 | Intelligent self-labeling method and device for hip joint diseases based on unsupervised learning |
CN116894973A (en) * | 2023-07-06 | 2023-10-17 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
CN116894973B (en) * | 2023-07-06 | 2024-05-03 | 北京长木谷医疗科技股份有限公司 | Integrated learning-based intelligent self-labeling method and device for hip joint lesions |
Also Published As
Publication number | Publication date |
---|---|
CN114842085B (en) | 2022-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114842085B (en) | Full-scene vehicle attitude estimation method | |
CN108399419B (en) | Method for recognizing Chinese text in natural scene image based on two-dimensional recursive network | |
CN111598030B (en) | Method and system for detecting and segmenting vehicle in aerial image | |
CN111401436B (en) | Streetscape image segmentation method fusing network and two-channel attention mechanism | |
CN112396607A (en) | Streetscape image semantic segmentation method for deformable convolution fusion enhancement | |
CN111126359A (en) | High-definition image small target detection method based on self-encoder and YOLO algorithm | |
CN110781850A (en) | Semantic segmentation system and method for road recognition, and computer storage medium | |
CN113486726A (en) | Rail transit obstacle detection method based on improved convolutional neural network | |
CN113743269B (en) | Method for recognizing human body gesture of video in lightweight manner | |
CN113870335A (en) | Monocular depth estimation method based on multi-scale feature fusion | |
CN112801027A (en) | Vehicle target detection method based on event camera | |
CN110688905A (en) | Three-dimensional object detection and tracking method based on key frame | |
CN112990065A (en) | Optimized YOLOv5 model-based vehicle classification detection method | |
CN116797787B (en) | Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network | |
CN112163447B (en) | Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet | |
CN111696038A (en) | Image super-resolution method, device, equipment and computer-readable storage medium | |
CN115588126A (en) | GAM, CARAFE and SnIoU fused vehicle target detection method | |
CN111881914B (en) | License plate character segmentation method and system based on self-learning threshold | |
CN113096133A (en) | Method for constructing semantic segmentation network based on attention mechanism | |
Yu et al. | Intelligent corner synthesis via cycle-consistent generative adversarial networks for efficient validation of autonomous driving systems | |
CN112581423A (en) | Neural network-based rapid detection method for automobile surface defects | |
CN117037119A (en) | Road target detection method and system based on improved YOLOv8 | |
CN111626298B (en) | Real-time image semantic segmentation device and segmentation method | |
CN114187569A (en) | Real-time target detection method integrating Pearson coefficient matrix and attention | |
CN114693951A (en) | RGB-D significance target detection method based on global context information exploration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |