CN114842085A - Full-scene vehicle attitude estimation method - Google Patents

Full-scene vehicle attitude estimation method Download PDF

Info

Publication number
CN114842085A
CN114842085A CN202210780438.0A CN202210780438A CN114842085A CN 114842085 A CN114842085 A CN 114842085A CN 202210780438 A CN202210780438 A CN 202210780438A CN 114842085 A CN114842085 A CN 114842085A
Authority
CN
China
Prior art keywords
image
vehicle
layer
key point
full
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210780438.0A
Other languages
Chinese (zh)
Other versions
CN114842085B (en
Inventor
刘寒松
王永
王国强
刘瑞
翟贵乾
李贤超
焦安健
谭连胜
董玉超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sonli Holdings Group Co Ltd
Original Assignee
Sonli Holdings Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sonli Holdings Group Co Ltd filed Critical Sonli Holdings Group Co Ltd
Priority to CN202210780438.0A priority Critical patent/CN114842085B/en
Publication of CN114842085A publication Critical patent/CN114842085A/en
Application granted granted Critical
Publication of CN114842085B publication Critical patent/CN114842085B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention belongs to the technical field of vehicle attitude estimation, and relates to a full-scene vehicle attitude estimation method.

Description

Full-scene vehicle attitude estimation method
Technical Field
The invention belongs to the technical field of vehicle attitude estimation, and relates to a full-scene vehicle attitude estimation method.
Background
The automatic driving prospect is wide, the development trend of future automobiles is that the development of automatic driving requires that vehicles have the ability to clearly judge the surrounding environment, correct driving routes and driving behaviors are selected, drivers are assisted to control the vehicles, the driving scenes in reality are complex and changeable, different countermeasures are required in each complex scene, and the estimation of vehicle postures is used as an important task in the automatic driving technology and aims to locate key points of the vehicles from images or videos and help to judge the driving states of the surrounding vehicles.
At present, the main challenge of vehicle attitude estimation is the occlusion problem, and no matter in which driving scene, the occlusion problem exists, such as occlusion between vehicles, occlusion between pedestrians and vehicles, and occlusion between other objects and vehicles, but the existing vehicle attitude estimation method is difficult to identify the vehicle attitude in the occlusion scene, and therefore a vehicle attitude estimation method facing a full scene is urgently needed.
The convolutional neural network obtains excellent performance in the field of attitude estimation, most work regards the deep convolutional neural network as a strong black box predictor, however, how to capture the spatial relationship between components is still unclear, from the viewpoint of science and practical application, the interpretability of the model can help to understand how the model relates variables to achieve final prediction, and how the attitude estimation algorithm processes various input images, and the Transformer can capture long-distance relationship to reveal the dependency relationship between key points in the task of vehicle attitude estimation.
Since the advent of the Transformer, its high computational efficiency and scalability have made it dominate natural language processing, being a deep neural network based mainly on the mechanism of self-attention, and due to its powerful performance, researchers are looking for ways to apply the Transformer to computer vision tasks, where the performance of the Transformer-based model in various vision benchmarking tests is similar or better than that of other types of networks (such as convolutional networks and recursive networks), but no report or use of the model in vehicle pose estimation is known at present.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, designs and provides a full-scene vehicle attitude estimation method, realizes high-efficiency vehicle attitude estimation, takes Swin transform as a backbone network for feature extraction, uses a transform encoder to encode feature map information into position representation of key points, obtains key point dependence items by calculating attention scores, predicts the final key point positions, effectively solves the problem of vehicle occlusion, and realizes full-scene vehicle attitude estimation.
In order to achieve the purpose, the Swin transform is introduced as a backbone network, a network structure is optimized according to the characteristics of a vehicle attitude estimation task, original image information is compressed into a position sequence with compact key points, the vehicle attitude estimation task is converted into a coding task, a key point dependent item is obtained by calculating an attention score, and a final key point position is predicted, wherein the specific process comprises the following steps:
(1) and (3) data set construction:
selecting vehicle images in an open source data set, collecting images of various vehicles in a traffic monitoring and parking lot, constructing a vehicle data set, and dividing the vehicle data set into a training set, a verification set and a test set;
(2) image segmentation: the image in the vehicle data set is segmented into non-overlapping image slices by a slice segmentation module, each image slice is regarded as a mark and is characterized by serial RGB values of the input image;
(3) extracting hierarchical features of a backbone network: the image slice marks obtained in the step (2) firstly pass through a linear embedding layer of a first stage of a backbone network, the feature dimension is changed into a random dimension C, and then, the two embedding layers and a second stage are used for carrying out layered feature extraction to obtain a feature map;
(4) position coding: inputting the characteristic diagram obtained in the step (3) into a position coding layer for position coding, and enabling the characteristic diagram to pass through
Figure 893806DEST_PATH_IMAGE001
Convolution or one linear layer being flattened
Figure 996892DEST_PATH_IMAGE002
An
Figure 969527DEST_PATH_IMAGE003
Dimensional vectors which pass through four attention layers and a feedforward neural network and then output characteristic vectors, wherein H and W are the height and width of the image respectively;
(5) generating a keypoint heat map: reshaping the characteristic vector obtained in the step (4) back
Figure 802354DEST_PATH_IMAGE004
Then channel dimensions are determined from
Figure 100611DEST_PATH_IMAGE003
Lowering to K, and generating a predicted K key point heat map, wherein K is the number of key points of each vehicle and has a value of 78;
(6) and outputting a result: and inhibiting the key point heat map to a key point coordinate through a non-maximum value, and marking the position of the key point in the original image to realize the attitude estimation of the full-scene vehicle.
Further, 78 key points are defined for each vehicle in the vehicle image in the step (1), and a boundary frame and a category of the vehicle, namely a minimum bounding rectangle of the vehicle, are labeled.
Further, the size of each image slice in step (2) is
Figure 392790DEST_PATH_IMAGE005
With a characteristic dimension of
Figure 293750DEST_PATH_IMAGE006
Further, the trunk network in the step (3) adopts a Swin Transformer trunk network, the first stage includes a linear embedding layer and two Swin Transformer blocks, and the number of labels of the two Swin Transformer blocks is
Figure 590870DEST_PATH_IMAGE007
Where H and W are the height and width of the input image; the second stage comprises a linear merging layer and two Swin transform blocks, the image slices subjected to the feature extraction in the first stage are marked in a reduction mode through the linear merging layer, and the linear merging layer enables each group to be marked
Figure 122345DEST_PATH_IMAGE008
The features of adjacent blocks are connected, with a linear layer acting in the dimension of
Figure 262340DEST_PATH_IMAGE009
The number of marks is reduced by 4 times, and the output dimension is changed to
Figure 842357DEST_PATH_IMAGE003
Then, feature transformation is carried out through two Swin transform blocks, and the resolution of the obtained image is
Figure 384196DEST_PATH_IMAGE002
And realizing layered feature extraction.
Further, the position coding layer in step (4) adopts an encoder with a standard transform architecture, the position coding layer regards the feature map as a dynamic weight determined by specific image content, information flow in forward propagation is reweighed, a key point dependency is obtained by calculating a score of a last attention layer, a higher value of a position attention score in an image indicates that the contribution degree to predicting key points is larger, and the blocked key points are predicted by the dependency of the key points.
Compared with the prior art, the method has the advantages that the Swin transducer is used for replacing the traditional convolutional neural network, the layered transducer is adopted for the main network, the calculation efficiency is improved, the linear calculation complexity is low, the long-distance relation in the image is captured by using the encoder of the standard transducer, the dependency relation of the predicted key point is disclosed, the final position of the predicted key point is formed by collecting the dependency item which contributes greatly to the key point through the final attention layer, the shielding problem is solved, the method obtains better balance between the detection precision and the speed, and the method has higher practical application value.
Drawings
Fig. 1 is a schematic structural framework diagram of a vehicle attitude estimation system provided by the present invention.
Fig. 2 is a schematic structural diagram of a first stage of the backbone network according to the present invention.
Fig. 3 is a schematic structural diagram of a second stage of the backbone network according to the present invention.
FIG. 4 is a single structure diagram of the coding layer according to the present invention.
FIG. 5 is a block flow diagram of a vehicle attitude estimation method according to the present invention.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
Example (b):
the embodiment provides a full-scene vehicle attitude estimation method based on a transform backbone and a position encoder, which introduces Swin transform as a backbone network, converts a vehicle attitude estimation task into an encoding task by compressing original image information into a position sequence with compact key points, obtains a key point dependency term by calculating an attention score, predicts a final key point position, can effectively predict a blocked vehicle key point position, and realizes full-scene vehicle attitude estimation, as shown in FIGS. 1-5, and specifically comprises the following steps:
(1) and (3) data set construction:
selecting vehicle images in an open source data set, collecting images containing various vehicles in real scenes such as traffic monitoring, parking lots and the like, constructing a vehicle data set, defining 78 key points on each vehicle, taking a car as an example, mainly defining points with strong local texture feature information, such as corner point definitions (4 corner points of car lamps, 4 corner points of front and rear windshields and the like) on multiple selected vehicles, labeling a boundary frame and a category of the vehicle, namely a minimum bounding rectangle of the vehicle, and finally dividing the data set into a training set, a verification set and a test set;
(2) image segmentation:
the vehicle image is segmented into non-overlapping image slices by a slice segmentation module, each image slice being of a size
Figure 555415DEST_PATH_IMAGE005
Their characteristic dimension is
Figure 293957DEST_PATH_IMAGE006
Each image slice is regarded as a mark and is characterized by serial RGB values of the input image;
(3) extracting hierarchical features of a backbone network;
the backbone network is divided into two stages, the image slice marking firstly passes through the first stage, as shown in fig. 2, the first stage comprises a linear embedding layer and two Swin transducer blocks, the linear embedding layer is applied to the image slice original value characteristics and maps the image slice original value characteristics to a random dimension C, and the number of the transducer blocks is that
Figure 536720DEST_PATH_IMAGE007
H and W are the height and width of the input image, followed by a second stage, as shown in FIG. 3, of reducing the mark by a linear merge layer that reduces the mark per group as the network goes deep
Figure 74011DEST_PATH_IMAGE008
Connecting the characteristics of adjacent blocks, connectingA linear layer acting in a dimension of
Figure 744027DEST_PATH_IMAGE009
The number of marks is reduced by 4 times, and the output dimension is changed to
Figure 733980DEST_PATH_IMAGE003
Then, feature transformation is carried out through two Swin transform blocks, and the resolution of the obtained image is
Figure 718116DEST_PATH_IMAGE002
Realizing the extraction of the layered characteristics;
(4) position coding:
the feature graph output by the backbone network is input into the coding layer, the embodiment has 4 coding layers, each coding layer is as shown in fig. 4, firstly, the feature graph passes through
Figure 500127DEST_PATH_IMAGE001
Convolution or a linear layer, flattened into
Figure 809886DEST_PATH_IMAGE002
An
Figure 880610DEST_PATH_IMAGE003
Dimensional vectors which are subjected to 4 attention layers and a feedforward neural network to obtain characteristic vectors;
(5) generating a keypoint heat map:
the coding layer outputs the feature vectors, which are first reshaped back
Figure 934017DEST_PATH_IMAGE004
Then channel dimensions are determined from
Figure 773797DEST_PATH_IMAGE003
Decreasing to K (K is the number of keypoints per vehicle, value 78), generating a predicted K keypoint heat map;
(6) and outputting a result: and (5) applying a non-maximum value to suppress the key point coordinates in the key point heat map generated in the step (5), and marking the positions of the key points in the original image to realize the estimation of the vehicle attitude of the whole scene.
Structures, algorithms, and computational processes not described in detail herein are all common in the art.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the disclosure of the embodiment examples, but the scope of the invention is defined by the appended claims.

Claims (5)

1. A full scene vehicle attitude estimation method is characterized by comprising the following steps:
(1) and (3) data set construction:
selecting vehicle images in an open source data set, collecting images of various vehicles in a traffic monitoring and parking lot, constructing a vehicle data set, and dividing the vehicle data set into a training set, a verification set and a test set;
(2) image segmentation: the image in the vehicle data set is segmented into non-overlapping image slices by a slice segmentation module, each image slice is regarded as a mark and is characterized by serial RGB values of the input image;
(3) extracting hierarchical features of a backbone network: the image slice marks obtained in the step (2) firstly pass through a linear embedding layer of a first stage of a backbone network, the feature dimension is changed into a random dimension C, and then, the two embedding layers and a second stage are used for carrying out layered feature extraction to obtain a feature map;
(4) position coding: inputting the characteristic diagram obtained in the step (3) into a position coding layer for position coding, and enabling the characteristic diagram to pass through
Figure 473472DEST_PATH_IMAGE001
Convolution or one linear layer being flattened
Figure 640142DEST_PATH_IMAGE002
An
Figure 376017DEST_PATH_IMAGE003
Dimensional vectors which pass through four attention layers and a feedforward neural network and then output characteristic vectors, wherein H and W are the height and width of the image respectively;
(5) generating a keypoint heat map: reshaping the characteristic vector obtained in the step (4) back
Figure 168393DEST_PATH_IMAGE004
Then channel dimensions are determined from
Figure 306113DEST_PATH_IMAGE003
Lowering to K, and generating a predicted K key point heat map, wherein K is the number of key points of each vehicle and has a value of 78;
(6) and outputting a result: and inhibiting the key point heat map to a key point coordinate through a non-maximum value, and marking the position of the key point in the original image to realize the attitude estimation of the full-scene vehicle.
2. The full-scene vehicle pose estimation method according to claim 1, wherein 78 key points are defined for each vehicle in the vehicle image in the step (1), and a boundary box and a category of the vehicle are labeled.
3. The full-scene vehicle pose estimation method of claim 2, wherein the size of each image slice in step (2) is
Figure 348412DEST_PATH_IMAGE005
With a characteristic dimension of
Figure 696216DEST_PATH_IMAGE006
4. The full-scene vehicle pose estimation method according to claim 3, whereinThe trunk network in the step (3) adopts a Swin Transformer trunk network, the first stage comprises a linear embedding layer and two Swin Transformer blocks, and the number of marks of the two Swin Transformer blocks is
Figure 902070DEST_PATH_IMAGE007
Where H and W are the height and width of the input image; the second stage comprises a linear merging layer and two Swin transform blocks, the image slices subjected to the feature extraction in the first stage are marked in a reduction mode through the linear merging layer, and the linear merging layer enables each group to be marked
Figure 769663DEST_PATH_IMAGE008
The features of adjacent blocks are connected, with a linear layer acting in the dimension of
Figure 668349DEST_PATH_IMAGE009
The number of marks is reduced by 4 times, and the output dimension is changed to
Figure 503449DEST_PATH_IMAGE003
Then, feature transformation is carried out through two Swin transform blocks, and the resolution of the obtained image is
Figure 247415DEST_PATH_IMAGE002
And realizing layered feature extraction.
5. The full-scene vehicle attitude estimation method according to claim 4, wherein the position coding layer in step (4) adopts an encoder of a standard transform architecture, the position coding layer regards the feature map as a dynamic weight determined by specific image content, re-weights the information flow in forward propagation, obtains a key point dependency by calculating a score of a last attention layer, and predicts the occluded key point through the key point dependency, wherein the higher a certain position attention score value in the image is, the greater the contribution degree of the certain position attention score value to the predicted key point is.
CN202210780438.0A 2022-07-05 2022-07-05 Full-scene vehicle attitude estimation method Active CN114842085B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210780438.0A CN114842085B (en) 2022-07-05 2022-07-05 Full-scene vehicle attitude estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210780438.0A CN114842085B (en) 2022-07-05 2022-07-05 Full-scene vehicle attitude estimation method

Publications (2)

Publication Number Publication Date
CN114842085A true CN114842085A (en) 2022-08-02
CN114842085B CN114842085B (en) 2022-09-16

Family

ID=82574897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210780438.0A Active CN114842085B (en) 2022-07-05 2022-07-05 Full-scene vehicle attitude estimation method

Country Status (1)

Country Link
CN (1) CN114842085B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272992A (en) * 2022-09-30 2022-11-01 松立控股集团股份有限公司 Vehicle attitude estimation method
CN116740714A (en) * 2023-06-12 2023-09-12 北京长木谷医疗科技股份有限公司 Intelligent self-labeling method and device for hip joint diseases based on unsupervised learning
CN116758341A (en) * 2023-05-31 2023-09-15 北京长木谷医疗科技股份有限公司 GPT-based hip joint lesion intelligent diagnosis method, device and equipment
CN116894973A (en) * 2023-07-06 2023-10-17 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions
CN117352120A (en) * 2023-06-05 2024-01-05 北京长木谷医疗科技股份有限公司 GPT-based intelligent self-generation method, device and equipment for knee joint lesion diagnosis
CN116894973B (en) * 2023-07-06 2024-05-03 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598339A (en) * 2018-12-07 2019-04-09 电子科技大学 A kind of vehicle attitude detection method based on grid convolutional network
US20200020117A1 (en) * 2018-07-16 2020-01-16 Ford Global Technologies, Llc Pose estimation
CN113591936A (en) * 2021-07-09 2021-11-02 厦门市美亚柏科信息股份有限公司 Vehicle attitude estimation method, terminal device and storage medium
CN113792669A (en) * 2021-09-16 2021-12-14 大连理工大学 Pedestrian re-identification baseline method based on hierarchical self-attention network
CN114663917A (en) * 2022-03-14 2022-06-24 清华大学 Multi-view-angle-based multi-person three-dimensional human body pose estimation method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200020117A1 (en) * 2018-07-16 2020-01-16 Ford Global Technologies, Llc Pose estimation
CN109598339A (en) * 2018-12-07 2019-04-09 电子科技大学 A kind of vehicle attitude detection method based on grid convolutional network
CN113591936A (en) * 2021-07-09 2021-11-02 厦门市美亚柏科信息股份有限公司 Vehicle attitude estimation method, terminal device and storage medium
CN113792669A (en) * 2021-09-16 2021-12-14 大连理工大学 Pedestrian re-identification baseline method based on hierarchical self-attention network
CN114663917A (en) * 2022-03-14 2022-06-24 清华大学 Multi-view-angle-based multi-person three-dimensional human body pose estimation method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHIHONG WU等: "DST3D: DLA-Swin Transformer for Single-Stage Monocular 3D Object Detection", 《2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV)》 *
ZINAN XIONG 等: "SWIN-POSE: SWIN TRANSFORMER BASED HUMAN POSE ESTIMATION", 《ARXIV:2201.07384V1》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272992A (en) * 2022-09-30 2022-11-01 松立控股集团股份有限公司 Vehicle attitude estimation method
CN116758341A (en) * 2023-05-31 2023-09-15 北京长木谷医疗科技股份有限公司 GPT-based hip joint lesion intelligent diagnosis method, device and equipment
CN116758341B (en) * 2023-05-31 2024-03-19 北京长木谷医疗科技股份有限公司 GPT-based hip joint lesion intelligent diagnosis method, device and equipment
CN117352120A (en) * 2023-06-05 2024-01-05 北京长木谷医疗科技股份有限公司 GPT-based intelligent self-generation method, device and equipment for knee joint lesion diagnosis
CN116740714A (en) * 2023-06-12 2023-09-12 北京长木谷医疗科技股份有限公司 Intelligent self-labeling method and device for hip joint diseases based on unsupervised learning
CN116740714B (en) * 2023-06-12 2024-02-09 北京长木谷医疗科技股份有限公司 Intelligent self-labeling method and device for hip joint diseases based on unsupervised learning
CN116894973A (en) * 2023-07-06 2023-10-17 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions
CN116894973B (en) * 2023-07-06 2024-05-03 北京长木谷医疗科技股份有限公司 Integrated learning-based intelligent self-labeling method and device for hip joint lesions

Also Published As

Publication number Publication date
CN114842085B (en) 2022-09-16

Similar Documents

Publication Publication Date Title
CN114842085B (en) Full-scene vehicle attitude estimation method
CN108399419B (en) Method for recognizing Chinese text in natural scene image based on two-dimensional recursive network
CN111598030B (en) Method and system for detecting and segmenting vehicle in aerial image
CN111401436B (en) Streetscape image segmentation method fusing network and two-channel attention mechanism
CN112396607A (en) Streetscape image semantic segmentation method for deformable convolution fusion enhancement
CN111126359A (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN110781850A (en) Semantic segmentation system and method for road recognition, and computer storage medium
CN113486726A (en) Rail transit obstacle detection method based on improved convolutional neural network
CN113743269B (en) Method for recognizing human body gesture of video in lightweight manner
CN113870335A (en) Monocular depth estimation method based on multi-scale feature fusion
CN112801027A (en) Vehicle target detection method based on event camera
CN110688905A (en) Three-dimensional object detection and tracking method based on key frame
CN112990065A (en) Optimized YOLOv5 model-based vehicle classification detection method
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN112163447B (en) Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet
CN111696038A (en) Image super-resolution method, device, equipment and computer-readable storage medium
CN115588126A (en) GAM, CARAFE and SnIoU fused vehicle target detection method
CN111881914B (en) License plate character segmentation method and system based on self-learning threshold
CN113096133A (en) Method for constructing semantic segmentation network based on attention mechanism
Yu et al. Intelligent corner synthesis via cycle-consistent generative adversarial networks for efficient validation of autonomous driving systems
CN112581423A (en) Neural network-based rapid detection method for automobile surface defects
CN117037119A (en) Road target detection method and system based on improved YOLOv8
CN111626298B (en) Real-time image semantic segmentation device and segmentation method
CN114187569A (en) Real-time target detection method integrating Pearson coefficient matrix and attention
CN114693951A (en) RGB-D significance target detection method based on global context information exploration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant