CN117523549B - Three-dimensional point cloud object identification method based on deep and wide knowledge distillation - Google Patents

Three-dimensional point cloud object identification method based on deep and wide knowledge distillation Download PDF

Info

Publication number
CN117523549B
CN117523549B CN202410009182.2A CN202410009182A CN117523549B CN 117523549 B CN117523549 B CN 117523549B CN 202410009182 A CN202410009182 A CN 202410009182A CN 117523549 B CN117523549 B CN 117523549B
Authority
CN
China
Prior art keywords
model
point cloud
deep
output
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410009182.2A
Other languages
Chinese (zh)
Other versions
CN117523549A (en
Inventor
田逸非
陈敏
李朋阳
尹捷明
吕梦婕
周剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202410009182.2A priority Critical patent/CN117523549B/en
Publication of CN117523549A publication Critical patent/CN117523549A/en
Application granted granted Critical
Publication of CN117523549B publication Critical patent/CN117523549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention belongs to the field of three-dimensional point cloud object identification, and discloses a three-dimensional point cloud object identification method based on deep and wide knowledge distillation, which comprises the steps of firstly, selecting a deep learning model as a teacher model, inputting original point cloud data into the teacher model for pre-training and testing, and respectively obtaining characteristic nodes, enhancement nodes and prediction results after training and testing; and secondly, training the data obtained after training as a training sample of the stacking width learning model after knowledge distillation to obtain a width learning classifier, and finally, inputting the sample data obtained by knowledge distillation of the data obtained by testing the teacher model into the training width learning classifier to obtain a class label. According to the invention, the advantages of the teacher model are transferred to the stacking width learning model, so that better classification capability can be obtained by using the student model after knowledge distillation, the calculation amount of the model is greatly reduced by using the stacking width model, and the classification speed is improved.

Description

Three-dimensional point cloud object identification method based on deep and wide knowledge distillation
Technical Field
The invention belongs to the field of three-dimensional point cloud object identification, and particularly relates to a three-dimensional point cloud object identification method based on deep and wide knowledge distillation.
Background
A point cloud is a three-dimensional dataset made up of a large number of discrete points that can represent objects and scenes in three-dimensional space. Each point contains spatial coordinate information and sometimes other information such as color, normal vector, etc. Point clouds are commonly used to capture geometric and visual information of objects or scenes in the real world, while the identification of three-dimensional point cloud objects is a classical task in the field of computer vision, and has wide application in the fields of automatic driving, industrial part production, and the like.
One of the most challenging tasks in three-dimensional point cloud object recognition is feature extraction and structural information analysis, particularly when dealing with unique properties of the point cloud, such as disorder, irregularity, and the like. To overcome these challenges, powerful methods are needed to efficiently handle the complexity inherent in point cloud data.
Inspired by the achievement of deep learning models in the field of image processing, many researchers have utilized convolutional neural networks to identify three-dimensional objects from point clouds. However, to overcome the special nature of the point cloud, the point cloud cannot be directly input into a traditional deep convolutional network, and preprocessing is required. Some network models translate the point cloud into a 2D/3D regular grid, e.g., multi-view images, voxels, in order to directly use existing deep learning algorithms to identify three-dimensional objects. However, these preprocessing steps may result in a loss of apparent features of the original geometric details. Thus, the grid-based algorithm is only applicable to objects with distinct distinguishing features. In order to avoid the point cloud losing features during rasterization, some researchers use multi-layer perceptrons (MLPs) to simulate convolution kernels in point cloud feature extraction, and by stacking MLPs sharing weights, high-dimensional point-to-point features and their neighborhood information can be obtained. In addition, for the unstructured nature of point clouds, some point convolution kernels aimed at directly extracting point cloud features have also been studied, and although these convolution kernels have a feature extraction capability that is stronger than that of MLPs, the parameters in these models require a lot of time and memory to train and fine tune. And most of the neural networks pay more attention to operator design, so that the influence of nonlinear classification on object recognition performance at a full connection layer is ignored.
While deep network structures can provide a network with a strong learning ability, such structures suffer from a large number of hyper-parameters and corresponding propagation processes, which are time consuming to train. The width learning system (BLS) is a shallow neural network structure, which reduces layer-to-layer coupling compared to a deep structure, making the network structure more compact. The width learning system generates feature nodes and enhancement nodes using inputs, and the feature nodes and enhancement nodes are connected with an output layer, and weights thereof are obtained by calculating pseudo-inverses. In addition, the width learning system is an incremental learning system, and can update network parameters in an incremental manner, and when feature nodes, enhancement nodes or input data are newly added, the network does not need to be retrained from the beginning, and only the weight of the newly added part is calculated, so that compared with a deep structure network, the width learning system has the characteristics of rapidness and high efficiency.
However, the learning ability of the width learning system as a shallow neural network is relatively limited, and the accuracy of the width learning system cannot be well ensured when facing complex tasks.
Disclosure of Invention
In order to solve the technical problems, the invention provides a three-dimensional point cloud object identification method based on deep and wide knowledge distillation.
In order to achieve the above purpose, the invention is realized by the following technical scheme:
the invention relates to a three-dimensional point cloud object identification method based on deep and wide knowledge distillation, which specifically comprises the following steps:
step 1, selecting a deep learning network as a teacher model, training the teacher model based on a training point cloud data set, extracting distinguishing features of original point cloud data by using the teacher model, and obtaining a soft label generated by the model;
step 2, constructing a jackKDBLS model of an n-layer width learning network, taking the jackKDBLS model as a student model, and splicing the characteristics acquired in the step 1 to be used as input of the student model;
step 3, training a stackKDBLS model by utilizing the soft tag information and the real tag of the point cloud data acquired by the deep learning network;
and 4, if the final training result in the step 3 exceeds a preset threshold value, further stacking a width learning network on the basis of the original jackKDBLS model.
The invention further improves that: the distinguishing features required to be obtained in the step 1 are global features and local features respectively, the global features are obtained by performing a series of spatial transformation, feature extraction and pooling operations on input point cloud data, the global features represent global properties of point cloud, and the geometric structures and features of the point cloud are summarized and extracted; the local features are obtained by first determining the local neighborhood range of each point, then normalizing its coordinates for each selected local neighborhood to reduce the effects of transformations such as rotations, translations, etc., and finally convolving or pooling, however unlike global features, it captures more specific and abstract feature information learned at the deep neural network layer, which helps the network more fully understand the structure and features of the point cloud data.
The invention further improves that: the student model constructed in the step 2 comprises n width learning system modules, the n width learning system modules are stacked through residual connection, the output of the i-1 th width learning system module is used as the input of the i-1 th width learning system module, the output of the i-1 th width learning system module is the residual of the 1,2, …, i-1 width learning system module, i is less than or equal to n, the final output of the width learning system module is the sum of the outputs of the n width learning system modules, and each width learning system module comprises a feature node, a feature node weight, an enhancement node and an enhancement node weight.
Assuming that the input data is x and the output data is y, the output of the ith width learning system module is u i The method comprises the following steps:
wherein,and->For the connection weight of the feature node to the output layer, < ->For the weight between the randomly generated input and the feature node, < ->The connection weight between the randomly generated characteristic node and the enhancement node is used; />Is->Andwherein->Generalized function as feature node, ++>In order to enhance the generalized function of the node,,/>as a mapping function, the final output of the system is:
and->Obtained by solving an optimization problem:
wherein,is +.for training data in the ith width learning system module>Is a desired output of (1); />Is a balance coefficient.
The optimization problem is solved by ridge regression approximation:
wherein,;/>connection weights for feature node and enhancement node to output layer, +.>Is an identity matrix.
The invention further improves that: typically, in the classification task, the object labels are represented in a manner called one-hot coding, where each class is represented by a vector, only one element is 0, and the remaining elements are used to represent the class. The output of the neural network is a set of scores called logits, which have not been normalized by the softmax function and therefore may contain more information, particularly similarities and correlations between the target class and other classes. Thus, knowledge distillation is adopted in step 3, and the teacher model θ is used to assist in training the stackKDBLS model.
The logits and predicted outputs of the teacher model θ are:
wherein the method comprises the steps ofFor inputting data +.>Logits for teacher model, +.>And outputting the prediction of the teacher model theta.
If not considering teacher model theta i ThenWherein->As pseudo-inverse matrix, when using teacher model θ to assist in training the stabkdbls model, target output +.>The calculation mode of (2) is changed, and the calculation mode is changed into:
wherein,for the number of stacked jackkdbls models, +.>Is distillation temperature, ++>Is the predicted output of the first jackkdbls model. This calculation takes into account the output of the teacher model and the previous jackkThe output of the DBLS model helps to convey more information to train the stabkdbls model.
If k=1, thenSummary of the output of the first k stabkdbls models need not be considered.
The invention further improves that: in step 5, to determine whether to stack more breadth-learning networks into the stackKDBLS model, KL divergence is used to measure the predicted outputAnd the target output. The calculation formula of the KL divergence is as follows:
if it isAbove a predetermined threshold epsilon, a BLS network is added to improve the performance of the model.
The beneficial effects of the invention are as follows:
1. the invention allows knowledge distillation between different types of models, namely, knowledge is transferred from a complex deep neural network to a lightweight width learning network, so that knowledge sharing and migration between models are promoted;
2. the invention can effectively utilize the width learning network as the student model to learn more knowledge from the teacher model more quickly and directly;
3. the stackKDBLS model improves the overall performance of three-dimensional shape recognition of the original point cloud through the knowledge transfer framework, and can obtain higher classification precision than the original deep learning network under the condition of smaller time and resource expenditure.
Drawings
FIG. 1 is a flow chart of a three-dimensional point cloud object identification method based on deep and wide knowledge distillation.
Fig. 2 is a schematic diagram of a distillation model based on deep and wide knowledge, namely a jackkdbls.
Detailed Description
Embodiments of the invention are disclosed in the drawings, and for purposes of explanation, numerous practical details are set forth in the following description. However, it should be understood that these practical details are not to be taken as limiting the invention. That is, in some embodiments of the invention, these practical details are unnecessary.
As shown in fig. 1, the invention relates to a three-dimensional point cloud object identification method based on deep and wide knowledge distillation, which specifically comprises the following steps:
step 1, selecting a deep learning network as a teacher model, training the teacher model based on a training point cloud data set, extracting distinguishing features of original point cloud data by using the teacher model, and obtaining a soft label generated by the model.
The specific operation is as follows:
(1) Selecting a deep learning network model, and taking original point cloud data as the input of the model;
(2) The method comprises the steps of performing a series of operations such as space transformation, feature extraction, pooling and the like on original point cloud data through a depth network model to capture global features and local features of the original point cloud;
(3) And obtaining the soft tag information by using the trained deep learning network.
And 2, constructing a stackKDBLS model with a three-layer width learning network, taking the stackKDBLS model as a student model, and splicing the characteristics acquired in the step 1 to be used as input of the student model. The specific operation is as follows:
step 21, for the first width learning system module, randomly initializing a weight matrixAnd->Using、/>And->Calculating to obtain characteristic node and enhancement node>,/>The formula can be used:
step 22, calculating to obtain input dataAnd (2) desired output->Weights between->,/>And then by the formula: />Obtaining a predicted output +.>
Step 23, stacking new width learning modules on the basis of the first width learning system module, and outputting the i (i=2, 3) th width learning system module asI.e., the output of the last width learning system module, the desired output is:
likewise randomly initializing a weight matrixAnd->Use +.>、/>And->Calculating to obtain characteristic node and enhancement node>,/>The formula can be used:
calculating to obtain input->And (2) desired output->Weights between->、/>
Step 24, thereby passing through the formula:obtain a pre-preparationMeasuring output->
Step 25, repeating step 23 until the number of stacked width learning system modules is equal to n, and the final prediction output is:
and step 3, training a stackKDBLS model by utilizing the soft tag information and the real tag of the point cloud data acquired by the deep learning network. The specific operation is as follows:
the knowledge distillation mode is adopted, the deep learning network selected in the step 1 is used as a teacher model theta to assist in training a stackKDBLS model, and the logits and the prediction output of the teacher model theta are as follows:
when the teacher model θ is employed to assist in training the stabkdbls model, an output is desiredThe calculation mode of (a) is changed and is converted into:
wherein,is the distillation temperature;
and 4, if the final training result in the step 3 exceeds a preset threshold value, further stacking a width learning network on the basis of the original jackKDBLS model. The specific operation is as follows:
to determine whether to stack more breadth-learning systems into the model, KL divergence is used to measure the prediction outputAnd eyes (eyes)Differences between the target outputs. The calculation formula of the KL divergence is as follows:
if it isAbove a predetermined threshold epsilon, a width learning system is additionally added to improve the performance of the model.
According to the invention, the advantages of the teacher model are transferred to the stacking width learning model, so that better classification capability can be obtained by using the student model after knowledge distillation, the calculation amount of the model is greatly reduced by using the stacking width model, and the classification speed is improved.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.

Claims (6)

1. A three-dimensional point cloud object identification method based on deep and wide knowledge distillation is characterized in that: the three-dimensional point cloud object identification method comprises the following steps:
step 1, selecting a deep learning network model as a teacher model, training the teacher model based on a training point cloud data set, extracting distinguishing characteristics of original point cloud data by using the teacher model, and acquiring a soft label generated by the teacher model;
step 2, constructing a jackKDBLS model of an n-layer width learning network, taking the jackKDBLS model as a student model, and splicing the distinguishing features acquired in the step 1 to be used as input of the student model;
step 3, training the stackKDBLS model constructed in the step 2 by utilizing the soft tag information and the real tag of the point cloud data acquired by the deep learning network in the step 1;
step 4, if the final training result in step 3 exceeds a predetermined threshold, stacking a width learning network based on the original stackKDBLS model, wherein:
the jackKDBLS model constructed in the step 2, namely the student model, comprises n width learning system modules which are stacked through residual connection, wherein the output of the ith-1 width learning system module is used as the input of the ith width learning system module, the output of the ith width learning system module is 1,2, …, the residual of the ith-1 width learning system module, i is less than or equal to n, the final output of the width learning system module is the sum of the outputs of the n width learning system modules, each width learning system module comprises a characteristic node, a characteristic node weight, an enhancement node and an enhancement node weight,
the step 2 specifically includes:
assuming that the input data is x and the output data is y, the output of the ith width learning system module is u i The method comprises the following steps:
wherein,and->For the connection weight of the feature node to the output layer, < ->For the weight between the randomly generated input and the feature node, < ->The connection weight between the randomly generated characteristic node and the enhancement node is used; q (Q) p (. Cndot.) is a composite mapping of P (-) and Q (-) where P (-) is the generalized function of the feature node and Q (-) is the generalized function of the enhancement node, vi=g(u i -1), g (·) is a mapping function, the final output of the system being:
2. the three-dimensional point cloud object identification method based on deep and wide knowledge distillation according to claim 1, wherein the method comprises the following steps: the step 1 specifically comprises the following steps:
step 1.1, selecting a deep learning network model, and taking original point cloud data as input of the deep learning network model;
step 1.2, performing space transformation, feature extraction and pooling operation on original point cloud data through a depth network model to capture global features and local features of the original point cloud data;
and 1.3, acquiring soft tag information of the point cloud data by using the trained deep learning network.
3. The three-dimensional point cloud object identification method based on deep and wide knowledge distillation according to claim 2, wherein the method comprises the following steps: the distinguishing features in the step 1 are global features and local features respectively, the global features are obtained by performing spatial transformation, feature extraction and pooling operation on input original point cloud data, the global features represent global properties of point clouds, the geometric structures and features of the point clouds are summarized and extracted, the local features are obtained by determining a local neighborhood range of each point, normalizing coordinates of each selected local neighborhood, and finally obtaining the local features through convolution or pooling, so that more local and abstract feature information learned in a deep neural network middle layer is captured.
4. The three-dimensional point cloud object identification method based on deep and wide knowledge distillation according to claim 1, wherein the method comprises the following steps: the saidAnd->Obtained by solving an optimization problem:
wherein y is i Is the desired output for training data vi in the ith width learning system module, lambda is the balance coefficient,
the optimization problem is solved by ridge regression approximation:
wherein,W i and the connection weights of the characteristic nodes and the enhancement nodes and the output layer are adopted, and I is an identity matrix.
5. The three-dimensional point cloud object identification method based on deep and wide knowledge distillation according to claim 1, wherein the method comprises the following steps: the step 3 specifically includes:
step 3.1, adopting a knowledge distillation mode, and using a teacher model theta to assist in training a stackKDBLS model, wherein the logits and the prediction output of the teacher model theta are as follows:
Z t =θ(X)
Y t =sofymax(θ(X))
wherein X is input data, Z t Logits, Y, as teacher model t The prediction output of the teacher model theta;
step 3.2, if the teacher model θ is not considered, thenWherein (A) k ) + As a pseudo-inverse matrix, when using the teacher model θ to assist in training the stabkdbls model, the target output Y k Calculation mode of (a)The calculation mode is changed into:
where k is the number of stacked jackkdbls models, t k Is the distillation temperature, Y l Is the predictive output of the first width learning system module,
if k=1, Y 1 =(1-1/t)Y GT +1/tY t Summary of the output of the first k stabkdbls models need not be considered.
6. The three-dimensional point cloud object identification method based on deep and wide knowledge distillation according to claim 1, wherein the method comprises the following steps: in step 4, to determine whether to stack more breadth-learning networks into the stackKDBLS model, KL divergence is used to measure the predicted output Y k The difference from the target output, wherein the calculation formula of the KL divergence is:
L k =D KL (Y k ||Y GT )
if L k Above a predetermined threshold epsilon, a breadth-learning network is added to improve the performance of the model.
CN202410009182.2A 2024-01-04 2024-01-04 Three-dimensional point cloud object identification method based on deep and wide knowledge distillation Active CN117523549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410009182.2A CN117523549B (en) 2024-01-04 2024-01-04 Three-dimensional point cloud object identification method based on deep and wide knowledge distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410009182.2A CN117523549B (en) 2024-01-04 2024-01-04 Three-dimensional point cloud object identification method based on deep and wide knowledge distillation

Publications (2)

Publication Number Publication Date
CN117523549A CN117523549A (en) 2024-02-06
CN117523549B true CN117523549B (en) 2024-03-29

Family

ID=89764844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410009182.2A Active CN117523549B (en) 2024-01-04 2024-01-04 Three-dimensional point cloud object identification method based on deep and wide knowledge distillation

Country Status (1)

Country Link
CN (1) CN117523549B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035595A (en) * 2022-06-02 2022-09-09 西北大学 3D model compression method based on spatio-temporal information transfer knowledge distillation technology
CN115690708A (en) * 2022-10-21 2023-02-03 苏州轻棹科技有限公司 Method and device for training three-dimensional target detection model based on cross-modal knowledge distillation
CN116189172A (en) * 2023-04-20 2023-05-30 福建环宇通信息科技股份公司 3D target detection method, device, storage medium and chip

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11587291B2 (en) * 2021-06-30 2023-02-21 Tencent America LLC Systems and methods of contrastive point completion with fine-to-coarse refinement

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115035595A (en) * 2022-06-02 2022-09-09 西北大学 3D model compression method based on spatio-temporal information transfer knowledge distillation technology
CN115690708A (en) * 2022-10-21 2023-02-03 苏州轻棹科技有限公司 Method and device for training three-dimensional target detection model based on cross-modal knowledge distillation
CN116189172A (en) * 2023-04-20 2023-05-30 福建环宇通信息科技股份公司 3D target detection method, device, storage medium and chip

Also Published As

Publication number Publication date
CN117523549A (en) 2024-02-06

Similar Documents

Publication Publication Date Title
CN111489358B (en) Three-dimensional point cloud semantic segmentation method based on deep learning
CN110414432B (en) Training method of object recognition model, object recognition method and corresponding device
CN108491880B (en) Object classification and pose estimation method based on neural network
TWI742382B (en) Neural network system for vehicle parts recognition executed by computer, method for vehicle part recognition through neural network system, device and computing equipment for vehicle part recognition
CN111583263B (en) Point cloud segmentation method based on joint dynamic graph convolution
CN109948475B (en) Human body action recognition method based on skeleton features and deep learning
CN111507378A (en) Method and apparatus for training image processing model
US20220215227A1 (en) Neural Architecture Search Method, Image Processing Method And Apparatus, And Storage Medium
CN110222718B (en) Image processing method and device
CN113705769A (en) Neural network training method and device
CN111476806B (en) Image processing method, image processing device, computer equipment and storage medium
CN111008618B (en) Self-attention deep learning end-to-end pedestrian re-identification method
CN113780292A (en) Semantic segmentation network model uncertainty quantification method based on evidence reasoning
CN110281949B (en) Unified hierarchical decision-making method for automatic driving
CN114359631A (en) Target classification and positioning method based on coding-decoding weak supervision network model
CN116704431A (en) On-line monitoring system and method for water pollution
CN113870160A (en) Point cloud data processing method based on converter neural network
CN116258990A (en) Cross-modal affinity-based small sample reference video target segmentation method
CN115797629A (en) Example segmentation method based on detection enhancement and multi-stage bounding box feature refinement
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN114973031A (en) Visible light-thermal infrared image target detection method under view angle of unmanned aerial vehicle
CN114626476A (en) Bird fine-grained image recognition method and device based on Transformer and component feature fusion
CN114445816A (en) Pollen classification method based on two-dimensional image and three-dimensional point cloud
CN113128564B (en) Typical target detection method and system based on deep learning under complex background
CN114492634A (en) Fine-grained equipment image classification and identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant