CN112052626B - Automatic design system and method for neural network - Google Patents
Automatic design system and method for neural network Download PDFInfo
- Publication number
- CN112052626B CN112052626B CN202010818278.5A CN202010818278A CN112052626B CN 112052626 B CN112052626 B CN 112052626B CN 202010818278 A CN202010818278 A CN 202010818278A CN 112052626 B CN112052626 B CN 112052626B
- Authority
- CN
- China
- Prior art keywords
- module
- neural network
- model
- video
- block module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013461 design Methods 0.000 title claims abstract description 24
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 23
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000011156 evaluation Methods 0.000 claims description 23
- 238000012549 training Methods 0.000 claims description 16
- 230000009467 reduction Effects 0.000 claims description 7
- 238000012795 verification Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 6
- 230000000007 visual effect Effects 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Software Systems (AREA)
- Development Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Geometry (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Game Theory and Decision Science (AREA)
- General Health & Medical Sciences (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a neural network automatic design system and a neural network automatic design method, which utilize NAS to carry out automatic design of a model with a supervision VO task, and improve the two fields of NAS and VO. For the NAS field, a more general NAS framework is provided, which can process space and time sequence information simultaneously so as to be suitable for video visual tasks. In the VO aspect, by utilizing our VONAS algorithm, a network model with better performance and lighter weight is obtained by searching.
Description
Technical Field
The invention belongs to the field of visual odometers in the field of computer vision, and particularly relates to an automatic design system and method for a neural network.
Background
Visual Odometer (VO) is a critical task in the field of autopilot and robotics, aimed at estimating camera pose from successive frames. The conventional VO task is a typical geometrical task, and the pose is obtained by using the matching of feature points or pixels to perform strict computation. With the rapid development of CNNs (Convolutional Neural Networks ) and RNNs (recurrent neural network, recurrent neural networks) in visual tasks, more and more end-to-end network frameworks are also applied in VO tasks.
In a deep learning-based framework, VO tasks are distinguished as video regression tasks from semantic-based visual tasks (e.g., image classification, object detection). First, the VO task predicts a 6-DoF (degree of freedom, degrees of freedom) camera pose, focusing more on the geometric feature stream than on the semantic features. The camera motion cannot be accurately calculated merely by semantic information, such as simply detecting or identifying objects in the image. Secondly, the VO task needs to process at least two pictures simultaneously to calculate the relative pose, pays attention to the extraction capability of the VO task to time sequence characteristics, is sensitive to the input sequence of the images, and means that the input sequence is different, and the prediction results are also different.
The automatic design of a model for a VO task is performed by using a neural network architecture search (Neural Architecture Search, abbreviated as NAS) nowadays, and the selection of a lightweight model suitable for extracting geometric features and timing features is an innovative and very challenging attempt. However, as described above, the requirement of geometric feature extraction can be solved by the model automatic design of NAS, but NAS cannot process the timing information.
Disclosure of Invention
The invention aims to solve the technical problem that the prior NAS cannot process time sequence information.
In order to solve the above problems, the present invention provides an automatic design system and method for a neural network.
The technical scheme adopted by the invention is as follows:
an automatic design method of a neural network comprises a super-network structure and a controller model, and comprises the following steps:
s1, preparing a video sequence containing video data and real camera pose data;
s2, extracting a video segment V1 from the video sequence of the S1, forming training batch data by the video segment V1, uniformly sampling each block operation of the super-network structure, selecting the operation of the training batch, forming a path after the selection is completed, wherein the path is a sub-network model, sequentially inputting two adjacent frames of images in the V1 into the sub-network model according to time sequence to obtain a pose sequence between the image frames, calculating an error by using a loss function, and updating network parameters until the loss function is not reduced;
s3, outputting operands selected by each block by using a controller model to generate codes of sub-models, extracting video segments v2 and corresponding real camera pose data from the video sequence of the S1 according to video time sequence by adopting the network parameters iterated in the S2, inputting v2 into the sub-models to obtain predicted poses, comparing the predicted poses with the real camera poses, calculating to obtain segment evaluation indexes, repeating the operation of the S3 until the complete video sequence is extracted, and calculating all segment evaluation indexes to obtain final evaluation indexes of the sub-models;
s4, carrying out parameter updating on the controller model parameters by utilizing the final evaluation index of the submodel obtained in the S3, and repeating the S3 until the set iteration number is reached or the performance of the submodel is not improved;
s5, outputting n submodels by using a controller, and picking out the submodel with the best final evaluation index as a final output result.
Preferably, in the step S2, the loss function is:
where x is the Euclidean distance between the predicted pose and the true camera pose and a and c are parameters controlling loss.
Preferably, the video sequence of S1 is divided into a training set and a verification set, S2 employs the training set, and S3 employs the verification set.
Preferably, the final evaluation index calculation method of the S3 neutron model includes: all segment evaluation indexes are averaged.
Preferably, n in S5 is an integer of 10 or less.
Preferably, in S3, the final evaluation index is recorded.
The utility model provides an automatic design system of neural network, contains super network structure and controller model, super network structure contains Stem module, convolution block module, the sequential block module and Tail module, stem module is used for handling the input and is two stacked rgb pictures, tail module is used for handling time sequence information, convolution block module includes the convolution operation combination of different parameter of juxtaposition, the sequential block module includes the downsampling operation combination of juxtaposition based on the convolution, the sequential block module is used for integrating the time sequence information of input.
Preferably, the sequential block module may employ one of 4 operations, which are ConvlstmS 3x3 or ConvlstmS 5x5 or ConvGRUS 3x3 or ConvGRUS 5x5, respectively.
Compared with the prior art, the invention has the following advantages and effects:
the invention utilizes the NAS to automatically design the model with the supervision of VO tasks, and improves the NAS field and the VO field. For the NAS field, a more general NAS framework is provided, which can process space and time sequence information simultaneously so as to be suitable for video visual tasks. In the VO aspect, by utilizing our VONAS algorithm, a network model with better performance and lighter weight is obtained by searching.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a schematic diagram of a super network structure according to the present invention;
FIG. 2 is a sequential block module;
FIG. 3 is an operational diagram of the Convolution Block module and the Reduction Block module;
FIG. 4 is a schematic diagram of a controller model;
FIG. 5 is a schematic diagram of the output results of the present invention;
FIG. 6 is se:Sup>A comparison of pose estimation results of the present invention (VONAS-A and VONAS-B) with other algorithms;
fig. 7 is a graph of pose estimation versus complexity and performance of an existing advanced network architecture.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1:
an automatic design system of a neural network comprises a super-network structure and a controller model, wherein the super-network structure comprises a stem initial processing module, a plurality of block modules and a tail module as shown in figure 1. The Stem block contains convolution operations with a convolution kernel size 7x7, stride of 2. The Tail module contains a common convlstm layer for the final processing of the timing information. The operations for block include 3 types: first, convolution block, comprises a combination of convolution operations of a plurality of different parameters in parallel; the size of the feature map output by the module is consistent with the number of channels and the input, and the number of channels (the number of convolution kernels of each layer) is unchanged. Second, reduction block, comprising a parallel convolution-based combination of downsampling operations; the size of the feature map output by the module is half of that of the feature map input, and the channel number is doubled. Third, sequential block, comprising different similar convolution-based convRNN operations, integrates timing information for the input sequence; the size of the feature map output by the module is unchanged, and the channel number is unchanged. As shown in fig. 2, the timing module includes two modules: first, the convlstm module reduces one gate operation, thereby reducing the complexity of the model, and eliminates the tanh operation, adding normalization, named convlstms (simple). Second, the convglu module reduces the tanh operation and increases the normalization and is named convGRUs (simple). Each timing block contains both operations, with convolution kernel sizes of 3 and 5, respectively, so the number of alternate operations in the timing block is 4. As shown in FIG. 3, alternative operations contained in the normal block and the reduction block are shown.
As shown in fig. 1-5, an automatic design system for a neural network includes the following steps.
Step 1, preparing a video sequence containing video data and real camera pose data, such as a KITTI unmanned dataset containing video data acquired by an onboard camera of an automobile and real camera pose data provided by the dataset. Step 2 may then be performed directly or the entire video sequence may be partitioned into a training set and a validation set and then step 2 may be performed.
Step 2, divide into 5 steps:
2.1, a continuous video segment V1, V1 preferably comprising segments of 5-10 frames of data is extracted from the video sequence or training set in step 1, constituting the training data of the current batch.
2.2, sampling the structure of the sub-network model, specifically comprising uniformly sampling each block operation of the super-network, selecting the operation of the current training batch, and forming a path from input to output in the super-network after all blocks are selected, wherein the path is a sub-network model.
And 2.3, inputting two adjacent images in the V1 into the submodel in sequence according to the video time sequence to obtain a pose sequence between the image frames.
2.4, calculating errors by using Adaptive loss, and then carrying out back propagation to update network parameters, wherein the Adaptive loss is defined as follows:
where x is the Euclidean distance between the predicted pose and the true pose, and a and c are parameters for controlling loss, which can be controlled by network gradient feedback.
And 2.5, repeating the training iteration from 2.1 to 2.4 until the loss function is no longer reduced, so that all candidate operations of each block in the super network can be fully trained.
And 3, carrying out model searching by adopting the verification set in the step 1, wherein the method comprises the following steps of:
and 3.1, outputting operands selected by each block in time sequentially by using the controller model, and generating a code of a sub-model after outputting. The code is a series of sequences, each value in the sequence being represented as a candidate operation number selected in each block. The code uniquely corresponds to a sub-model. A schematic of the process is shown in fig. 4.
And 3.2, after the network structure of the sub-model is determined, the parameters of the sub-model are inherited (directly copied) from the network parameters of the corresponding structure in the super-network structure trained in the step 2, and retraining is not needed, so that huge time consumption caused by retraining of the sub-model is reduced.
3.3, the structure and parameters of the submodel are all determined through 3.1 and 3.2, then the video segment V2 is extracted from the video sequence or the verification set according to the video time sequence to conduct pose prediction, namely, V2 is input into the submodel to obtain a predicted pose, then the predicted pose is compared with a pose true value, and a segment evaluation index is obtained through calculation. And averaging the evaluation indexes of the prediction results of the video sequences of all verification sets to obtain a final evaluation index of a sub-model, thereby evaluating the performance of the sub-model. Step 3.4 may be performed directly or after recording the submodel structure and the evaluation index, step 3.4 may be performed.
And 4, determining and evaluating the complete sub-model once in the steps 3.1-3.3, and then updating parameters of the controller model by using one obtained sub-model evaluation index, wherein the method particularly adopts policy gradient in a reinforcement learning method to update parameters so as to gradually improve the performance of the sub-network model output by the controller. And then repeating the steps 3.1-3.3 for training iteration, and completing the training of the controller model after the preset training iteration number is reached or the sub-network performance of the controller output is not improved (namely, the accuracy is not improved).
And 5, outputting (preferably, less than 10) submodels by using the controller model trained in the step 4, and selecting the submodel with the best pose prediction evaluation index (namely, the smallest evaluation value) as a final search result.
As shown in fig. 6-7, the average values of the pose estimates (Avg terr and Avg rerr) of the present invention are lower and perform better than the performance of other advanced network models.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (8)
1. The utility model provides an automatic design method of a neural network, which is characterized by comprising a super network structure and a controller model, wherein the super network structure comprises a step module, a convolution block module, a reduction block module, a sequential block module and a Tai l module, the step module is used for processing an rgb picture which is input into two stacks, the Tail module is used for processing time sequence information, the convolution block module comprises a convolution operation combination of parallel different parameters, the reduction block module comprises a parallel convolution-based downsampling operation combination, and the sequential block module is used for integrating the input time sequence information; the automatic design method of the neural network comprises the following steps:
s1, preparing a video sequence containing video data and real camera pose data;
s2, extracting a video segment V1 from the video sequence of the S1, forming training batch data by the video segment V1, uniformly sampling each block operation of the super-network structure, selecting the operation of the training batch, forming a path after the selection is completed, wherein the path is a sub-network model, sequentially inputting two adjacent frames of images in the V1 into the sub-network model according to time sequence to obtain a pose sequence between the image frames, calculating an error by using a loss function, and updating network parameters until the loss function is not reduced;
s3, outputting operands selected by each block by using a controller model to generate codes of sub-models, extracting video segments v2 and corresponding real camera pose data from the video sequence of the S1 according to video time sequence by adopting the network parameters iterated in the S2, inputting v2 into the sub-models to obtain predicted poses, comparing the predicted poses with the real camera poses, calculating to obtain segment evaluation indexes, repeating the operation of the S3 until the complete video sequence is extracted, and calculating the segment evaluation indexes to obtain final evaluation indexes of the sub-models;
s4, carrying out parameter updating on the controller model parameters by utilizing the final evaluation index of the submodel obtained in the S3, and repeating the S3 until the set iteration number is reached or the performance of the submodel is not improved;
s5, outputting n submodels by using a controller, and picking out the submodel with the best final evaluation index as a final output result.
2. The automatic design method of a neural network according to claim 1, wherein in the step S2, the loss function is:
where x is the Euclidean distance between the predicted pose and the true camera pose and a and c are parameters controlling loss.
3. The automatic design method of neural network according to claim 1, wherein the video sequence of S1 is divided into a training set and a verification set, S2 employs the training set, and S3 employs the verification set.
4. The automatic design method of a neural network according to claim 1, wherein the final evaluation index calculation method of the S3 neutron model is as follows: all segment evaluation indexes are averaged.
5. The automatic neural network design method according to claim 1, wherein in S3, the final evaluation index is recorded.
6. The automatic design method of a neural network according to claim 1, wherein n in S5 is an integer of 10 or less.
7. An automatic neural network design system for implementing the automatic neural network design method according to claim 1, comprising a super network structure and a controller model, wherein the super network structure comprises a step module, a convolution block module, a reduction block module, a sequential block module and a Tai module, the step module is used for processing two stacked rgb pictures, the Tai module is used for processing time sequence information, the convolution block module comprises a convolution operation combination of parallel different parameters, the reduction block module comprises a parallel convolution-based downsampling operation combination, and the sequential block module is used for integrating the input time sequence information.
8. The neural network automatic design system of claim 7, wherein the sequential block module can employ one of 4 operations, which are ConvlstmS 3x3 or ConvlstmS 5x5 or convrus 3x3 or convrus 5x5, respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010818278.5A CN112052626B (en) | 2020-08-14 | 2020-08-14 | Automatic design system and method for neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010818278.5A CN112052626B (en) | 2020-08-14 | 2020-08-14 | Automatic design system and method for neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112052626A CN112052626A (en) | 2020-12-08 |
CN112052626B true CN112052626B (en) | 2024-01-19 |
Family
ID=73600419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010818278.5A Active CN112052626B (en) | 2020-08-14 | 2020-08-14 | Automatic design system and method for neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112052626B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114492570A (en) * | 2021-12-21 | 2022-05-13 | 绍兴市北大信息技术科创中心 | Key point extraction network construction method and system of neural network architecture |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018052875A1 (en) * | 2016-09-15 | 2018-03-22 | Google Llc | Image depth prediction neural networks |
WO2019213459A1 (en) * | 2018-05-04 | 2019-11-07 | Northeastern University | System and method for generating image landmarks |
CN111028282A (en) * | 2019-11-29 | 2020-04-17 | 浙江省北大信息技术高等研究院 | Unsupervised pose and depth calculation method and system |
CN111182292A (en) * | 2020-01-05 | 2020-05-19 | 西安电子科技大学 | No-reference video quality evaluation method and system, video receiver and intelligent terminal |
CN111369608A (en) * | 2020-05-29 | 2020-07-03 | 南京晓庄学院 | Visual odometer method based on image depth estimation |
-
2020
- 2020-08-14 CN CN202010818278.5A patent/CN112052626B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018052875A1 (en) * | 2016-09-15 | 2018-03-22 | Google Llc | Image depth prediction neural networks |
WO2019213459A1 (en) * | 2018-05-04 | 2019-11-07 | Northeastern University | System and method for generating image landmarks |
CN111028282A (en) * | 2019-11-29 | 2020-04-17 | 浙江省北大信息技术高等研究院 | Unsupervised pose and depth calculation method and system |
CN111182292A (en) * | 2020-01-05 | 2020-05-19 | 西安电子科技大学 | No-reference video quality evaluation method and system, video receiver and intelligent terminal |
CN111369608A (en) * | 2020-05-29 | 2020-07-03 | 南京晓庄学院 | Visual odometer method based on image depth estimation |
Non-Patent Citations (1)
Title |
---|
残差神经网络及其在医学图像处理中的应用研究;周涛;霍兵强;陆惠玲;任海玲;;电子学报(07);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112052626A (en) | 2020-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Unsupervised moving object detection via contextual information separation | |
Oh et al. | Crowd counting with decomposed uncertainty | |
CN108596053B (en) | Vehicle detection method and system based on SSD and vehicle posture classification | |
CN107529650B (en) | Closed loop detection method and device and computer equipment | |
Chen et al. | Semantic image segmentation with task-specific edge detection using cnns and a discriminatively trained domain transform | |
CN110378348B (en) | Video instance segmentation method, apparatus and computer-readable storage medium | |
Pfeuffer et al. | Semantic segmentation of video sequences with convolutional lstms | |
Kuo et al. | Dynamic attention-based visual odometry | |
CN110766044A (en) | Neural network training method based on Gaussian process prior guidance | |
KR102378887B1 (en) | Method and Apparatus of Bounding Box Regression by a Perimeter-based IoU Loss Function in Object Detection | |
CN111291631B (en) | Video analysis method and related model training method, device and apparatus thereof | |
CN111027347A (en) | Video identification method and device and computer equipment | |
CN115187786A (en) | Rotation-based CenterNet2 target detection method | |
CN113516713A (en) | Unmanned aerial vehicle self-adaptive target tracking method based on pseudo twin network | |
CN112052626B (en) | Automatic design system and method for neural network | |
EP4343616A1 (en) | Image classification method, model training method, device, storage medium, and computer program | |
Ji et al. | YOLO-TLA: An efficient and lightweight small object detection model based on YOLOv5 | |
US20220159278A1 (en) | Skip convolutions for efficient video processing | |
CN114943840A (en) | Training method of machine learning model, image processing method and electronic equipment | |
CN110580712A (en) | Improved CFNet video target tracking method using motion information and time sequence information | |
CN116630367B (en) | Target tracking method, device, electronic equipment and storage medium | |
CN110942463B (en) | Video target segmentation method based on generation countermeasure network | |
CN113689383A (en) | Image processing method, device, equipment and storage medium | |
CN115019342B (en) | Endangered animal target detection method based on class relation reasoning | |
CN115018884B (en) | Visible light infrared visual tracking method based on multi-strategy fusion tree |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |