WO2024032010A1 - Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert - Google Patents
Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert Download PDFInfo
- Publication number
- WO2024032010A1 WO2024032010A1 PCT/CN2023/086781 CN2023086781W WO2024032010A1 WO 2024032010 A1 WO2024032010 A1 WO 2024032010A1 CN 2023086781 W CN2023086781 W CN 2023086781W WO 2024032010 A1 WO2024032010 A1 WO 2024032010A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- few
- detection
- sample
- model
- training
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 124
- 238000013526 transfer learning Methods 0.000 title claims abstract description 18
- 238000000034 method Methods 0.000 claims abstract description 42
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000012360 testing method Methods 0.000 claims abstract description 19
- 238000011897 real-time detection Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 3
- 230000006870 function Effects 0.000 claims description 27
- 230000004044 response Effects 0.000 claims description 22
- 230000008569 process Effects 0.000 claims description 12
- 238000012795 verification Methods 0.000 claims description 7
- 238000004821 distillation Methods 0.000 claims description 6
- 238000005259 measurement Methods 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 3
- 239000000654 additive Substances 0.000 claims description 3
- 230000000996 additive effect Effects 0.000 claims description 3
- 230000003321 amplification Effects 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 2
- 230000000007 visual effect Effects 0.000 description 4
- 241000283690 Bos taurus Species 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000013140 knowledge distillation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- the invention belongs to the field of image processing and relates to a real-time detection method of few-sample targets based on a transfer learning strategy.
- Object detection is one of the most important and fundamental tasks in computer vision.
- CNN Convolutional Neural Network
- visual Transformer with high detection performance.
- the excellent detection performance of these models is achieved at the expense of large amounts of data. Due to the complexity of the object and the large number of model parameters, the detection accuracy will drop rapidly when the amount of data is limited. Therefore, few-shot target detection has received more and more attention in recent years.
- the purpose of the method based on meta-learning strategy is to obtain the correlation between the current image and the few samples.
- the detection performance for the few samples has been improved, due to the feature extraction structure, input features and few sample features in the minority sample detection branch, The structure of the relationship between them and the number of small sample categories have resulted in a greatly increased computational complexity of the model.
- the purpose of the method based on the transfer learning strategy is to enable the detection model that already has feature representation capabilities to be well adapted to the few-sample target.
- the purpose of the present invention is to provide a two-way combined real-time target detection model, based on the transfer learning strategy, using Darknet-53 combined with Spatial Pyramid Pooling (SPP) and Feature Pyramid Network (Feature Pyramid). Network, FPN) as the backbone and neck, respectively extract image features and provide semantic features at different scales.
- SPP Spatial Pyramid Pooling
- Feature Pyramid Feature Pyramid Network
- FPN Feature Pyramid Network
- the large-sample category detection branch is only used to detect large-sample category objects, while the few-sample category detection branch is used to detect all categories of objects.
- the discriminator After outputting the detection results in parallel, the discriminator will scan the two results and output the more appropriate result of the two parallel branches based on a metric criterion.
- the main reason for using the dual-path combination structure is that when the model is trained on a small number of samples, the detection accuracy of objects in the large sample category will degrade, and the few sample detection branch will have false positive bounding boxes that actually belong to the large sample category.
- the few-sample detection branch also learns the prediction differences of large-sample categories from the large-sample detection branch through knowledge distillation, thereby improving the generalization ability of the detection branch.
- the present invention proposes a feature-based response
- the Attentive DropBlock regularization method is used to guide the model to focus on the overall characteristics of the target, avoid being dominated by local salient features, and enhance the generalization ability of the model.
- a real-time detection method of few-sample targets based on transfer learning strategy including the following steps:
- S4 Fine-tune the few-sample category detection branch on the few-sample category data; use a new regularization method to guide the model to focus on the overall characteristics of the object during fine-tuning;
- the detection network model includes: the backbone network is Darknet-53 combined with Spatial Pyramid Pooling (SPP), which is used to extract image features; the detection neck network is composed of Feature Pyramid Network (Feature Pyramid Network, FPN), used to provide semantic features of different scales to the detection head network; the detection head network is a dual-channel detection branch network structure with a discriminator, in which the large sample category detection branch is only used to detect categories corresponding to large samples The target, few-shot category detection branch is used to detect all categories of targets, and the discriminator is used to scan the results of the two branches in sequence and obtain the final output result according to a measurement criterion.
- SPP Spatial Pyramid Pooling
- FPN Feature Pyramid Network
- step S2 processing limited data by using random affine transformation, multi-scale image training strategy, MixUp data fusion strategy and Label Smoothing label processing strategy.
- step S3 the backbone network is initialized to the weights trained on the ImageNet data set, and the network model except the few-sample detection branch is trained from scratch using large-sample category data.
- L box is the additive combination of the GIoU loss function and smooth L1 loss of coordinate regression;
- L cls and L obj are the Focal Loss function and the binary cross-entropy loss function respectively.
- step S4 the model parameters of the main part of the detection model, the detection neck part and the large sample category detection branch part are frozen, and only the small sample category detection branch is fine-tuned.
- the loss function at this stage involves the coordinates of the prediction frame , target confidence, classification results and the difference of large sample category detection branches.
- step S4 specifically includes the following steps:
- N represents the batch size
- l represents the absolute error function
- ⁇ is used to control the impact of base class distillation loss on model gradient update
- O d (i, j) represents the discriminator output of a specific spatial grid.
- the new regularization method is the Attentive DropBlock algorithm, which has a dynamic coefficient ⁇ , as shown below:
- the parameters keep_prob and block_size affect the frequency and range of the feature map being set to zero
- ⁇ represents the sigmoid function, which is used to control the response range
- ⁇ represents the response amplification factor
- the Attentive DropBlock algorithm first determines whether it is currently in the fine-tuning stage. If the model is fine-tuning, obtain the channel response f C and spatial response f S of the few-sample category detection branch; then, calculate the parameter ⁇ according to the parameters keep_prob, block_size and ⁇ . Finally, the spatial position of each different channel feature is set to zero according to the Bernoulli distribution probability with parameter ⁇ ; finally, with the zero position as the center, a mask block with a length and width value of block_size is constructed, so that Regularize the model.
- step S5 train and test on the PASCAL VOC and MS COCO data sets
- the training set and the verification set are first merged into one set for training to detect the magic heart, and then its test set is selected for testing.
- the test evaluation standard adopts the Intersection over Union (IoU) threshold of 0.5
- the mean Average Precision (mAP) i.e. mAP@50
- the average number of frames per second (mean Frames Per Second, mFPS) of multiple different small sample collections represent the detection accuracy and speed of the detection model;
- mAP i.e. AP
- FPS frames per second
- step S5 stochastic gradient descent is used as the optimization method of the network model, the initial learning rate is 1 ⁇ 10 -3 , and the set minimum batch size is 16 in different data sets; for PASCAL VOC and MS COCO Data set, the number of times of initial training and fine-tuning of the detection model is 300, and the CosineLR learning rate change strategy (from 0.001 to 0.00001) is used during the training process; during the prediction process, the length and width of the input image are fixed at 448 ⁇ 448; FPS To obtain the sum of the waiting time for each result and the time for post-processing the results, mFPS is the average FPS under different few-sample sets.
- the present invention proposes an Attentive DropBlock regularization method based on feature response to guide the model to pay attention to the overall characteristics of the object, avoid over-fitting of the model in the fine-tuning stage, avoid being dominated by local salient features, and enhance Due to the generalization ability of the model, the present invention can not only achieve accurate detection of few-sample category objects under smaller model parameters, but also achieve real-time detection of related targets.
- Figure 1 is an overall flow chart of the model proposed by the present invention.
- Figure 2 is a visual comparison chart of DropBlock algorithm and Attentive DropBlock algorithm
- Figure 3 is a diagram showing the visual detection results of large sample and small sample category objects by the model proposed by the present invention.
- Figure 4 shows the response to the target and the visual detection results of the large-sample category detection branch and the few-sample category detection branch of the model proposed by the present invention.
- a real-time detection method of few-sample targets based on transfer learning strategy includes the following steps:
- the S1 specifically includes the following steps:
- multi-scale image training strategy (320, 352, 384, 416, 448, 480, 512, 544, 576 and 608), MixUp data fusion strategy and Label Smoothing label processing strategy to conduct limited data Processing, thereby increasing the generalization performance of the detection model to the sample.
- L box is the additive combination of the GIoU loss function of coordinate regression and the smooth L1 loss.
- L cls and L obj are the Focal Loss function and the binary cross-entropy loss function respectively.
- the backbone, detection neck and large sample detection branches are frozen to maintain strong generalization ability, and only the few sample detection branches and SPP layers and their adjacent volumes are Stacked layers for training.
- many false positive bounding boxes are generated, resulting in low detection accuracy due to the similarity between objects in the two classes. Therefore, we randomly sample K instances from the corresponding data for each large-sample category, so that the few-shot detection branch predicts all categories of objects.
- the large-sample category detection branch has strong generalization ability
- the few-sample detection branch should learn this branch to obtain better generalization ability. Therefore, we establish the base class distillation loss L b between the two branches, and the calculation formula is as follows:
- N the batch size.
- ⁇ is used to control the impact of base class distillation loss on model gradient update.
- O d (i, j) represents the discriminator output of a specific spatial grid.
- the present invention proposes an Attentive DropBlock algorithm.
- This algorithm is not only affected by the parameters keep_prob and block_size, but also affected by the model's semantic features. Impact of response.
- the DropBlock algorithm sets a constant coefficient for all locations within the feature map, as follows:
- ⁇ is a dynamic coefficient that relies on the feature map response extracted in the Attentive DropBlock algorithm.
- ⁇ is a dynamic coefficient that relies on the feature map response extracted in the Attentive DropBlock algorithm.
- F ⁇ R B ⁇ C ⁇ H ⁇ W adopts the global max pooling function for each channel feature to obtain the response f C ⁇ R B ⁇ C ⁇ 1 ⁇ 1
- the global average pooling function yields the response f S ⁇ R B ⁇ 1 ⁇ H ⁇ W . Therefore, the calculation formula of ⁇ in the Attentive DropBlock algorithm is as follows:
- ⁇ represents the sigmoid function used to control the response range
- ⁇ represents the response amplification factor
- the Attentive DropBlock algorithm will first determine whether the model is currently in the fine-tuning stage. If the model is fine-tuning, obtain the channel response f C and spatial response f S of the few-sample category detection branch. Afterwards, after calculating the parameter ⁇ based on the two responses, keep_prob, block_size and ⁇ , the spatial position of each different channel feature is set to zero according to the Bernoulli distribution probability with the parameter ⁇ . Finally, with the zero position as the center, a mask block with a length and width of block_size is constructed to regularize the model.
- Figure 2 shows the difference between DropBlock and Attentive DropBlock. It can be observed that Attentive The gamma value in DropBlock is related to the target response. Feature maps that contain more target responses have higher ⁇ values, which means that the detection model can better avoid being dominated by local obvious features and thus pay more attention to unobvious features during the training process, thereby obtaining better results. Sample target detection accuracy.
- the S5 for the PASCAL VOC data set, three different data combination structures are obtained in such a way that 15 categories are large-sample categories and the remaining 5 categories are few-sample categories (the first few-sample category includes Birds, buses, cows, motorcycles, and sofas; the second few-shot category includes airplanes, bottles, cows, horses, and sofas; the third few-shot category includes boats, cats, motorcycles, sheep, and sofas); for MS In the COCO data set, the 20 categories that are the same as those in the PASCAL VOC data set are small-sample categories, and the remaining 60 categories are large-sample categories.
- the present invention uses stochastic gradient descent as the optimization method of the network model, the initial learning rate is 1 ⁇ 10 -3 , and the set minimum batch size is 16 in different data sets. For these two data sets, the number of times the model was trained from scratch and fine-tuned was 300, and the CosineLR learning rate change strategy (from 0.001 to 0.00001) was used during the training process.
- the length and width of the input image are fixed at 448 ⁇ 448.
- the present invention compares the detection accuracy and detection speed of various few-sample target detection models proposed in recent years on the PASCAL VOC 2007 and MS COCO 2014 data sets.
- the detection model of the present invention was evaluated on the challenging PASCAL VOC 2007 and MS COCO 2014 data sets according to the evaluation criteria specified in the PASCAL VOC and MS COCO data.
- These two benchmark data contain training sets, validation sets and test sets.
- the PASCAL VOC 2007 data set contains 20 target categories
- the MS COCO 2014 data set contains 80 categories.
- the present invention first combines the PASCAL VOC 2007 and PASCAL VOC 2012 training sets and verification sets into one set for training the detection model, and selects the PASCAL VOC 2007 test set for testing.
- the test evaluation standard adopts the Intersection Ratio (Intersection).
- the detection model is represented by the mean Average Precision (mAP) (i.e. mAP@50) with a threshold of 0.5 over Union (IoU) and the average number of frames per second (mean Frames Per Second, mFPS) of multiple different few sample sets. detection accuracy and speed.
- mAP mean Average Precision
- IoU 0.5 over Union
- mFPS mean Frames Per Second
- the present invention only uses the MS COCO 2014 training set for training, and uses its verification set for verification in the test phase, using the mAP (i.e. AP) of IoU from 0.5 to 0.95 (interval is 0.05) and the number of transmission frames per second (Frames Per Second, FPS) represents the detection accuracy and speed of the detection model.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
La présente invention, qui relève du domaine du traitement d'image, concerne un procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert, comprenant les étapes suivantes : S1 : la construction d'un modèle de réseau de détection ; S2 : le prétraitement de données d'entrée ; S3 : l'apprentissage d'un modèle de détection d'objet à partir de rien à l'aide de données de classe de grand échantillon ; S4 : le réglage fin d'une branche de détection de classe à peu de coups à l'aide de données de classe à peu de coups ; et durant le réglage précis, l'utilisation d'un nouveau procédé de régularisation pour guider le modèle pour qu'il prête attention à une caractéristique globale d'un objet ; et S5 : l'apprentissage du modèle de détection au moyen d'un ensemble d'apprentissage, et l'exécution d'un test à l'aide d'un ensemble de tests. La présente invention évite une surapprentissage d'un modèle dans une phase de réglage fin, évite une dominance par des caractéristiques saillantes locales et améliore la capacité de généralisation du modèle. La présente invention peut non seulement réaliser une détection précise sur des objets de classe à peu de coups à l'aide de moins de paramètres de modèle, mais peut également obtenir une détection en temps réel d'objets associés.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210962295.5 | 2022-08-11 | ||
CN202210962295.5A CN115393634B (zh) | 2022-08-11 | 2022-08-11 | 一种基于迁移学习策略的少样本目标实时检测方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024032010A1 true WO2024032010A1 (fr) | 2024-02-15 |
Family
ID=84118843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/086781 WO2024032010A1 (fr) | 2022-08-11 | 2023-04-07 | Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115393634B (fr) |
WO (1) | WO2024032010A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117876823A (zh) * | 2024-03-11 | 2024-04-12 | 浙江甲骨文超级码科技股份有限公司 | 一种茶园图像检测方法及其模型训练方法和系统 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115393634B (zh) * | 2022-08-11 | 2023-12-26 | 重庆邮电大学 | 一种基于迁移学习策略的少样本目标实时检测方法 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109615016A (zh) * | 2018-12-20 | 2019-04-12 | 北京理工大学 | 一种基于金字塔输入增益的卷积神经网络的目标检测方法 |
CN110674866A (zh) * | 2019-09-23 | 2020-01-10 | 兰州理工大学 | 迁移学习特征金字塔网络对X-ray乳腺病灶图像检测方法 |
CN111223553A (zh) * | 2020-01-03 | 2020-06-02 | 大连理工大学 | 一种两阶段深度迁移学习中医舌诊模型 |
AU2020100705A4 (en) * | 2020-05-05 | 2020-06-18 | Chang, Jiaying Miss | A helmet detection method with lightweight backbone based on yolov3 network |
US20220067335A1 (en) * | 2020-08-26 | 2022-03-03 | Beijing University Of Civil Engineering And Architecture | Method for dim and small object detection based on discriminant feature of video satellite data |
CN114663729A (zh) * | 2022-03-29 | 2022-06-24 | 南京工程学院 | 一种基于元学习的气缸套小样本缺陷检测方法 |
CN115393634A (zh) * | 2022-08-11 | 2022-11-25 | 重庆邮电大学 | 一种基于迁移学习策略的少样本目标实时检测方法 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008842A (zh) * | 2019-03-09 | 2019-07-12 | 同济大学 | 一种基于深度多损失融合模型的行人重识别方法 |
CN109977812B (zh) * | 2019-03-12 | 2023-02-24 | 南京邮电大学 | 一种基于深度学习的车载视频目标检测方法 |
CN113971815A (zh) * | 2021-10-28 | 2022-01-25 | 西安电子科技大学 | 基于奇异值分解特征增强的少样本目标检测方法 |
CN114841257B (zh) * | 2022-04-21 | 2023-09-22 | 北京交通大学 | 一种基于自监督对比约束下的小样本目标检测方法 |
-
2022
- 2022-08-11 CN CN202210962295.5A patent/CN115393634B/zh active Active
-
2023
- 2023-04-07 WO PCT/CN2023/086781 patent/WO2024032010A1/fr unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109615016A (zh) * | 2018-12-20 | 2019-04-12 | 北京理工大学 | 一种基于金字塔输入增益的卷积神经网络的目标检测方法 |
CN110674866A (zh) * | 2019-09-23 | 2020-01-10 | 兰州理工大学 | 迁移学习特征金字塔网络对X-ray乳腺病灶图像检测方法 |
CN111223553A (zh) * | 2020-01-03 | 2020-06-02 | 大连理工大学 | 一种两阶段深度迁移学习中医舌诊模型 |
AU2020100705A4 (en) * | 2020-05-05 | 2020-06-18 | Chang, Jiaying Miss | A helmet detection method with lightweight backbone based on yolov3 network |
US20220067335A1 (en) * | 2020-08-26 | 2022-03-03 | Beijing University Of Civil Engineering And Architecture | Method for dim and small object detection based on discriminant feature of video satellite data |
CN114663729A (zh) * | 2022-03-29 | 2022-06-24 | 南京工程学院 | 一种基于元学习的气缸套小样本缺陷检测方法 |
CN115393634A (zh) * | 2022-08-11 | 2022-11-25 | 重庆邮电大学 | 一种基于迁移学习策略的少样本目标实时检测方法 |
Non-Patent Citations (2)
Title |
---|
GHIASI GOLNAZ, TSUNG-YI LIN, LE QUOC V: "Dropblock: A regularization method for convolutional networks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, ITHACA, 30 October 2018 (2018-10-30), Ithaca, XP093137589, [retrieved on 20240305], DOI: 10.48550/arXiv.1810.12890 * |
XIA RUIYANG; LI GUOQUAN; HUANG ZHENGWEN; MENG HONGYING; PANG YU: "Bi-path Combination YOLO for Real-time Few-shot Object Detection", PATTERN RECOGNITION LETTERS., ELSEVIER, AMSTERDAM., NL, vol. 165, 1 December 2022 (2022-12-01), NL , pages 91 - 97, XP087247996, ISSN: 0167-8655, DOI: 10.1016/j.patrec.2022.11.025 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117876823A (zh) * | 2024-03-11 | 2024-04-12 | 浙江甲骨文超级码科技股份有限公司 | 一种茶园图像检测方法及其模型训练方法和系统 |
Also Published As
Publication number | Publication date |
---|---|
CN115393634A (zh) | 2022-11-25 |
CN115393634B (zh) | 2023-12-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109961034B (zh) | 基于卷积门控循环神经单元的视频目标检测方法 | |
WO2019228317A1 (fr) | Procédé et dispositif de reconnaissance faciale et support lisible par ordinateur | |
CN107610087B (zh) | 一种基于深度学习的舌苔自动分割方法 | |
WO2024032010A1 (fr) | Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert | |
CN107463920A (zh) | 一种消除局部遮挡物影响的人脸识别方法 | |
CN111898406B (zh) | 基于焦点损失和多任务级联的人脸检测方法 | |
CN109711422A (zh) | 图像数据处理、模型的建立方法、装置、计算机设备和存储介质 | |
Gao et al. | YOLOv4 object detection algorithm with efficient channel attention mechanism | |
CN110348447A (zh) | 一种具有丰富空间信息的多模型集成目标检测方法 | |
CN115661943A (zh) | 一种基于轻量级姿态评估网络的跌倒检测方法 | |
CN111860587A (zh) | 一种用于图片小目标的检测方法 | |
CN114463759A (zh) | 一种基于无锚框算法的轻量化文字检测方法及装置 | |
CN115187786A (zh) | 一种基于旋转的CenterNet2目标检测方法 | |
CN113205103A (zh) | 一种轻量级的文身检测方法 | |
CN115564983A (zh) | 目标检测方法、装置、电子设备、存储介质及其应用 | |
CN110163130B (zh) | 一种用于手势识别的特征预对齐的随机森林分类系统及方法 | |
CN116580322A (zh) | 一种地面背景下无人机红外小目标检测方法 | |
Chen et al. | Ship Detection with Optical Image Based on Attention and Loss Improved YOLO | |
Jeevanantham et al. | Deep Learning Based Plant Diseases Monitoring and Detection System | |
Tu et al. | Toward automatic plant phenotyping: starting from leaf counting | |
CN111950586B (zh) | 一种引入双向注意力的目标检测方法 | |
JP7239002B2 (ja) | 物体数推定装置、制御方法、及びプログラム | |
Lv et al. | An image rendering-based identification method for apples with different growth forms | |
Wu et al. | Siamese Network Object Tracking Algorithm Combined with Attention Mechanism | |
CN116777947B (zh) | 一种用户轨迹识别预测方法、装置及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23851240 Country of ref document: EP Kind code of ref document: A1 |