WO2024032010A1 - Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert - Google Patents

Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert Download PDF

Info

Publication number
WO2024032010A1
WO2024032010A1 PCT/CN2023/086781 CN2023086781W WO2024032010A1 WO 2024032010 A1 WO2024032010 A1 WO 2024032010A1 CN 2023086781 W CN2023086781 W CN 2023086781W WO 2024032010 A1 WO2024032010 A1 WO 2024032010A1
Authority
WO
WIPO (PCT)
Prior art keywords
few
detection
sample
model
training
Prior art date
Application number
PCT/CN2023/086781
Other languages
English (en)
Chinese (zh)
Inventor
李国权
夏瑞阳
林金朝
庞宇
朱宏钰
Original Assignee
重庆邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 重庆邮电大学 filed Critical 重庆邮电大学
Publication of WO2024032010A1 publication Critical patent/WO2024032010A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the invention belongs to the field of image processing and relates to a real-time detection method of few-sample targets based on a transfer learning strategy.
  • Object detection is one of the most important and fundamental tasks in computer vision.
  • CNN Convolutional Neural Network
  • visual Transformer with high detection performance.
  • the excellent detection performance of these models is achieved at the expense of large amounts of data. Due to the complexity of the object and the large number of model parameters, the detection accuracy will drop rapidly when the amount of data is limited. Therefore, few-shot target detection has received more and more attention in recent years.
  • the purpose of the method based on meta-learning strategy is to obtain the correlation between the current image and the few samples.
  • the detection performance for the few samples has been improved, due to the feature extraction structure, input features and few sample features in the minority sample detection branch, The structure of the relationship between them and the number of small sample categories have resulted in a greatly increased computational complexity of the model.
  • the purpose of the method based on the transfer learning strategy is to enable the detection model that already has feature representation capabilities to be well adapted to the few-sample target.
  • the purpose of the present invention is to provide a two-way combined real-time target detection model, based on the transfer learning strategy, using Darknet-53 combined with Spatial Pyramid Pooling (SPP) and Feature Pyramid Network (Feature Pyramid). Network, FPN) as the backbone and neck, respectively extract image features and provide semantic features at different scales.
  • SPP Spatial Pyramid Pooling
  • Feature Pyramid Feature Pyramid Network
  • FPN Feature Pyramid Network
  • the large-sample category detection branch is only used to detect large-sample category objects, while the few-sample category detection branch is used to detect all categories of objects.
  • the discriminator After outputting the detection results in parallel, the discriminator will scan the two results and output the more appropriate result of the two parallel branches based on a metric criterion.
  • the main reason for using the dual-path combination structure is that when the model is trained on a small number of samples, the detection accuracy of objects in the large sample category will degrade, and the few sample detection branch will have false positive bounding boxes that actually belong to the large sample category.
  • the few-sample detection branch also learns the prediction differences of large-sample categories from the large-sample detection branch through knowledge distillation, thereby improving the generalization ability of the detection branch.
  • the present invention proposes a feature-based response
  • the Attentive DropBlock regularization method is used to guide the model to focus on the overall characteristics of the target, avoid being dominated by local salient features, and enhance the generalization ability of the model.
  • a real-time detection method of few-sample targets based on transfer learning strategy including the following steps:
  • S4 Fine-tune the few-sample category detection branch on the few-sample category data; use a new regularization method to guide the model to focus on the overall characteristics of the object during fine-tuning;
  • the detection network model includes: the backbone network is Darknet-53 combined with Spatial Pyramid Pooling (SPP), which is used to extract image features; the detection neck network is composed of Feature Pyramid Network (Feature Pyramid Network, FPN), used to provide semantic features of different scales to the detection head network; the detection head network is a dual-channel detection branch network structure with a discriminator, in which the large sample category detection branch is only used to detect categories corresponding to large samples The target, few-shot category detection branch is used to detect all categories of targets, and the discriminator is used to scan the results of the two branches in sequence and obtain the final output result according to a measurement criterion.
  • SPP Spatial Pyramid Pooling
  • FPN Feature Pyramid Network
  • step S2 processing limited data by using random affine transformation, multi-scale image training strategy, MixUp data fusion strategy and Label Smoothing label processing strategy.
  • step S3 the backbone network is initialized to the weights trained on the ImageNet data set, and the network model except the few-sample detection branch is trained from scratch using large-sample category data.
  • L box is the additive combination of the GIoU loss function and smooth L1 loss of coordinate regression;
  • L cls and L obj are the Focal Loss function and the binary cross-entropy loss function respectively.
  • step S4 the model parameters of the main part of the detection model, the detection neck part and the large sample category detection branch part are frozen, and only the small sample category detection branch is fine-tuned.
  • the loss function at this stage involves the coordinates of the prediction frame , target confidence, classification results and the difference of large sample category detection branches.
  • step S4 specifically includes the following steps:
  • N represents the batch size
  • l represents the absolute error function
  • is used to control the impact of base class distillation loss on model gradient update
  • O d (i, j) represents the discriminator output of a specific spatial grid.
  • the new regularization method is the Attentive DropBlock algorithm, which has a dynamic coefficient ⁇ , as shown below:
  • the parameters keep_prob and block_size affect the frequency and range of the feature map being set to zero
  • represents the sigmoid function, which is used to control the response range
  • represents the response amplification factor
  • the Attentive DropBlock algorithm first determines whether it is currently in the fine-tuning stage. If the model is fine-tuning, obtain the channel response f C and spatial response f S of the few-sample category detection branch; then, calculate the parameter ⁇ according to the parameters keep_prob, block_size and ⁇ . Finally, the spatial position of each different channel feature is set to zero according to the Bernoulli distribution probability with parameter ⁇ ; finally, with the zero position as the center, a mask block with a length and width value of block_size is constructed, so that Regularize the model.
  • step S5 train and test on the PASCAL VOC and MS COCO data sets
  • the training set and the verification set are first merged into one set for training to detect the magic heart, and then its test set is selected for testing.
  • the test evaluation standard adopts the Intersection over Union (IoU) threshold of 0.5
  • the mean Average Precision (mAP) i.e. mAP@50
  • the average number of frames per second (mean Frames Per Second, mFPS) of multiple different small sample collections represent the detection accuracy and speed of the detection model;
  • mAP i.e. AP
  • FPS frames per second
  • step S5 stochastic gradient descent is used as the optimization method of the network model, the initial learning rate is 1 ⁇ 10 -3 , and the set minimum batch size is 16 in different data sets; for PASCAL VOC and MS COCO Data set, the number of times of initial training and fine-tuning of the detection model is 300, and the CosineLR learning rate change strategy (from 0.001 to 0.00001) is used during the training process; during the prediction process, the length and width of the input image are fixed at 448 ⁇ 448; FPS To obtain the sum of the waiting time for each result and the time for post-processing the results, mFPS is the average FPS under different few-sample sets.
  • the present invention proposes an Attentive DropBlock regularization method based on feature response to guide the model to pay attention to the overall characteristics of the object, avoid over-fitting of the model in the fine-tuning stage, avoid being dominated by local salient features, and enhance Due to the generalization ability of the model, the present invention can not only achieve accurate detection of few-sample category objects under smaller model parameters, but also achieve real-time detection of related targets.
  • Figure 1 is an overall flow chart of the model proposed by the present invention.
  • Figure 2 is a visual comparison chart of DropBlock algorithm and Attentive DropBlock algorithm
  • Figure 3 is a diagram showing the visual detection results of large sample and small sample category objects by the model proposed by the present invention.
  • Figure 4 shows the response to the target and the visual detection results of the large-sample category detection branch and the few-sample category detection branch of the model proposed by the present invention.
  • a real-time detection method of few-sample targets based on transfer learning strategy includes the following steps:
  • the S1 specifically includes the following steps:
  • multi-scale image training strategy (320, 352, 384, 416, 448, 480, 512, 544, 576 and 608), MixUp data fusion strategy and Label Smoothing label processing strategy to conduct limited data Processing, thereby increasing the generalization performance of the detection model to the sample.
  • L box is the additive combination of the GIoU loss function of coordinate regression and the smooth L1 loss.
  • L cls and L obj are the Focal Loss function and the binary cross-entropy loss function respectively.
  • the backbone, detection neck and large sample detection branches are frozen to maintain strong generalization ability, and only the few sample detection branches and SPP layers and their adjacent volumes are Stacked layers for training.
  • many false positive bounding boxes are generated, resulting in low detection accuracy due to the similarity between objects in the two classes. Therefore, we randomly sample K instances from the corresponding data for each large-sample category, so that the few-shot detection branch predicts all categories of objects.
  • the large-sample category detection branch has strong generalization ability
  • the few-sample detection branch should learn this branch to obtain better generalization ability. Therefore, we establish the base class distillation loss L b between the two branches, and the calculation formula is as follows:
  • N the batch size.
  • is used to control the impact of base class distillation loss on model gradient update.
  • O d (i, j) represents the discriminator output of a specific spatial grid.
  • the present invention proposes an Attentive DropBlock algorithm.
  • This algorithm is not only affected by the parameters keep_prob and block_size, but also affected by the model's semantic features. Impact of response.
  • the DropBlock algorithm sets a constant coefficient for all locations within the feature map, as follows:
  • is a dynamic coefficient that relies on the feature map response extracted in the Attentive DropBlock algorithm.
  • is a dynamic coefficient that relies on the feature map response extracted in the Attentive DropBlock algorithm.
  • F ⁇ R B ⁇ C ⁇ H ⁇ W adopts the global max pooling function for each channel feature to obtain the response f C ⁇ R B ⁇ C ⁇ 1 ⁇ 1
  • the global average pooling function yields the response f S ⁇ R B ⁇ 1 ⁇ H ⁇ W . Therefore, the calculation formula of ⁇ in the Attentive DropBlock algorithm is as follows:
  • represents the sigmoid function used to control the response range
  • represents the response amplification factor
  • the Attentive DropBlock algorithm will first determine whether the model is currently in the fine-tuning stage. If the model is fine-tuning, obtain the channel response f C and spatial response f S of the few-sample category detection branch. Afterwards, after calculating the parameter ⁇ based on the two responses, keep_prob, block_size and ⁇ , the spatial position of each different channel feature is set to zero according to the Bernoulli distribution probability with the parameter ⁇ . Finally, with the zero position as the center, a mask block with a length and width of block_size is constructed to regularize the model.
  • Figure 2 shows the difference between DropBlock and Attentive DropBlock. It can be observed that Attentive The gamma value in DropBlock is related to the target response. Feature maps that contain more target responses have higher ⁇ values, which means that the detection model can better avoid being dominated by local obvious features and thus pay more attention to unobvious features during the training process, thereby obtaining better results. Sample target detection accuracy.
  • the S5 for the PASCAL VOC data set, three different data combination structures are obtained in such a way that 15 categories are large-sample categories and the remaining 5 categories are few-sample categories (the first few-sample category includes Birds, buses, cows, motorcycles, and sofas; the second few-shot category includes airplanes, bottles, cows, horses, and sofas; the third few-shot category includes boats, cats, motorcycles, sheep, and sofas); for MS In the COCO data set, the 20 categories that are the same as those in the PASCAL VOC data set are small-sample categories, and the remaining 60 categories are large-sample categories.
  • the present invention uses stochastic gradient descent as the optimization method of the network model, the initial learning rate is 1 ⁇ 10 -3 , and the set minimum batch size is 16 in different data sets. For these two data sets, the number of times the model was trained from scratch and fine-tuned was 300, and the CosineLR learning rate change strategy (from 0.001 to 0.00001) was used during the training process.
  • the length and width of the input image are fixed at 448 ⁇ 448.
  • the present invention compares the detection accuracy and detection speed of various few-sample target detection models proposed in recent years on the PASCAL VOC 2007 and MS COCO 2014 data sets.
  • the detection model of the present invention was evaluated on the challenging PASCAL VOC 2007 and MS COCO 2014 data sets according to the evaluation criteria specified in the PASCAL VOC and MS COCO data.
  • These two benchmark data contain training sets, validation sets and test sets.
  • the PASCAL VOC 2007 data set contains 20 target categories
  • the MS COCO 2014 data set contains 80 categories.
  • the present invention first combines the PASCAL VOC 2007 and PASCAL VOC 2012 training sets and verification sets into one set for training the detection model, and selects the PASCAL VOC 2007 test set for testing.
  • the test evaluation standard adopts the Intersection Ratio (Intersection).
  • the detection model is represented by the mean Average Precision (mAP) (i.e. mAP@50) with a threshold of 0.5 over Union (IoU) and the average number of frames per second (mean Frames Per Second, mFPS) of multiple different few sample sets. detection accuracy and speed.
  • mAP mean Average Precision
  • IoU 0.5 over Union
  • mFPS mean Frames Per Second
  • the present invention only uses the MS COCO 2014 training set for training, and uses its verification set for verification in the test phase, using the mAP (i.e. AP) of IoU from 0.5 to 0.95 (interval is 0.05) and the number of transmission frames per second (Frames Per Second, FPS) represents the detection accuracy and speed of the detection model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention, qui relève du domaine du traitement d'image, concerne un procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert, comprenant les étapes suivantes : S1 : la construction d'un modèle de réseau de détection ; S2 : le prétraitement de données d'entrée ; S3 : l'apprentissage d'un modèle de détection d'objet à partir de rien à l'aide de données de classe de grand échantillon ; S4 : le réglage fin d'une branche de détection de classe à peu de coups à l'aide de données de classe à peu de coups ; et durant le réglage précis, l'utilisation d'un nouveau procédé de régularisation pour guider le modèle pour qu'il prête attention à une caractéristique globale d'un objet ; et S5 : l'apprentissage du modèle de détection au moyen d'un ensemble d'apprentissage, et l'exécution d'un test à l'aide d'un ensemble de tests. La présente invention évite une surapprentissage d'un modèle dans une phase de réglage fin, évite une dominance par des caractéristiques saillantes locales et améliore la capacité de généralisation du modèle. La présente invention peut non seulement réaliser une détection précise sur des objets de classe à peu de coups à l'aide de moins de paramètres de modèle, mais peut également obtenir une détection en temps réel d'objets associés.
PCT/CN2023/086781 2022-08-11 2023-04-07 Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert WO2024032010A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210962295.5 2022-08-11
CN202210962295.5A CN115393634B (zh) 2022-08-11 2022-08-11 一种基于迁移学习策略的少样本目标实时检测方法

Publications (1)

Publication Number Publication Date
WO2024032010A1 true WO2024032010A1 (fr) 2024-02-15

Family

ID=84118843

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/086781 WO2024032010A1 (fr) 2022-08-11 2023-04-07 Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert

Country Status (2)

Country Link
CN (1) CN115393634B (fr)
WO (1) WO2024032010A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117876823A (zh) * 2024-03-11 2024-04-12 浙江甲骨文超级码科技股份有限公司 一种茶园图像检测方法及其模型训练方法和系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115393634B (zh) * 2022-08-11 2023-12-26 重庆邮电大学 一种基于迁移学习策略的少样本目标实时检测方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615016A (zh) * 2018-12-20 2019-04-12 北京理工大学 一种基于金字塔输入增益的卷积神经网络的目标检测方法
CN110674866A (zh) * 2019-09-23 2020-01-10 兰州理工大学 迁移学习特征金字塔网络对X-ray乳腺病灶图像检测方法
CN111223553A (zh) * 2020-01-03 2020-06-02 大连理工大学 一种两阶段深度迁移学习中医舌诊模型
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network
US20220067335A1 (en) * 2020-08-26 2022-03-03 Beijing University Of Civil Engineering And Architecture Method for dim and small object detection based on discriminant feature of video satellite data
CN114663729A (zh) * 2022-03-29 2022-06-24 南京工程学院 一种基于元学习的气缸套小样本缺陷检测方法
CN115393634A (zh) * 2022-08-11 2022-11-25 重庆邮电大学 一种基于迁移学习策略的少样本目标实时检测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008842A (zh) * 2019-03-09 2019-07-12 同济大学 一种基于深度多损失融合模型的行人重识别方法
CN109977812B (zh) * 2019-03-12 2023-02-24 南京邮电大学 一种基于深度学习的车载视频目标检测方法
CN113971815A (zh) * 2021-10-28 2022-01-25 西安电子科技大学 基于奇异值分解特征增强的少样本目标检测方法
CN114841257B (zh) * 2022-04-21 2023-09-22 北京交通大学 一种基于自监督对比约束下的小样本目标检测方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615016A (zh) * 2018-12-20 2019-04-12 北京理工大学 一种基于金字塔输入增益的卷积神经网络的目标检测方法
CN110674866A (zh) * 2019-09-23 2020-01-10 兰州理工大学 迁移学习特征金字塔网络对X-ray乳腺病灶图像检测方法
CN111223553A (zh) * 2020-01-03 2020-06-02 大连理工大学 一种两阶段深度迁移学习中医舌诊模型
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network
US20220067335A1 (en) * 2020-08-26 2022-03-03 Beijing University Of Civil Engineering And Architecture Method for dim and small object detection based on discriminant feature of video satellite data
CN114663729A (zh) * 2022-03-29 2022-06-24 南京工程学院 一种基于元学习的气缸套小样本缺陷检测方法
CN115393634A (zh) * 2022-08-11 2022-11-25 重庆邮电大学 一种基于迁移学习策略的少样本目标实时检测方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GHIASI GOLNAZ, TSUNG-YI LIN, LE QUOC V: "Dropblock: A regularization method for convolutional networks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, ITHACA, 30 October 2018 (2018-10-30), Ithaca, XP093137589, [retrieved on 20240305], DOI: 10.48550/arXiv.1810.12890 *
XIA RUIYANG; LI GUOQUAN; HUANG ZHENGWEN; MENG HONGYING; PANG YU: "Bi-path Combination YOLO for Real-time Few-shot Object Detection", PATTERN RECOGNITION LETTERS., ELSEVIER, AMSTERDAM., NL, vol. 165, 1 December 2022 (2022-12-01), NL , pages 91 - 97, XP087247996, ISSN: 0167-8655, DOI: 10.1016/j.patrec.2022.11.025 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117876823A (zh) * 2024-03-11 2024-04-12 浙江甲骨文超级码科技股份有限公司 一种茶园图像检测方法及其模型训练方法和系统

Also Published As

Publication number Publication date
CN115393634A (zh) 2022-11-25
CN115393634B (zh) 2023-12-26

Similar Documents

Publication Publication Date Title
CN109961034B (zh) 基于卷积门控循环神经单元的视频目标检测方法
WO2019228317A1 (fr) Procédé et dispositif de reconnaissance faciale et support lisible par ordinateur
CN107610087B (zh) 一种基于深度学习的舌苔自动分割方法
WO2024032010A1 (fr) Procédé de détection d'objet à peu de coups en temps réel sur la base d'une stratégie d'apprentissage de transfert
CN107463920A (zh) 一种消除局部遮挡物影响的人脸识别方法
CN111898406B (zh) 基于焦点损失和多任务级联的人脸检测方法
CN109711422A (zh) 图像数据处理、模型的建立方法、装置、计算机设备和存储介质
Gao et al. YOLOv4 object detection algorithm with efficient channel attention mechanism
CN110348447A (zh) 一种具有丰富空间信息的多模型集成目标检测方法
CN115661943A (zh) 一种基于轻量级姿态评估网络的跌倒检测方法
CN111860587A (zh) 一种用于图片小目标的检测方法
CN114463759A (zh) 一种基于无锚框算法的轻量化文字检测方法及装置
CN115187786A (zh) 一种基于旋转的CenterNet2目标检测方法
CN113205103A (zh) 一种轻量级的文身检测方法
CN115564983A (zh) 目标检测方法、装置、电子设备、存储介质及其应用
CN110163130B (zh) 一种用于手势识别的特征预对齐的随机森林分类系统及方法
CN116580322A (zh) 一种地面背景下无人机红外小目标检测方法
Chen et al. Ship Detection with Optical Image Based on Attention and Loss Improved YOLO
Jeevanantham et al. Deep Learning Based Plant Diseases Monitoring and Detection System
Tu et al. Toward automatic plant phenotyping: starting from leaf counting
CN111950586B (zh) 一种引入双向注意力的目标检测方法
JP7239002B2 (ja) 物体数推定装置、制御方法、及びプログラム
Lv et al. An image rendering-based identification method for apples with different growth forms
Wu et al. Siamese Network Object Tracking Algorithm Combined with Attention Mechanism
CN116777947B (zh) 一种用户轨迹识别预测方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23851240

Country of ref document: EP

Kind code of ref document: A1