WO2023216725A1 - 一种改进的主动学习遥感样本标记方法 - Google Patents

一种改进的主动学习遥感样本标记方法 Download PDF

Info

Publication number
WO2023216725A1
WO2023216725A1 PCT/CN2023/082939 CN2023082939W WO2023216725A1 WO 2023216725 A1 WO2023216725 A1 WO 2023216725A1 CN 2023082939 W CN2023082939 W CN 2023082939W WO 2023216725 A1 WO2023216725 A1 WO 2023216725A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
value
labeled
classifier
label
Prior art date
Application number
PCT/CN2023/082939
Other languages
English (en)
French (fr)
Inventor
董铱斐
段红伟
邹圣兵
陈婷
Original Assignee
北京数慧时空信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京数慧时空信息技术有限公司 filed Critical 北京数慧时空信息技术有限公司
Publication of WO2023216725A1 publication Critical patent/WO2023216725A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes

Definitions

  • the invention relates to the technical field of remote sensing image processing, and in particular to an improved active learning remote sensing sample labeling method.
  • Global land cover data is essential information for humans to understand nature and grasp natural laws. It is also the most basic data required for various resource management and geographical information services.
  • the advantage of remote sensing data is that it contains rich spatial information, which is conducive to studying the spatial characteristics of ground objects. With the continuous breakthroughs in my country's satellite hardware technology and earth observation technology, the spatial resolution, time resolution and even spectral resolution of remote sensing data have been increasingly improved. The amount of remote sensing data has been growing exponentially. If all manually labeled data is used, it will lead to labeling. The cost is too great. In this context, the method of active learning for sample labeling emerged as the times require.
  • the existing sample labeling principle based on active learning is to select some high-value unlabeled samples from unlabeled samples, add them to the labeled sample set after being marked by experts, and then use the supplemented labeled sample set to train the classifier to improve the accuracy of the classifier. , continue to use the current classifier to select value samples for expert labeling, train the current classifier again until the classifier meets the preset training stop conditions, and finally use the trained classifier to label the remaining unlabeled samples.
  • Active learning actively selects some high-value unlabeled samples to be labeled by experts in related fields. Such samples usually contain rich information and play a good role in model tuning.
  • the present invention proposes an improved active learning remote sensing sample labeling method, which not only makes up for the shortcomings of existing active learning that relies on experts to label value samples, resulting in high labor costs, but also makes up for the labeling errors caused by the existing machine labeling of value samples. High rate of defects.
  • the present invention generates value generation samples through a generative adversarial network to increase the data richness of the value samples, further uses the value generation samples to train a second classifier group that can better learn the characteristics of the value samples, and improves the accuracy of machine labeling of value samples. efficiency, ensuring the accuracy of sample labeling while greatly reducing labor costs.
  • An improved active learning remote sensing sample labeling method which includes the following steps:
  • S1 obtains a sample set, which includes an unlabeled sample set and a labeled sample set;
  • S2 obtains the first classifier model through training on the labeled sample set
  • S3 determines whether the conditions for terminating training of the first classifier model are met:
  • S4 puts unlabeled samples into the first classifier model for prediction, and uses the improved value sample query strategy to screen unlabeled samples to obtain a value sample set ⁇ b i ⁇ with both uncertainty and diversity, where b i is value sample;
  • S5 obtains the prediction result of value sample b i in the first classifier, including the category label and the prediction score of the category label, sorts the prediction scores in order from large to small, and selects the category with the top s prediction score label, obtain the candidate pseudo-label set ⁇ L i n ⁇ of the value sample b i ;
  • i is the number of the value sample
  • n is the number of the candidate pseudo-label
  • s is the number of candidate pseudo-labels in the candidate pseudo-label set, where s ⁇ 2;
  • S6 trains a generative adversarial network through the value sample set ⁇ b i ⁇ , and obtains the value generation sample set ⁇ b ij ⁇ through the trained generative adversarial network;
  • S7 assigns the candidate pseudo labels Lin of s value samples b i to the value generation sample set ⁇ b ij ⁇ respectively , and obtains s labeled value generation sample sets ⁇ b ij / L i n ⁇ ;
  • S8 takes the union of s labeled value generated sample sets and labeled sample sets respectively, and obtains s merged labeled sample sets;
  • S9 trains a second classifier group through the labeled sample set, and uses the response of the second classifier group to filter out the real label L i a of the value sample b i from s candidate pseudo labels, completing the value sample b Mark i to obtain the marked value sample set ⁇ (b i /L i a ) ⁇ ;
  • a is the number of the value sample pseudo-label
  • S11 uses the first classifier model to label the unlabeled sample set.
  • step S7 includes:
  • the s candidate pseudo- labels Lin of the value sample bi are assigned to the value generation sample set ⁇ b ij ⁇ respectively, and s labeled value generation sample sets ⁇ (b ij / L i 1 ) ⁇ , ⁇ (b ij / L i 2 ) ⁇ ,..., ⁇ (b ij / L i s ) ⁇ .
  • step S8 includes:
  • the labeled sample set is combined with the s labeled value generated sample sets to obtain s merged labeled sample sets.
  • step S9 includes:
  • S91 trains s second classifier models through s merged labeled sample sets to form a second classifier group
  • S92 Input the value sample bi into s second classifier models respectively, and determine the value sample bi among s candidate pseudo labels based on the response difference of the value sample bi in the s second classifier models . pseudo-label.
  • the improved value sample query strategy includes:
  • S41 clusters the labeled sample set according to the clustering algorithm and obtains n cluster centers x c ;
  • a prediction probability vector f ( x ) is generated based on uncertainty screening
  • S43 calculates the maximum distance between the unlabeled sample x and the n cluster centers x c based on uncertainty screening, and generates a diversity vector g( x );
  • S44 obtains the sample value T of the unlabeled sample by predicting the probability vector f ( x ) and the diversity vector g ( x );
  • S45 determines whether the current dynamic threshold exists: if it does not exist, execute step S46; if it exists, jump to S47;
  • step S46 sets the initial value of the initial dynamic threshold as the current dynamic threshold, and constructs a value sample set.
  • the initial value sample set is an empty set, and step S47 is executed;
  • S47 determines whether the current dynamic threshold should be adjusted based on whether the value sample set is an empty set:
  • S48 determines whether the unlabeled sample x is a valuable sample based on the relationship between the sample value T and the current dynamic threshold T THR ;
  • the conditions for terminating training of the first classifier model include that the number of value samples reaches a preset upper limit or the training error of the first classifier model is within a preset range.
  • step S6 includes:
  • S61 encodes the value sample bi to obtain the hidden variable corresponding to the value sample bi ;
  • S65 inputs the latent variables and random noise corresponding to the value sample into the trained generator to obtain a value generation sample b ij , and the value generation sample b ij and the value sample bi follow the same distribution.
  • the present invention obtains value generation samples with the same distribution as the value samples through a generative adversarial network, thereby using the value generation samples to train a second classifier group that can better learn the characteristics of the value samples, and using the second classifier group to complete the classification of the value samples Compared with existing active learning methods, this method increases the data richness of value samples, enables the classifier to better learn the characteristics of value samples, significantly improves the labeling accuracy of the classifier, and reduces the cost of manual labeling;
  • this invention considers the difference between value samples and marked samples, performs value screening on samples from uncertainty and diversity, and obtains values with training value in both uncertainty and diversity. Samples, compared with existing active learning query strategies, this method avoids the problem of sample bias in the selection of value samples.
  • Figure 1 is a flow chart of an improved active learning remote sensing sample labeling method provided by the present invention
  • Figure 2 is a schematic diagram of the training process of a generative adversarial network in a specific embodiment of the present invention
  • Figure 3 is a schematic diagram of an improved active learning remote sensing sample labeling process in a specific embodiment of the present invention.
  • Figure 1 is a flow chart of an improved active learning remote sensing sample labeling method provided by the present invention. The method includes the following steps:
  • S1 obtains a sample set, which includes an unlabeled sample set and a labeled sample set.
  • S2 obtains the first classifier model by training the labeled sample set.
  • Resnet50 can be selected as the network architecture and the first classifier model C 1 can be obtained through training with the labeled sample set;
  • S3 determines whether the conditions for terminating training of the first classifier model are met.
  • step S11 stop training and output the trained first classifier model C 1 , and execute step S11;
  • S4 puts unlabeled samples into the first classifier model for prediction, and uses the improved value sample query strategy to screen unlabeled samples to obtain a value sample set ⁇ b i ⁇ with both uncertainty and diversity, where b i is Value sample.
  • the improved value sample query strategy includes:
  • S41 clusters the labeled sample set according to the clustering algorithm to obtain n cluster centers x c ;
  • distance calculation uses Euclidean distance, and other types of measurement methods, such as cosine distance, can also be used, which are set according to specific tasks. Assuming that the labeled sample x L k (i) represents the k-th dimension of the i-th labeled sample, then the Euclidean distance dist(i,j) between samples x L (i) and x L (j) is :
  • S42 puts unlabeled samples into the first classifier model for prediction. For each unlabeled sample x , select the maximum possible category label L 1 and the second possible category label L 2 to generate a prediction probability f ( x );
  • x) respectively represent the probability scores of the first classifier predicting the maximum possible category label L 1 and the second possible category label L 2 for the unlabeled sample x . ;
  • S43 calculates the maximum distance between the unlabeled sample x and the cluster center x c (i) based on uncertainty screening, and generates a diversity vector g( x );
  • p will select different values.
  • dist( ⁇ ) represents the Manhattan distance.
  • dist( ⁇ ) represents the Euclidean distance
  • n is the cluster center.
  • the number of , k represents the dimensions of the unlabeled sample and the cluster center sample
  • x k represents the k-th dimension of the unlabeled sample x
  • x C k (i) respectively represents the k-th dimension of the i-th cluster center sample x C dimensions;
  • S44 performs value sample query through the improved value sample query strategy formula, and obtains the sample value T of the unlabeled sample.
  • the formula is as follows:
  • S45 determines whether the current dynamic threshold exists: if it does not exist, execute step S46; if it exists, jump to S47;
  • step S46 sets the initial value of the initial dynamic threshold as the current dynamic threshold, and constructs a value sample set.
  • the initial value sample set is an empty set, and step S47 is executed;
  • S47 determines whether the current dynamic threshold should be adjusted based on whether the value sample set is an empty set:
  • S48 determines whether the unlabeled sample x is a valuable sample based on the relationship between the sample value T and the current dynamic threshold T THR ;
  • sample uncertainty is usually used to screen value samples, but this may cause data bias. Improving the query strategy through sample diversity tends to select outliers. Avoid sample bias and make the value samples queried have both uncertainty and diversity. Among them, in the calculation of sample diversity, samples that are far away from the labeled samples tend to be selected. The specific method is to cluster the labeled sample cluster center,
  • T ⁇ T THR that is, the probability of the largest possible class is greater than the second possible class to a certain extent.
  • the current first classifier model is sufficient to distinguish the category of the worthless sample, so it is no longer necessary to use the worthless sample to fine-tune the model, that is, the worthless sample has a relatively small or even no effect on the model, so There is no need to train the model again with worthless samples.
  • S5 obtains the prediction result of value sample b i in the first classifier, including the category label and the prediction score of the category label, sorts the prediction scores in order from large to small, and selects the category with the top s prediction score label, and obtain the candidate pseudo-label set ⁇ L i n ⁇ of the value sample b i .
  • i is the number of the value sample
  • n is the number of the candidate pseudo-label
  • s is the number of candidate pseudo-labels in the candidate pseudo-label set, where s ⁇ 2.
  • step S5 will be described in detail by taking the acquisition process of the candidate pseudo-label set ⁇ L i n ⁇ of the value sample b 1 as an example:
  • the category labels of value sample b 1 in the first classifier are arranged from large to small according to the predicted probability scores and are represented as L 1 1 , L 1 2 ,..., where L 1 1 is the first class label of value sample b 1 Classifier C 1 is predicted to be the largest possible category label, L 1 2 is the value sample b 1 is predicted to be the second most likely category label by the first classifier C 1 , and so on, selected from large to small according to the predicted probability score
  • the first s category labels, as candidate pseudo-labels of value sample b 1 constitute the candidate pseudo-label set ⁇ L 1 n ⁇ of value sample b 1 , which means that the machine cannot determine the category label of value sample b 1 and needs to select s category labels. Find the true class label of value sample b 1 in L 1 1 , L 1 2 ,..., L 1 s .
  • S6 trains a generative adversarial network through the value sample set ⁇ b i ⁇ , and obtains the value generation sample set ⁇ b ij ⁇ through the trained generative adversarial network.
  • step S6 includes:
  • S61 encodes the value sample bi to obtain the hidden variable corresponding to the value sample bi ;
  • S65 inputs the latent variables and random noise corresponding to the value sample into the trained generator to obtain a value generation sample b ij , and the value generation sample b ij and the value sample bi follow the same distribution.
  • steps S61-S64 are the training process of the generative adversarial network. Please refer to Figure 2 for this training process.
  • synthetic samples will be considered “fake samples” by the discriminator, and the purpose of the generator is to "cheat" the discriminator and generate synthetic samples that the discriminator can identify as "real samples”.
  • the entire training process is the discriminator
  • the final result of the game is to obtain the parameters that maximize the classification accuracy of the discriminator, and to obtain the generator parameters that maximize the deception of the discriminator.
  • S7 assigns the candidate pseudo-labels Lin of s value samples b i to the value generation sample set ⁇ b ij ⁇ respectively , and obtains s labeled value generation sample sets ⁇ b ij / L i n ⁇ .
  • step S7 includes:
  • the s candidate pseudo labels Lin of the value sample bi are assigned to the value generation sample b ij respectively , and s labeled value generation sample sets ⁇ (b ij / L i 1 ) ⁇ , ⁇ (b ij / L i 2 ) ⁇ ,..., ⁇ (b ij / L i s ) ⁇ .
  • S8 combines the s labeled value generated sample sets and the labeled sample sets to obtain s merged labeled sample sets.
  • step S8 includes:
  • S9 trains a second classifier group through the labeled sample set, and uses the response of the second classifier group to filter out the pseudo label L i a of the value sample b i from s candidate pseudo labels, completing the value sample b Mark i to obtain the marked value sample set ⁇ (b i /L i a ) ⁇ , where a is the number of the value sample pseudo-label.
  • Resnet50 can be selected as the network architecture of the second classifier model.
  • step S9 includes:
  • S91 trains s second classifier models through s merged labeled sample sets to form a second classifier group
  • S92 Input the value sample bi into s second classifier models respectively, and determine the value sample bi among s candidate pseudo labels based on the response difference of the value sample bi in the s second classifier models . pseudo-label.
  • the sample set ⁇ b ij /L i n ⁇ is generated according to the labeled value and mixed with the labeled sample set to train s second classifier models C 2 in , and the value sample b i is used in the s second classifier model.
  • the responses are different, determine the pseudo label L i a of value sample b i in the predicted label set ⁇ L i n ⁇ of bi , complete the labeling of value sample b i , and obtain the labeled value sample b i /L i a .
  • steps S7-S9 will be described in detail by taking the marking process of value sample b 1 as an example:
  • value sample b 1 cannot be marked, and the value sample b 1 is returned to the value sample set ⁇ b i ⁇ ;
  • steps S7-S9 of this example use the adversarial generation network to simulate the distribution of value samples, increase the data richness of the value samples, obtain the value generation samples, and further use the value generation samples to train a model that can better learn the value samples.
  • the second classifier group of features is used to complete the labeling of value samples. Since this method increases the data richness of value samples and enables the classifier to better learn the characteristics of value samples, the present invention can significantly reduce the cost of manual labeling and significantly improve the accuracy of machine labeling.
  • S11 uses the first classifier model to label the unlabeled sample set.
  • the present invention obtains value generation samples with the same distribution as the value samples through a generative adversarial network, thereby using the value generation samples to train a second classifier group that can better learn the characteristics of the value samples,
  • the second classifier group is used to complete the labeling of value samples.
  • this method increases the data richness of value samples, enables the classifier to better learn the characteristics of value samples, and significantly improves the labeling of the classifier. accuracy while reducing manual labeling costs;
  • the present invention takes into account the difference between value samples and labeled samples through an improved value sample query strategy, and performs value screening on samples based on uncertainty and diversity, and obtains results that have both uncertainty and diversity. Value samples for training values. Compared with existing active learning query strategies, this method avoids the problem of sample bias in the selection of value samples.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本发明涉及一种改进的主动学习遥感样本标记方法,属于遥感图像处理技术领域。该方法首先利用改进的价值样本筛选策略筛选出兼具不确定性和多样性的价值样本,然后利用生成对抗网络生成价值生成样本以增加价值样本的数据丰富度,进一步利用价值生成样本训练出第二分类器组,利用训练的第二分类器组完成价值样本集的标记。本发明解决了传统主动学习中通过专家进行价值样本标记导致的人工成本高的问题,利用学习到丰富的价值样本特征的第二分类器组进行价值样本标记,在降低了人工成本的同时,有效增加了标记准确率。

Description

一种改进的主动学习遥感样本标记方法 技术领域
本发明涉及遥感图像处理技术领域,尤其涉及一种改进的主动学习遥感样本标记方法。
背景技术
全球土地覆盖数据是人类认识自然、掌握自然规律必备的信息,也是各种资源管理和地理信息服务所需要的最基本数据。遥感数据的优势在于其包含了丰富的空间信息,有利于研究地物的空间特性。随着我国卫星硬件技术和对地观测技术的不断突破使得遥感数据的空间分辨率、时间分辨率乃至光谱分辨率日益提升,遥感数据量呈井喷式增长,如果全部采用人工标记数据将会导致标记成本过大。在此背景下,主动学习进行样本标记的方法应运而生。
现有基于主动学习样本标记原理是从未标记样本中挑选部分价值量高的未标注样本,经过专家标记后补充到已标记样本集中,再用补充的已标记样本集训练分类器来提高分类器精度,继续利用当前分类器选取价值样本进行专家标记,再次训练当前分类器,直至分类器满足预设的训练停止条件,最后用训练好的分类器对其余未标记样本进行样本标记。主动学习通过主动选择一些价值量高的未标注样本给相关领域的专家进行标注,这样的样本通常蕴含了丰富的信息,并且对模型调优起着很好的作用。
现有的主动学习任务尚存在许多不足之处。具体的,第一,传统的主动学习通过专家知识对价值样本进行标记,但是,在具体实践中,但是由于缺少专业知识,人工标注是一件异常困难且代价很大的事,目前采用机器标注取代人工标注,然而由于价值样本数量较少,导致机器不能很好地学习价值样本的特征,不能保证标记的准确率;第二,现有的主动学习方法通常利用不确定性衡量未标记样本的价值量,而基于不确定性的主动学习方法通常忽略了数据之间的差异性,存在同一类中重复选择不必要的样本的情况。
发明内容
本发明提出一种改进的主动学习遥感样本标记方法,弥补了现有的主动学习依赖专家标记价值样本的导致人工成本较高的缺陷的同时,弥补了现有的机器标记价值样本导致的标记错误率高的缺陷。本发明通过生成对抗网络生成价值生成样本以增加价值样本的数据丰富度,进一步利用价值生成样本训练出能够更好地学习价值样本的特征的第二分类器组,提高了机器标记价值样本的准确率,保证样本标记的准确率的同时大大降低了人工成本。
为实现上述技术目的,本发明的技术方案如下:
一种改进的主动学习遥感样本标记方法,该方法包括以下步骤:
S1获取样本集,所述样本集包括未标记样本集和已标记样本集;
S2通过所述已标记样本集训练得到第一分类器模型;
S3判断是否满足所述第一分类器模型训练终止的条件:
若满足,结束训练,执行步骤S11;
若不满足,执行步骤S4;
S4将未标记样本放入第一分类器模型进行预测,并利用改进的价值样本查询策略筛选未标记样本,得到既有不确定性又具有多样性的价值样本集{b i },b i 为价值样本;
S5获取价值样本b i 在所述第一分类器中的预测结果,包括类别标签和类别标签的预测分数,并按照从大到小的顺序对预测分数进行排序,选取预测分数排名前s的类别标签,得到价值样本b i 的候选伪标签集{L i n };
i为价值样本的编号, n为候选伪标签的编号,s为候选伪标签集中的候选伪标签个数,其中s≥2; 
S6通过所述价值样本集{b i }训练生成对抗网络,并通过训练的生成对抗网络获得价值生成样本集{b ij };
S7将s个价值样本b i 的候选伪标签L i n ,分别赋予价值生成样本集{b ij },得到s个已标记价值生成样本集{b ij/ L i n };
S8分别对s个已标记价值生成样本集和已标记样本集取并集,得到s个合并已标记样本集;
S9通过已标记样本集训练出第二分类器组,并通过第二分类器组的响应,从s个候选伪标签中筛选出所述价值样本b i 的真实标签L i a ,完成价值样本b i 的标记,得到已标记价值样本集{(b i /L i a )};
a为价值样本伪标签的编号;
S10将所述已标记价值样本集{b i /L i a }加入所述已标记样本集,返回至所述步骤S2;
S11通过所述第一分类器模型对未标记样本集进行样本标记。
更进一步地,所述步骤S7包括:
将所述价值样本b i 的s个候选伪标签L i n 分别赋予所述价值生成样本集{b ij },得到s个已标记价值生成样本集{(b ij/ L i 1)},{(b ij/ L i 2)},...,{(b ij/ L i s)}。
更进一步地,所述步骤S8包括:
将已标记样本集分别和s个已标记价值生成样本集取并集,得到s个合并已标记样本集。
更进一步地,所述步骤S9包括:
S91通过s个合并已标记样本集训练出s个第二分类器模型,组成第二分类器组;
S92将所述价值样本b i 分别输入s个第二分类器模型,根据所述价值样本b i 在s个第二分类器模型的响应差异在s个候选伪标签中确定所述价值样本b i 的伪标签。
更进一步地,所述改进的价值样本查询策略包括:
S41根据聚类算法对已标记样本集进行聚类,得到n个聚类中心x c
S42对于每个未标记样本 x,经过第一分类器的预测,基于不确定性筛选产生预测概率向量 f( x);
S43基于不确定性筛选计算未标记样本 x和所述n个聚类中心x c之间的最大距离,产生多样性向量g( x);
S44通过预测概率向量 f( x)和多样性向量g( x),得到未标记样本的样本价值T;
S45判断是否存在当前动态阈值:若不存在,执行步骤S46,若存在,跳转至S47;
S46设置初始动态阈值初始值作为当前动态阈值,并构建价值样本集,初始价值样本集为空集,执行步骤S47;
S47根据所述价值样本集是否为空集判断当前动态阈值是否进行调整:
若所述价值样本集为空集,则对所述当前动态阈值增加预设数值,获得新的当前动态阈值;
若所述价值样本集不为空集,则保持所述当前动态阈值不变;
S48根据所述样本价值T与当前动态阈值T THR的关系判断所述未标记样本 x是否为价值样本;
若否,则将所述未标记样本 x赋予伪标签y 1,并将该样本(y1|x)合并入已标记样本集;
若是,则将所述未标记样本 x记为价值样本b i i为价值样本的编号,并加入价值样本集{b i }。
更进一步地,所述第一分类器模型训练终止的条件包括所述价值样本的数量达到预先设定的上限或所述第一分类器模型训练误差在预先设定的范围内。
更进一步地,所述步骤S6包括:
S61对所述价值样本b i 进行编码,得到价值样本b i 对应的隐变量;
S62初始化所述生成对抗网络的生成器和鉴别器的参数;
S63将所述价值样本b i 对应的隐变量和随机噪声输入所述生成器,得到合成样本,并将所述合成样本和所述价值样本同时输入所述鉴别器以鉴别真伪;
S64所述生成器和所述鉴别器相互博弈直至所述鉴别器将所述合成样本鉴定为真;
S65将所述价值样本对应的隐变量和随机噪声输入训练好的生成器,得到价值生成样本b ij ,所述价值生成样本b ij 与所述价值样本b i 服从相同分布。
本发明的有益效果为:
本发明通过生成对抗网络得到与价值样本同分布的价值生成样本,从而利用价值生成样本训练出能够更好地学习价值样本的特征的第二分类器组,利用第二分类器组完成对价值样本的标记,相比现有的主动学习方法,本方法增加价值样本的数据丰富度,使分类器更好地学习价值样本的特征,显著提升分类器的标注准确率,同时降低人工标记成本;
本发明通过改进的价值样本查询策略,考虑了价值样本与已标记样本的差异性,从不确定性和多样性对样本进行价值筛选,得到在不确定性和多样性上均具备训练价值的价值样本,相比现有的主动学习查询策略,本方法在价值样本的选取上避免了样本偏差问题。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
附图1为本发明提供的一种改进的主动学习遥感样本标记方法流程图;
附图2为本发明一具体实施例中生成对抗网络的训练过程示意图;
附图3为本发明一具体实施例中一种改进的主动学习遥感样本标记过程示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本发明保护的范围。
参考图1和图3,理解以下具体实施例。图1为本发明提供的一种改进的主动学习遥感样本标记方法的流程图。该方法包括以下步骤:
S1获取样本集,所述样本集包括未标记样本集和已标记样本集。
S2通过所述已标记样本集训练得到第一分类器模型。
在具体实现中,可选择Resnet50作为网络架构并通过已标记样本集训练得到第一分类器模型C 1
S3判断是否满足所述第一分类器模型训练终止的条件。
若满足,结束训练,执行步骤S11;
若不满足,执行步骤S4;
在具体实现中,判断当前第一分类器模型C 1的训练误差是否在预先设定的范围内:
若是,停止训练并输出训练好的第一分类器模型C 1,执行步骤S11;
若否,继续训练第一分类器模型C 1,执行步骤S4;
S4将未标记样本放入第一分类器模型进行预测,并利用改进的价值样本查询策略筛选未标记样本,得到既有不确定性又具有多样性的价值样本集{b i },b i 为价值样本。
作为一种实现方式,改进的价值样本查询策略包括:
S41根据聚类算法对已标记样本集进行聚类得到n个聚类中心x c
在具体实现中,距离计算使用欧几里得距离,也可以使用其他类型的度量方法,如余弦距离等,根据具体的任务来设定。假设已标记样本 x L k(i)表示第i个已标记样本第k个维度,则样本 x L(i)和 x L(j)之间的欧几里得距离dist(i,j)为:
根据聚类结果,得到n个聚类中心x c
S42将未标记样本放入第一分类器模型进行预测,对于每个未标记样本 x,选取最大可能类别标签L 1和第二可能类别标签L 2,基于不确定性筛选产生预测概率 f( x);
其中,P(L 1|x)和P(L 2|x)分别表示对未标记样本 x而言,第一分类器预测为最大可能类别标签L 1和第二可能类别标签L 2的概率分数;
S43基于不确定性筛选计算未标记样本 x和所述聚类中心x c(i)之间的最大距离,产生多样性向量g( x);
其中,对于不同分布的数据集,p将会选取不同值,p=1时,dist(×)表示曼哈顿距离,p=2时,dist(×)表示欧几里得距离,n为聚类中心的个数,k表示未标记样本和聚类中心样本的维度,x k表示未标记样本 x的第k个维度,x C k(i)分别表示第i个聚类中心样本x C的第k个维度;
S44通过改进的价值样本查询策略公式进行价值样本查询,得到未标记样本的样本价值T,公式如下:
S45判断是否存在当前动态阈值:若不存在,执行步骤S46,若存在,跳转至S47;
S46设置初始动态阈值初始值作为当前动态阈值,并构建价值样本集,初始价值样本集为空集,执行步骤S47;
S47根据所述价值样本集是否为空集判断当前动态阈值是否进行调整:
若所述价值样本集为空集,则对所述当前动态阈值增加预设数值,获得新的当前动态阈值;
若所述价值样本集不为空集,则保持所述当前动态阈值不变;
S48根据所述样本价值T与当前动态阈值T THR的关系判断所述未标记样本 x是否为价值样本;
若否,则将所述未标记样本 x赋予伪标签y 1,并将该样本(y 1|x)合并入已标记样本;
若是,则将所述未标记样本 x记为价值样本b i i为价值样本的编号,并加入价值样本集{b i }。
在现有的价值样本的查询策略中,通常利用样本的不确定性进行价值样本的筛选,但是可能会造成数据偏差,通过样本的多样性改进查询策略,倾向于选择离群点的可能性,避免出现样本偏差,使查询出的价值样本兼具不确定性和多样性。其中,在样本多样性计算中,倾向于选择与已标记样本距离较远的样本,具体做法是对已标记样本聚类中心,
值得说明的是, x来自未标记样本集,x C(i)来自已标记样本集,两个样本集中的样本均具有k个维度。
值得说明的是,对于T≥T THR的未标记样本,其对第一分类器来说是无价值样本,由于T≥T THR,即最大可能类的概率在一定程度上大于第二可能类,当前第一分类器模型足以能够区分出所述无价值样本的类别,所以不再需要使用所述无价值样本对模型进行微调,即无价值样本对模型的作用相对较小,甚至没有作用,所以不需要用无价值样本再次进行模型训练。
S5获取价值样本b i 在所述第一分类器中的预测结果,包括类别标签和类别标签的预测分数,并按照从大到小的顺序对预测分数进行排序,选取预测分数排名前s的类别标签,得到价值样本b i 的候选伪标签集{L i n }。 i为价值样本的编号, n为候选伪标签的编号,s为候选伪标签集中的候选伪标签个数,其中s≥2。
在具体实现中,参考图3,以价值样本b 1的候选伪标签集{L i n }获取过程为例详细说明步骤S5:
价值样本b 1的在所述第一分类器中类别标签按照预测概率分数从大到小排列表示为L 1 1,L 1 2,...,其中L 1 1是价值样本b 1被第一分类器C 1预测为最大可能的类别标签,L 1 2是价值样本b 1被第一分类器C 1预测为第二大可能的类别标签,以此类推,按照预测概率分数从大到小选取前s个类别标签,作为价值样本b 1的候选伪标签,构成价值样本b 1的候选伪标签集{L 1 n },说明机器无法确定价值样本b 1的类别标签,需要在s个类别标签L 1 1,L 1 2,...,L 1 s中找到价值样本b 1的真实类别标签。
S6通过所述价值样本集{b i }训练生成对抗网络,并通过训练的生成对抗网络获得价值生成样本集{b ij }。
作为一种实现方式,上述步骤S6包括:
S61对所述价值样本b i 进行编码,得到价值样本b i 对应的隐变量;
S62初始化所述生成对抗网络的生成器和鉴别器的参数;
S63将所述价值样本b i 对应的隐变量和随机噪声输入所述生成器,得到合成样本,并将所述合成样本和所述价值样本同时输入所述鉴别器以鉴别真伪;
S64所述生成器和所述鉴别器相互博弈直至所述鉴别器将所述合成样本鉴定为真;
S65将所述价值样本对应的隐变量和随机噪声输入训练好的生成器,得到价值生成样本b ij ,所述价值生成样本b ij 与所述价值样本b i 服从相同分布。
其中,步骤S61-S64是生成对抗网络的训练过程,该训练过程参考图2。在训练过程中,合成样本会被鉴别器认为是“假样本”,而生成器的目的就是“欺骗”鉴别器,生成让鉴别器鉴别为“真样本”的合成样本,整个训练过程就是鉴别器和生成器的博弈过程,博弈的最终结果就是获得令鉴别器分类准确率最大化的参数,以及获得最大化欺骗鉴别器的生成器参数。
S7将s个价值样本b i 的候选伪标签L i n ,分别赋予价值生成样本集{b ij },得到s个已标记价值生成样本集{b ij/ L i n }。
作为一种实现方式,上述步骤S7包括:
将所述价值样本b i 的s个候选伪标签L i n 分别赋予所述价值生成样本b ij ,得到s个已标记价值生成样本集{(b ij/ L i 1)},{(b ij/ L i 2)},...,{(b ij/ L i s)}。
S8分别将s个已标记价值生成样本集与已标记样本集取并集,得到s个合并已标记样本集。
作为一种实现方式,上述步骤S8包括:
分别将s个已标记价值生成样本集{(b ij/ L i 1)},{(b ij/ L i 2)},...,{(b ij/ L i s)}与已标记样本集R取并集,得到s个合并已标记样本集:{(b ij/ L i 1)}∪R,{(b ij/ L i 2)}∪R,...,{(b ij/ L i s)}∪R。
S9通过已标记样本集训练出第二分类器组,并通过第二分类器组的响应,从s个候选伪标签中筛选出所述价值样本b i 的伪标签L i a ,完成价值样本b i 的标记,得到已标记价值样本集{(b i /L i a )}, a为价值样本伪标签的编号。
在具体实现中,可选择Resnet50作为第二分类器模型的网络架构。
作为一种实现方式,上述步骤S9包括:
S91通过s个合并已标记样本集训练出s个第二分类器模型,组成第二分类器组;
S92将所述价值样本b i 分别输入s个第二分类器模型,根据所述价值样本b i 在s个第二分类器模型的响应差异在s个候选伪标签中确定所述价值样本b i 的伪标签。
将s个预测标签分别赋予生成样本b ij ,得到s个有标记价值生成样本b ij /L i n
然后,根据有标记价值生成样本集{b ij /L i n }混合有标记样本集训练出s个第二分类器模型C 2 in ,并根据价值样本b i 在s个第二分类器模型的响应不同,在b i 的预测标签集{L i n}中确定价值样本b i 的伪标签L i a ,完成价值样本b i 的标记,得到已标记价值样本b i /L i a
在具体实现中,参考图3,以价值样本b 1的标记过程为例详细说明步骤S7-S9:
(1)从步骤S4中获取价值样本b 1在第一分类器模型C 1中预测分数最大值对应的候选伪标签L 1 1和预测分数第二大值对应的候选伪标签L 1 2
(2)将所述预测标签L 1 1和所述预测标签L 1 2分别赋予所述价值生成样本集{b 1 j },得到2个已标记价值生成样本集{(b 1 j/ L 1 1)}和{(b 1 j/ L 1 2)};
(3)分别将2个已标记价值生成样本集{(b 1 j/ L 1 1)}和{(b 1 j/ L 1 2)}和已标记样本集R取并集,得到2个合并已标记样本集:{(b ij/ L i 1)}∪R和{(b ij/ L i 2)}∪R。
(4)通过2个合并已标记样本集训练出第二分类器组,包括2个第二分类器模型C 2 11和C 2 12,具体步骤为:
利用合并已标记样本集{(b 1 j/ L 1 1)}∪R训练出第二分类器模型C 2 11
利用合并已标记样本集{(b 1 j/ L 1 2)}∪R训练出第二分类器模型C 2 12
(5)将所述价值样本b 1分别输入训练的第二分类器模型C 2 11和C 2 12得到响应m 11和m 12,并根据输出的响应m 11和m 12之间的差异,在第一预测标签L 1 1和第二预测标签L 1 2中确定所述价值样本b i 的伪标签:
若存在m 11<m 12,赋予价值样本b 1标签L 1 2,得到已标记价值样本b 1/L 1 2
若存在m 11>m 12,赋予价值样本b 1标签L 1 1,得到已标记价值样本b 1/L 1 1
若存在m 11=m 12,价值样本b 1无法标记,将所述价值样本b 1放回价值样本集{b i };
(6)参考步骤(1)-(5)对价值样本集{b i }中其他的价值样本b i i≠1)进行样本标记,得到已标记价值样本集{(b i /L i a )}。
值得说明的是,传统的主动学习通过专家知识对价值样本进行标记,在具体实践中,由于缺少专业知识,人工标注是一件异常困难且代价很大的事,目前采用机器标注取代人工标注,然而由于价值样本数量较少,导致机器不能很好地学习价值样本的特征,不能保证标记的准确率。
针对以上问题,本实例的步骤S7-S9利用对抗生成网络,模拟价值样本的分布,增加价值样本的数据丰富度,得到价值生成样本,进一步利用价值生成样本训练出能够更好地学习价值样本的特征的第二分类器组,利用第二分类器组完成对价值样本的标记。由于本方法增加价值样本的数据丰富度,使分类器更好地学习价值样本的特征,所以本发明可以显著降低人工标记成本的同时,显著提升机器标记的准确率。
S10将所述已标记价值样本集{b i /L i a }加入所述已标记样本集,返回至所述步骤S2。
S11通过所述第一分类器模型对未标记样本集进行样本标记。
本发明的有益效果是:一方面,本发明通过生成对抗网络得到与价值样本同分布的价值生成样本,从而利用价值生成样本训练出能够更好地学习价值样本的特征的第二分类器组,利用第二分类器组完成对价值样本的标记,相比现有的主动学习方法,本方法增加价值样本的数据丰富度,使分类器更好地学习价值样本的特征,显著提升分类器的标注准确率,同时降低人工标记成本;
另一方面,本发明通过改进的价值样本查询策略,考虑了价值样本与已标记样本的差异性,从不确定性和多样性对样本进行价值筛选,得到在不确定性和多样性上均具备训练价值的价值样本,相比现有的主动学习查询策略,本方法在价值样本的选取上避免了样本偏差问题。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (7)

  1. 一种改进的主动学习遥感样本标记方法,其中,该方法包括以下步骤:
    S1获取样本集,所述样本集包括未标记样本集和已标记样本集;
    S2通过所述已标记样本集训练得到第一分类器模型;
    S3判断是否满足所述第一分类器模型训练终止的条件:
    若满足,结束训练,执行步骤S11;
    若不满足,执行步骤S4;
    S4将未标记样本放入第一分类器模型进行预测,并利用改进的价值样本查询策略筛选未标记样本,得到既有不确定性又具有多样性的价值样本集{b i },b i 为价值样本;
    S5获取价值样本b i 在所述第一分类器中的预测结果,包括类别标签和类别标签的预测分数,并按照从大到小的顺序对预测分数进行排序,选取预测分数排名前s的类别标签,得到价值样本b i 的候选伪标签集{L i n };
    i为价值样本的编号, n为候选伪标签的编号,s为候选伪标签集中的候选伪标签个数,其中s≥2;
    S6通过所述价值样本集{b i }训练生成对抗网络,并通过训练的生成对抗网络获得价值生成样本集{b ij };
    S7将s个价值样本b i 的候选伪标签L i n ,分别赋予价值生成样本集{b ij },得到s个已标记价值生成样本集{b ij/ L i n };
    S8分别对s个已标记价值生成样本集和已标记样本集取并集,得到s个合并已标记样本集;
    S9通过已标记样本集训练出第二分类器组,并通过第二分类器组的响应,从s个候选伪标签中筛选出所述价值样本b i 的真实标签L i a ,完成价值样本b i 的标记,得到已标记价值样本集{(b i /L i a )};
    a为价值样本伪标签的编号;
    S10将所述已标记价值样本集{b i /L i a }加入所述已标记样本集,返回至所述步骤S2;
    S11通过所述第一分类器模型对未标记样本集进行样本标记。
  2. 根据权利要求1所述方法,其中,所述步骤S7包括:
    将所述价值样本b i 的s个候选伪标签L i n 分别赋予所述价值生成样本集{b ij },得到s个已标记价值生成样本集{(b ij/ L i 1)},{(b ij/ L i 2)},...,{(b ij/ L i s)}。
  3. 根据权利要求1所述方法,其中,所述步骤S8包括:
    将已标记样本集分别和s个已标记价值生成样本集取并集,得到s个合并已标记样本集。
  4. 根据权利要求1所述方法,其中,所述步骤S9包括:
    S91通过s个合并已标记样本集训练出s个第二分类器模型,组成第二分类器组;
    S92将所述价值样本b i 分别输入s个第二分类器模型,根据所述价值样本b i 在s个第二分类器模型的响应差异在s个候选伪标签中确定所述价值样本b i 的伪标签。
  5. 根据权利要求1所述方法,其中,所述改进的价值样本查询策略包括:
    S41根据聚类算法对已标记样本集进行聚类,得到n个聚类中心x c
    S42对于每个未标记样本 x,经过第一分类器的预测,基于不确定性筛选产生预测概率向量 f( x);
    S43基于不确定性筛选计算未标记样本 x和所述n个聚类中心x c之间的最大距离,产生多样性向量g( x);
    S44通过预测概率向量 f( x)和多样性向量g( x),得到未标记样本的样本价值T;
    S45判断是否存在当前动态阈值:若不存在,执行步骤S46,若存在,跳转至S47;
    S46设置初始动态阈值初始值作为当前动态阈值,并构建价值样本集,初始价值样本集为空集,执行步骤S47;
    S47根据所述价值样本集是否为空集判断当前动态阈值是否进行调整:
    若所述价值样本集为空集,则对所述当前动态阈值增加预设数值,获得新的当前动态阈值;
    若所述价值样本集不为空集,则保持所述当前动态阈值不变;
    S48根据所述样本价值T与当前动态阈值T THR的关系判断所述未标记样本 x是否为价值样本;
    若否,则将所述未标记样本 x赋予伪标签y 1,并将该样本(y 1丨x)合并入已标记样本集;
    若是,则将所述未标记样本 x记为价值样本b i i为价值样本的编号,并加入价值样本集{b i }。
  6. 根据权利要求1所述方法,其中,所述第一分类器模型训练终止的条件包括所述价值样本的数量达到预先设定的上限或所述第一分类器模型训练误差在预先设定的范围内。
  7. 根据权利要求1所述方法,其中,所述步骤S6包括:
    S61对所述价值样本b i 进行编码,得到价值样本b i 对应的隐变量;
    S62初始化所述生成对抗网络的生成器和鉴别器的参数;
    S63将所述价值样本b i 对应的隐变量和随机噪声输入所述生成器,得到合成样本,并将所述合成样本和所述价值样本同时输入所述鉴别器以鉴别真伪;
    S64所述生成器和所述鉴别器相互博弈直至所述鉴别器将所述合成样本鉴定为真;
    S65将所述价值样本对应的隐变量和随机噪声输入训练好的生成器,得到价值生成样本b ij ,所述价值生成样本b ij 与所述价值样本b i 服从相同分布。
PCT/CN2023/082939 2022-05-12 2023-03-22 一种改进的主动学习遥感样本标记方法 WO2023216725A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210512002.3A CN114627390B (zh) 2022-05-12 2022-05-12 一种改进的主动学习遥感样本标记方法
CN202210512002.3 2022-05-12

Publications (1)

Publication Number Publication Date
WO2023216725A1 true WO2023216725A1 (zh) 2023-11-16

Family

ID=81906166

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/082939 WO2023216725A1 (zh) 2022-05-12 2023-03-22 一种改进的主动学习遥感样本标记方法

Country Status (2)

Country Link
CN (1) CN114627390B (zh)
WO (1) WO2023216725A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114627390B (zh) * 2022-05-12 2022-08-16 北京数慧时空信息技术有限公司 一种改进的主动学习遥感样本标记方法
CN115063692B (zh) * 2022-07-06 2024-02-27 西北工业大学 一种基于主动学习的遥感图像场景分类方法
CN115272870A (zh) * 2022-09-19 2022-11-01 北京数慧时空信息技术有限公司 基于地学信息和主动学习的遥感样本标注方法

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800785A (zh) * 2018-12-12 2019-05-24 中国科学院信息工程研究所 一种基于自表达相关的数据分类方法和装置
CN110309868A (zh) * 2019-06-24 2019-10-08 西北工业大学 结合无监督学习的高光谱图像分类方法
CN110990576A (zh) * 2019-12-24 2020-04-10 用友网络科技股份有限公司 基于主动学习的意图分类方法、计算机设备和存储介质
CN111950619A (zh) * 2020-08-05 2020-11-17 东北林业大学 一种基于双重生成对抗网络的主动学习方法
US20210019629A1 (en) * 2019-07-17 2021-01-21 Naver Corporation Latent code for unsupervised domain adaptation
CN112818791A (zh) * 2021-01-25 2021-05-18 哈尔滨工业大学 一种二级筛选模式融合校验的协同式半监督算法
CN113408605A (zh) * 2021-06-16 2021-09-17 西安电子科技大学 基于小样本学习的高光谱图像半监督分类方法
CN113780097A (zh) * 2021-08-17 2021-12-10 北京数慧时空信息技术有限公司 基于知识图谱和深度学习的耕地提取方法
CN114627390A (zh) * 2022-05-12 2022-06-14 北京数慧时空信息技术有限公司 一种改进的主动学习遥感样本标记方法

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800785A (zh) * 2018-12-12 2019-05-24 中国科学院信息工程研究所 一种基于自表达相关的数据分类方法和装置
CN110309868A (zh) * 2019-06-24 2019-10-08 西北工业大学 结合无监督学习的高光谱图像分类方法
US20210019629A1 (en) * 2019-07-17 2021-01-21 Naver Corporation Latent code for unsupervised domain adaptation
CN110990576A (zh) * 2019-12-24 2020-04-10 用友网络科技股份有限公司 基于主动学习的意图分类方法、计算机设备和存储介质
CN111950619A (zh) * 2020-08-05 2020-11-17 东北林业大学 一种基于双重生成对抗网络的主动学习方法
CN112818791A (zh) * 2021-01-25 2021-05-18 哈尔滨工业大学 一种二级筛选模式融合校验的协同式半监督算法
CN113408605A (zh) * 2021-06-16 2021-09-17 西安电子科技大学 基于小样本学习的高光谱图像半监督分类方法
CN113780097A (zh) * 2021-08-17 2021-12-10 北京数慧时空信息技术有限公司 基于知识图谱和深度学习的耕地提取方法
CN114627390A (zh) * 2022-05-12 2022-06-14 北京数慧时空信息技术有限公司 一种改进的主动学习遥感样本标记方法

Also Published As

Publication number Publication date
CN114627390A (zh) 2022-06-14
CN114627390B (zh) 2022-08-16

Similar Documents

Publication Publication Date Title
WO2023216725A1 (zh) 一种改进的主动学习遥感样本标记方法
Lu et al. Multisource compensation network for remote sensing cross-domain scene classification
TWI772805B (zh) 訓練生成對抗網路的方法、產生影像的方法及電腦可讀儲存媒體
Hao et al. An end-to-end architecture for class-incremental object detection with knowledge distillation
Chen et al. High-quality R-CNN object detection using multi-path detection calibration network
CN113190699B (zh) 一种基于类别级语义哈希的遥感图像检索方法及装置
CN110569886A (zh) 一种双向通道注意力元学习的图像分类方法
CN106991382A (zh) 一种遥感场景分类方法
CN107609084B (zh) 一种基于群智汇聚收敛的资源关联方法
CN110880019A (zh) 通过无监督域适应训练目标域分类模型的方法
CN112633406A (zh) 一种基于知识蒸馏的少样本目标检测方法
Islam et al. InceptB: a CNN based classification approach for recognizing traditional bengali games
Jiang et al. Delving into sample loss curve to embrace noisy and imbalanced data
CN109829494A (zh) 一种基于加权相似性度量的聚类集成方法
CN112434628A (zh) 基于主动学习和协同表示的小样本极化sar图像分类方法
CN110309300A (zh) 一种识别理科试题知识点的方法
WO2023201938A1 (zh) 缺失轨迹填补方法及系统
CN115292532B (zh) 基于伪标签一致性学习的遥感图像域适应检索方法
CN112733035A (zh) 基于知识图谱的知识点推荐方法、装置、存储介质及电子装置
CN115861738A (zh) 一种类别语义信息引导的遥感目标检测主动采样方法
Lan et al. Instance, scale, and teacher adaptive knowledge distillation for visual detection in autonomous driving
CN111797935B (zh) 基于群体智能的半监督深度网络图片分类方法
CN109242039A (zh) 一种基于候选标记估计的未标记数据利用方法
Xu et al. Bidirectional matrix feature pyramid network for object detection
CN113033410A (zh) 基于自动数据增强的域泛化行人重识别方法、系统及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23802501

Country of ref document: EP

Kind code of ref document: A1