WO2022062419A1 - Target re-identification method and system based on non-supervised pyramid similarity learning - Google Patents

Target re-identification method and system based on non-supervised pyramid similarity learning Download PDF

Info

Publication number
WO2022062419A1
WO2022062419A1 PCT/CN2021/092935 CN2021092935W WO2022062419A1 WO 2022062419 A1 WO2022062419 A1 WO 2022062419A1 CN 2021092935 W CN2021092935 W CN 2021092935W WO 2022062419 A1 WO2022062419 A1 WO 2022062419A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
pyramid
scene domain
unsupervised
samples
Prior art date
Application number
PCT/CN2021/092935
Other languages
French (fr)
Chinese (zh)
Inventor
董文会
曲培树
刘汉平
唐延柯
陈慧杰
高迎
张俊叶
Original Assignee
德州学院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 德州学院 filed Critical 德州学院
Publication of WO2022062419A1 publication Critical patent/WO2022062419A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Definitions

  • the invention belongs to the field of target re-identification, and in particular relates to a target re-identification method and system based on unsupervised pyramid similarity learning.
  • target re-identification is to compare and match the pedestrian target image to be searched with the pedestrian images obtained under different cameras, and to find out whether the target pedestrian appears in different camera monitoring scenes.
  • This technology plays an important role in intelligent surveillance and public safety. This problem has always been challenging in complex surveillance environments (such as illumination changes, objects occluded by other things, different surveillance perspectives, etc.).
  • the deep re-identification method based on unsupervised cross-domain learning uses the labeled source scene domain data to train the deep learning framework to obtain the original model, and uses the unlabeled data to train the original model in the target scene domain, so that the model adapts to the target scene domain. data and obtain an accurate target model. Due to the difference between the source scene domain and the target scene domain, how to obtain a good adaptive model is a key problem to be solved by such methods.
  • the current methods to solve this problem include: learning a target model with invariant features and adaptively updating it by aligning attributes and labels, and generating images in the target domain that are consistent with the style of the labeled images in the source scene as training samples through an adversarial network.
  • the inventor found that the target model constructed by the current target re-identification method is inaccurate, and the target model is not suitable for the characteristics of unlabeled samples.
  • the present invention provides a target re-identification method and system based on unsupervised pyramid similarity learning, which can classify and identify feature blocks of different scales by means of unsupervised clustering, and screen out valid data samples.
  • the initial model is trained and updated, and through continuous iterative training and updating, the model is more and more adapted to the sample data in the target scene domain, which can improve the accuracy of pedestrian target re-identification.
  • a first aspect of the present invention provides an object re-identification method based on unsupervised pyramid similarity learning.
  • An object re-identification method based on unsupervised pyramid similarity learning including:
  • the training and update process of the target re-identification model is as follows:
  • the sample images of the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
  • a second aspect of the present invention provides an object re-identification system based on unsupervised pyramid similarity learning.
  • An object re-identification system based on unsupervised pyramid similarity learning including:
  • an image acquisition module which is used to acquire the sample image to be queried and the target scene domain image
  • the training and update process of the target re-identification model is as follows:
  • a fourth aspect of the present invention provides a computer apparatus.
  • a computer device comprising a memory, a processor and a computer program stored on the memory and running on the processor, the processor implements the above-mentioned goal of learning based on unsupervised pyramid similarity when the processor executes the program Steps in the re-identification method.
  • the multi-scale pyramid feature block of the present invention is simple and general, can fully describe the sample features from the whole to the local, and fully mine the identification information of the samples.
  • the present invention integrates multi-scale pyramid similarity learning into the unsupervised deep convolutional neural network, constructs a multi-scale feature depth model to learn the characteristics of unidentified samples, and the model comprehensively learns the similarity between different samples and different scale feature blocks , with stable and robust characteristics.
  • the present invention designs a function to measure the similarity between the source scene domain and the target scene domain and the similarity distance between samples in the target scene domain in the transfer learning process.
  • each scale feature block uses DBSCAN clustering to realize automatic labeling and filter.
  • the samples screened by this method are more conducive to the transfer and adaptation of the model, resulting in better performance.
  • FIG. 1 is a flowchart of a target re-identification method based on unsupervised pyramid similarity learning according to an embodiment of the present invention
  • FIG. 2 is a framework diagram of an initial model deep convolutional neural network according to an embodiment of the present invention
  • Fig. 3 is the multi-scale pyramid feature block flow chart of the embodiment of the present invention.
  • FIG. 4 is a framework diagram of an adaptive transfer learning according to an embodiment of the present invention.
  • FIG. 7 is a Rank-1 recognition accuracy curve diagram corresponding to different parameters p according to an embodiment of the present invention.
  • the target re-identification method based on unsupervised pyramid similarity learning in this embodiment includes:
  • Step 2 output the target image matching the sample image to be queried in the target scene domain through the target re-identification model
  • the training and update process of the target re-identification model is as follows:
  • the sample images of the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
  • the labeled and filtered samples are used to continue training the model. After several iterations of training, the updated model will be more suitable for the target scene area, thereby obtaining a higher target re-identification accuracy.
  • the initial model is to provide experience for the early learning of unidentified samples in the target scene domain, and to improve the accuracy of the preliminary learning.
  • the initial model is obtained by training a deep convolutional neural network built on labeled samples in the source scene domain.
  • FIG. 2 A specific example of the initial model of this embodiment is shown in FIG. 2 , and the initial model is a modified ResNet-50 deep convolutional neural network.
  • FC1 is 2048
  • FC2 is the number of actual entities.
  • the loss function is designed as a combination of the cross entropy loss function (cross entropy loss) and the triplet loss function (triplet loss), the triplet loss function is used in the first fully connected layer, and the cross is used in the second fully connected layer.
  • Entropy loss function The combination of the two loss functions will give full play to the advantages of both classification and validation methods.
  • the triplet loss function (triplet loss) adopts batch-hard triplet loss, and each mini-batch is constructed by randomly sampling K sample instances of P target entities, which are defined as follows:
  • m is the edge parameter.
  • the cross entropy loss function is defined as:
  • This embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, when the processor executes the program, the computer program based on the first embodiment described above is implemented. Steps in an object re-identification method for unsupervised pyramid similarity learning.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

A target re-identification method and system based on non-supervised pyramid similarity learning. The target re-identification method based on non-supervised pyramid similarity learning comprises: obtaining a sample image to be queried and a target scene domain image; and outputting a target image, matching the sample image to be queried, in a target scene domain via a target re-identification model, wherein a training and updating process of the target re-identification model comprises: performing non-supervised multiscale horizontal pyramid similarity learning on images of a source scene domain and the target scene domain; and automatically labeling a target scene domain sample image according to similarity and screening out a training sample to train and update an initial model so as to obtain the target re-identification model. By means of continuous iterative training and updating, the model is more and more adaptive to sample data in a target scene domain, and the accuracy of pedestrian target re-identification can be improved.

Description

基于非督导金字塔相似性学习的目标重识别方法及系统Object Re-Identification Method and System Based on Unsupervised Pyramid Similarity Learning 技术领域technical field
本发明属于目标重识别领域,尤其涉及一种基于非督导金字塔相似性学习的目标重识别方法及系统。The invention belongs to the field of target re-identification, and in particular relates to a target re-identification method and system based on unsupervised pyramid similarity learning.
背景技术Background technique
本部分的陈述仅仅是提供了与本发明相关的背景技术信息,不必然构成在先技术。The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art.
目标重识别的目的是将需要查找的行人目标图像与在不同摄像机下获得的行人图像进行比较和匹配,查找该目标行人在不同摄像机监控场景中是否出现。该项技术在智能监控和公共安全方面有重要作用。在复杂的监控环境中(如光照发生变化、目标被其他事物遮挡、不同的监控视角等),该问题一直具有挑战性。The purpose of target re-identification is to compare and match the pedestrian target image to be searched with the pedestrian images obtained under different cameras, and to find out whether the target pedestrian appears in different camera monitoring scenes. This technology plays an important role in intelligent surveillance and public safety. This problem has always been challenging in complex surveillance environments (such as illumination changes, objects occluded by other things, different surveillance perspectives, etc.).
最近,基于深度学习框架的目标重识别方法获得了较好的性能,该类方法可以分为督导式深度目标重识别方法和非督导式深度目标重识别方法。督导式深度目标重识别方法具有较高的识别正确率,但是该方法需要对监控场景中的大量行人目标进行标注,这将会消耗大量的人力和物力。对于不同的应用场景该方法不具备自适应性,需要重新标注数据。非督导式深度目标重识别方法无需标注监控场景中的数据,其难点是如何有效的学习行人目标的模型。在该类方法中基于非督导交叉域学习的深度重识别方法具有较好的性能。基于非督导交叉域学习的深度重识别方法利用已标注的源场景域数据训练深度学习框架获得原始模型,在目标场景域中利用未标注数据对原始模型进行训练,使得模型 自适应目标场景域的数据并获得准确的目标模型。由于源场景域与目标场景域的不同,如何获得良好的自适应模型是这类方法需要解决的关键问题。目前解决这一问题的方法有:学习不变特征的目标模型并通过属性和标号对齐的方式进行自适应更新、通过对抗网络在目标域中生成与源场景已标注图像样式一致的图像作为训练样本进行自适应或者学习不同摄像机中相似性的不一致性等。这些方法在性能上仍然不如相应的督导式方法,在构建模型、迁移算法等方面仍存在问题,大部分采用的是整体特征模型,当目标被遮挡或监控视角改变时性能将出现大幅度下降。Recently, object re-identification methods based on deep learning frameworks have achieved good performance, which can be divided into supervised deep object re-identification methods and unsupervised deep object re-identification methods. The supervised deep object re-identification method has a high recognition accuracy, but this method needs to label a large number of pedestrian objects in the monitoring scene, which will consume a lot of manpower and material resources. For different application scenarios, the method is not adaptive, and the data needs to be re-labeled. The unsupervised deep object re-identification method does not need to label the data in the surveillance scene, and its difficulty is how to effectively learn the pedestrian object model. Among this class of methods, the deep re-identification method based on unsupervised cross-domain learning has better performance. The deep re-identification method based on unsupervised cross-domain learning uses the labeled source scene domain data to train the deep learning framework to obtain the original model, and uses the unlabeled data to train the original model in the target scene domain, so that the model adapts to the target scene domain. data and obtain an accurate target model. Due to the difference between the source scene domain and the target scene domain, how to obtain a good adaptive model is a key problem to be solved by such methods. The current methods to solve this problem include: learning a target model with invariant features and adaptively updating it by aligning attributes and labels, and generating images in the target domain that are consistent with the style of the labeled images in the source scene as training samples through an adversarial network. Do adaptation or learn the inconsistency of similarity in different cameras, etc. The performance of these methods is still inferior to the corresponding supervised methods, and there are still problems in building models and migration algorithms. Most of them use the overall feature model, and the performance will drop significantly when the target is occluded or the monitoring perspective changes.
综上所述,发明人发现,目前的目标重识别方法所构建的目标模型不准确,而且目标模型不适用未标注样本特性。To sum up, the inventor found that the target model constructed by the current target re-identification method is inaccurate, and the target model is not suitable for the characteristics of unlabeled samples.
发明内容SUMMARY OF THE INVENTION
为了解决上述问题,本发明提供一种基于非督导金字塔相似性学习的目标重识别方法及系统,其通过非督导聚类的方式对于不同尺度的特征块进行分类标识,并筛选出有效的数据样本对初始模型进行训练和更新,通过不断的迭代训练和更新,使得模型越来越适应目标场景域中的样本数据,能够提高行人目标重识别的准确性。In order to solve the above problems, the present invention provides a target re-identification method and system based on unsupervised pyramid similarity learning, which can classify and identify feature blocks of different scales by means of unsupervised clustering, and screen out valid data samples. The initial model is trained and updated, and through continuous iterative training and updating, the model is more and more adapted to the sample data in the target scene domain, which can improve the accuracy of pedestrian target re-identification.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:
本发明的第一个方面提供一种基于非督导金字塔相似性学习的目标重识别方法。A first aspect of the present invention provides an object re-identification method based on unsupervised pyramid similarity learning.
一种基于非督导金字塔相似性学习的目标重识别方法,包括:An object re-identification method based on unsupervised pyramid similarity learning, including:
获取待查询样本图像及目标场景域图像;Obtain the sample image to be queried and the target scene domain image;
经目标重识别模型输出目标场景域中与待查询样本图像匹配的目标图像;The target image that matches the sample image to be queried in the target scene domain is output through the target re-identification model;
其中,目标重识别模型的训练和更新过程为:Among them, the training and update process of the target re-identification model is as follows:
对源场景域和目标场景域图像进行非督导多尺度水平金字塔相似性学习;Unsupervised multi-scale horizontal pyramid similarity learning for source scene domain and target scene domain images;
根据相似性对目标场景域样本图像进行自动标注并筛选出训练样本来对初始模型进行训练和更新,得到目标重识别模型。According to the similarity, the sample images of the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
本发明的第二个方面提供一种基于非督导金字塔相似性学习的目标重识别系统。A second aspect of the present invention provides an object re-identification system based on unsupervised pyramid similarity learning.
一种基于非督导金字塔相似性学习的目标重识别系统,包括:An object re-identification system based on unsupervised pyramid similarity learning, including:
图像获取模块,其用于获取待查询样本图像及目标场景域图像;an image acquisition module, which is used to acquire the sample image to be queried and the target scene domain image;
目标重识别模块,其用于经目标重识别模型输出目标场景域中与待查询样本图像匹配的目标图像;A target re-identification module, which is used for outputting a target image matching the sample image to be queried in the target scene domain through the target re-identification model;
其中,目标重识别模型的训练和更新过程为:Among them, the training and update process of the target re-identification model is as follows:
对源场景域和目标场景域图像进行非督导多尺度水平金字塔相似性学习;Unsupervised multi-scale horizontal pyramid similarity learning for source scene domain and target scene domain images;
根据相似性对目标场景域样本图像进行自动标注并筛选出训练样本来对初始模型进行训练和更新,得到目标重识别模型。According to the similarity, the sample images of the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
本发明的第三个方面提供一种计算机可读存储介质。A third aspect of the present invention provides a computer-readable storage medium.
一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述所述的基于非督导金字塔相似性学习的目标重识别方法中的步骤。A computer-readable storage medium on which a computer program is stored, when the program is executed by a processor, implements the steps in the above-mentioned object re-identification method based on unsupervised pyramid similarity learning.
本发明的第四个方面提供一种计算机设备。A fourth aspect of the present invention provides a computer apparatus.
一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述所述的基于非督导金字塔相似性学习的目标重识别方法中的步骤。A computer device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, the processor implements the above-mentioned goal of learning based on unsupervised pyramid similarity when the processor executes the program Steps in the re-identification method.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:
本发明的多尺度金子塔特征分块简单通用,可以从整体到局部全面描述样本特征,充分挖掘样本的辨识性信息。The multi-scale pyramid feature block of the present invention is simple and general, can fully describe the sample features from the whole to the local, and fully mine the identification information of the samples.
本发明将多尺度金字塔相似性学习融入非督导深度卷积神经网络,构建了多尺度的特征深度模型来学习未标识样本的特性,该模型全面学习了不同样本、不同尺度特征块之间的相似性,具有稳定和鲁棒的特性。The present invention integrates multi-scale pyramid similarity learning into the unsupervised deep convolutional neural network, constructs a multi-scale feature depth model to learn the characteristics of unidentified samples, and the model comprehensively learns the similarity between different samples and different scale feature blocks , with stable and robust characteristics.
本发明在迁移学习中设计了度量源场景域与目标场景域的相似性及目标场景域样本间的相似性距离度量函数,在此基础上各尺度特征块利用DBSCAN聚类实现了样本自动标注和筛选。通过此方法筛选的样本更加有利于模型的迁移和自适应,从而获得更好的性能。The present invention designs a function to measure the similarity between the source scene domain and the target scene domain and the similarity distance between samples in the target scene domain in the transfer learning process. On this basis, each scale feature block uses DBSCAN clustering to realize automatic labeling and filter. The samples screened by this method are more conducive to the transfer and adaptation of the model, resulting in better performance.
附图说明Description of drawings
构成本发明的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。The accompanying drawings forming a part of the present invention are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention, and do not constitute an improper limitation of the present invention.
图1是本发明实施例的基于非督导金字塔相似性学习的目标重识别方法流程图;1 is a flowchart of a target re-identification method based on unsupervised pyramid similarity learning according to an embodiment of the present invention;
图2是本发明实施例的初始模型深度卷积神经网络框架图FIG. 2 is a framework diagram of an initial model deep convolutional neural network according to an embodiment of the present invention
图3是本发明实施例的多尺度金字塔特征分块流程图;Fig. 3 is the multi-scale pyramid feature block flow chart of the embodiment of the present invention;
图4是本发明实施例的自适应迁移学习框架图;4 is a framework diagram of an adaptive transfer learning according to an embodiment of the present invention;
图5是本发明实施例的不同尺度对应的Rank-1识别准确率曲线图;5 is a graph of the Rank-1 recognition accuracy rate corresponding to different scales according to an embodiment of the present invention;
图6是本发明实施例的不同参数β∈[0,1]对应的Rank-1识别准确率曲线图;6 is a graph of the Rank-1 recognition accuracy rate corresponding to different parameters β∈[0,1] according to an embodiment of the present invention;
图7是本发明实施例的不同参数p对应的Rank-1识别准确率曲线图。FIG. 7 is a Rank-1 recognition accuracy curve diagram corresponding to different parameters p according to an embodiment of the present invention.
具体实施方式detailed description
下面结合附图与实施例对本发明作进一步说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.
应该指出,以下详细说明都是例示性的,旨在对本发明提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the invention. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本发明的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terminology used herein is for the purpose of describing specific embodiments only, and is not intended to limit the exemplary embodiments according to the present invention. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural as well, furthermore, it is to be understood that when the terms "comprising" and/or "including" are used in this specification, it indicates that There are features, steps, operations, devices, components and/or combinations thereof.
实施例一Example 1
如图1所示,本实施例的基于非督导金字塔相似性学习的目标重识别方法,包括:As shown in FIG. 1 , the target re-identification method based on unsupervised pyramid similarity learning in this embodiment includes:
步骤1:获取待查询样本图像及目标场景域图像;Step 1: Obtain the sample image to be queried and the target scene domain image;
步骤2:经目标重识别模型输出目标场景域中与待查询样本图像匹配的目标图像;Step 2: output the target image matching the sample image to be queried in the target scene domain through the target re-identification model;
其中,目标重识别模型的训练和更新过程为:Among them, the training and update process of the target re-identification model is as follows:
对源场景域和目标场景域图像进行非督导多尺度水平金字塔相似性学习;Unsupervised multi-scale horizontal pyramid similarity learning for source scene domain and target scene domain images;
根据相似性对目标场景域样本图像进行自动标注并筛选出训练样本来对初始模型进行训练和更新,得到目标重识别模型。According to the similarity, the sample images of the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
标注和筛选后的样本用于继续训练模型,经过若干次的迭代训练将使得更新后的模型更加适应目标场景区域,从而获得更高的目标重识别准确率。The labeled and filtered samples are used to continue training the model. After several iterations of training, the updated model will be more suitable for the target scene area, thereby obtaining a higher target re-identification accuracy.
在具体实施中,初始模型是为目标场景域未标识样本的前期学习提供经验, 提高初步学习的准确率。初始模型通过源场景域中的已标注样本训练构建的深度卷积神经网络而获得。In a specific implementation, the initial model is to provide experience for the early learning of unidentified samples in the target scene domain, and to improve the accuracy of the preliminary learning. The initial model is obtained by training a deep convolutional neural network built on labeled samples in the source scene domain.
本实施例的初始模型的具体实施例如图2所示,初始模型为改造的ResNet-50深度卷积神经网络。A specific example of the initial model of this embodiment is shown in FIG. 2 , and the initial model is a modified ResNet-50 deep convolutional neural network.
此处需要说明的是,在其他实施例中,初始模型也可采用其他现有的深度卷积神经网络模型来实现,此处不再详述。It should be noted here that, in other embodiments, the initial model may also be implemented by using other existing deep convolutional neural network models, which will not be described in detail here.
下面以改造的ResNet-50深度卷积神经网络为例来说明:The following is an example of the modified ResNet-50 deep convolutional neural network:
具体改造为:The specific transformation is:
保留ResNet-50的前四层,添加均匀池化层和两个全连接层FC1和FC2。FC1的输出维数为2048,FC2的输出维数为实际实体的个数。Retain the first four layers of ResNet-50, add a uniform pooling layer and two fully connected layers FC1 and FC2. The output dimension of FC1 is 2048, and the output dimension of FC2 is the number of actual entities.
在损失函数设计为交叉熵损失函数(cross entropy loss)和三元组损失函数(triplet loss)的联合,在第一个全连接层使用三元组损失函数,在第二个全连接层使用交叉熵损失函数。两种损失函数的联合将充分发挥分类和验证两种方法的优势。The loss function is designed as a combination of the cross entropy loss function (cross entropy loss) and the triplet loss function (triplet loss), the triplet loss function is used in the first fully connected layer, and the cross is used in the second fully connected layer. Entropy loss function. The combination of the two loss functions will give full play to the advantages of both classification and validation methods.
三元组损失函数(triplet loss)采用batch-hard triplet loss,每一个小批量通过随机采样P个目标实体的K个样本实例构建,其定义如下:The triplet loss function (triplet loss) adopts batch-hard triplet loss, and each mini-batch is constructed by randomly sampling K sample instances of P target entities, which are defined as follows:
Figure PCTCN2021092935-appb-000001
Figure PCTCN2021092935-appb-000001
其中,
Figure PCTCN2021092935-appb-000002
为选中样本的特征;
Figure PCTCN2021092935-appb-000003
为与
Figure PCTCN2021092935-appb-000004
标号一致的样本特征,
Figure PCTCN2021092935-appb-000005
Figure PCTCN2021092935-appb-000006
标号不一致的样本特征,m为边缘参数。
in,
Figure PCTCN2021092935-appb-000002
is the feature of the selected sample;
Figure PCTCN2021092935-appb-000003
for and
Figure PCTCN2021092935-appb-000004
The sample features with the same label,
Figure PCTCN2021092935-appb-000005
and
Figure PCTCN2021092935-appb-000006
The sample features with inconsistent labels, m is the edge parameter.
交叉熵损失函数(cross entropy loss)的定义为:The cross entropy loss function is defined as:
Figure PCTCN2021092935-appb-000007
Figure PCTCN2021092935-appb-000007
其中,
Figure PCTCN2021092935-appb-000008
分别为实际标号和预测的标号,l CE为样本的交叉熵损失。
in,
Figure PCTCN2021092935-appb-000008
are the actual label and the predicted label, respectively, and lCE is the cross-entropy loss of the sample.
源场景域训练时使用的损失函数L source为公式(1)和(2)的叠加。 The loss function L source used for training in the source scene domain is the superposition of formulas (1) and (2).
L source=L triplet+L CE        (3) L source = L triplet + L CE (3)
以用Market1501公共数据库进行训练为例,其库中行人的个数为750个,则FC2的输出维数为750。在训练过程中使用的损失函数为交叉熵损失函数和三元组损失函数。Taking the market1501 public database for training as an example, the number of pedestrians in the database is 750, and the output dimension of FC2 is 750. The loss functions used in the training process are the cross-entropy loss function and the triplet loss function.
非督导多尺度金字塔相似性学习为:Unsupervised multi-scale pyramid similarity learning is:
非督导多尺度相似性学习用于挖掘目标场景域样本与源场景域样本之间以及目标场景域内样本之间在多种尺度上的相似性。目标场景域样本与源场景域样本之间相似性学习主要是为了挖掘源场景与目标场景域之间的相似性,该相似性的挖掘有助于初始模型向目标场景域的迁移,尤其初始学习阶段。目标场景域内样本间相似性学习主要是为了挖掘样本间的相似度,为目标域样本的自动标注提供依据。Unsupervised multi-scale similarity learning is used to mine the similarity at multiple scales between samples in the target scene domain and the source scene domain samples as well as between samples in the target scene domain. The similarity learning between the target scene domain samples and the source scene domain samples is mainly to mine the similarity between the source scene and the target scene domain. The similarity mining helps the migration of the initial model to the target scene domain, especially the initial learning stage. The purpose of learning the similarity between samples in the target scene domain is to mine the similarity between samples and provide a basis for the automatic labeling of samples in the target domain.
非督导多尺度金字塔相似性学习的具体方案如下:The specific scheme of unsupervised multi-scale pyramid similarity learning is as follows:
设将第j幅目标场景域的样本图像
Figure PCTCN2021092935-appb-000009
输入初始模型后获得的特征图为
Figure PCTCN2021092935-appb-000010
根据设置的尺度参数σ,特征图被均匀的水平分割为2 σ块,对每一分块均匀池化后,可以获得特征集合
Figure PCTCN2021092935-appb-000011
多尺度金字塔体现在:设σ=σ 0,则尺度参数集合可以设置为小于σ 0的所有正整数的集合{0,1,…,σ 0},则对于特征图为
Figure PCTCN2021092935-appb-000012
最终获得的多尺度金字塔特征集为
Figure PCTCN2021092935-appb-000013
该集合中包含了从整体(尺度参数为0)到2 σ个局部特征的不同尺度的特征,可以充分的表述该幅 图像的特性。
Set the sample image of the jth target scene domain
Figure PCTCN2021092935-appb-000009
The feature map obtained after inputting the initial model is
Figure PCTCN2021092935-appb-000010
According to the set scale parameter σ, the feature map is evenly divided horizontally into 2 σ blocks. After uniform pooling of each block, the feature set can be obtained.
Figure PCTCN2021092935-appb-000011
The multi-scale pyramid is embodied in: set σ=σ 0 , then the scale parameter set can be set as the set of all positive integers less than σ 0 {0, 1, ..., σ 0 }, then for the feature map,
Figure PCTCN2021092935-appb-000012
The final multi-scale pyramid feature set obtained is
Figure PCTCN2021092935-appb-000013
The set contains features of different scales from the whole (scale parameter is 0) to local features, which can fully express the characteristics of the image.
目标场景域样本与源场景域样本之间相似性定义为:The similarity between the target scene domain samples and the source scene domain samples is defined as:
Figure PCTCN2021092935-appb-000014
Figure PCTCN2021092935-appb-000014
其中,
Figure PCTCN2021092935-appb-000015
为目标场景域样本特征
Figure PCTCN2021092935-appb-000016
在源场景域的最近邻样本。
Figure PCTCN2021092935-appb-000017
越小说明该样本越与源场景域相近。对源场景和目标场景域中对应的分块特征利用公式(4)计算二者的相似性,可以充分分析两个不同场景域的相似性。
in,
Figure PCTCN2021092935-appb-000015
Sample features for the target scene domain
Figure PCTCN2021092935-appb-000016
Nearest neighbor samples in the source scene domain.
Figure PCTCN2021092935-appb-000017
The smaller the value, the closer the sample is to the source scene domain. Using the formula (4) to calculate the similarity between the corresponding block features in the source scene and the target scene domain, the similarity between the two different scene domains can be fully analyzed.
为更准确的实现目标场景域内样本之间的相似性学习,本方案采用每一个样本的上下文环境来描述相应样本,上下文环境描述具体采用K相互近邻样本向量(k-reciprocal vector)。样本
Figure PCTCN2021092935-appb-000018
的K相互近邻样本向量v i,k定义为:当样本
Figure PCTCN2021092935-appb-000019
是样本
Figure PCTCN2021092935-appb-000020
的K相互近邻时
Figure PCTCN2021092935-appb-000021
当二者不是K相互近邻时v i,k=0。
In order to more accurately realize the similarity learning between samples in the target scene domain, this scheme uses the context of each sample to describe the corresponding samples, and the context description specifically uses K-reciprocal vector. sample
Figure PCTCN2021092935-appb-000018
The K mutually adjacent sample vectors vi ,k are defined as: when the sample
Figure PCTCN2021092935-appb-000019
is the sample
Figure PCTCN2021092935-appb-000020
When the K are adjacent to each other
Figure PCTCN2021092935-appb-000021
When the two are not K neighbors to each other, vi ,k =0.
目标场景域内样本之间的相似性定义为:The similarity between samples in the target scene domain is defined as:
Figure PCTCN2021092935-appb-000022
Figure PCTCN2021092935-appb-000022
其中,
Figure PCTCN2021092935-appb-000023
为目标场景域中的两个样本特征,v i,k,v j,k分别为样本i和j的K相互近邻样本向量,N T为目标场景域中的样本总数。对所有的样本特征块使用公式(5)可以计算相应分块特征对应的相似性。
in,
Figure PCTCN2021092935-appb-000023
are two sample features in the target scene domain, v i, k , v j, k are the K mutually adjacent sample vectors of samples i and j respectively, and N T is the total number of samples in the target scene domain. Using formula (5) for all sample feature blocks, the similarity corresponding to the corresponding block features can be calculated.
图3为多尺度金字塔特征分块的实施例流程图。具体为将特征图按照尺度参数σ,将特征图均匀分成2 σ块。多尺度性体现在对特征图采用多个尺度进行分块,如图3中采用的尺度为{0,1,2,3},特征图最终被分解{1,2,4,8}块,这些特征块经过均匀池化形成多尺度金字塔特征。 FIG. 3 is a flowchart of an embodiment of multi-scale pyramid feature segmentation. Specifically, the feature map is divided into 2 σ blocks uniformly according to the scale parameter σ. Multi-scale is reflected in the use of multiple scales to block the feature map. The scale used in Figure 3 is {0, 1, 2, 3}, and the feature map is finally decomposed into {1, 2, 4, 8} blocks, These feature blocks are uniformly pooled to form multi-scale pyramid features.
在目标场景域样本自动标注及训练样本筛选的过程中:In the process of automatic labeling of samples in the target scene domain and screening of training samples:
样本标注和样本筛选主要用于训练模型,使用准确标注且适当的样本对模型进行训练有助于获得较高的识别准确率。Sample labeling and sample screening are mainly used to train the model, and training the model with accurately labeled and appropriate samples helps to obtain a higher recognition accuracy.
样本自动标注和筛选方案为:对不同尺度分块样本集合采用非督导聚类算法DBSCAN进行聚类并分配伪标签。DBSCAN聚类中采用的距离标准为公式(4)和(5)两种距离的结合,具体为:The automatic labeling and screening scheme of samples is as follows: the unsupervised clustering algorithm DBSCAN is used to cluster and assign pseudo-labels to the sample sets of different scales. The distance standard used in DBSCAN clustering is the combination of the two distances of formula (4) and (5), specifically:
Figure PCTCN2021092935-appb-000024
Figure PCTCN2021092935-appb-000024
其中,
Figure PCTCN2021092935-appb-000025
为目标场景样本的第k个金字塔特征块,β∈[0,1]为平衡参数。
in,
Figure PCTCN2021092935-appb-000025
is the k-th pyramid feature block of the target scene sample, and β∈[0, 1] is the balance parameter.
为筛选数据样本,将所有通过公式(6)计算的样本对距离从小到大排序,设置DBSCAN聚类算法的扫描半径ε为前pN个距离的均值。其中p为比例因子,N为目标场景中样本对的总数。只有在扫描半径内的样本才会被选择。In order to screen the data samples, all the distances of the samples calculated by formula (6) are sorted from small to large, and the scanning radius ε of the DBSCAN clustering algorithm is set as the mean value of the first pN distances. where p is the scale factor and N is the total number of sample pairs in the target scene. Only samples within the scan radius will be selected.
在模型的训练和更新的过程中:During model training and updating:
模型的训练和更新用于实现模型由源场景域到目标场景域的迁移,训练和更新后的模型将更加适应目标场景域,从而具有良好的性能。The training and updating of the model is used to realize the migration of the model from the source scene domain to the target scene domain. The trained and updated model will be more suitable for the target scene domain, so that it has good performance.
在目标场景域训练时使用的损失函数是将所有的金字塔特征块作为独立的个体计算,分别代入公式(3)中并求累加和:The loss function used in the training of the target scene domain is to calculate all the pyramid feature blocks as independent individuals, respectively substitute them into formula (3) and calculate the cumulative sum:
Figure PCTCN2021092935-appb-000026
Figure PCTCN2021092935-appb-000026
目标场景域的自适应迁移学习的具体过程如图4所示,所有样本按图3过程获得多尺度金字塔特征,然后采用DBSCAN非督导聚类算法进行标注和筛选,DBSCAN聚类的距离标准由公式(6)计算获得,样本的筛选按照公式(6)计算获得的距离按照由小到大排序后,在扫描半径ε内的样本用于自适应迁移学习的训练,其余的将被排除在外。每一个尺度的金字塔特征都需要作为独立个体进 行DBSCAN聚类,即每一个样本将获得多个尺度范围上的标注。目标场景域的自适应迁移学习使用的深度学习框架基本与初始模型图2相似,不同在于训练过程中每一尺度的样本特征均会作为独立个体参与其中,因此的损失函数为公式(7)是所有尺度上损失函数的累加和。The specific process of adaptive transfer learning in the target scene domain is shown in Figure 4. All samples obtain multi-scale pyramid features according to the process in Figure 3, and then use the DBSCAN unsupervised clustering algorithm for labeling and screening. The distance standard of DBSCAN clustering is determined by the formula (6) Obtained by calculation. After the distances calculated by formula (6) are sorted from small to large, the samples within the scanning radius ε are used for the training of adaptive transfer learning, and the rest will be excluded. The pyramid features of each scale need to be clustered by DBSCAN as an independent individual, that is, each sample will obtain annotations on multiple scales. The deep learning framework used in the adaptive transfer learning of the target scene domain is basically similar to the initial model in Figure 2. The difference is that the sample features of each scale will participate as independent individuals in the training process. Therefore, the loss function is formula (7) is Cumulative sum of loss functions at all scales.
模型的训练和更新采用多次迭代训练的方式,每次迭代都重新标注获取样本特征,重新标注样本和筛选,随着迭代次数的增加,模型将逐步适应目标场景域样本,从而获得准确的识别率。在进行目标重识别时将查询样本图像输入模型中即可获取匹配的目标图像,从而实现查询目的。The training and updating of the model adopts the method of multiple iterative training. In each iteration, the sample features are re-labeled, and the samples are re-labeled and screened. With the increase of the number of iterations, the model will gradually adapt to the target scene domain samples, so as to obtain accurate recognition. Rate. When re-identifying the target, input the query sample image into the model to obtain the matching target image, so as to realize the query purpose.
通过以下仿真进一步说明:This is further illustrated by the following simulation:
对本实施例的目标重识别方法中的关键性参数选择进行了仿真计算,包括尺度参数σ、距离标准计算中融合与源场景相似性和目标场景域样本相似性的参数β及计算ε需要的比例参数p。仿真中使用的源场景域图像库为DukeMTMC-ReID,目标场景域图像库为Market1501,二者均为公用目标重识别常用库。仿真结果可以为相关技术人员在具体案例中的应用提供参考。The key parameter selection in the target re-identification method of this embodiment is simulated and calculated, including the scale parameter σ, the parameter β that fuses the similarity with the source scene and the similarity of the target scene domain sample in the distance standard calculation, and the ratio required to calculate ε. parameter p. The source scene domain image library used in the simulation is DukeMTMC-ReID, and the target scene domain image library is Market1501, both of which are commonly used libraries for public target re-identification. The simulation results can provide reference for the application of relevant technical personnel in specific cases.
图5为不同尺度参数σ时本实施例的该方案Rank-1的识别准确率,可见不同的尺度参数将获得不同的识别率,仿真结果显示当σ=2时,即对应的参数集合为σ={0,1,2}时将具有最高的识别准确率。Fig. 5 shows the recognition accuracy of the scheme Rank-1 of the present embodiment with different scale parameters σ. It can be seen that different scale parameters will obtain different recognition rates. The simulation results show that when σ=2, the corresponding parameter set is σ ={0,1,2} will have the highest recognition accuracy.
图6为不同参数β对应的Rank-1的识别准确率。由公式(6)距离标准计算中可见β的作用为相似性学习中两种相似性所占的权重比例,通过仿真结果可见,当β=0.1时,即与源场景相似性比重为0.1,目标场景域样本相似性占0.9时将获得最高的Rank-1的识别准确率。Figure 6 shows the recognition accuracy of Rank-1 corresponding to different parameters β. It can be seen from formula (6) that the role of β in the calculation of the distance standard is the weight ratio of the two similarities in similarity learning. It can be seen from the simulation results that when β=0.1, that is, the proportion of similarity with the source scene is 0.1, and the target The highest Rank-1 recognition accuracy will be obtained when the sample similarity in the scene domain accounts for 0.9.
图7为不同比例参数p对应的Rank-1识别率准确率。本实施例中扫描半径 ε设置为前pN个距离的均值,其中N是样本对的个数。由于N的个数很大,因而p的具体设置将对识别准确率的影响较大。通过仿真结果可见,当p设置为1.7×10 -3时识别准确率最高。 Figure 7 shows the accuracy of Rank-1 recognition rates corresponding to different scale parameters p. In this embodiment, the scanning radius ε is set as the mean value of the first pN distances, where N is the number of sample pairs. Since the number of N is very large, the specific setting of p will have a great influence on the recognition accuracy. It can be seen from the simulation results that the recognition accuracy is the highest when p is set to 1.7×10 -3 .
实施例二 Embodiment 2
本实施例的基于非督导金字塔相似性学习的目标重识别系统,包括:The target re-identification system based on unsupervised pyramid similarity learning of this embodiment includes:
图像获取模块,其用于获取待查询样本图像及目标场景域图像;an image acquisition module, which is used to acquire the sample image to be queried and the target scene domain image;
目标重识别模块,其用于经目标重识别模型输出目标场景域中与待查询样本图像匹配的目标图像;A target re-identification module, which is used for outputting a target image matching the sample image to be queried in the target scene domain through the target re-identification model;
其中,目标重识别模型的训练和更新过程为:Among them, the training and update process of the target re-identification model is as follows:
对源场景域和目标场景域图像进行非督导多尺度水平金字塔相似性学习;Unsupervised multi-scale horizontal pyramid similarity learning for source scene domain and target scene domain images;
根据相似性对目标场景域样本图像进行自动标注并筛选出训练样本来对初始模型进行训练和更新,得到目标重识别模型。According to the similarity, the sample images of the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
本实施例的基于非督导金字塔相似性学习的目标重识别系统的各个模块与实施例一中的基于非督导金字塔相似性学习的目标重识别方法中步骤一一对应,其具体实施过程如实施例一所述,此处不再累述。The modules of the target re-identification system based on unsupervised pyramid similarity learning in this embodiment correspond one-to-one with the steps in the target re-identification method based on unsupervised pyramid similarity learning in the first embodiment, and the specific implementation process is the same as the embodiment. Once mentioned, it will not be repeated here.
实施例三 Embodiment 3
本实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述实施例一所述的基于非督导金字塔相似性学习的目标重识别方法中的步骤。This embodiment provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for target re-identification based on unsupervised pyramid similarity learning described in the first embodiment above. step.
实施例四 Embodiment 4
本实施例提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述实 施例一所述的基于非督导金字塔相似性学习的目标重识别方法中的步骤。This embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, when the processor executes the program, the computer program based on the first embodiment described above is implemented. Steps in an object re-identification method for unsupervised pyramid similarity learning.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including but not limited to disk storage, optical storage, and the like.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flows of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random AccessMemory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium. During execution, the processes of the embodiments of the above-mentioned methods may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (10)

  1. 一种基于非督导金字塔相似性学习的目标重识别方法,其特征在于,包括:A target re-identification method based on unsupervised pyramid similarity learning, characterized by comprising:
    获取待查询样本图像及目标场景域图像;Obtain the sample image to be queried and the target scene domain image;
    经目标重识别模型输出目标场景域中与待查询样本图像匹配的目标图像;The target image that matches the sample image to be queried in the target scene domain is output through the target re-identification model;
    其中,目标重识别模型的训练和更新过程为:Among them, the training and update process of the target re-identification model is as follows:
    对源场景域和目标场景域图像进行非督导多尺度水平金字塔相似性学习;Unsupervised multi-scale horizontal pyramid similarity learning for source scene domain and target scene domain images;
    根据相似性对目标场景域样本图像进行自动标注并筛选出训练样本来对初始模型进行训练和更新,得到目标重识别模型。According to the similarity, the sample images in the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
  2. 如权利要求1所述的基于非督导金字塔相似性学习的目标重识别方法,其特征在于,初始模型通过源场景域中的已标注样本训练构建的深度卷积神经网络而获得。The target re-identification method based on unsupervised pyramid similarity learning according to claim 1, wherein the initial model is obtained by training a deep convolutional neural network constructed by annotated samples in the source scene domain.
  3. 如权利要求1所述的基于非督导金字塔相似性学习的目标重识别方法,其特征在于,在非督导多尺度水平金字塔相似性学习的过程中,利用初始模型在目标区域中提取目标场景域中未标识样本的特征图并将其进行不同尺度的水平分块,利用从全局到不同局部的特征来挖掘未标识样本的辨识性信息。The target re-identification method based on unsupervised pyramid similarity learning according to claim 1, wherein in the process of unsupervised multi-scale horizontal pyramid similarity learning, the initial model is used to extract the target scene domain in the target area. The feature map of unidentified samples is divided into horizontal blocks of different scales, and the discriminative information of unidentified samples is mined by using features from global to different local areas.
  4. 如权利要求1所述的基于非督导金字塔相似性学习的目标重识别方法,其特征在于,在非督导多尺度水平金字塔相似性学习的过程中,目标场景域样本与源场景域样本之间相似性可表示为1与自然对数项的差值,该自然对数项是目标场景域样本特征与其在源场景域的最近邻样本之间距离取负后的自然对数。The target re-identification method based on unsupervised pyramid similarity learning according to claim 1, wherein in the process of unsupervised multi-scale horizontal pyramid similarity learning, the target scene domain samples are similar to the source scene domain samples. The property can be expressed as the difference between 1 and the natural logarithm term, which is the natural logarithm of the negative distance between the target scene domain sample feature and its nearest neighbor sample in the source scene domain.
  5. 如权利要求1所述的基于非督导金字塔相似性学习的目标重识别方法,其特征在于,在非督导多尺度水平金字塔相似性学习的过程中,目标场景域内 样本之间的相似性为1与比值的差值,该比值为任意两个样本K相互近邻样本向量的较小者的累加和与任意两个样本K相互近邻样本向量的较大者的累加和的比值。The target re-identification method based on unsupervised pyramid similarity learning according to claim 1, wherein in the process of unsupervised multi-scale horizontal pyramid similarity learning, the similarity between samples in the target scene domain is 1 and The difference of the ratio, the ratio is the ratio of the cumulative sum of the smaller of the adjacent sample vectors of any two samples K to the cumulative sum of the larger of the adjacent sample vectors of any two samples K.
  6. 如权利要求1所述的基于非督导金字塔相似性学习的目标重识别方法,其特征在于,在自动标注并筛选出训练样本的过程中,通过非督导聚类的方式对于不同尺度的特征块进行分类标识,并筛选出有效的数据样本。The target re-identification method based on unsupervised pyramid similarity learning according to claim 1, characterized in that, in the process of automatically labeling and filtering out the training samples, the feature blocks of different scales are processed by means of unsupervised clustering. Classification and identification, and filter out valid data samples.
  7. 如权利要求1所述的基于非督导金字塔相似性学习的目标重识别方法,其特征在于,在目标重识别模型的训练和更新过程中,采用多次迭代训练的方式,每次迭代都重新标注获取样本特征,重新标注样本和筛选,随着迭代次数的增加,目标重识别模型逐步适应目标场景域样本。The target re-identification method based on unsupervised pyramid similarity learning according to claim 1, wherein in the training and updating process of the target re-identification model, a method of multiple iteration training is adopted, and each iteration is re-labeled The sample features are obtained, the samples are re-labeled and screened. With the increase of the number of iterations, the target re-identification model gradually adapts to the target scene domain samples.
  8. 一种基于非督导金字塔相似性学习的目标重识别系统,其特征在于,包括:A target re-identification system based on unsupervised pyramid similarity learning, characterized by comprising:
    图像获取模块,其用于获取待查询样本图像及目标场景域图像;an image acquisition module, which is used to acquire the sample image to be queried and the target scene domain image;
    目标重识别模块,其用于经目标重识别模型输出目标场景域中与待查询样本图像匹配的目标图像;A target re-identification module, which is used for outputting a target image matching the sample image to be queried in the target scene domain through the target re-identification model;
    其中,目标重识别模型的训练和更新过程为:Among them, the training and update process of the target re-identification model is as follows:
    对源场景域和目标场景域图像进行非督导多尺度水平金字塔相似性学习;Unsupervised multi-scale horizontal pyramid similarity learning for source scene domain and target scene domain images;
    根据相似性对目标场景域样本图像进行自动标注并筛选出训练样本来对初始模型进行训练和更新,得到目标重识别模型。According to the similarity, the sample images in the target scene domain are automatically marked and the training samples are selected to train and update the initial model, and obtain the target re-identification model.
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-7中任一项所述的基于非督导金字塔相似性学习的目标重识别方法中的步骤。A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the target repetition based on unsupervised pyramid similarity learning as described in any one of claims 1-7 is realized. Identify the steps in the method.
  10. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1-7中任一项所述的基于非督导金字塔相似性学习的目标重识别方法中的步骤。A computer device, comprising a memory, a processor and a computer program stored in the memory and running on the processor, wherein the processor implements any one of claims 1-7 when executing the program Steps in the described object re-identification method based on unsupervised pyramid similarity learning.
PCT/CN2021/092935 2020-09-22 2021-05-11 Target re-identification method and system based on non-supervised pyramid similarity learning WO2022062419A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011003036.7 2020-09-22
CN202011003036.7A CN112132014B (en) 2020-09-22 2020-09-22 Target re-identification method and system based on non-supervised pyramid similarity learning

Publications (1)

Publication Number Publication Date
WO2022062419A1 true WO2022062419A1 (en) 2022-03-31

Family

ID=73842376

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/092935 WO2022062419A1 (en) 2020-09-22 2021-05-11 Target re-identification method and system based on non-supervised pyramid similarity learning

Country Status (3)

Country Link
CN (1) CN112132014B (en)
NL (1) NL2029214B1 (en)
WO (1) WO2022062419A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132014B (en) * 2020-09-22 2022-04-12 德州学院 Target re-identification method and system based on non-supervised pyramid similarity learning
CN112949406A (en) * 2021-02-02 2021-06-11 西北农林科技大学 Sheep individual identity recognition method based on deep learning algorithm
CN112906557B (en) * 2021-02-08 2023-07-14 重庆兆光科技股份有限公司 Multi-granularity feature aggregation target re-identification method and system under multi-view angle
CN113420824A (en) * 2021-07-03 2021-09-21 上海理想信息产业(集团)有限公司 Pre-training data screening and training method and system for industrial vision application
CN114565839A (en) * 2022-02-17 2022-05-31 广州市城市规划勘测设计研究院 Remote sensing image target detection method, device, equipment and computer medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160078359A1 (en) * 2014-09-12 2016-03-17 Xerox Corporation System for domain adaptation with a domain-specific class means classifier
CN110414462A (en) * 2019-08-02 2019-11-05 中科人工智能创新技术研究院(青岛)有限公司 A kind of unsupervised cross-domain pedestrian recognition methods and system again
CN111259756A (en) * 2020-01-10 2020-06-09 西安培华学院 Pedestrian re-identification method based on local high-frequency features and mixed metric learning
CN112132014A (en) * 2020-09-22 2020-12-25 德州学院 Target re-identification method and system based on non-supervised pyramid similarity learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622229B (en) * 2017-08-29 2021-02-02 中山大学 Video vehicle re-identification method and system based on fusion features
CN111259836A (en) * 2020-01-20 2020-06-09 浙江大学 Video pedestrian re-identification method based on dynamic graph convolution representation
CN111476168B (en) * 2020-04-08 2022-06-21 山东师范大学 Cross-domain pedestrian re-identification method and system based on three stages

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160078359A1 (en) * 2014-09-12 2016-03-17 Xerox Corporation System for domain adaptation with a domain-specific class means classifier
CN110414462A (en) * 2019-08-02 2019-11-05 中科人工智能创新技术研究院(青岛)有限公司 A kind of unsupervised cross-domain pedestrian recognition methods and system again
CN111259756A (en) * 2020-01-10 2020-06-09 西安培华学院 Pedestrian re-identification method based on local high-frequency features and mixed metric learning
CN112132014A (en) * 2020-09-22 2020-12-25 德州学院 Target re-identification method and system based on non-supervised pyramid similarity learning

Also Published As

Publication number Publication date
CN112132014A (en) 2020-12-25
CN112132014B (en) 2022-04-12
NL2029214B1 (en) 2023-03-14
NL2029214A (en) 2022-05-23

Similar Documents

Publication Publication Date Title
WO2022062419A1 (en) Target re-identification method and system based on non-supervised pyramid similarity learning
Huixian The analysis of plants image recognition based on deep learning and artificial neural network
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN111666843B (en) Pedestrian re-recognition method based on global feature and local feature splicing
CN105701502B (en) Automatic image annotation method based on Monte Carlo data equalization
CN107133569B (en) Monitoring video multi-granularity labeling method based on generalized multi-label learning
CN111126360A (en) Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN110414462A (en) A kind of unsupervised cross-domain pedestrian recognition methods and system again
CN112507901B (en) Unsupervised pedestrian re-identification method based on pseudo tag self-correction
CN108491766B (en) End-to-end crowd counting method based on depth decision forest
CN103425996B (en) A kind of large-scale image recognition methods of parallel distributed
CN108875816A (en) Merge the Active Learning samples selection strategy of Reliability Code and diversity criterion
CN109299707A (en) A kind of unsupervised pedestrian recognition methods again based on fuzzy depth cluster
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN110619059B (en) Building marking method based on transfer learning
CN110399895A (en) The method and apparatus of image recognition
CN112990282B (en) Classification method and device for fine-granularity small sample images
CN104268546A (en) Dynamic scene classification method based on topic model
CN111914902A (en) Traditional Chinese medicine identification and surface defect detection method based on deep neural network
Tan et al. Rapid fine-grained classification of butterflies based on FCM-KM and mask R-CNN fusion
CN112183464A (en) Video pedestrian identification method based on deep neural network and graph convolution network
Xu et al. Graphical modeling for multi-source domain adaptation
CN114782752A (en) Small sample image grouping classification method and device based on self-training
CN109784404A (en) A kind of the multi-tag classification prototype system and method for fusion tag information
Chang et al. Fine-grained butterfly and moth classification using deep convolutional neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21870806

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21870806

Country of ref document: EP

Kind code of ref document: A1