CN115205632B - Semi-supervised multi-view metric learning method in Riemann space - Google Patents

Semi-supervised multi-view metric learning method in Riemann space Download PDF

Info

Publication number
CN115205632B
CN115205632B CN202210847014.1A CN202210847014A CN115205632B CN 115205632 B CN115205632 B CN 115205632B CN 202210847014 A CN202210847014 A CN 202210847014A CN 115205632 B CN115205632 B CN 115205632B
Authority
CN
China
Prior art keywords
matrix
view
semi
supervised
learning method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210847014.1A
Other languages
Chinese (zh)
Other versions
CN115205632A (en
Inventor
梁建青
梁吉业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Jinxinan Technology Co ltd
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN202210847014.1A priority Critical patent/CN115205632B/en
Publication of CN115205632A publication Critical patent/CN115205632A/en
Application granted granted Critical
Publication of CN115205632B publication Critical patent/CN115205632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a semi-supervised multi-view measurement learning method in Riemann space, which comprises the following steps: extracting multi-view features of the image from the training set and generating a sample pair; constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge; embedding data and knowledge from Euclidean space into Riemann manifold subspace to complete feature mapping; and carrying out multi-view fusion to obtain the unified representation of the features. The invention solves the problem of high dependence on strong supervision information and Euclidean space in the related technology, provides a novel high-efficiency measurement learning method suitable for complex application scenes and weak supervision labeling environments, and improves the performance of related tasks of weak supervision heterogeneous data mining and pattern recognition.

Description

Semi-supervised multi-view metric learning method in Riemann space
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a semi-supervised multi-view measurement learning method in a Riemann space.
Background
Distance metrics play a decisive role in the performance of most machine learning methods. In the face of complex and varied application scenarios, conventional metric functions have not been able to capture real data structures. How to learn to get task and data driven, flexible distance metrics is a research hotspot in the field of machine learning. As one of the mainstream technologies in the current machine learning field, metric learning aims to automatically learn a suitable metric from data, and is widely used in the fields of face recognition, information retrieval, network link prediction and the like.
Under the background of big data, the data presents the characteristics of high dimension, multi-source isomerism and extremely weak supervision, which makes learning of quick and effective distance measurement difficult, and simultaneously brings unprecedented challenges to intelligent information processing in the fields of traditional machine learning, pattern recognition and the like. The high dependence on the strongly supervised information and the euclidean space is a common problem in the current metric learning research, which leads to the great limitation of the application range of the existing learning model and algorithm in practical application.
Disclosure of Invention
The invention provides a semi-supervised multi-view measurement learning method in Riemann space, which aims to overcome the high dependence on strong supervision information and Euclidean space. The invention can accurately describe manifold distribution of data in weak supervision labeling environment and non-European space, and improves the performance of weak supervision heterogeneous data measurement learning.
The technical scheme of the invention is as follows: a semi-supervised multi-view measurement learning method under Riemann space comprises the following specific steps:
step 101: extracting multi-view features of the image from the training set and generating a sample pair;
step 102: constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge;
step 103: embedding data and knowledge from Euclidean space into Riemann manifold subspace to complete feature mapping;
step 104: and carrying out multi-view fusion to obtain the unified representation of the features.
Optionally, the step 101 extracts multi-view features of the image from the training set and forms a sample pair, and further includes:
the training set is transmitted into a local feature HOG, a SIFT feature descriptor and a deep convolutional neural network, and after a word bag model and a final full-connection layer of a feature extraction network are passed, 500-dimensional word bag representation and 1024-dimensional depth features of an image are respectively obtainedAnd obtaining the similar sample pair set S, the dissimilar sample pair set D and the unmarked sample set U according to the sample labels.
Optionally, the loss function is:
wherein L is a measurement learning total loss function, L dis To distinguish loss lambda 1 And lambda (lambda) 2 To control balance parameters between targets, L reg1 For regular loss of semi-supervised graph, L reg2 To measure regular loss, w v For v View weight, A (v) For v-view metric matrix, S (v) For the intra-class divergence of v views, D (v) For v view inter-class divergence, X (v) For the v view feature matrix, L is Laplacian matrix, D sld (A (v) ,A 0 ) For symmetrical LogDet divergence, A 0 The matrix is positive determined for a priori symmetry.
Optionally, discriminating loss L dis And obtaining the distance measurement with strong discrimination capability under the measurement matrix constructed by each view.
Alternatively, a laplace matrix l=d-W, wherein,for a diagonal matrix, the adjacency matrix W is defined as follows
Optionally, semi-supervised graph regularization loss L reg1 And a laplace matrix L, samples located within a local area of the low-dimensional manifold have similar classes based on manifold assumptions.
Optionally, measure canonical loss L reg2 So that in the matrix S (v) Guarantee A in case of near odd or irreversible (v) There is a solution.
Optionally, the loss term L is discriminated from the objective function dis Part, metric matrix A (v) Is generalized by the following objective function
Wherein delta R Riemann distance for SPD matrix
δ R (X,Y):=||log(Y -1/2 XY -1/2 )|| F For X, Y > 0,
optionally, the metric matrix a for each view is obtained and then solved for w.
The measurement learning method solves the problem of high dependence on strong supervision information and Euclidean space in the related technology, provides a novel high-efficiency measurement learning method suitable for complex application scenes and weak supervision labeling environments, and improves the performance of weak supervision heterogeneous data measurement learning.
Drawings
FIG. 1 is a flow chart of a semi-supervised multi-view metric learning method in Riemann space according to an embodiment of the present invention;
FIG. 2 is a specific technical scheme of an embodiment of the present invention.
Detailed Description
In order to enable those skilled in the art to better understand the present invention, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present invention with reference to the accompanying drawings. It will be apparent that the described embodiments are merely some, but not all embodiments of the invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden on the person of ordinary skill in the art based on the embodiments of the present invention, are intended to be within the scope of the present invention.
Assume that there are N samples from m viewsAiming at each view, the invention obtains the distance measurement with strong discrimination capability under the measurement matrix constructed by each view. The invention constructs the Laplace matrix and the regular loss guide data distribution of the semi-supervised graph based on manifold assumption in order to effectively utilize a large number of unmarked samples. In consideration of the condition that the intra-class divergence matrix approaches odd or is irreversible, the invention utilizes symmetrical LogDet divergence to construct measurement regular loss, thereby ensuring that each view measurement matrix has a solution. Finally, the invention generalizes the solving of each metric matrix from Euclidean space to Riemann space, so that the distance metric obtained by learning can better meet the requirements of actual complex application scenes. In the solving process, the invention obtains the measurement matrix of each view and then calculates the weight.
The steps of the invention are specifically described below with reference to fig. 1 and 2:
step 101: multi-view features of the image are extracted from the training set and pairs of samples are generated.
The training set is transmitted into a local feature HOG, a SIFT feature descriptor and a deep convolutional neural network, and after a word bag model and a final full-connection layer of a feature extraction network are passed, 500-dimensional word bag representation and 1024-dimensional depth features of an image are respectively obtainedAnd obtaining the similar sample pair S, the dissimilar sample pair D and the unmarked sample set U according to the sample label.
Step 102: and constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge.
By means of the thought of large interval, loss L is distinguished dis And obtaining the distance measurement with strong discrimination capability under the measurement matrix constructed by each view.
Based on manifold assumptions, samples located in the low-dimensional manifold local area have similar categories, and a semi-supervised graph regular loss L is constructed reg1 And a laplace matrix l=d-W, wherein,for a diagonal matrix, the adjacency matrix W is defined as follows
Taking into consideration the condition that the intra-class divergence matrix approaches odd or is irreversible, utilizing symmetrical LogDet divergence to construct a measurement regularized loss L reg2 Thereby ensuring the measurement matrix A of each view (v) Has the specific form of
The total loss function is defined as follows:
wherein L is a measurement learning total loss function, L dis To distinguish loss lambda 1 And lambda (lambda) 2 To control balance parameters between targets, L reg1 For regular loss of semi-supervised graph, L reg2 To measure regular loss, w v For v View weight, A (v) For v-view metric matrix, S (v) For the intra-class divergence of v views, D (v) For v view inter-class divergence, X (v) For the v view feature matrix, L is Laplacian matrix, D sld (A (v) ,A 0 ) For symmetrical LogDet divergence, A 0 The matrix is positive determined for a priori symmetry.
Step 103: and embedding the data and the knowledge from the Euclidean space into the Riemann manifold subspace to finish the feature mapping.
Firstly, solving A by considering fixed w, and judging loss term L in objective function dis Part, metric matrix A (v) Is generalized by the following objective function
Wherein delta R Riemann distance for SPD matrix
δ R (X,Y):=||log(Y -1/2 XY -1/2 )|| F For X, Y > 0,
the above problem has a closed-form solution in the Riemann manifold subspace in the form of a weighted geometric mean
A (v) =(S (v) ) -1 # t D (v)
Further for the overall objective function, each view measures matrix A (v) Solution of (2)
Step 104: and carrying out multi-view fusion to obtain the unified representation of the features.
After a measurement matrix A of each view is obtained by utilizing an alternate solving strategy, carrying constraint conditions into an objective function to construct a generalized Lagrange function for derivation, and then solving w
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (9)

1. The semi-supervised multi-view measurement learning method in the Riemann space is characterized by comprising the following steps:
step 101: extracting multi-view features of the image from the training set and generating a sample pair;
step 102: constructing a multi-view intra-class and inter-class divergence matrix, embedding semantic information into a feature subspace, and realizing migration and fusion of data and knowledge;
step 103: embedding data and knowledge from Euclidean space into Riemann manifold subspace to complete feature mapping;
step 104: and carrying out multi-view fusion to obtain the unified representation of the features.
2. The semi-supervised multiview metric learning method under Riemann space of claim 1, wherein the step 101 of extracting multiview features of an image from a training set and forming sample pairs further comprises:
the training set is transmitted into a local feature HOG, a SIFT feature descriptor and a deep convolutional neural network, and after a word bag model and a final full-connection layer of a feature extraction network are passed, 500-dimensional word bag representation and 1024-dimensional depth features of an image are respectively obtainedAnd obtaining a similar sample pair set S, a dissimilar sample pair set D and a label-free sample set according to the sample labels>
3. The semi-supervised multiview metric learning method under Riemann space of claim 1, wherein the loss function is:
wherein, the liquid crystal display device comprises a liquid crystal display device,learning the total loss function for metrics>To distinguish loss lambda 1 And lambda (lambda) 2 In order to control the balance parameters between the targets,regular loss for semi-supervised graphs, < >>To measure regular loss, w v For v View weight, A (v) For v-view metric matrix, S (v) For the intra-class divergence of v views, D (v) For v view inter-class divergence, X (v) For the v view feature matrix, L is Laplacian matrix, D sld (A (v) ,A 0 ) For symmetrical LogDet divergence, A 0 The matrix is positive determined for a priori symmetry.
4. A semi-supervised multiview metric learning method as defined in claim 3, wherein said discrimination lossAnd obtaining the distance measurement with strong discrimination capability under the measurement matrix constructed by each view.
5. The method for semi-supervised multiview metric learning under Rieman space as set forth in claim 3, wherein the Laplace matrix L = D-W, wherein,for a diagonal matrix, the adjacency matrix W is defined as follows
6. A semi-supervised multiview metric learning method as claimed in claim 3, wherein the semi-supervised graph regularization lossAnd a laplace matrix L, samples located within a local area of the low-dimensional manifold have similar classes according to manifold assumptions.
7. A semi-supervised multiview metric learning method as claimed in claim 3, wherein said metric regularization lossSo that in the matrix S (v) Guarantee A in case of near odd or irreversible (v) There is a solution.
8. A semi-supervised multiview metric learning method as claimed in claim 3, wherein the loss term is determined in the objective functionPart, metric matrix A (v) The solution of (2) is augmented by an objective function
Wherein delta R Riemann distance for SPD matrix
δ R (X,Y):=||log(Y -1/2 XY -1/2 )|| F For X, Y > 0.
9. A semi-supervised multiview metric learning method as claimed in claim 3, wherein the metric matrix a for each view is obtained and then solved for w.
CN202210847014.1A 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space Active CN115205632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210847014.1A CN115205632B (en) 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210847014.1A CN115205632B (en) 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space

Publications (2)

Publication Number Publication Date
CN115205632A CN115205632A (en) 2022-10-18
CN115205632B true CN115205632B (en) 2023-07-18

Family

ID=83581743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210847014.1A Active CN115205632B (en) 2022-07-07 2022-07-07 Semi-supervised multi-view metric learning method in Riemann space

Country Status (1)

Country Link
CN (1) CN115205632B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414575A (en) * 2019-07-11 2019-11-05 东南大学 A kind of semi-supervised multiple labeling learning distance metric method merging Local Metric
CN110598733A (en) * 2019-08-05 2019-12-20 南京智谷人工智能研究院有限公司 Multi-label distance measurement learning method based on interactive modeling
CN111488951B (en) * 2020-05-22 2023-11-28 南京大学 Method for generating countermeasure metric learning model for RGB-D image classification

Also Published As

Publication number Publication date
CN115205632A (en) 2022-10-18

Similar Documents

Publication Publication Date Title
Wang et al. Adaptive fusion for RGB-D salient object detection
CN107563381B (en) Multi-feature fusion target detection method based on full convolution network
Zhang et al. Detection of co-salient objects by looking deep and wide
Zheng et al. Centralized ranking loss with weakly supervised localization for fine-grained object retrieval.
Shankar et al. Deep-carving: Discovering visual attributes by carving deep neural nets
CN107145862B (en) Multi-feature matching multi-target tracking method based on Hough forest
Xiao et al. Deep learning for occluded and multi‐scale pedestrian detection: A review
CN106228539B (en) A variety of geometric primitive automatic identifying methods in a kind of three-dimensional point cloud
CN110472652B (en) Small sample classification method based on semantic guidance
Li et al. Adaptive metric learning for saliency detection
CN109446333A (en) A kind of method that realizing Chinese Text Categorization and relevant device
Zhang et al. Robust adaptive learning with Siamese network architecture for visual tracking
Chen et al. A saliency map fusion method based on weighted DS evidence theory
Chen et al. Human motion target posture detection algorithm using semi-supervised learning in internet of things
Yang et al. Research on subway pedestrian detection algorithms based on SSD model
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
Liu et al. Graph-based Knowledge Distillation: A survey and experimental evaluation
CN115205632B (en) Semi-supervised multi-view metric learning method in Riemann space
CN113569657A (en) Pedestrian re-identification method, device, equipment and storage medium
CN116205905B (en) Power distribution network construction safety and quality image detection method and system based on mobile terminal
CN116469171A (en) Self-supervision skeleton action recognition method based on cross-view consistency mining
Ma Fixed-point tracking of English reading text based on mean shift and multi-feature fusion
Kong et al. Context semantics for small target detection in large-field images with two cascaded faster R-CNNs
Zhou et al. Fish density estimation with multi-scale context enhanced convolutional neural network
Lv et al. Visual tracking with tree‐structured appearance model for online learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231206

Address after: Room 1806, Block B, Huanya Times Square, No. 7 Yari Street, Taiyuan Xuefu Park, Shanxi Comprehensive Reform Demonstration Zone, Taiyuan City, Shanxi Province, 030000

Patentee after: Shanxi Jinxinan Technology Co.,Ltd.

Address before: 030006 803, science and technology building, Shanxi University, No. 92, Wucheng Road, Xiaodian District, Taiyuan City, Shanxi Province

Patentee before: SHANXI University