CN110619348A - Feature learning algorithm based on personalized discrimination - Google Patents

Feature learning algorithm based on personalized discrimination Download PDF

Info

Publication number
CN110619348A
CN110619348A CN201910724615.1A CN201910724615A CN110619348A CN 110619348 A CN110619348 A CN 110619348A CN 201910724615 A CN201910724615 A CN 201910724615A CN 110619348 A CN110619348 A CN 110619348A
Authority
CN
China
Prior art keywords
personalized
samples
weight
updating
global
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910724615.1A
Other languages
Chinese (zh)
Inventor
郭艳蓉
郝世杰
汪萌
洪日昌
陈涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Hefei Polytechnic University
Original Assignee
Hefei Polytechnic University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Polytechnic University filed Critical Hefei Polytechnic University
Priority to CN201910724615.1A priority Critical patent/CN110619348A/en
Publication of CN110619348A publication Critical patent/CN110619348A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts

Abstract

The invention discloses a characteristic learning algorithm based on personalized discrimination, which comprises the specific method of learning the algorithm of weights aiming at different types of samples, and comprises the following steps: the specific implementation method of the algorithm for learning the weights of different types of samples comprises the following steps: beginning: updating the global weight, and determining a global weight related item; updating the personalized weight, and determining the items related to the personalized weight: updating a similarity matrix, and converting an objective function through fixed variables WG and Wp: and (4) ending and starting: and updating the global weight: updating the personalized weight: updating the similarity matrix: finishing; by simultaneously exploring features compatible with all samples and features specific to each sample topic, global shared discriminative features and personalized heterogeneous features thereof are explored. In addition, the method simultaneously introduces a self-adaptive spectral clustering algorithm and mines the manifold structure of the data. The method can obtain better prediction performance.

Description

Feature learning algorithm based on personalized discrimination
Technical Field
The invention relates to the field of algorithms, in particular to a feature learning algorithm based on personalized judgment.
Background
In recent years, a large amount of high-dimensional data has appeared in applications such as data mining, machine learning, computer vision, and natural language processing. These high-dimensional data not only require enormous computational and storage costs, but also greatly affect model performance. Feature selection is the most efficient preprocessing means to process these high dimensional data. It selects relevant features directly from the original feature space, learns a more compact representative feature subset. Feature selection facilitates selection of a more compact, more interpretable model, thereby further improving learning performance.
The supervision algorithm is mainly used for classification and regression problems and aims to determine characteristics with distinguishing properties to predict samples. Sparse learning based on feature selection, all samples share the same global pattern, although it has met with some success in terms of classification accuracy. However, in practical application, the samples are highly personalized, the global model inevitably ignores the personalized features of the samples, the personalized features among the samples are not fully mined, and the heterogeneity of the samples is explored.
However, in the existing algorithm, the global mode can only determine global features with discriminant, and cannot capture heterogeneity and personality of the samples, so that the personality features among the samples are ignored, and the effect of the supervision algorithm is influenced.
Disclosure of Invention
The invention aims to provide a feature learning algorithm based on personalized discrimination, which not only establishes a global mode for all samples, but also explores personalized modes among samples so as to solve the following defects of the existing algorithm in the background art: the global mode can only determine the global features with discriminant, and can not capture the heterogeneity and personality of the samples, and the personality features among the samples are ignored.
In order to achieve the purpose, the invention provides the following technical scheme: a feature learning algorithm based on personalized judgment comprises finding out personalized features corresponding to each type of samples and global features of all the samples, so that the prediction performance of a model is improved. The method comprises the steps of firstly clustering samples of different categories, regarding each clustering result as a theme, and simultaneously searching characteristics compatible with all samples and characteristics specific to each sample theme to search global shared discriminative characteristics and personalized heterogeneous characteristics of the samples.
Preferably, the specific method of the algorithm for learning the weights of the samples in different categories includes:
beginning:
and updating the global weight:
updating the personalized weight:
updating the similarity matrix:
and (6) ending.
Preferably, the algorithm for learning the weights of the samples of different categories is implemented as follows:
beginning:
updating the global weight, and determining a global weight related item;
updating the personalized weight, and determining the items related to the personalized weight:
updating a similarity matrix, and converting an objective function through fixed variables WG and Wp:
and (6) ending.
Has the advantages that:
the invention explores the global shared discriminant features and personalized heterogeneity features of the samples by simultaneously exploring features compatible with all samples and features specific to each sample topic. In addition, the method simultaneously introduces a self-adaptive spectral clustering algorithm and mines the manifold structure of the data. The method can obtain better prediction performance.
Drawings
FIG. 1 is a schematic structural view of the present invention;
FIG. 2 is a diagram of a method of practicing the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
As shown in fig. 1-2, the present invention provides a technical solution: a feature learning algorithm based on personalized judgment comprises finding out personalized features corresponding to each type of samples and global features of all the samples, so that the prediction performance of a model is improved. Firstly, clustering samples of different categories, taking each clustering result as a theme, and exploring global shared discriminant characteristics and personalized heterogeneity characteristics of the samples by simultaneously exploring characteristics compatible with all the samples and characteristics specific to each sample theme;
the specific method of the algorithm for learning the weights of different classes of samples comprises the following steps:
beginning:
update global weights 101:
update personalized weights 102:
update similarity matrix 103:
and (6) ending.
The specific implementation method of the algorithm for learning the weights of different types of samples comprises the following steps:
beginning:
updating the global weight, and determining an item related to the global weight S101;
updating the personalized weight, and determining the items related to the personalized weight S102:
updating the similarity matrix, and converting the objective function S103 by fixing the variables WG and Wp:
and (6) ending.
Wherein the overall objective function:
wherein X ═ X1,...,xi,...,xn]∈Rd×nEach sample is described by a d-dimensional feature, and each class is clustered first (class k), with m samples in each personalized cluster. WGIs a global weight, WPIs a personalized weight learned for each cluster. First use loss functionModeling each target function, adding constraint to the weight matrix, | | WG||2,1Over-fitting is avoided and sparsity is also obtained between features.
Meanwhile, a spectral clustering algorithm is introduced into the target function to explore a data structure, but due to a large number of useless features and noise features contained in data, a Laplace matrix in the target function cannot accurately reflect the relation between samples, self-adaptive manifold structure learning is further introduced, original data are mapped to another space through an individualized weight matrix, a similarity matrix is constrained, and the accuracy of the data structure is guaranteed to the maximum extent. Therein
The optimization process of the algorithm is as follows:
secondly, preferably, the specific method of the algorithm for learning the weights of the samples in different categories includes:
(1) updating global weight WG
Determining and global weighting WGThe related items are:
and calculating the partial derivatives to obtain:
where U is a diagonal matrix with diagonal elements ofLet the partial derivative be 0, have:
WG=-(XXT+αU)-1X(ZWP-Y)
(2) updating the personalized weight Wp
Determining the terms associated with the personalized weight:
l is a laplacian matrix, learned from the similarity between samples:
L-D-G, G being the similarity matrix, D being the diagonal matrix, the diagonal elements being the sum of the row elements of the G matrix.
And (3) performing partial derivation on the personalized weight:
f ∈ R in formulakd×kdIs a diagonal matrix, the diagonal elements are defined as:
Imjis an indication function, ifBelong toImjSet to 1, otherwise 0, further we can get a closed-form solution of the personalized weight:
WP=-(ZTZ+βF+ZTLZ)-1ZT(XTWG-Y)
updating the similarity matrix L
By a fixed variable WGAnd WpThe objective function may be converted to the following:
1 represents that the vector elements are all 1. Further, willThus, according to the lagrangian function:
wherein tau, eta are Lagrangian operators, and based on Karush-Kuhn-Tucker (KKT), a closed solution of the similarity matrix is obtained:
although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (3)

1. An algorithm for learning weights for different classes of samples, characterized by: and finding out the corresponding personalized features of each type of samples and the global features of all the samples, thereby improving the prediction performance of the model. The method comprises the steps of firstly clustering samples of different categories, regarding each clustering result as a theme, and simultaneously searching characteristics compatible with all samples and characteristics specific to each sample theme to search global shared discriminative characteristics and personalized heterogeneous characteristics of the samples.
2. The feature learning algorithm based on personalized discriminant as claimed in claim 1, wherein: the specific method of the algorithm for learning the weights of the samples in different categories comprises the following steps: beginning: update global weight (101): update personalization weight (102): update similarity matrix (103):
and (6) ending.
3. The feature learning algorithm based on personalized discriminant as claimed in claim 1, wherein: the specific implementation method of the algorithm for learning the weights of different types of samples comprises the following steps:
beginning: updating the global weight, and determining a global weight related item (S101);
updating the personalized weight, and determining the items related to the personalized weight (S102):
updating the similarity matrix, and converting the objective function by fixing the variables WG and Wp (S103):
and (6) ending.
CN201910724615.1A 2019-08-07 2019-08-07 Feature learning algorithm based on personalized discrimination Pending CN110619348A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910724615.1A CN110619348A (en) 2019-08-07 2019-08-07 Feature learning algorithm based on personalized discrimination

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910724615.1A CN110619348A (en) 2019-08-07 2019-08-07 Feature learning algorithm based on personalized discrimination

Publications (1)

Publication Number Publication Date
CN110619348A true CN110619348A (en) 2019-12-27

Family

ID=68921692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910724615.1A Pending CN110619348A (en) 2019-08-07 2019-08-07 Feature learning algorithm based on personalized discrimination

Country Status (1)

Country Link
CN (1) CN110619348A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160210532A1 (en) * 2015-01-21 2016-07-21 Xerox Corporation Method and system to perform text-to-image queries with wildcards
CN106503727A (en) * 2016-09-30 2017-03-15 西安电子科技大学 A kind of method and device of classification hyperspectral imagery
US20170308770A1 (en) * 2016-04-26 2017-10-26 Xerox Corporation End-to-end saliency mapping via probability distribution prediction
CN107403196A (en) * 2017-07-28 2017-11-28 江南大学 Instant learning modeling method based on spectral clustering analysis
CN109344889A (en) * 2018-09-19 2019-02-15 深圳大学 A kind of cerebral disease classification method, device and user terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160210532A1 (en) * 2015-01-21 2016-07-21 Xerox Corporation Method and system to perform text-to-image queries with wildcards
US20170308770A1 (en) * 2016-04-26 2017-10-26 Xerox Corporation End-to-end saliency mapping via probability distribution prediction
CN106503727A (en) * 2016-09-30 2017-03-15 西安电子科技大学 A kind of method and device of classification hyperspectral imagery
CN107403196A (en) * 2017-07-28 2017-11-28 江南大学 Instant learning modeling method based on spectral clustering analysis
CN109344889A (en) * 2018-09-19 2019-02-15 深圳大学 A kind of cerebral disease classification method, device and user terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JUNDONG LI等: "Unsupervised personalized feature selection", 《2018 ASSOCIATION FOR THE ADVANCEMENT OF ARTIFICIAL INTELLIGENCE》 *
宗林林等: "一种多流形正则化的多视图非负矩阵分解算法", 《南京大学学报(自然科学)》 *

Similar Documents

Publication Publication Date Title
EP3757905A1 (en) Deep neural network training method and apparatus
CN113221905B (en) Semantic segmentation unsupervised domain adaptation method, device and system based on uniform clustering and storage medium
CN108984642B (en) Printed fabric image retrieval method based on Hash coding
CN112926654B (en) Pre-labeling model training and certificate pre-labeling method, device, equipment and medium
CN106708929B (en) Video program searching method and device
Zhang et al. Protein complex prediction in large ontology attributed protein-protein interaction networks
CN110737839A (en) Short text recommendation method, device, medium and electronic equipment
CN109670418B (en) Unsupervised object identification method combining multi-source feature learning and group sparsity constraint
CN113868366A (en) Streaming data-oriented online cross-modal retrieval method and system
CN109871934A (en) Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm
CN111080551A (en) Multi-label image completion method based on depth convolution characteristics and semantic neighbor
CN114299362A (en) Small sample image classification method based on k-means clustering
CN115048539A (en) Social media data online retrieval method and system based on dynamic memory
CN116910571B (en) Open-domain adaptation method and system based on prototype comparison learning
CN113837232A (en) Black box model distillation method based on sample selection and weighting loss function
WO2023116816A1 (en) Protein sequence alignment method and apparatus, and server and storage medium
CN110619348A (en) Feature learning algorithm based on personalized discrimination
Ng et al. Incremental hashing with sample selection using dominant sets
Chen et al. D-trace: deep triply-aligned clustering
CN114595741A (en) High-dimensional data rapid dimension reduction method and system based on neighborhood relationship
CN113869398A (en) Unbalanced text classification method, device, equipment and storage medium
Wu et al. Unsupervised query by example spoken term detection using features concatenated with self-organizing map distances
Wang et al. Characteristics analysis of applied mathematics in colleges and universities based on big data mining algorithm model
CN111061939A (en) Scientific research academic news keyword matching recommendation method based on deep learning
Ye et al. Hypersphere anchor loss for K-Nearest neighbors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination