CN110633732A - Multi-modal image recognition method based on low-rank and joint sparsity - Google Patents

Multi-modal image recognition method based on low-rank and joint sparsity Download PDF

Info

Publication number
CN110633732A
CN110633732A CN201910751979.9A CN201910751979A CN110633732A CN 110633732 A CN110633732 A CN 110633732A CN 201910751979 A CN201910751979 A CN 201910751979A CN 110633732 A CN110633732 A CN 110633732A
Authority
CN
China
Prior art keywords
dictionary
matrix
low
rank
updating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910751979.9A
Other languages
Chinese (zh)
Other versions
CN110633732B (en
Inventor
孙彬
杨轲
王子强
朱韦丹
卢陶然
刘强
徐利梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910751979.9A priority Critical patent/CN110633732B/en
Publication of CN110633732A publication Critical patent/CN110633732A/en
Application granted granted Critical
Publication of CN110633732B publication Critical patent/CN110633732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular
    • G06F18/21324Rendering the within-class scatter matrix non-singular involving projections, e.g. Fisherface techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/513Sparse representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-modal image recognition method based on low rank and joint sparsity, and belongs to the technical field of image recognition. In order to overcome the technical problem that the inter-modal difference among multi-modal images is larger than the inter-category difference, the original multi-modal data are projected into a low-rank common subspace, the low-rank constraint on the common subspace can effectively retain the similar information among different modalities of the same category, so that the connection among categories in the low-rank common subspace is larger, the image dimensionality can be reduced, the dimensionality disaster is avoided to a certain extent, and then the joint sparse representation of the data of different modalities is obtained in a joint sparse constraint mode to obtain the fused features; and classifying and identifying the features through a common classifier to obtain a final identification result. Aiming at the multi-modal problem, the invention combines the characteristics of a plurality of modes by adopting characteristic fusion to obtain the characteristics which are more beneficial to identification, thereby improving the identification efficiency.

Description

Multi-modal image recognition method based on low-rank and joint sparsity
Technical Field
The invention belongs to the technical field of image recognition, and particularly relates to a multi-modal image recognition technology based on low-rank and joint sparsity.
Background
Image recognition technology uses a computer to process and analyze images, classify objects in the images, make meaningful judgments, and the like. With the development of sensors, in real life, multi-modal image data is easily captured. The multi-mode data can be fused to provide complementary information, so that the recognition performance is improved, and compared with a scheme based on single-mode information, the scheme based on the multi-mode information has a higher practical application value. Due to the difference of imaging mechanisms of different modes, the traditional single-mode image recognition algorithm cannot process multi-mode images, and further application of image recognition is limited. With large differences between data from different modalities, multi-modal data can be considered to come from different domains, with different distributions. Thus, no direct comparison can be made between multimodal data. Compared with a single-mode recognition method, the multi-mode-based recognition algorithm has the challenge of linking information of multiple modes to reduce mode difference.
In order to combine the multi-modal features, the feature fusion technology can be applied to feature fusion and extraction of the multi-modal. Joint sparse representation is a common feature fusion tool, and the basic principle thereof is to achieve the purpose of combining multiple modal features by constraining the sparse representations of samples of the same generic class to share the same sparse pattern (i.e. dictionary atoms used by the sparse representation are the same). The document "X.Yuan, S.Yan.visual classification with multi-task joint registration [ C ]. in 2010IEEE Computer Society Conference on Computer Vision and Pattern registration, 2010, 3493-3500" is to use the same sparse Pattern for feature fusion. Thus, the assumption is suitable for the case that different features extracted by the same reference are used as multi-modal information, and for the case that observation values of similar samples such as multi-view or multi-sensors are large in difference, the assumption of constraining the same sparse mode limits the composition of dictionary atoms, and is not suitable for data with large modal difference. The documents "S.Shekhar, V.M.Patel, N.M.Nasrabadi, et. Joint Sparse Representation for Robust Multimodal Biometrics [ J. IEEE Transactions on Pattern Analysis and Machine Analysis, 2014,36(1): 113-126." propose joint sparsity constraints which relax the constraints on Sparse modes with respect to the assumption of joint sparsity expression, more applicable to Multimodal situations. Since the similarity in the modality is greater than the similarity in the category, the direct fusion of the features with large differences in the conventional multimodal feature fusion method is easy to lose modality information.
Disclosure of Invention
The invention aims to: in view of the above existing problems, a recognition method suitable for use in a multi-modal situation is provided.
The invention discloses a multi-modal identification method based on low-rank and joint sparse constraint, which specifically comprises the following steps:
step S1: training a low-rank projection matrix P for multi-modal recognition and a dictionary D:
step S101: constructing an optimization model:
Figure BDA0002167509000000026
Figure BDA0002167509000000027
wherein K represents a modal number, C represents a category number, | · | | purpleFRepresenting Frobenius norm, | | · |. luminance*Representing a nuclear norm, and a superscript T representing a matrix transposition;
Xia feature matrix representing training samples of the i-th modality, and XiIs miX n dimensional matrix, where miRepresenting the characteristic dimension (image characteristic dimension) of the training sample of the ith modality, wherein n represents the number of samples contained in each modality; diRepresenting the i-th modeA dictionary; lambdaiRepresentation dictionary DiCoefficient matrices of, i.e. dictionaries DiA sparse coefficient matrix of (a); sparse coefficient Λ ═ Λ12,...,ΛK](ii) a λ represents a regularization parameter;
Figure BDA0002167509000000028
dictionary D representing the ith modalityiAn atom of (a); i represents an identity matrix;
step S102: solving the constructed optimization model by adopting an alternating direction multiplier method based on a preset training sample set to obtain a low-rank projection matrix P and a dictionary D;
step S2: feature matrix Y based on different modalities of objects to be classified1,Y2,...,YKAfter the object to be classified is projected through a low-rank projection matrix P, the joint sparse representation about the dictionary D is solved, and therefore the joint sparse coefficient of the object to be classified is obtained
Figure BDA0002167509000000021
Step S3: joint sparse coefficient based on object to be classified
Figure BDA0002167509000000022
And carrying out classification processing to obtain a classification identification result of the object to be classified.
Further, in step S102, the specific steps of obtaining the low-rank projection matrix P and the dictionary D are as follows:
step S102-1: initializing initial parameters including sparse coefficient Λ0Low rank projection matrix P0Dictionary D0(ii) a Auxiliary variable Z0And W0Lagrange multiplier AZ,0And AW,0(ii) a And lagrange parameter alphaZ,αW(ii) a The iteration time t is 0, and the maximum iteration time k is obtained;
wherein, the dictionary
Figure BDA0002167509000000023
Coefficient of sparseness
Figure BDA0002167509000000024
Figure BDA0002167509000000025
Auxiliary variable Z0Matrix dimension and sparsity factor Λ0Are the same in matrix dimension, W0And a low rank projection matrix P0The matrix dimensions of the low-rank projection matrix are the same (the dimension of the low-rank projection matrix is preset based on an actual application scene); namely, it is
Figure BDA0002167509000000031
Step S102-2: updating the sparse coefficient lambda:
by the formula
Figure BDA0002167509000000032
Obtaining a coefficient matrix of the dictionary corresponding to the ith mode after the t +1 th update
Figure BDA0002167509000000033
Thereby obtaining the coefficient Lambda after the t +1 time of updatingt+1
Step S102-3: updating the dictionary D:
solving the equation by a quadratic problem solverThus obtaining t +1 updated dictionary Dt+1
Step S102-4: update low rank projection P:
by the formulaObtaining the low-rank projection matrix P after the t +1 time of updatingt+1
Step S102-5: updating the auxiliary variable Z:
by the formula
Figure BDA0002167509000000036
Obtaining the auxiliary variable Z at the t +1Row vector z of ith row after secondary updatei,t+1So as to obtain the t +1 updated auxiliary variable Zt+1
Wherein,
Figure BDA0002167509000000037
are respectively Λt+1、AZ,tThe row vector of the ith row of (1);
step S102-6: updating the auxiliary variable W:
by the formula
Figure BDA0002167509000000038
Obtaining the t +1 updated auxiliary variable Wt+1
Wherein F Σ BTIs that
Figure BDA0002167509000000039
Singular value decomposition of (c); the function S (a, b) takes the values: when | a | ≧ b: s (a, b) ═ sgn (a) (| a | -b); when | a | < b: s (a, b) ═ 0;
step S102-7: updating lagrange multiplier AZAnd AW
By formula AZ,t+1=AZ,tZt+1-Zt+1) And AW,t+1=AW,tW(Pt+1-Wt+1) Obtaining Lagrange multiplier A after t +1 time of updatingZ,t+1And AW,t+1
Step S102-8: judging whether the iteration time t reaches the maximum iteration time k, if so, updating the latest Pt+1And Dt+1As training result values of the projection matrix P and the dictionary D; otherwise, t +1 is updated, and the procedure returns to step S102-2.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
according to the method, the difference between the modalities is reduced through low-rank projection, and the low-rank constraint on the public subspace can effectively retain the similar information between different modalities of the same type, so that the connection between the types in the low-rank public subspace is larger, the image dimensionality can be reduced, and the dimensionality disaster is avoided to a certain extent. Aiming at the multi-modal problem, the characteristics of a plurality of modes are combined by adopting characteristic fusion to obtain the characteristics which are more beneficial to identification, and the identification efficiency is improved.
Drawings
FIG. 1 shows the recognition rate of the present invention under the near infrared and visible light face data set (CASIA HFB).
FIG. 2 is a parameter characteristic diagram of the present invention. The numbers 1 to 8 on the abscissa represent these values, respectively: 0.001,0.01,0.1,0.5,1,5, 10, 100.
Fig. 3 is a graph of the convergence of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The solution of the invention is that firstly, the multi-mode images distributed in different spaces are projected in low rank, so as to reduce the difference between modes, retain the important distinguishing information of the same category and reduce the data dimension; then, in the same projected space, performing feature fusion by adopting joint sparse constraint; and finally, realizing image recognition processing based on the fused features. Because the same type but different modes have similar information, compared with high-dimensional original image information, the similar information between the modes is required to be low-dimensional, and in order to reflect the low-dimensional characteristics in the multi-mode public subspace, the method extracts the similar information between the modes on the premise that the multi-mode projection matrix is low-rank, achieves the purpose of reducing the mode difference, and further improves the identification efficiency. The invention can be applied to identification processing in scenes such as identity identification, safety monitoring, criminal investigation and crime and the like.
The specific implementation process of the multi-modal identification method based on the low-rank and joint sparse constraint is as follows:
consider a training sample with C classes, K modalities, with the training sample for each modality expressed as:
Figure BDA0002167509000000041
i=1,2,...,K,mithe dimension of the sample is trained by the table, and n is the number of samples contained in each mode. If the low rank subspace is represented by P (also called low rank common projection), P is obtained after low rank projection of the sampleTXi. The invention adopts joint sparse representation based on dictionary learning mode and design
Figure BDA0002167509000000042
Dictionary of corresponding modalities, NiFor the noise of the ith modal sample, the theory represented by joint sparsity is as follows:
PTXi=DiΛi+Ni,i=1,2,...,K
wherein, ΛiRepresentation dictionary DiThe coefficient matrix of (2).
Introducing low-rank constraint and joint sparse constraint, and solving by the following optimization formula (optimization model) to obtain low-rank public projection P and dictionary DiAnd its coefficient matrix Λi
Figure BDA0002167509000000051
Wherein
Figure BDA0002167509000000052
As a dictionary Di(ii) atom (| · |) non-combustible gasFRepresenting the Frobenius norm with a sparse coefficient Λ ═ Λ12,...,ΛK]I.e. ΛiIs a sub-matrix of the matrix Λ, which may also be referred to as a sparse coefficient matrix; kernel norm P (| non-conducting phosphor)*=∑iσi(P) the value is the sum of the eigenvalues of the matrix. Since the rank minimization problem cannot be directly handled with low rank (p), the rank minimization problem can be approximately solved by the kernel norm. Orthogonal constraint PTI ensures that the resulting P is the basis transformation matrix, where I is the identity matrix and λ is the regularization parameter.
The solution was performed by Alternating orientation Method of Multipliers (ADMM). Introducing auxiliary variables Z and W, definingAugmented Lagrange function
Figure BDA0002167509000000053
The expression is as follows:
Figure BDA0002167509000000054
wherein A isZ、AWIs a linearly constrained multiplier (i.e., Lagrange multiplier), alphaZ、αWIs a positive parameter, a sign<A, B > represents tr (A)TB) Tr (-) denotes a trace of the matrix, and
Figure BDA0002167509000000055
wherein
Figure BDA0002167509000000056
Are respectively AZ、AWAn element of (1); dictionary D ═ D1,D2,...,DK]。
Solving functions for P, Λ, Z and W according to augmented Lagrange multiplication
Figure BDA0002167509000000057
While maintaining AZAnd AWNot changing, then fixing other variables, for AZAnd AWAnd (6) updating. Having an objective functionWith a distributed structure, to simplify the problem, the problem can be solved by taking the variables P, Λ, Z and W as the unique variables of the objective function, respectively. The solving process for each sub-optimization goal is described in detail below. Since the optimization process is an iterative update solving process, the result of the t-th update is represented by adding subscript t (t ≧ 0) to the corresponding variable in the following equation.
(1) And updating the sparse coefficient lambda.
When solving the sparse coefficient, the optimization formula is converted into:
Figure BDA0002167509000000061
the optimization formula is a convex function, and an updating formula of the sparse coefficient lambda is obtained by solving the first order partial derivative and zero calculation:
wherein I is an identity matrix, ZiA sub-matrix of Z, i.e. Z ═ Z1,Z2,...,ZK]。
(2) And updating the dictionary D.
Fixing other variables and parameters to obtain the following optimized formula:
Figure BDA0002167509000000063
this is a quadratic constrained quadratic programming problem (QCQP) that can be solved by a quadratic solver.
(3) Low rank projection P updates.
The optimization formula for solving the low-rank projection is as follows:
Figure BDA0002167509000000064
the optimized expression is a convex function, and is obtained by solving the first order partial derivative and zero:
Figure BDA0002167509000000065
wherein the superscript T denotes the matrix transposition, i.e.
Figure BDA0002167509000000066
To represent
Figure BDA0002167509000000067
The transposing of (1).
(4) The auxiliary variable Z is updated.
Solving the optimization problem for the auxiliary variable Z becomes:
Figure BDA0002167509000000068
equivalent transformation into:
Figure BDA0002167509000000069
since the above equation has a separable structure, each row of Z can be treated separately to solve this problem. Memory gammaiziAre respectively Lambda and AZAnd row i of Z. The problem solving by the above equation translates into:
wherein
Figure BDA0002167509000000072
And z isi,t+1The calculation of (a) has the following form:
Figure BDA0002167509000000073
wherein, the sign function (c)+Represents taking max (c) in vector ciAnd 0) value.
(5) The auxiliary variable W is updated.
Updating the auxiliary variable W requires solving the following optimization:
Figure BDA0002167509000000074
the above equation is equivalently converted into:
Figure BDA0002167509000000075
the above equation is the shrinkage problem, and the solution equation is:
Figure BDA0002167509000000076
wherein F Σ BTIs thatThe Singular Value Decomposition (SVD) of (a) and (b) of the function S are, when | a | ≧ b: s (a, b) ═ sgn (a) (| a | -b); when | a | < b: s (a, b) ═ 0.
(6) Parameter AZAnd AWAnd (4) updating.
Lagrange multiplier aZAnd AWThe update formula of (2) is: a. theZ,t+1=AZ,tZt+1-Zt+1)
AW,t+1=AW,tW(Pt+1-Wt+1)
In summary, to solve for the low rank common projection P, dictionary DiAnd its coefficient matrix ΛiThe specific solving process is as follows:
step 1: initializing parameters:
the initialization parameters include: sparse coefficient Λ0(ii) a Low rank projection matrix P0(ii) a Dictionary D0(ii) a Auxiliary variable Z0(ii) a Auxiliary variable W0(ii) a Linear constraint multiplier aZ,AW(ii) a And lagrange parameter alphaZ,αW(ii) a The maximum number of iterations k.
Step 2: updating the sparse coefficient lambda:
through type
Figure BDA0002167509000000078
And updating the sparse coefficient lambda.
And step 3: updating the dictionary D:
solving the equation by a quadratic problem solver
Figure BDA0002167509000000081
And 4, step 4: update low rank projection P:
through type
Figure BDA0002167509000000082
The low rank projection P is updated.
And 5: updating the auxiliary variable Z:
through typeThe parameter Z is updated.
Step 6: updating the auxiliary variable W:
through typeThe parameter W is updated.
And 7: updating lagrange multiplier AZAnd AW
Through the formula AZ,t+1=AZ,tZt+1-Zt+1) And AW,t+1=AW,tW(Pt+1-Wt+1) For parameter AZAnd AWAnd (6) updating.
And 8: judging the iteration times:
when the iteration time t is less than k, t is t +1, and the step 2 is returned; and when t is larger than or equal to k, ending, and outputting the obtained projection matrix P and the dictionary D.
Through the steps, the training process of the optimization model is completed, and a projection matrix P and a dictionary D are obtained.
And step 9: solving joint sparse coefficients of test samples
Figure BDA0002167509000000085
In the test process, the test sample with K modes is recorded as { Y1,Y2,...,YKAnd (5) after the test samples are projected through a low-rank projection matrix P, solving the joint sparse representation about the dictionary D, and obtaining a joint sparse coefficient by solving the following formula
Figure BDA0002167509000000087
Wherein the value of the lambda is the same as that of the lambda in the training process,
step 10: and the classifier classifies according to the joint sparse coefficient to obtain an identification result.
For example, based on an actual application scene, category labels matched with different joint sparse coefficients are preset, so that the joint sparse coefficients obtained based on the current solution are matched with the corresponding category labels, and further a category identification result of the current image to be classified is obtained.
The classification includes, but is not limited to, KNN classifier, Support Vector Machine (SVM), naive Bayes classifier.
Examples
To further verify the identification performance of the present invention, simulation verification was performed on MATLAB 2016. For the convenience of analysis, the simulation scene considers the human face recognition under the near infrared and visible light scenes and the multi-view scene. Eight existing classification methods selected in a comparative experiment are specifically as follows: SCDL (Semi-coordinated Dictionary Learning), CDL (Coupled Dictionary Learning), GCDL1, GCDL2(Generalized Coupled Dictionary Learning), PCA (Primary Complex analysis), SRRS (Supervised Regularizationbased distribution hub subsystem), LRCS (Low-random mon subsystem) and CLRS (Collective Low-random subsystem); wherein SCDL, CDL, GCDL1 and GCDL2 are dictionary learning-based methods, and PCA, SRRS, LRCS and CLRS are common subspace learning-based methods.
Comparing the method (Ours) of the present invention with the existing eight methods for comparison, performing a comparison test of recognition rates on face data sets (CMU Multi PIE) at two different viewing angles, wherein the specific comparison is shown in table 1, wherein cases 1 to 6 represent different viewing angle combination schemes, that is, two different viewing angles are selected from the CMU Multi PIE to be combined, and 6 different combination results are obtained for simulation verification.
TABLE 1
Figure BDA0002167509000000091
As can be seen from Table 1, the low rank and joint sparsity based multi-modal image recognition method of the invention has better recognition rate than the existing method.
Fig. 1 shows the comparison result of the recognition rate of the present invention and the recognition rate of the above eight conventional classification methods in the face data set (CASIA HFB) of the infrared and visible light scenes, where the recognition rate of the present invention is the highest.
Fig. 2 shows a parameter characteristic diagram of the present invention when performing recognition processing in a near-infrared and visible light scene. The numbers 1 to 8 on the abscissa represent the values of the regularization parameter λ respectively as follows: 0.001,0.01,0.1,0.5,1,5, 10, 100.
Fig. 3 shows a convergence graph when the Recognition processing is performed in the near-infrared and visible light scenes, in which the abscissa is the number of iterations, a circled curve represents a convergence curve (Objective value), and a curve with "x" represents a Recognition rate variation curve (Recognition rate). As can be seen from fig. 3, in this embodiment, the optimal maximum number of iterations may be set to be between 10 and 15, so as to reduce the computation amount on the premise of ensuring the recognition rate.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.

Claims (3)

1. A multi-modal image recognition method based on low rank and joint sparsity is characterized by comprising the following steps:
step S1: training a low-rank projection matrix P for multi-modal recognition and a dictionary D:
step S101: constructing an optimization model:
Figure FDA0002167508990000011
Figure FDA0002167508990000012
wherein K represents a modal number, C represents a category number, | · | | purpleFRepresenting Frobenius norm, | | · |. luminance1,2Representing a joint sparse norm, | · | | luminance*Representing a nuclear norm, and a superscript T representing a matrix transposition;
Xia feature matrix representing training samples of the i-th modality, and XiIs miX n dimensional matrix, where miRepresenting the characteristic dimension of training samples of the ith mode, and n represents the number of samples contained in each mode; diA dictionary representing the ith modality; lambdaiRepresentation dictionary DiA coefficient matrix of (a); sparse coefficient Λ ═ Λ12,...,ΛK](ii) a λ represents a regularization parameter;
Figure FDA0002167508990000018
dictionary D representing the ith modalityiAn atom of (a); i represents an identity matrix;
step S102: solving the constructed optimization model by adopting an alternating direction multiplier method based on a preset training sample set to obtain a low-rank projection matrix P and a dictionary D;
step S2: feature matrix Y based on different modalities of objects to be classified1,Y2,...,YKAfter the object to be classified is projected through a low-rank projection matrix P, the joint sparse representation about the dictionary D is solved, and therefore the joint sparse coefficient of the object to be classified is obtained
Figure FDA0002167508990000013
Figure FDA0002167508990000014
Step S3: joint sparsity based on objects to be classifiedCoefficient of performance
Figure FDA0002167508990000015
And carrying out classification processing to obtain a classification identification result of the object to be classified.
2. The method of claim 1, wherein the step S102 of obtaining the low rank projection matrix P and the dictionary D comprises the steps of:
step S102-1: initializing initial parameters including sparse coefficient Λ0Low rank projection matrix P0Dictionary D0(ii) a Auxiliary variable Z0And W0Lagrange multiplier AZ,0And AW,0(ii) a And lagrange parameter alphaZ,αW(ii) a The iteration time t is 0, and the maximum iteration time k is obtained;
wherein, the dictionary
Figure FDA0002167508990000016
Coefficient of sparseness
Figure FDA0002167508990000017
Figure FDA0002167508990000021
Auxiliary variable Z0Matrix dimension and sparsity factor Λ0Has the same dimension of matrix, and an auxiliary variable W0And a low rank projection matrix P0The matrix dimensions of (a) are the same;
step S102-2: updating the sparse coefficient lambda:
by the formula
Figure FDA0002167508990000022
Obtaining a coefficient matrix of the dictionary corresponding to the ith mode after the t +1 th update
Figure FDA0002167508990000023
Thereby obtaining the sparse coefficient Lambda after the t +1 time of updatingt+1
Step S102-3: updating the dictionary D:
solving the equation by a quadratic problem solver
Figure FDA0002167508990000024
s.t.
Figure FDA0002167508990000025
Thus obtaining t +1 updated dictionary Dt+1
Step S102-4: update low rank projection P:
by the formula
Figure FDA0002167508990000026
Obtaining the low-rank projection matrix P after the t +1 time of updatingt+1
Step S102-5: updating the auxiliary variable Z:
by the formula
Figure FDA0002167508990000027
Obtaining the row vector Z of the ith row of the auxiliary variable Z after the t +1 th updatei,t+1So as to obtain the t +1 updated auxiliary variable Zt+1
Wherein,
Figure FDA0002167508990000028
γi,t+1
Figure FDA00021675089900000211
are respectively Λt+1、AZ,tThe row vector of the ith row of (1);
step S102-6: updating the auxiliary variable W:
by the formula
Figure FDA0002167508990000029
Obtaining the t +1 updated auxiliary variable Wt+1
Wherein F Σ BTIs that
Figure FDA00021675089900000210
Singular value decomposition of (c); the function S (a, b) takes the values: when | a | ≧ b: s (a, b) ═ sgn (a) (| a | -b); when | a | < b: s (a, b) ═ 0;
step S102-7: updating lagrange multiplier AZAnd AW
By formula AZ,t+1=AZ,tZt+1-Zt+1) And AW,t+1=AW,tW(Pt+1-Wt+1) Obtaining Lagrange multiplier A after t +1 time of updatingZ,t+1And AW,t+1
Step S102-8: judging whether the iteration time t reaches the maximum iteration time k, if so, updating the latest Pt+1And Dt+1As training result values of the projection matrix P and the dictionary D; otherwise, t +1 is updated, and the procedure returns to step S102-2.
3. The method of claim 2, wherein the maximum number of iterations k ranges from 10 to 15.
CN201910751979.9A 2019-08-15 2019-08-15 Multi-modal image recognition method based on low-rank and joint sparsity Active CN110633732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910751979.9A CN110633732B (en) 2019-08-15 2019-08-15 Multi-modal image recognition method based on low-rank and joint sparsity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910751979.9A CN110633732B (en) 2019-08-15 2019-08-15 Multi-modal image recognition method based on low-rank and joint sparsity

Publications (2)

Publication Number Publication Date
CN110633732A true CN110633732A (en) 2019-12-31
CN110633732B CN110633732B (en) 2022-05-03

Family

ID=68969698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910751979.9A Active CN110633732B (en) 2019-08-15 2019-08-15 Multi-modal image recognition method based on low-rank and joint sparsity

Country Status (1)

Country Link
CN (1) CN110633732B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149053A (en) * 2020-08-19 2020-12-29 江苏大学 Multi-view image characterization method based on low-rank correlation analysis
CN112541554A (en) * 2020-12-18 2021-03-23 华中科技大学 Multi-modal process monitoring method and system based on time constraint kernel sparse representation
CN116246712A (en) * 2023-02-13 2023-06-09 中国人民解放军军事科学院军事医学研究院 Data subtype classification method with sparse constraint multi-mode matrix joint decomposition

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632138A (en) * 2013-11-20 2014-03-12 南京信息工程大学 Low-rank partitioning sparse representation human face identifying method
CN103761537A (en) * 2014-02-07 2014-04-30 重庆市国土资源和房屋勘测规划院 Image classification method based on low-rank optimization feature dictionary model
US20170024855A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Single Image Super-Resolution Method Using Transform-Invariant Directional Total Variation with S1/2+L1/2-norm
CN107563968A (en) * 2017-07-26 2018-01-09 昆明理工大学 A kind of method based on the group medicine image co-registration denoising for differentiating dictionary learning
CN107977949A (en) * 2017-07-26 2018-05-01 昆明理工大学 A kind of method improved based on projection dictionary to the Medical image fusion quality of study
CN108460412A (en) * 2018-02-11 2018-08-28 北京盛安同力科技开发有限公司 A kind of image classification method based on subspace joint sparse low-rank Structure learning
CN109215780A (en) * 2018-08-24 2019-01-15 齐鲁工业大学 The multi-modal data analysis method and system of high Laplace regularization low-rank representation
CN109447009A (en) * 2018-11-02 2019-03-08 南京审计大学 Hyperspectral image classification method based on subspace nuclear norm regularized regression model
CN109522956A (en) * 2018-11-16 2019-03-26 哈尔滨理工大学 A kind of low-rank differentiation proper subspace learning method
CN110069978A (en) * 2019-03-04 2019-07-30 杭州电子科技大学 The face identification method that the non-convex low-rank decomposition of identification and superposition Sparse indicate

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632138A (en) * 2013-11-20 2014-03-12 南京信息工程大学 Low-rank partitioning sparse representation human face identifying method
CN103761537A (en) * 2014-02-07 2014-04-30 重庆市国土资源和房屋勘测规划院 Image classification method based on low-rank optimization feature dictionary model
US20170024855A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Single Image Super-Resolution Method Using Transform-Invariant Directional Total Variation with S1/2+L1/2-norm
CN107563968A (en) * 2017-07-26 2018-01-09 昆明理工大学 A kind of method based on the group medicine image co-registration denoising for differentiating dictionary learning
CN107977949A (en) * 2017-07-26 2018-05-01 昆明理工大学 A kind of method improved based on projection dictionary to the Medical image fusion quality of study
CN108460412A (en) * 2018-02-11 2018-08-28 北京盛安同力科技开发有限公司 A kind of image classification method based on subspace joint sparse low-rank Structure learning
CN109215780A (en) * 2018-08-24 2019-01-15 齐鲁工业大学 The multi-modal data analysis method and system of high Laplace regularization low-rank representation
CN109447009A (en) * 2018-11-02 2019-03-08 南京审计大学 Hyperspectral image classification method based on subspace nuclear norm regularized regression model
CN109522956A (en) * 2018-11-16 2019-03-26 哈尔滨理工大学 A kind of low-rank differentiation proper subspace learning method
CN110069978A (en) * 2019-03-04 2019-07-30 杭州电子科技大学 The face identification method that the non-convex low-rank decomposition of identification and superposition Sparse indicate

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DING Z 等: ""Low-rank embedded ensemble semantic dictionary for zero-shot learning"", 《PROCEEDINGS OF THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
LIU G 等: ""Robust recovery of subspace structures by low-rank representation"", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
SUN B 等: ""Fusion of noisy images based on joint distribution model in dual‐tree complex wavelet domain"", 《INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY》 *
邓志华 等: ""低秩稀疏分解与显著性度量的医学图像融合"", 《光学技术》 *
高仕博 等: ""面向目标检测的稀疏表示方法研究进展"", 《电子学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149053A (en) * 2020-08-19 2020-12-29 江苏大学 Multi-view image characterization method based on low-rank correlation analysis
CN112541554A (en) * 2020-12-18 2021-03-23 华中科技大学 Multi-modal process monitoring method and system based on time constraint kernel sparse representation
CN112541554B (en) * 2020-12-18 2024-03-22 华中科技大学 Multi-mode process monitoring method and system based on time constraint and nuclear sparse representation
CN116246712A (en) * 2023-02-13 2023-06-09 中国人民解放军军事科学院军事医学研究院 Data subtype classification method with sparse constraint multi-mode matrix joint decomposition
CN116246712B (en) * 2023-02-13 2024-03-26 中国人民解放军军事科学院军事医学研究院 Data subtype classification method with sparse constraint multi-mode matrix joint decomposition

Also Published As

Publication number Publication date
CN110633732B (en) 2022-05-03

Similar Documents

Publication Publication Date Title
Tang et al. Learning a joint affinity graph for multiview subspace clustering
Patel et al. Latent space sparse and low-rank subspace clustering
Patel et al. Kernel sparse subspace clustering
Zheng et al. Iterative re-constrained group sparse face recognition with adaptive weights learning
Yan et al. Graph embedding and extensions: A general framework for dimensionality reduction
Zhang et al. Graph based constrained semi-supervised learning framework via label propagation over adaptive neighborhood
He et al. Robust principal component analysis based on maximum correntropy criterion
CN110633732B (en) Multi-modal image recognition method based on low-rank and joint sparsity
Li et al. Mutual component analysis for heterogeneous face recognition
CN107392107B (en) Face feature extraction method based on heterogeneous tensor decomposition
Zheng et al. A novel approach inspired by optic nerve characteristics for few-shot occluded face recognition
Nguyen et al. Kernel low-rank representation for face recognition
Lu et al. Nuclear norm-based 2DLPP for image classification
CN104715266B (en) The image characteristic extracting method being combined based on SRC DP with LDA
CN107918761A (en) A kind of single sample face recognition method based on multiple manifold kernel discriminant analysis
Nguyen et al. Discriminative low-rank dictionary learning for face recognition
Puthenputhussery et al. A sparse representation model using the complete marginal fisher analysis framework and its applications to visual recognition
Li et al. Robust subspace clustering with independent and piecewise identically distributed noise modeling
CN111310813A (en) Subspace clustering method and device for potential low-rank representation
Jin et al. Multiple graph regularized sparse coding and multiple hypergraph regularized sparse coding for image representation
Zhang et al. Cost-sensitive joint feature and dictionary learning for face recognition
He et al. Low-rank representation with graph regularization for subspace clustering
Chen et al. Semi-supervised dictionary learning with label propagation for image classification
Li et al. Unsupervised active learning via subspace learning
Givens et al. Biometric face recognition: from classical statistics to future challenges

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant