CN116401528A - Multi-element time sequence unsupervised dimension reduction method based on global-local divergence - Google Patents

Multi-element time sequence unsupervised dimension reduction method based on global-local divergence Download PDF

Info

Publication number
CN116401528A
CN116401528A CN202310278160.1A CN202310278160A CN116401528A CN 116401528 A CN116401528 A CN 116401528A CN 202310278160 A CN202310278160 A CN 202310278160A CN 116401528 A CN116401528 A CN 116401528A
Authority
CN
China
Prior art keywords
neighborhood
sequence
global
feature
dimension reduction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310278160.1A
Other languages
Chinese (zh)
Inventor
李正欣
胡钢
刘嘉
吴虎胜
吴丹阳
刘斌
周漩
杨波
吴诗辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Air Force Engineering University of PLA
Original Assignee
Air Force Engineering University of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Air Force Engineering University of PLA filed Critical Air Force Engineering University of PLA
Priority to CN202310278160.1A priority Critical patent/CN116401528A/en
Publication of CN116401528A publication Critical patent/CN116401528A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods

Abstract

The invention discloses a multi-element time sequence unsupervised dimension reduction method based on global-local divergence, which comprises the following steps: s1, calculating a covariance matrix of a multi-element time sequence, extracting upper triangular elements of the covariance matrix, and combining the upper triangular elements into a feature sequence to obtain a feature set Fea= { f of the multi-element time sequence i I = 1,2, …, n }; s2, establishing a neighborhood set N for the feature set by using k neighbor and Euclidean distance (EuclidDistance, ED) measurement k (f i )={f j I j = 1,2, …, k }; s3, after finding the neighborhood of each sample point, calculating each feature sequence f in the feature set i Neighborhood center sequence m of (2) i The method comprises the steps of carrying out a first treatment on the surface of the S4, representing the local divergence by using the neighborhood variance of the projected sample points, firstly calculating the variance of the neighborhood set of each sample point after projection, and then accumulating and summing the variances to obtain the local divergence; and S5, calculating the variance of the domain center point according to the domain center point obtained in the step S3 to obtain the global variance. Finally, the experimental result shows that the low-dimensional projection sequence obtained by the method can representThe original MTS realizes more obvious dimension reduction.

Description

Multi-element time sequence unsupervised dimension reduction method based on global-local divergence
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a multi-element time sequence unsupervised dimension reduction method based on global-local divergence.
Background
Time series generally refers to a series of time-varying data acquired by various sensors, which are widely used in the fields of environmental science, medicine, finance, etc., and can be classified into a unitary time series (univariate time series, UTS) and a multiple time series (multiple timeseries, MTS) according to the number of variables. MTS can be seen as a combination of corresponding UTS generated by different factors in the same system, which in contrast to UTS not only contains high dimensionality in the time dimension, but also in the feature dimension, and there is a correlation between features. Thus, data mining for MTS is more complex than UTS.
At present, data mining of MTS is applied to industrial fault monitoring, and data in fault monitoring are mostly obtained through sensors, so that a large amount of noise information exists, characteristic dimensions and time dimensions exist in monitoring data collected by the sensors, and data processing is difficult and low in efficiency due to high latitude during fault monitoring. Therefore, how to effectively dimension down the multiple time series data in fault monitoring is the key to solve the problem.
Time series data mining generally includes clustering, classification, prediction, anomaly detection, correlation analysis, etc., and these mining tasks are generally related to the size and complexity of data, and MTS has high-dimensional characteristics of two dimensions at the same time, so that when performing the mining tasks, it is generally required to perform dimension reduction or feature representation to reduce the complexity of data and mitigate interference caused by redundant information, and currently, the prior art is mainly divided into dimension reduction for feature dimensions, dimension reduction for time dimensions, and dimension reduction for both dimensions at the same time. The technical problem that the time length of MTS with different lengths is not equal can not be solved by dimension reduction aiming at the characteristic dimension, and difficulty exists in the similarity measurement of the subsequent data mining; the dimension reduction of the time dimension ignores the variable correlation characteristics in the characteristic dimension, and the problems of information redundancy and the like can exist after the dimension reduction, and the dimension reduction of the time dimension and the variable correlation characteristics generally needs to use a two-way dimension reduction technology, so that the calculation cost is high.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a multi-element time sequence unsupervised dimension reduction method based on global-local divergence.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the multi-element time sequence unsupervised dimension reduction method based on global-local divergence is characterized by comprising the following steps:
s1: acquiring fault monitoring data through a sensor, forming a multi-element time sequence of the acquired fault monitoring data into a multi-element time sequence original data set D, respectively calculating covariance matrixes of each multi-element time sequence in the data set, extracting upper triangle elements of the covariance matrixes, and combining the upper triangle elements into a characteristic sequence; all the feature sequences form a feature sequence set, and the length of each feature sequence is the same;
s2: based on the obtained feature sequence set, measuring by using k neighbor numbers and Euclidean distances to establish each sample neighbor set in the feature sequence set;
s3: calculating a neighborhood center sequence of each neighborhood set in the feature set according to the neighborhood sets obtained in the step S2;
s4: calculating the variance of each neighborhood set of the sample point to be projected according to the neighborhood sets obtained in the step S2, and then accumulating and summing the variances to calculate the local divergence;
s5: calculating a neighborhood global variance according to the neighborhood central point obtained in the step S3 to obtain a global divergence;
s6: according to the local divergence and the global divergence obtained in the steps S4 and S5, solving a projection matrix;
s7: according to the projection matrix obtained in the step S6, the feature sequence set obtained in the step S1 is projected, so that a dimension-reduced feature sequence is obtained:
y i =W T f i (6);
s8: according to the dimension reduction feature sequence, a dimension reduction feature set D' = { y of the fault monitoring data is obtained i I=1, 2, …, n }, wherein,
Figure BDA0004137076730000021
s9: and processing the fault monitoring data after the dimension reduction to obtain a fault monitoring result.
Further, the specific operation steps of the step 1 include:
s11: the obtained fault monitoring data is subjected to multi-element time series to form a multi-element time series original data set D= { X i I=1, 2, …, n, where n is the number of samples,
Figure BDA0004137076730000031
representing the ith MTS sample, x in the original dataset i (i=1, 2, …, m) represents a series of observations of the ith variable, m is the variable number, t i Is the time length of the ith multiple time series; and zero-equalizing each multi-element time sequence: x is X i =X i -E(X i );
S12: covariance matrix calculation is carried out on each multi-element time sequence after zero-mean treatment, and the covariance matrix of the ith MTS is as follows:
Figure BDA0004137076730000032
s13: covariance matrix
Figure BDA0004137076730000033
Extracting the upper triangular matrix elements of the row vector, and sequentially forming the row vector:
f i =(a 11 ,a 12 ,…,a 1m ,a 21 ,…,a 2m ,…,a mm ) (8)
will f i As the feature sequence of the ith MTS, a feature set fea= { f of the multivariate time series is obtained i |i=1,2,…,n}。
Further, in step S3, a neighborhood center sequence m is calculated i The calculation formula of (2) is as follows:
Figure BDA0004137076730000034
where k is the number of neighbors, m i Is f i A neighborhood center sequence of k neighbors to the neighborhood.
Further, the specific step of S4 includes:
s41: based on the neighborhood sets, calculating variances of each neighborhood set of the projected sample points, and accumulating and summing the variances to obtain a local divergence calculation formula:
Figure BDA0004137076730000035
wherein p is i =W T m i A low-dimensional projection for a neighborhood center point; y is j The method is characterized by comprising the steps of low-dimensional projection of sample points, wherein W is a projection matrix, and subscript L is abbreviation of English Local;
s42: the conversion of formula (2) can be obtained:
Figure BDA0004137076730000041
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004137076730000042
Figure BDA0004137076730000043
for the ith feature sequence f i Is a local divergence matrix of:
Figure BDA0004137076730000044
further, the specific step of S5 includes:
s51: calculating a neighborhood global variance according to a neighborhood center point in the neighborhood center sequence to obtain a global divergence calculation formula:
Figure BDA0004137076730000045
wherein p is i =W T m i A low-dimensional projection for a neighborhood center point; subscript G is an abbreviation for Global;
s52: the transformation of formula (3) can be obtained:
Figure BDA0004137076730000046
wherein S is G Is a global divergence matrix:
Figure BDA0004137076730000047
s53: the formula (12) is simplified to obtain:
Figure BDA0004137076730000051
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004137076730000052
is the center point of all neighborhood centers, namely the global neighborhood center.
Further, the specific step of S6 includes:
s61: combining the local divergence obtained in the formula (9) and the global divergence obtained in the formula (11) to obtain the formula (4):
Figure BDA0004137076730000053
s62: converting the formula (4) into a generalized eigenvalue solution problem, and obtaining a projection matrix by solving the formula (4):
S G ω=λS L ω (5)
wherein: omega is a generalized eigenvector, lambda is a generalized eigenvalue;
s63: solving (5) to obtain the first d maximum eigenvalues lambda 12 ,…,λ d (d < p), corresponding feature vector ω 12 ,…,ω d Projection matrix w= (ω) 12 ,…,ω d )。
Compared with the prior art, the invention has the beneficial effects that:
the invention firstly provides a feature sequence extraction method, which extracts the upper triangle element of a multi-element time sequence covariance matrix and combines the upper triangle element into a feature sequence. Then, an unsupervised dimension reduction model is provided by taking the basic ideas of minimum local divergence and maximum global divergence, and global information is reserved as much as possible while the local neighbor relation is maintained. And taking the characteristic sequence as input, minimizing the sum of all sample point neighborhood variances, and maximizing the neighborhood center point variance. The projection matrix obtained by solving the model can realize dimension reduction of the multi-element time sequence; finally, the dimension reduction method and the related comparison method provided by the invention are subjected to experimental verification through 20 groups of public data sets, and experimental results show that the dimension reduction method provided by the invention can effectively reduce the dimension of the MTS data set, thereby improving the fault monitoring accuracy.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention discloses a multi-element time sequence unsupervised dimension reduction method based on global-local divergence, which specifically comprises the following steps:
s1, acquiring fault monitoring numbers through a sensor, acquiring corresponding multi-element time sequences, forming a multi-element time sequence original data set D, and converting the multi-element time sequence original data set D into an equal-length characteristic sequence set Fea= { f i |i=1,2,…,n};
Specifically, the specific operation steps of S1 include:
s11, forming a multi-element time sequence into a multi-element time sequence original data set D= { X i I=1, 2, …, n, where n is the number of samples,
Figure BDA0004137076730000061
represents the ith MTS sample, x in the dataset i (i=1, 2, …, m) represents a series of observations of the ith variable, m is the variable number, t i Is the length of time of the ith MTS, zero-mean processing is performed on each MTS, i.e. X i =X i -E(X i );
S12, performing covariance matrix calculation on each zero-mean-processed multi-element time sequence, wherein the covariance matrix of the ith MTS is as follows:
Figure BDA0004137076730000062
wherein the covariance matrix
Figure BDA0004137076730000063
Is a symmetrical array;
will sigma i Is extracted from the upper triangular matrix element of the matrix element, and the row vectors are formed in sequence:
f i =(a 11 ,a 12 ,…,a 1m ,a 21 ,…,a 2m ,…,a mm ) (8)
will f i As the feature sequence of the ith MTS, the feature set fea= { f of MTS is obtained i |i=1,2,…,n}。
S2, based on the obtained feature set Fea, measuring by using k neighbor and Euclidean distance (EuclidDistance, ED), and obtaining a neighbor set N of each sample in the feature set Fea for the feature set k (f i )={f j |j=1,2,…,k}
S3, calculating a neighborhood center sequence m of each neighborhood set in the feature set Fea according to the neighborhood set obtained in the step S2 i ,m i The calculation formula of (2) is as follows:
Figure BDA0004137076730000071
where k is the number of neighbors, m i Is f i A neighborhood center sequence of k neighboring points;
s4, according to the neighborhood sets obtained in the step S2, variances of each neighborhood set of the sample points after projection are obtained, and then the variances are accumulated and summed to obtain a local divergence calculation formula:
Figure BDA0004137076730000072
wherein p is i =W T m i A low-dimensional projection for a neighborhood center point; y is j The method refers to low-dimensional projection of sample points, W is a projection matrix, and subscript L is abbreviation of English Local (Local);
specifically, S4 further includes:
s41, converting the formula (2) to obtain:
Figure BDA0004137076730000073
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004137076730000074
S Li for the ith feature sequence f i Is a local divergence matrix of (1), namely:
Figure BDA0004137076730000075
s5, calculating a neighborhood global variance according to the neighborhood central point obtained in the step S3 to obtain a global divergence:
Figure BDA0004137076730000076
wherein p is i =W T m i A low-dimensional projection for a neighborhood center point; subscript G is an abbreviation for Global;
specifically, S5 further includes:
s51, transforming the formula (3)
Figure BDA0004137076730000081
Wherein the method comprises the steps of
Figure BDA0004137076730000082
S52, S in (12) G As a global divergence matrix, it is simplified to obtain:
Figure BDA0004137076730000083
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004137076730000084
the center point of all neighborhood centers is called a global neighborhood center;
s6, combining the local divergence (formula (9)) and the global divergence (formula (11)) obtained in the steps S4 and S5 to obtain formula (4), and obtaining a projection matrix W= (omega) by solving the formula (4) 12 ,…,ω d ):
Figure BDA0004137076730000085
Wherein, the formula (4) is converted into a generalized eigenvalue solution problem:
S G ω=λS L ω (5)
where ω is a generalized eigenvector and λ is a generalized eigenvalue.
By solving the equation (5), the first d largest eigenvalues λ can be obtained 12 ,…,λ d (d < p) and corresponding feature vector omega 12 ,…,ω d And projection matrix w= (ω) 12 ,…,ω d );
S7, according to the projection matrix obtained in the step S6, projecting sample points (namely elements in the feature sequence set Fea) in the feature sequence set obtained in the step S1 to obtain a dimension-reduction feature sequence:
y i =W T f i (6)
finally, a dimension reduction feature set D' = { y is obtained i I=1, 2, …, n }, wherein,
Figure BDA0004137076730000091
examples
In order to verify the dimension reduction method (hereinafter referred to as GLSUP) of the present invention, a correlation experiment was performed.
1. Data set selection
Let MTS dataset d= { X i I=1, 2, …, n, where n is the number of samples,
Figure BDA0004137076730000092
is the ith MTS, x in the data set i (i=1, 2, …, m) is a series of observations of the ith variable, m is the number of variables, and the feature sequence length is (m+1) m/2. Assuming that the average length of MTS in the dataset is t, the selected low-dimensional feature number is d (d < m) 2 )。
2. Algorithm complexity
The training process of the invention mainly comprises three parts: feature extraction, model solving and projection. In the feature extraction stage, the calculation cost is mainly the calculation of MTS covariance matrix, and the calculation complexity is O (nm) 2 t). In the model solving stage, the calculation cost is mainly the calculation of the local divergence matrix and the generalized eigenvalue solution, and the calculation complexity is O (nm) 4 ) And O (m) 6 ). After the projection matrix W is obtained, the GLSUP method projects the feature sequence in the feature set Fea, and the computation complexity is O (dm 2 ). Thus, the complexity of the GLSUP method is O (nm 2 t+nm 4 +m 6 )。
3. MTS data dimension reduction experimental process
In this example, 20 sets of multi-element time series data for different fields were selected, see Table 1, including LP1, LP2, LP3, LP4, LP5, daily and SportsActivities (DSA), fingerMovements (FM), handMovementDirection (HMD), NATOPS, cricket, racketSports (RS), epilepsy, basicMotions (BM), LSST, articularyWordRecognition (AWR), EEGeye, wafer, walkvsRun (WR), kickvsPunch (KP), australian Sign Language (ASL), respectively. The first 15 are equal length data sets and the second 5 are unequal length data sets.
Table 1 MTS data set information
Figure BDA0004137076730000101
The dimension reduction effectiveness refers to the degree of characterization of MTS features by the reserved information after the dimension reduction of the data. The experiment inputs the dimension reduction result into a KNN (k=1) classifier, and the dimension reduction effectiveness is evaluated through classification accuracy. The experimental procedure can thus be described as: and performing dimension reduction on the original MTS data set by using a dimension reduction algorithm to obtain a dimension reduction data set. And sequentially selecting samples from the dimensionality reduction data set, inputting the samples into a classifier, obtaining 1 sample most similar to the queried sample by using nearest neighbor query, taking the sample label value as the category to which the queried sample belongs, and if the sample label value is consistent with the queried sample label value, correctly classifying, otherwise, incorrectly classifying. After performing the operation on all samples, the classification accuracy is obtained:
ε=n true /n
wherein epsilon is the classification precision, n true For the number of correctly classified samples, n is the number of samples.
5 unsupervised dimension reduction methods such as PCA, CPCA, LPP, PBLDA, VPCA are selected as comparison methods. Since the VPCA method is only applicable to equal length data sets, it was only tested on 15 sets of equal length data sets. On the MTS data sets with different lengths, the dimension reduction results of the PCA and CPCA methods are still sequences with different lengths. Euclidean distance can only measure sequences of equal length. Therefore, in the 5 groups of unequal-length data sets, the dimension reduction results of the PCA and CPCA methods are measured by using dynamic time warping (Dynamic Time Warping, DTW) distances, so that KNN classification is realized.
The PCA, CPCA, VPCA method involves the following parameters: variance contribution sigma; the parameters involved in the LPP method are: neighbor number k, thermonuclear parameter t, low-dimensional feature number d; parameters involved in the PBLDA method are: feature dimension p c Dimension p of time r The method comprises the steps of carrying out a first treatment on the surface of the The parameters involved in the GLSUP method are: neighbor number k, low-dimensional feature number d. In the experiment, variance contribution ratio σ of PCA, CPCA and VPCA methods was set to 80%. The neighbor number k and the thermonuclear parameter t are both set to 1. Time dimension p r The length of the original time sequence is kept unchanged, and the characteristic dimension p is adjusted c And the low-dimensional feature number d to obtain the optimal matching precision of PBLDA, GLSUP, LPP three dimension reduction algorithms.
The results of the dimension reduction effectiveness test are shown in table 2, and the highest classification precision of each row in the table is shown in a thickening way. From the results, the GLSUP method has a good classification accuracy in all 20 data sets. The GLSUP method converts MTS into an equal-length characteristic sequence, retains correlation information among different variables, considers global and local information of a data set, and realizes dimension reduction on the characteristic sequence.
TABLE 2 dimension reduction effectiveness test results
Figure BDA0004137076730000111
Figure BDA0004137076730000121
4. Conclusion of the experiment
For other methods, CPCA method has a greater improvement in the dimension reduction effectiveness compared to PCA methods. The reason is that the former projects the MTS to a common low-dimensional subspace, while the latter projects to a different low-dimensional subspace. However, both methods reduce the dimension only for the variable dimension, without reducing the sequence length. The VPCA method has high classification accuracy, but can only be used for equal-length data sets. The PBLDA method performs dimension reduction from variable dimension and time dimension, but the dimension reduction effect is not good on part of the data set, because when the data sets with different lengths are faced, the PBLDA method adopts a cut-off mode to convert the sequences with different lengths into sequences with equal lengths, which causes information loss. The LPP method extracts the feature sequence from the MTS and performs LPP dimension reduction, solves the problem of unequal length, but only considers the local information of the data set and ignores the global information. In addition, the method needs singular value decomposition for each MTS, and has higher calculation complexity.
In addition, the GLSUP method has significant advantages over other methods in the multi-category dataset ASL. The reason is that the GLSUP method takes global and local information of the data set into account, and samples can be projected into clusters with better separability in the unlabeled data set.
In summary, the dimension reduction method uses the covariance matrix of the MTS as the feature sequence of the MTS, and can convert MTS with unequal time length into the feature sequence with equal length; and the equal-length characteristic sequences are projected to the same common low-dimensional space, and the obtained low-dimensional projection sequences can represent the original MTS, so that a relatively obvious dimension reduction effect is realized.
The above embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and changes are intended by those skilled in the art on the basis of the present invention, and are within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims (6)

1. The multi-element time sequence unsupervised dimension reduction method based on global-local divergence is characterized by comprising the following steps:
s1: acquiring fault monitoring data through a sensor, forming a multi-element time sequence of the acquired fault monitoring data into a multi-element time sequence original data set D, respectively calculating covariance matrixes of each multi-element time sequence in the data set, extracting upper triangle elements of the covariance matrixes, and combining the upper triangle elements into a characteristic sequence; all the feature sequences form a feature sequence set, and the length of each feature sequence is the same;
s2: based on the obtained feature sequence set, measuring by using k neighbor numbers and Euclidean distances to establish each sample neighbor set in the feature sequence set;
s3: calculating a neighborhood center sequence of each neighborhood set in the feature set according to the neighborhood sets obtained in the step S2;
s4: calculating the variance of each neighborhood set of the sample point to be projected according to the neighborhood sets obtained in the step S2, and then accumulating and summing the variances to calculate the local divergence;
s5: calculating a neighborhood global variance according to the neighborhood central point obtained in the step S3 to obtain a global divergence;
s6: according to the local divergence and the global divergence obtained in the steps S4 and S5, solving a projection matrix;
s7: according to the projection matrix obtained in the step S6, the feature sequence set obtained in the step S1 is projected, so that a dimension-reduced feature sequence is obtained:
y i =W T f i (6);
s8: according to the dimension reduction feature sequence, a dimension reduction feature set D' = { y of the fault monitoring data is obtained i I=1, 2, …, n }, wherein,
Figure FDA0004137076720000011
s9: and processing the fault monitoring data after the dimension reduction to obtain a fault monitoring result.
2. The method for unsupervised dimension reduction of a multivariate time series based on global-local divergence as set forth in claim 1, wherein the specific operation steps of step 1 include:
s11: the obtained fault monitoring data is subjected to multi-element time series to form a multi-element time series original data set D= { X i I=1, 2, …, n, where n is the number of samples,
Figure FDA0004137076720000012
representing the ith MTS sample, x in the original dataset i (i=1, 2, …, m) represents a series of observations of the ith variable, m is the variable number, t i Is the time length of the ith multiple time series; and zero-equalizing each multi-element time sequence: x is X i =X i -E(X i );
S12: covariance matrix calculation is carried out on each multi-element time sequence after zero-mean treatment, and the covariance matrix of the ith MTS is as follows:
Figure FDA0004137076720000021
s13: covariance matrix
Figure FDA0004137076720000022
Extracting the upper triangular matrix elements of the row vector, and sequentially forming the row vector:
f i =(a 11 ,a 12 ,…,a 1m ,a 21 ,…,a 2m ,…,a mm ) (8)
will f i As the feature sequence of the ith MTS, a feature set fea= { f of the multivariate time series is obtained i |i=1,2,…,n}。
3. The multi-element time series unsupervised dimension reduction method based on global-local divergence according to claim 2, wherein: step S3, calculating a neighborhood center sequence m i The calculation formula of (2) is as follows:
Figure FDA0004137076720000023
where k is the number of neighbors, m i Is f i A neighborhood center sequence of k neighbors to the neighborhood.
4. A multi-element time series unsupervised dimension reduction method based on global-local divergence as claimed in claim 3, wherein the specific step of S4 comprises:
s41: based on the neighborhood sets, calculating variances of each neighborhood set of the projected sample points, and accumulating and summing the variances to obtain a local divergence calculation formula:
Figure FDA0004137076720000024
wherein p is i =W T m i A low-dimensional projection for a neighborhood center point; y is j The method is characterized by comprising the steps of low-dimensional projection of sample points, wherein W is a projection matrix, and subscript L is abbreviation of English Local;
s42: the conversion of formula (2) can be obtained:
Figure FDA0004137076720000031
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0004137076720000032
Figure FDA0004137076720000033
for the ith feature sequence f i Is a local divergence matrix of:
Figure FDA0004137076720000034
5. the method for unsupervised dimension reduction of a multivariate time series based on global-local divergence as set forth in claim 4, wherein the specific step of S5 comprises:
s51: calculating a neighborhood global variance according to a neighborhood center point in the neighborhood center sequence to obtain a global divergence calculation formula:
Figure FDA0004137076720000035
wherein p is i =W T m i A low-dimensional projection for a neighborhood center point; subscript G is an abbreviation for Global;
s52: the transformation of formula (3) can be obtained:
Figure FDA0004137076720000036
wherein S is G Is a global divergence matrix:
Figure FDA0004137076720000037
s53: the formula (12) is simplified to obtain:
Figure FDA0004137076720000041
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0004137076720000042
is the center point of all neighborhood centers, namely the global neighborhood center.
6. The method for unsupervised dimension reduction of a multivariate time series based on global-local divergence as set forth in claim 5, wherein the specific step of S6 comprises:
s61: combining the local divergence obtained in the formula (9) and the global divergence obtained in the formula (11) to obtain the formula (4):
Figure FDA0004137076720000043
s62: converting the formula (4) into a generalized eigenvalue solution problem, and obtaining a projection matrix by solving the formula (4):
S G ω=λS L ω (5)
wherein: omega is a generalized eigenvector, lambda is a generalized eigenvalue;
s63: solving (5) to obtain the first d maximum eigenvalues lambda 12 ,…,λ d (d < p), corresponding feature vector ω 12 ,…,ω d Projection matrix w= (ω) 12 ,…,ω d )。
CN202310278160.1A 2023-03-21 2023-03-21 Multi-element time sequence unsupervised dimension reduction method based on global-local divergence Pending CN116401528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310278160.1A CN116401528A (en) 2023-03-21 2023-03-21 Multi-element time sequence unsupervised dimension reduction method based on global-local divergence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310278160.1A CN116401528A (en) 2023-03-21 2023-03-21 Multi-element time sequence unsupervised dimension reduction method based on global-local divergence

Publications (1)

Publication Number Publication Date
CN116401528A true CN116401528A (en) 2023-07-07

Family

ID=87009515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310278160.1A Pending CN116401528A (en) 2023-03-21 2023-03-21 Multi-element time sequence unsupervised dimension reduction method based on global-local divergence

Country Status (1)

Country Link
CN (1) CN116401528A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738866A (en) * 2023-08-11 2023-09-12 中国石油大学(华东) Instant learning soft measurement modeling method based on time sequence feature extraction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738866A (en) * 2023-08-11 2023-09-12 中国石油大学(华东) Instant learning soft measurement modeling method based on time sequence feature extraction
CN116738866B (en) * 2023-08-11 2023-10-27 中国石油大学(华东) Instant learning soft measurement modeling method based on time sequence feature extraction

Similar Documents

Publication Publication Date Title
WO2016091017A1 (en) Extraction method for spectral feature cross-correlation vector in hyperspectral image classification
Crainiceanu et al. Population value decomposition, a framework for the analysis of image populations
CN107273825B (en) Physiological signal fusion identity recognition method based on improved canonical correlation analysis
Li et al. Overview of principal component analysis algorithm
CN108564061B (en) Image identification method and system based on two-dimensional pivot analysis
CN105608478B (en) image feature extraction and classification combined method and system
Deng et al. Invariant subspace learning for time series data based on dynamic time warping distance
CN108520310B (en) Wind speed forecasting method of G-L mixed noise characteristic v-support vector regression machine
Peng et al. Correntropy based graph regularized concept factorization for clustering
CN113947157B (en) Dynamic brain effect connection network generation method based on hierarchical clustering and structural equation model
CN116401528A (en) Multi-element time sequence unsupervised dimension reduction method based on global-local divergence
CN109241870B (en) Coal mine underground personnel identity identification method based on gait identification
CN114503131A (en) Search device, search method, search program, and learning model search system
CN110399814B (en) Face recognition method based on local linear representation field adaptive measurement
CN106599903B (en) Signal reconstruction method for weighted least square dictionary learning based on correlation
Dan et al. Learning brain dynamics of evolving manifold functional MRI data using geometric-attention neural network
Zhu et al. Novel K-Medoids based SMOTE integrated with locality preserving projections for fault diagnosis
CN114254703A (en) Robust local and global regularization non-negative matrix factorization clustering method
CN101877065A (en) Extraction and identification method of non-linear authentication characteristic of facial image under small sample condition
Tao et al. Image recognition based on two-dimensional principal component analysis combining with wavelet theory and frame theory
CN111783526B (en) Cross-domain pedestrian re-identification method using posture invariance and graph structure alignment
CN112257573A (en) ECG identity recognition method based on t-SNE and Adaboost
He et al. Underdetermined mixing matrix estimation based on joint density-based clustering algorithms
Shao et al. An NMF-based method for the fingerprint orientation field estimation
Ren et al. Multivariate functional data clustering using adaptive density peak detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination