CN114492566A

CN114492566A - Weight-adjustable high-dimensional data dimension reduction method and system

Info

Publication number: CN114492566A
Application number: CN202111557901.7A
Authority: CN
Inventors: 杨旭东; 张树巍; 刘焰明; 张庆明
Original assignee: Southwest University of Science and Technology
Current assignee: Southwest University of Science and Technology
Priority date: 2021-12-20
Filing date: 2021-12-20
Publication date: 2022-05-13

Abstract

The invention discloses a method and a system for reducing dimension of high-dimensional data with adjustable weight, relating to the technical field of dimension reduction of data, wherein the dimension reduction method comprises the following steps: step1, extracting data; step2, acquiring an attribute weight matrix; step3, calculating the weighted Euclidean point pair distance; step4, calculating the joint probability of the high-dimensional space; and Step5, acquiring the low-dimensional space point distribution. The invention solves the problems of lower dimension reduction accuracy, larger error and the like in the prior art.

Description

Weight-adjustable high-dimensional data dimension reduction method and system

Technical Field

The invention relates to the technical field of data dimension reduction, in particular to a weight-adjustable high-dimensional data dimension reduction method and system.

Background

At present, the human society is entering a big data era, with the rapid development of computer information technology, various industries in the society are gradually digitalized, and more data are generated and stored. How to convert the complex high-dimensional data into low-dimensional data which can be observed and further used conveniently is an important problem to be solved urgently. At present, most dimension reduction methods are classified into linearity and nonlinearity, and are mainly represented as PCA, MDS, t-SNE and the like, wherein t-SNE measures the similarity between a high-dimensional space point pair and a low-dimensional space point pair through conditional probability, KL divergence is used as a target function, so that the low-dimensional space can keep a good embedding effect, and the t-SNE algorithm adopts a Gaussian kernel function when calculating the similarity of the high-dimensional space point pair, so that the Euclidean distance between the point pairs is inevitably calculated. Due to the characteristics of the data and the difference between attributes, when the Euclidean distance is calculated, the distance of each data attribute is not the same and important, so that the probability structure of a real high-dimensional space can not be completely reflected by adopting a Gaussian kernel function of the Euclidean distance, the dimension reduction effect on the basis is not ideal enough, more accurate and flexible dimension reduction is difficult to be performed according to the characteristics of the data, and the dimension reduction effect is weakened along with the increase of the complexity of the data.

A plurality of dimension reduction and clustering methods for high-dimension complex data exist. In the patent re-recognition method for confusing digital handwriting (Chinese patent publication No. CN 109034021A, published time 2018.12.18), original t-SNE high-dimensional space points are subjected to grouping weighting on distance calculation, so that dimension reduction errors are reduced, and re-recognition accuracy is improved. However, the difference of each attribute in the original data before is not considered, the calculated Euclidean distances are only subjected to grouping weighting, and certain limitation still exists when multi-attribute data are analyzed. In the patent of a score clustering analysis method based on t-SNE (Chinese patent publication No. CN 111625576A, published time 2020.09.04), a t-SNE algorithm is directly used for performing dimensionality reduction processing on high-dimensional student score data, and although a visual experiment result shows that the dimensionality reduction processing has an effect on the student score data, the visual experiment result is directed at student score data with strong attribute characteristics, the relevance among attributes is not considered, and quantitative comparison indexes for the experiment result are lacked.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a weight-adjustable high-dimensional data dimension reduction method and system, and solves the problems of low dimension reduction accuracy, large error and the like in the prior art.

The technical scheme adopted by the invention for solving the problems is as follows:

a weight-adjustable high-dimensional data dimension reduction method comprises the following steps:

step1, extracting data: extracting n m-dimensional high-dimensional data to form a data matrix X of n X m;

wherein x is_ikThe ith row and the kth column of the high-dimensional data, n is more than 2 and is a positive integer, m is more than 3 and is a positive integer, i is a positive integer and is more than or equal to 1 and less than or equal to n, k is a positive integer and is more than or equal to 1 and less than or equal to m;

step2, obtaining an attribute weight matrix: performing attribute weight calculation on the data matrix X to obtain an attribute weight matrix weight;

weight＝[w_c1 … w_ci … w_cm]；

wherein w_ciWeighting the attribute weight of the ith column of data in the data matrix X;

step3, calculating the weighted Euclidean point pair distance: substituting weight into a high-dimensional space point pair Euclidean distance calculation formula to obtain an attribute weighted Euclidean point pair distance matrix D between each point pair;

wherein the content of the first and second substances,

d_ijthe weighted Euclidean distance between the ith row of data and the jth row of data in a high-dimensional space in the data matrix X is defined, wherein the high-dimensional space refers to a space with the dimension larger than 3; x is the number of_ikFor data in the ith row and the kth column of the data matrix X, X_jkThe data of the jth row and the kth column in the data matrix X;

step4, calculating the joint probability of the high-dimensional space: according to the attribute weighting Euclidean point pair distance matrix D between each point pair, continuously calculating the joint conditional probability p of the high-dimensional space of the data matrix X_ij；

Step5, acquiring the distribution of low-dimensional space points: computing low-dimensional spatial joint probability q_ijAdopting KL divergence as a target function, continuously calculating the similarity of the low-dimensional space until the value of the KL divergence is converged to obtain the low-dimensional spaceDistribution of spatial points; wherein, the low-dimensional space refers to a space with the dimension less than or equal to 3.

As a preferable technical scheme, in Step2, an attribute weight is calculated by adopting an SVD weight method and a Critic weight method respectively, and the two attribute weights are respectively recorded as weight_aAnd weight_bThen weight is added_aAnd weight_bAnd calculating an attribute weight value corresponding to the global optimal solution as the initial positions of two sample points of the particle swarm algorithm.

As a preferable embodiment, in Step4, the optimum standard deviation δ centered on the ith row data in the data matrix X is searched for using binary search based on the set confusion value_iAnd the optimum standard deviation delta centered on the jth row of data_jCalculating conditional probability p of high-dimensional data matrix_i|j、p_j|iThen calculate the joint probability p of the high-dimensional space_ijThe calculation formula is as follows:

wherein k is a positive integer and is more than or equal to 1 and less than or equal to n.

As a preferred technical solution, the condition for stopping binary search is as follows: the absolute value of the difference between the set confusion value and the currently calculated confusion value is less than 0.0001 or the binary search times is more than 50.

As a preferred technical solution, in Step5, using KL divergence as a target function, continuously updating the positions of all points in the low-dimensional space by a gradient descent method, and recalculating the conditional probability and the joint probability of the low-dimensional space and the new value of the KL divergence until the value of the KL divergence function C converges to obtain the distribution of the points in the low-dimensional space;

wherein Y is a randomly initialized low-dimensional space matrix of n x 2 according to t distribution, Y_iIndicating the ith point, y, of a randomly initialized low-dimensional spatial matrix_i1Is the abscissa of the ith point, y_i2Is the ordinate of the ith point.

As a preferable technical solution, in Step5, changing the position of the low-dimensional space point by introducing the momentum parameter, iteratively calculating the joint probability of the low-dimensional space until the value of the KL divergence function converges, and obtaining the distribution of the low-dimensional space point.

As a preferable technical solution, in Step5, the judgment condition until the value of the KL divergence function converges is whether the difference between the KL divergence function value of the current iteration and the KL divergence function value of the previous iteration is less than 0.005. If the space is less than 0.005, outputting a low-dimensional space dimension reduction result matrix; otherwise, the positions of the low-dimensional space points are continuously updated iteratively until the difference value is less than 0.005.

As a preferred technical solution, in Step5, a calculation formula for obtaining the distribution of the low-dimensional spatial points is as follows:

wherein, y^uFor an updated two-dimensional spatial matrix of the lower dimension y^(u-1)A two-dimensional space matrix of a lower dimension generated for the last iteration, wherein eta is a step length, alpha (u) is a learning rate, and alpha (u) (y)^(u-1)-y^(u-2)) Is a momentum gradient, at the first iteration (y)^(u-1)-y^(u-2)) Default to 0.

As a preferable embodiment, in Step5, the dimensionality reduction effect is evaluated using the contour coefficient, the DBI, the CH score, and/or the KL divergence value.

A weight-adjustable high-dimensional data dimension reduction system is applied to the weight-adjustable high-dimensional data dimension reduction method, and comprises the following modules which are sequentially connected:

a data extraction module: extracting n m-dimensional high-dimensional data to form a data matrix X of n X m;

an attribute weight matrix acquisition module: the method comprises the following steps of performing attribute weight calculation on a data matrix X to obtain an attribute weight matrix weight;

weight＝[w_c1 … w_ci … w_cm]；

wherein, w_ciWeighting the attribute weight of the ith column of data in the data matrix X;

a weighted Euclidean point pair distance calculating module: substituting weight into the Euclidean distance calculation formula of the high-dimensional space point pair to obtain an attribute weighted Euclidean point pair distance matrix D between each point pair;

wherein the content of the first and second substances,

d_ijthe weighted Euclidean distance between the ith row of data and the jth row of data in the data matrix X in the high-dimensional spaceRefers to a space with dimension > 3; x is the number of_ikFor data in the ith row and the kth column of the data matrix X, X_jkThe data of the jth row and the kth column in the data matrix X;

a module for calculating high-dimensional spatial joint probability: the method is used for weighting the Euclidean point pair distance matrix D according to the attribute between each point pair and continuously calculating the joint conditional probability p of the high-dimensional space of the data matrix X_ij；

A low-dimensional spatial point distribution obtaining module: to calculate the joint probability q of the low-dimensional space_ijAdopting KL divergence as a target function, and continuously calculating the similarity of the low-dimensional space until the value of the KL divergence function is converged to obtain the distribution condition of the low-dimensional space points; wherein, the low-dimensional space refers to a space with the dimension less than or equal to 3.

Compared with the prior art, the invention has the following beneficial effects:

(1) according to the invention, the distance of the t-SNE algorithm high-dimensional space point is improved through the attribute weight, and when the attribute weight is calculated, the solution can be carried out through a specific weight algorithm, and meanwhile, different weights with different attributes can be distributed through the requirement of a user on dimension reduction, so that the user-defined weight distribution and dimension reduction are realized;

(2) the method can more remarkably reflect the difference of data in the original high-dimensional space and the low-dimensional embedding after dimension reduction, solve the dimension reduction problem of high-dimensional complex data, improve the clustering effect and the dimension reduction accuracy;

(3) in the same condition, the KL divergence value is reduced, the profile coefficient is better, the Davies-Bouldin index is better, and the Calinski-Harbasz Score is higher.

Drawings

FIG. 1 is a diagram illustrating the steps of a method for reducing dimension of high-dimensional data with adjustable weight according to the present invention;

FIG. 2 is an overall framework of the dimension reduction method of the present invention;

FIG. 3 is a flow chart of a dimension reduction method according to the present invention;

FIG. 4 is a graph comparing the visual results of dimension reduction in the case of 2000 sets of data for PCA (top left), MDS (top right), t-SNE (bottom left) and the dimension reduction method of the present invention (bottom right);

FIG. 5 is a graph comparing the profile coefficient indices for PCA, MDS, t-SNE and the dimensionality reduction method of the present invention in the case of 2000 sets of data;

FIG. 6 is a comparison of Davies-Bouldin indices for PCA, MDS, t-SNE and the dimensionality reduction method of the present invention in a 2000-set of data;

FIG. 7 is a plot comparing Calinski-Harbasz Score indices in the case of 2000 sets of data for PCA, MDS, t-SNE and the dimensionality reduction method of the present invention;

FIG. 8 is a plot of KL divergence values versus values for t-SNE versus the dimensionality reduction method of the present invention for the 2000 th set of data and the 1000 th iteration;

FIG. 9 is a graph of the change of KL divergence values with the increase of the number of iterations in the case of the t-SNE and the 2000 sets of data for the dimension reduction method of the present invention.

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.

Example 1

As shown in fig. 1 to 9, the present embodiment takes the complex high-dimensional medical record data as an example.

The method comprises the following steps:

step 1: preprocessing the complex high-dimensional medical record data: forming an experimental data set by n pieces of data, wherein each piece of data comprises m attributes, and then carrying out standardization processing on n m-dimensional data to form a data matrix X for experiments;

step 2: respectively carrying out weight calculation on the data matrix of n x m by adopting an SVD (singular value decomposition) and Critic weight value method;

step 3: substituting the weighted value calculated in Step2 into a high-dimensional space point pair Euclidean distance calculation formula, and multiplying the distance between each component by the corresponding weighted value to obtain a weighted point pair Euclidean distance matrix D;

wherein the content of the first and second substances,

step 4: searching for the optimum standard deviation delta centered on ith row data in data matrix X using a binary search based on the set confusion value_iAnd the optimum standard deviation delta with the jth row of data as the center_jCalculating conditional probability p of high-dimensional data matrix_i|j、p_j|iThen calculate the joint probability p of the high-dimensional space_ijThe calculation formula is as follows:

Step 5: randomly generating a normal distribution matrix of n x 2, and calculating the low-dimensional conditional probability and the low-dimensional spatial joint probability q for the matrix_ijAdopting KL divergence as a target function, continuously calculating the similarity of a high-dimensional space and a low-dimensional space by a gradient descent method, introducing the position of a low-dimensional space point changed by a momentum parameter,recalculating low-dimensional spatial joint probabilities q_ijUntil the value of the KL divergence function is basically unchanged, obtaining the distribution condition of the low-dimensional space points;

objective function KL divergence:

gradient formula:

the position variation formula of the low-dimensional space point is as follows:

preferably, the two weights obtained in Step2 are used as sample points in a space with m, the KL divergence is used as a fitness function, the positions and the movement speeds of the points are continuously updated through a particle swarm algorithm, the fitness function is continuously calculated through the positions of the points, a global optimal position is obtained, the position is used as the weight, the weighted Euclidean distance is continuously calculated in Step3, and Step5 is repeated to obtain the optimal dimensionality reduction effect.

Velocity update formula:

v_i＝ω×v_i+c_i×rand()×(pbest_i-x_i)+c₂×rand()×(gbest_i-x_i)

location update formula:

x_i＝x_i+v_i。

example 2

As shown in fig. 1 to 9, as a further optimization of embodiment 1, this embodiment includes all the technical features of embodiment 1, and in addition, this embodiment further includes the following technical features:

the present embodiment takes the complex high-dimensional medical record data as an example.

Firstly, selecting n pieces of complex high-dimensional case data, forming a data matrix X of n X m because each piece of data has m attributes, and then carrying out standardization processing on the matrix;

x_i＝[x_i1 x_i2 … x_ik … x_im]。

secondly, carrying out attribute weight calculation on the formed data matrix X by two methods of SVD (singular value decomposition) and Critic weight method to obtain weight_aAnd weight_bTwo weights;

weight_a＝[w_a1 … w_ai … w_am]

weight_b＝[w_b1 … w_bi … w_bm]

wherein w_ai、w_biAttribute weight values of ith row data in the data matrix X are calculated by SVD and Critic weight value methods respectively;

thirdly, the weight calculated in the second step_aAnd weight_bInitial positions p of two sample points as Particle Swarm Optimization (PSO)₁And p₂In this case, the search space dimension and the number of particles of the particle algorithm are m (m is p)₁And p₂I.e. representing how many attributes there are) and 2. By setting the number of iterations by itself, the initial velocity v_iInertia factor ω, learning factor c₁And c₂Will beThe contour coefficient is used as a fitness function, iteration is started, the fitness of each sample point is evaluated, and the local optimum value pbest of each sample point is updated_iAnd global optimum gbest_iBy continuously varying the speed v of the sample points_iAnd position p_iAnd returning the position corresponding to the global optimal value at the moment as a new weight until the iteration number meets the set maximum iteration number requirement, wherein the calculation formula is as follows:

v_i＝ω×v_i+c₁×rand()×(pbest_i-p_i)+c₂×rand()×(gbest_i-pi (1)；

x_i＝p_i+v_i (2)；

v_ifor the updated velocity value, v, (left 1 in equation 9)_iIs (right 2 in equation 9) the original velocity value, ω is the inertia factor, c₁And c₂As a learning factor, pbest_iFor the locally optimal solution of the ith particle, gbest_iIs a globally optimal solution, and rand () generates an interval [0, 1%]Random number between, x_i(left 1 in equation 10) is the position after the particle update, x_i(Right 1 in equation 10) is the home position, pbest_i；

Indicating the local optimum, gbest, of the ith particle calculated from the fitness function_iRepresents the global optimum value calculated by the ith particle according to the fitness function.

Fourthly, the attribute weight obtained by calculation in the third step is substituted into a high-dimensional space point pair Euclidean distance calculation formula (3) to obtain a square matrix D of the attribute weighted Euclidean point pair distance between each point pair;

fifthly, weighting according to the calculated attributesThe Euclidean point pair distance square matrix D is continuously calculated through the Gaussian kernel function to obtain the conditional probability p of the high-dimensional space of the data matrix X_i|j：

Sixth, it sets a confusion P_yFinding the optimal σ_j. In the fourth step p_i|jThe perplexity P is calculated by the formula (5)_xHandle P_xAnd P_yPerforming difference operation, performing dichotomy iteration, and updating sigma_jAnd P_xUp to P_xAnd P_yIs less than the minimum limit value of 0.00001, the iteration is stopped, the current sigma is_jI.e. the optimal value, and obtains the optimal p of the high-dimensional space_i|j：

p_i|jCalculating the high-dimensional space conditional probability for the fifth step;

seventhly, according to the calculated conditional probability of the high-dimensional space of the data matrix X, calculating the joint probability p of the high-dimensional space_ij：

Wherein p is_i|jIs a matrix. Therefore, only p is needed here_i|jTransposing to obtain p_j|i；

Eighthly, randomly initializing a low-dimensional space matrix Y of n x 2 according with t distribution, simultaneously calculating the Euclidean distance of a low-dimensional space point pair for the Y matrix, and calculating the low-dimensional space through a t distribution probability density function with the degree of freedom of 1Joint probability q_ij：

Wherein y is_iDenotes the ith point, where y_i1Is the abscissa of the ith point, y_i2Is the ordinate of the ith point.

And ninthly, taking the KL divergence as an objective function, and minimizing the KL divergence to enable the similarity between the high-dimensional space point pair and the low-dimensional space point pair to be as close as possible, wherein the KL divergence of the objective function is shown as a formula (8):

p_ijis a high dimensional spatial conditional probability, q_ijIs a low dimensional spatial conditional probability;

tenth, finding the minimum value of the divergence of the objective function KL by a gradient descent method, wherein the gradient formula is as follows:

eleventh, to speed up the search and avoid falling into the locally optimal solution, the momentum parameter is added to update the low-dimensional space matrix Y, as shown in equation 10. And continuing to update the updated low-dimensional space matrix Y through

formulas

7,8,9 and 10. When the set iteration times are reached, stopping iteration to obtain a relatively accurate low-dimensional space two-dimensional matrix after dimensionality reduction:

y^ufor an updated two-dimensional spatial matrix, y^(u-1)Is the last oneThe two-dimensional space matrix generated by the sub-iteration, eta is the step size, alpha (u) is the learning rate, and alpha (u) (y)^(u-1)-y^(u-2)) Is the momentum gradient for enhancing the effect of the gradient descent algorithm, at the first iteration (y)^(u-1)-y^(u-2)) Default to 0.

Twelfth, the more accurate low-dimensional space two-dimensional matrix obtained in the eleventh step is the actual output value of the two-dimensional data obtained by reducing the dimension of the high-dimensional medical record data after the improvement of the original t-SNE algorithm. The complex high-dimensional medical record data is label-free data. Therefore, the invention develops a new method, firstly adopts a k-means algorithm to cluster the low-dimensional space two-dimensional matrix obtained in the eleventh step, and finally adopts the contour coefficient, the DBI, the CH fraction and the KL divergence value to evaluate the clustering effect of the dimension reduction data, so that the dimension reduction effect of the invention is reflected from the side surface.

Thirteenth, the invention provides the improvement of the distance of the t-SNE algorithm high-dimensional space point through the attribute weight, when calculating the attribute weight, the invention can not only solve through a specific weight algorithm, but also distribute different weights of different attributes according to the requirement of the user on dimension reduction, thereby achieving the self-defined weight distribution and dimension reduction.

Calculating high-dimensional data attribute weight by SVD and Critic weight value method, using the weight as an initial sample point in PSO (particle swarm optimization), using contour coefficient as fitness function, continuously iterating sample point position and speed to finally obtain the optimal attribute weight distribution value of the data, introducing the optimal weight value into the calculation of Euclidean distance in high-dimensional space to obtain attribute weighted Euclidean distance, using the attribute weighted Euclidean distance as the calculation mode of Gaussian kernel function in high-dimensional space, and then performing dimension reduction on the basis of t-SNE algorithm. Therefore, the method can more remarkably reflect the difference of the data in the original high-dimensional space and the low-dimensional embedding after dimension reduction, solve the dimension reduction problem of high-dimensional complex data, improve the clustering effect and improve the dimension reduction accuracy.

According to the comparison with the precision of a plurality of dimensionality reduction algorithms such as PCA, MDS, t-SNE and the like and the clustering judgment index, the improved t-SNE algorithm provided by the invention has the advantages that in the same condition, the KL divergence value is reduced by 52.2% compared with the t-SNE, the contour coefficient is better than 27.6% -98.6% compared with the PCA, MDS and t-SNE, the Davies-Bouldin index is better than the PCA, MDS and t-SNE by 31.1% -45.7%, and the Calinski-Harbasz Score is higher than the PCA, MDS and t-SNE by 2-5 times.

As described above, the present invention can be preferably realized.

All features disclosed in all embodiments of the present specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.

The foregoing is only a preferred embodiment of the present invention, and the present invention is not limited thereto in any way, and any simple modification, equivalent replacement and improvement made to the above embodiment within the spirit and principle of the present invention still fall within the protection scope of the present invention.

Claims

1. A weight-adjustable high-dimensional data dimension reduction method is characterized by comprising the following steps:

weight＝[w_c1 … w_ci … w_cm]；

wherein the content of the first and second substances,

Step5, acquiring the distribution of low-dimensional space points: computing low-dimensional spatial joint probability q_ijAdopting KL divergence as a target function, and continuously calculating the similarity of the low-dimensional space until the value of the KL divergence function is converged to obtain the distribution condition of the low-dimensional space points; wherein, the low-dimensional space refers to a space with the dimension less than or equal to 3.

2. The method as claimed in claim 1, wherein Step2 calculates an attribute weight by using SVD weight method and Critic weight method, and the two attribute weights are respectively denoted as weight_aAnd weight_bThen weight is added_aAnd weight_bAnd calculating an attribute weight value corresponding to the global optimal solution as the initial positions of two sample points of the particle swarm algorithm.

3. The method of claim 2, wherein in Step4, binary search is used to search data according to the set confusion valueOptimum standard deviation delta centered on ith row of data in matrix X_iAnd the optimum standard deviation delta centered on the jth row of data_jCalculating conditional probability p of high-dimensional data matrix_i|j、p_j|iThen calculate the joint probability p of the high-dimensional space_ijThe calculation formula is as follows:

4. The method of claim 3, wherein the condition for stopping binary search is: the absolute value of the difference between the set confusion value and the currently calculated confusion value is less than 0.0001 or the binary search times is more than 50.

5. The method according to claim 4, wherein in Step5, KL divergence is used as the objective function, the positions of all points in the low-dimensional space are continuously updated by a gradient descent method, the conditional probability and the joint probability of the low-dimensional space and the new value of the KL divergence are recalculated until the value of the KL divergence function C converges, and the distribution of the points in the low-dimensional space is obtained;

6. The method of claim 5, wherein in Step5, the momentum parameter is further introduced to change the positions of the low-dimensional spatial points, and the joint probability of the low-dimensional space is iteratively calculated until the value of the KL divergence function converges, so as to obtain the distribution of the low-dimensional spatial points.

7. The method of claim 6, wherein in Step5, the judgment condition until the value of the KL divergence function converges is whether the difference between the KL divergence function value of the current iteration and the KL divergence function value of the previous iteration is less than 0.005. If the space is less than 0.005, outputting a low-dimensional space dimension reduction result matrix; otherwise, the positions of the low-dimensional space points are continuously updated iteratively until the difference value is less than 0.005.

8. The method of claim 7, wherein in Step5, the formula for obtaining the distribution of the low-dimensional spatial points is as follows:

wherein, y^uFor an updated two-dimensional spatial matrix of the lower dimension y^(u-1)A two-dimensional space matrix of low dimension generated for the last iteration, ηStep size, α (u) is learning rate, α (u) (y)^(u-1)-y^(u-2)) Is a momentum gradient, at the first iteration (y)^(u-1)-y^(u-2)) Default to 0.

9. The method for dimensionality reduction of high-dimensional data with adjustable weight according to any one of claims 1 to 8, wherein in Step5, the dimensionality reduction effect is evaluated by using contour coefficients, DBI, CH scores and/or KL divergence values.

10. A weight-adjustable high-dimensional data dimension reduction system is applied to the weight-adjustable high-dimensional data dimension reduction method of any one of claims 1 to 9, and comprises the following modules which are sequentially connected in sequence:

an attribute weight matrix acquisition module: the method is used for carrying out attribute weight calculation on the data matrix X to obtain an attribute weight matrix weight;

weight＝[w_c1 … w_ci … w_cm]；

the calculation weighted Euclidean point pair distance module: substituting weight into the Euclidean distance calculation formula of the high-dimensional space point pair to obtain an attribute weighted Euclidean point pair distance matrix D between each point pair;

wherein the content of the first and second substances,

d_ijthe weighted Euclidean distance between the ith row of data and the jth row of data in a data matrix X in a high-dimensional space, wherein the high-dimensional space refers to a space with the dimension larger than 3; x is the number of_ikFor data in the ith row and the kth column of the data matrix X, X_jkThe data of the jth row and the kth column in the data matrix X;

A low-dimensional spatial point distribution acquisition module: to calculate the joint probability q of the low-dimensional space_ijAdopting KL divergence as a target function, and continuously calculating the similarity of the low-dimensional space until the value of the KL divergence function is converged to obtain the distribution condition of the low-dimensional space points; wherein, the low-dimensional space refers to a space with the dimension less than or equal to 3.