CN108073695A

CN108073695A - A kind of higher-dimension time-variable data method for visualizing of dimension reduction space visual perception enhancing

Info

Publication number: CN108073695A
Application number: CN201711304971.5A
Authority: CN
Inventors: 周志光
Original assignee: Zhejiang University of Finance and Economics
Current assignee: Zhejiang University of Finance and Economics
Priority date: 2017-12-10
Filing date: 2017-12-10
Publication date: 2018-05-25
Anticipated expiration: 2037-12-10
Also published as: CN108073695B

Abstract

The invention discloses a high-dimensional time-varying data visualization method for enhanced visual perception of dimensionality reduction space, comprising: reading high-dimensional time-varying data, using a multi-dimensional scaling algorithm to obtain two-dimensional space coordinates of the high-dimensional time-varying data, and using the formula (1) Solve the coordinates of high-dimensional time-varying data in two-dimensional space to obtain the target orthogonal matrix, , in formula (1), Q represents the target orthogonal matrix, Indicates the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the current moment, Represents the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the previous moment, N represents the number of high-dimensional time-varying data, D ^t (Q) represents the high-dimensional time-varying data after orthogonal matrix transformation The projection offset at the current moment relative to the previous moment; the projection offset of the high-dimensional time-varying data in the two-dimensional space is minimized by using the target orthogonal matrix, and the projection of the high-dimensional time-varying data at the current moment is obtained. The invention can significantly improve the visualization and analysis efficiency of high-dimensional time-varying data.

Description

A high-dimensional time-varying data visualization method with enhanced visual perception of dimensionality reduction space

技术领域technical field

本发明涉及一种降维空间的高维时变数据可视化方法，属于计算机图形学及数据可视化技术领域。The invention relates to a high-dimensional time-varying data visualization method in a dimensionality reduction space, and belongs to the technical fields of computer graphics and data visualization.

背景技术Background technique

高维数据分析通常同时考虑数据的多维属性，尤其是当数据的维度间存在较强的相关性，无法单独考虑时，有效展示数据的多维属性并且利用多维属性分析发现数据中潜在的特征是可视分析领域的研究热点。平行坐标(Mcdonnell K T，MuellerK.Illustrative Parallel Coordi-nates[J].Computer Graphics Forum，2008，27(3)：1031-1038.)利用跨越一组平行的垂直轴的折线表示高维数据，其中每条平行轴代表一个维度。支持用户沿着轴排列的方向提取数据的隐含信息，进而找到维度间的相关性。然而，不合理的轴排列会带来严重的数据线混淆问题，严重地降低了可视系统的可读性，许多学者对此进行深入的研究。Graham等学者使用自由曲线代替折线减少同一类数据中的遮挡，实现同一数据集中的数据区分(Graham M，Kennedy J.Using Curves to EnhanceParallel Coordinate Visualisations[C]//International Conference onInformation Visualization.IEEE Computer Society，2003：10)。Peng等学者通过密度分析对平行坐标进行重排列展现出较少混淆，增强数据的可读性(Peng W，Ward M O，Rundensteiner EA.Clutter Reduction in Multi-Dimensional Data VisualizationUsing Dimension Reordering[C]//IEEE Symposium on Information Visual-ization.IEEE Computer Society，2004：89-96.)。虽然平行坐标能够展示多维数据的分布特征，但是当数据集很大时显示的数据层次混乱，很难发现数据的变化规律，阻碍了人们对信息的理解.散点图矩阵(Cleveland W C，Mcgill M E.Dynamic graphics for statis-tics[J].Journal ofthe Royal Statistical Society，1990，153(1))将数据的不同维度按照一定的顺序进行二维排列，独立展示任意2个维度组合生成的散点图，从而支持用户解读原始数据并分析任意2个维度之间的相关性。但是缺少了数据在超过两个维度情况下的聚类展示，虽然Elmqvist等学者提出将散点图放入三维空间中旋转进而找到多个维度间的关系，但在处理更高维度的数据时仍存在局限性(Elmqvist N，Dragicevic P，Fekete JD.Rolling the Dice：Multidimensional Visual Exploration using ScatterplotMatrix Navigation[J]//.IEEE Transactions on Visualization&Computer Graphics，2008，14(6)：1141.)。High-dimensional data analysis usually considers the multidimensional attributes of the data at the same time, especially when there is a strong correlation between the dimensions of the data, which cannot be considered separately. It is possible to effectively display the multidimensional attributes of the data and use multidimensional attribute analysis to discover potential features in the data. Research hotspots in the field of visual analysis. Parallel coordinates (Mcdonnell K T, Mueller K. Illustrative Parallel Coordi-nates [J]. Computer Graphics Forum, 2008, 27(3): 1031-1038.) represent high-dimensional data using polylines spanning a set of parallel vertical axes, where each The parallel axes represent a dimension. Support users to extract the implicit information of the data along the direction of the axis arrangement, and then find the correlation between dimensions. However, unreasonable axis arrangement will cause serious data line confusion, which seriously reduces the readability of the visual system, and many scholars have conducted in-depth research on this. Scholars such as Graham use free curves instead of polylines to reduce occlusion in the same type of data and realize data differentiation in the same data set (Graham M, Kennedy J. Using Curves to Enhance Parallel Coordinate Visualizations[C]//International Conference on Information Visualization.IEEE Computer Society, 2003: 10). Peng and other scholars showed less confusion and enhanced data readability by rearranging parallel coordinates through density analysis (Peng W, Ward M O, Rundensteiner EA. Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering[C]//IEEE Symposium on Information Visualization. IEEE Computer Society, 2004: 89-96.). Although parallel coordinates can display the distribution characteristics of multi-dimensional data, when the data set is large, the displayed data hierarchy is chaotic, and it is difficult to find the changing law of the data, which hinders people's understanding of information. Scatter plot matrix (Cleveland W C, Mcgill M E.Dynamic graphics for statis-tics[J].Journal of the Royal Statistical Society, 1990, 153(1)) Arrange the different dimensions of the data in a two-dimensional order in a certain order, and independently display the scatter points generated by any combination of two dimensions Graph, so as to support users to interpret the original data and analyze the correlation between any two dimensions. However, there is a lack of clustering display of data in the case of more than two dimensions. Although scholars such as Elmqvist propose to rotate the scatter plot in three-dimensional space to find the relationship between multiple dimensions, it is still difficult to deal with higher-dimensional data. There are limitations (Elmqvist N, Dragicevic P, Fekete JD. Rolling the Dice: Multidimensional Visual Exploration using ScatterplotMatrix Navigation [J]//. IEEE Transactions on Visualization & Computer Graphics, 2008, 14(6): 1141.).

降维映射技术可以提供多维数据内在结构在低维空间的表示，是一种有效的多维数据可视化技术。能够帮助用户有效地在低维空间中探索及理解抽象的多维信息及其结构，使用户可以在知识发现、信息认知和信息决策过程中快速准确地发掘数据集中隐含地特征、关系、模式、趋势及聚类信息等。主成分分析法常用于将数据减少到两个或者三个维度来生成相似性布局。然而这种方法舍弃了数据的其他维度，很难保持数据在原始空间的高维分布。一个经典的方法是局部保持投影(locality preserving projection，LPP)(HeX，Niyogi P.Locality preserving projections[J].Ad-vances in Neural InformationProcessing Systems，2004，16(1)：186-197)，这种方法线性地保存了数据的局部高维分布。作为一种代表性的低维嵌入算法，等距映射(Isomap)(Tenenbaum J B，Silva V D，Langford J C.A global geometric，for nonlinear dimensionality reduction[C]//2000：2319-23)能够保存数据在原始高维空间的分布。相比于LPP，Isomap技术是计算所有数据对之间的距离(可以视为数据对的差异性)，以保持数据的全局高维分布。其中，多维标度算法作为Isomap技术中的一种常用算法，有效地结合了平行坐标及散点图矩阵的优势，支持利用低维空间近似表达多维空间中的数据分布，展示了多维数据的相似性。为了验证与增强多维标度算法的可视分析能力，国内外学者对多维标度算法进行了深入研究。Keim等学者设计了针对基于多维标度的大型数据库挖掘的可视化方法，并对比了平行坐标和散点图矩阵等的展示效果，展现了多维标度算法在高维数据可视化的高效性(Keim D A，Kriegel H P.Visualization Techniques for Mining Large Databases：A Comparison[M].IEEE Educational Activities Department，1996)。Vera等学者利用插值方法平滑多维标度算法的二维映射结果，增强多维标度算法的时序数据分析能力(Vera J F，Angulo JM，Roldán J A.Stability analysis in nonstationary spatial covarianceestimation[J].Stochastic Environmental Research&Risk Assessment，2016：1-14)。Dimension reduction mapping technology can provide the representation of multi-dimensional data internal structure in low-dimensional space, and it is an effective multi-dimensional data visualization technology. It can help users effectively explore and understand abstract multi-dimensional information and its structure in low-dimensional space, enabling users to quickly and accurately discover the hidden features, relationships, and patterns in the data set in the process of knowledge discovery, information cognition, and information decision-making , trends and clustering information, etc. Principal component analysis is often used to reduce data to two or three dimensions to generate similarity layouts. However, this method discards other dimensions of the data, and it is difficult to maintain the high-dimensional distribution of the data in the original space. A classic method is locality preserving projection (locality preserving projection, LPP) (HeX, Niyogi P.Locality preserving projections[J]. Ad-vances in Neural Information Processing Systems, 2004, 16(1): 186-197), this The method linearly preserves the local high-dimensional distribution of the data. As a representative low-dimensional embedding algorithm, Isomap (Tenenbaum J B, Silva V D, Langford J C.A global geometric, for nonlinear dimensionality reduction[C]//2000:2319-23) can save data in the original Distributions in high-dimensional spaces. Compared with LPP, Isomap technology calculates the distance between all data pairs (which can be regarded as the difference of data pairs) to maintain the global high-dimensional distribution of data. Among them, the multidimensional scaling algorithm, as a commonly used algorithm in Isomap technology, effectively combines the advantages of parallel coordinates and scatterplot matrix, supports the use of low-dimensional space to approximate the expression of data distribution in multidimensional space, and shows the similarity of multidimensional data. sex. In order to verify and enhance the visual analysis ability of multidimensional scaling algorithms, domestic and foreign scholars have carried out in-depth research on multidimensional scaling algorithms. Scholars such as Keim designed a visualization method for large-scale database mining based on multidimensional scaling, and compared the display effects of parallel coordinates and scatter plot matrices, showing the efficiency of multidimensional scaling algorithms in high-dimensional data visualization (Keim D A , Kriegel H P. Visualization Techniques for Mining Large Databases: A Comparison [M]. IEEE Educational Activities Department, 1996). Vera and other scholars use interpolation method to smooth the two-dimensional mapping results of multidimensional scaling algorithm, and enhance the time series data analysis ability of multidimensional scaling algorithm (Vera J F, Angulo JM, Roldán J A.Stability analysis in nonstationary spatial covarianceestimation[J].Stochastic Environmental Research & Risk Assessment, 2016: 1-14).

多维数据常常具有时序变化特点，数据的时序演变模式往往难以探索。因此，面向具有显著时序属性的多维数据，开展可视分析具有重要的意义。动画(Bender S，McfarlandD A.The art and science of dynamic network visualization[J].Journal of SocialStructure，2006，7(2)：1206-1241)因其可以有效地保存用户的意境地图(mental map)，而成为演示数据随时间推移演变的直观方式。但是Archambault等学者通过实验证明：在动画中获得的意境地图对于较长时序的视觉认知可能没有太大帮助，不仅如此，在多个时刻上投影点的剧烈变化无法支持快速追踪前后时间节点的变化模式(Archambault D，PurchaseH，Pinaud B.Animation，Small Multiples，and the Effect of Mental MapPreservation in Dynamic Graphs[J].IEEE Transactions on Visualization&ComputerGraphics，2011，17(4)：539-552)。因此，最近的方法(Burch M，Fritz M，Beck F，etal.TimeSpiderTrees：A Novel Visual Metaphor for Dynamic Compound Graphs[C]//IEEE Symposium on Visual Languages and Human-Centric Computing.IEEE ComputerSociety，2010：168-175.Liu S，Wu Y，Wei E，Liu M，and Liu Y.Storyflow：Tracking theevolution of stories[C]//IEEE Transactions on Visu-alization ComputerGraphics，19(12)：2436-45，2013)多侧重于利用静态图展示时序数据。时间轴和多小图(small multiples)是以静态图方式对时间维度进行编码的两种选择。许多学者采用多小图方法进行可视分析，Hadlak等学者提出了基于多小图的可视化方法，允许用户交互地选择多个小图中聚焦区，并为所选数据建立合适的布局(Hadlak S，Schulz H J，SchumannH.In situ exploration of large dynamic networks[J].IEEE Trans Vis ComputGraph，2011，17(12)：2334-2343)。Walker等学者将球面坐标的优点引入平行坐标中，更直观的展示了多维数据的时序特征([18]Walker J，Geng Z，Jones M，et al.Visualizationof Large，Time-Dependent，Abstract Data with Integrated Spherical and ParallelCoordinates[C]//EuroVis-Short Papers.2012：43-47)。然而，面对长时间序列时，多小图方法需要展示大量的投影图，严重地降低了可视效果的可读性，增加视觉杂波，进而限制了用户对时序特征的追踪。Multidimensional data often has the characteristics of time-series changes, and the time-series evolution patterns of data are often difficult to explore. Therefore, it is of great significance to carry out visual analysis for multi-dimensional data with significant time series attributes. Animation (Bender S, Mcfarland D A. The art and science of dynamic network visualization [J]. Journal of SocialStructure, 2006, 7(2): 1206-1241) because it can effectively save the user's artistic conception map (mental map), Instead, it becomes an intuitive way to demonstrate the evolution of data over time. However, Archambault and other scholars have proved through experiments that the artistic conception map obtained in the animation may not be of great help to the visual cognition of a long time series. Not only that, the drastic changes of the projection points at multiple moments cannot support fast tracking of time nodes before and after. Change patterns (Archambault D, Purchase H, Pinaud B. Animation, Small Multiples, and the Effect of Mental MapPreservation in Dynamic Graphs [J]. IEEE Transactions on Visualization & Computer Graphics, 2011, 17(4): 539-552). Therefore, recent methods (Burch M, Fritz M, Beck F, et al. TimeSpiderTrees: A Novel Visual Metaphor for Dynamic Compound Graphs [C]//IEEE Symposium on Visual Languages and Human-Centric Computing. IEEE Computer Society, 2010: 168- 175. Liu S, Wu Y, Wei E, Liu M, and Liu Y. Storyflow: Tracking theevolution of stories[C]//IEEE Transactions on Visu-alization Computer Graphics, 19(12): 2436-45, 2013) multi-focus Use static graphs to display time series data. Time axis and small multiples are two options for encoding the time dimension in a static graph manner. Many scholars use the method of multi-small graphs for visual analysis. Scholars such as Hadlak proposed a visualization method based on multi-small graphs, which allows users to interactively select the focus area in multiple small graphs and establish a suitable layout for the selected data (Hadlak S , Schulz H J, Schumann H. In situ exploration of large dynamic networks [J]. IEEE Trans Vis ComputGraph, 2011, 17(12): 2334-2343). Scholars such as Walker introduced the advantages of spherical coordinates into parallel coordinates, which more intuitively demonstrated the timing characteristics of multidimensional data ([18]Walker J, Geng Z, Jones M, et al.Visualization of Large, Time-Dependent, Abstract Data with Integrated Spherical and Parallel Coordinates [C]//EuroVis-Short Papers. 2012: 43-47). However, in the face of long-term sequences, the multi-image method needs to display a large number of projection images, which seriously reduces the readability of visual effects, increases visual clutter, and limits the user's ability to track time series features.

基于时间轴的方法将时间作为一个轴，再将高维信息压缩到一维或二维空间，极大地增强了可读性，因而，大量利用多维标度算法的研究工作利用这种方式展示多维时变数据，以增强用户在低维空间中的视觉认知。Dwyer等学者提出类似时空立方体(space-time-cube)技术，将时间视为多维标度算法投影在二维空间后的第三个维度，从而在三维空间上展现了数据在时间上的变化(Dwyer T.Gallagher D R.Visualising Changes inFund Manager Holdings in Two and a Half-Dimensions[J].In-formationVisualization，2004，3(4)：227-244)。Bernard等学者设计了基于时间路径的展示多维时变数据的方法，连接不同时刻相同的数据实体构成时间路径，以帮助用户理解单一实体在时间上的模式变化(Bernard J，Wilhelm N，Scherer M，et al.TimeSeriesPaths：Projection-Based Explorative Analysis of Multivarate Time Series Data[C]//Conference in Central Europe on Computer Graphics，Visualization and ComputerVision.2012)。还有一些利用多维标度算法将多维数据投影到二维空间的方法(Hu Y，WuS，Xia S，et al.Motion track：Visualizing variations of human motion data[C]//Visualization Symposium.IEEE，2010：153-160；Ward M O，Guo Z.Visual Explorationof Time-Series Data with Shape Space Projections[J].Computer Graphics Forum，2011，30(3)：701-710；Mao Y，Dillon J，Lebanon G.Sequential document visual-ization.[J].IEEE Transactions on Visualization&Computer Graphics，2007，13(6)：1208-1215.)等，聚焦于单一数据在时间上的移动，以及它们在时间上产生的路径。虽然这些技术允许检测数据在时间上的循环模式，但是由于数据项会被投影成任意的二维坐标，导致了严重的视觉紊乱，干扰了用户的视觉追踪。Jackle等学者提出了基于滑动窗口的多维标度算法(Jackle D，Fischer F，Schreck T，and Keim D.A.Temporal mds plots foranalysis of multivariate data[C]//IEEE Transactions on Visualization ComputerGraphics，22(1)：141，2016)，有效地将多维数据压缩至一维并以时间为一个轴在二维空间可视分析网络安全数据。然而，将多维数据映射至一维空间后，抑制了特征差异的视觉认知，而且在滑动窗口技术的使用之后，虽然有效地平滑了多维标度算法的突变性，却抑制了离群点的有效展示。The method based on the time axis uses time as an axis, and then compresses high-dimensional information into one-dimensional or two-dimensional space, which greatly enhances the readability. Therefore, a large number of research works using multi-dimensional scaling algorithms use this method to display multi-dimensional Time-varying data to enhance users' visual cognition in low-dimensional space. Scholars such as Dwyer proposed a technology similar to the space-time cube (space-time-cube), which regards time as the third dimension after the multi-dimensional scaling algorithm is projected into the two-dimensional space, thus showing the temporal changes of the data in the three-dimensional space ( Dwyer T. Gallagher D R. Visualizing Changes in Fund Manager Holdings in Two and a Half-Dimensions [J]. Information Visualization, 2004, 3(4): 227-244). Scholars such as Bernard designed a time-path-based method for displaying multi-dimensional time-varying data, connecting the same data entities at different times to form a time path to help users understand the temporal pattern changes of a single entity (Bernard J, Wilhelm N, Scherer M, et al. TimeSeriesPaths: Projection-Based Explorative Analysis of Multivariate Time Series Data[C]//Conference in Central Europe on Computer Graphics, Visualization and ComputerVision.2012). There are also some methods that use multidimensional scaling algorithms to project multidimensional data into two-dimensional space (Hu Y, WuS, Xia S, et al.Motion track: Visualizing variations of human motion data[C]//Visualization Symposium.IEEE, 2010 : 153-160; Ward M O, Guo Z. Visual Exploration of Time-Series Data with Shape Space Projections[J]. Computer Graphics Forum, 2011, 30(3): 701-710; Mao Y, Dillon J, Lebanon G. Sequential document visualization.[J].IEEE Transactions on Visualization&Computer Graphics, 2007, 13(6):1208-1215.), etc., focusing on the movement of single data in time and the paths they generate in time. While these techniques allow detection of cyclical patterns in data over time, they result in severe visual clutter that interferes with the user's visual tracking since data items are projected into arbitrary 2D coordinates. Scholars such as Jackle proposed a multidimensional scaling algorithm based on sliding windows (Jackle D, Fischer F, Schreck T, and Keim D.A.Temporal mds plots for analysis of multivariate data[C]//IEEE Transactions on Visualization Computer Graphics, 22(1): 141 , 2016), effectively compress multi-dimensional data into one dimension and visually analyze network security data in two-dimensional space with time as an axis. However, after mapping multidimensional data into one-dimensional space, the visual recognition of feature differences is suppressed, and after the use of sliding window technique, although the abruptness of multidimensional scaling algorithm is effectively smoothed, the outlier is suppressed. Show effectively.

可以看出，高维时变数据可视化方法可以帮助领域专家快速分析展示数据特征，增强可视化的视觉感知，一定程度上提升了高维时变数据可视化及分析效率。然而现有的平行坐标，散点图矩阵及降维映射技术等可视化方法尚存在一定的局限性，很难展示数据的时序特征，不利于用户交互式分析，妨碍了高维时变数据可视化效率的提升。It can be seen that the high-dimensional time-varying data visualization method can help domain experts quickly analyze and display data characteristics, enhance the visual perception of visualization, and improve the visualization and analysis efficiency of high-dimensional time-varying data to a certain extent. However, existing visualization methods such as parallel coordinates, scatterplot matrix, and dimensionality reduction mapping techniques still have certain limitations. It is difficult to display the time series characteristics of data, which is not conducive to user interactive analysis, and hinders the visualization efficiency of high-dimensional time-varying data. improvement.

发明内容Contents of the invention

本发明的目的是提供一种降维空间视觉感知增强的高维时变数据可视化方法。The purpose of the present invention is to provide a high-dimensional time-varying data visualization method with enhanced spatial visual perception in reduced dimensionality.

为实现上述目的，本发明所采用的技术方案是：To achieve the above object, the technical solution adopted in the present invention is:

本发明降维空间视觉感知增强的高维时变数据可视化方法包括：The high-dimensional time-varying data visualization method for enhancing dimensionality reduction space visual perception of the present invention includes:

读取高维时变数据，使用多维标度算法得到高维时变数据的二维空间坐标，使用公式(1)对高维时变数据在二维空间的坐标进行求解得到目标正交矩阵，Read the high-dimensional time-varying data, use the multi-dimensional scaling algorithm to obtain the two-dimensional space coordinates of the high-dimensional time-varying data, use the formula (1) to solve the coordinates of the high-dimensional time-varying data in the two-dimensional space to obtain the target orthogonal matrix,

公式(1)中，Q表示目标正交矩阵，表示高维时变数据在当前时刻的第i个数据的二维空间坐标，表示高维时变数据在前一时刻的第i个数据的二维空间坐标，N表示高维时变数据的个数，D^t(Q)表示经过正交矩阵变换后的高维时变数据在当前时刻相对于前一时刻的投影偏移；In formula (1), Q represents the target orthogonal matrix, Indicates the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the current moment, Represents the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the previous moment, N represents the number of high-dimensional time-varying data, D ^t (Q) represents the high-dimensional time-varying data after orthogonal matrix transformation The projection offset at the current moment relative to the previous moment;

利用目标正交矩阵将高维时变数据在二维空间的投影偏移最小化，得到高维时变数据在当前时刻的投影。The projection offset of high-dimensional time-varying data in two-dimensional space is minimized by using the target orthogonal matrix, and the projection of high-dimensional time-varying data at the current moment is obtained.

进一步地，本发明使用公式(1)分别通过旋转、翻转两种形式对高维时变数据在的二维空间的坐标进行求解，对应得到旋转正交矩阵、翻转正交矩阵，分别计算经旋转正交矩阵、翻转正交矩阵变换后的高维时变数据在当前时刻相对于前一时刻的投影偏移，并选择其中具有较小投影偏移的正交矩阵作为目标正交矩阵。Further, the present invention uses the formula (1) to solve the coordinates of the two-dimensional space of the high-dimensional time-varying data in the two forms of rotation and flipping respectively, and obtains the corresponding rotation orthogonal matrix and flip orthogonal matrix, respectively calculates the rotated The projection offset of the high-dimensional time-varying data transformed by the orthogonal matrix and the flipped orthogonal matrix at the current moment relative to the previous moment, and the orthogonal matrix with the smaller projection offset is selected as the target orthogonal matrix.

进一步地，本发明在利用目标正交矩阵将高维时变数据在二维空间的投影偏移最小化之后，将高维时变数据在当前时刻的投影空间分割为N个以上的子空间，将不同的投影点填充在不同的子空间内，得到高维时变数据在当前时刻的投影。Further, after minimizing the projection offset of the high-dimensional time-varying data in the two-dimensional space by using the target orthogonal matrix, the present invention divides the projection space of the high-dimensional time-varying data at the current moment into more than N subspaces, Different projection points are filled in different subspaces to obtain the projection of high-dimensional time-varying data at the current moment.

进一步地，本发明所述子空间为正三角形、正方形或正六边形。Further, the subspace of the present invention is a regular triangle, square or regular hexagon.

进一步地，本发明当所述子空间为正六边形且目标子空间内存在两个以上目标投影点时，按照目标子空间内目标投影点的遍历顺序，通过以下方法将各目标投影点逐个填充到不同的正六边形子空间内：Further, in the present invention, when the subspace is a regular hexagon and there are more than two target projection points in the target subspace, each target projection point is filled one by one by the following method according to the traversal order of the target projection points in the target subspace into different regular hexagonal subspaces:

若当前目标投影点为第一个投影点，则将当前目标投影点填充在目标子空间内；否则，执行以下步骤：If the current target projection point is the first projection point, fill the current target projection point in the target subspace; otherwise, perform the following steps:

步骤a.若当前目标投影点落在当前子空间的中心点处，则执行步骤b，否则执行步骤c；Step a. If the current target projection point falls at the center point of the current subspace, then execute step b, otherwise execute step c;

步骤b.自当前子空间开始向外逐层搜索，直至找到一个尚未被填充的外围正六边形；将当前目标投影点填充在该外围正六边形内，并将当前子空间更新为目标子空间，进而判断目标子空间内的目标投影点是否填充完毕，若未填充完毕，则以下一个目标投影点为当前目标投影点返回执行步骤a；Step b. Search outward layer by layer from the current subspace until an unfilled outer regular hexagon is found; fill the current target projection point in the outer regular hexagon, and update the current subspace to the target subspace , and then judge whether the target projection point in the target subspace is filled, if not, then the next target projection point is the current target projection point and return to step a;

步骤c.从当前子空间的相邻正六边形中找到两个夹角最小的正六边形，所述夹角最小的正六边形是指：在当前子空间的所有相邻正六边形中，正六边形的中心点和当前子空间的中心点的连线与当前目标投影点和当前子空间的中心点的连线的夹角最小；Step c. Find two regular hexagons with the smallest included angle from the adjacent regular hexagons in the current subspace, the regular hexagon with the smallest included angle means: in all adjacent regular hexagons in the current subspace, The angle between the line connecting the center point of the regular hexagon and the center point of the current subspace and the line connecting the current target projection point and the center point of the current subspace is the smallest;

步骤d.从两个夹角最小的正六边形中选出其中心点与当前目标投影点的距离较小的一个作为优选正六边形；Step d. Select the one whose center point and the distance from the current target projection point is smaller as the preferred regular hexagon from the two regular hexagons with the smallest included angle;

步骤e.若优选正六边形未被填充，则将当前目标投影点填充在优选正六边形内，并将当前子空间更新为目标子空间，进而判断目标子空间内的目标投影点是否填充完毕，若未填充完毕，则以下一个目标投影点为当前目标投影点返回执行步骤a；若优选正六边形已被填充，则执行步骤f；Step e. If the preferred regular hexagon is not filled, fill the current target projection point in the preferred regular hexagon, and update the current subspace to the target subspace, and then judge whether the target projection point in the target subspace is filled , if the filling is not completed, then the next target projection point is the current target projection point and return to step a; if the preferred regular hexagon has been filled, then perform step f;

步骤f.从与优选正六边形和当前子空间均有共同边的两个邻接正六边形中选择其中心点与当前目标投影点的距离较小的一个作为次优正六边形；若次优正六边形未被填充，则将当前目标投影点填充在次优正六边形内，并将当前子空间更新为目标子空间，进而判断目标子空间内的目标投影点是否填充完毕，若未填充完毕，则以下一个目标投影点为当前目标投影点返回执行步骤a；若次优正六边形已被填充，执行步骤g；Step f. From the two adjacent regular hexagons that have a common side with the preferred regular hexagon and the current subspace, select the one with the smaller distance between its center point and the current target projection point as the suboptimal regular hexagon; if the suboptimal If the regular hexagon is not filled, fill the current target projection point in the suboptimal regular hexagon, and update the current subspace to the target subspace, and then judge whether the target projection point in the target subspace is filled, if not After completion, the next target projection point is the current target projection point and return to step a; if the suboptimal regular hexagon has been filled, go to step g;

步骤g.将当前子空间更新为所述优选正六边形，再将当前目标投影点从目标子空间转移到当前子空间内，并使当前目标投影点在当前子空间的位置与其转移前在目标子空间的位置相同，然后返回执行步骤c。Step g. Update the current subspace to the preferred regular hexagon, then transfer the current target projection point from the target subspace to the current subspace, and make the current target projection point in the position of the current subspace before the transfer of the target The positions of the subspaces are the same, and then return to step c.

与现有技术相比，本发明的有益效果是：Compared with prior art, the beneficial effect of the present invention is:

(1)使用多维标度算法获得高维时序数据在降维空间的二维坐标后，通过迭代正交变换矩阵最小化相邻时间步的投影偏移寻找目标正交矩阵。进而利用目标正交矩阵对高维时变数据在二维空间的坐标进行变换，将高维时变数据在二维空间的投影偏移最小化，减小相邻时间节点投影点的偏移。由此，本发明减小了高维时变数据在降维空间中的视觉紊乱，数据的时序特征得以快速展示，便利了用户交互式分析，明显提升高维时变数据可视化及分析效率，能够有效满足用户的应用需求。(1) After using the multidimensional scaling algorithm to obtain the two-dimensional coordinates of the high-dimensional time series data in the dimensionality reduction space, the target orthogonal matrix is found by iterating the orthogonal transformation matrix to minimize the projection offset of adjacent time steps. Furthermore, the target orthogonal matrix is used to transform the coordinates of the high-dimensional time-varying data in the two-dimensional space, so as to minimize the projection offset of the high-dimensional time-varying data in the two-dimensional space, and reduce the offset of the projection points of adjacent time nodes. As a result, the present invention reduces the visual disorder of high-dimensional time-varying data in the dimensionality reduction space, quickly displays the time series features of the data, facilitates user interactive analysis, significantly improves the visualization and analysis efficiency of high-dimensional time-varying data, and can Effectively meet the user's application needs.

(2)进一步地，通过旋转和翻转两种形式缩小求解空间可以更快地获取目标正交矩阵。(2) Furthermore, the target orthogonal matrix can be obtained more quickly by reducing the solution space through two forms of rotation and flipping.

(3)为了得到更好的视觉感知效果，本发明将高维时变数据在当前时刻的投影空间进行划分，综合考虑夹角最小和距离最小因素，将不同数据项投影在不同的子空间内，由此进一步避免了因投影点增多而存在两个或两个以上的投影点落在同一个正六边形上所可能带来的视觉紊乱。(3) In order to obtain a better visual perception effect, the present invention divides the projection space of high-dimensional time-varying data at the current moment, comprehensively considers the factors of minimum angle and minimum distance, and projects different data items in different subspaces , thereby further avoiding visual confusion caused by two or more projection points falling on the same regular hexagon due to the increase of projection points.

附图说明Description of drawings

图1是本发明在将高维时变数据在二维空间的投影偏移最小化之前与之后的可视化效果对比图，其中，(a)为当前时刻使用经典多维标度算法将高维数据投影至二维空间的结果图；(b)为下一时刻使用经典多维标度算法将高维数据投影至二维空间的结果图；(c)为当前时刻使用经典多维标度算法将高维数据投影至二维空间后再使用正交变换的结果图；(d)为下一时刻使用经典多维标度算法将高维数据投影至二维空间后再使用正交变换的结果图。Fig. 1 is a comparison diagram of the present invention before and after minimizing the projection offset of high-dimensional time-varying data in two-dimensional space, wherein (a) is the projection of high-dimensional data using the classic multi-dimensional scaling algorithm at the current moment to two-dimensional space; (b) is the result of projecting high-dimensional data to two-dimensional space using classical multidimensional scaling algorithm at the next moment; (c) is using classical multidimensional scaling algorithm to project high-dimensional data to two-dimensional space at the current moment Projected to a two-dimensional space and then using the result map of the orthogonal transformation; (d) is the result map of the high-dimensional data projected to the two-dimensional space using the classical multidimensional scaling algorithm at the next moment and then using the orthogonal transformation result map.

图2为直接堆叠四个投影点到同一个正六边形上的可视化效果图；Figure 2 is a visual effect diagram of directly stacking four projection points onto the same regular hexagon;

图3为本发明使用综合考虑夹角和距离的方案填充投影点的可视化效果图；Fig. 3 is a visualization effect diagram of filling projection points using a scheme that comprehensively considers angles and distances in the present invention;

图4是本发明使用综合考虑夹角和距离的方案填充投影点的流程图；Fig. 4 is a flow chart of filling projection points using a scheme that comprehensively considers angles and distances in the present invention;

图5是无遮挡绘制结果对比图，其中，(a)为经典散点图结果图像，(b)为本发明无遮挡绘制结果图像。Fig. 5 is a comparison diagram of unoccluded rendering results, wherein (a) is a result image of a classic scatter plot, and (b) is an image of an unoccluded rendering result of the present invention.

具体实施方式Detailed ways

下面以具体的实施例对本发明降维空间视觉感知增强的高维时变数据可视化方法作进一步的说明。具体方法如下：The method for visualizing high-dimensional time-varying data with enhanced visual perception in dimension-reduced space according to the present invention will be further described with specific embodiments below. The specific method is as follows:

加载初始高维时变数据，通过经典多维标度算法将高维时变数据投影至二维空间进行展示，得到高维时变数据的二维空间坐标。具体方法如下：Load the initial high-dimensional time-varying data, project the high-dimensional time-varying data into a two-dimensional space for display through the classic multi-dimensional scaling algorithm, and obtain the two-dimensional space coordinates of the high-dimensional time-varying data. The specific method is as follows:

先将原始高维时变数据按照公式(2)进行归一化处理，以消除不同维度量纲的影响。First, the original high-dimensional time-varying data is normalized according to formula (2) to eliminate the influence of different dimensions.

公式(2)中，x_i表示维度x中第i个数据的值；x_min，x_max分别表示维度x的最大值和最小值；表示归一化后维度x中的第i个数据项。In formula (2), x _i represents the value of the i-th data in dimension x; x _min and x _max respectively represent the maximum and minimum values of dimension x; Indicates the i-th data item in dimension x after normalization.

然后计算经归一化处理后的高维时变数据在原始空间中的距离矩阵。本发明可采用欧式距离阵D，计算公式如公式(3)所示：Then calculate the distance matrix of the normalized high-dimensional time-varying data in the original space. The present invention can adopt Euclidean distance array D, and calculation formula is as shown in formula (3):

公式(3)中，分别表示归一化后维度x中的第i和第j个数据项，||·||表示二范数运算，表示归一化后维度x中的第i个数据项和第j个数据项的欧式距离，n表示数据项的个数。In formula (3), respectively represent the i-th and j-th data items in the dimension x after normalization, and ||·|| represents the two-norm operation, Indicates the Euclidean distance between the i-th data item and the j-th data item in the dimension x after normalization, and n indicates the number of data items.

进而根据距离矩阵计算低维空间表示的中心化内积阵B，如公式(4)所示：Then calculate the centered inner product matrix B represented by the low-dimensional space according to the distance matrix, as shown in formula (4):

公式(4)中，表示第i个数据项和第j个数据项的欧式距离，n表示数据项的个数。In formula (4), Indicates the Euclidean distance between the i-th data item and the j-th data item, and n indicates the number of data items.

最后对B进行正交分解，并选取2个最大的特征值和相对应的特征向量，得到二维空间下的拟合构图计算公式如公式(5)所示：Finally, B is decomposed orthogonally, and the two largest eigenvalues and corresponding eigenvectors are selected to obtain the fitting composition in two-dimensional space The calculation formula is shown in formula (5):

公式(5)中，U，Λ，U′分别为矩阵B的特征向量阵、特征值阵、与特征向量阵的转置；为矩阵B特征值的绝对值最大的两个，U₂是与特征值对应的特征向量；即为高维时变数据在降维空间(即二维空间)的表示。In the formula (5), U, Λ, U' are respectively the transposition of the eigenvector array, the eigenvalue array, and the eigenvector array of the matrix B; Be the two largest absolute values of matrix B eigenvalues, U ₂ is the eigenvector corresponding to the eigenvalues; That is, the representation of high-dimensional time-varying data in a reduced-dimensional space (ie, a two-dimensional space).

在多维标度算法中，由于只考虑数据间的相对位置的不变性，因而同一数据项在不同时刻的投影存在较强的任意性，这种差异通常是由于正交变换或者是归一化产生的，很大程度上干扰了用户对感兴趣的时序特征的视觉追踪，给数据分析人员带来了严重的视觉紊乱。对比图1(a)和图1(b)可知，一些数据项在前一个时刻被投影至右上角，而在下一个时刻则突然映射至右下角，相较于图1(c)和图1(d)存在严重的视觉紊乱，这在一定程度制约了高维时变数据可视化效率的提升。In the multi-dimensional scaling algorithm, since only the invariance of the relative position between data is considered, the projection of the same data item at different times has strong arbitrariness. This difference is usually due to orthogonal transformation or normalization. It largely interferes with the user's visual tracking of the time-series features of interest, and brings serious visual disturbance to data analysts. Comparing Figure 1(a) and Figure 1(b), it can be seen that some data items are projected to the upper right corner at the previous moment, and suddenly mapped to the lower right corner at the next moment, compared with Figure 1(c) and Figure 1( d) There are serious visual disturbances, which to a certain extent restrict the improvement of the visualization efficiency of high-dimensional time-varying data.

为此，本发明为了提高用户对时变特征的视觉认知，在多维标度算法基础上提出增加正交变换以修正数据的投影坐标，从而实现在不改变数据在低维空间中相对位置的基础上，最小化多维时变数据在低维空间上的偏移，增强探索数据的时变特征的能力。For this reason, in order to improve the user's visual cognition of time-varying features, the present invention proposes to add an orthogonal transformation on the basis of the multi-dimensional scaling algorithm to correct the projection coordinates of the data, so as to realize the relative position of the data without changing the low-dimensional space. Based on this, the offset of multi-dimensional time-varying data in low-dimensional space is minimized, and the ability to explore the time-varying characteristics of data is enhanced.

正交变换可以在不改变数据间相互位置关系的前提下，修正投影点的坐标，进而最小化相邻映射图的偏移距离。为此，本发明利用公式(1)所示的迭代求解方式对高维时变数据在二维空间的坐标进行求解，得到目标正交矩阵，以获得最小化投影偏移。Orthogonal transformation can correct the coordinates of projection points without changing the mutual positional relationship between data, thereby minimizing the offset distance of adjacent maps. For this reason, the present invention uses the iterative solution method shown in formula (1) to solve the coordinates of the high-dimensional time-varying data in the two-dimensional space to obtain the target orthogonal matrix to obtain the minimum projection offset.

公式(1)中，Q表示目标正交矩阵，表示高维时变数据在当前时刻的第i个数据的二维空间坐标，表示高维时变数据在前一时刻的第i个数据的二维空间坐标，N表示高维时变数据的个数，D^t(Q)表示经过正交矩阵变换后的高维时变数据在当前时刻相对于前一时刻的投影偏移。In formula (1), Q represents the target orthogonal matrix, Indicates the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the current moment, Represents the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the previous moment, N represents the number of high-dimensional time-varying data, D ^t (Q) represents the high-dimensional time-varying data after orthogonal matrix transformation Projection offset at the current moment relative to the previous moment.

进一步地，利用目标正交矩阵将高维时变数据在二维空间的投影偏移最小化，得到高维时变数据在当前时刻的投影。由此，本发明减小了高维时变数据在降维空间中的视觉紊乱，数据的时序特征得以快速展示，便利了用户交互式分析，明显提升高维时变数据可视化及分析效率，能够有效满足用户的应用需求。Furthermore, the projection offset of the high-dimensional time-varying data in two-dimensional space is minimized by using the target orthogonal matrix, and the projection of the high-dimensional time-varying data at the current moment is obtained. As a result, the present invention reduces the visual disorder of high-dimensional time-varying data in the dimensionality reduction space, quickly displays the time series features of the data, facilitates user interactive analysis, significantly improves the visualization and analysis efficiency of high-dimensional time-varying data, and can Effectively meet the user's application needs.

作为本发明的优选方式，为更快地获取目标正交矩阵，可利用正交变换中的旋转和翻转最小化序列映射图的偏移减小求解空间。即使用公式(1)分别通过旋转、翻转两种形式对高维时变数据在的二维空间的坐标进行求解。其中，旋转变换可以采用如公式(6)所示的矩阵：As a preferred mode of the present invention, in order to obtain the target orthogonal matrix faster, the rotation and flip in the orthogonal transformation can be used to minimize the offset of the sequence map and reduce the solution space. That is, formula (1) is used to solve the coordinates of the two-dimensional space of the high-dimensional time-varying data through rotation and flipping respectively. Among them, the rotation transformation can adopt the matrix shown in formula (6):

公式(6)中，Q_Rotation表示旋转变换的正交矩阵，θ表示旋转变换中的旋转角，因此最小化序列映射集偏移的问题就简化为求解最佳旋转角。同时翻转变换可以采用如公式(7)所示的矩阵：In formula (6), Q _Rotation represents the orthogonal matrix of the rotation transformation, and θ represents the rotation angle in the rotation transformation, so the problem of minimizing the offset of the sequence mapping set is simplified to find the optimal rotation angle. Simultaneously flipping transformations can use a matrix as shown in equation (7):

公式(7)中，Q_Overturn表示翻转变换的正交矩阵，k表示翻转变换中对称轴的斜率。In formula (7), Q _Overturn represents the orthogonal matrix of the flip transformation, and k represents the slope of the symmetry axis in the flip transformation.

通过翻转变换最小化偏移问题可简化成寻找最优的对称轴。本发明可通过遍历所有可能的角度和对称轴来得到最准确的正交变换方案。The problem of minimizing offset by flipping transformations can be reduced to finding the optimal axis of symmetry. The present invention can obtain the most accurate orthogonal transformation scheme by traversing all possible angles and symmetry axes.

作为一种优选实施方式，本发明可利用相邻映射图中数据项的平均偏移角度来反映旋转角，从而得到最佳旋转角，其计算公式如公式(8)所示：As a preferred embodiment, the present invention can use the average offset angle of data items in adjacent maps to reflect the rotation angle, thereby obtaining the optimal rotation angle, the calculation formula of which is shown in formula (8):

公式(8)中，θ^t表示t时刻的旋转角；分别表示第i个投影点在t时刻和t-1时刻的投影坐标；N为投影点的个数。In formula (8), θ ^t represents the rotation angle at time t; respectively represent the projection coordinates of the i-th projection point at time t and time t-1; N is the number of projection points.

作为计算最佳对称轴的优选实施方式，本发明可使用相邻映射图上的偏移最大的m个数据项的近似对称轴作为翻转变换的对称轴，其计算公式如公式(9)所示：As a preferred embodiment of calculating the optimal symmetry axis, the present invention can use the approximate symmetry axis of the m data items with the largest offset on the adjacent map as the symmetry axis of the flip transformation, and its calculation formula is shown in formula (9) :

公式(9)中，和分别是t时刻和t-1时刻映射图上偏移第i大的投影点.m为用户定义的偏移最大的数据项的个数；k^t表示t时刻对称轴的斜率。利用斜率k^t可得到过原点的最佳对称轴。In formula (9), and are the projected points with the i-th largest offset on the map at time t and t-1, respectively. m is the number of data items with the largest offset defined by the user; k ^t indicates the slope of the symmetry axis at time t. The best axis of symmetry passing through the origin can be obtained by using the slope k ^t .

通过公式(8)和公式(9)获得了最佳旋转角与对称轴之后，通过比较旋转与翻转的偏移大小，确定适当的正交变换，实现最佳旋转或翻转变换，得到最小化偏移的降维映射结果，使用户得到更好地视觉认知。After the optimal rotation angle and axis of symmetry are obtained through formula (8) and formula (9), the appropriate orthogonal transformation is determined by comparing the offsets of rotation and flipping to achieve the best rotation or flipping transformation, and the minimum deviation is obtained. The dimensionality reduction mapping result of shifting enables users to get better visual cognition.

使用经典的多维标度算法投影时，选择散点图进行结果展示，然而随着数据量的增大，以及多维数据的复杂分布造成投影点的相互遮挡，容易引发用户的视觉歧义与交互困难，而一个恰当的投影点大小很难选择。因此本发明引入空间划分技术，将多维标度算法投影后的散点图分割成多个子空间，每个子空间可以表示一个投影点，从而去除投影点的相互遮挡。具体过程如下：When projecting using the classic multi-dimensional scaling algorithm, the scatter diagram is selected for the result display. However, as the amount of data increases and the complex distribution of multi-dimensional data causes mutual occlusion of projection points, it is easy to cause visual ambiguity and interaction difficulties for users. And an appropriate projection point size is difficult to choose. Therefore, the present invention introduces space division technology to divide the scatter diagram projected by the multi-dimensional scaling algorithm into multiple subspaces, and each subspace can represent a projection point, thereby removing the mutual occlusion of the projection points. The specific process is as follows:

本发明可使用正三角形和正方形这两种最简单的划分多边形方法进行投影空间的划分。作为本发明的优选实施方式，使用正六边形进行空间投影划分，不仅可以无缝分割平面，还进一步提高了可视化的美感。通过使用一对一的映射方法，可将每一个投影点填充至与之对应的正六边形子空间内。本发明可通过使用尽量小的正六边形进行空间分割而使一个正六边形子空间上最多只有一个投影点。若因投影点增多而存在两个或两个以上的投影点落在同一个正六边形上的情形，作为本发明的优选实施方式，可进一步采用图4所示的方案，综合考虑夹角最小、距离最短的原则将其他投影点填充到其他正六边形中，从而保证一个正六边形至多只填充有一个投影点，有效避免视觉紊乱。以下具体说明采用图4所示的方案进行投影空间划分和填充投影点的方法：The present invention can divide the projection space by using the two simplest methods of dividing polygons: regular triangle and square. As a preferred embodiment of the present invention, using regular hexagons for space projection division can not only seamlessly divide the plane, but also further improve the aesthetic feeling of visualization. By using a one-to-one mapping method, each projected point can be filled into the corresponding regular hexagonal subspace. The present invention can divide the space by using as small a regular hexagon as possible so that there is at most one projection point on a regular hexagonal subspace. If there are two or more projection points falling on the same regular hexagon due to the increase of projection points, as a preferred embodiment of the present invention, the scheme shown in Figure 4 can be further adopted, taking into account the minimum included angle , The principle of the shortest distance fills other projection points into other regular hexagons, so as to ensure that a regular hexagon is filled with at most one projection point, effectively avoiding visual disorder. The following describes in detail how to divide the projection space and fill the projection points using the scheme shown in Figure 4:

当所述子空间为正六边形且目标子空间内存在两个以上目标投影点时，按照目标子空间内目标投影点的遍历顺序，通过以下方法将各目标投影点逐个填充到不同的正六边形子空间内：When the subspace is a regular hexagon and there are more than two target projection points in the target subspace, according to the traversal order of the target projection points in the target subspace, each target projection point is filled to different regular hexagons one by one by the following method In the shape subspace:

以下进一步以具体的示例详细说明本发明的综合考虑夹角和距离的优选方案。The preferred scheme of the present invention, which comprehensively considers the included angle and the distance, will be described in detail below with specific examples.

作为对比，图2示出了直接堆叠A、B、C、D四个投影点到同一个正六边形(即均落在正六边形1)内的可视化效果图。As a comparison, FIG. 2 shows a visual effect diagram of directly stacking the four projection points A, B, C, and D into the same regular hexagon (that is, they all fall within the regular hexagon 1).

如图3(a)所示，A、B、C、D四个投影点按先后顺序依次落在目标子空间内(即均落在正六边形1内)。高维时变数据在当前时刻的投影空间则被分割为不少于四个的正六边形子空间。图3(a)中，正六边形2、3、4、5、6、7为与当前子空间(正六边形1)有共用边的相邻正六边形，它们相对于当前子空间(正六边形1)而言，为第一层次的外围正六边形；正六边形8-19相对于当前子空间(正六边形1)而言，则为第二层次的外围正六边形。As shown in Figure 3(a), the four projection points A, B, C, and D fall in the target subspace in sequence (that is, they all fall in the regular hexagon 1). The projection space of high-dimensional time-varying data at the current moment is divided into no less than four regular hexagonal subspaces. In Fig. 3 (a), regular hexagons 2, 3, 4, 5, 6, 7 are adjacent regular hexagons having a shared side with the current subspace (regular hexagon 1), and they are relative to the current subspace (regular hexagon 1). In terms of polygon 1), it is the outer regular hexagon of the first level; with respect to the current subspace (regular hexagon 1), the regular hexagon 8-19 is the outer regular hexagon of the second level.

采用本发明的图4所示的优选方案时，根据目标投影点的遍历顺序，依次对第一个目标投影点A、第二个目标投影点B、第三个目标投影点C、第四个目标投影点D进行填充。具体方法如下：When adopting the preferred solution shown in Fig. 4 of the present invention, according to the traversal order of the target projection points, the first target projection point A, the second target projection point B, the third target projection point C, the fourth target projection point The target projection point D is filled. The specific method is as follows:

(一)首先对第一个目标投影点A进行填充。如图3(a)所示，由于A是第一个目标投影点，故将目标投影点A填充在当前子空间内，此时，当前子空间为目标子空间(正六边形1)。(1) First, fill the first target projection point A. As shown in Fig. 3(a), since A is the first target projection point, the target projection point A is filled in the current subspace. At this time, the current subspace is the target subspace (regular hexagon 1).

(二)接下来对第二个目标投影点B进行填充。由于当前目标投影点B未落在当前子空间的中心点处，因此按以下步骤执行；(2) Next, fill in the second target projection point B. Since the current target projection point B does not fall at the center point of the current subspace, the following steps are performed;

步骤a.经判断，当前目标投影点B未落在当前子空间正六边形1的中心点O₁处，因此执行步骤c。Step a. It is judged that the current target projection point B does not fall at the center point O ₁ of the regular hexagon 1 in the current subspace, so step c is executed.

步骤c.将当前目标投影点B与当前子空间正六边形1的中心点O₁连接，得到连线O₁B。将当前子空间的所有相邻正六边形2-7的中心点分别与当前子空间的中心点O₁连接，比较这些连线与连线O₁B的夹角的大小。其中，以夹角α和β(α＝β)为最小。如图3(a)所示，α为正六边形1的中心点O₁和正六边形2的中心点O₂的连线(连线O₁O₂)与连线O₁B的夹角，β为正六边形1的中心点O₁和正六边形5的中心点O₅的连线(连线O₁O₅)与连线O₁B的夹角。由于夹角α和β最小，因此正六边形2和正六边形5均为所说的“夹角最小的正六边形”。执行步骤d。Step c. Connect the current target projection point B with the center point O ₁ of the regular hexagon 1 in the current subspace to obtain the connection line O ₁ B. Connect the center points of all adjacent regular hexagons 2-7 in the current subspace to the center point O ₁ of the current subspace respectively, and compare the angles between these connecting lines and the connecting line O ₁ B. Among them, the angle between α and β (α=β) is the smallest. As shown in Figure 3(a), α is the angle between the line connecting the center point O ₁ of the regular hexagon 1 and the center point O ₂ of the regular hexagon 2 (the line O ₁ O ₂ ) and the line O ₁ B , β is the angle between the line connecting the center point O ₁ of the regular hexagon 1 and the center point O ₅ of the regular hexagon 5 (the line O ₁ O ₅ ) and the line O ₁ B. Since the included angles α and β are the smallest, the regular hexagon 2 and the regular hexagon 5 are both so-called "regular hexagons with the smallest included angle". Execute step d.

步骤d.由于正六边形2和正六边形5均为夹角最小的正六边形，因此，需要从正六边形2和正六边形5中进一步选出其中心点与当前目标投影点B的距离较小的一个作为优选正六边形。如图3(a)所示，由于当前目标投影点B与中心点O₂之间的距离小于B与中心点O₅之间的距离，因此，按照距离最短的原则，选择正六边形2作为当前目标投影点B所要填充的优选正六边形。Step d. Since both the regular hexagon 2 and the regular hexagon 5 are regular hexagons with the smallest included angle, it is necessary to further select the distance between the center point of the regular hexagon 2 and the regular hexagon 5 and the current target projection point B The one with the smaller distance is the preferred regular hexagon. As shown in Figure 3(a), since the distance between the current target projection point B and the center point _O2 is smaller than the distance between B and the center point _O5 , according to the principle of the shortest distance, the regular hexagon 2 is selected as The preferred regular hexagon to be filled by the current target projection point B.

步骤e.由于优选正六边形(正六边形2)未被填充，因此将当前目标投影点B填充在优选正六边形(正六边形2)内，并将当前子空间更新为目标子空间(即正六边形1)。【参见图3(b)】。Step e. Since the preferred regular hexagon (regular hexagon 2) is not filled, the current target projection point B is filled in the preferred regular hexagon (regular hexagon 2), and the current subspace is updated to the target subspace ( Namely regular hexagon 1). [See Figure 3(b)].

(三)接下来对第三个投影点C进行填充。(3) Next, fill the third projection point C.

步骤a.经判断，当前目标投影点C未落在当前子空间的中心点处，因此执行以下步骤：Step a. After judging, the current target projection point C does not fall at the center point of the current subspace, so the following steps are performed:

其中，步骤c至步骤d可参照上述目标投影点B的填充过程，按照夹角最小和距离最短原则，正六边形2为备选的用于填充目标投影点C的优选正六边形。接下来执行步骤e。具体如下：Wherein, step c to step d can refer to the above-mentioned filling process of the target projection point B, according to the principles of the smallest included angle and the shortest distance, the regular hexagon 2 is an alternative preferred regular hexagon for filling the target projection point C. Next, step e is performed. details as follows:

步骤e.由于优选正六边形(正六边形2)已被目标投影点B填充，因此不能再将目标投影点C填充在正六边形2内，由此继续执行步骤f。Step e. Since the preferred regular hexagon (regular hexagon 2) has already been filled by the target projection point B, the target projection point C can no longer be filled in the regular hexagon 2, thus continuing to execute step f.

步骤f.从与优选正六边形(正六边形2)和当前子空间(正六边形1)均有共同边的两个邻接正六边形(正六边形3和正六边形4)中选择其中心点与当前目标投影点的距离较小的一个作为次优正六边形。由于正六边形3的中心点O₃与目标投影点C的连线O₃C的长度小于正六边形4的中心点O₄与投影点C的连线O₄C的长度，因而以正六边形3作为次优正六边形。Step f. Select one of the two adjacent regular hexagons (regular hexagon 3 and regular hexagon 4) that have a common side with the preferred regular hexagon (regular hexagon 2) and the current subspace (regular hexagon 1) The one with the smaller distance between the center point and the current target projection point is the suboptimal regular hexagon. Since the length of the line O 3 C connecting the center point O ₃ of the regular hexagon ₃ and the target projection point C is shorter than the length of the line O 4 C connecting the center point O ₄ of the regular hexagon ₄ and the projection point C, the regular hexagon Shape 3 is a suboptimal regular hexagon.

由于次优正六边形(正六边形3)未被填充，因此将当前目标投影点C填充在次优正六边形(正六边形3)内，并将当前子空间更新为目标子空间(正六边形1)。【参见图3(c)】。Since the suboptimal regular hexagon (regular hexagon 3) is not filled, the current target projection point C is filled in the suboptimal regular hexagon (regular hexagon 3), and the current subspace is updated to the target subspace (regular hexagon 3) Polygon 1). [See Figure 3(c)].

(四)对第四个投影点D进行填充。(4) Filling the fourth projection point D.

步骤a.经判断，当前目标投影点D未落在当前子空间的中心点处，因此执行以下步骤：Step a. After judging, the current target projection point D does not fall at the center point of the current subspace, so the following steps are performed:

同理，步骤b至步骤f可参照上述目标投影点C的填充过程，按照夹角最小和距离最短原则，正六边形2为备选的用于填充目标投影点D的优选正六边形。然而，由于优选正六边形(正六边形2)已被目标投影点B填充，因此不能再将目标投影点D填充在正六边形2内。而次优正六边形(正六边形3)由于已被目标投影点C填充，因此，也不能将目标投影点D填充在正六边形3内。由此，继续执行步骤g。Similarly, steps b to f can refer to the above filling process of the target projection point C. According to the principles of the smallest included angle and the shortest distance, the regular hexagon 2 is an alternative preferred regular hexagon for filling the target projection point D. However, since the preferred regular hexagon (regular hexagon 2 ) is already filled with the target projection point B, the target projection point D cannot be filled in the regular hexagon 2 any more. Since the suboptimal regular hexagon (regular hexagon 3 ) has already been filled by the target projection point C, the target projection point D cannot be filled in the regular hexagon 3 . Thus, proceed to step g.

步骤g.将当前子空间更新为优选正六边形(即正六边形2)，以正六边形2作为新的当前子空间。然后将当前目标投影点D从目标子空间(正六边形1)中转移到当前子空间(正六边形2)内，并使当前目标投影点D在当前子空间(正六边形2)的位置与其转移前在目标子空间(正六边形1)的位置相同，接着按照步骤c的方法继续执行步骤h。Step g. Update the current subspace to the preferred regular hexagon (ie regular hexagon 2), and use the regular hexagon 2 as the new current subspace. Then transfer the current target projection point D from the target subspace (regular hexagon 1) to the current subspace (regular hexagon 2), and make the current target projection point D in the position of the current subspace (regular hexagon 2) The position in the target subspace (regular hexagon 1) before the transfer is the same, and then continue to execute step h according to the method of step c.

步骤h.如图3(c)所示，正六边形1、3、4、8、9、10为当前子空间(正六边形2)的相邻正六边形。将当前子空间(正六边形2)的所有相邻正六边形1、3、4、8、9、10的中心点分别与当前子空间(正六边形2)的中心点O₂连接，比较这些连线与连线O₂D的夹角的大小。其中，以夹角γ和δ为最小(γ＝δ)。如图3(c)所示，正六边形9的中心点O₉和当前子空间(正六边形2)的中心点O₂的连线O₉O₂与连线O₂D的夹角为γ，当前子空间(正六边形2)的中心点O₂和正六边形1的中心点O₁的连线O₂O₁与连线O₂D的夹角为δ，且γ＝δ。因此，夹角最小的两个正六边形为正六边形1和正六边形9。Step h. As shown in Figure 3(c), the regular hexagons 1, 3, 4, 8, 9, and 10 are the adjacent regular hexagons of the current subspace (regular hexagon 2). The center points of all adjacent regular hexagons 1, 3, 4, 8, 9, 10 of the current subspace (regular hexagon 2) are respectively connected with the center point _O2 of the current subspace (regular hexagon 2), and compared The size of the angle between these connecting lines and the connecting line O ₂ D. Among them, the angle between γ and δ is the smallest (γ=δ). As shown in Figure 3(c), the angle between the line O 9 O 2 connecting _the center point O ₉ of _the regular hexagon 9 and the center point O ₂ of the current subspace (regular hexagon 2) and the line O ₂ D is γ, the angle between the line O 2 O 1 connecting _the center point O ₂ of the current subspace (regular hexagon 2 ) and the center point O ₁ of the regular hexagon ₁ , and the line O ₂ D is δ, and γ=δ. Therefore, the two regular hexagons with the smallest included angle are regular hexagon 1 and regular hexagon 9.

步骤i.按照步骤d的方法，由于当前目标投影点D与中心点O₉之间的距离小于D与中心点O₁之间的距离，因此，按照距离最短的原则，选择正六边形9作为当前目标投影点D所要填充的优选正六边形。Step i. According to the method of step d, since the distance between the current target projection point D and the center point _O9 is smaller than the distance between D and the center point _O1 , therefore, according to the principle of the shortest distance, select the regular hexagon 9 as The preferred regular hexagon to be filled by the current target projection point D.

步骤j.由于正六边形9尚未被填充，因此将当前目标投影点D填充在正六边形9内。投影完毕后将当前子空间更新为目标子空间(即正六边形1)。Step j. Since the regular hexagon 9 has not been filled, fill the current target projection point D in the regular hexagon 9 . After the projection is completed, update the current subspace to the target subspace (ie regular hexagon 1).

若当前正六边形1内还有更多的目标投影点，则返回执行步骤a，按照上述方法选择相应的正六边形并填充目标投影点。If there are more target projection points in the current regular hexagon 1, return to step a, select the corresponding regular hexagon and fill the target projection points according to the above method.

作为本发明的优选实施方式，按照如图4所示的方法将各目标投影点一一填充到各正六边形中，可以去除高维数据降解至低维空间后，投影点相互遮挡引发的视觉歧义，用户可获得没有遮挡的高维数据在低维空间投影展示，增强了用户对多维数据的视觉认知。As a preferred embodiment of the present invention, according to the method shown in Figure 4, each target projection point is filled into each regular hexagon one by one, which can remove the visual problems caused by mutual occlusion of projection points after high-dimensional data is degraded to low-dimensional space. Ambiguity, users can obtain unobstructed high-dimensional data projection display in low-dimensional space, which enhances users' visual cognition of multi-dimensional data.

图1示出了使用散点图展示的高维时变数据在使用经典多维标度算法和本发明方法获得的结果图像对比，其中，图1(a)、图1(b)是利用公式(3)、公式(4)以及公式(5)所示的经典多维标度算法获得的绘制结果，由于只考虑数据间的相对位置的不变性，因而同一数据项在不同时刻的投影存在很强的任意性，对用户的视觉感知造成了很大的障碍，对数据的时序模式很难进行视觉追踪；图1(c)、图1(d)是采用公式(8)和公式(9)所示的降维空间增强视觉感知的高维时变数据可视化的绘制结果图像，计算正交变换矩阵，最小化相邻时间步的投影点偏移，增强用户对高维时变数据的时序特征的视觉感知，对比图1(b)和图1(d)可以发现，应用本发明算法，高维时变数据相邻时间步的投影点可以清晰的追踪。Fig. 1 shows the high-dimensional time-varying data displayed using scatter diagrams using the classical multidimensional scaling algorithm and the result image comparison obtained by the method of the present invention, wherein Fig. 1 (a) and Fig. 1 (b) use the formula ( 3), the drawing results obtained by the classic multidimensional scaling algorithm shown in formula (4) and formula (5), because only the invariance of the relative position between data is considered, the projection of the same data item at different times has a strong Arbitrariness creates a great obstacle to the user's visual perception, and it is difficult to visually track the timing pattern of the data; Figure 1(c) and Figure 1(d) are shown by formula (8) and formula (9) The dimensionality reduction space enhances the visual perception of high-dimensional time-varying data to visualize the rendering result image, calculates the orthogonal transformation matrix, minimizes the projection point offset of adjacent time steps, and enhances the user's vision of the time-series features of high-dimensional time-varying data Perception, comparing Fig. 1(b) and Fig. 1(d), it can be found that by applying the algorithm of the present invention, the projection points of adjacent time steps of high-dimensional time-varying data can be clearly tracked.

图5出示了在使用修正后的多维标度算法，利用公式(1)的情形下，高维数据利用散点图展示和六边形划分空间获得的结果图像对比。其中，图5(a)是利用普通散点图获得的绘制结果，由于散点图中投影点大小很难定义，往往会存在严重的遮挡现象。图5(b)是采用正六边形划分空间获得的绘制结果图像，当多个点被投影在同一个正六边形时，使用上述综合考虑夹角和距离的优选方案去除了投影点的相互遮挡。Fig. 5 shows the result image comparison obtained by using the modified multi-dimensional scaling algorithm and formula (1) to display the high-dimensional data using a scatter diagram and dividing the space by hexagons. Among them, Fig. 5(a) is the drawing result obtained by using the ordinary scatter diagram. Since the size of the projected points in the scatter diagram is difficult to define, there are often serious occlusion phenomena. Figure 5(b) is the rendering result image obtained by using a regular hexagon to divide the space. When multiple points are projected on the same regular hexagon, the mutual occlusion of the projection points is removed by using the above-mentioned optimal scheme that comprehensively considers the included angle and distance. .

可以看出，利用本发明方法获得的结果图像，不存在散点图中的遮挡现象，可以便利地进行交互分析，提升了高维时变数据可视化与分析效率。It can be seen that the result image obtained by using the method of the present invention does not have the occlusion phenomenon in the scatter diagram, and interactive analysis can be conveniently performed, which improves the visualization and analysis efficiency of high-dimensional time-varying data.

与传统高维时变数据可视化过程相比，本发明的最大优势是获取高维时变数据在低维空间的表达，在多维标度算法的基础上，通过正交变换使高维时变数据在二维空间的投影偏移最小化，以帮助用户对感兴趣的时间模式进行有效的视觉感知及追踪，增强了用户对高维时变数据在降维空间的视觉感知。此外，为避免投影点之间的相互遮挡，进一步引入正六边形对高维时变数据的投影空间进行划分，以增强用户对高维数据在降维空间中特征的视觉认知和交互，可以快速的展示数据中隐含地时序模式，提升了高维时变数据可视化及分析效率。利用本发明方法可以在低维空间中，增强用户对高维数据时变特征的视觉感知，计算正交矩阵以实现最小化高维时变数据在二维空间的投影偏移，减小了多维标度算法在分析时变数据时带来的视觉紊乱，并通过六边形划分投影空间，去除了投影点间的相互遮挡，提升了高维时变数据可视化与分析效率。Compared with the traditional high-dimensional time-varying data visualization process, the biggest advantage of the present invention is to obtain the expression of high-dimensional time-varying data in low-dimensional space, and make the high-dimensional time-varying data The projection offset in the two-dimensional space is minimized to help users effectively visually perceive and track the time patterns of interest, and enhance the user's visual perception of high-dimensional time-varying data in the reduced-dimensional space. In addition, in order to avoid mutual occlusion between projection points, regular hexagons are further introduced to divide the projection space of high-dimensional time-varying data, so as to enhance users' visual cognition and interaction with the characteristics of high-dimensional data in the dimensionality reduction space, which can Quickly display the implicit timing patterns in the data, improving the visualization and analysis efficiency of high-dimensional time-varying data. Utilizing the method of the present invention can enhance the user's visual perception of time-varying features of high-dimensional data in low-dimensional space, and calculate an orthogonal matrix to minimize the projection offset of high-dimensional time-varying data in two-dimensional space, reducing multi-dimensional The visual disorder caused by the scaling algorithm when analyzing time-varying data, and the projection space is divided by hexagons, the mutual occlusion between projection points is removed, and the visualization and analysis efficiency of high-dimensional time-varying data is improved.

Claims

1. A high-dimensional time-varying data visualization method for enhanced spatial visual perception, characterized in that it comprises:

Read the high-dimensional time-varying data, use the multi-dimensional scaling algorithm to obtain the two-dimensional space coordinates of the high-dimensional time-varying data, use the formula (1) to solve the coordinates of the high-dimensional time-varying data in the two-dimensional space to obtain the target orthogonal matrix,

<mrow><munder><mrow><mi>arg</mi><mi>min</mi></mrow><mi>Q</mi></munder><msup><mi>D</mi><mi>t</mi></msup><mrow><mo>(</mo><mi>Q</mi><mo>)</mo></mrow><mo>=</mo><munderover><mo>&Sigma;</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>N</mi></munderover><mrow><mo>(</mo><mo>|</mo><mo>|</mo><msubsup><mi>Qv</mi><mi>i</mi><mi>t</mi></msubsup><mo>-</mo><msubsup><mi>v</mi><mi>i</mi><mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msubsup><mo>|</mo><mo>|</mo><mo>)</mo></mrow><mo>-</mo><mo>-</mo><mo>-</mo><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></mrow>

In formula (1), Q represents the target orthogonal matrix, Indicates the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the current moment, Represents the two-dimensional spatial coordinates of the i-th data of the high-dimensional time-varying data at the previous moment, N represents the number of high-dimensional time-varying data, D ^t (Q) represents the high-dimensional time-varying data after orthogonal matrix transformation The projection offset at the current moment relative to the previous moment;

The projection offset of high-dimensional time-varying data in two-dimensional space is minimized by using the target orthogonal matrix, and the projection of high-dimensional time-varying data at the current moment is obtained.

2. The high-dimensional time-varying data visualization method according to claim 1, characterized in that: use formula (1) to solve the coordinates of the two-dimensional space of the high-dimensional time-varying data by rotating and flipping respectively, Corresponding to the obtained rotation orthogonal matrix and flipped orthogonal matrix, respectively calculate the projection offset of the high-dimensional time-varying data transformed by the rotated orthogonal matrix and flipped orthogonal matrix at the current moment relative to the previous moment, and select one of them with a relatively The orthomatrix of the small projected offset serves as the target orthomatrix.

3. The high-dimensional time-varying data visualization method according to claim 1 or 2, characterized in that: after using the target orthogonal matrix to minimize the projection offset of the high-dimensional time-varying data in two-dimensional space, the high-dimensional The projection space of time-varying data at the current moment is divided into more than N subspaces, and different projection points are filled in different subspaces to obtain the projection of high-dimensional time-varying data at the current moment.

4. The high-dimensional time-varying data visualization method according to claim 3, wherein the subspace is a regular triangle, a square or a regular hexagon.

5. The high-dimensional time-varying data visualization method according to claim 4, wherein when the subspace is a regular hexagon and there are more than two target projection points in the target subspace, according to the target in the target subspace The traversal order of the projection points fills each target projection point into different regular hexagonal subspaces one by one by the following method:

If the current target projection point is the first projection point, fill the current target projection point in the target subspace; otherwise, perform the following steps:

Step a. If the current target projection point falls at the center point of the current subspace, then execute step b, otherwise execute step c;

Step b. Search outward layer by layer from the current subspace until an unfilled outer regular hexagon is found; fill the current target projection point in the outer regular hexagon, and update the current subspace to the target subspace , and then judge whether the target projection point in the target subspace is filled, if not, then the next target projection point is the current target projection point and return to step a;

Step c. Find two regular hexagons with the smallest included angle from the adjacent regular hexagons in the current subspace, the regular hexagon with the smallest included angle means: in all adjacent regular hexagons in the current subspace, The angle between the line connecting the center point of the regular hexagon and the center point of the current subspace and the line connecting the current target projection point and the center point of the current subspace is the smallest;

Step d. Select the one whose center point and the distance from the current target projection point is smaller as the preferred regular hexagon from the two regular hexagons with the smallest included angle;

Step e. If the preferred regular hexagon is not filled, fill the current target projection point in the preferred regular hexagon, and update the current subspace to the target subspace, and then judge whether the target projection point in the target subspace is filled , if the filling is not completed, then the next target projection point is the current target projection point and return to step a; if the preferred regular hexagon has been filled, then perform step f;

Step f. From the two adjacent regular hexagons that have a common side with the preferred regular hexagon and the current subspace, select the one with the smaller distance between its center point and the current target projection point as the suboptimal regular hexagon; if the suboptimal If the regular hexagon is not filled, fill the current target projection point in the suboptimal regular hexagon, and update the current subspace to the target subspace, and then judge whether the target projection point in the target subspace is filled, if not After completion, the next target projection point is the current target projection point and return to step a; if the suboptimal regular hexagon has been filled, go to step g;

Step g. Update the current subspace to the preferred regular hexagon, then transfer the current target projection point from the target subspace to the current subspace, and make the current target projection point in the position of the current subspace before the transfer of the target The positions of the subspaces are the same, and then return to step c.