CN114519086A - Incremental interactive clustering visualization method and system for credible cloud data sharing - Google Patents

Incremental interactive clustering visualization method and system for credible cloud data sharing Download PDF

Info

Publication number
CN114519086A
CN114519086A CN202210145820.4A CN202210145820A CN114519086A CN 114519086 A CN114519086 A CN 114519086A CN 202210145820 A CN202210145820 A CN 202210145820A CN 114519086 A CN114519086 A CN 114519086A
Authority
CN
China
Prior art keywords
data
clustering
incremental
cluster
visualization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210145820.4A
Other languages
Chinese (zh)
Inventor
金福生
韩华旭
黄罡
陈朔鹰
张舒汇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Beijing Institute of Technology BIT
Original Assignee
Peking University Shenzhen Graduate School
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School, Beijing Institute of Technology BIT filed Critical Peking University Shenzhen Graduate School
Priority to CN202210145820.4A priority Critical patent/CN114519086A/en
Publication of CN114519086A publication Critical patent/CN114519086A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • G06F16/287Visualization; Browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an incremental interactive clustering visualization method and system for credible cloud data sharing, wherein the method comprises the following steps: compiling a data sharing intelligent contract and operating the intelligent contract; extracting target data of the data set according to the written data sharing intelligent contract; clustering the extracted target data by adopting a clustering algorithm, and outputting a clustering result; performing multi-dimensional scale dimension reduction on the clustering result to realize projection in a two-dimensional space, and performing visual display; and performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data. On the basis of block chain data credible sharing, incremental interactive analysis and mining are carried out on data, data safety is effectively guaranteed, data processing efficiency is improved, and meanwhile a user can conveniently and visually analyze and mine the data.

Description

Incremental interactive clustering visualization method and system for credible cloud data sharing
Technical Field
The invention relates to the technical field of system engineering, in particular to an incremental interactive clustering visualization method and system for trusted sharing of cloud data.
Background
Data visualization is one of important ways of data application, and data is visually presented in a chart or graph mode after being converted, so that people can better understand and analyze hidden information in the data. Clustering is a common mining method for data analysis, and is commonly used for mining of multivariate relational data, sequence and anomaly detection and the like. Currently, many data analysis and mining platforms provide a clustering algorithm to perform data analysis and mining and perform visual display of results. In most cases, the general process is to collect all the data to be analyzed from each system, then process all the data simultaneously in a batch processing manner, and finally show the clustering result. The above processing flow and manner have many problems in the data analysis and mining process. Firstly, in the aspect of data processing, all data needs to be gathered and then the subsequent steps are carried out, and the data gathering time is long. Particularly, in a data-distributed sharing system, in each data processing process, it is necessary to wait for the aggregation of all data and then perform subsequent operations. Meanwhile, in the process of collecting and processing all the original data, the risk of data leakage exists, and the data safety can not be well protected. Secondly, in the above-mentioned process, the whole process of data analysis by using clustering is a staticized process, and after setting parameters each time, the algorithm runs to obtain a result and shows the result to the user, so that the user cannot adjust the clustering parameters more intuitively.
Therefore, on the basis of the existing data analysis and mining technology, how to provide an incremental interactive clustering visualization method and system for trusted sharing of cloud data to greatly improve the processing efficiency of data, guarantee the safety and the credibility of the data, and perform visual operation on the data processing process becomes a problem to be solved by technical personnel in the field.
Disclosure of Invention
In view of the above problems, the present invention provides an incremental interactive clustering visualization method and system for trusted cloud data sharing, which at least solve some of the above technical problems, so as to effectively ensure data security, improve data processing efficiency, and facilitate users to perform analysis and mining of data more intuitively.
The embodiment of the invention provides an incremental interactive clustering visualization method for credible cloud data sharing, which is characterized by comprising the following steps of:
s1, compiling a data sharing intelligent contract and operating the intelligent contract;
s2, extracting target data of the data set according to the written data sharing intelligent contract;
s3, clustering the extracted target data by adopting a clustering algorithm, and outputting a clustering result;
s4, carrying out multi-dimensional scale dimensionality reduction on the clustering result, realizing the projection of the clustering result in a two-dimensional space, and carrying out visual display;
And S5, performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data.
Further, in step S4, performing dimensionality reduction on the clustering result by using a multidimensional scale, includes:
s41, calculating a distance matrix according to the clustering result and the set low-dimensional space dimension;
s42, calculating an inner product matrix according to the distance matrix;
s43, performing eigenvalue decomposition on the inner product matrix, calculating the first n maximum eigenvalues and eigenvectors thereof, and generating a diagonal matrix and an eigenvector matrix which are formed by the first n maximum eigenvalues;
and S44, calculating a matrix after dimensionality reduction according to the diagonal matrix and the eigenvector matrix formed by the first n maximum eigenvalues, and outputting the low-dimensional representation of the clustering result.
Further, the step S5 includes:
s51, if the user changes the attribute of the target data, constructing forward projection, and displaying the result after data attribute transformation on a visual interface to realize forward projection interaction;
s52, if the user drags and drops the visual data points to the target data, constructing a backward projection, calculating the attribute of the data points after adjustment, and realizing backward projection interaction;
And S53, if the user adds the incremental data on the basis of the target data, performing cluster analysis and visualization operation on the incremental data.
Further, the S51 includes:
s511, acquiring data points with changed attributes and the mass center of each cluster in the current clustering model;
s512, respectively calculating Euclidean distances from the data points to the mass center of each cluster;
s513, acquiring the shortest distance in the Euclidean distances, and classifying the data points into the cluster corresponding to the shortest distance;
and S514, projecting the data points in a scatter diagram, and carrying out color coding by using the same color of the corresponding clusters to realize forward projection interaction.
Further, in S511, an average method is used to obtain the centroid of each cluster in the current clustering model.
Further, the S52 includes:
s521, acquiring low-dimensional data of a projection point after dragging and dropping the visual data point;
s522, calculating a change vector from the original projection point to the new projection point according to the low-dimensional data;
s523, calculating delta x by adopting a PCA dimension reduction method:
Δx[e0 e1]=Δy
wherein, Δ x is a characteristic change vector of the original data point; [ e ] a0 e1]A feature vector matrix of the original data points; Δ y is a position change vector;
And S524, solving the delta x optimal solution according to a regularized least square method to obtain the attribute of the data point after adjustment, and realizing back projection interaction.
Further, the S53 includes:
s531, acquiring data points of the newly added incremental data, reading the data points into a processing system, and setting a K-L divergence threshold;
s532, calculating the K-L divergence of the data set after the data points are increased and the original data set;
and S533, comparing the K-L divergence with a K-L divergence threshold, and respectively adopting a sample external expansion mode and a re-clustering mode to realize cluster analysis and visualization operation on the incremental data.
Further, in S533, the cluster analysis and visualization operation of the incremental data is implemented in an ex-sample expansion manner, including: when the K-L divergence is smaller than a K-L divergence threshold value, dividing the clusters of the newly added data points according to the distance from the points to the cluster center, and updating the cluster center; and projecting the newly added data points into a scatter diagram by adopting a sample external expansion mode.
Further, in S533, a re-clustering manner is adopted to implement cluster analysis and visualization operation of the incremental data, including: when the K-L divergence is larger than a K-L divergence threshold value, re-clustering the data set, finding the optimal overlapping position of the target data dimension reduction result, and projecting; and carrying out color coding on the projection points subjected to dimension reduction according to the clustering result.
The embodiment of the invention also provides an incremental interactive clustering visualization system for trusted cloud data sharing, which comprises:
the compiling and running module is used for compiling the data sharing intelligent contract and running the intelligent contract;
the extraction module is used for extracting target data of the data set according to the written data sharing intelligent contract;
the clustering module is used for clustering the extracted target data by adopting a clustering algorithm and outputting a clustering result;
the visualization module is used for carrying out multi-dimensional scale dimension reduction on the clustering result, realizing the projection of the clustering result in a two-dimensional space and carrying out visual display;
and performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data.
The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:
the embodiment of the invention provides an incremental interactive clustering visualization method for trusted sharing of cloud data, which comprises the following steps: compiling a data sharing intelligent contract and operating the intelligent contract; extracting target data of the data set according to the written data sharing intelligent contract; clustering the extracted target data by adopting a clustering algorithm, and outputting a clustering result; performing multi-dimensional scale dimension reduction on the clustering result to realize projection in a two-dimensional space, and performing visual display; and performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data. On the basis of block chain data credible sharing, incremental interactive analysis and mining are carried out on data, data safety is effectively guaranteed, data processing efficiency is improved, and meanwhile a user can conveniently and visually analyze and mine the data.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart of an incremental interactive clustering visualization method for trusted sharing of cloud data according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the invention provides an incremental interactive clustering visualization method for trusted sharing of cloud data, which is shown by reference to fig. 1 and comprises the following steps:
s1, compiling a data sharing intelligent contract and operating the intelligent contract;
s2, extracting target data of the data set according to the written data sharing intelligent contract;
s3, clustering the extracted target data by adopting a clustering algorithm, and outputting a clustering result;
s4, carrying out multi-dimensional scale dimensionality reduction on the clustering result, realizing the projection of the clustering result in a two-dimensional space, and carrying out visual display;
and S5, performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data.
According to the embodiment, incremental interactive analysis and mining are performed on the data on the basis of block chain data trusted sharing, so that the data are analyzed and mined more intuitively by a user while the data security is effectively guaranteed and the data processing efficiency is improved.
The following specifically describes each of the above steps:
specifically, steps S1 and S2 are executed to compile the data sharing intelligent contract; and extracting target data of the data set according to the written data sharing intelligent contract. The block chain technology is adopted, so that the trusted sharing of data can be effectively realized, the distributed data sharing is realized, and the safety and the credibility of the shared data are guaranteed.
Further, in step S3, a clustering algorithm is used to cluster the extracted target data, and a clustering result is output. The incremental clustering analysis mining method is adopted, so that the data processing efficiency is greatly improved, and the response time of data analysis mining is shortened. The method specifically comprises the following steps:
s31, selecting a clustering algorithm;
s32, setting clustering parameters;
and S33, clustering the extracted target data and outputting a clustering result X.
For example:
1. selecting a k-means algorithm;
2. setting input of k-means, including a data set and a k value;
3. clustering is performed by using an algorithm, and the data set is divided into k classes.
Further, step S4 is to perform multidimensional dimension reduction on the clustering result, so as to realize projection (projection into low-dimensional data) of the target data (which is high-dimensional data) clustering result in a two-dimensional space, and perform visual display. And the visual display is carried out on the interface, so that the clustering parameters can be adjusted more visually. The method specifically comprises the following steps:
s41, calculating a distance matrix D according to the clustering result X and the set low-dimensional space dimension n, wherein the element D of the ith row and j columnijIs the distance of the sample;
s42, calculating an inner product matrix B according to the distance matrix D and the principle that the distances before and after dimension reduction are the same as much as possible;
S43, carrying out eigenvalue decomposition on the inner product matrix B, calculating the first n maximum eigenvalues and corresponding eigenvectors thereof, and generating a diagonal matrix U consisting of the first n maximum eigenvaluesnAnd eigenvector matrix Vn
S44, according to the diagonal matrix UnAnd eigenvector matrix VnAnd calculating a matrix Z after dimensionality reduction, and outputting a low-dimensional representation of a clustering result, namely the low-dimensional representation of initial target data.
Further, step S5 is to perform corresponding projection interaction, cluster analysis and visualization operation according to the requirement of the user for changing the target data. The visualization is that after the multi-dimension is reduced into two dimensions, the two dimensions are points which generate the abscissa and the ordinate, and the points are directly displayed on the graph. And when the user adds new data and changes the data set of the target data, performing corresponding operation on the visually displayed interface. The method specifically comprises the following steps:
s51, if the user changes the data attribute, constructing a forward projection, and displaying the result after the data attribute transformation on a visual interface to realize forward projection interaction;
s52, if the user drags and drops the visual data points, constructing a backward projection, calculating the property of the data points after adjustment, and realizing backward projection interaction;
And S53, if the incremental data are added by the user, performing cluster analysis and visualization operation on the incremental data.
Specifically, step S51 includes:
s511, the user interacts, the attribute value of the data point is changed, the data point x after the attribute is changed is obtained, and the centroid m of k clusters in the current clustering model is obtained by adopting an average value methodj,0<j<k;
S512, respectively calculating Euclidean distances l from the data points x to the mass center of k clustersj
S513, obtaining the shortest distance min l in the Euclidean distancejClassifying the data point x into a cluster corresponding to the shortest distance;
and S514, projecting the data point x in a scatter diagram, and carrying out color coding by using the same color of the corresponding cluster to realize forward projection interaction.
Step S52 includes:
s521, dragging a node to a new position in a projection space by a user on a front-end interface to acquire low-dimensional data y of the projection point;
s522, calculating a change vector delta y from the original projection point to the new projection point according to the low-dimensional data y;
s523, calculating delta x by adopting a PCA dimension reduction method:
Δx[e0 e1]=Δy
wherein, Δ x is a characteristic change vector of the original data point; [ e ] a0 e1]Is a raw numberA feature vector matrix of the base points; Δ y is a position change vector;
and S524, solving the delta x optimal solution according to a regularized least square method to obtain the attribute of the data points after adjustment, and realizing back projection interaction.
Wherein, in step S522, the original projection point refers to a point before drag and drop; the new projection point refers to the point after drag and drop; i.e. a point is moved to another place by drag and drop. The original data point in step S523 refers to a point before scrub.
Furthermore, in order to ensure accurate and rapid response of incremental data change in the visualization process, two different modes are adopted, and when the influence of the newly added incremental data on the current clustering result is small, a sample external expansion method is adopted for projection of the newly added data. And when the influence of the newly added incremental data on the current clustering result is large, visually constructing by adopting a re-clustering mode, and finding the optimal overlap with the projection position of the data before re-clustering by utilizing Procrustes transformation. The specific step S53 includes:
s531, acquiring a data point x of the newly added incremental data, reading the data point x (from a database and the like) into a processing system, and setting a K-L divergence threshold value K;
s532, calculating the K-L divergence of the data set with the added data points and the original data set (referring to the data set in the step S2);
and S533, comparing the K-L divergence with a K-L divergence threshold value K, and respectively adopting a sample external expansion mode and a re-clustering mode to realize cluster analysis and visualization operation of incremental data. The method specifically comprises the following steps:
When the K-L divergence is smaller than a K-L divergence threshold value K, dividing clusters of the newly added data points according to the distance from the points to the cluster center, and updating the cluster center; projecting the newly added data point x into a scatter diagram by adopting a sample external expansion mode;
when the K-L divergence is larger than a K-L divergence threshold value K, re-clustering the data set, applying Procrustes geometric transformation to find the optimal overlapping position with the last target data dimension reduction result, and projecting; and carrying out color coding on the projection points subjected to dimension reduction according to the clustering result.
The above steps are illustrated below by a specific application example:
the method comprises the following steps: clustering data in different systems and performing interactive operation;
the process is as follows:
1. compiling a contract;
2. extracting data;
3. clustering is carried out;
4. reducing the dimension to two dimensions;
5. displaying to a visual interface;
6. establishing forward interaction from data to an interface, namely: manually changing the attribute of the data, and then displaying the attribute on an interface;
7. establishing backward interaction from the interface to the data, namely: dragging and dropping the points of the interface, and then obtaining the result after the attribute is changed;
8. establishing a visualization scheme of the newly added data, namely: how to combine the original scheme after adding one data.
The incremental interactive clustering visualization method for trusted cloud data sharing further optimizes the clustering data analysis and mining process, and optimizes the clustering data analysis and mining process in three aspects: firstly, the data is converted into incremental calculation from the calculation of full data, and clustering analysis is carried out according to the time sequence of data arrival while the data operation result is not changed; secondly, in the data sharing process, the centralized sharing of the original data is avoided, a distributed strategy is adopted, the data is transformed by using the modes of encryption, dimension reduction and the like, and the transformed data is shared on the basis of not influencing the data processing; and thirdly, the data of the processing result is not only displayed, but the analysis and the processing are directly carried out on the interface, and the processing result is displayed, so that the data value is more intuitively mined. Therefore, the data processing efficiency can be effectively improved; the problem of data leakage caused by centralized data sharing and processing is avoided; and interactive visual clustering data analysis and mining are realized.
On the other hand, the embodiment of the invention also provides an incremental interactive clustering visualization system for trusted cloud data sharing, which is suitable for the above incremental interactive clustering visualization method for trusted cloud data sharing, and comprises the following steps:
The compiling and running module is used for compiling the data sharing intelligent contract and running the intelligent contract;
the extraction module is used for extracting target data of the data set according to the written data sharing intelligent contract;
the clustering module is used for clustering the extracted target data by adopting a clustering algorithm and outputting a clustering result;
the visualization module is used for carrying out multi-dimensional scale dimension reduction on the clustering result, realizing the projection of the clustering result in a two-dimensional space and carrying out visualization display;
and performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data.
The cloud data trusted sharing incremental interactive clustering visualization system is suitable for the cloud data trusted sharing incremental interactive clustering visualization method, so that the implementation of the system can refer to the implementation of the method, and repeated parts are not repeated.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, as the system is suitable for the method disclosed by the embodiment, the description is simple, and the relevant points can be referred to the description of the method part.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. The cloud data trusted sharing incremental interactive clustering visualization method is characterized by comprising the following steps:
s1, compiling a data sharing intelligent contract and operating the intelligent contract;
s2, extracting target data of the data set according to the written data sharing intelligent contract;
s3, clustering the extracted target data by adopting a clustering algorithm, and outputting a clustering result;
s4, carrying out multi-dimensional scale dimensionality reduction on the clustering result, realizing the projection of the clustering result in a two-dimensional space, and carrying out visual display;
and S5, performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data.
2. The incremental interactive cluster visualization method for trusted sharing of cloud data according to claim 1, wherein in the step S4, performing dimensionality reduction on the clustering result by using a multidimensional scale includes:
S41, calculating a distance matrix according to the clustering result and the set low-dimensional space dimension;
s42, calculating an inner product matrix according to the distance matrix;
s43, performing eigenvalue decomposition on the inner product matrix, calculating the first n maximum eigenvalues and eigenvectors thereof, and generating a diagonal matrix and an eigenvector matrix which are formed by the first n maximum eigenvalues;
and S44, calculating a matrix after dimensionality reduction according to the diagonal matrix and the eigenvector matrix formed by the first n maximum eigenvalues, and outputting the low-dimensional representation of the clustering result.
3. The cloud-based data trusted sharing incremental interactive cluster visualization method according to claim 1, wherein the step S5 includes:
s51, if the user changes the attribute of the target data, constructing forward projection, and displaying the result after data attribute transformation on a visual interface to realize forward projection interaction;
s52, if the user drags and drops the visual data points to the target data, constructing a backward projection, calculating the attribute of the data points after adjustment, and realizing backward projection interaction;
and S53, if the user adds the incremental data on the basis of the target data, performing cluster analysis and visualization operation on the incremental data.
4. The incremental interactive cluster visualization method for trusted sharing of cloud data according to claim 3, wherein said S51 comprises:
s511, acquiring data points after the attribute is changed and the mass center of each cluster in the current clustering model;
s512, respectively calculating Euclidean distances from the data points to the mass center of each cluster;
s513, acquiring the shortest distance in the Euclidean distances, and classifying the data points into a cluster corresponding to the shortest distance;
and S514, projecting the data points in a scatter diagram, and carrying out color coding by using the same color of the corresponding cluster to realize forward projection interaction.
5. The cloud data trusted sharing incremental interactive clustering visualization method of claim 4, wherein in S511, a mean method is adopted to obtain a centroid of each cluster in a current clustering model.
6. The cloud-based data trusted sharing incremental interactive cluster visualization method of claim 3, wherein said S52 comprises:
s521, acquiring low-dimensional data of a projection point after dragging and dropping the visual data point;
s522, calculating a change vector from the original projection point to the new projection point according to the low-dimensional data;
S523, calculating delta x by adopting a PCA dimension reduction method:
Δx[e0 e1]=Δy
wherein, Δ x is a characteristic variation vector of an original data point; [ e ]0 e1]A feature vector matrix of the original data points; Δ y is a position change vector;
and S524, solving the delta x optimal solution according to a regularized least square method to obtain the attribute of the data points after adjustment, and realizing back projection interaction.
7. The cloud-based data trusted sharing incremental interactive cluster visualization method of claim 3, wherein said S53 comprises:
s531, acquiring data points of the newly added incremental data, reading the data points into a processing system, and setting a K-L divergence threshold;
s532, calculating the K-L divergence of the data set after the data points are increased and the original data set;
and S533, comparing the K-L divergence with a K-L divergence threshold, and respectively adopting a sample external expansion mode and a re-clustering mode to realize cluster analysis and visualization operation on the incremental data.
8. The cloud data trusted sharing incremental interactive cluster visualization method of claim 7, wherein in S533, a sample-out extension manner is adopted to implement cluster analysis and visualization operations of incremental data, including: when the K-L divergence is smaller than a K-L divergence threshold value, dividing the clusters of the newly added data points according to the distance from the points to the cluster center, and updating the cluster center; and projecting the newly added data points into a scatter diagram by adopting a sample external expansion mode.
9. The cloud-based data trusted sharing incremental interactive cluster visualization method of claim 7, wherein in S533, clustering analysis and visualization operation of the incremental data are implemented by means of re-clustering, and the method includes: when the K-L divergence is larger than a K-L divergence threshold, re-clustering the data set, finding the optimal overlapping position of the target data dimension reduction result, and projecting; and carrying out color coding on the projection points after dimension reduction according to the clustering result.
10. Cloud data credible shared incremental interactive clustering visualization system is characterized by comprising:
the compiling and running module is used for compiling the data sharing intelligent contract and running the intelligent contract;
the extraction module is used for extracting target data of the data set according to the written data sharing intelligent contract;
the clustering module is used for clustering the extracted target data by adopting a clustering algorithm and outputting a clustering result;
the visualization module is used for carrying out multi-dimensional scale dimension reduction on the clustering result, realizing the projection of the clustering result in a two-dimensional space and carrying out visual display;
and performing corresponding projection interaction, cluster analysis and visualization operation according to the change requirement of the user on the target data.
CN202210145820.4A 2022-02-17 2022-02-17 Incremental interactive clustering visualization method and system for credible cloud data sharing Pending CN114519086A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210145820.4A CN114519086A (en) 2022-02-17 2022-02-17 Incremental interactive clustering visualization method and system for credible cloud data sharing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210145820.4A CN114519086A (en) 2022-02-17 2022-02-17 Incremental interactive clustering visualization method and system for credible cloud data sharing

Publications (1)

Publication Number Publication Date
CN114519086A true CN114519086A (en) 2022-05-20

Family

ID=81599406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210145820.4A Pending CN114519086A (en) 2022-02-17 2022-02-17 Incremental interactive clustering visualization method and system for credible cloud data sharing

Country Status (1)

Country Link
CN (1) CN114519086A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116955117A (en) * 2023-09-18 2023-10-27 深圳市艺高智慧科技有限公司 Computer radiator performance analysis system based on data visualization enhancement

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116955117A (en) * 2023-09-18 2023-10-27 深圳市艺高智慧科技有限公司 Computer radiator performance analysis system based on data visualization enhancement
CN116955117B (en) * 2023-09-18 2023-12-22 深圳市艺高智慧科技有限公司 Computer radiator performance analysis system based on data visualization enhancement

Similar Documents

Publication Publication Date Title
Harrell Jr et al. Package ‘hmisc’
Sun et al. Learning sparse representation with variational auto-encoder for anomaly detection
US10929649B2 (en) Multi-pose face feature point detection method based on cascade regression
Turkay et al. Brushing dimensions-a dual visual analysis model for high-dimensional data
Correa et al. A framework for uncertainty-aware visual analytics
Hubert et al. Robust PCA and classification in biosciences
Meesrikamolkul et al. Shape-based clustering for time series data
Çiflikli et al. Implementing a data mining solution for enhancing carpet manufacturing productivity
US20020188618A1 (en) Systems and methods for ordering categorical attributes to better visualize multidimensional data
CN111539444B (en) Gaussian mixture model method for correction type pattern recognition and statistical modeling
Kwon et al. Sampling for scalable visual analytics
CN113591879A (en) Deep multi-view clustering method, network, device and storage medium based on self-supervision learning
Mei et al. Fuzzy relational clustering around medoids: A unified view
Tam et al. Visualization of time‐series data in parameter space for understanding facial dynamics
CN114519086A (en) Incremental interactive clustering visualization method and system for credible cloud data sharing
Liu et al. Robust dataset classification approach based on neighbor searching and kernel fuzzy c-means
Yu et al. PSEUDo: Interactive pattern search in multivariate time series with locality-sensitive hashing and relevance feedback
Chandrakala et al. A density based method for multivariate time series clustering in kernel feature space
Chen Optimizing star-coordinate visualization models for effective interactive cluster exploration on big data
Kera et al. Spurious vanishing problem in approximate vanishing ideal
Blanchet et al. Triplet Markov fields for the classification of complex structure data
CN111080351A (en) Clustering method and system for multi-dimensional data set
CN114238952A (en) Abnormal behavior detection method, device and system and computer readable storage medium
Nabney et al. Semisupervised learning of hierarchical latent trait models for data visualization
Toda et al. Visualization, Clustering, and Graph Generation of Optimization Search Trajectories for Evolutionary Computation Through Topological Data Analysis: Application of the Mapper

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination