CN111783832A - Interactive selection method of space-time data prediction model - Google Patents

Interactive selection method of space-time data prediction model Download PDF

Info

Publication number
CN111783832A
CN111783832A CN202010492269.1A CN202010492269A CN111783832A CN 111783832 A CN111783832 A CN 111783832A CN 202010492269 A CN202010492269 A CN 202010492269A CN 111783832 A CN111783832 A CN 111783832A
Authority
CN
China
Prior art keywords
node
algorithm
prediction
data
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010492269.1A
Other languages
Chinese (zh)
Other versions
CN111783832B (en
Inventor
孙国道
查梦
朱素佳
徐超清
王磊
梁荣华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202010492269.1A priority Critical patent/CN111783832B/en
Publication of CN111783832A publication Critical patent/CN111783832A/en
Application granted granted Critical
Publication of CN111783832B publication Critical patent/CN111783832B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/24765Rule-based classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

An interactive selection method based on a space-time data prediction model is used for screening and cleaning original data, deleting abnormal values and completing missing values; then clustering the data by combining a Canopy and K-Means clustering algorithm to obtain K cluster types, and extracting the first K1 cluster type areas to calculate boundaries; after the data of each region are subjected to differential processing, respectively training by using a Random Forest algorithm, an SVM algorithm, an ARIMA algorithm and an LSTM algorithm to establish a model, and then predicting each region according to the model; displaying the obtained prediction result data in each area on the map by using a visual font; and after the steps are completed, carrying out layout optimization on the region font and the region connecting line. The interaction exploration component provided by the system helps the user to effectively distinguish the differences of the various models in an intuitive manner. The glyph design and the geographic layout design of the invention enable users to intuitively carry out deep analysis on the prediction output.

Description

Interactive selection method of space-time data prediction model
Technical Field
The invention relates to an interactive selection method of a space-time data prediction model.
Background
With the development of society, a great amount of spatio-temporal data is generated in various fields, such as: tracking industrial production, conducting financial transactions and monitoring the environment. Analysis of spatiotemporal data typically includes statistical analysis, model recommendations and best predictions, which are important to many aspects of society. Reasonable prediction is made by analyzing the spatio-temporal data, so that researchers can be helped to master social/technical development trends and make better decisions.
The core function of the space-time prediction is to establish a prediction model, and the prediction methods for the space-time data can be divided into two categories: traditional statistical methods and machine learning methods. Conventional statistical methods are typically based on observations of the user and require the following steps: data sampling, data mapping and curve fitting/parameter estimation, and specifically includes moving average Models (MA), autoregressive moving average models (ARMA) and autoregressive integrated moving average (ARIMA) models. The machine learning method is mainly based on data classification and requires the following steps: the method comprises the following steps of data decomposition, model training and model prediction, and specifically comprises models such as a Support Vector Machine (SVM), a Random Forest (Random Forest) and a nonparametric regression model. With the development, many advanced models, such as Artificial Neural Networks (ANNs) and long-term short-term memory networks (LSTMs), have emerged to help people learn deep about spatiotemporal data in different application domains.
Since different types of models are suitable for different application scenarios and may have their specific limitations, researchers have conducted a series of studies on model recommendations to help people to have a better choice in model selection. However, there are some challenges in their work. One of the difficulties is that their research focus is data driven. Although a great deal of interesting information can be found by data transformation analysis, a great deal of application context information needs to be considered. At the same time, this can be difficult due to the various levels of uncertainty that exist. Another challenge is that they do not properly account for the predicted performance of the model. The user may want to know why this model works well, while other models do not work in certain situations, and the advantages of the different models and their scope of use.
At this stage, in visual analysis of spatiotemporal prediction models, visual analysis systems combine multiple machine learning models with interactive visualization, allowing users to examine the models and visually compare the prediction performance between different models. Many visualization researchers perform the analysis by linking model performance metrics (e.g., accuracy and precision) to visualization components (e.g., line graphs, bar graphs, and heat maps). Through the interactive exploration process, the user can learn about the various possibilities of the predictive models and learn about the differences of the various models. However, existing efforts for multi-model steering and selection aim at detecting performance indicators rather than fully considering input data and model outputs, and inadequate understanding of the relationship between null data and model outputs may result in incorrect model selection.
Based on the above problems, we elaborated a visualization framework combining time series/geospatial data and various predictive model performances, which is beneficial for machine learning beginners and laymen to select and understand models. The present invention provides a visual analysis system that enables a user to interactively examine the similarity of the prediction outputs of spatiotemporal data models. In contrast to other work, we aimed at mapping geographic information to the output of predictive models, with correlation of multiple models. The clustering method is already applied to similarity analysis, and a user can intuitively know the similarity of different models by combining with a correlation matrix view. Meanwhile, a novel font is designed, so that a user can intuitively carry out deep analysis on prediction output. Furthermore, based on mouse scaling, we have designed multi-layer layouts that help eliminate geospatial overlap when the user closely observes glyphs linked to the predicted output of the model. At each level, the force directed graph algorithm has been applied to glyphs to avoid collisions. In addition, the designed time axis view can help the user to quickly know the original time sequence.
In various fields, data prediction is a common data analysis method. However, it is difficult for beginners and non-professionals to select a suitable prediction model and to understand the prediction performance and usage scenarios of the different models. The existing model recommending system recommends a most appropriate prediction model for a user based on data driving, does not give proper explanation to the prediction performance of the prediction model, and makes it difficult for the user to find interesting points. The user may want to know why this model works well, while other models do not work in certain situations, and the advantages of the different models and their scope of use.
Disclosure of Invention
Existing work for multi-model steering and selection aims at detecting performance indexes, and does not consider input data and model output comprehensively, and model selection is incorrect due to the fact that understandings of the relation between the space data and the model output are insufficient. In order to overcome the defects of the prior art, the invention provides an interactive selection method of a space-time data prediction model.
In order to solve the technical problems, the invention provides the following technical scheme:
an interactive selection method of a spatiotemporal data prediction model, which explains and contrasts the spatiotemporal prediction model in a visual way based on a map area, the method comprises the following steps:
(1) acquiring shared bicycle data, deleting bicycle data which are not in an analysis area and have abnormal riding time, judging whether each bicycle data of all time points of all days in the area is empty or not, filling 0 if the bicycle data is empty, and making a data set;
(2) clustering the obtained data set by combining a Canopy clustering algorithm and a K-means clustering algorithm, and taking the first K1 cluster classes as prediction areas; then modeling and predicting the flow data of the K1 prediction areas, wherein the test result comprises prediction data, real data, MAE value of the prediction data, RMSE value of the prediction data, R-Squared value and 1-R-Squared value of the prediction data;
(3) the prediction results of four prediction algorithms of each region are presented and displayed by the visual font, and the analysis and display steps are as follows:
(3-1) using a multilayer radar chart as a first layer of the area chart: respectively and sequentially taking the obtained (1-R-Squared) value, MAE value and RMSE value as the vertexes of a radar map according to a counterclockwise sequence, wherein each vertex displays a parameter name and a numerical value, the four algorithms are respectively displayed by stacking four layers of radar maps, each layer of radar map is respectively expressed by different textures, a backslash system expresses a Random Forest algorithm, a vertical bar expresses an LSTM algorithm, a horizontal bar expresses an ARIMA algorithm, and a slaslash system expresses an SVM algorithm;
(3-2) using the histogram as the second layer of the area map: respectively drawing a histogram of the obtained R-Squared value, MAE value and RMSE value on a second layer according to a counterclockwise sequence, wherein the height of each column represents the value, and each column displays the parameter name; respectively using different textures to express different prediction algorithms to obtain prediction results, wherein a reverse slash system expresses a Random Forest algorithm, a vertical slash system expresses an LSTM algorithm, a horizontal slash system expresses an ARIMA algorithm, and a slash system expresses an SVM algorithm; different gray scales of the histogram with the same texture show the prediction results obtained by different training models of the same algorithm;
(4) predicting a layout algorithm of regional fonts based on a map; according to the central position of the area, the original position of the font is placed at the central point of the area, and the overlapped font is rearranged by using an improved force guiding layout algorithm, wherein the process of the layout algorithm is as follows:
(4-1) inputting K1 initial node positions, wherein the radius of each node is r, calculating the distance between the nodes, and if the distance between the nodes is less than 2r, the two nodes are overlapped;
(4-2) calculating the relative position between each node to obtain a relative matrix M, and for the nodes { a1, a2. } (i ═ K1), calculating the matrix M as:
Figure BDA0002521525680000031
for each node, defining that the upper right of the node is marked as 0, the upper left of the node is marked as 1, the lower left of the node is marked as 2, and the lower right of the node is marked as 3, and then calculating the relative position between each original node and all other nodes to obtain a K1 order matrix;
(4-4) calculating the displacement generated by the force action between the two nodes, wherein the two-node displacement calculation formula is as follows:
Figure BDA0002521525680000032
wherein x represents the displacement generated by the force action between two points, Δ x represents the difference between the abscissa of the two points, Δ y represents the difference between the ordinate of the two points, and k is the force action coefficient;
(4-5) calculating the displacement generated by the action of force between each node and all other nodes, and accumulating to obtain the unit displacement of each node;
(4-6) sequentially updating the coordinates of the K1 nodes according to the unit displacement of each node;
(4-7) calculating the relative position between each updated node to obtain a relative matrix M1, and comparing M with M1, namely comparing the relative position between each point with the relative position between each updated point; for two nodes P with changed relative positions1,P2Then, the maximum displacement of the two points is calculated according to the original relative angle and the original coordinate of the two points, and P is updated2The coordinates of (a);
(4-8) repeating the steps (4-4) to (4-7) for n times until all the nodes are not overlapped;
(5) glyphs connecting similar regions using lines; calculating the attribute of the node of the region, wherein the attribute is composed of each index of a prediction algorithm, then grouping similar points into one class by a clustering algorithm, finally connecting the similar fonts corresponding to the regions by line segments, and creating a Bezier curve for beautification, wherein the algorithm process of connecting the fonts by the line segments is as follows:
(5-1) inputting a series of coordinate points S ═ { P ═ P1,P2···Pt};
(5-2) selecting a first point P1As an initial point p0
(5-3) calculation of p0Euclidean distances to other points, and selecting the distance with the greatest distanceNear point pkAnd save the line segment [ p ]0,pk]And the nearest distance dis, deleting p from S0
(5-4) let p0=pkRepeating the step (5-3);
(5-5) iterating (t-1) times, namely stopping iterating when only one element exists in S, saving the connection line as lines0, and saving the line length distance 0;
(5-6) selection of P in sequence2···PtAs an initial point p0Repeating the steps (5-2) to (5-5) to obtain (t-1) connecting lines and line lengths thereof, screening out the connecting lines with the line segments not intersecting (the common end points do not calculate the intersection), and selecting the connecting line with the shortest line length as a line segment connecting scheme;
(6) a line layout algorithm is provided for detecting the collision between the line segment and the node and replanning the line path, and the implementation steps are as follows:
(6-1) the radius of each node is r, the distance d from each node to each line segment is detected, and if r is larger than d, the node is detected to collide with the line segment; otherwise, continuing to judge the next node;
(6-2) when the node collides with the line segment, judging the relative position of the node and the line segment, if the node is on the left side of the line segment, selecting a stagnation point on the right side of the node, and if not, selecting a stagnation point on the left side of the node;
(6-3) connecting the stationed point with the center of the node, obtaining two intersection points on the circumference of the non-similar node, finding the intersection point closest to the line segment, and then generating a virtual node within the threshold theta of the point to connect the virtual node with the similar node.
The technical conception of the invention is as follows: an interactive visualization analysis system is designed for interactive analysis understanding and visual comparison of the prediction model and the performance of the prediction model. Firstly, spatial data are analyzed, then a prediction model is established through a data set, prediction is completed, finally a group of visual fonts is designed to better display the prediction result of the model, meanwhile, a novel layout algorithm is provided to solve the problems of font-to-font collision and font-to-line collision, and the difference and the correlation among prediction models are displayed. The method helps the user to better understand the influence of the prediction model and the parameters thereof on the prediction result and the difference between the prediction result and different models.
The invention has the beneficial effects that: through visual analysis and comparison of the prediction models, the relation between the spatio-temporal data and model output is comprehensively considered by combining time series/geographic space data and various prediction model performances, and an interactive visual analysis system is designed, so that a user is allowed to interactively explore the similarity of the spatio-temporal data model prediction output, and the understanding of the model performances is deepened. Innovative glyph and geography designs allow users to intuitively analyze the predicted output in depth.
Drawings
FIG. 1 is a flow chart of the present invention.
FIG. 2 is a visual glyph diagram of the present invention.
Fig. 3 is a layout diagram of a font according to the present invention, in which (a) represents a region font and (b) represents a region font after layout.
Fig. 4 is a connection line layout diagram of the present invention, in which (a) shows connection lines of region glyphs, and (b) shows connection lines of region glyphs after layout.
Detailed description of the preferred embodiments
The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, and the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1 to 4, an interactive selection method of a spatiotemporal data prediction model includes the following steps:
(1) acquiring shared bicycle data, deleting bicycle data which are not in an analysis area and have abnormal riding time, judging whether each bicycle data of all time points of all days in the area is empty or not, filling 0 if the bicycle data is empty, and making a data set;
(2) clustering the obtained data set by combining a Canopy clustering algorithm and a K-means clustering algorithm, and taking the first K1 cluster classes as prediction areas; then modeling and predicting the flow data of the K1 prediction areas, wherein the test result comprises prediction data, real data, MAE value of the prediction data, RMSE value of the prediction data, R-Squared value and 1-R-Squared value of the prediction data;
(3) the prediction results of four prediction algorithms (as shown in fig. 2) of each region are displayed by the visual font, and the analysis display steps are as follows:
(3-1) using a multilayer radar chart as a first layer of the area chart: respectively and sequentially taking the obtained (1-R-Squared) value, MAE value and RMSE value as the vertexes of a radar map according to a counterclockwise sequence, wherein each vertex displays a parameter name and a numerical value, the four algorithms are respectively displayed by stacking four layers of radar maps, each layer of radar map is respectively expressed by different textures, a backslash system expresses a Random Forest algorithm, a vertical bar expresses an LSTM algorithm, a horizontal bar expresses an ARIMA algorithm, and a slaslash system expresses an SVM algorithm;
(3-2) using the histogram as the second layer of the area map: respectively drawing a histogram of the obtained R-Squared value, MAE value and RMSE value on a second layer according to a counterclockwise sequence, wherein the height of each column represents the value, and each column displays the parameter name; respectively using different textures to express different prediction algorithms to obtain prediction results, wherein a reverse slash system expresses a Randomforest algorithm, a vertical slash expresses an LSTM algorithm, a horizontal slash expresses an ARIMA algorithm, and a slash system expresses an SVM algorithm; different gray scales of the histogram with the same texture show the prediction results obtained by different training models of the same algorithm;
(4) predicting a layout algorithm of regional fonts based on a map; the original position of the glyph is placed at the central point of the area according to the central position of the area, and the overlapped glyph is rearranged by using a modified force-oriented layout algorithm (as shown in FIG. 3), wherein the process of the layout algorithm is as follows:
(4-1) inputting K1 initial node positions, wherein the radius of each node is r, calculating the distance between the nodes, and if the distance between the nodes is less than 2r, the two nodes are overlapped;
(4-2) calculating the relative position between each node to obtain a relative matrix M, and for the nodes { a1, a2. } (i ═ K1), calculating the matrix M as:
Figure BDA0002521525680000051
for each node, defining that the upper right of the node is marked as 0, the upper left of the node is marked as 1, the lower left of the node is marked as 2, and the lower right of the node is marked as 3, and then calculating the relative position between each original node and all other nodes to obtain a K1 order matrix;
(4-4) calculating the displacement generated by the force action between the two nodes, wherein the two-node displacement calculation formula is as follows:
Figure BDA0002521525680000061
wherein x represents the displacement generated by the force action between two points, Δ x represents the difference between the abscissa of the two points, Δ y represents the difference between the ordinate of the two points, and k is the force action coefficient;
(4-5) calculating the displacement generated by the action of force between each node and all other nodes, and accumulating to obtain the unit displacement of each node;
(4-6) sequentially updating the coordinates of the K1 nodes according to the unit displacement of each node;
(4-7) calculating the relative position between each updated node to obtain a relative matrix M1, and comparing M with M1, namely comparing the relative position between each point with the relative position between each updated point; for two nodes P with changed relative positions1,P2Then, the maximum displacement of the two points is calculated according to the original relative angle and the original coordinate of the two points, and P is updated2The coordinates of (a);
(4-8) repeating the steps (4-4) to (4-7) and iterating for n times until all the nodes are not overlapped.
(5) Glyphs connecting similar regions using lines (see fig. 4 (a)); calculating the attribute of the node of the region, wherein the attribute is composed of each index of a prediction algorithm, then grouping similar points into one class by a clustering algorithm, finally connecting the similar fonts corresponding to the regions by line segments, and creating a Bezier curve for beautification, wherein the algorithm process of connecting the fonts by the line segments is as follows:
(5-1) inputting a series of coordinate points S ═ { P ═ P1,P2···Pt};
(5-2) selecting a first point P1As an initial point p0
(5-3) calculation of p0Euclidean distance from other points, and selecting point p closest to the other pointskAnd save the line segment [ p ]0,pk]And the nearest distance dis, deleting p from S0
(5-4) let p0=pkRepeating the step (5-3);
(5-5) iterating (t-1) times, namely stopping iterating when only one element exists in S, saving the connection line as lines0, and saving the line length distance 0;
(5-6) selection of P in sequence2···PtAs an initial point p0Repeating the steps (5-2) to (5-5) to obtain (t-1) connecting lines and line lengths thereof, screening out the connecting lines with non-intersecting line segments (common end points are not calculated to be intersected), and selecting the connecting line with the shortest line length as the line segment connecting scheme 5
(6) Referring to fig. 4, in a preferred embodiment, a line layout algorithm is proposed to detect the collision between a line segment and a node and to re-plan a line path, and the implementation steps are as follows:
(6-1) the radius of each node is r, the distance d from each node to each line segment is detected, and if r is larger than d, the node is detected to collide with the line segment; otherwise, continuing to judge the next node;
(6-2) when the node collides with the line segment, judging the relative position of the node and the line segment, if the node is on the left side of the line segment, selecting a stagnation point on the right side of the node, and if not, selecting a stagnation point on the left side of the node;
(6-3) connecting the stationed point with the center of the node, obtaining two intersection points on the circumference of the non-similar node, finding the intersection point closest to the line segment, and then generating a virtual node within the threshold theta of the point to connect the virtual node with the similar node.
The embodiment designs an interactive visual analysis system which is used for performing interactive analysis understanding and visual comparison on the prediction model and the performance of the prediction model. Firstly, spatial data are analyzed, then a prediction model is established through a data set, prediction is completed, finally a group of visual fonts is designed to better display the prediction result of the model, meanwhile, a novel layout algorithm is provided to solve the problems of font-to-font collision and font-to-line collision, and the difference and the correlation among prediction models are displayed. The method helps the user to better understand the influence of the prediction model and the parameters thereof on the prediction result and the difference between the prediction result and different models.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims (1)

1. A method for interactive selection of a spatio-temporal data prediction model, the method comprising the steps of:
(1) acquiring shared bicycle data, deleting bicycle data which are not in an analysis area and have abnormal riding time, judging whether each bicycle data of all time points of all days in the area is empty or not, filling 0 if the bicycle data is empty, and making a data set;
(2) clustering the obtained data set by combining a Canopy clustering algorithm and a K-means clustering algorithm, and taking the first K1 cluster classes as prediction areas; then modeling and predicting the flow data of the K1 prediction areas, wherein the test result comprises prediction data, real data, MAE value of the prediction data, RMSE value of the prediction data, R-Squared value and 1-R-Squared value of the prediction data;
(3) the prediction results of four prediction algorithms of each region are presented and displayed by the visual font, and the analysis and display steps are as follows:
(3-1) using a multilayer radar chart as a first layer of the area chart: respectively and sequentially taking the obtained (1-R-Squared) value, MAE value and RMSE value as the vertexes of a radar map according to a counterclockwise sequence, wherein each vertex displays a parameter name and a numerical value, the four algorithms are respectively displayed by stacking four layers of radar maps, each layer of radar map is respectively expressed by different textures, a backslash system expresses a Random Forest algorithm, a vertical bar expresses an LSTM algorithm, a horizontal bar expresses an ARIMA algorithm, and a slaslash system expresses an SVM algorithm;
(3-2) using the histogram as the second layer of the area map: respectively drawing a histogram of the obtained R-Squared value, MAE value and RMSE value on a second layer according to a counterclockwise sequence, wherein the height of each column represents the value, and each column displays the parameter name; respectively using different textures to express different prediction algorithms to obtain prediction results, wherein a reverse slash system expresses a Random Forest algorithm, a vertical slash system expresses an LSTM algorithm, a horizontal slash system expresses an ARIMA algorithm, and a slash system expresses an SVM algorithm; different gray scales of the histogram with the same texture show the prediction results obtained by different training models of the same algorithm;
(4) predicting a layout algorithm of regional fonts based on a map; according to the central position of the area, the original position of the font is placed at the central point of the area, and the overlapped font is rearranged by using an improved force guiding layout algorithm, wherein the process of the layout algorithm is as follows:
(4-1) inputting K1 initial node positions, wherein the radius of each node is r, calculating the distance between the nodes, and if the distance between the nodes is less than 2r, the two nodes are overlapped;
(4-2) calculating the relative position between each node to obtain a relative matrix M, and for the nodes { a1, a2. } (i ═ K1), calculating the matrix M as:
Figure FDA0002521525670000011
for each node, defining that the upper right of the node is marked as 0, the upper left of the node is marked as 1, the lower left of the node is marked as 2, and the lower right of the node is marked as 3, and then calculating the relative position between each original node and all other nodes to obtain a K1 order matrix;
(4-4) calculating the displacement generated by the force action between the two nodes, wherein the two-node displacement calculation formula is as follows:
Figure FDA0002521525670000012
wherein x represents the displacement generated by the force action between two points, Δ x represents the difference between the abscissa of the two points, Δ y represents the difference between the ordinate of the two points, and k is the force action coefficient;
(4-5) calculating the displacement generated by the action of force between each node and all other nodes, and accumulating to obtain the unit displacement of each node;
(4-6) sequentially updating the coordinates of the K1 nodes according to the unit displacement of each node;
(4-7) calculating the relative position between each updated node to obtain a relative matrix M1, and comparing M with M1, namely comparing the relative position between each point with the relative position between each updated point; for two nodes P with changed relative positions1,P2Then, the maximum displacement of the two points is calculated according to the original relative angle and the original coordinate of the two points, and P is updated2The coordinates of (a);
(4-8) repeating the steps (4-4) to (4-7) for n times until all the nodes are not overlapped;
(5) glyphs connecting similar regions using lines; calculating the attribute of the node of the region, wherein the attribute is composed of each index of a prediction algorithm, then grouping similar points into one class by a clustering algorithm, finally connecting the similar fonts corresponding to the regions by line segments, and creating a Bezier curve for beautification, wherein the algorithm process of connecting the fonts by the line segments is as follows:
(5-1) inputting a series of coordinate points S ═ { P ═ P1,P2…Pt};
(5-2) selecting a first point P1As an initial point p0
(5-3) calculation of p0Euclidean distance from other points, and selecting point p closest to the other pointskAnd save the line segment [ p ]0,pk]And the nearest distance dis, deleting p from S0
(5-4) let p0=pkRepeating the step (5-3);
(5-5) iterating (t-1) times, namely stopping iterating when only one element exists in S, saving the connection line as lines0, and saving the line length distance 0;
(5-6) selection of P in sequence2…PtAs an initial point p0Repeating the steps (5-2) to (5-5) to obtain (t-1) connecting lines and line lengths thereof, screening out connecting lines with mutually disjoint line segments, and selecting the connecting line with the shortest line length as a line segment connecting scheme;
(6) a line layout algorithm is provided for detecting the collision between the line segment and the node and replanning the line path, and the implementation steps are as follows:
(6-1) the radius of each node is r, the distance d from each node to each line segment is detected, and if r is larger than d, the node is detected to collide with the line segment; otherwise, continuing to judge the next node;
(6-2) when the node collides with the line segment, judging the relative position of the node and the line segment, if the node is on the left side of the line segment, selecting a stagnation point on the right side of the node, and if not, selecting a stagnation point on the left side of the node;
(6-3) connecting the stationed point with the center of the node, obtaining two intersection points on the circumference of the non-similar node, finding the intersection point closest to the line segment, and then generating a virtual node within the threshold theta of the point to connect the virtual node with the similar node.
CN202010492269.1A 2020-06-03 2020-06-03 Interactive selection method of space-time data prediction model Active CN111783832B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010492269.1A CN111783832B (en) 2020-06-03 2020-06-03 Interactive selection method of space-time data prediction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010492269.1A CN111783832B (en) 2020-06-03 2020-06-03 Interactive selection method of space-time data prediction model

Publications (2)

Publication Number Publication Date
CN111783832A true CN111783832A (en) 2020-10-16
CN111783832B CN111783832B (en) 2022-07-15

Family

ID=72754042

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010492269.1A Active CN111783832B (en) 2020-06-03 2020-06-03 Interactive selection method of space-time data prediction model

Country Status (1)

Country Link
CN (1) CN111783832B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227613A1 (en) * 2021-04-27 2022-11-03 中国科学院深圳先进技术研究院 Method and system for dynamically analyzing interaction between protein and small molecules, and computer-readable carrier

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106354760A (en) * 2016-08-18 2017-01-25 北京工商大学 Deforming statistical map based multi-view spatio-temporal data visualization method and application
EP3505957A1 (en) * 2017-12-27 2019-07-03 Avantix Treatment method for multi-target detection, characterisation and tracking and associated device
CN110096500A (en) * 2019-05-07 2019-08-06 上海海洋大学 A kind of visual analysis method and system towards ocean multidimensional data
CN110389982A (en) * 2019-07-25 2019-10-29 东北师范大学 A kind of spatiotemporal mode visual analysis system and method based on air quality data
CN110888912A (en) * 2019-10-15 2020-03-17 中国人民解放军国防科技大学 Target behavior semantic track prediction method based on space-time big data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106354760A (en) * 2016-08-18 2017-01-25 北京工商大学 Deforming statistical map based multi-view spatio-temporal data visualization method and application
EP3505957A1 (en) * 2017-12-27 2019-07-03 Avantix Treatment method for multi-target detection, characterisation and tracking and associated device
CN110096500A (en) * 2019-05-07 2019-08-06 上海海洋大学 A kind of visual analysis method and system towards ocean multidimensional data
CN110389982A (en) * 2019-07-25 2019-10-29 东北师范大学 A kind of spatiotemporal mode visual analysis system and method based on air quality data
CN110888912A (en) * 2019-10-15 2020-03-17 中国人民解放军国防科技大学 Target behavior semantic track prediction method based on space-time big data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
X. CHEN: ""A Survey of Multi-Space Techniques in Spatio-Temporal Simulation Data Visualization"", 《VISUAL INFORMATICS》 *
X. CHEN: ""A Survey of Multi-Space Techniques in Spatio-Temporal Simulation Data Visualization"", 《VISUAL INFORMATICS》, 16 August 2019 (2019-08-16), pages 1 - 11 *
孙国道: ""基于地理标签的推文话题时空演变的可视分析方法"", 《计算机科学》 *
孙国道: ""基于地理标签的推文话题时空演变的可视分析方法"", 《计算机科学》, 31 August 2019 (2019-08-31), pages 1 - 8 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227613A1 (en) * 2021-04-27 2022-11-03 中国科学院深圳先进技术研究院 Method and system for dynamically analyzing interaction between protein and small molecules, and computer-readable carrier

Also Published As

Publication number Publication date
CN111783832B (en) 2022-07-15

Similar Documents

Publication Publication Date Title
Lhuillier et al. State of the art in edge and trail bundling techniques
Li et al. Deep supervision with intermediate concepts
Hogräfer et al. The state of the art in map‐like visualization
Zhou et al. Edge bundling in information visualization
Kamal et al. Recent advances and challenges in uncertainty visualization: a survey
Wood et al. Visualisation of origins, destinations and flows with OD maps
Van Den Elzen et al. Baobabview: Interactive construction and analysis of decision trees
Wu et al. Visualizing flow of uncertainty through analytical processes
CN108900546A (en) The method and apparatus of time series Network anomaly detection based on LSTM
Zhao et al. Recognition of building group patterns using graph convolutional network
JP2019212307A (en) System and method for large scale multidimensional spatiotemporal data analysis
JP2019045894A (en) Retrieval program, retrieval method and information processing apparatus operating retrieval program
Ling et al. Solving optimization problems through fully convolutional networks: An application to the traveling salesman problem
Rodrigues et al. Clustering of architectural floor plans: A comparison of shape representations
Nagar et al. Visualization and analysis of Pareto-optimal fronts using interpretable self-organizing map (iSOM)
He et al. A linear tessellation model to identify spatial pattern in urban street networks
US20170039741A1 (en) Multi-dimensional visualization
CN111783832B (en) Interactive selection method of space-time data prediction model
CN117275748A (en) Visual analysis method for RNA-disease association relationship based on density relationship graph
Yu et al. Sparse reconstruction with spatial structures to automatically determine neighbors
Zhang Automated evaluation of generalized topographic maps
Pizarro et al. Large-scale multi-unit floor plan dataset for architectural plan analysis and recognition
Bernard et al. Multiscale visual quality assessment for cluster analysis with Self-Organizing Maps
Guo Exploratory spatial data analysis
Li et al. A viewpoint based approach to the visual exploration of trajectory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant