CN106815320B - Investigation big data visual modeling method and system based on expanded three-dimensional histogram - Google Patents

Investigation big data visual modeling method and system based on expanded three-dimensional histogram Download PDF

Info

Publication number
CN106815320B
CN106815320B CN201611225454.4A CN201611225454A CN106815320B CN 106815320 B CN106815320 B CN 106815320B CN 201611225454 A CN201611225454 A CN 201611225454A CN 106815320 B CN106815320 B CN 106815320B
Authority
CN
China
Prior art keywords
data
dimensional
expanded
dimensional histogram
investigation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611225454.4A
Other languages
Chinese (zh)
Other versions
CN106815320A (en
Inventor
胡钦太
黄昌勤
张瑜
卢春和
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUANGZHOU CREATEVIEW OPTOELECTRONICS TECHNOLOGY Co Ltd
South China Normal University
Original Assignee
GUANGZHOU CREATEVIEW OPTOELECTRONICS TECHNOLOGY Co Ltd
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GUANGZHOU CREATEVIEW OPTOELECTRONICS TECHNOLOGY Co Ltd, South China Normal University filed Critical GUANGZHOU CREATEVIEW OPTOELECTRONICS TECHNOLOGY Co Ltd
Priority to CN201611225454.4A priority Critical patent/CN106815320B/en
Publication of CN106815320A publication Critical patent/CN106815320A/en
Application granted granted Critical
Publication of CN106815320B publication Critical patent/CN106815320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education

Abstract

The invention discloses a research big data visual modeling method and a research big data visual modeling system based on an expanded three-dimensional histogram, wherein the method comprises the following steps: initializing a three-dimensional visual model; classifying and preprocessing the research data according to specific data types; reading original investigation data, extracting and normalizing expanded three-dimensional histogram data according to the requirements of the three-dimensional visual model and the classification pretreatment result, and generating expanded three-dimensional histogram data conforming to the standard format of the three-dimensional visual model; and carrying out visual display on the generated expanded three-dimensional histogram data according to a graphics method. The method is based on the expanded three-dimensional histogram, can form a unified integral visual analysis result, and has wider applicability; the three-dimensional histogram is expanded to be a visual chart which is associated with the original investigation data, so that the fidelity is high; and a corresponding effective data preprocessing method can be adopted according to different data types, so that the method is more effective. The invention can be widely applied to the field of big data processing.

Description

Investigation big data visual modeling method and system based on expanded three-dimensional histogram
Technical Field
The invention relates to the field of big data processing, in particular to a research big data visual modeling method and system based on an expanded three-dimensional histogram.
Background
Education equipment is a necessary condition for education modernization, and an education decision department needs to collect and count research data of an information technology application level to master the configuration condition of basic facilities of an education informatization process and determine the application condition of software and hardware facilities, and simultaneously needs to comprehensively describe the education informatization progress condition based on the research data and generate various corresponding reports such as an application analysis report, problem diagnosis, planning and consultation, development decision and the like, so that the abundant research data is utilized to evaluate, consult and plan the implementation current situation, the application effect and the development level of each level of education informatization.
In recent years, with the huge investment in education informatization and the huge difference between the application of education informatization and the expected effect, people have to pay attention to deep problems such as strategic decisions and investment profits in education informatization. More and more people begin to pay attention to the performance of education informatization, and the work center of gravity of education informatization is turned from investment, information scheme providing, platform and system to education informatization integration, education informatization value evaluation and education informatization sustainable development. The education informationization evaluation is also gradually changed from a mode of determining the education informationization level by taking investment as a main mode to a mode of determining the education informationization level by taking performance as a main mode, so that the application development of the education informationization is promoted by evaluating the performance of the education information. However, in the evaluation of the performance of education information, both foreign and domestic research is being attempted. The evaluation of the educational information performance is a difficult thing: on one hand, the education informatization is not only a dynamic development process, but also belongs to the problem of multiple input and multiple output, and the output of the education informatization is not easy to measure by using a quantitative index; on the other hand, no mature theoretical guidance and appropriate measuring method and measuring tool exist in the field, and the education informatization evaluation is a very difficult work, because the benefits of the education informatization are diverse, not only the economic benefits but also the social benefits are seen, and not only the current benefits are more expressed as long-term benefits, but also the inherent benefits are more expressed as derived benefits. Therefore, the performance evaluation of education information becomes a very important and quite urgent topic which is concerned by people.
The data types of the education information research data are complicated and belong to dynamic data which can change according to the situation, and means capable of carrying out automatic processing and visual analysis are urgently needed for the huge data volume generated by large-scale research in province and city. The conventional visualization means mainly performs contrast statistics and visualization on simplified types of numerical classes, which is not suitable for the requirement of research and analysis in case of sudden increase of data amount and types, and then the analysis processing is usually performed through clustering.
The traditional clustering analysis and calculation methods mainly comprise the following steps:
1. dividing method (partioning methods)
Given a data set with N tuples or records, the partitioning method will construct K groups, each group representing a cluster, K < N, and the K groups satisfy the following conditions: (1) each group at least comprises a data record; (2) each data record belongs to and only one grouping (this requirement can be relaxed appropriately in some fuzzy clustering algorithms). For a given K, the algorithm first gives an initial grouping method and then changes the grouping by iterative methods such that the grouping scheme is better after each improvement than before. Good metrics are: the closer records in the same group the better, while the farther records in different groups the better. The algorithm using this basic idea of the partitioning method is: the K-MEANS algorithm, the K-MEDOIDS algorithm, and the CLARANS algorithm.
Most partitioning methods are distance-based. After the number K of partitions to be constructed is given, the partitioning method firstly creates an initialization partitioning scheme; it then employs an iterative relocation technique to divide by moving objects from one group to another. Among these partitioning methods, a good partition metric is generally: objects in the same cluster are as close to or related to each other as possible, while objects in different clusters are as far apart or different as possible. The traditional division method can only be expanded to subspace clustering, but not search the whole data space, and is suitable for data with many attributes and sparse data. In order to achieve global optimization, the partition-based clustering method may need to exhaust all possible partitions, and the calculation amount is very large. In fact, most applications use popular heuristics, such as k-means and k-center algorithms, to asymptotically improve the clustering quality to approach the locally optimal solution. These heuristic clustering methods are well suited for finding spherical clusters in small and medium-scale databases. In order to find clusters with complex shapes and cluster very large data sets, the partitioning-based clustering method needs to be further extended.
2. Hierarchical method (hierarchical methods)
The hierarchical method carries out hierarchical decomposition on a given data set until a certain condition is met, and the hierarchical method can be specifically divided into a bottom-up scheme and a top-down scheme. Taking the "bottom-up" approach as an example, initially each data record is grouped into a separate group, and in the next iteration, the hierarchical approach groups those records that are adjacent to each other into a group until all records are grouped into a group or some condition is met. The representative algorithm of the hierarchical method is as follows: BIRCH algorithm, CURE algorithm, chaleleon algorithm, etc.
Hierarchical clustering methods may be based on distance or based on density or connectivity. Some extensions to hierarchical clustering methods also take into account subspace clustering content. The hierarchical approach has the drawback that once a step (merging or splitting) is completed, it cannot be withdrawn (without considering the number of combinations of different choices, the computational overhead is small); however, this technique cannot correct erroneous decisions. Therefore, further improvement of the clustering quality of the hierarchical clustering method is also required.
3. Model-based methods
The model-based approach assumes a model for each cluster and then finds a data set that better satisfies the model. Such a model may be a function of the density distribution of data points in space or otherwise, a potential assumption of which is: the target data set is determined by a series of probability distributions. There are generally two trial directions for model-based approaches: statistical approaches and approaches to neural networks.
In summary, the existing visual modeling method for the informationized research and development of the big data has the following defects or shortcomings:
(1) only simple visualization of local data types can be realized, the method is only suitable for visual comparison of the same data types, a uniform overall visualization analysis result cannot be formed, and the applicability is not wide;
(2) the visualization is irreversible to the data processing process, and the fidelity is low;
(3) the processing requirements of complex multi-type data cannot be met, and a corresponding effective data preprocessing method cannot be adopted according to different data types.
Disclosure of Invention
To solve the above technical problems, the present invention aims to: the investigation big data visualization modeling method based on the expanded three-dimensional histogram is wide in applicability, high in fidelity and effective.
Another object of the present invention is to: the investigation big data visualization modeling system based on the expanded three-dimensional histogram is wide in applicability, high in fidelity and effective.
The technical scheme adopted by the invention is as follows:
the investigation big data visualization modeling method based on the expanded three-dimensional histogram comprises the following steps:
initializing a three-dimensional visual model;
classifying and preprocessing the research data according to specific data types;
reading original investigation data, extracting and normalizing expanded three-dimensional histogram data according to the requirements of the three-dimensional visual model and the classification pretreatment result to generate expanded three-dimensional histogram data conforming to the standard format of the three-dimensional visual model, the horizontal dimension of the expanded three-dimensional histogram is formed by different filling main bodies with different levels, the longitudinal dimension comprises a plurality of complex data types, the Z-direction dimension is a structure which is formed by cell attributes, cell heights and top textures and is related to original research data, different filling bodies of different levels comprise, but are not limited to provinces, cities, counties and schools, various complex data types comprise, but are not limited to, logic data, text data, numerical data and enumeration data, and top textures adopt different colors to express the variation trend of the data;
and carrying out visual display on the generated expanded three-dimensional histogram data according to a graphics method.
Further, the step of initializing the three-dimensional visualization model includes:
determining the horizontal and longitudinal absolute widths, data intervals and the total number of the minimum unit dimensions contained in the expanded three-dimensional histogram;
determining a specific dimension structure for expanding the horizontal coordinate and the vertical coordinate of the three-dimensional histogram;
determining the total height of the expanded three-dimensional histogram in the z-axis direction, the position of a ground plane and height representation methods of different data types;
and setting a top texture parameter expanding the z-axis direction of the three-dimensional histogram.
Further, the step of performing classification preprocessing on research data according to specific data types includes:
reading in original research data;
performing corresponding classification processing according to the specific data type of the original research data to obtain processed data;
and archiving the processed data.
Further, the step of performing corresponding classification processing according to the specific data type of the original research data to obtain processed data specifically includes:
if the original research data is numerical data, determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; if the original investigation data is logic data, firstly enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; if the original research data is text data, firstly listing keywords of the text data, then extracting an abstract of the text data, then determining a method for mapping the keywords of the text data, and finally identifying the variation trend of the single text data.
Further, the step of reading the original investigation data, extracting and normalizing the expanded three-dimensional histogram data according to the requirements of the three-dimensional visualization model and the classification preprocessing result, and generating the expanded three-dimensional histogram data conforming to the standard format of the three-dimensional visualization model includes:
reading the original investigation data one by one, and carrying out deep analysis on the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data until the minimum data unit of the original investigation data is analyzed;
extracting graphical representation data required by the original research data according to the specific data type of the original research data, the requirements of the three-dimensional visual model and the classification preprocessing result, and calculating the corresponding data change trend;
normalizing the three-dimensional visual model of the extracted graphical representation data and the calculated data change trend to obtain data conforming to the standard format of the three-dimensional visual model;
and writing the data which accords with the standard format of the three-dimensional visual model into an expanded three-dimensional histogram data set.
Further, the step of performing the normalization of the expanded three-dimensional histogram data includes:
a ground plane is used as a reference plane, and a cubic height normalization strategy is formulated according to the characteristics of different data types;
and normalizing the expanded three-dimensional histogram data according to the cubic height normalization strategy, wherein the height of the expanded three-dimensional histogram in the Z-axis direction is higher than the ground level or lower than the ground level.
The other technical scheme adopted by the invention is as follows:
an investigation big data visualization modeling system based on an expanded three-dimensional histogram comprises:
the three-dimensional visualization model initialization module is used for initializing a three-dimensional visualization model;
the investigation data classification preprocessing module is used for classifying and preprocessing the investigation data according to specific data types;
an expanded three-dimensional histogram data generation module for reading the original investigation data, extracting and normalizing the expanded three-dimensional histogram data according to the requirements of the three-dimensional visualization model and the classification preprocessing result, generating expanded three-dimensional histogram data in accordance with the standard format of the three-dimensional visualization model, the horizontal dimension of the expanded three-dimensional histogram is formed by different filling main bodies with different levels, the longitudinal dimension comprises a plurality of complex data types, the Z-direction dimension is a structure which is formed by cell attributes, cell heights and top textures and is related to original research data, different filling bodies of different levels comprise, but are not limited to provinces, cities, counties and schools, various complex data types comprise, but are not limited to, logic data, text data, numerical data and enumeration data, and top textures adopt different colors to express the variation trend of the data;
and the investigation data expansion three-dimensional histogram display module is used for visually displaying the generated expansion three-dimensional histogram data according to a graphics method.
Further, the investigation data classification preprocessing module comprises:
the data reading unit is used for reading in original research data;
the classification processing unit is used for carrying out corresponding classification processing according to the specific data type of the original research data to obtain processed data;
and the archiving unit is used for archiving the processed data.
Further, the classification processing unit specifically performs the following operations:
if the original research data is numerical data, determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; if the original investigation data is logic data, firstly enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; if the original research data is text data, firstly listing keywords of the text data, then extracting an abstract of the text data, then determining a method for mapping the keywords of the text data, and finally identifying the variation trend of the single text data.
Further, the expanded three-dimensional histogram data generation module includes:
the reading and analyzing unit is used for reading the original investigation data one by one and deeply analyzing the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data until the data unit with the minimum original investigation data is analyzed;
the extraction and calculation unit is used for extracting graphical representation data required by the original research data according to the specific data type of the original research data, the requirements of the three-dimensional visual model and the classification preprocessing result, and calculating the corresponding data change trend;
the normalization unit is used for normalizing the extracted graphical representation data and the calculated data change trend of the three-dimensional visualization model to obtain data conforming to the standard format of the three-dimensional visualization model;
and the writing unit is used for writing the data conforming to the standard format of the three-dimensional visual model into the expanded three-dimensional histogram data set.
The method of the invention has the beneficial effects that: the method comprises the steps of initializing a three-dimensional visualization model, carrying out classification pretreatment on research data according to specific data types, carrying out extraction and normalization on expanded three-dimensional histogram data and carrying out visual display, generating expanded three-dimensional histogram data which accords with a standard format of the three-dimensional visualization model by extracting and normalizing the expanded three-dimensional histogram data based on the expanded three-dimensional histogram, and carrying out visual display, so that a unified integral visual analysis result can be formed, and the applicability is wider; expanding the Z-direction dimension of the three-dimensional histogram into a structure which consists of cell attributes, cell heights and top textures and is associated with the original research data, and expanding the visual chart of the three-dimensional histogram to be associated with the original research data, so that the irreversibility defect of conventional visualization on data processing is overcome, and the fidelity is high; the longitudinal dimension of the three-dimensional histogram is expanded to include multiple complex data types, the data types are expanded, classification preprocessing is additionally performed on the research data according to specific data types, the processing requirements of complex multi-type data are met, a corresponding effective data preprocessing method can be adopted according to different data types, and the method is more effective.
The system of the invention has the advantages that: the three-dimensional visualization model generation and visualization method comprises a three-dimensional visualization model initialization module, a survey data classification preprocessing module, an expanded three-dimensional histogram data generation module and a survey data expanded three-dimensional histogram display module, wherein expanded three-dimensional histogram data which accords with a three-dimensional visualization model standard format are generated in the expanded three-dimensional histogram data generation module and the survey data expanded three-dimensional histogram display module through expanded three-dimensional histogram data extraction and normalization based on an expanded three-dimensional histogram and are visually displayed, so that a unified overall visualization analysis result can be formed, and the applicability is wider; the Z-direction dimension of the expanded three-dimensional histogram data generation module is a structure which is composed of cell attributes, cell heights and top textures and is associated with the original research data, and the defect of irreversibility of conventional visualization on data processing is overcome by associating the expanded three-dimensional histogram which is a visual chart with the original research data, so that the fidelity is high; the longitudinal dimension of the expanded three-dimensional histogram data generation module comprises multiple complex data types, the data types are expanded, a research data classification preprocessing module for performing classification preprocessing on research data according to specific data types is additionally arranged, the processing requirements of complex multi-type data are met, a corresponding effective data preprocessing method can be adopted according to different data types, and the method is more effective.
Drawings
FIG. 1 is a flow chart illustrating steps of a method for developing a visualized modeling of big data based on an extended three-dimensional histogram according to the present invention;
FIG. 2 is a flowchart illustrating the classification processing steps according to the specific data type of the original research data;
FIG. 3 is a block diagram of the overall structure of the research big data visualization modeling based on the extended three-dimensional histogram.
Detailed Description
Referring to fig. 1, the research big data visualization modeling method based on the expanded three-dimensional histogram includes the following steps:
initializing a three-dimensional visual model;
classifying and preprocessing the research data according to specific data types;
reading original investigation data, extracting and normalizing expanded three-dimensional histogram data according to the requirements of the three-dimensional visual model and the classification pretreatment result to generate expanded three-dimensional histogram data conforming to the standard format of the three-dimensional visual model, the horizontal dimension of the expanded three-dimensional histogram is formed by different filling main bodies with different levels, the longitudinal dimension comprises a plurality of complex data types, the Z-direction dimension is a structure which is formed by cell attributes, cell heights and top textures and is related to original research data, different filling bodies of different levels comprise, but are not limited to provinces, cities, counties and schools, various complex data types comprise, but are not limited to, logic data, text data, numerical data and enumeration data, and top textures adopt different colors to express the variation trend of the data;
and carrying out visual display on the generated expanded three-dimensional histogram data according to a graphics method.
Further as a preferred embodiment, the step of initializing the three-dimensional visualization model includes:
determining the horizontal and longitudinal absolute widths, data intervals and the total number of the minimum unit dimensions contained in the expanded three-dimensional histogram;
determining a specific dimension structure for expanding the horizontal coordinate and the vertical coordinate of the three-dimensional histogram;
determining the total height of the expanded three-dimensional histogram in the z-axis direction, the position of a ground plane and height representation methods of different data types;
and setting a top texture parameter expanding the z-axis direction of the three-dimensional histogram.
Referring to fig. 2, further as a preferred embodiment, the step of performing classification preprocessing on the research data according to specific data types includes:
reading in original research data;
performing corresponding classification processing according to the specific data type of the original research data to obtain processed data;
and archiving the processed data.
Referring to fig. 2, as a further preferred embodiment, the step of performing corresponding classification processing according to a specific data type of the original research data to obtain processed data specifically includes:
if the original research data is numerical data, determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; if the original investigation data is logic data, firstly enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; if the original research data is text data, firstly listing keywords of the text data, then extracting an abstract of the text data, then determining a method for mapping the keywords of the text data, and finally identifying the variation trend of the single text data.
As a preferred embodiment, the step of reading the original research data, extracting and normalizing the expanded three-dimensional histogram data according to the requirement of the three-dimensional visualization model and the result of the classification preprocessing, and generating the expanded three-dimensional histogram data conforming to the standard format of the three-dimensional visualization model includes:
reading the original investigation data one by one, and carrying out deep analysis on the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data until the minimum data unit of the original investigation data is analyzed;
extracting graphical representation data required by the original research data according to the specific data type of the original research data, the requirements of the three-dimensional visual model and the classification preprocessing result, and calculating the corresponding data change trend;
normalizing the three-dimensional visual model of the extracted graphical representation data and the calculated data change trend to obtain data conforming to the standard format of the three-dimensional visual model;
and writing the data which accords with the standard format of the three-dimensional visual model into an expanded three-dimensional histogram data set.
Further, as a preferred embodiment, the step of performing the normalization of the expanded three-dimensional histogram data includes:
a ground plane is used as a reference plane, and a cubic height normalization strategy is formulated according to the characteristics of different data types;
and normalizing the expanded three-dimensional histogram data according to the cubic height normalization strategy, wherein the height of the expanded three-dimensional histogram in the Z-axis direction is higher than the ground level or lower than the ground level.
Referring to fig. 3, the research big data visualization modeling system based on the extended three-dimensional histogram includes:
the three-dimensional visualization model initialization module is used for initializing a three-dimensional visualization model;
the investigation data classification preprocessing module is used for classifying and preprocessing the investigation data according to specific data types;
an expanded three-dimensional histogram data generation module for reading the original investigation data, extracting and normalizing the expanded three-dimensional histogram data according to the requirements of the three-dimensional visualization model and the classification preprocessing result, generating expanded three-dimensional histogram data in accordance with the standard format of the three-dimensional visualization model, the horizontal dimension of the expanded three-dimensional histogram is formed by different filling main bodies with different levels, the longitudinal dimension comprises a plurality of complex data types, the Z-direction dimension is a structure which is formed by cell attributes, cell heights and top textures and is related to original research data, different filling bodies of different levels comprise, but are not limited to provinces, cities, counties and schools, various complex data types comprise, but are not limited to, logic data, text data, numerical data and enumeration data, and top textures adopt different colors to express the variation trend of the data;
and the investigation data expansion three-dimensional histogram display module is used for visually displaying the generated expansion three-dimensional histogram data according to a graphics method.
Further preferably, the research data classification preprocessing module includes:
the data reading unit is used for reading in original research data;
the classification processing unit is used for carrying out corresponding classification processing according to the specific data type of the original research data to obtain processed data;
and the archiving unit is used for archiving the processed data.
Further as a preferred embodiment, the classification processing unit specifically performs the following operations:
if the original research data is numerical data, determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; if the original investigation data is logic data, firstly enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; if the original research data is text data, firstly listing keywords of the text data, then extracting an abstract of the text data, then determining a method for mapping the keywords of the text data, and finally identifying the variation trend of the single text data.
Further as a preferred embodiment, the expanded three-dimensional histogram data generating module includes:
the reading and analyzing unit is used for reading the original investigation data one by one and deeply analyzing the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data until the data unit with the minimum original investigation data is analyzed;
the extraction and calculation unit is used for extracting graphical representation data required by the original research data according to the specific data type of the original research data, the requirements of the three-dimensional visual model and the classification preprocessing result, and calculating the corresponding data change trend;
the normalization unit is used for normalizing the extracted graphical representation data and the calculated data change trend of the three-dimensional visualization model to obtain data conforming to the standard format of the three-dimensional visualization model;
and the writing unit is used for writing the data conforming to the standard format of the three-dimensional visual model into the expanded three-dimensional histogram data set.
The invention will be further explained and explained with reference to the drawings and the embodiments in the description.
Example one
Aiming at the problems of low applicability, low fidelity and insufficient effectiveness in the prior art, the invention provides a novel investigation big data visualization modeling method and system based on an expanded three-dimensional histogram. The key point of the invention is to promote the clustering of the research data processing processes of various layers, dimensions and types into a unified integral visual analysis model from the conventional visual comparison mode of the same data type. The invention is based on an expanded three-dimensional histogram, and the expanded three-dimensional histogram is expanded on the basis of the three-dimensional histogram: the horizontal dimension is formed by different parts of the same simple hierarchy and is expanded to be formed by different filling main bodies with different hierarchies such as province, city, county and school, and conditions are provided for the subsequent expansion of large data processing with different granularities; the longitudinal dimension is expanded from a common pure numerical data type to a set of multiple complex data types such as logic data, text data, numerical data and enumeration data, and the expression capability and the application range of the visual three-dimensional model are greatly expanded; the Z-direction dimension is developed from a common single height into a high-fidelity structure which is composed of three connotations of cell attributes, cell heights and top textures and is connected with original research data. The top texture adopts different colors to express the variation trend of the data, and lays a foundation for the application of the temporal database. In the visual modeling process, when the newly introduced complex data types are preprocessed, a cubic height normalization strategy is formulated according to the characteristics of different data types. Therefore, the invention also introduces a ground plane as a reference plane, so that the expanded height can be represented in a mode of being higher than the ground plane or lower than the ground plane, and the connotation and expression capability of the cubic height are enhanced.
As shown in fig. 3, the research big data visualization modeling system of the present invention includes four major parts, namely, a research data classification preprocessing module, a three-dimensional visualization model initialization module, an expanded three-dimensional histogram data generation module, and a research data expanded three-dimensional histogram display module. The research data classification preprocessing module is used for preprocessing the research data of different types. And the three-dimensional visualization model initialization module is used for initializing data needing initialization, such as width, height, precision and the like of the visualization three-dimensional model. And the expanded three-dimensional histogram data generation module reads the original research data, and performs data combination and normalization according to the requirements of the three-dimensional visualization model to finally obtain normalized data in the standard format of the three-dimensional visualization model. And a data expansion three-dimensional histogram display module is investigated, and the standardized data of the data expansion three-dimensional histogram generation module are combined row by row and column by height and top texture according to a graphics method to form output data capable of being directly visually displayed.
As shown in FIG. 1, the research big data visualization modeling method of the invention comprises the following steps:
(1) initializing a three-dimensional visualization model, comprising: determining the absolute width of the transverse direction and the longitudinal direction and the interval of data; determining how many minimum unit dimensions there are in total; according to the requirements of the investigation system, the dimension of the horizontal and vertical coordinates can be determined to be a contained structure; the z-axis direction is used for determining the total height, the position of the ground plane and the height representation methods of different data types, so that different practical meanings related to the application level of the information technology are given to the height of pure values of different plane positions. Meanwhile, a set of differentiated top textures is prepared to represent the change of the numerical value during initialization.
(2) And carrying out classification pretreatment on the research data. As shown in fig. 2, the classification preprocessing mainly processes three main types of data: the numerical data, namely determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; logic data, namely enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; and the text data is firstly listed with keywords of the text data, then abstracts of the text data are extracted, a method for mapping the keywords of the text data is determined, and finally the change trend of the single text data is identified.
(3) And expanding the generation of three-dimensional histogram data.
The specific process of generating the expanded three-dimensional histogram data comprises the following steps: firstly, reading the original investigation data one by one, and deeply analyzing the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data (namely, setting a target format to be analyzed) until analyzing a data unit with the minimum original investigation data; extracting required graphical representation data according to the data type of the investigated original data and the processing method of the related type data in the step (2), and calculating the change trend of the corresponding data; and then, normalizing the extracted and calculated data by a three-dimensional visualization model, and finally uniformly writing the normalized data into an expanded three-dimensional histogram data set.
(4) And (5) researching and developing three-dimensional histogram display of data expansion.
The step is used for changing the extended three-dimensional histogram data with certain significance generated in the last step (3) into a data set which is completely expressed according to the format required by graphics so as to carry out visual display processing. This step can specify the length, width, and height of the three-dimensional visual model, the spacing and height between each row and column, and the specific display requirements of the texture such as the top map, and has basic conditions for enabling operations such as display, movement, rotation, and projection cutting in the direction of each dimension coordinate axis.
Example two
The evaluation of the application level of the educational information technology is a complex system process, and the process of modeling by applying the visual modeling system of the first embodiment specifically comprises the following steps:
(1) establishing an index system for information technology application evaluation, wherein the index system comprises main indexes of various planning, management, investment, application, training and other evaluations and maintains the relative stability in the past evaluation;
(2) establishing a network evaluation system, carrying out cross-region investigation through a network means as much as possible, and accumulating enough investigation data volume;
(3) establishing the visual modeling system of the first embodiment, and improving the traditional data processing mode towards the visual processing and analyzing direction;
(4) performing visual analysis and presentation of research data according to the visual modeling system of the first embodiment;
(5) on the basis of visual analysis, a long-term investigation mechanism is established based on the top texture, so that the transition from terminal investigation evaluation to continuous monitoring is realized.
Compared with the prior art, the invention has the following advantages:
a. a visual model is established to improve a statistical analysis framework, and a digital service form is converted into a graphical service form;
b. extending from simple local data-type visualization to system-wide data visualization;
c. by expanding the visualization chart of the three-dimensional histogram to be associated with the original investigation data, the irreversibility defect of conventional visualization on data processing is overcome, and the fidelity is higher;
d. the complex multi-type data is clustered and expanded on the basis of semantics, the different types of data are distinguished from each other in the model, and a corresponding effective data preprocessing method is adopted, so that the method is more effective;
e. the top texture is fully utilized, and the fourth-dimensional information representing the change trend is added to the traditional three-dimensional visualization model, so that the conversion from the investigation system to the continuous monitoring system becomes possible;
f. the connotation of the three-dimensional histogram is expanded, and a structured basic visualization model with hierarchy, analysis and interpretability is constructed.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The investigation big data visualization modeling method based on the expanded three-dimensional histogram is characterized in that: the method comprises the following steps:
initializing a three-dimensional visual model;
classifying and preprocessing the research data according to specific data types;
reading original investigation data, extracting and normalizing expanded three-dimensional histogram data according to the requirements of a three-dimensional visual model and classification preprocessing results, and generating expanded three-dimensional histogram data which accords with the standard format of the three-dimensional visual model, wherein the horizontal dimension of the expanded three-dimensional histogram is formed by different filling main bodies with different levels, the longitudinal dimension comprises multiple complex data types, and the Z-direction dimension is a structure which is formed by cell attributes, cell heights and top textures and is connected with the original investigation data, wherein the different filling main bodies with different levels comprise provinces, cities, counties and schools, the multiple complex data types comprise logic data, text data and numerical data, and the top textures adopt different colors to express the variation trend of the data;
and carrying out visual display on the generated expanded three-dimensional histogram data according to a graphics method.
2. The investigation big data visualization modeling method based on the extended three-dimensional histogram of claim 1, characterized in that: the step of initializing the three-dimensional visualization model includes:
determining the horizontal and longitudinal absolute widths, data intervals and the total number of the minimum unit dimensions contained in the expanded three-dimensional histogram;
determining a specific dimension structure for expanding the horizontal coordinate and the vertical coordinate of the three-dimensional histogram;
determining the total height of the expanded three-dimensional histogram in the z-axis direction, the position of a ground plane and height representation methods of different data types;
and setting a top texture parameter expanding the z-axis direction of the three-dimensional histogram.
3. The investigation big data visualization modeling method based on the extended three-dimensional histogram of claim 1, characterized in that: the step of classifying and preprocessing the research data according to specific data types comprises the following steps:
reading in original research data;
performing corresponding classification processing according to the specific data type of the original research data to obtain processed data;
and archiving the processed data.
4. The investigation big data visualization modeling method based on the extended three-dimensional histogram of claim 3, characterized in that: the step of performing corresponding classification processing according to the specific data type of the original research data to obtain processed data specifically includes:
if the original research data is numerical data, determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; if the original investigation data is logic data, firstly enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; if the original research data is text data, firstly listing keywords of the text data, then extracting an abstract of the text data, then determining a method for mapping the keywords of the text data, and finally identifying the variation trend of the single text data.
5. The investigation big data visualization modeling method based on the extended three-dimensional histogram of claim 3, characterized in that: the step of reading original investigation data, extracting and normalizing expanded three-dimensional histogram data according to the requirements of the three-dimensional visual model and the classification preprocessing result, and generating expanded three-dimensional histogram data conforming to the standard format of the three-dimensional visual model comprises the following steps:
reading the original investigation data one by one, and carrying out deep analysis on the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data until the minimum data unit of the original investigation data is analyzed;
extracting graphical representation data required by the original research data according to the specific data type of the original research data, the requirements of the three-dimensional visual model and the classification preprocessing result, and calculating the corresponding data change trend;
normalizing the three-dimensional visual model of the extracted graphical representation data and the calculated data change trend to obtain data conforming to the standard format of the three-dimensional visual model;
and writing the data which accords with the standard format of the three-dimensional visual model into an expanded three-dimensional histogram data set.
6. The method for visual modeling of research big data based on extended three-dimensional histograms according to any of claims 1-5, characterized in that: the step of performing extended three-dimensional histogram data normalization comprises:
a ground plane is used as a reference plane, and a cubic height normalization strategy is formulated according to the characteristics of different data types;
and normalizing the expanded three-dimensional histogram data according to the cubic height normalization strategy, wherein the height of the expanded three-dimensional histogram in the Z-axis direction is higher than the ground level or lower than the ground level.
7. An investigation big data visualization modeling system based on an expanded three-dimensional histogram is characterized in that: the method comprises the following steps:
the three-dimensional visualization model initialization module is used for initializing a three-dimensional visualization model;
the investigation data classification preprocessing module is used for classifying and preprocessing the investigation data according to specific data types;
an expanded three-dimensional histogram data generation module for reading the original investigation data, extracting and normalizing the expanded three-dimensional histogram data according to the requirements of the three-dimensional visualization model and the classification preprocessing result, generating expanded three-dimensional histogram data in accordance with the standard format of the three-dimensional visualization model, the horizontal dimension of the expanded three-dimensional histogram is formed by different filling main bodies with different levels, the longitudinal dimension comprises a plurality of complex data types, the Z-direction dimension is a structure which is formed by cell attributes, cell heights and top textures and is related to original research data, the different filling bodies of different levels comprise provinces, cities, counties and schools, the various complex data types comprise logic data, text data and numerical data, and the top texture adopts different colors to express the variation trend of the data;
and the investigation data expansion three-dimensional histogram display module is used for visually displaying the generated expansion three-dimensional histogram data according to a graphics method.
8. The system according to claim 7, wherein the developed three-dimensional histogram based research big data visualization modeling system comprises: the investigation data classification preprocessing module comprises:
the data reading unit is used for reading in original research data;
the classification processing unit is used for carrying out corresponding classification processing according to the specific data type of the original research data to obtain processed data;
and the archiving unit is used for archiving the processed data.
9. The system according to claim 8, wherein the developed three-dimensional histogram based research big data visualization modeling system comprises: the classification processing unit specifically executes the following operations:
if the original research data is numerical data, determining the value range of the numerical data, then determining the average value of the numerical data, then determining a numerical mapping method of the numerical data, and finally identifying the variation trend of the single numerical data; if the original investigation data is logic data, firstly enumerating the value range of each investigation item of the logic data, then determining the reference value of the logic data, then determining the logical data value mapping method, and finally identifying the variation trend of the single logic data; if the original research data is text data, firstly listing keywords of the text data, then extracting an abstract of the text data, then determining a method for mapping the keywords of the text data, and finally identifying the variation trend of the single text data.
10. The system according to claim 7, 8 or 9, wherein the histogram expansion based research big data visualization modeling system comprises: the expanded three-dimensional histogram data generation module comprises:
the reading and analyzing unit is used for reading the original investigation data one by one and deeply analyzing the original investigation data layer by layer according to the data format of the predetermined composite structure investigation data until the data unit with the minimum original investigation data is analyzed;
the extraction and calculation unit is used for extracting graphical representation data required by the original research data according to the specific data type of the original research data, the requirements of the three-dimensional visual model and the classification preprocessing result, and calculating the corresponding data change trend;
the normalization unit is used for normalizing the extracted graphical representation data and the calculated data change trend of the three-dimensional visualization model to obtain data conforming to the standard format of the three-dimensional visualization model;
and the writing unit is used for writing the data conforming to the standard format of the three-dimensional visual model into the expanded three-dimensional histogram data set.
CN201611225454.4A 2016-12-27 2016-12-27 Investigation big data visual modeling method and system based on expanded three-dimensional histogram Active CN106815320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611225454.4A CN106815320B (en) 2016-12-27 2016-12-27 Investigation big data visual modeling method and system based on expanded three-dimensional histogram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611225454.4A CN106815320B (en) 2016-12-27 2016-12-27 Investigation big data visual modeling method and system based on expanded three-dimensional histogram

Publications (2)

Publication Number Publication Date
CN106815320A CN106815320A (en) 2017-06-09
CN106815320B true CN106815320B (en) 2020-03-17

Family

ID=59110274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611225454.4A Active CN106815320B (en) 2016-12-27 2016-12-27 Investigation big data visual modeling method and system based on expanded three-dimensional histogram

Country Status (1)

Country Link
CN (1) CN106815320B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033204B (en) * 2018-06-29 2021-10-08 浙江大学 Hierarchical integral histogram visual query method based on world wide web
CN112365110A (en) * 2019-07-24 2021-02-12 中移信息技术有限公司 Research method, platform, server and computer storage medium
CN111523009B (en) * 2020-07-03 2020-10-13 北京每日优鲜电子商务有限公司 Data visualization processing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103295025A (en) * 2013-05-03 2013-09-11 南京大学 Automatic selecting method of three-dimensional model optimal view
CN103412871A (en) * 2013-07-08 2013-11-27 北京百度网讯科技有限公司 Method and device for generating visualized view
CN103617220A (en) * 2013-11-22 2014-03-05 北京掌阔移动传媒科技有限公司 Method and device for implementing mobile terminal 3D (three dimensional) model
CN105069020A (en) * 2015-07-14 2015-11-18 国家信息中心 3D visualization method and system of natural resource data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8626750B2 (en) * 2011-01-28 2014-01-07 Bitvore Corp. Method and apparatus for 3D display and analysis of disparate data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103295025A (en) * 2013-05-03 2013-09-11 南京大学 Automatic selecting method of three-dimensional model optimal view
CN103412871A (en) * 2013-07-08 2013-11-27 北京百度网讯科技有限公司 Method and device for generating visualized view
CN103617220A (en) * 2013-11-22 2014-03-05 北京掌阔移动传媒科技有限公司 Method and device for implementing mobile terminal 3D (three dimensional) model
CN105069020A (en) * 2015-07-14 2015-11-18 国家信息中心 3D visualization method and system of natural resource data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
海底多维综合数据建模及可视化技术研究;苏天赟;《中国优秀博硕士学位论文全文数据库 (博士)基础科学辑》;20070215(第02期);全文 *

Also Published As

Publication number Publication date
CN106815320A (en) 2017-06-09

Similar Documents

Publication Publication Date Title
CN110245981B (en) Crowd type identification method based on mobile phone signaling data
CN107918830B (en) Power distribution network running state evaluation method based on big data technology
CN106919689A (en) Professional domain knowledge mapping dynamic fixing method based on definitions blocks of knowledge
CN110866123A (en) Method for constructing data map based on data model and system for constructing data map
CN106815320B (en) Investigation big data visual modeling method and system based on expanded three-dimensional histogram
CN111552813A (en) Power knowledge graph construction method based on power grid full-service data
CN109492796A (en) A kind of Urban Spatial Morphology automatic Mesh Partition Method and system
CN111814528B (en) Connectivity analysis noctilucent image city grade classification method
CN112329857A (en) Image classification method based on improved residual error network
CN114626886A (en) Questionnaire data analysis method and system
CN115099315A (en) Multi-source heterogeneous geographic information data semantic fusion conversion method based on CityGML
CN105631465A (en) Density peak-based high-efficiency hierarchical clustering method
CN116720632B (en) Engineering construction intelligent management method and system based on GIS and BIM
CN113254517A (en) Service providing method based on internet big data
CN104102716A (en) Imbalance data predicting method based on cluster stratified sampling compensation logic regression
CN114168795B (en) Building three-dimensional model mapping and storing method and device, electronic equipment and medium
CN115907159A (en) Method, device, equipment and medium for determining similar path typhoon
CN109614491B (en) Further mining method based on mining result of data quality detection rule
CN111710157B (en) Method for extracting hot spot area of taxi
CN113918537A (en) XML-based power grid multidimensional data modeling method
CN112308340A (en) Power data processing method and device
CN112488236A (en) Integrated unsupervised student behavior clustering method
CN111291102A (en) High-performance scale statistical calculation method for government affair data mining
Si et al. Construction and Management Method of University Information Platform Based on Big Data Technology
CN114707039B (en) Rapid data management method based on mass data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant