CN106599241B - Visual management method for big data in GIS software - Google Patents

Visual management method for big data in GIS software Download PDF

Info

Publication number
CN106599241B
CN106599241B CN201611182291.6A CN201611182291A CN106599241B CN 106599241 B CN106599241 B CN 106599241B CN 201611182291 A CN201611182291 A CN 201611182291A CN 106599241 B CN106599241 B CN 106599241B
Authority
CN
China
Prior art keywords
data
reading
big data
visual
csv file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611182291.6A
Other languages
Chinese (zh)
Other versions
CN106599241A (en
Inventor
钟耳顺
王尔琪
陈国雄
陈勇
胡辰璞
王少华
刘晓妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Supermap Software Co ltd
Original Assignee
Supermap Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Supermap Software Co ltd filed Critical Supermap Software Co ltd
Priority to CN201611182291.6A priority Critical patent/CN106599241B/en
Publication of CN106599241A publication Critical patent/CN106599241A/en
Application granted granted Critical
Publication of CN106599241B publication Critical patent/CN106599241B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a visual management method for big data in GIS software, which comprises the following steps: 1) constructing distributed data sources suitable for different data storage modes; 2) inputting known parameters to open a corresponding data source according to a data storage mode, and accessing and reading big data stored in a server; 3) visual data management operation is realized on the read big data; 4) and uploading the processed data to a server side, and realizing data storage or sharing for others. The invention can conveniently and intuitively operate and manage the data cluster through interactive operation, realize direct data analysis effect display, help common users to better understand data, and assist managers to make decisions by carrying out deeper analysis by data analysis experts.

Description

Visual management method for big data in GIS software
Technical Field
The invention relates to the technical field of computers and the field of geographic information systems, in particular to a visual management method for big data in GIS software.
Background
With the advent of the cloud era, Big Data (Big Data) has attracted more and more attention. The value content and the mining cost of big data and the technology thereof are more important than the quantity. How to utilize this large-scale data is critical to many industries. The storage and processing of large data is particularly important. The big data processing is guided by value, and can carry out various processing such as processing, mining, optimization and the like on the big data.
The big data is a data set which is large in scale and greatly exceeds the capability range of traditional database software tools in the aspects of acquisition, storage, management, analysis and the like, and has four characteristics of large data scale, rapid data circulation, various data types and low value density. The core of big data technology lies in the specialized processing of the meaningful data. The data volume of big data is generally above the TB level, and generally cannot be processed by a single computer, and a distributed architecture is generally adopted. It features distributed data mining on big data. But it must rely on distributed processing of cloud computing, distributed databases and cloud storage, virtualization technologies. The value of big data is reflected in the following 3 aspects: (1) enterprises offering products or services to a large number of consumers can be precisely marketed based on big data technology; (2) small and medium-sized micro enterprises in small and American modes can utilize big data technology to perform service transformation; and (3) the potential value of large data needs to be fully played by traditional enterprises which have to be transformed under the internet pressure.
Data visualization is the most fundamental requirement of data analysis tools, whether for data analysis experts or for ordinary users. The data can be visually displayed, the data can speak by oneself, and audiences can hear results. And the GIS software can combine the data and the spatial geographic position, so that the result can be seen on a map more visually, and deeper data mining can be carried out.
Most of the existing software for large data visualization management is professional data analysis software, most of the software is a Linux operating system with a Hadoop running environment, but the existing software also consumes high learning cost and longer time cost and is responsible for huge expense.
Professional data processing software for processing, mining and optimizing big data is not available internationally, but the professional GIS software which can combine the big data with geographic information system software, process and mine the big data in the geographic information system software and display the big data by combining geographic positions is not uncommon and needs to be realized in a visual mode, and the method belongs to the blank field in the domestic GIS industry.
Disclosure of Invention
The invention mainly aims to provide a visual management method for big data in GIS software, and aims to solve the problems of low analysis efficiency, poor display effect and the like when the GIS software processes the big space-time data in the prior art.
In order to solve the technical problem, the application provides a visual management method for big data in GIS software. The method comprises the following steps: 1) constructing distributed data sources suitable for different data storage modes; 2) inputting known parameters to open a corresponding data source according to a data storage mode, and accessing and reading big data stored in a server; 3) visual data management operation including field setting, index creating, data adding, data importing and data exporting is achieved on the read big data; 4) and uploading the processed data to a server side, and realizing data storage or sharing for others.
Further, the data source in the step 1) includes an HDFS data source and a MongoDB data source.
Further, in the step 3), in the process of reading the big data, the user can perform custom configuration to convert the data meeting the configuration condition into geospatial data in batch.
Further, the data management in the step 3) is a multitask operation, the currently-ongoing multitask is displayed in a visual mode, the task progress can be checked, and operations such as canceling the ongoing task are supported.
The invention has the beneficial effects that:
1. a new engine mode is added, namely the two data sources are as follows: the HDFS data source and the MongoDB data source can directly read the big data stored in the server only by inputting corresponding parameters by a user; in the process of big data reading, a user can convert data meeting configuration conditions into geospatial data in batches based on custom configuration, and the process is also visual.
2. The method integrates reading and management of the big data in the domestic GIS software, adopts a visual mode, adds two engine modes which are convenient for users to understand, is also a common big data storage mode, adopts a processing mode which is convenient to operate and easy to use and understand in the management of the read big data, and supports the conversion of the source data into the geospatial data.
3. A user accesses big data deployed at a server end in an interactive mode, finally obtains a data format suitable for geographic information operation, and automatically converts information including a geographic spatial position into geographic spatial data; in addition, the big data is stored in a distributed mode, and in the invention, the system is automatically adapted to the bottom-layer physical environment, so that a user does not need to know which computer the data is stored in, and only needs to input related parameters, and the system can automatically match and read the data in the background and display the data at the front end.
4. Aiming at the problems of low analysis efficiency, poor display effect and the like of common domestic GIS software in processing space-time big data, the invention realizes the visual management of the domestic GIS software on the big data, conveniently and visually operates and manages a data cluster through interactive operation, realizes the direct display of data analysis effect, helps common users to better understand data, and helps data analysis experts to carry out deeper analysis and assists managers to make decisions.
5. The method is based on Spark framework and Scala programming language, an openable distributed data source is constructed in the homemade desktop GIS software, a user can acquire data resources stored at a server end by inputting corresponding parameters such as address, instance name, user name and password, and the data resources can be converted into a data format readable by the GIS software (for example, a text file (CSV) containing geographic coordinate information is converted into a spatial point data set) by setting corresponding field parameters, so that the high-efficiency visual management of big data is realized.
The invention aims to fill the blank of distributed big data management in the domestic GIS software, and visually manage the distributed big data without depending on an operating system and a Hadoop operating environment, thereby reducing the operation difficulty of a user and greatly improving the use efficiency of the user.
Drawings
FIG. 1 is a flow chart of a visual management method for big data in GIS software according to the invention;
fig. 2 is a flowchart of reading a CSV file according to a first embodiment of the present invention.
Detailed Description
The following examples are given to further illustrate the embodiments of the present invention:
first embodiment
As shown in fig. 1 and fig. 2, a method for visually managing big data in GIS software includes the following steps S01 to S03.
Step S01: and constructing an HDFS distributed data source, wherein the data format stored by the server side is a CSV file.
Step S02: and inputting known parameters to open a corresponding data source according to the data storage mode, and accessing and reading the big data stored at the server.
The data is stored in an Oracle database, when the data is opened in SuperMap GIS software, parameters such as a server address (a server address for storing the data), an instance name, an alias (a name displayed in the GIS software), a user name, a password and the like need to be input, and an HDFS data source is opened;
step S03: visual data management operation including field setting, index creating, data adding, data importing and data exporting is achieved on the read big data; the data format stored by the server side is a CSV file; converting the data with the geographic coordinate information into a point data set in a data mode which can be identified by GIS software; when the CSV file is imported, the first line field, the separator and the like of the CSV file can be set, and the creation of a data index after the CSV file is imported also supports operations such as addition, data export and the like of batch imported data;
a) managing the read data: displaying a directory structure of the data file based on a directory tree mode, supporting new creation and deletion of a directory, and renaming the directory; opening an HDFS data source by inputting a server address, an instance name, a user name, a password and the like; configuring relevant attributes when reading the CSV file, reading field information in the CSV file and converting the field information into field information which can be identified by software; the CSV file reading process comprises the following steps: predefining relevant attributes when reading the CSV file, such as a file path, a starting line, character codes, separators and the like; setting relevant attributes when reading the CSV file according to the preset parameter items; predefining a field structure of the CSV file; creating an index; reading fields in the CSV file, and creating according to the original type; if the geographic coordinate information field is detected, directly generating a point data set;
b) and (4) visualization operation on the data. The method comprises the steps of supporting new creation and addition of data, supporting interactive operation between a client and a server, uploading and downloading data, supporting breakpoint continuous transmission and supporting import and export operation of the data; displaying the accessed files in the server directory in a sub-window mode, wherein the displayed content comprises information such as indexes, file names, sizes, occupied Blocksize, owners, groups and the like;
c) various data management operations currently performed can be checked in task management, multiple tasks currently performed are displayed in a visual mode, task progress can be checked, and operations such as canceling of the tasks in progress are supported: for HDFS data sources: firstly, specifying field information when establishing indexes for data; when data without indexes are calculated and analyzed, appointed field information is supported; and can match the data set type by setting field information.
Step S04: and uploading the processed data to a server side, and realizing data storage or sharing for others.
Second embodiment
As shown in fig. 1 and fig. 2, a method for visually managing big data in GIS software includes the following steps S01 to S03.
Step S01: and constructing a MongoDB distributed data source, wherein the data format stored by the server side is a CSV file.
Step S02: and inputting known parameters to open a corresponding data source according to the data storage mode, and accessing and reading the big data stored at the server.
The data is stored in an Oracle database, when the data is opened in SuperMap GIS software, parameters such as server addresses (server addresses for storing the data), instance names, alias names (names displayed in the GIS software), user names, passwords and the like need to be input, and a MongoDB data source is opened;
step S03: visual data management operation including field setting, index creating, data adding, data importing and data exporting is achieved on the read big data; the data format stored by the server side is a CSV file; converting the data with the geographic coordinate information into a point data set in a data mode which can be identified by GIS software; when the CSV file is imported, the first line field, the separator and the like of the CSV file can be set, and the creation of a data index after the CSV file is imported also supports operations such as addition, data export and the like of batch imported data;
b) managing the read data: supporting the creation and deletion of a directory and renaming the directory; opening a MongoDB data source by inputting a server address, an instance name, a user name, a password and the like; configuring relevant attributes when reading the CSV file, reading field information in the CSV file and converting the field information into field information which can be identified by software; the CSV file reading process comprises the following steps: predefining relevant attributes when reading the CSV file, such as a file path, a starting line, character codes, separators and the like; setting relevant attributes when reading the CSV file according to the preset parameter items; predefining a field structure of the CSV file; creating an index; reading fields in the CSV file, and creating according to the original type; if the geographic coordinate information field is detected, directly generating a point data set;
b) and (4) visualization operation on the data. The method comprises the steps of supporting new creation and addition of data, supporting interactive operation between a client and a server, uploading and downloading data, supporting breakpoint continuous transmission and supporting import and export operation of the data; displaying the accessed files in the server directory in a sub-window mode, wherein the displayed content comprises information such as indexes, file names, sizes, occupied Blocksize, owners, groups and the like;
c) various data management operations currently performed can be checked in task management, multiple tasks currently performed are displayed in a visual mode, task progress can be checked, and operations such as canceling of the tasks in progress are supported: for the MongoDB data sources: the fields of all tables are stored by fixed-name tables (e.g. smfieldlnfos). And matching the data set type through the set field information.
Step S04: and uploading the processed data to a server side, and realizing data storage or sharing for others.
Description of terms:
spark: an open-source, general-purpose parallel framework that can process large data (TB-level) in parallel in a reliable and fault-tolerant manner across large-scale clusters. By enabling the in-memory distributed dataset, it is not only able to provide interactive queries, but also to optimize the iterative workload. Spark is based on a Scala language implementation that uses Scala as its application framework. Spark and Scala can be tightly integrated, and Scala can manipulate distributed data sets as easily as manipulating local collection objects. Spark can be used to build large, low latency data analysis applications. Spark provides distributed computing power in the memory and has API programming interfaces of Java, Scale, Python and R programming languages.
And (4) Scala: the programming language of one multi-paradigm has the characteristics of object-oriented programming, functional programming, static type and the like, has expansibility, and can realize interoperation with Java and NET.
HDFS (Hadoop distributed File System): hadoop distributed file system. HDFS is a highly fault tolerant system suitable for deployment on inexpensive machines. HDFS provides high throughput data access and is well suited for application on large-scale data sets.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the technical principle of the present invention, and these modifications and decorations should also be regarded as being within the protection scope of the present invention.

Claims (4)

1. A visual management method for big data in GIS software comprises the following steps:
1) constructing distributed data sources suitable for different data storage modes;
2) inputting known parameters to open a corresponding data source according to a data storage mode, and accessing and reading big data stored in a server;
3) and for the read big data, realizing visual data management operation, including setting fields, creating indexes, adding data, importing data and exporting data, wherein the data management operation comprises the following steps:
displaying a directory structure of the data file based on a directory tree mode, supporting new creation and deletion of a directory, and renaming the directory; opening an HDFS data source by inputting a server address, an instance name, a user name and a password; configuring relevant attributes when reading the CSV file, reading field information in the CSV file and converting the field information into identifiable field information, wherein the reading of the CSV file comprises the following steps:
predefining relevant attributes when reading the CSV file, wherein the relevant attributes comprise a file path, a starting line, character codes and separators; setting relevant attributes when reading the CSV file according to the preset parameter items; predefining a field structure of the CSV file; creating an index; reading fields in the CSV file, and creating according to the original type; when detecting that the geographic coordinate information field is contained, generating a point data set;
4) and uploading the processed data to a server side, and realizing data storage or sharing for others.
2. The visual management method for big data in GIS software according to claim 1, wherein the data sources in step 1) include an HDFS data source and a MongoDB data source.
3. The visual management method for big data in GIS software according to claim 1, wherein in the step 3), during the process of big data reading, the user can convert the data meeting the configuration condition into geospatial data in batch in custom configuration.
4. The visual management method for big data in GIS software according to claim 1, wherein the data management of step 3) is a multitask operation, and the multitask currently in progress is displayed in a visual manner, including that the progress of the task can be viewed, and cancellation of the task in progress is supported.
CN201611182291.6A 2016-12-20 2016-12-20 Visual management method for big data in GIS software Active CN106599241B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611182291.6A CN106599241B (en) 2016-12-20 2016-12-20 Visual management method for big data in GIS software

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611182291.6A CN106599241B (en) 2016-12-20 2016-12-20 Visual management method for big data in GIS software

Publications (2)

Publication Number Publication Date
CN106599241A CN106599241A (en) 2017-04-26
CN106599241B true CN106599241B (en) 2020-06-30

Family

ID=58599585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611182291.6A Active CN106599241B (en) 2016-12-20 2016-12-20 Visual management method for big data in GIS software

Country Status (1)

Country Link
CN (1) CN106599241B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108681450A (en) * 2018-05-08 2018-10-19 北京明朝万达科技股份有限公司 A kind of flow establishment dispositions method and system based on Activiti
CN112100123B (en) * 2020-08-31 2023-06-09 长江空间信息技术工程有限公司(武汉) Method for layering and displaying large-data-volume CAD (computer aided design) files at front end of Web
CN112559453A (en) * 2020-12-09 2021-03-26 恒安嘉新(北京)科技股份公司 Data storage method and device, electronic equipment and storage medium
CN112612864B (en) * 2020-12-28 2022-08-16 厦门市美亚柏科信息股份有限公司 Data visualization display method and terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065205A (en) * 2012-12-26 2013-04-24 深圳先进技术研究院 Three-dimensional intelligent transportation junction passenger flow time-space analysis and prediction system
KR20160025364A (en) * 2014-08-27 2016-03-08 한전케이디엔주식회사 Smart grids distribute network system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104007948B (en) * 2014-05-23 2017-06-13 广东威创视讯科技股份有限公司 Method and device based on the visualization display of three-dimension GIS mass data Distributed Calculation
CN104361110B (en) * 2014-12-01 2016-01-20 广东电网有限责任公司清远供电局 Magnanimity electricity consumption data analysis system and in real time calculating, data digging method
CN105512302A (en) * 2015-12-14 2016-04-20 浪潮软件股份有限公司 Distributed GIS (geographic information system) platform system data access and embedding method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103065205A (en) * 2012-12-26 2013-04-24 深圳先进技术研究院 Three-dimensional intelligent transportation junction passenger flow time-space analysis and prediction system
KR20160025364A (en) * 2014-08-27 2016-03-08 한전케이디엔주식회사 Smart grids distribute network system

Also Published As

Publication number Publication date
CN106599241A (en) 2017-04-26

Similar Documents

Publication Publication Date Title
US11797558B2 (en) Generating data transformation workflows
CN108519967B (en) Chart visualization method and device, terminal and storage medium
US10540383B2 (en) Automatic ontology generation
CN106897322B (en) A kind of access method and device of database and file system
CN106599241B (en) Visual management method for big data in GIS software
CN103955538B (en) HBase data persistence and query methods and HBase system
CN106030573A (en) Implementation of semi-structured data as a first-class database element
US20130166602A1 (en) Cloud-enabled business object modeling
US9239854B2 (en) Multi-domain impact analysis using object relationships
US11860870B2 (en) High efficiency data querying
US20190370255A1 (en) Remote query optimization in multi data sources
CN111221785A (en) Semantic data lake construction method of multi-source heterogeneous data
Pol Big data analysis: Comparison of hadoop mapreduce, pig and hive
Parmar et al. MongoDB as an efficient graph database: An application of document oriented NOSQL database
US20180196869A1 (en) Natural language search using facets
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
US20190370375A1 (en) Enabling data source extensions
US11030177B1 (en) Selectively scanning portions of a multidimensional index for processing queries
CN110704481A (en) Method and device for displaying data
Gašpar et al. Integrating Two Worlds: Relational and NoSQL
CN112817930A (en) Data migration method and device
US10503731B2 (en) Efficient analysis of distinct aggregations
US10152556B1 (en) Semantic modeling platform
Kotecha et al. Handling non-relational databases on big query with scheduling approach and performance analysis
CN115952207B (en) Threat mail storage method and system based on Starblocks database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant