CN107066511A - A kind of Distributed Time sequence service system of gis and method - Google Patents

A kind of Distributed Time sequence service system of gis and method Download PDF

Info

Publication number
CN107066511A
CN107066511A CN201710047723.0A CN201710047723A CN107066511A CN 107066511 A CN107066511 A CN 107066511A CN 201710047723 A CN201710047723 A CN 201710047723A CN 107066511 A CN107066511 A CN 107066511A
Authority
CN
China
Prior art keywords
data
row
length
index
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710047723.0A
Other languages
Chinese (zh)
Inventor
龚杰
王希
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710047723.0A priority Critical patent/CN107066511A/en
Publication of CN107066511A publication Critical patent/CN107066511A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of Distributed Time sequence service system of gis and method, its system includes data acquisition module, data storage module and data inquiry module, the data acquisition module, its data being used in the automatic weather station to distribution are acquired in the way of message queue;The data memory module, it is used to enter the data of the data collecting module collected in the form of time series determinant storage by the database of distributed non-relational;The data inquiry module, it is used to independently interact with the data memory module using distributed multiple processes, realizes telescopic inquiry service.The present invention encapsulates the implementation process of complicated data acquisition, storage, inquiry by system, and the data, services for the simple High Availabitity for being more suitable for meteorological data are provided to client.

Description

A kind of Distributed Time sequence service system of gis and method
Technical field
The present invention relates to Meteorological Science and big data technical applications, and in particular to a kind of Distributed Time sequence is geographical Information service system and method.
Background technology
Because Meteorological Automatic Station radix itself is huge, each automatic Weather Station is per minute to produce data, and data are increasingly huge. Safeguard that the data cost of this magnitude is quite high using special relevant database;Also scheme solves number using distributed schemes The problem of according to storage, but still need to put into substantial amounts of research and development energy for data sample.At present in the art with originally carrying The technical scheme that case is closer to is based on magnanimity the problem of inquiry for traditional centrally stored single-point of data there is provided one kind The storage and retrieval method of meteorological data, using Hadoop platform, by setting up two to distributed non-relational database Hbase Level index, and data are imported into cloud platform by conversion, migration, the reliable memory and quick-searching of mass data are realized, It comprises the following steps:Data filtering;The corresponding sheet format defined in Hbase;Set up secondary index;The situation of dividing carries out data Import;The situation of dividing carries out data retrieval.The invention can realize the real-time query of data, also avoid conventional storage and maintenance from largely counting , being capable of more economical efficiently real-time query magnanimity meteorology on the premise of sensitive data safety is ensured according to produced sky high cost Data.
But there is also following defect for the scheme of prior art:
1. being set up prior art only describes HBase, secondary index is theoretical, is not provided with for the specific of meteorological data Build table scheme and secondary index scheme.
2. the secondary index theory of prior art is the General Theory of HBase secondary indexs, not for meteorological data Optimization, wastes memory space.
3. prior art only proposes memory scan scheme, finally to improve meteorological data behaviour in service and also need to systematization Total solution.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of Distributed Time sequence service system of gis and side Method, can construct proprietary system for storage, retrieval and the analysis of magnanimity automatic Weather Station data, be to divide extensively on geographical position The monitoring data that the automatic Weather Station collection of cloth comes provides storage, index and serviced, and these data is easier access.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:A kind of Distributed Time sequence geographic information services system System, including data acquisition module, data storage module and data inquiry module,
The data acquisition module, its data being used in the automatic weather station to distribution are carried out in the way of message queue Collection;
The data memory module, it is used to adopt the data acquisition module using the database of distributed non-relational The data of collection enter determinant storage in the form of time series;
The data inquiry module, it is used to hand over the data memory module by the way that distributed multiple processes are independent Mutually, telescopic inquiry service is realized.
The beneficial effects of the invention are as follows:A kind of Distributed Time sequence service system of gis of the present invention is sealed by system Complicated data acquisition, storage, the implementation process of inquiry are filled, the simple High Availabitity of meteorological data is more suitable for client's offer Data, services, can be for the storage of magnanimity automatic Weather Station data, retrieval and analysis structure using the database of distributed non-relational Proprietary system is built, the monitoring data come for automatic weather station collection widely distributed on geographical position provides storage, index And service, and these data is easier access;Meanwhile, data inquiry module is by multiple nothings operated on different server The service processes composition of state, different processes are independently interacted with data memory module, and independent caching separately provides http access Interface, when there is a large amount of clients to access, route can be carried out by load-balanced server and shares pressure to many inquiry clothes Business device, it is achieved thereby that the system High Availabitity, telescopic inquiry service, not only can be straight by calling database service interface Connect inquiry obtain data, can also Data of Automatic Weather service on the basis of build application, such as Automatic weather station and network and Automatic weather station APP.
On the basis of above-mentioned technical proposal, the present invention can also do following improvement.
Further, in addition to data analysis module, the data analysis module, it is used for based on Spark to the data The data stored in memory module are directly analyzed and/or analyzed using storehouse TgisML is analyzed.
Beneficial effect using above-mentioned further scheme is:Data analysis module removes client from and repeats to make the tired of wheel Disturb, facility is provided for analysis magnanimity Data of Automatic Weather.
Further, the analysis storehouse TgisML is based on Spark Computational frames and TGIS data models, to rely on Spark machine learning MLlib storehouses and a kind of data analysis developed for the geographic information data structure that TGIS storehouses are preserved Storehouse.
Further, the database of the distributed non-relational is specially HBase, includes tables of data in the HBase Data, in the tables of data data, includes a Ge Lie races t, in the tables of data data row race t, includes a row Key and multiple row qualifiers;
Being stored with the line unit of the tables of data data row race t, " administrative area code+observation time+type of site+website is compiled Code ", wherein, " administrative area code " length is 3 bytes, and the length of " observation time " is 4 bytes, described " to stand The length of vertex type " is 1 byte, and the length of " station code " is 3 bytes;
Be stored with " Value Types mark+key element name " in the row qualifier of the tables of data data row race t, wherein, described value The length of type identification is 1 byte, and preceding 4 reservations, the type and form of rear 4 values that are stored with, the length of the key element name Spend for 3 bytes.
Beneficial effect using above-mentioned further scheme is:The tables of data data is used as line unit by administrative area code Primary sign, can prevent time series data from unidirectionally increasing a small amount of region write-in focus burdens caused, and new line unit is set Meter contains all information of needs, and contrasting the line unit of the acquired data storage table of conventional store modelling reduces storage sky Between;After row name length is reduced simultaneously, the space that key is occupied in key-value is considerably reduced when same number of rows is taken with HBase.
Further, " observation time " stored in the line unit of the tables of data data row race t is integral point time hour, and right Stored in the cell that the minute data in a hour is then compressed in correspondence integral point time hour.
Beneficial effect using above-mentioned further scheme is:Deposited due to minute data is pressed into the single row of hour data Storage, has been significantly compression the line number of data.
Further, information table info, described information table info and the tables of data data phases are also included in the HBase Mutually mapping, in described information table info, includes Liang Gelie races, respectively arranges race id and row race name;
In described information table info row race id, include line unit and row qualifier, described information table info row race id's Be stored with " type of site+station code " in line unit, wherein, the length of " type of site " is 1 byte, " website The length of coding " is 3 bytes;Be stored with " site name+longitude and latitude the number of degrees in described information table info row race id row qualifier Value+height above sea level+administration area code ";
In described information table info row race name, include line unit and row qualifier, described information table info row race It is stored with name line unit " site name ";Being stored with described information table info row race name row qualifier, " website is compiled Code ".
Beneficial effect using above-mentioned further scheme is:Information table info is by often row repeats the site name of storage originally The information such as title, longitude and latitude numerical value, height above sea level, administrative area code is individually stored, and reduces repetition storage;By writing filtering Device, can be easily achieved and search for matching site name and station code automatically by inputting character string.
Further, time series search index table data_ts_index, the time sequence are also included in the HBase Include a line unit, the time series search index table data_ts_index in row search index table data_ts_index Line unit in be stored with " type of site+station code+observation time ", the length of " type of site " is 1 byte, described The length of " station code " is 3 bytes, and the length of " observation time " is 4 bytes.
Beneficial effect using above-mentioned further scheme is:Time series search index table data_ts_index is except " looking into Ask in the range of some, a certain moment, the data of a certain class website " inquiry of this dimension is outside conventional inquiry, also another The query function of individual dimension:" inquiring about some website, the data of certain time ", that is, inquiry website time series data, During scan data, it is possible to specify clear and definite start-stop position, time series query time is optimized.
Further,
Also include alarm search index table data-alert-index, the alarm search index table in the HBase Include 1 line unit and 1 Ge Lie races f in data-alert-index;
Be stored with " administrative area code+observation time+report in the alarm search index table data-alert-index line units Alert key element name+type of site+station code ", the length of " the administrative area code " is 3 bytes, " observation time " Length is 4 bytes, and the length of " the alarm key element name " is 3 bytes, and the length of " type of site " is 1 byte, The length of " station code " is 3 bytes;
In the alarm search index table data-alert-index row race f, only 1 row qualifier v, the alarm Search index table data-alert-index row race f row qualifier v represents the alarming value of alarm key element.
Beneficial effect using above-mentioned further scheme is:The search index table data-alert-index that alarms is to be directed to " in the range of some, a certain moment, a certain class website, some key element value " whether beyond certain early warning value and the search index set up Table.
Further, geographical position search index table data-geo-index, the geographical position are also included in the HBase Put and include line unit in search index table data-geo-index, the geographical position search index table data-geo-index rows It is corresponding that the utilization space that is stored with key index Geohash algorithms are calculated to warp, the latitude of each automatic weather station Geohash coding.
Beneficial effect using above-mentioned further scheme is:Geographical position search index table data-geo-index can be by Geographical position divides and inquired about, and has widened inquiry means.
Based on a kind of above-mentioned Distributed Time sequence service system of gis, the present invention also provides a kind of Distributed Time Sequence geographic information services method.
A kind of Distributed Time sequence geographic information services method, comprises the following steps,
S1, the data acquisition module is adopted to the data in the automatic weather station of distribution in the way of message queue Collection;
S2, the data memory module is using the database of distributed non-relational to the data collecting module collected Data enter determinant storage in the form of time series;
S3, the data inquiry module is independently interacted by distributed multiple processes with the data memory module, real Existing telescopic inquiry service.
The beneficial effects of the invention are as follows:A kind of Distributed Time sequence geographic information services method of the present invention is sealed by system Complicated data acquisition, storage, the implementation process of inquiry are filled, the simple High Availabitity of meteorological data is more suitable for client's offer Data, services, can be for the storage of magnanimity automatic Weather Station data, retrieval and analysis structure using the database of distributed non-relational Proprietary system is built, the monitoring data come for automatic weather station collection widely distributed on geographical position provides storage, index And service, and these data is easier access;Meanwhile, data inquiry module is by multiple nothings operated on different server The service processes composition of state, different processes are independently interacted with data memory module, and independent caching separately provides http access Interface, when there is a large amount of clients to access, route can be carried out by load-balanced server and shares pressure to many inquiry clothes Business device, it is achieved thereby that the system High Availabitity, telescopic inquiry service, not only can be straight by calling database service interface Connect inquiry obtain data, can also Data of Automatic Weather service on the basis of build application, such as Automatic weather station and network and Automatic weather station APP.
Brief description of the drawings
Fig. 1 is a kind of structured flowchart of Distributed Time sequence service system of gis of the invention;
Fig. 2 is a kind of flow chart of Distributed Time sequence geographic information services method of the invention.
Embodiment
The principle and feature of the present invention are described below in conjunction with accompanying drawing, the given examples are served only to explain the present invention, and It is non-to be used to limit the scope of the present invention.
As shown in figure 1, a kind of Distributed Time sequence service system of gis, including the storage of data acquisition module, data Storing module and data inquiry module, the data acquisition module, the data that it is used in the automatic weather station to distribution are with message The mode of queue is acquired;The data memory module, it is used for the database using distributed non-relational to the number The data gathered according to acquisition module enter determinant storage in the form of time series;The data inquiry module, it is used to pass through Distributed multiple processes are independently interacted with the data memory module, realize telescopic inquiry service.
The present invention system also include data analysis module, the data analysis module, its be used for based on Spark to described The data stored in data memory module are directly analyzed and/or analyzed using storehouse TgisML is analyzed.The analysis storehouse TgisML is, based on Spark Computational frames and TGIS data models, relies on Spark machine learning MLlib storehouses and pin A kind of Data analysis library for developing to geographic information data structure that TGIS storehouses are preserved, encapsulate progress machine learning algorithm it Data outside forehead extract conversion operation, and Concise Analysis interface is provided for frequently-used data analysis work.Data analysis module is removed from Client is repeated to make the puzzlement of wheel, and facility is provided for analysis magnanimity Data of Automatic Weather.
A kind of Distributed Time sequence service system of gis of the present invention (also referred to as TGIS) encapsulates complexity by system Data acquisition, storage, the implementation process of inquiry, the data for providing the simple High Availabitity for being more suitable for meteorological data to client take Business, can be constructed specially using the database of distributed non-relational for storage, retrieval and the analysis of magnanimity automatic Weather Station data There is system, the monitoring data come for automatic weather station collection widely distributed on geographical position provides storage, index and serviced, And these data is easier access;Meanwhile, data inquiry module by it is multiple operate on different server it is stateless Service processes (i.e. tgis processes) are constituted, and different processes are independently interacted with data memory module, and independent caching is separately provided Http access interfaces, when there is a large amount of clients to access, route can be carried out by load-balanced server and share pressure to many Platform inquires about server, it is achieved thereby that the system High Availabitity, telescopic inquiry service, by calling database service interface, no Acquisition data only can be directly inquired about, application, such as automatic gas can also be built on the basis of Data of Automatic Weather service As station net and automatic weather station APP.
Specifically:
The database of the distributed non-relational is specially HBase, and tables of data data, letter are included in the HBase Cease table info, time series search index table data_ts_inde, alarm search index table data-alert-index and geography Position enquiring concordance list data-geo-index ropes, before these tables are introduced, introduce the several crucial general of HBase first Read:Line unit and row race;Line unit (Row Key) is the sole mode that HBase sets up index, and row is according to line unit according to Row key's Syllable sequence (byte order) sequence storage, the access to table will pass through line unit (get:Single RowKey is accessed;scan: RowKey Range Access, or full table scan);Row race (Column Family) must provide when table is defined, and each row race can be with There are one or more row qualifiers (Column Qualifier), row qualifier need not be provided when table is defined dynamically to be added Enter, data are stored separately by row race.
Lower mask body introduces the table design in HBase:
In the tables of data data, include a Ge Lie races t (only one of which row race t, can prevent original design across The unnecessary IO that the inquiry of row race is caused), in the tables of data data row race t, include a line unit and multiple row qualifiers;
Being stored with the line unit of the tables of data data row race t, " administrative area code+observation time+type of site+website is compiled Code ", wherein, " administrative area code " length is 3 bytes, and the length of " observation time " is 4 bytes, described " to stand The length of vertex type " is 1 byte, and the length of " station code " is 3 bytes;
The line unit specific design of the tables of data data row race t is as follows:(11 bytes, hexadecimal)
06 69 0F 58 2B CB B0 51 00 E0 18
In tables of data data row race t line unit, be stored with complete " administrative area code " in the 1st to 3 byte, for example, The hexadecimal of storage x06 x69 x0F to be converted into 10 systems be 420111, represent Wuhan City, Hubei Province Hongshan District, store the Chinese Word " Wuhan City, Hubei Province Hongshan District " needs 18 bytes (GBK codings), and 3 words are only needed to " administrative area code " numerical value storage Section;Be stored with " observation time " in 4th to 7 byte, is represented using Linux timestamp numerical value, and the 2B CB B0 of hexadecimal 58 are changed It is 1479265200 to be counted as the decimal system, represents " Wed, 16 Nov 2,016 03:00:00 GMT ", " in November, 2016 Beijing time 16 days 11:00:00 ", compared with original use time character string " 2016-11-16 11:00:00 " (22 byte), or reversing time " 16110000201611 " (14 byte) has saved memory space;Specifically, the time stored here is integral point time hour, It is compressed in a cell and represents for the minute data in a hour, due to minute data is pressed into the list of hour data Stored in individual row, be significantly compression the line number of data;Be stored with " type of site " of automatic weather station in 8th byte, such as Here x51 represent character Q, represent to store in region automatic weather station (country station for space symbol x20), the 9th to 11 byte Have " station code " of 5 digits of automatic weather station, such as here x00 xE0 x18 be scaled the decimal system for 57368, generation Table " station code " 57368.
Be stored with " Value Types mark+key element name " in the row qualifier of the tables of data data row race t, wherein, described value The length of type identification is 1 byte (8), and preceding 4 reservations, the type and form of rear 4 values that are stored with, and later 4 are Starting point, the 1st is a flag, and it is integer or floating type to show value, and 0 represents integer, and 1 represents floating type, and latter 3 show number According to length (i.e.:The length of the key element name is 3 bytes), represent 1 byte by 1 skew, 000;001 represents 2 bytes. Length must be one in 1,2,4,8, not so illustrate wrong;The length of the key element name is 3 bytes, key element name compression To 3 bytes, reduce and repeat the space waste that storage long line qualifier is caused.
The tables of data data can prevent time series data list by primary sign of the administrative area code as line unit The a small amount of region write-in focus burdens caused to growth, new line unit design contains all information of needs, contrast tradition The tables of data data of storage model design line unit reduces memory space;After row name length is reduced simultaneously, taken with HBase identical The space that key is occupied in key-value is considerably reduced during line number.
Also include information table info in the HBase, described information table info mutually maps with the tables of data data (specific mapping relations be in information table info the site information that stores with stored in tables of data data line unit " website is compiled Code " mutually mapping), in described information table info, include Liang Gelie races, respectively arrange race id and row race name;Arrange race id In inquiry and uncorrelated with row race name, due to being the mutual mapping of " station code " and site information, Liang Gelie races will gather around There is the row that quantity is equal, another row race will not be had influence on because of the undue growth of a Ge Lie races line number.
In described information table info row race id, include line unit and row qualifier, described information table info row race id's Be stored with " type of site+station code " in line unit, wherein, the length of " type of site " is 1 byte, " website The length of coding " is 3 bytes;Be stored with " site name+longitude and latitude the number of degrees in described information table info row race id row qualifier Value+height above sea level+administration area code ";
Described information table info row race id line unit specific design is as follows:(4 bytes, hexadecimal)
51 00 E0 18
In described information table info row race id line unit, the 1st byte is stored with " type of site " of automatic weather station, than As here x51 represent character Q, represent region automatic weather station (country station for space symbol x20), the 2nd to 4 byte is stored with " station code " of 5 digits of automatic weather station, such as here x00 xE0 x18 be scaled the decimal system for 57368, represent " station code " 57368.
In described information table info row race name, include line unit and a row qualifier, described information table info row It is stored with race name line unit " site name ";Be stored with " website in described information table info row race name row qualifier Coding ".
Information table info is by often row repeats site name, longitude and latitude numerical value, height above sea level, the administrative area code of storage originally Individually stored etc. information, reduce repetition storage;By writing filter, it can be easily achieved automatic by inputting character string Search matching name of station and station code.
Except " inquire about in the range of some, a certain moment, a certain class website data " inquiry of this dimension is conventional look into Outside inquiry, there is the inquiry of another dimension also critically important, that is, " inquiring about some website, the data of certain time ", also It is the time series data for inquiring about website.Because HBase line unit is the sole mode of index, so setting up unique side of index Formula is exactly to set up a line unit to describe the different new table of dimension, that is, needs setup time sequence queries concordance list data_ts_ Index, can certainly be safeguarded new by self-defined coprocessor when HBase adds line, by the data dump in row to new table Table content can also insert data to two tables respectively by client, but can reduce client using the method for coprocessor IO between end and server end.
Include a line unit, the time series inquiry in the time series search index table data_ts_index Be stored with " type of site+station code+observation time ", " type of site " in concordance list data_ts_index line unit Length be 1 byte, the length of " station code " is 3 bytes, and the length of " observation time " is 4 bytes.
Time series search index table data_ts_index line unit specific design is as follows:(8 bytes, hexadecimal)
51 00 E0 18 58 2B CB B0
In the line unit of the time series search index table data_ts_index, be stored with automatic meteorological in the 1st byte " type of site " stood, such as hexadecimal 51 represents character Q here, represents that (country station is space symbol, ten to regional weather station Senary 20), " station code " of 5 digits of the automatic weather station that is stored with the 2nd to 4 byte, such as hexadecimal here It is 57368 that 00E018, which is scaled the decimal system, represents and is stored with " observation time " in " station code " the 57368, the 5th to 8 byte, and Institute's " stating observation time " represented using Linux timestamp numerical value, such as 2B CB B0 of hexadecimal 58 here are converted into ten and entered 1479265200 are made as, " Wed, 16 Nov 2,016 03 is represented:00:00 GMT ", " 16 days 11 November in 2016 Beijing time: 00:00”。
Because HBase presses line unit sequence, advantage of this is that same type, the record of same website is fallen on automatically Continuous storage location.During scan data, it is possible to specify clear and definite start-stop position, time series query time is optimized.Together When, because data are continuous, the demand of the packet summation speed lifting counted to the key element value in the website section time Also it is met and (is the scanner with packet summation by defining coprocessor to replace regionscanner, as a result exists Client merges).
Time series search index table data_ts_index except " inquire about in the range of some, a certain moment, a certain class station The inquiry of this dimension of the data of point " is that outside conventional inquiry, also have the query function of another dimension:" inquire about some website, The data of certain time ", that is, the time series data of website is inquired about, during scan data, it is possible to specify clear and definite start stop bit Put, optimize time series query time.
Also include alarm search index table data-alert-index, the alarm search index table in the HBase Include 1 line unit and 1 Ge Lie races f in data-alert-index;
Be stored with " administrative area code+observation time+report in the alarm search index table data-alert-index line units Alert key element name+type of site+station code ", the length of " the administrative area code " is 3 bytes, " observation time " Length is 4 bytes, and the length of " the alarm key element name " is 3 bytes, and the length of " type of site " is 1 byte, The length of " station code " is 3 bytes;
(14 bytes, 16 enter the line unit specific design of the alarm search index table data-alert-index as follows System):
06 69 0F 58 2B CB B0 P 1 0 51 00 E0 18
In the line unit of the alarm search index table data-alert-index, it is stored with the 1st to 3 byte complete " administrative area code ", " 06 69 0F " as therein;Be stored with 4th to 7 byte " observation time ", as therein " 58 2B CB B0”;Be stored with 8th to 10 byte " alarm key element name ", " P 10 " as therein;It is stored with 11st byte Automatic weather station " type of site ", and be " 51 " therein;12nd to 14 with 5 of the automatic weather station that is stored with 3 bytes Several " station code ", and be " 0,0E0 18 " therein.
In the alarm search index table data-alert-index row race f, only 1 row qualifier v, the alarm Search index table data-alert-index row race f row qualifier v represents the alarming value of alarm key element.
Alarm search index table data-alert-index be for " in the range of some, a certain moment, a certain class website, Whether some key element value " is beyond certain early warning value and the search index table set up.Some model only so can obtain to line unit filtering In enclosing, a certain moment, a certain class website, some key element value exceed all websites of certain early warning value, and beyond the value of alarm.
In some cases and website need not be classified by administrative area, administrative area divides the units such as administrative organs Compare useful, but in the Internet, applications to masses, divided by geographical position.Doing some position phases When the analysis of pass, such as during the neighbouring inquiries of KNN, with greater need for relation of the website on geographical position can be represented.Then may be used The corresponding Geohash of calculation of longitude & latitude of each website is encoded with introducing spatial index Geohash algorithms, Geohash is compiled Code, which is stored in line unit, replaces administrative area coding to realize spatial index, i.e., by setting up geographical position search index table data- Geo-index can realize above-mentioned geographical position inquiry work.
Include line unit, the geographical position search index in the geographical position search index table data-geo-index The utilization space that is stored with table data-geo-index line units indexes warp, latitude of the Geohash algorithms to each automatic weather station The corresponding Geohash codings calculated.The geographical position search index table data-geo-index can be by ground Reason position, which is divided, to be inquired about, and has widened inquiry means.
Based on a kind of above-mentioned Distributed Time sequence service system of gis, the present invention also provides a kind of Distributed Time Sequence geographic information services method.
As shown in Fig. 2 a kind of Distributed Time sequence geographic information services method, comprises the following steps,
S1, the data acquisition module is adopted to the data in the automatic weather station of distribution in the way of message queue Collection;
S2, the data memory module is using the database of distributed non-relational to the data collecting module collected Data enter determinant storage in the form of time series;
S3, the data inquiry module is independently interacted by distributed multiple processes with the data memory module, real Existing telescopic inquiry service.
A kind of Distributed Time sequence geographic information services method of the present invention encapsulates complicated data acquisition by system, deposited Storage, the implementation process of inquiry, the data, services for the simple High Availabitity for being more suitable for meteorological data are provided to client, using distribution The database of non-relational can construct proprietary system for storage, retrieval and the analysis of magnanimity automatic Weather Station data, be on ground Manage automatic weather station collection widely distributed on position next monitoring data storage, index are provided and serviced, and count these According to easily access;Meanwhile, data inquiry module is made up of multiple stateless service processes operated on different server, Different process independently interact with data memory module, and independence is cached, and separately provides http access interfaces, when there is a large amount of clients During access, route can be carried out by load-balanced server and shares pressure to many inquiry servers, it is achieved thereby that this is System High Availabitity, telescopic inquiry service, by calling database service interface, not only can directly inquire about acquisition data, may be used also To build application, such as Automatic weather station and network and automatic weather station APP on the basis of being serviced in Data of Automatic Weather.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims (10)

1. a kind of Distributed Time sequence service system of gis, it is characterised in that:Including data acquisition module, data storage Module and data inquiry module,
The data acquisition module, its data being used in the automatic weather station to distribution are adopted in the way of message queue Collection;
The data memory module, it is used for the database using distributed non-relational to the data collecting module collected Data enter determinant storage in the form of time series;
The data inquiry module, it is used to independently interact with the data memory module by distributed multiple processes, real Existing telescopic inquiry service.
2. a kind of Distributed Time sequence service system of gis according to claim 1, it is characterised in that:Also include Data analysis module, the data analysis module, it is used for straight to the data stored in the data memory module based on Spark Tap into row analysis and/or analyzed using storehouse TgisML is analyzed.
3. a kind of Distributed Time sequence service system of gis according to claim 2, it is characterised in that:Described point Analysis storehouse TgisML be, based on Spark Computational frames and TGIS data models, rely on Spark machine learning MLlib storehouses, And a kind of Data analysis library developed for the geographic information data structure that TGIS storehouses are preserved.
4. a kind of Distributed Time sequence service system of gis according to any one of claims 1 to 3, its feature exists In:The database of the distributed non-relational is specially HBase, and tables of data data is included in the HBase, in the number According to a Ge Lie races t in table data, is included, in the tables of data data row race t, include a line unit and multiple row are limited Symbol;
Be stored with " administrative area code+observation time+type of site+station code " in the line unit of the tables of data data row race t, Wherein, described " administrative area code " length is 3 bytes, and the length of " observation time " is 4 bytes, " the website class The length of type " is 1 byte, and the length of " station code " is 3 bytes;
Be stored with " Value Types mark+key element name " in the row qualifier of the tables of data data row race t, wherein, the Value Types The length of mark is 1 byte, and preceding 4 reservations, the type and form of rear 4 values that are stored with, and the length of the key element name is 3 Individual byte.
5. a kind of Distributed Time sequence service system of gis according to claim 4, it is characterised in that:The number " observation time " stored in line unit according to table data row race t is integral point time hour, and for the number of minutes in a hour Stored according to being then compressed in the cell of correspondence integral point time hour.
6. a kind of Distributed Time sequence service system of gis according to claim 5, it is characterised in that:It is described Also include information table info in HBase, described information table info mutually maps with the tables of data data, in described information table In info, include Liang Gelie races, respectively arrange race id and row race name;
In described information table info row race id, include line unit and row qualifier, described information table info row race id line unit In be stored with " type of site+station code ", wherein, the length of " type of site " is 1 byte, " station code " Length be 3 bytes;Be stored with " site name+longitude and latitude numerical value+sea in described information table info row race id row qualifier Degree of lifting+administration area code ";
In described information table info row race name, include line unit and row qualifier, described information table info row race name's It is stored with line unit " site name ";It is stored with " station code " in described information table info row race name row qualifier.
7. a kind of Distributed Time sequence service system of gis according to claim 6, it is characterised in that:It is described Also include time series search index table data_ts_index, the time series search index table data_ts_ in HBase Include " the website that is stored with a line unit, the line unit of the time series search index table data_ts_index in index Type+station code+observation time ", the length of " type of site " is 1 byte, and the length of " station code " is 3 Individual byte, the length of " observation time " is 4 bytes.
8. a kind of Distributed Time sequence service system of gis according to claim 7, it is characterised in that:It is described Also include alarm search index table data-alert-index, the alarm search index table data-alert- in HBase Include 1 line unit and 1 Ge Lie races f in index;
Being stored with the alarm search index table data-alert-index line units, " administrative area code+observation time+alarm will Plain name+type of site+station code ", the length of " the administrative area code " is 3 bytes, the length of " observation time " For 4 bytes, the length of " the alarm key element name " is 3 bytes, and the length of " type of site " is 1 byte, described The length of " station code " is 3 bytes;
In the alarm search index table data-alert-index row race f, only 1 row qualifier v, the alarm inquiry Concordance list data-alert-index row race f row qualifier v represents the alarming value of alarm key element.
9. a kind of Distributed Time sequence service system of gis according to claim 8, it is characterised in that:It is described Also include geographical position search index table data-geo-index, the geographical position search index table data- in HBase Include in geo-index and be stored with line unit, the geographical position search index table data-geo-index line units using empty Between the corresponding Geohash that is calculated to warp, the latitude of each automatic weather station of index Geohash algorithms encode.
10. a kind of Distributed Time sequence geographic information services method, it is characterised in that:Comprise the following steps,
S1, the data acquisition module is acquired to the data in the automatic weather station of distribution in the way of message queue;
S2, the data memory module uses the database of distributed non-relational to the data of the data collecting module collected Enter determinant storage in the form of time series;
S3, the data inquiry module is independently interacted by distributed multiple processes with the data memory module, and realization can Flexible inquiry service.
CN201710047723.0A 2017-01-20 2017-01-20 A kind of Distributed Time sequence service system of gis and method Pending CN107066511A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710047723.0A CN107066511A (en) 2017-01-20 2017-01-20 A kind of Distributed Time sequence service system of gis and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710047723.0A CN107066511A (en) 2017-01-20 2017-01-20 A kind of Distributed Time sequence service system of gis and method

Publications (1)

Publication Number Publication Date
CN107066511A true CN107066511A (en) 2017-08-18

Family

ID=59598097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710047723.0A Pending CN107066511A (en) 2017-01-20 2017-01-20 A kind of Distributed Time sequence service system of gis and method

Country Status (1)

Country Link
CN (1) CN107066511A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108052566A (en) * 2017-12-06 2018-05-18 广东建邦计算机软件股份有限公司 City element information processing method, device, server and storage medium
CN108763323A (en) * 2018-05-03 2018-11-06 华风象辑(北京)气象科技有限公司 Meteorological lattice point file application process based on resource set and big data technology
CN109657018A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 A kind of distribution vehicle operation data querying method and terminal device
CN109951313A (en) * 2019-01-18 2019-06-28 长江大学 A kind of monitoring device and method of Hadoop cloud platform
CN110147377A (en) * 2019-05-29 2019-08-20 大连大学 General polling algorithm based on secondary index under extensive spatial data environment
CN111090794A (en) * 2019-11-07 2020-05-01 远景智能国际私人投资有限公司 Meteorological data query method, device and storage medium
CN111339229A (en) * 2020-02-24 2020-06-26 交通运输部水运科学研究所 Ship autonomous navigation aid decision-making system
CN111352956A (en) * 2020-02-24 2020-06-30 交通运输部水运科学研究所 Acquisition and storage system for shipping big data
CN111581220A (en) * 2020-05-28 2020-08-25 泰康保险集团股份有限公司 Storage and retrieval method, device, equipment and storage medium for time series data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853290A (en) * 2010-05-25 2010-10-06 南京信息工程大学 Meteorological service performance evaluation method based on geographical information system (GIS)
CN104008212A (en) * 2014-06-23 2014-08-27 中国科学院重庆绿色智能技术研究院 Method for storing IOT time series data related to geographical location information
CN104063478A (en) * 2014-07-02 2014-09-24 董可 Specific region population consumption level sensing system and method based on mobile devices

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853290A (en) * 2010-05-25 2010-10-06 南京信息工程大学 Meteorological service performance evaluation method based on geographical information system (GIS)
CN104008212A (en) * 2014-06-23 2014-08-27 中国科学院重庆绿色智能技术研究院 Method for storing IOT time series data related to geographical location information
CN104063478A (en) * 2014-07-02 2014-09-24 董可 Specific region population consumption level sensing system and method based on mobile devices

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108052566A (en) * 2017-12-06 2018-05-18 广东建邦计算机软件股份有限公司 City element information processing method, device, server and storage medium
CN108763323A (en) * 2018-05-03 2018-11-06 华风象辑(北京)气象科技有限公司 Meteorological lattice point file application process based on resource set and big data technology
CN108763323B (en) * 2018-05-03 2022-03-15 华风象辑(北京)气象科技有限公司 Meteorological grid point file application method based on resource set and big data technology
CN109657018A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 A kind of distribution vehicle operation data querying method and terminal device
CN109657018B (en) * 2018-11-13 2024-05-07 平安科技(深圳)有限公司 Distributed vehicle running data query method and terminal equipment
CN109951313A (en) * 2019-01-18 2019-06-28 长江大学 A kind of monitoring device and method of Hadoop cloud platform
CN109951313B (en) * 2019-01-18 2022-04-19 长江大学 Monitoring device and method for Hadoop cloud platform
CN110147377A (en) * 2019-05-29 2019-08-20 大连大学 General polling algorithm based on secondary index under extensive spatial data environment
CN110147377B (en) * 2019-05-29 2022-12-27 大连大学 General query method based on secondary index under large-scale spatial data environment
WO2021091495A1 (en) * 2019-11-07 2021-05-14 Envision Digital International Pte. Ltd. Method for inquiring weather data, and electronic device and storage medium thereof
CN111090794B (en) * 2019-11-07 2023-12-05 远景智能国际私人投资有限公司 Meteorological data query method, device and storage medium
CN111090794A (en) * 2019-11-07 2020-05-01 远景智能国际私人投资有限公司 Meteorological data query method, device and storage medium
CN111352956A (en) * 2020-02-24 2020-06-30 交通运输部水运科学研究所 Acquisition and storage system for shipping big data
CN111339229A (en) * 2020-02-24 2020-06-26 交通运输部水运科学研究所 Ship autonomous navigation aid decision-making system
CN111581220A (en) * 2020-05-28 2020-08-25 泰康保险集团股份有限公司 Storage and retrieval method, device, equipment and storage medium for time series data

Similar Documents

Publication Publication Date Title
CN107066511A (en) A kind of Distributed Time sequence service system of gis and method
CN108446293B (en) Method for constructing city portrait based on city multi-source heterogeneous data
CN105045856B (en) A kind of big data remote sensing satellite data processing system based on Hadoop
CN104657436B (en) Static tile pyramid parallel constructing method based on MapReduce
CN111459908A (en) Multi-source heterogeneous ecological environment big data processing method and system based on data lake
CN105069703A (en) Mass data management method of power grid
CN102609525A (en) Method for unifying existing longitude and latitude subdividing grids
CN102103713A (en) Method and system for monitoring wetland resource and ecological environment
CN103336772A (en) Novel organization method of single-scene image tile data
CN109871418A (en) A kind of space index method and system of space-time data
CN109299298A (en) Construction method, device, application method and the system of image fusion model
CN104933175B (en) Performance data correlation analysis method and performance monitoring system
CN109145072B (en) Remote sensing monitoring partition method and device for grassland biomass
CN104486116A (en) Multidimensional query method and multidimensional query system of flow data
CN103970914A (en) Acquisition and storage method for heterogeneous data between sewage treatment plants
CN110968636A (en) Multi-dimensional big data analysis and processing system for earthquake early warning
CN118012850B (en) Intelligent irrigation multisource information-oriented database construction system, method and equipment
CN112069141A (en) Special compression method for meteorological forecast lattice point data
CN112905571B (en) Train rail transit sensor data management method and device
CN110851758B (en) Webpage visitor quantity counting method and device
Wang et al. A storage method for remote sensing images based on google s2
CN111061806A (en) Storage method and networked access method for distributed massive geographic tiles
Fleury et al. AMMA information system: an efficient cross‐disciplinary tool and a legacy for forthcoming projects
CN111221820A (en) Method for storing and reading equipment networking data in real time
CN106501635A (en) A kind of three-dimensional digital electrical network continuous monitoring system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170818

RJ01 Rejection of invention patent application after publication