CN106504169A - A kind of waterlogging data handling system and its processing method based on stream process - Google Patents

A kind of waterlogging data handling system and its processing method based on stream process Download PDF

Info

Publication number
CN106504169A
CN106504169A CN201611026709.4A CN201611026709A CN106504169A CN 106504169 A CN106504169 A CN 106504169A CN 201611026709 A CN201611026709 A CN 201611026709A CN 106504169 A CN106504169 A CN 106504169A
Authority
CN
China
Prior art keywords
modules
result
waterlogging
flume
stream process
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611026709.4A
Other languages
Chinese (zh)
Inventor
史鑫明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SUZHOU AEROSPACE SYSTEM ENGINEERING Co Ltd
Original Assignee
SUZHOU AEROSPACE SYSTEM ENGINEERING Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SUZHOU AEROSPACE SYSTEM ENGINEERING Co Ltd filed Critical SUZHOU AEROSPACE SYSTEM ENGINEERING Co Ltd
Priority to CN201611026709.4A priority Critical patent/CN106504169A/en
Publication of CN106504169A publication Critical patent/CN106504169A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of waterlogging data handling system based on stream process, which includes waterlogging model computation module, Flume modules, Kafka modules, SparkStreaming modules and application system.Reading and treatment effeciency is improved using SparkStreaming stream process framework, result of calculation is submitted to by stream process framework with interval of timestamps, the parsing of Shp files is carried out in stream process framework, and the result to same node, the result for keeping up with a time is compared, the relatively last result of each node is exported, the different triangle gridding of water depth value is exported.And then meet actual demand.Improve the efficiency of our process and displaying.

Description

A kind of waterlogging data handling system and its processing method based on stream process
Technical field
The invention belongs to high amount of traffic process application, in particular to a kind of process waterlogging data system and Method.
Background technology
With the development of big data, people are to the processing requirement of big data also more and more higher, original batch processing framework MapReduce is suitable for calculated off line, cannot but meet the higher business of requirement of real-time, such as real-time recommendation, user behavior analysis Deng.
Spark Streaming are built upon the real-time Computational frame on Spark, by it provide abundant API, Based on the high-speed execution engine of internal memory, user can ask application in conjunction with streaming, batch processing and interaction audit trial, and Spark is a class The distributed computing framework of MapReduce is similar to, its core is elasticity distribution formula data set, there is provided richer than MapReduce Rich model, quickly can carry out successive ignition to data set, in internal memory to support the data mining algorithm and figure of complexity Shape computational algorithm.Spark Streaming are a kind of real-time Computational frame of structure on Spark, and it extends Spark process The ability of extensive stream data.
Flume is the system of distributed, reliable and High Availabitity massive logs collection, polymerization and a transmission, supports Various types of data sender is customized in system, for collecting data;Meanwhile, Flume is provided and is carried out simple process to data, and is write Various data receivings(Customizable)Ability.
Flume is mainly purchased into by 3 important components:
Source:The collection to daily record data is completed, is divided into transtion and event is driven among channel.
Channel:The function of a queue is mainly provided, and the data in providing to source are simply cached.
Sink:The data in Channel are taken out, corresponding storage file system, data base is carried out, or is submitted to long-range Server.
It is using the journal file for being the original record of the program that directly reads, base to change minimum occupation mode to existing program Originally seamless access can be realized, it is not necessary to which existing program is made any change.
Flume divides three-tier architecture in logic:Agent, collector and storage.
①agent
For gathered data, agent be in flume produce data flow where, meanwhile, the data of generation can be spread by agent Defeated to collector.
②collector
The effect of collector be by the data summarization of multiple agent after, be loaded in storage.
③storage
Storage is storage system, can be common a file, or HDFS, HIVE, HBase etc..
At present, as due to the characteristic of geography information, the real-time estimate of waterlogging model fails to carry using Distributed Calculation The high computational efficiency of itself.Therefore for the calculating of large area waterlogging model, the calculating for carrying out zones of different using multiple nodes Then the result of each node is processed.But for model prediction area increasing when, need to process Data also more and more, single work station configures higher server and is increasingly difficult to the demand for meeting this change.
Content of the invention
For overcoming deficiency of the prior art, it is an object of the invention to provide at a kind of waterlogging data based on stream process Reason system is improving the efficiency and real-time of the bandwagon effect of result.
For realizing above-mentioned technical purpose, above-mentioned technique effect is reached, the present invention is achieved through the following technical solutions:
A kind of waterlogging data handling system based on stream process, which includes waterlogging model computation module, Flume modules, Kafka moulds Block, SparkStreaming modules and application system;The waterlogging model computation module will produce substantial amounts of waterlogging Predicting Technique Result data, is then stored as Shp files with Shp forms(Shp files are developed by ESRI, and the Shp files of an ESRI include one Individual master file, an index file, and a dBASE table, the suffix of wherein master file is exactly .shp), the Flume modules lead to Cross its Agent and collect the Shp files, be then aggregated into the collector of the Flume modules, the Flume modules Daily record is transported to Sink the production procedure that the Kafka modules complete data, and the SparkStreaming modules are followed the trail of and disappeared The side-play amount or offset for taking this data is consumed, and is encoded with parsing described in the SparkStreaming modules The program of Shp files, described program return the result of change every time after parsing the Shp files, be transmitted further to the Kafka moulds Block, then communication is set up by the application system and the Kafka systems, specific message queue is monitored, the result of change is obtained, Complete the displaying of GIS information.
Another goal of the invention of the present invention is to provide a kind of waterlogging data processing method based on stream process, it include with Lower step:
1)The calculating that zones of different is carried out by waterlogging model computation module to node;
2)The results of prediction and calculation of these multiple nodes is collected by process by Flume modules;
3)The result that collects is processed by SparkStreaming modules, result of calculation is submitted to interval of timestamps Stream process framework, carries out the parsing of Shp files in stream process framework;
4)By result of the Kafka modules to same node, the result for keeping up with a time is compared;
5)The relatively last result of each node is exported by application system, the different triangle gridding of water depth value is exported.
The invention has the beneficial effects as follows:
Compared with prior art, the result of calculation of waterlogging model is used for stream calculation framework by system and method for the invention, is carried The speed of the displaying of high waterlogging early warning.Manager can be made to take the precautionary measures faster, reduce loss.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of description, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after. The specific embodiment of the present invention is shown in detail in by following examples and its accompanying drawing.
Description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is the system framework schematic diagram of the present invention.
Specific embodiment
Below with reference to the accompanying drawings and in conjunction with the embodiments, the present invention is described in detail.
Shown in Figure 1, a kind of waterlogging data handling system based on stream process, it include waterlogging model computation module 1, Flume modules 2, Kafka modules 3, SparkStreaming modules 4 and application system 5;The waterlogging model computation module 1 will Substantial amounts of waterlogging Predicting Technique result data is produced, Shp files are stored as with Shp forms then, the Flume modules 2 pass through Its Agent collects the Shp files, is then aggregated into the collector of the Flume modules 2, the Flume modules 2 Daily record is transported to Sink the production procedure that the Kafka modules 3 complete data, and the SparkStreaming modules 4 are followed the trail of The side-play amount or offset for consuming this data is consumed, and is encoded with parsing institute in the SparkStreaming modules 4 The program of Shp files is stated, described program returns the result of change every time, is transmitted further to the Kafka after parsing the Shp files Module 3, then communication is set up by the application system 5 and the Kafka systems 3, specific message queue is monitored, change is obtained As a result, the displaying of GIS information is completed.
The processing method of the waterlogging data handling system of the present embodiment is as follows:
1)The calculating that zones of different is carried out by waterlogging model computation module 1 to node;
2)The results of prediction and calculation of these multiple nodes is collected by process by Flume modules 2;
3)Processed by the result of 4 pairs of collections of SparkStreaming modules, result of calculation is submitted to interval of timestamps To stream process framework, the parsing of Shp files is carried out in stream process framework;
4)By result of the Kafka modules 3 to same node, the result for keeping up with a time is compared;
5)The relatively last result of each node is exported by application system 5, the different triangle gridding of water depth value is exported.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this area For art personnel, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, made any repair Change, equivalent, improvement etc., should be included within the scope of the present invention.

Claims (2)

1. a kind of waterlogging data handling system based on stream process, it is characterised in that:Including waterlogging model computation module(1)、 Flume modules(2), Kafka modules(3), SparkStreaming modules(4)And application system(5);
The waterlogging model computation module(1)Substantial amounts of waterlogging Predicting Technique result data will be produced, will then be stored with Shp forms For Shp files, the Flume modules(2)The Shp files are collected by its Agent, the Flume modules are then aggregated into (2)Collector, the Flume modules(2)Sink daily record is transported to the Kafka modules(3)Complete the life of data Produce flow process, the SparkStreaming modules(4)The side-play amount of this data is consumed in tracking or offset is consumed, institute State SparkStreaming modules(4)In be encoded with the program that parses the Shp files, described program parses the Shp files Return the result of change every time afterwards, be transmitted further to the Kafka modules(3), then by the application system(5)With the Kafka System(3)Communication is set up, specific message queue is monitored, the result of change is obtained, is completed the displaying of GIS information.
2. a kind of waterlogging data processing method based on stream process, it is characterised in that including following processing method:
1)By waterlogging model computation module(1)The calculating that zones of different is carried out to node;
2)By Flume modules(2)The results of prediction and calculation of these multiple nodes is collected process;
3)By SparkStreaming modules(4)The result that collects is processed, result of calculation is carried with interval of timestamps Stream process framework is given, and the parsing of Shp files is carried out in stream process framework;
4)By Kafka modules(3)Result to same node, the result for keeping up with a time are compared;
5)By application system(5)The relatively last result of each node is exported, the different triangle gridding of water depth value carries out defeated Go out.
CN201611026709.4A 2016-11-22 2016-11-22 A kind of waterlogging data handling system and its processing method based on stream process Pending CN106504169A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611026709.4A CN106504169A (en) 2016-11-22 2016-11-22 A kind of waterlogging data handling system and its processing method based on stream process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611026709.4A CN106504169A (en) 2016-11-22 2016-11-22 A kind of waterlogging data handling system and its processing method based on stream process

Publications (1)

Publication Number Publication Date
CN106504169A true CN106504169A (en) 2017-03-15

Family

ID=58328051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611026709.4A Pending CN106504169A (en) 2016-11-22 2016-11-22 A kind of waterlogging data handling system and its processing method based on stream process

Country Status (1)

Country Link
CN (1) CN106504169A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107317838A (en) * 2017-05-24 2017-11-03 重庆邮电大学 A kind of astronomical metadata archiving method and system based on stream data processing framework
CN110377653A (en) * 2019-07-15 2019-10-25 武汉中地数码科技有限公司 A kind of real-time big data calculates and storage method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060095202A1 (en) * 2004-11-01 2006-05-04 Hitachi, Ltd. Method of delivering difference map data
CN101727261A (en) * 2008-10-17 2010-06-09 华硕电脑股份有限公司 Page operation method and electronic device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060095202A1 (en) * 2004-11-01 2006-05-04 Hitachi, Ltd. Method of delivering difference map data
CN101727261A (en) * 2008-10-17 2010-06-09 华硕电脑股份有限公司 Page operation method and electronic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈任飞等: "基于Flume/Kafka/Spark的分布式日志流处理系统的设计与实现", 《中国科技论文在线》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107317838A (en) * 2017-05-24 2017-11-03 重庆邮电大学 A kind of astronomical metadata archiving method and system based on stream data processing framework
CN107317838B (en) * 2017-05-24 2020-11-17 重庆邮电大学 Astronomical metadata filing method and system based on streaming data processing architecture
CN110377653A (en) * 2019-07-15 2019-10-25 武汉中地数码科技有限公司 A kind of real-time big data calculates and storage method and system

Similar Documents

Publication Publication Date Title
Yang IoT stream processing and analytics in the fog
CN106709035B (en) A kind of pretreatment system of electric power multidimensional panoramic view data
CN103297503B (en) Mobile terminal intelligent perception system based on information retrieval server by different level
CN102902752B (en) Method and system for monitoring log
Wang et al. A deep learning based energy-efficient computational offloading method in Internet of vehicles
CN105512297A (en) Distributed stream-oriented computation based spatial data processing method and system
CN109710731A (en) A kind of multidirectional processing system of data flow based on Flink
CN109831478A (en) Rule-based and model distributed processing intelligent decision system and method in real time
CN106951552A (en) A kind of user behavior data processing method based on Hadoop
Yan et al. Big data driven wireless communications: A human-in-the-loop pushing technique for 5G systems
CN111586091A (en) Edge computing gateway system for realizing computing power assembly
CN111198918B (en) Data processing system based on big data platform and link optimization method
CN103916478B (en) The method and apparatus that streaming based on distributed system builds data side
CN106504169A (en) A kind of waterlogging data handling system and its processing method based on stream process
CN110995652B (en) Big data platform unknown threat detection method based on deep migration learning
CN106682225A (en) Big data collecting and storing method and system
CN107995278B (en) A kind of scene intelligent analysis system and method based on metropolitan area grade Internet of Things perception data
CN106990913B (en) A kind of distributed approach of extensive streaming collective data
CN104778355A (en) Trajectory outlier detection method based on wide-area distributed traffic system
CN106970976A (en) A kind of real-time dynamic passenger flow volume statistical method in scenic spot based on visitor's mobile signaling protocol data
CN115391429A (en) Time sequence data processing method and device based on big data cloud computing
CN114219165A (en) Electricity consumption big data storage system, prediction algorithm and visual display platform
CN111813833B (en) Real-time two-degree communication relation data mining method
Liu et al. Distributed and real-time query framework for processing participatory sensing data streams
CN113360576A (en) Power grid mass data real-time processing method and device based on Flink Streaming

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Shi Xinming

Inventor after: Li Yujie

Inventor after: Liu Jia

Inventor after: Chen Kun

Inventor after: Liu Changxin

Inventor after: Yang Fang

Inventor before: Shi Xinming

CB03 Change of inventor or designer information
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170315

WD01 Invention patent application deemed withdrawn after publication