CN112597205A - Real-time data calculation and storage method based on stream and message scheduling - Google Patents

Real-time data calculation and storage method based on stream and message scheduling Download PDF

Info

Publication number
CN112597205A
CN112597205A CN202011608430.3A CN202011608430A CN112597205A CN 112597205 A CN112597205 A CN 112597205A CN 202011608430 A CN202011608430 A CN 202011608430A CN 112597205 A CN112597205 A CN 112597205A
Authority
CN
China
Prior art keywords
data
real
time
stream
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011608430.3A
Other languages
Chinese (zh)
Inventor
姜宇
周含笑
刘源
于雷
王兆祥
董丽娜
李墨野
王建勋
赵辉
邵文杰
马刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Space Star Data System Technology Co ltd
Original Assignee
Harbin Space Star Data System Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Space Star Data System Technology Co ltd filed Critical Harbin Space Star Data System Technology Co ltd
Priority to CN202011608430.3A priority Critical patent/CN112597205A/en
Publication of CN112597205A publication Critical patent/CN112597205A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a real-time data calculating and storing method based on stream and message scheduling, which comprises the following steps: step one, data flow buffering, namely establishing a data buffering channel in a big data high concurrency scene, and providing a stable data source after peak elimination for a data consumer by adopting a kafka system; step two, real-time data analysis, namely, completing online operation analysis of mass real-time data according to actual requirements through a quasi real-time stream calculation engine; and step three, real-time data storage, wherein the source data are stored in a distributed time sequence database according to time points, and real-time sequence storage and historical data retrieval of big data are realized. The method provided by the invention can be applied to a real-time data analysis and storage system of mass data, improves the rapid data processing capability, the real-time operation response speed and the mass data storage capability of the system, and meets the data subscription requirement.

Description

Real-time data calculation and storage method based on stream and message scheduling
Technical Field
The invention relates to the field of data processing, in particular to a real-time data calculating and storing method based on stream + message scheduling.
Background
In the field of smart city construction, with the wide application and development of intelligent sensor terminals, data transmission networks, big data and cloud computing technologies, the mode of acquiring multi-source data becomes more and more simple, and under the urban computing environment, the more real-time data streams in the original fields of urban planning, traffic control, environmental protection, resident life, social operation, public affairs and the like need to be processed. In the development and construction of smart cities, massive and various data are required to be utilized for constructing diversified urban services, a solid data foundation is provided for the service society, and therefore new opportunities and challenges are brought to the planning and construction of the smart cities.
Generally, the basis of smart city application is the collection, transmission, storage and analysis of a large amount of sensor data, the current data processing mode faces huge challenges of real-time performance and analysis diversity, and the current analysis is basically off-line calculation, and the calculation amount may need minutes or even hours and cannot meet the real-time performance requirement.
Above-mentioned relevant wisdom city field data acquisition has satisfied basic management demand at the applied initial stage, however along with intelligent continuous deepening, the city is meticulous, and intelligent management demand is more and more strong, and the scheme in the past can't satisfy the application needs: firstly, simple data storage is single, and a large amount of data are overstocked, and packet loss is a common phenomenon. Secondly, traditional analysis mode, the timeliness is too low, and the problem emergence needs very high timeliness in the wisdom city, needs the timely analysis response of system. Three traditional data analysis needs to invest a lot of manpower to carry out real-time dynamic monitoring.
Disclosure of Invention
The invention aims to solve the technical problems provided by the prior art, and further provides a real-time data calculation and storage method based on stream and message scheduling.
The invention discloses a real-time data calculating and storing method based on stream and message scheduling, which comprises the following steps:
step one, data flow buffering, namely establishing a data buffering channel in a big data high concurrency scene, and providing a stable data source after peak elimination for a data consumer by adopting a kafka system;
step two, real-time data analysis, namely, completing online operation analysis of mass real-time data according to actual requirements through a quasi real-time stream calculation engine;
step three, real-time data storage, wherein source data are stored in a distributed time sequence database according to time points, and real-time sequence storage and historical data retrieval of big data are realized;
and step four, storing the analysis result, and storing the analysis result into a corresponding distributed database according to the actual requirement.
And step two, establishing a high-throughput and high-availability data reporting channel during data stream buffering, and simultaneously finishing the transverse expansion of the data receiving gateway through the expansion message scheduling cluster.
And step two, storing the real-time data in a time sequence database opensdb based on a distributed technology, wherein the opensdb is the time sequence database established on the hbase and provides an api data interface in an http form.
In the online streaming computation, a stream computation engine consumes the message scheduling module queue messages in real time, and completes online real-time analysis according to the comparison of alarm rules stored in a relational database mysql.
And pushing the analysis result to a data flow engine kafka to complete the message subscription relationship.
The analysis result can be pushed to the specific real-time consumption topic of the message pipeline according to the actual requirement of the system, and the subscriber can receive the analysis result in real time by subscribing the consumption topic.
The beneficial effects of the invention include:
1. the method provided by the invention provides a new problem solving angle and thought reference for the field based on the combined mode of stream and message scheduling.
2. The method provided by the invention can be applied to a real-time data analysis and storage system of mass data, improves the rapid data processing capability, the real-time operation response speed and the mass data storage capability of the system, and meets the data subscription requirement.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
fig. 2 is a schematic view of flow calculation analysis in an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below with reference to fig. 1 to 2.
FIG. 1 is a schematic diagram of the real-time stream computation storage of the present invention.
In step S1, the data queue buffers data, and in this step, the data stream buffering is completed by establishing an association between the acquisition client and the data stream engine, i.e., the message scheduling module, so as to provide the system with basic terminal data.
In step S2, the online streaming computation is implemented, and the stream computation engine consumes the message scheduling module queue message in real time to complete the online analysis. And calculating to obtain an analysis result through comparison and analysis of the real-time data and the data model, and calculating to quickly respond in real time.
In step S3, the subscription data pushing is completed, the analysis result is pushed to the subscriber, and the data is pushed in real time to provide data support for the urbanized quick response.
In step S4, the source data persistence is realized, the source data is stored in the source data time sequence database, and the design of the optimized open-source distributed time sequence database can effectively improve the storage capacity and the fast retrieval capacity.
FIG. 2 is a graph showing the online analysis data of steps S1, S2
In step S2, the stream computation engine consumes the message scheduling module queue message in real time through the real-time data buffered by the message queue in S1 by the online streaming computation, completing the online analysis as needed.
FIG. 2 is a graph showing the online analysis data of steps S1, S2
In step S2, the stream computation engine consumes the message scheduling module queue message in real time through the real-time data buffered by the message queue in S1 by the online streaming computation, completing the online analysis as needed.
Example 1
Taking the data acquisition of the city street lamp perception terminal and the analysis after the data acquisition as an example.
After street lamp real-time data are uploaded to a message scheduling module through a sensor, data after massive sensing terminal data are reported in real time are cached and shunted through a message queue cache shunting module to form a real-time data source, real-time calculation is carried out through a stream calculation engine real-time analysis module according to a stream calculation engine calculation mode, the real-time data are predicted according to an existing street lamp parameter model, and the real-time data can be output to indicate whether a power consumption state is normal or not, line loss indexes and whether false switching exists or not, and the street lamp can be continuously switched on and off for many times within a period of time and can be defined as a system. And the perceived original street lamp time sequence data and the analysis data are adaptively stored through a data storage module for storing time sequence data and storing result data in a distributed manner.
The above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the embodiments of the present invention, and those skilled in the art can easily make various changes and modifications according to the main concept and spirit of the present invention, so the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A real-time data calculation and storage method based on stream and message scheduling is characterized by comprising the following steps:
step one, data flow buffering, namely establishing a data buffering channel in a big data high concurrency scene, and providing a stable data source after peak elimination for a data consumer by adopting a kafka system;
step two, real-time data analysis, namely, completing online operation analysis of mass real-time data according to actual requirements through a quasi real-time stream calculation engine;
step three, real-time data storage, wherein source data are stored in a distributed time sequence database according to time points, and real-time sequence storage and historical data retrieval of big data are realized;
and step four, storing the analysis result, and storing the analysis result into a corresponding distributed database according to the actual requirement.
2. The method according to claim 1, wherein in step two, a high throughput and high availability data reporting channel is established during buffering of the data stream, and the lateral extension of the data receiving gateway is completed by extending the message scheduling cluster.
3. The method for calculating and storing real-time data based on stream and message scheduling as claimed in claim 1, wherein in step two, the real-time data is stored in a distributed technology based time sequence database opentsb, which is a time sequence database based on hbase and provides api data interface in http form.
4. The method of claim 1, wherein in the online streaming computing, the stream computing engine consumes the message scheduling module queue message in real time, and completes online real-time analysis according to the comparison of the alarm rules stored in the relational database mysql.
5. The method for calculating and storing real-time data based on stream and message scheduling as claimed in claim 1, wherein the analysis result is pushed to a data stream engine kafka to complete the message subscription relationship.
CN202011608430.3A 2020-12-30 2020-12-30 Real-time data calculation and storage method based on stream and message scheduling Pending CN112597205A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011608430.3A CN112597205A (en) 2020-12-30 2020-12-30 Real-time data calculation and storage method based on stream and message scheduling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011608430.3A CN112597205A (en) 2020-12-30 2020-12-30 Real-time data calculation and storage method based on stream and message scheduling

Publications (1)

Publication Number Publication Date
CN112597205A true CN112597205A (en) 2021-04-02

Family

ID=75206422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011608430.3A Pending CN112597205A (en) 2020-12-30 2020-12-30 Real-time data calculation and storage method based on stream and message scheduling

Country Status (1)

Country Link
CN (1) CN112597205A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282611A (en) * 2021-06-29 2021-08-20 深圳平安智汇企业信息管理有限公司 Method and device for synchronizing stream data, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704545A (en) * 2017-11-08 2018-02-16 华东交通大学 Railway distribution net magnanimity information method for stream processing based on Storm Yu Kafka message communicatings
CN111077870A (en) * 2020-01-06 2020-04-28 浙江中烟工业有限责任公司 Intelligent OPC data real-time acquisition and monitoring system and method based on stream calculation
CN111177276A (en) * 2020-01-06 2020-05-19 浙江中烟工业有限责任公司 Spark calculation framework-based kinetic energy data processing system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704545A (en) * 2017-11-08 2018-02-16 华东交通大学 Railway distribution net magnanimity information method for stream processing based on Storm Yu Kafka message communicatings
CN111077870A (en) * 2020-01-06 2020-04-28 浙江中烟工业有限责任公司 Intelligent OPC data real-time acquisition and monitoring system and method based on stream calculation
CN111177276A (en) * 2020-01-06 2020-05-19 浙江中烟工业有限责任公司 Spark calculation framework-based kinetic energy data processing system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282611A (en) * 2021-06-29 2021-08-20 深圳平安智汇企业信息管理有限公司 Method and device for synchronizing stream data, computer equipment and storage medium
CN113282611B (en) * 2021-06-29 2024-04-23 深圳平安智汇企业信息管理有限公司 Method, device, computer equipment and storage medium for synchronizing stream data

Similar Documents

Publication Publication Date Title
CN107729413B (en) Regional traffic intelligent management system based on big data
CN108346010B (en) Shared automobile scheduling method based on user demand analysis
CN111741073B (en) Electric power data transmission system based on 5G communication network
Waluyo et al. Research in mobile database query optimization and processing
WO2022127234A1 (en) Cloud platform-based network comprehensive monitoring method and system
CN113299059B (en) Data-driven road traffic control decision support method
CN101938814B (en) Mobile terminal paging method and mobile call center equipment
CN110912200B (en) Cascade hydropower station optimal scheduling system and method and safety power grid system
CN105049298A (en) Method and system for monitoring cloud resource
CN108737519A (en) A kind of industrial Internet of Things cloud service platform intelligent acquisition method
CN112597205A (en) Real-time data calculation and storage method based on stream and message scheduling
CN110377653A (en) A kind of real-time big data calculates and storage method and system
CN108001282B (en) Charging device and method for realizing dynamic electricity price adjustment based on big data
CN105303292A (en) Distribution data storage method and device
CN103118102B (en) A kind of under cloud computing environment statistics and control system and the method for spatial data accessing rule
CN105682124B (en) A kind of power-economizing method based on virtual network
WO2024001266A1 (en) Video stream transmission control method and apparatus, device, and medium
CN116366692A (en) High-performance intelligent edge terminal system
CN114266377A (en) Mobile emergency power supply space scheduling system and method based on power Internet of things
CN105205605B (en) Interactive service system of city intelligent portal terminal and electric power marketing terminal
CN109271395A (en) Extensive real time data for comprehensive monitoring system updates delivery system and method
CN115100898A (en) Cooperative computing task unloading method for urban intelligent parking management system
CN112181920A (en) Internet of vehicles big data high-performance compression storage method and system
CN104468515B (en) A kind of intelligent transformer substation communication method and system based on information centre's network
Zhang et al. Framework design of urban traffic planning based on wireless network optimisation and cognitive sustainable data retrieval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210402

RJ01 Rejection of invention patent application after publication