CN112597205A - Real-time data calculation and storage method based on stream and message scheduling - Google Patents
Real-time data calculation and storage method based on stream and message scheduling Download PDFInfo
- Publication number
- CN112597205A CN112597205A CN202011608430.3A CN202011608430A CN112597205A CN 112597205 A CN112597205 A CN 112597205A CN 202011608430 A CN202011608430 A CN 202011608430A CN 112597205 A CN112597205 A CN 112597205A
- Authority
- CN
- China
- Prior art keywords
- data
- real
- time
- stream
- analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a real-time data calculating and storing method based on stream and message scheduling, which comprises the following steps: step one, data flow buffering, namely establishing a data buffering channel in a big data high concurrency scene, and providing a stable data source after peak elimination for a data consumer by adopting a kafka system; step two, real-time data analysis, namely, completing online operation analysis of mass real-time data according to actual requirements through a quasi real-time stream calculation engine; and step three, real-time data storage, wherein the source data are stored in a distributed time sequence database according to time points, and real-time sequence storage and historical data retrieval of big data are realized. The method provided by the invention can be applied to a real-time data analysis and storage system of mass data, improves the rapid data processing capability, the real-time operation response speed and the mass data storage capability of the system, and meets the data subscription requirement.
Description
Technical Field
The invention relates to the field of data processing, in particular to a real-time data calculating and storing method based on stream + message scheduling.
Background
In the field of smart city construction, with the wide application and development of intelligent sensor terminals, data transmission networks, big data and cloud computing technologies, the mode of acquiring multi-source data becomes more and more simple, and under the urban computing environment, the more real-time data streams in the original fields of urban planning, traffic control, environmental protection, resident life, social operation, public affairs and the like need to be processed. In the development and construction of smart cities, massive and various data are required to be utilized for constructing diversified urban services, a solid data foundation is provided for the service society, and therefore new opportunities and challenges are brought to the planning and construction of the smart cities.
Generally, the basis of smart city application is the collection, transmission, storage and analysis of a large amount of sensor data, the current data processing mode faces huge challenges of real-time performance and analysis diversity, and the current analysis is basically off-line calculation, and the calculation amount may need minutes or even hours and cannot meet the real-time performance requirement.
Above-mentioned relevant wisdom city field data acquisition has satisfied basic management demand at the applied initial stage, however along with intelligent continuous deepening, the city is meticulous, and intelligent management demand is more and more strong, and the scheme in the past can't satisfy the application needs: firstly, simple data storage is single, and a large amount of data are overstocked, and packet loss is a common phenomenon. Secondly, traditional analysis mode, the timeliness is too low, and the problem emergence needs very high timeliness in the wisdom city, needs the timely analysis response of system. Three traditional data analysis needs to invest a lot of manpower to carry out real-time dynamic monitoring.
Disclosure of Invention
The invention aims to solve the technical problems provided by the prior art, and further provides a real-time data calculation and storage method based on stream and message scheduling.
The invention discloses a real-time data calculating and storing method based on stream and message scheduling, which comprises the following steps:
step one, data flow buffering, namely establishing a data buffering channel in a big data high concurrency scene, and providing a stable data source after peak elimination for a data consumer by adopting a kafka system;
step two, real-time data analysis, namely, completing online operation analysis of mass real-time data according to actual requirements through a quasi real-time stream calculation engine;
step three, real-time data storage, wherein source data are stored in a distributed time sequence database according to time points, and real-time sequence storage and historical data retrieval of big data are realized;
and step four, storing the analysis result, and storing the analysis result into a corresponding distributed database according to the actual requirement.
And step two, establishing a high-throughput and high-availability data reporting channel during data stream buffering, and simultaneously finishing the transverse expansion of the data receiving gateway through the expansion message scheduling cluster.
And step two, storing the real-time data in a time sequence database opensdb based on a distributed technology, wherein the opensdb is the time sequence database established on the hbase and provides an api data interface in an http form.
In the online streaming computation, a stream computation engine consumes the message scheduling module queue messages in real time, and completes online real-time analysis according to the comparison of alarm rules stored in a relational database mysql.
And pushing the analysis result to a data flow engine kafka to complete the message subscription relationship.
The analysis result can be pushed to the specific real-time consumption topic of the message pipeline according to the actual requirement of the system, and the subscriber can receive the analysis result in real time by subscribing the consumption topic.
The beneficial effects of the invention include:
1. the method provided by the invention provides a new problem solving angle and thought reference for the field based on the combined mode of stream and message scheduling.
2. The method provided by the invention can be applied to a real-time data analysis and storage system of mass data, improves the rapid data processing capability, the real-time operation response speed and the mass data storage capability of the system, and meets the data subscription requirement.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
fig. 2 is a schematic view of flow calculation analysis in an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below with reference to fig. 1 to 2.
FIG. 1 is a schematic diagram of the real-time stream computation storage of the present invention.
In step S1, the data queue buffers data, and in this step, the data stream buffering is completed by establishing an association between the acquisition client and the data stream engine, i.e., the message scheduling module, so as to provide the system with basic terminal data.
In step S2, the online streaming computation is implemented, and the stream computation engine consumes the message scheduling module queue message in real time to complete the online analysis. And calculating to obtain an analysis result through comparison and analysis of the real-time data and the data model, and calculating to quickly respond in real time.
In step S3, the subscription data pushing is completed, the analysis result is pushed to the subscriber, and the data is pushed in real time to provide data support for the urbanized quick response.
In step S4, the source data persistence is realized, the source data is stored in the source data time sequence database, and the design of the optimized open-source distributed time sequence database can effectively improve the storage capacity and the fast retrieval capacity.
FIG. 2 is a graph showing the online analysis data of steps S1, S2
In step S2, the stream computation engine consumes the message scheduling module queue message in real time through the real-time data buffered by the message queue in S1 by the online streaming computation, completing the online analysis as needed.
FIG. 2 is a graph showing the online analysis data of steps S1, S2
In step S2, the stream computation engine consumes the message scheduling module queue message in real time through the real-time data buffered by the message queue in S1 by the online streaming computation, completing the online analysis as needed.
Example 1
Taking the data acquisition of the city street lamp perception terminal and the analysis after the data acquisition as an example.
After street lamp real-time data are uploaded to a message scheduling module through a sensor, data after massive sensing terminal data are reported in real time are cached and shunted through a message queue cache shunting module to form a real-time data source, real-time calculation is carried out through a stream calculation engine real-time analysis module according to a stream calculation engine calculation mode, the real-time data are predicted according to an existing street lamp parameter model, and the real-time data can be output to indicate whether a power consumption state is normal or not, line loss indexes and whether false switching exists or not, and the street lamp can be continuously switched on and off for many times within a period of time and can be defined as a system. And the perceived original street lamp time sequence data and the analysis data are adaptively stored through a data storage module for storing time sequence data and storing result data in a distributed manner.
The above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the embodiments of the present invention, and those skilled in the art can easily make various changes and modifications according to the main concept and spirit of the present invention, so the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (5)
1. A real-time data calculation and storage method based on stream and message scheduling is characterized by comprising the following steps:
step one, data flow buffering, namely establishing a data buffering channel in a big data high concurrency scene, and providing a stable data source after peak elimination for a data consumer by adopting a kafka system;
step two, real-time data analysis, namely, completing online operation analysis of mass real-time data according to actual requirements through a quasi real-time stream calculation engine;
step three, real-time data storage, wherein source data are stored in a distributed time sequence database according to time points, and real-time sequence storage and historical data retrieval of big data are realized;
and step four, storing the analysis result, and storing the analysis result into a corresponding distributed database according to the actual requirement.
2. The method according to claim 1, wherein in step two, a high throughput and high availability data reporting channel is established during buffering of the data stream, and the lateral extension of the data receiving gateway is completed by extending the message scheduling cluster.
3. The method for calculating and storing real-time data based on stream and message scheduling as claimed in claim 1, wherein in step two, the real-time data is stored in a distributed technology based time sequence database opentsb, which is a time sequence database based on hbase and provides api data interface in http form.
4. The method of claim 1, wherein in the online streaming computing, the stream computing engine consumes the message scheduling module queue message in real time, and completes online real-time analysis according to the comparison of the alarm rules stored in the relational database mysql.
5. The method for calculating and storing real-time data based on stream and message scheduling as claimed in claim 1, wherein the analysis result is pushed to a data stream engine kafka to complete the message subscription relationship.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011608430.3A CN112597205A (en) | 2020-12-30 | 2020-12-30 | Real-time data calculation and storage method based on stream and message scheduling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011608430.3A CN112597205A (en) | 2020-12-30 | 2020-12-30 | Real-time data calculation and storage method based on stream and message scheduling |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112597205A true CN112597205A (en) | 2021-04-02 |
Family
ID=75206422
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011608430.3A Pending CN112597205A (en) | 2020-12-30 | 2020-12-30 | Real-time data calculation and storage method based on stream and message scheduling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112597205A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282611A (en) * | 2021-06-29 | 2021-08-20 | 深圳平安智汇企业信息管理有限公司 | Method and device for synchronizing stream data, computer equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704545A (en) * | 2017-11-08 | 2018-02-16 | 华东交通大学 | Railway distribution net magnanimity information method for stream processing based on Storm Yu Kafka message communicatings |
CN111077870A (en) * | 2020-01-06 | 2020-04-28 | 浙江中烟工业有限责任公司 | Intelligent OPC data real-time acquisition and monitoring system and method based on stream calculation |
CN111177276A (en) * | 2020-01-06 | 2020-05-19 | 浙江中烟工业有限责任公司 | Spark calculation framework-based kinetic energy data processing system and method |
-
2020
- 2020-12-30 CN CN202011608430.3A patent/CN112597205A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107704545A (en) * | 2017-11-08 | 2018-02-16 | 华东交通大学 | Railway distribution net magnanimity information method for stream processing based on Storm Yu Kafka message communicatings |
CN111077870A (en) * | 2020-01-06 | 2020-04-28 | 浙江中烟工业有限责任公司 | Intelligent OPC data real-time acquisition and monitoring system and method based on stream calculation |
CN111177276A (en) * | 2020-01-06 | 2020-05-19 | 浙江中烟工业有限责任公司 | Spark calculation framework-based kinetic energy data processing system and method |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282611A (en) * | 2021-06-29 | 2021-08-20 | 深圳平安智汇企业信息管理有限公司 | Method and device for synchronizing stream data, computer equipment and storage medium |
CN113282611B (en) * | 2021-06-29 | 2024-04-23 | 深圳平安智汇企业信息管理有限公司 | Method, device, computer equipment and storage medium for synchronizing stream data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107729413B (en) | Regional traffic intelligent management system based on big data | |
CN108346010B (en) | Shared automobile scheduling method based on user demand analysis | |
CN111741073B (en) | Electric power data transmission system based on 5G communication network | |
Waluyo et al. | Research in mobile database query optimization and processing | |
WO2022127234A1 (en) | Cloud platform-based network comprehensive monitoring method and system | |
CN113299059B (en) | Data-driven road traffic control decision support method | |
CN101938814B (en) | Mobile terminal paging method and mobile call center equipment | |
CN110912200B (en) | Cascade hydropower station optimal scheduling system and method and safety power grid system | |
CN105049298A (en) | Method and system for monitoring cloud resource | |
CN108737519A (en) | A kind of industrial Internet of Things cloud service platform intelligent acquisition method | |
CN112597205A (en) | Real-time data calculation and storage method based on stream and message scheduling | |
CN110377653A (en) | A kind of real-time big data calculates and storage method and system | |
CN108001282B (en) | Charging device and method for realizing dynamic electricity price adjustment based on big data | |
CN105303292A (en) | Distribution data storage method and device | |
CN103118102B (en) | A kind of under cloud computing environment statistics and control system and the method for spatial data accessing rule | |
CN105682124B (en) | A kind of power-economizing method based on virtual network | |
WO2024001266A1 (en) | Video stream transmission control method and apparatus, device, and medium | |
CN116366692A (en) | High-performance intelligent edge terminal system | |
CN114266377A (en) | Mobile emergency power supply space scheduling system and method based on power Internet of things | |
CN105205605B (en) | Interactive service system of city intelligent portal terminal and electric power marketing terminal | |
CN109271395A (en) | Extensive real time data for comprehensive monitoring system updates delivery system and method | |
CN115100898A (en) | Cooperative computing task unloading method for urban intelligent parking management system | |
CN112181920A (en) | Internet of vehicles big data high-performance compression storage method and system | |
CN104468515B (en) | A kind of intelligent transformer substation communication method and system based on information centre's network | |
Zhang et al. | Framework design of urban traffic planning based on wireless network optimisation and cognitive sustainable data retrieval |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210402 |
|
RJ01 | Rejection of invention patent application after publication |