CN107016128A - A kind of data processing method and device - Google Patents

A kind of data processing method and device Download PDF

Info

Publication number
CN107016128A
CN107016128A CN201710343831.2A CN201710343831A CN107016128A CN 107016128 A CN107016128 A CN 107016128A CN 201710343831 A CN201710343831 A CN 201710343831A CN 107016128 A CN107016128 A CN 107016128A
Authority
CN
China
Prior art keywords
data
message
oriented middleware
database
data processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710343831.2A
Other languages
Chinese (zh)
Inventor
臧勇真
戴雪冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201710343831.2A priority Critical patent/CN107016128A/en
Publication of CN107016128A publication Critical patent/CN107016128A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of data processing method and device, and the above method comprises the following steps:Data from different data sources are transmitted by message-oriented middleware;Stream Processing is carried out to the data, and treated data are preserved to database, data is realized and once collects, repeatedly distribution, it is not necessary to data are repeatedly extracted, the pressure of source database is alleviated, and the real-time calculating to data is realized, data-handling efficiency is improved.

Description

A kind of data processing method and device
Technical field
The invention belongs to computer realm, more particularly to a kind of data processing method and device.
Background technology
With continuing to develop for cloud computing technology, cloud computing technology constantly landing turns into support every profession and trade Information Technology Development Mainstay.Based on hadoop and hbase distributed type assemblies, nowadays turn into domestic and international cloud computing popular research object. Hadoop HDFS distributed storages provide storage mode for cloud platform, and hbase then provides database service for cloud platform.Enterprise Industry or Government departments' data are more and more, and data are done with the association analysis of profound level, data mining will be Data Integration Get up, be placed on unified data platform.The appearance of big data treatment technology, can be good at tackling this problem.But it is each The data source category of operation system is more and numerous and diverse, data standard disunity, then how to arrive the data acquisition in each business library In big data platform, and pre-processed, unified data format or standard is converted into, as what is handled big data Matter of utmost importance.
Therefore, in the urgent need to providing a kind of efficient data processing scheme, answered to solve data processing in big data platform Miscellaneous the problem of.
The content of the invention
The present invention provides a kind of data processing method and device, to solve the above problems.
The present invention provides a kind of data processing method.The above method comprises the following steps:
Data from different data sources are transmitted by message-oriented middleware;
Stream Processing is carried out to the data, and treated data are preserved to database.
The present invention also provides a kind of data processing equipment, including:Data transmission module, data processing module, wherein, it is described Data transmission module is connected with the data processing module;
Data transmission module, for transmitting the data from different data sources by message-oriented middleware;
Data processing module, for carrying out Stream Processing to the data, and treated data is preserved to data Storehouse.
Pass through following scheme:Data from different data sources are transmitted by message-oriented middleware, data is realized and once receives Take, have more distribution, it is not necessary to repeatedly extracted data, alleviate the pressure of source database.
Pass through following scheme:Stream Processing is carried out to data, and treated data are preserved to database, is realized Real-time calculating to data, improves data-handling efficiency, and data are pre-processed, and is that follow-up off-line calculation does standard It is standby.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not constitute inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 show the data processing method process chart of the embodiment of the present invention 1;
Fig. 2 show the Organization Chart of the data processing method of the embodiment of the present invention 2;
Fig. 3 show the data processing method timing diagram of the embodiment of the present invention 3;
Fig. 4 show the data processing equipment structure chart of the embodiment of the present invention 4.
Embodiment
Describe the present invention in detail below with reference to accompanying drawing and in conjunction with the embodiments.It should be noted that not conflicting In the case of, the feature in embodiment and embodiment in the application can be mutually combined.
Fig. 1 show the data processing method process chart of the embodiment of the present invention 1, comprises the following steps:
Step 102:Data from different data sources are transmitted by message-oriented middleware.
The message-oriented middleware is MQ data/address bus.MQ is used as data/address bus, the function of data distribution is played, to MQ For data be all by ETL (Extract-Transform-Load, description by data from source terminal through extraction, conversion, plus It is loaded onto the process of destination) push what is come.By json modes by the data-pushing to the message-oriented middleware.
Step 104:Stream Processing is carried out to the data, and treated data are preserved to database.
Further, include to data progress Stream Processing:
The data are monitored, the resource of the data is counted according to monitored results;
Create the index of the data.
Wherein, streaming computing is carried out using storm.Many distributed computing systems can in real time or near real-time is located in Manage high amount of traffic, the framework of streaming big data processing, such as Storm, Spark and Samza.
The database is hbase databases or hadoop databases.
Using message-oriented middleware as data/address bus, data flood peak can be tackled, realizes that the once production of message is repeatedly consumed. Streaming computing is carried out by Storm, data can in real time be handled.Data can be carried out during generating date Pretreatment and the distribution of data, the process such as index, real-time is high, save calculate, Internet resources.
Fig. 2 show the Organization Chart of the data processing method of the embodiment of the present invention 2.
As shown in Fig. 2 the data-pushing from each data source is to message-oriented middleware MQ.The data source can be relationship type number According to storehouse, the data extracted are handled by ETL modes, and final data is pushed into MQ with json forms, form is {‘tablename’:' table name ', ' rowkey ':' rowkey ' of tissue, ' source ':' source ' }.The data source can also be Third party database.
MQ is by data-pushing to Storm, and Storm is responsible for writing data into the real-time processing of hbase databases and data, Create index, monitoring resource and other business processings.
Solve from relevant database and other data source gathered datas to the data processing of Hadoop big data platforms Problem.Data are once collected, and have more distribution, it is not necessary to repeatedly extracted data, alleviate the pressure of source database;Logarithm According to can be calculated in real time;Data can be pre-processed, be that follow-up off-line calculation is prepared.
Fig. 3 show the data processing method timing diagram of the embodiment of the present invention 3.
Data are extracted from data source, and pass through ETL processing, treated data are sent to data/address bus (MQ), data Bus is by data distribution to Hbase databases and third party database.Meanwhile, strom is handled data in real time, is carried out Full-text index, resource statistics, and by the data storage after processing to Hbase databases.
Message-oriented middleware technology has two Core Features:Asynchronous and decoupling.The two Core Features are improved on the whole should With the operating efficiency of system, the availability, stability and scalability of system are enhanced, Consumer's Experience is improved.Use OneMM Message-oriented middleware system can realize that (such as ERP system, payment are with other systems for each intermodule of application system or application system System) between decoupling and asynchronous message transfer, change data are directly exchanged by Database vendors mode, cause bottom between system The data exchange problem of the too high problem of the layer data degree of coupling and long-range cross-region application system.Pass through message-oriented middleware and streaming Calculate, it is possible to achieve the data sampling and processing problem of traditional relational and other data sources to big data platform.Data one It is secondary to collect, repeatedly distribution, it is not necessary to data are repeatedly extracted, the pressure of source database is alleviated;Data can be carried out Calculate in real time;Data can be pre-processed, be that follow-up off-line calculation is prepared.
Fig. 4 show the data processing equipment structure chart of the embodiment of the present invention 4.
As shown in figure 4, data processing equipment a kind of according to an embodiment of the invention, including:Data transmission module 402, Data processing module 404, wherein, the data transmission module 402 is connected with the data processing module 404;
Data transmission module 402, for transmitting the data from different data sources by message-oriented middleware;
Data processing module 404, for carrying out Stream Processing to the data, and treated data is preserved to number According to storehouse.
Further, the data processing module 404 includes:
Statistic unit 4042, for monitoring the data, is counted according to monitored results to the resource of the data;
Index creation unit 4044, the index for creating the data.
The message-oriented middleware is MQ data/address bus.
Wherein, data processing module 404 carries out streaming computing using storm;
The database is hbase databases or hadoop databases.
Data transmission module 402 is by json modes by the data-pushing to the message-oriented middleware.
Pass through following scheme:Data from different data sources are transmitted by message-oriented middleware, data is realized and once receives Take, repeatedly distribution, it is not necessary to data are repeatedly extracted, the pressure of source database is alleviated.
Pass through following scheme:Stream Processing is carried out to data, and treated data are preserved to database, is realized Real-time calculating to data, improves data-handling efficiency, and data are pre-processed, and is that follow-up off-line calculation does standard It is standby.
The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should be included in the scope of the protection.

Claims (10)

1. a kind of data processing method, it is characterised in that comprise the following steps:
Data from different data sources are transmitted by message-oriented middleware;
Stream Processing is carried out to the data, and treated data are preserved to database.
2. according to the method described in claim 1, it is characterised in that include carrying out Stream Processing to the data:
The data are monitored, the resource of the data is counted according to monitored results;
Create the index of the data.
3. according to the method described in claim 1, it is characterised in that the message-oriented middleware is MQ data/address bus.
4. according to the method described in claim 1, it is characterised in that streaming computing is carried out using storm;
The database is hbase databases or hadoop databases.
5. method according to any one of claim 1 to 4, it is characterised in that pushed away the data by json modes Deliver to the message-oriented middleware.
6. a kind of data processing equipment, it is characterised in that including:Data transmission module, data processing module, wherein, the number It is connected according to transport module with the data processing module;
Data transmission module, for transmitting the data from different data sources by message-oriented middleware;
Data processing module, for carrying out Stream Processing to the data, and treated data is preserved to database.
7. device according to claim 6, it is characterised in that the data processing module includes:
Statistic unit, for monitoring the data, is counted according to monitored results to the resource of the data;
Index creation unit, the index for creating the data.
8. device according to claim 6, it is characterised in that the message-oriented middleware is MQ data/address bus.
9. device according to claim 6, it is characterised in that data processing module carries out streaming computing using storm;
The database is hbase databases or hadoop databases.
10. the device according to any one of claim 6 to 9, it is characterised in that data transmission module passes through json modes By the data-pushing to the message-oriented middleware.
CN201710343831.2A 2017-05-16 2017-05-16 A kind of data processing method and device Pending CN107016128A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710343831.2A CN107016128A (en) 2017-05-16 2017-05-16 A kind of data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710343831.2A CN107016128A (en) 2017-05-16 2017-05-16 A kind of data processing method and device

Publications (1)

Publication Number Publication Date
CN107016128A true CN107016128A (en) 2017-08-04

Family

ID=59449987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710343831.2A Pending CN107016128A (en) 2017-05-16 2017-05-16 A kind of data processing method and device

Country Status (1)

Country Link
CN (1) CN107016128A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108629016A (en) * 2018-05-08 2018-10-09 成都信息工程大学 Support real-time stream calculation towards big data database control system, computer program
CN109492012A (en) * 2018-10-31 2019-03-19 厦门安胜网络科技有限公司 A kind of method, apparatus and storage medium of data real-time statistics and retrieval
CN110334075A (en) * 2019-04-04 2019-10-15 平安科技(深圳)有限公司 Data migration method and relevant device based on message-oriented middleware

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124483A1 (en) * 2011-11-10 2013-05-16 Treasure Data, Inc. System and method for operating a big-data platform
CN104573068A (en) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 Information processing method based on megadata
CN104657502A (en) * 2015-03-12 2015-05-27 浪潮集团有限公司 System and method for carrying out real-time statistics on mass data based on Hadoop
CN105677752A (en) * 2015-12-30 2016-06-15 深圳先进技术研究院 Streaming computing and batch computing combined processing system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124483A1 (en) * 2011-11-10 2013-05-16 Treasure Data, Inc. System and method for operating a big-data platform
CN104573068A (en) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 Information processing method based on megadata
CN104657502A (en) * 2015-03-12 2015-05-27 浪潮集团有限公司 System and method for carrying out real-time statistics on mass data based on Hadoop
CN105677752A (en) * 2015-12-30 2016-06-15 深圳先进技术研究院 Streaming computing and batch computing combined processing system and method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108629016A (en) * 2018-05-08 2018-10-09 成都信息工程大学 Support real-time stream calculation towards big data database control system, computer program
CN108629016B (en) * 2018-05-08 2022-05-24 成都信息工程大学 Big data base oriented control system supporting real-time stream computing and computer program
CN109492012A (en) * 2018-10-31 2019-03-19 厦门安胜网络科技有限公司 A kind of method, apparatus and storage medium of data real-time statistics and retrieval
CN110334075A (en) * 2019-04-04 2019-10-15 平安科技(深圳)有限公司 Data migration method and relevant device based on message-oriented middleware
CN110334075B (en) * 2019-04-04 2023-06-20 平安科技(深圳)有限公司 Data migration method based on message middleware and related equipment

Similar Documents

Publication Publication Date Title
CN110784419B (en) Method and system for visualizing professional railway electric service data
CN109254982A (en) A kind of stream data processing method, system, device and computer readable storage medium
CN104424229B (en) A kind of calculation method and system that various dimensions are split
CN104536965B (en) A kind of data query display systems under the conditions of big data and method
CN112860695B (en) Monitoring data query method, device, equipment, storage medium and program product
CN108038207A (en) A kind of daily record data processing system, method and server
CN105760449B (en) A kind of cloud method for pushing towards multi-source heterogeneous data
CN107016128A (en) A kind of data processing method and device
CN112948492A (en) Data processing system, method and device, electronic equipment and storage medium
CN106910146A (en) A kind of isomery educational data switching plane and method based on Stream Processing technology
CN103455633A (en) Method of distributed analysis for massive network detailed invoice data
CN114707914B (en) Supply and marketing management center platform system based on SaaS framework
CN108924228B (en) Industrial internet optimization system based on edge calculation
CN112749940A (en) Electric power operation supporting platform
CN108573029A (en) A kind of method, apparatus and storage medium obtaining network access relational data
CN108268569A (en) The acquisition of water resource monitoring data and analysis system and method based on big data technology
CN107636655A (en) Data are provided in real time to service(DaaS)System and method
CN104821958B (en) Electricity consumption data packet interactive interface method based on WebService
CN103916368B (en) A kind of method and device for realizing data processing between different data sources
CN105610823A (en) Stream media processing method and processing system architecture based on task vectors
CN111143651B (en) Data acquisition and analysis system for new media integrated operation management
CN107451301A (en) Processing method, device, equipment and the storage medium of bill mail are delivered in real time
CN104391949B (en) A kind of wide-area data method for managing resource based on data dictionary
CN111049898A (en) Method and system for realizing cross-domain architecture of computing cluster resources
CN207691854U (en) A kind of Government Affair Information System based on data share exchange

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170804