CN105677836A - Big data processing and solving system simultaneously supporting offline data and real-time online data - Google Patents

Big data processing and solving system simultaneously supporting offline data and real-time online data Download PDF

Info

Publication number
CN105677836A
CN105677836A CN201610005212.8A CN201610005212A CN105677836A CN 105677836 A CN105677836 A CN 105677836A CN 201610005212 A CN201610005212 A CN 201610005212A CN 105677836 A CN105677836 A CN 105677836A
Authority
CN
China
Prior art keywords
data
module
real
distributed
configuration information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610005212.8A
Other languages
Chinese (zh)
Inventor
许丹霞
刘寅
汪伟
郑宇�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huishang Rongtong Information Technology Co Ltd
Original Assignee
Beijing Huishang Rongtong Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huishang Rongtong Information Technology Co Ltd filed Critical Beijing Huishang Rongtong Information Technology Co Ltd
Priority to CN201610005212.8A priority Critical patent/CN105677836A/en
Publication of CN105677836A publication Critical patent/CN105677836A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a big data processing and solving system simultaneously supporting offline data and real-time online data. The system comprises a data collecting module, a preprocessing module, a distributed storage module, a distributed real-time flow calculating module, an offline data processing module, a database, a data comprehensive analysis and query module, a comprehensive showing module and a uniform configuration center. The big data processing and solving system can process the real-time data and the offline data, and is timely in processing and high in processing efficiency.

Description

The big data of a kind of support off-line data and real-time online data simultaneously deal with system
Technical field
The present invention relates to a kind of big data and deal with scheme, particularly the complete big data of a kind of support off-line data and real-time online data simultaneously deal with system.
Background technology
Along with the development of technology, people have increasing need for building complicated and low latency process system. Two instruments that they can use all can not be fully solved problem: for processing the extendible high latency batch processing system of historical data, and cannot reprocess the low latency Stream Processing system of result. But the two instrument is connected together, it is possible to build available solution.
Hadoop framework brings batch data and processes, but processing in real time of the big data of network size remains a challenge. There is a lot of technology to can be used to set up such a complete data handling system, but to select suitable instrument and layout to use them to be complicated and arduous.
Summary of the invention
Based on case above, the present invention proposes the complete big data of a kind of support off-line data and real-time online data simultaneously and deals with scheme. Including:
One, a configurable data acquisition module that can gather multiple Data Source, and introduce distributed fault testing mechanism, improve stability and the reliability of data acquisition.
Two, a configurable data preprocessing module, it is possible to read configuration information loading from unified configuration center and process program accordingly.
Three, the distributed document memory module of an innovatory algorithm, it is proposed to a kind of appraisal procedure to joint behavior, stores algorithm to HDFS and improves so that it is can complete the storage work of mass data more quickly, efficiently and accurately.
Four, a high performance real time data processing module, adopts Strom distributive type to process framework, processes magnanimity real time data, and result of calculation be stored in real time in data base.
Five, a high performance off-line data processing module, adopts HadoopMapReduce programming model, and proposes a kind of task allocation algorithms inferred based on node dynamic property, improve performance and the stability of off-line data processing module.
Six, an overview display module highly customized, provides inquiry service based on web container, realizes analyzing result visualization by ECharts, and user can pass through to pull self-defined layout, the displayed page that customization is personalized, collaborative support and drilling through between chart. And provide interface that unified configuration center is safeguarded.
For realizing the purpose of the present invention, it is achieved by the following technical solutions:
A kind of big data handling system simultaneously supporting off-line data and real-time online data, including:
Data acquisition module, pretreatment module, distributed storage module, distributed real-time streams computing module, off-line data processing module, data base, aggregation of data analyze enquiry module, overview display module and unified configuration center;
Wherein:
Data acquisition module is for reading configuration information from unified configuration center, the data in relevant database are read according to this configuration information, and these data are imported distributed document memory module, receive the process request that application cluster sends, the request data received is supplied directly to distributed real-time streams computing module, application cluster journal file is sent to local disk and carries out storage backup;
Data preprocessing module, for reading configuration information from unified configuration center, reads the journal file of the application of local disk storage, is stored in local disk, and uploads files to distributed document memory module after data are processed;
Distributed storage module is used for storing mass data;
Distributed real-time streams computing module is for reading data from data acquisition module, and reads the configuration information of unified configuration center, calculates in real time, result of calculation is stored in data base; Each index, for processing the data of storage in distributed document memory module, has been calculated rear write into Databasce by off-line data processing module;
Data base is used for storing data;
Aggregation of data is analyzed enquiry module and is used for accessing data base, and provides various index query interface;
Overview display module is for providing inquiry service based on web container, it is achieved analyze result visualization;
Unified configuration center is for configuring application cluster.
Described big data handling system, it is preferred that: data acquisition module includes message-oriented middleware module, and this message-oriented middleware module receives the process request that application cluster sends, and the data received are supplied directly to distributed real-time streams computing module; Application cluster journal file is also sent to local disk and carries out storage backup by this message-oriented middleware module.
Described big data handling system, it is preferred that: data preprocessing module data are carried out pretreatment include data are carried out, stipulations, compression processes the data of identical category.
Described big data handling system, it is preferred that distributed document memory module includes: memory node, joint behavior evaluation module;
Wherein:
(1) performance of each server in application cluster is estimated by joint behavior evaluation module, generates a dynamic joint behavior reference file, and this document is regular update according to demand; The assessment of cluster interior joint server performance is included the CPU disposal ability of server, internal memory performance and magnetic disc i/o performance;
(2) when the upper transmitting file of user, joint behavior evaluation module first calculates the performance value of memory node and the ratio value of all joint behavior numerical value summations, and the value further according to this ratio determines that the size of data that this node can store accounts for cluster and always stores the ratio of size of data.
Described big data handling system, it is preferred that:
The performance number P_i of server node describes in order to minor function, and wherein C_i represents cpu performance value, and M_i represents internal memory performance value, and D_i represents magnetic disc i/o performance number, and W_i represents network I/O performance number:
Pi=α Ci+βMi+γDi+δWi
Alpha+beta+γ+δ=1
In above-mentioned formula, these four parameters of α, β, γ, δ represent the impact for the different weights of server performance of each index.
A kind of big data processing method simultaneously supporting off-line data and real-time online data, including:
Configuration information is read from unified configuration center, the data in relevant database are read according to this configuration information, and these data are imported distributed document memory module, receive the process request that application cluster sends, the request data received is supplied directly to distributed real-time streams computing module, application cluster journal file is sent to local disk and carries out storage backup;
Read configuration information from unified configuration center, read the journal file of the application of local disk storage, be stored in local disk after data are carried out pretreatment, and upload files to distributed document memory module;
Read data from data acquisition module, and read the configuration information of unified configuration center, calculate in real time, result of calculation is stored in data base; Each index, for processing the data of storage in distributed document memory module, has been calculated rear write into Databasce by off-line data processing module.
Described big data processing method, it is preferred that: data are carried out pretreatment include data are carried out, stipulations, compression processes the data of identical category.
Accompanying drawing explanation
Fig. 1 is the big data handling system schematic diagram simultaneously supporting off-line data and real-time online data provided by the invention;
Fig. 2 is the improvement dispatching algorithm schematic diagram of the present invention.
Detailed description of the invention
As it is shown in figure 1, support that the big data handling system of off-line data and real-time online data includes simultaneously: data acquisition module, data preprocessing module, unified configuration center, distributed document memory module, distributed real-time streams computing module, off-line data processing module, data base, aggregation of data analyze enquiry module and overview display module.
Data acquisition module:
(1) reading configuration information from unified configuration center, the data increment in a relevant database (such as: MySQL, Oracle etc.) is imported distributed document memory module by the mode dispatched by timing cycle, such as HDFS. Such as importing the user message table of storage, production schedule etc. in oracle database, data based on these data coordinate daily record data to be analyzed in follow-up log processing, calculating etc. processes. Data acquisition module according to the link information of the data base of the above data configuration derivation data from configuration center reading, from which table derivation data, can be derived the mode (full dose/increment) of data and derive the time started of data, data type etc.
(2) including message-oriented middleware module, it is possible to be WebSphereMQ message-oriented middleware, this middleware module receives the process request that application cluster sends, and the data received are supplied directly to distributed real-time streams computing module for real-time calculating. By this message-oriented middleware module, the journal file (application cluster journal file) of each application is sent to local disk and carries out storage backup. Data will not be carried out any process amendment by this part, it is ensured that data intactly store. Journal file will be supplied directly to data preprocessing module and use. Acquisition module may also include scheduler module and synchronous task management module, and synchronous task management module is for synchronizing the data acquisition in data base to HDFS, and scheduler module is for being timed above-mentioned data acquisition.
(3) data acquisition module is as a distributed system, this is as multinode structure, data need to be transmitted between different nodes, the situation such as therefore there will be node failure, system process lost efficacy, node load is excessive, and these situations all will cause loss of data. In order to ensure the transmission safety of data, the invention allows for a distributed fault based on data acquisition module and detect framework. In this framework, the data source of data collecting module collected includes two category nodes, one class is host node, another kind of is controlled node, one monitor node is managed server as host node, each application node, as controlled node, completes the monitoring to each application at each application node and controls function, and the method that monitoring host node is communicated by heart beating carries out data interaction with each controlled node. Controlled node needs timing to be sent to heartbeat data, reports the status information that this node apply, it is possible to is the Apply Names of this node, stores the node status information such as cpu load of position, IP address, present node. When certain controlled node does not send heartbeat data in heart beat cycle, then judge that this node temporarily lost efficacy, when certain node failure and alarm, be conducive to related personnel to fix a breakdown as early as possible, improve fault-tolerance and the reliability of data transmission. Meanwhile, controlled node configuration can be modified by manager by web interface, and is notified that controlled node updates its configuration by heart beating communication by monitor node.
Data preprocessing module: read configuration information from unified configuration center, read the journal file of each application being sent to local disk by described message-oriented middleware module, program is processed according to configuration information startup, data are carried out duplicate removal, cleaning, stipulations, exception record process, compression processes the data of identical category, is stored in local disk after each business being sorted out. And uploading files in distributed document memory module HDFS, data preprocessing module can include log collection module, is used for uploading journal file, and the log collection module of employing can be Flume system.
Distributed document memory module: the HDFS distributed file storage system of Hadoop can be adopted. The present invention creatively proposes a kind of appraisal procedure to joint behavior, distributed document memory module HDFS is stored algorithm improve, being implemented as follows, this distributed document memory module includes: memory node, joint behavior evaluation module (namenode):
(1) performance of each server in cluster is estimated by joint behavior evaluation module, generates a dynamic joint behavior reference file, and this document is regular update according to demand. When Hadoop cluster interior joint server performance is assessed by the present invention, focus mainly includes the CPU disposal ability of server, internal memory performance, magnetic disc i/o performance and network I/O performance. Performance number P to a server node i_i, it is possible to describe in order to minor function, wherein C_iRepresent cpu performance value, M_iRepresent internal memory performance value, D_iRepresent disk performance value, W_iRepresent network performance value:
Pi=α Ci+βMi+γDi+δWi
Alpha+beta+γ+δ=1
In above-mentioned formula, these four parameters of α, β, γ, δ represent the impact for the different weights of server performance of each index, and in different application scenarios, weight is also different. Such as when Hadoop cluster application is in the scene of data mining, then based on cpu performance. Therefore in actual applications, it is necessary to adjust the weighted value of parameters according to concrete application scenarios. After the value defining these four parameters of above-mentioned α, β, γ, δ, get the performance number P of each node through the test of performance reference instrument_i
(2) when the upper transmitting file of user, NameNode needs to store the data block of this document according to certain algorithms selection node. Joint behavior evaluation module all can give one joint behavior numerical value of this node in the Performance Evaluation to each node, the performance value of joint behavior evaluation module elder generation computing node and the ratio value of all joint behavior numerical value summations, the value further according to this ratio determines that the size of data that this node can store accounts for cluster and always stores the ratio of size of data. Realize storing when file stores on each node the data block of corresponding proportion according to the performance of node.
Distributed real-time streams computing module: based on ageing requirement, Storm is adopted to realize, Storm is a kind of big data handling system cluster increased income, data are read from message-oriented middleware module, and read the configuration information of unified configuration center, calculate in real time according to configuration information, result of calculation is stored in data base, such as HBASE or oracle database.
Off-line data processing module: for processing the mass data in distributed file system, by each index (as the same day goods orders amount seniority among brothers and sisters, merchandise sales classification seniority among brothers and sisters etc.) calculated rear write into Databasce, and by the mass data storage after processing in distributed file system. The present invention is by studying the Task Scheduling Mechanism of MapReduce, it is proposed that a kind of task allocation algorithms inferred based on node dynamic property suitable in isomerous environment, to improve its performance processing off-line data and stability. Processed offline module includes task allocation node, task processes node.
The present invention uses the data processing rate of node to represent the performance of node. Node data processing speed is the data volume of this node processing within the unit interval. In the Hadoop cluster of isomery, the quantity processing speed of node can present the performance difference of each node exactly.
Off-line data processing module, when processing data, carries out task distribution in the following way:
(1) when task allocation node to distribute task, it needs to process the performance of node in conjunction with each task, need the data volume of transmission and the network performance of each node described to carry out COMPREHENSIVE CALCULATING, selects optimal node by relevant computational analysis and carrys out operation task distribution. In acquiescence Hadoop cluster, task run is random at which node, and in innovatory algorithm of the present invention, the node of operation task is then the performance according to node and loading condition selects, and this improves Hadoop performance and stability to a certain extent.
Specific algorithm flow process is as shown in Figure 2: first task allocation node obtains the performance number of cluster interior joint by node dynamic property inference module, simultaneously by processing the heart beating communication of node with task, obtains other information of node, builds node state list. When task allocation node starts to distribute Reduce task, adopt the form of actively distribution. Task allocation node is high to Low according to joint behavior, successively the nodal information in query node status list, the then loading condition of query node, choose available free renduce task run ability and also performance best node distribution one reduce task. Further according to the node operation task needing the number of tasks run to choose respective numbers successively.
Aggregation of data analyzes enquiry module: is used for accessing the data bases such as HBase data base and Oracle, and provides various index query interface. In addition this module also provides for the maintenance interface to unified configuration center.
Overview display module: provide inquiry service based on web container, it is achieved analyze result visualization, it is provided that directly perceived, lively, data visualization chart that can be mutual, personalized. The characteristics such as re-computation, Data View, codomain roaming that pull of innovation greatly strengthen Consumer's Experience, imparts the ability that data are excavated, integrated by user. In addition overview display module also provides for interface unified configuration center is safeguarded. Simultaneously overview display module also provides for interface configuration center is safeguarded, mainly the information such as data source types, acquisition server address, acquisition strategies, pretreatment strategy is configured.
In accordance with the invention it is possible to off-line and process mass data in real time, meet user to data batch processing and ageing demand simultaneously. And the scheme of concrete raising systematic function is proposed so that storage and process mass data are more efficient. System level configurations, a lot of work just can be completed by page configuration. Displayed page is personalized, it is provided that better Consumer's Experience.

Claims (6)

1. the big data handling system simultaneously supporting off-line data and real-time online data, it is characterised in that include data acquisition module, pretreatment module, distributed storage module, distributed real-time streams computing module, off-line data processing module, data base, aggregation of data analysis enquiry module, overview display module and unified configuration center;
Wherein:
Data acquisition module, for reading configuration information from unified configuration center, reads the data in relevant database according to this configuration information, and these data is imported distributed document memory module; Receive the process request that application cluster sends, the request data received is supplied directly to distributed real-time streams computing module; Application cluster journal file is sent to local disk and carries out storage backup;
Data preprocessing module is for reading configuration information from unified configuration center, read the journal file of the application of local disk storage, it is stored in local disk after log file data is carried out pretreatment, and uploads log file data after pretreatment to distributed storage module;
Distributed storage module is used for storing mass data;
Distributed real-time streams computing module is for reading data from data acquisition module, and reads the configuration information of unified configuration center, according to this configuration information, the data read from data acquisition module is calculated in real time, result of calculation is stored in data base;
Each index, for processing the data of storage in distributed storage module, has been calculated rear write into Databasce by off-line data processing module;
Data base is used for storing data;
Aggregation of data is analyzed enquiry module and is used for accessing data base, and provides various index query interface;
Overview display module is for providing inquiry service based on web container, it is achieved analyze result visualization;
Unified configuration center is for configuring application cluster.
2. big data handling system according to claim 1, it is characterized in that: data acquisition module includes message-oriented middleware module, this message-oriented middleware module receives the process request that application cluster sends, and the request data received is supplied directly to distributed real-time streams computing module; Application cluster journal file is also sent to local disk and carries out storage backup by this message-oriented middleware module.
3. big data handling system according to claim 1, it is characterised in that: data preprocessing module data are carried out pretreatment include data are carried out, stipulations, compression processes the data of identical category.
4. big data handling system according to claim 1, it is characterised in that: distributed storage module includes: memory node, joint behavior evaluation module;
Wherein:
(1) performance of each server in application cluster is estimated by joint behavior evaluation module, generates a dynamic joint behavior reference file, and this document is regular update according to demand; The assessment of cluster interior joint server performance is included the CPU disposal ability of server, internal memory performance, magnetic disc i/o performance and network I/O performance;
(2) when the upper transmitting file of user, joint behavior evaluation module first calculates the performance value of memory node and the ratio value of all joint behavior numerical value summations, and the value further according to this ratio determines that the size of data that this node can store accounts for cluster and always stores the ratio of size of data.
5. the big data processing method simultaneously supporting off-line data and real-time online data, it is characterised in that including:
Read configuration information from unified configuration center, read the data in relevant database according to this configuration information, and these data are imported distributed document memory module; Receive the process request that application cluster sends, the request data received is supplied directly to distributed real-time streams computing module; Application cluster journal file is sent to local disk and carries out storage backup;
Read configuration information from unified configuration center, read the journal file of the application of local disk storage, after log file data is carried out pretreatment, be stored in local disk, and upload log file data after pretreatment to distributed document memory module;
Read data from data acquisition module, and read the configuration information of unified configuration center, according to this configuration information, the data read from data acquisition module are calculated in real time, result of calculation is stored in data base; Each index, for processing the data of storage in distributed document memory module, has been calculated rear write into Databasce by off-line data processing module.
6. big data processing method according to claim 5, it is characterised in that: data are carried out pretreatment include data are carried out, stipulations, compression processes the data of identical category.
CN201610005212.8A 2016-01-05 2016-01-05 Big data processing and solving system simultaneously supporting offline data and real-time online data Pending CN105677836A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610005212.8A CN105677836A (en) 2016-01-05 2016-01-05 Big data processing and solving system simultaneously supporting offline data and real-time online data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610005212.8A CN105677836A (en) 2016-01-05 2016-01-05 Big data processing and solving system simultaneously supporting offline data and real-time online data

Publications (1)

Publication Number Publication Date
CN105677836A true CN105677836A (en) 2016-06-15

Family

ID=56298913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610005212.8A Pending CN105677836A (en) 2016-01-05 2016-01-05 Big data processing and solving system simultaneously supporting offline data and real-time online data

Country Status (1)

Country Link
CN (1) CN105677836A (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250444A (en) * 2016-07-27 2016-12-21 北京集奥聚合科技有限公司 The real-time Input System of a kind of heterogeneous data source and method
CN106446170A (en) * 2016-09-27 2017-02-22 努比亚技术有限公司 Data querying method and device
CN106790572A (en) * 2016-12-27 2017-05-31 广州华多网络科技有限公司 The system and method that a kind of distributed information log is collected
CN106920158A (en) * 2017-03-22 2017-07-04 北京再塑宝科技有限公司 Order real-time monitoring system based on Storm and Kafka technologies
CN106940724A (en) * 2017-03-20 2017-07-11 天津大学 A kind of many pattern convergence analysis processing methods towards big data
CN106991070A (en) * 2016-10-11 2017-07-28 阿里巴巴集团控股有限公司 Real-time computing technique and device
CN107016133A (en) * 2017-05-24 2017-08-04 成都享之道网络科技有限公司 Based on the online big data system with offline double processing
CN107103051A (en) * 2017-04-05 2017-08-29 成都爱途享科技有限公司 Set up the quick loading device in processing data
CN107193988A (en) * 2017-05-30 2017-09-22 梅婕 The quick method for cleaning of data
CN107395669A (en) * 2017-06-01 2017-11-24 华南理工大学 A kind of collecting method and system based on the real-time distributed big data of streaming
CN107464056A (en) * 2017-08-03 2017-12-12 武汉远众科技有限公司 A kind of offline real-time task type collecting method based on mobile phone
CN107577809A (en) * 2017-09-27 2018-01-12 北京锐安科技有限公司 Offline small documents processing method and processing device
CN107870982A (en) * 2017-10-02 2018-04-03 深圳前海微众银行股份有限公司 Data processing method, system and computer-readable recording medium
CN108228830A (en) * 2018-01-03 2018-06-29 广东工业大学 A kind of data processing system
CN108519914A (en) * 2018-04-09 2018-09-11 腾讯科技(深圳)有限公司 Big data computational methods, system and computer equipment
CN108573348A (en) * 2018-04-18 2018-09-25 鑫涌算力信息科技(上海)有限公司 Financial indicator distributed computing method and its system
CN108595644A (en) * 2018-04-26 2018-09-28 宁波银行股份有限公司 A kind of big data platform operation management system
CN108629016A (en) * 2018-05-08 2018-10-09 成都信息工程大学 Support real-time stream calculation towards big data database control system, computer program
CN108920498A (en) * 2018-05-23 2018-11-30 阿里巴巴集团控股有限公司 Data query method, device and equipment
CN108984610A (en) * 2018-06-11 2018-12-11 华南理工大学 A kind of method and system based on the offline real-time processing data of big data frame
CN108985981A (en) * 2018-06-28 2018-12-11 北京奇虎科技有限公司 Data processing system and method
CN109327351A (en) * 2018-09-12 2019-02-12 拉扎斯网络科技(上海)有限公司 Real-time collecting method, device, electronic equipment and the storage medium of daily record data
CN109359109A (en) * 2018-08-23 2019-02-19 阿里巴巴集团控股有限公司 A kind of data processing method and system calculated based on distributed stream
CN109408567A (en) * 2018-09-11 2019-03-01 广东布田电子商务有限公司 A kind of big data processing platform network architecture
CN109739925A (en) * 2019-01-07 2019-05-10 北京云基数技术有限公司 A kind of data processing system and method based on big data
CN110110170A (en) * 2019-04-30 2019-08-09 北京字节跳动网络技术有限公司 A kind of method, apparatus of data processing, medium and electronic equipment
CN110489476A (en) * 2019-08-22 2019-11-22 金瓜子科技发展(北京)有限公司 Data processing method, system and server
CN110659270A (en) * 2019-08-19 2020-01-07 苏宁金融科技(南京)有限公司 Data processing and transmitting method and device
CN111061799A (en) * 2019-12-23 2020-04-24 集奥聚合(北京)人工智能科技有限公司 Distributed big data processing system
CN111159280A (en) * 2020-01-02 2020-05-15 南京欣网通信科技股份有限公司 Big data processing system
CN111949637A (en) * 2020-08-18 2020-11-17 上海七牛信息技术有限公司 Log data processing method, device and system, electronic equipment and storage medium
CN113360268A (en) * 2021-06-23 2021-09-07 成都房联云码科技有限公司 Weak centralized distributed scheduling system based on container operation
CN113407617A (en) * 2021-06-25 2021-09-17 交控科技股份有限公司 Real-time and off-line service unified processing method and device based on big data technology
CN113468246A (en) * 2021-07-20 2021-10-01 上海齐屹信息科技有限公司 Intelligent data counting and subscribing system and method based on OLTP

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102176696A (en) * 2011-02-25 2011-09-07 曙光信息产业(北京)有限公司 Multi-computer system
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN105138615A (en) * 2015-08-10 2015-12-09 北京思特奇信息技术股份有限公司 Method and system for building big data distributed log

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102176696A (en) * 2011-02-25 2011-09-07 曙光信息产业(北京)有限公司 Multi-computer system
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN105138615A (en) * 2015-08-10 2015-12-09 北京思特奇信息技术股份有限公司 Method and system for building big data distributed log

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250444A (en) * 2016-07-27 2016-12-21 北京集奥聚合科技有限公司 The real-time Input System of a kind of heterogeneous data source and method
CN106446170A (en) * 2016-09-27 2017-02-22 努比亚技术有限公司 Data querying method and device
CN106991070B (en) * 2016-10-11 2021-02-26 创新先进技术有限公司 Real-time computing method and device
CN106991070A (en) * 2016-10-11 2017-07-28 阿里巴巴集团控股有限公司 Real-time computing technique and device
CN106790572A (en) * 2016-12-27 2017-05-31 广州华多网络科技有限公司 The system and method that a kind of distributed information log is collected
CN106940724A (en) * 2017-03-20 2017-07-11 天津大学 A kind of many pattern convergence analysis processing methods towards big data
CN106920158A (en) * 2017-03-22 2017-07-04 北京再塑宝科技有限公司 Order real-time monitoring system based on Storm and Kafka technologies
CN107103051A (en) * 2017-04-05 2017-08-29 成都爱途享科技有限公司 Set up the quick loading device in processing data
CN107016133A (en) * 2017-05-24 2017-08-04 成都享之道网络科技有限公司 Based on the online big data system with offline double processing
CN107193988A (en) * 2017-05-30 2017-09-22 梅婕 The quick method for cleaning of data
CN107395669A (en) * 2017-06-01 2017-11-24 华南理工大学 A kind of collecting method and system based on the real-time distributed big data of streaming
CN107395669B (en) * 2017-06-01 2020-04-07 华南理工大学 Data acquisition method and system based on streaming real-time distributed big data
CN107464056A (en) * 2017-08-03 2017-12-12 武汉远众科技有限公司 A kind of offline real-time task type collecting method based on mobile phone
CN107577809A (en) * 2017-09-27 2018-01-12 北京锐安科技有限公司 Offline small documents processing method and processing device
CN107870982A (en) * 2017-10-02 2018-04-03 深圳前海微众银行股份有限公司 Data processing method, system and computer-readable recording medium
CN107870982B (en) * 2017-10-02 2021-04-23 深圳前海微众银行股份有限公司 Data processing method, system and computer readable storage medium
CN108228830A (en) * 2018-01-03 2018-06-29 广东工业大学 A kind of data processing system
CN108519914A (en) * 2018-04-09 2018-09-11 腾讯科技(深圳)有限公司 Big data computational methods, system and computer equipment
CN108519914B (en) * 2018-04-09 2021-10-26 腾讯科技(深圳)有限公司 Big data calculation method and system and computer equipment
CN108573348A (en) * 2018-04-18 2018-09-25 鑫涌算力信息科技(上海)有限公司 Financial indicator distributed computing method and its system
CN108573348B (en) * 2018-04-18 2021-01-01 鑫涌算力信息科技(上海)有限公司 Financial index distributed computing method and system
CN108595644A (en) * 2018-04-26 2018-09-28 宁波银行股份有限公司 A kind of big data platform operation management system
CN108629016A (en) * 2018-05-08 2018-10-09 成都信息工程大学 Support real-time stream calculation towards big data database control system, computer program
CN108629016B (en) * 2018-05-08 2022-05-24 成都信息工程大学 Big data base oriented control system supporting real-time stream computing and computer program
CN108920498A (en) * 2018-05-23 2018-11-30 阿里巴巴集团控股有限公司 Data query method, device and equipment
CN108920498B (en) * 2018-05-23 2022-03-25 创新先进技术有限公司 Data query method, device and equipment
CN108984610A (en) * 2018-06-11 2018-12-11 华南理工大学 A kind of method and system based on the offline real-time processing data of big data frame
CN108985981A (en) * 2018-06-28 2018-12-11 北京奇虎科技有限公司 Data processing system and method
CN108985981B (en) * 2018-06-28 2021-04-23 北京奇虎科技有限公司 Data processing system and method
CN109359109B (en) * 2018-08-23 2022-05-27 创新先进技术有限公司 Data processing method and system based on distributed stream computing
CN109359109A (en) * 2018-08-23 2019-02-19 阿里巴巴集团控股有限公司 A kind of data processing method and system calculated based on distributed stream
CN109408567A (en) * 2018-09-11 2019-03-01 广东布田电子商务有限公司 A kind of big data processing platform network architecture
CN109327351A (en) * 2018-09-12 2019-02-12 拉扎斯网络科技(上海)有限公司 Real-time collecting method, device, electronic equipment and the storage medium of daily record data
CN109739925A (en) * 2019-01-07 2019-05-10 北京云基数技术有限公司 A kind of data processing system and method based on big data
CN110110170A (en) * 2019-04-30 2019-08-09 北京字节跳动网络技术有限公司 A kind of method, apparatus of data processing, medium and electronic equipment
CN110659270A (en) * 2019-08-19 2020-01-07 苏宁金融科技(南京)有限公司 Data processing and transmitting method and device
CN110489476A (en) * 2019-08-22 2019-11-22 金瓜子科技发展(北京)有限公司 Data processing method, system and server
CN111061799A (en) * 2019-12-23 2020-04-24 集奥聚合(北京)人工智能科技有限公司 Distributed big data processing system
CN111159280A (en) * 2020-01-02 2020-05-15 南京欣网通信科技股份有限公司 Big data processing system
CN111949637A (en) * 2020-08-18 2020-11-17 上海七牛信息技术有限公司 Log data processing method, device and system, electronic equipment and storage medium
CN113360268A (en) * 2021-06-23 2021-09-07 成都房联云码科技有限公司 Weak centralized distributed scheduling system based on container operation
CN113407617A (en) * 2021-06-25 2021-09-17 交控科技股份有限公司 Real-time and off-line service unified processing method and device based on big data technology
CN113468246A (en) * 2021-07-20 2021-10-01 上海齐屹信息科技有限公司 Intelligent data counting and subscribing system and method based on OLTP

Similar Documents

Publication Publication Date Title
CN105677836A (en) Big data processing and solving system simultaneously supporting offline data and real-time online data
US20220283208A1 (en) Systems and methods for processing different data types
US20210176136A1 (en) Continuous data sensing of functional states of networked computing devices to determine efficiency metrics for servicing electronic messages asynchronously
Bi et al. Big data analytics with applications
CN106716454B (en) Identifying non-technical losses using machine learning
CN109690524A (en) Data Serialization in distributed event processing system
CN107256443A (en) Line loss real-time computing technique based on business and data integration
US9043317B2 (en) System and method for event-driven prioritization
CN102227121A (en) Distributed buffer memory strategy adaptive switching method based on machine learning and system thereof
US20180107961A1 (en) Task Support System and Task Support Method
CN106920158A (en) Order real-time monitoring system based on Storm and Kafka technologies
US10466686B2 (en) System and method for automatic configuration of a data collection system and schedule for control system monitoring
CN110348821A (en) A kind of the intelligence manufacture management system and method for combination Internet of Things
CN104113605A (en) Enterprise cloud application development monitoring processing method
Raj et al. Big data analytics processes and platforms facilitating smart cities
CN107220271A (en) A kind of method and system of distributed digital resource storage processing and management
Liu et al. On construction of an energy monitoring service using big data technology for smart campus
CN105260931A (en) Financial service platform system based on MOT module
Chircu et al. Visualization and machine learning for data center management
CN110837970A (en) Regional health platform quality control method and system
Yang et al. On construction of the air pollution monitoring service with a hybrid database converter
Adhikari et al. A distinctive real-time information for industries and new business opportunity analysis offered by SAP and AnyLogic simulation
Lloret-Gallego et al. Methodology for the evaluation of resilience of ICT systems for smart distribution grids
CN109522349A (en) Across categorical data calculating and sharing method, system, equipment
Wolak-Tuzimek et al. Effect of Integrated IT Systems on Enterprise Competitiveness at Time of “Industry 4.0”

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160615