CN104615526A - Monitoring system of large data platform - Google Patents

Monitoring system of large data platform Download PDF

Info

Publication number
CN104615526A
CN104615526A CN201410740935.3A CN201410740935A CN104615526A CN 104615526 A CN104615526 A CN 104615526A CN 201410740935 A CN201410740935 A CN 201410740935A CN 104615526 A CN104615526 A CN 104615526A
Authority
CN
China
Prior art keywords
large data
data platform
information
job
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410740935.3A
Other languages
Chinese (zh)
Inventor
熊桂喜
乔少卿
姜骁
赵明
杜博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201410740935.3A priority Critical patent/CN104615526A/en
Publication of CN104615526A publication Critical patent/CN104615526A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention relates to a monitoring system of a large data platform. The monitoring system of the large data platform comprises a large data platform operation information statistic module, a large data platform operation monitoring module and a large data platform operation statistic analysis module. The monitoring system of the large data platform resolves the following problems that firstly, the large data platform uses the Hadoop to store and manage data; secondly, for operation on the platform, the large data platform only stores the final state of operation, the middle state of operation is not recorded, and operation analysis is not facilitated; thirdly, the large data platform is lack of statistics and analysis of the operation state and trend, and only the current operation information can be obtained. The monitoring of assemblies in the platform is realized and displayed on an interface. The monitoring system of the large data platform realizes operation middle process monitoring, and collects and stores the input and output data amount and dependence information of operation. Through statistics and analysis of data in the operation process, the monitoring system of the large data platform realizes statistics and analysis of the operation trend in the large data platform.

Description

A kind of supervisory system of large data platform
Technical field
The present invention relates to the resource of large data platform and the monitoring of task, belong to computer and network technology application.
Background technology
Along with social informatization technology improve constantly and Internet technology is popularized fast, need data to be processed also increasing, the demand of every field to mass data processing also gets more and more.Can not meet people under the background of the demand of mass data processing at unit device storage space and arithmetic capability, Distributed Calculation and parallel computation start fast development and application, finally develop into grid computing.The monitor message of extensive lower distributed system is magnanimity, and monitoring resource is multi-level multi-source, and the dynamic of large data platform, complicacy bring numerous difficulty to the supervisory system of large data platform.How effectively to monitor the software and hardware resources in large data platform, predicting the bottleneck of resource in time, before breaking down, take corresponding measure, is the key improving large data platform service quality, is also the emphasis of research at present.
Monitoring is the important component part of large data platform, easy-to-use unified monitoring function is lacked in existing large data platform of increasing income, specifically have: obtain large data platform running status difficulty, can not the problem of job run state and the shortage to the statistics and analysis function of operation in the large data platform of real-time exhibition.The resource category that data platform need be monitored is various, and level is various.Hardware resource has CPU, internal memory, network and hard disk etc.; Software resource comprises Hadoop, Hbase and zookeeper etc. of running in platform; Operation resource comprise all kinds of operations operated on platform operation progress, take resource and schedule information etc.
Summary of the invention
The technical problem to be solved in the present invention: multi-source various dimensions monitoring data collection and integration in large data platform, the monitoring of operation in large data platform, statistics and analysis.There is provided intuitively, large data monitoring system that is easy-to-use, response fast.
The technical solution used in the present invention: a kind of supervisory system of large data platform, comprises large data platform operation information statistics sub system, large data platform monitoring operation subsystem and large data platform job accounting analyzing subsystem.
large data platform operation information statistics sub system
Large data platform overall operation situation is monitored in real time, the monitor message of all component in large data platform is carried out concentrate displaying, mainly distributed file system HDFS running status is shown, resource management framework Yarn running status is shown, distributed consensus service Zookeeper running status is shown and the displaying of NoSql database HBase running status is integrated.
● HDFS operation information is monitored
The performance index of the NameNode in HDFS are obtained, the HDFS information of DataNode by JMX.JMX (JavaManagement Extensions, i.e. Java administration extensions) is a framework being application program, equipment, system etc. and implanting management function.Hadoop provides JMX monitor-interface, and in HDFS, JMX monitor-interface is <Namenode>:50070/jmx.For the jmx interface of HDFS, rreturn value is JSON data, uses the json.loads in the json module of python to carry out the JSON data returned resolving the monitor message that can obtain HDFS.
WebHDFS is that the HDFS REST that hadoop provides realizes, and can access HDFS by the mode of REST API http, can realize carrying out GET, POST, PUT and DELETE operation to HDFS by REST API.The operation that large data platform runs operates the data on HDFS, need operation associated documents information on monitoring HDFS, the scale of work data can be obtained by these information, data manipulation total amount, generate result total amount and derive result data, in order to meet the monitoring demand of user job to data on HDFS, a kind of supervisory system of large data platform is by the encapsulation to WebHDFS, the file statistical information of operation inputoutput data can be obtained, thus data flow state in monitoring task.
● Yarn computational resource is monitored
Yarn is the distributed resource management framework of hadoop, and Yarn is made up of ResourceManager and nodemanager, and RM (ResourceManager) controls whole cluster and manages the distribution of the basic calculation resource of Yarn upper level applications.The JMX interface using RM to provide can obtain RM current operating conditions, mainly contains the CPU, memory source service condition and the RM service operation information that large data platform may be used for calculate.Use RM provide Restful API can obtain RM running state information, RM monitors metric, RM resource scheduling information, the upper application message of RM and RM distributed node information.
● Zookeeper operation monitoring
Use four word commands that Zookeeper provides " mntr ", each Zookeeper operation information can be obtained, use network that mntr order is sent to Zookeeper server, Zookeeper server returns linking number, memory database size, service role, the watcher number of Zookeeper service in the machine and postpones number.
● Hbase operation monitoring
Hbase provides JMX monitor-interface, and native system obtains HBase running state information by request JMX, and Hbase monitoring nodes information and Hbase show monitor message.
large data platform monitoring operation subsystem
Carry out calculated off-line and data analysis is large data platform key operation, homework type in existing large data platform is MapReduce operation, MapReduce monitoring operation function gathers for the data message of the MapReduce operation on hadoop, operation information and statistical information, needs to take distinct methods to monitor the operation run and the operation completed because the way to manage of hadoop to operation determines.Can be obtained just at running job operation information by the form of Restfu lAPI in Hadoop, after job run, under hadoop leaves the end-state information of the operation completed and statistical information the catalogue of HDFS in, the historical information of the operation completed can be obtained by the Historical Jobs message file of access HDFS.
● real time job is monitored
The Restful interface using Yarn to provide obtains the job run information run, and the running job monitor message that can obtain is described as follows shown in table:
Native system is in order to realize the monitoring to job run process, use finger daemon Collecting operation operation monitoring data, setting one-minute timer, an acquisition tasks is triggered every one minute, generate the url of the RESTful interface of monitoring operation information, after sending RESTful acquisition request result, be stored to database.Can obtain the trend of operation in operational process in this way, these trend can reflect network and IO trend in job run process, for task analysis provides foundation.The trend analysis data that native system provides have:
Monitor message title Account form Trend
HDFS_BYTES_WRITTEN Currency deducts last minute value HDFS writes data volume rate trend
FILE_BYTES_WRITTEN Currency deducts last minute value Local file write data volume rate trend
HDFS_BYTES_READ Currency deducts last minute value HDFS reads data volume rate trend
FILE_BYTES_READ Currency deducts last minute value Local file write data volume rate trend
memory snapshot Current virtual internal memory and physical memory value Operation internal memory uses variation tendency
Reduce shuffle bytes Currency deducts last minute value Network transmission speed trend
Map output records Currency deducts last minute value The line number that Map per minute exports
Map input records Currency deducts last minute value The line number that Map per minute reads
● Historical Jobs is monitored
From hadoop Job execution state aware, after hadoop Job execution, job run monitor message is by the particular file folder that is stored on HDFS, the historic task monitor message store path of acquiescence is /tmp/hadoop-yarn/staging/history/done/{date}/{ id}.jhist file, jhist file layout is json, by carrying out the json of file running statistical information when parsing can obtain each state of operation.Native system uses the java API of HDFS to obtain Historical Jobs operation information file on HDFS, resolves json and obtains Historical Jobs monitor message, and deposit to database.Native system adopts the mode of start by set date capture program to monitor the operation run, start by set date every day capture program, obtains all monitor message data of having finished the work run the previous day, and each monitor message is stored to database.Native system achieves finger daemon and gathers Historical Jobs monitor data, 0: 10 every day carried out an acquisition tasks, generate the path that information of finishing the work the day before yesterday stores, file under use HDFS API read path, is stored to database after resolving the file acquisition result obtained.Can obtain the trend of periodic job in operating statistic information in this way, these trend can reflect operation running status every day trend, for task analysis and prediction provide foundation.The trend analysis data that native system provides have:
large data platform job accounting analyzing subsystem
In application in large data platform, have operation can produce a large amount of intermediate data in the process of implementation, when platform stores inadequate, these intermediate data can affect the computing power of large data platform greatly, thus drag slow whole cluster, cause task failure on a large scale.So it is necessary to carry out statistics to job run procedural information, after task brings into operation, timing acquisition task run information, then to a series of data analysis and the displaying of task run state, thus the middle operation trend of Job execution can be analyzed, ensure the smooth execution of operation.In large data platform, there is new data importing every day, need to run specific program every day to process new data and analyze, rule can be found by Historical Jobs statistical information for the operation run these every days, thus job run situation is judged and predicts.It is as follows that native system carries out statistical study information to the operation on large data platform:
● network traffics
Network traffics refer to the flow that Hadoop operation produces during pulling data in operational process, produce network traffics and have following three phases: Map end obtains input data phase from HDFS, the shuffle stage obtains Map end and exports data phase, after Reduce has operated, output is written to the HDFS stage.Uninterrupted between operations and HDFS can be obtained by two counter analyzing operation IO relevant, these two Counter respectively: HDFS_BYTES_READ and HDFS_BYTES_WRITTEN in the FileSystemCounters of file system statistical information group.And the Reduce shuffle bytes in MapReduce Framework information group illustrates the operation flow that pulling data produces in shuffle process, also represent it is that Map end is transferred to Reduce and holds data volume size altogether.Namely the network traffics of Hadoop operation add up by the parameter of three above.
● IO reads and writes
Read and write data by the IO analyzing operation, the file system deflection that operation operates in the process of implementation can be obtained, the Hadoop default record all I/O operation to file system of operation, the statistical number of these operations is in FileSystemCounters group, as follows to FileSystemCounters counter group analysis:
1.HDFS_BYTES_READ represents that the byte number of data is read in operation in the process of implementation from HDFS, because MapReduce operation only has the Map stage to read data from HDFS, so also represent that Map obtains the total amount of data from HDFS, comprises split metadata.
2.HDFS_BYTES_WRITTEN represents that operation writes the total amount of byte of data in the process of implementation on HDFS, the Reduce stage of MapReduce operation is after being finished, result of calculation is write HDFS, if there is no the Reduce stage in operation, then after the operation Map stage is finished, the Output rusults in Map stage stored in HDFS.
3.HDFS_READ_OPS represents that operation in the process of implementation, altogether HDFS is carried out to the number of times of read operation.
4.HDFS_WRITTEN_OPS represents that operation in the process of implementation, altogether HDFS is carried out to the number of times of write operation.
5.FILE_BYTES_READ represents that the byte number of data is read in operation in the process of implementation from local disk, Map and the Reduce end of MapReduce operation can carry out sorting operation, needs to read the intermediate calculation data in local disk.
6.FILE_BYTES_WRITTEN represents that the byte number of data is read in operation in the process of implementation from local disk, Map and the Reduce end of MapReduce operation can carry out sorting operation, needs in interim intermediate result write local disk.
7.FILE_READ_OPS adds up the operand reading local disk.
8.FILE_WRITTEN_OPS adds up the operand reading local disk.
● internal memory service condition
By analyzing the internal memory service condition of operation, operation internal memory service condition in the process of implementation can be obtained, operation internal memory service condition statistical number in Map-Reduce Framework group, concrete Counter analysis is as follows:
1.Physical memory snapshot represents that operation is in operation the physical memory size of current use, and the memory image of the corresponding process that tasks all in All Jobs are read is added up.
2.Virtual memory snapshot represents that operation is in operation the virtual memory size of current use, and the memory image of the corresponding process that tasks all in All Jobs are read is added up.
3.Total committed heap usage represents that operation is in operation the storehouse size of current use, and the stack information of the corresponding process that tasks all in All Jobs are read is added up.
● local optimization
Hadoop is distributed computing framework, and data and calculating are all distributed, and Hadoop runs on the node with input data by allowing Map task, decreases network overhead, improves computing velocity, achieve and optimize this locality of Distributed Calculation.The data that can obtain reading by task count Job Counters.Data-local Map tasks are the Map numbers in this locality, all Map numbers in operation can be obtained by task count Job Counters.Launched Map tasks, be divided by then and can obtain the local optimization rate of operation.
● calculate and lay particular stress on rate
The institute of use CPU can be obtained if having time by analyzing statistical information in Hadoop operation, thus weigh the amount of calculation of operation, the all CPU time consumed in job run process obtain by Map-Reduce Framework:CPU time spent (ms), this statistical information is comprehensive by operation user's CPU time that each task process uses in operational process and kernel CPU time, operation is used the T.T. of CPU can calculate task T.T. and whether lay particular stress on calculating divided by job run.
Accompanying drawing explanation
Fig. 1 is system architecture diagram of the present invention.
Fig. 2 is running job of the present invention monitoring process flow diagram.
Fig. 3 is Historical Jobs of the present invention monitoring process flow diagram.
Fig. 4 is large data platform operation information statistics sub system process flow diagram of the present invention.
Fig. 5 is the large data platform functional structure chart according to the present invention's design.
Embodiment
As shown in Figure 1, a kind of supervisory system of large data platform is based upon on large data platform, comprises acquisition layer, accumulation layer, functional layer and presentation layer, and acquisition layer is responsible for gathering the monitor message in distributed type assemblies, and monitor message deposited in database; Accumulation layer uses relevant database mysql, is responsible for monitor data and the statistics of the collection of storage of collected layer; Functional layer is the Service controll end of system, is responsible for the process of data processing and web interface request; Presentation layer is used for showing monitor message, cluster management and Job execution information, is responsible for and user interactions.
Large data platform: monitoring management system is based upon on large data platform, large data platform is made up of the Hadoop ecosystem, includes distributed file system HDFS, resource management framework YARN, distributed consensus instrument Zookeeper, distributed NoSql database HBase.In large data platform, software version is as follows: Apache Hadoop 2.4.1, Apache Hbase0.98 and Apache Zookeeper 3.4.6.
Acquisition layer: acquisition layer is based upon on large data platform, collection monitoring information in many ways, at each machine upper portion administration sampling instrument, gather monitoring resource information, monitoring nodes information, finger daemon monitor message and monitoring operation information, and these monitor messages are carried out collection statistics, deposit in database.
Accumulation layer: accumulation layer is responsible for carrying out alternately with database, and the statistics of the data that storage of collected layer gathers and processing layer analytical calculation, has real time job database, Historical Jobs database, monitor message database and statistical information data storehouse.
Functional layer: functional layer is the key-course in MVC framework, is responsible for process user request and returns request results, being responsible for data processing, carrying out statistical study to image data.Carry out parsing to JMX monitor messages such as hadoop, hbase to extract and analyze, parsing is carried out to the information extracted in the Restful interface of the assemblies such as hadoop and extracts and analyze.Be divided into HDFS monitoring, monitoring resource, monitoring operation, monitoring nodes, job analysis and statistics according to demand.HDFS monitoring comprises cluster-based storage, HDFS node and data snapshot.Monitoring resource comprises spendable virtual resource monitor message on the monitoring of the hardware resource such as CPU, internal memory and Yarn.Monitoring operation comprises operation and takies resource situation, and Job execution situation and current work statistical information and Historical Jobs perform statistical information.Monitoring nodes comprises the information such as ruuning situation and finger daemon of each node.
Presentation layer: for showing monitor message and user interactions, is responsible for the displaying of cluster monitoring information and mutual with user, uses the exploitation of HTML5, CSS and JS storehouse, and look & feel uses Bootstrap3.0 exploitation.Use the template function in Django dynamically to generate interface content, the webpage that same template generation is of the same type can be used, greatly reduce code redundancy.User can obtain the monitor message overview display in large data platform by monitoring interface, use the various ways such as chart to show monitor message, can allow the monitor message of the large data platform of the acquisition of user's intuitive and convenient.
Fig. 2 illustrates the work flow that monitoring is running, obtain current the job list run, then the API of Yarn is used to judge whether operation runs according to operation id complete, if operation is also in operation, then use the Restful Url of the MRUrl.py generating run monitoring operation of native system, sent request by urllib2, and acquisition returns results, resolve returning results, extract useful monitor message and return to controller, the monitoring page dynamically sends request of data to controller, realizes the monitoring of running job.
Fig. 3 shows the Mission Monitor acquisition of information flow process completed, Hadoop provides a history server, the job information run can be checked by history server, these job run information have: Map number, Reduce number, the Hand up homework time, the Hand up homework time, the information such as operation deadline and operating statistic number, job run monitor message is by the particular file folder that is stored on HDFS, the historic task monitor message store path of acquiescence is /tmp/Hadoop-yarn/staging/history/done/{date}/{ id}.jhist file, jhist file layout is json, by carrying out the json of file running statistical information when parsing can obtain each state of operation.Native system uses the java API of HDFS to obtain Historical Jobs operation information file on HDFS, first the monitor message file that HDFS API reads all tasks completed the previous day is called, resolve text data, obtain JSON data, parse job run information with final stored in database.By calling a historic task acquisition function every day, collect the monitoring operation data that every balance table completes, operation on platform is daily added up, and periodic job monitor data is carried out statistics and analysis, obtain the operation trend of periodic job, these trend can reflect that the amount of reading and writing data size variation every day, execution time change and resource use change.
Fig. 4 illustrates the execution flow process of large data platform operation information statistics sub system, when user sends after large data platform operation information statistical circles requests in person and ask, controller performs monitoring information acquisition module, monitoring information acquisition module obtains HDFS monitor data, RM monitor data, the monitor data of monitoring nodes data and Zookeeper, hbase, after getting monitor data, monitor data is resolved, adds up and integrate the monitor message of large data platform, and show on interface.
More than describe implementation procedure of the present invention in detail, do not described part in detail and belong to techniques well known.

Claims (7)

1. a supervisory system for large data platform, is characterized in that: comprise large data platform operation information statistics sub system, large data platform monitoring operation subsystem and large data platform job accounting analyzing subsystem.
2. large data platform supervisory system according to claim 1, is characterized in that: operation information statistics sub system, monitors in real time large data platform overall operation situation, the monitor message of all component in large data platform is carried out concentrating and shows.
3. the supervisory system of large data platform according to claim 2, it is characterized in that: large data platform monitoring operation subsystem, Real-time Obtaining job run information, uninterruptedly monitors terminating, thus recorded by job run procedural information operation from bringing into operation to.
4. the supervisory system of large data platform according to claim 3, it is characterized in that: periodic job is monitored, collect the monitoring operation data that every balance table completes, the operation on platform is daily added up, and periodic job monitor data is carried out extracting and storing.
5. the supervisory system of large data platform according to claim 4, is characterized in that: carry out statistics and analysis to the job run situation on large data platform; Analyze job run procedural information, obtain the resource Using statistics in job run process, data turnover statistics, perform Information Statistics and trend; Analytical cycle job information, contrasts same operation each run situation in certain hour section, finds operation trend and the exception of this operation.
6. the supervisory system of large data platform according to claim 4, it is characterized in that: analyze the network traffics in job run process, IO read-write, resource service condition and operating Map and Reduce operation information, rate, local data operation optimization rate and data processing rate trend are laid particular stress in the calculating counted in Job execution process.
7. the supervisory system of large data platform according to claim 4, it is characterized in that: the analysis of the statistical information after same operation each run within one period is terminated, obtain the operation trend in section between this operation at this moment, these operation trends have: the change of the change of Job Operations data volume, Job execution temporal information and the change of operation resource use amount.
CN201410740935.3A 2014-12-05 2014-12-05 Monitoring system of large data platform Pending CN104615526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410740935.3A CN104615526A (en) 2014-12-05 2014-12-05 Monitoring system of large data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410740935.3A CN104615526A (en) 2014-12-05 2014-12-05 Monitoring system of large data platform

Publications (1)

Publication Number Publication Date
CN104615526A true CN104615526A (en) 2015-05-13

Family

ID=53149983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410740935.3A Pending CN104615526A (en) 2014-12-05 2014-12-05 Monitoring system of large data platform

Country Status (1)

Country Link
CN (1) CN104615526A (en)

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915793A (en) * 2015-06-30 2015-09-16 北京西塔网络科技股份有限公司 Public information intelligent analysis platform based on big data analysis and mining
CN105069066A (en) * 2015-07-29 2015-11-18 江苏方天电力技术有限公司 Big data platform based distributed calculation frame and method for monitoring energy conservation and emission reduction
CN105512002A (en) * 2015-11-23 2016-04-20 上海汽车集团股份有限公司 Equipment monitoring method and monitoring server
CN105573892A (en) * 2015-12-21 2016-05-11 农信银资金清算中心有限责任公司 Business data batch processing method and system
CN105630652A (en) * 2016-02-02 2016-06-01 中国石油大学(华东) Real-time big data platform Storm oriented runtime three-dimensional visualization system
CN106022007A (en) * 2016-06-14 2016-10-12 中国科学院北京基因组研究所 Cloud platform system and method oriented to biological omics big data calculation
CN106357781A (en) * 2016-09-29 2017-01-25 郑州云海信息技术有限公司 Method and system for establishing resource service calling interface
CN106452970A (en) * 2016-11-03 2017-02-22 合肥微梦软件技术有限公司 Analysis system for network flow monitoring
CN106686145A (en) * 2017-03-14 2017-05-17 郑州云海信息技术有限公司 Web service method for managing plurality of servers
CN107169084A (en) * 2017-05-11 2017-09-15 深圳市茁壮网络股份有限公司 A kind of data processing method, distributed file system and data server
CN107241752A (en) * 2017-05-26 2017-10-10 华中科技大学 The YARN dispatching methods and system of a kind of sensing network flow
CN107301113A (en) * 2017-05-26 2017-10-27 北京小度信息科技有限公司 Mission Monitor method and device
CN107391342A (en) * 2017-07-21 2017-11-24 郑州云海信息技术有限公司 A kind of database all-in-one and its monitoring method
CN107436806A (en) * 2016-05-27 2017-12-05 苏宁云商集团股份有限公司 A kind of resource regulating method and system
CN107844568A (en) * 2017-11-03 2018-03-27 广东电网有限责任公司电力调度控制中心 A kind of MapReduce implementation procedure optimization methods of processing data source renewal
CN107885834A (en) * 2017-11-09 2018-04-06 郑州云海信息技术有限公司 A kind of Hadoop big datas component uniformly verifies system
CN108304293A (en) * 2017-12-27 2018-07-20 武汉长江通信智联技术有限公司 A kind of software systems monitoring method based on big data technology
CN108712465A (en) * 2018-04-13 2018-10-26 电信科学技术第五研究所有限公司 Big data platform monitoring method
CN108959626A (en) * 2018-07-23 2018-12-07 四川省烟草公司成都市公司 A kind of cross-platform efficient automatic generation method of isomeric data bulletin
CN109509019A (en) * 2018-10-15 2019-03-22 佛山市顺德区碧桂园物业发展有限公司 Real estate project management state monitors application method, system and cloud application system
CN109933484A (en) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 Big data cluster quasi real time container resource allocation monitoring analysis method
CN110019044A (en) * 2017-12-15 2019-07-16 北京京东尚科信息技术有限公司 Big data cluster quasi real time Yarn Mission Monitor analysis method
CN110069335A (en) * 2019-05-07 2019-07-30 江苏满运软件科技有限公司 Task processing system, method, computer equipment and storage medium
CN110222923A (en) * 2015-09-11 2019-09-10 福建师范大学 Dynamically configurable big data analysis system
CN110351384A (en) * 2019-07-19 2019-10-18 深圳前海微众银行股份有限公司 Big data platform method for managing resource, device, equipment and readable storage medium storing program for executing
CN110493082A (en) * 2019-08-22 2019-11-22 北京辛诺创新科技有限公司 Big data monitoring method, device, electronic equipment and read/write memory medium
WO2019237585A1 (en) * 2018-06-13 2019-12-19 平安科技(深圳)有限公司 Zookeeper monitoring method, device, computer equipment and storage medium
CN110795301A (en) * 2018-08-01 2020-02-14 马上消费金融股份有限公司 Job monitoring method, device, terminal and computer storage medium
CN110990227A (en) * 2019-12-04 2020-04-10 哈尔滨工程大学 Numerical pool application characteristic performance acquisition and monitoring system and operation method thereof
CN111930493A (en) * 2019-05-13 2020-11-13 中国移动通信集团湖北有限公司 NodeManager state management method and device in cluster and computing equipment
CN112306818A (en) * 2020-11-20 2021-02-02 新华三大数据技术有限公司 Streaming operation processing method and device
CN112445674A (en) * 2019-08-30 2021-03-05 中国石油化工股份有限公司 Data processing method and storage medium of computer cluster
CN112633683A (en) * 2020-12-22 2021-04-09 北京百度网讯科技有限公司 Resource usage amount statistical method, device, system, electronic equipment and storage medium
CN113159731A (en) * 2021-05-12 2021-07-23 河南雪城软件有限公司 Intelligent analysis system and method for automatic monitoring data of pollution source
CN114268629A (en) * 2021-12-22 2022-04-01 杭州玳数科技有限公司 Private cloud based EMR system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110069338A (en) * 2009-12-17 2011-06-23 한국전자통신연구원 Distributed parallel processing system and method based on incremental mapreduce on data stream
CN102929667A (en) * 2012-10-24 2013-02-13 曙光信息产业(北京)有限公司 Method for optimizing hadoop cluster performance
CN103064664A (en) * 2012-11-28 2013-04-24 华中科技大学 Hadoop parameter automatic optimization method and system based on performance pre-evaluation
US20140226975A1 (en) * 2013-02-13 2014-08-14 Sodero Networks, Inc. Method and apparatus for boosting data intensive processing through optical circuit switching

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110069338A (en) * 2009-12-17 2011-06-23 한국전자통신연구원 Distributed parallel processing system and method based on incremental mapreduce on data stream
CN102929667A (en) * 2012-10-24 2013-02-13 曙光信息产业(北京)有限公司 Method for optimizing hadoop cluster performance
CN103064664A (en) * 2012-11-28 2013-04-24 华中科技大学 Hadoop parameter automatic optimization method and system based on performance pre-evaluation
US20140226975A1 (en) * 2013-02-13 2014-08-14 Sodero Networks, Inc. Method and apparatus for boosting data intensive processing through optical circuit switching

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张棋胜: "云计算平台监控系统的研究与应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915793A (en) * 2015-06-30 2015-09-16 北京西塔网络科技股份有限公司 Public information intelligent analysis platform based on big data analysis and mining
CN105069066A (en) * 2015-07-29 2015-11-18 江苏方天电力技术有限公司 Big data platform based distributed calculation frame and method for monitoring energy conservation and emission reduction
CN110222923A (en) * 2015-09-11 2019-09-10 福建师范大学 Dynamically configurable big data analysis system
CN105512002A (en) * 2015-11-23 2016-04-20 上海汽车集团股份有限公司 Equipment monitoring method and monitoring server
CN105573892A (en) * 2015-12-21 2016-05-11 农信银资金清算中心有限责任公司 Business data batch processing method and system
CN105573892B (en) * 2015-12-21 2018-03-02 农信银资金清算中心有限责任公司 Business datum runs batch method and system
CN105630652A (en) * 2016-02-02 2016-06-01 中国石油大学(华东) Real-time big data platform Storm oriented runtime three-dimensional visualization system
CN107436806A (en) * 2016-05-27 2017-12-05 苏宁云商集团股份有限公司 A kind of resource regulating method and system
CN106022007A (en) * 2016-06-14 2016-10-12 中国科学院北京基因组研究所 Cloud platform system and method oriented to biological omics big data calculation
CN106022007B (en) * 2016-06-14 2019-03-26 中国科学院北京基因组研究所 The cloud platform system and method learning big data and calculating is organized towards biology
CN106357781A (en) * 2016-09-29 2017-01-25 郑州云海信息技术有限公司 Method and system for establishing resource service calling interface
CN106452970A (en) * 2016-11-03 2017-02-22 合肥微梦软件技术有限公司 Analysis system for network flow monitoring
CN106686145A (en) * 2017-03-14 2017-05-17 郑州云海信息技术有限公司 Web service method for managing plurality of servers
CN107169084A (en) * 2017-05-11 2017-09-15 深圳市茁壮网络股份有限公司 A kind of data processing method, distributed file system and data server
CN107241752A (en) * 2017-05-26 2017-10-10 华中科技大学 The YARN dispatching methods and system of a kind of sensing network flow
CN107301113A (en) * 2017-05-26 2017-10-27 北京小度信息科技有限公司 Mission Monitor method and device
CN107241752B (en) * 2017-05-26 2019-10-25 华中科技大学 A kind of the YARN dispatching method and system of sensing network flow
CN107391342B (en) * 2017-07-21 2021-01-15 苏州浪潮智能科技有限公司 Database all-in-one machine and monitoring method thereof
CN107391342A (en) * 2017-07-21 2017-11-24 郑州云海信息技术有限公司 A kind of database all-in-one and its monitoring method
CN107844568A (en) * 2017-11-03 2018-03-27 广东电网有限责任公司电力调度控制中心 A kind of MapReduce implementation procedure optimization methods of processing data source renewal
CN107885834A (en) * 2017-11-09 2018-04-06 郑州云海信息技术有限公司 A kind of Hadoop big datas component uniformly verifies system
CN107885834B (en) * 2017-11-09 2021-07-20 浪潮云信息技术股份公司 Hadoop big data assembly unified verification system
CN109933484A (en) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 Big data cluster quasi real time container resource allocation monitoring analysis method
CN110019044A (en) * 2017-12-15 2019-07-16 北京京东尚科信息技术有限公司 Big data cluster quasi real time Yarn Mission Monitor analysis method
CN108304293A (en) * 2017-12-27 2018-07-20 武汉长江通信智联技术有限公司 A kind of software systems monitoring method based on big data technology
CN108712465A (en) * 2018-04-13 2018-10-26 电信科学技术第五研究所有限公司 Big data platform monitoring method
WO2019237585A1 (en) * 2018-06-13 2019-12-19 平安科技(深圳)有限公司 Zookeeper monitoring method, device, computer equipment and storage medium
CN108959626B (en) * 2018-07-23 2023-06-13 四川省烟草公司成都市公司 Efficient automatic generation method for cross-platform heterogeneous data profile
CN108959626A (en) * 2018-07-23 2018-12-07 四川省烟草公司成都市公司 A kind of cross-platform efficient automatic generation method of isomeric data bulletin
CN110795301A (en) * 2018-08-01 2020-02-14 马上消费金融股份有限公司 Job monitoring method, device, terminal and computer storage medium
CN109509019A (en) * 2018-10-15 2019-03-22 佛山市顺德区碧桂园物业发展有限公司 Real estate project management state monitors application method, system and cloud application system
CN110069335A (en) * 2019-05-07 2019-07-30 江苏满运软件科技有限公司 Task processing system, method, computer equipment and storage medium
CN111930493B (en) * 2019-05-13 2023-08-01 中国移动通信集团湖北有限公司 NodeManager state management method and device in cluster and computing equipment
CN111930493A (en) * 2019-05-13 2020-11-13 中国移动通信集团湖北有限公司 NodeManager state management method and device in cluster and computing equipment
CN110351384A (en) * 2019-07-19 2019-10-18 深圳前海微众银行股份有限公司 Big data platform method for managing resource, device, equipment and readable storage medium storing program for executing
CN110493082A (en) * 2019-08-22 2019-11-22 北京辛诺创新科技有限公司 Big data monitoring method, device, electronic equipment and read/write memory medium
CN112445674A (en) * 2019-08-30 2021-03-05 中国石油化工股份有限公司 Data processing method and storage medium of computer cluster
CN110990227A (en) * 2019-12-04 2020-04-10 哈尔滨工程大学 Numerical pool application characteristic performance acquisition and monitoring system and operation method thereof
CN110990227B (en) * 2019-12-04 2023-08-04 哈尔滨工程大学 Numerical pool application characteristic performance acquisition and monitoring system and operation method thereof
CN112306818B (en) * 2020-11-20 2022-03-22 新华三大数据技术有限公司 Streaming operation processing method and device
CN112306818A (en) * 2020-11-20 2021-02-02 新华三大数据技术有限公司 Streaming operation processing method and device
CN112633683A (en) * 2020-12-22 2021-04-09 北京百度网讯科技有限公司 Resource usage amount statistical method, device, system, electronic equipment and storage medium
CN112633683B (en) * 2020-12-22 2023-09-01 北京百度网讯科技有限公司 Resource usage statistics method, device, system, electronic equipment and storage medium
CN113159731A (en) * 2021-05-12 2021-07-23 河南雪城软件有限公司 Intelligent analysis system and method for automatic monitoring data of pollution source
CN114268629A (en) * 2021-12-22 2022-04-01 杭州玳数科技有限公司 Private cloud based EMR system

Similar Documents

Publication Publication Date Title
CN104615526A (en) Monitoring system of large data platform
Coutinho et al. Elasticity in cloud computing: a survey
Garraghan et al. An analysis of the server characteristics and resource utilization in google cloud
CN106067080B (en) Configurable workflow capabilities are provided
JP5584780B2 (en) Data collection method, data collection apparatus, and network management device
US9483288B2 (en) Method and system for running a virtual appliance
CN104915793A (en) Public information intelligent analysis platform based on big data analysis and mining
US20130254196A1 (en) Cost-based optimization of configuration parameters and cluster sizing for hadoop
CN107766402A (en) A kind of building dictionary cloud source of houses big data platform
CN111666490A (en) Information pushing method, device, equipment and storage medium based on kafka
CN108052679A (en) A kind of Log Analysis System based on HADOOP
CN105049218A (en) PhiCloud cloud charging method and system
CN111324445A (en) Task scheduling simulation system
CN102982489A (en) Power customer online grouping method based on mass measurement data
CN102902775A (en) Internet real-time computing method and internet real-time computing system
CN109190025A (en) information monitoring method, device, system and computer readable storage medium
Dubuc et al. Mapping the big data landscape: technologies, platforms and paradigms for real-time analytics of data streams
CN110168503A (en) Timeslice inserts facility
Singh et al. Big data: technologies, trends and applications
JP5870927B2 (en) Server system, management apparatus, server management method, and program
US10691653B1 (en) Intelligent data backfill and migration operations utilizing event processing architecture
Lee et al. Refining micro services placement over multiple kubernetes-orchestrated clusters employing resource monitoring
CN112181972A (en) Data management method and device based on big data and computer equipment
CN106293949A (en) Resource scheduling strategy based on baseline analysis in computing environment
Khan Hadoop performance modeling and job optimization for big data analytics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150513

WD01 Invention patent application deemed withdrawn after publication