CN105279603A - Dynamically configured big data analysis system and method - Google Patents

Dynamically configured big data analysis system and method Download PDF

Info

Publication number
CN105279603A
CN105279603A CN201510577285.XA CN201510577285A CN105279603A CN 105279603 A CN105279603 A CN 105279603A CN 201510577285 A CN201510577285 A CN 201510577285A CN 105279603 A CN105279603 A CN 105279603A
Authority
CN
China
Prior art keywords
real
time
configuration
dynamic
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510577285.XA
Other languages
Chinese (zh)
Other versions
CN105279603B (en
Inventor
肖如良
彭行雄
丘志鹏
倪友聪
杜欣
蔡声镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Normal University
Original Assignee
Fujian Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Normal University filed Critical Fujian Normal University
Priority to CN201910332409.6A priority Critical patent/CN110222923A/en
Priority to CN201510577285.XA priority patent/CN105279603B/en
Publication of CN105279603A publication Critical patent/CN105279603A/en
Application granted granted Critical
Publication of CN105279603B publication Critical patent/CN105279603B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a dynamically configured big data analysis system and method. The system includes a real-time data storage management module, a real-time flow analysis and calculating module, an off-line analysis module, and a visualization module. At least one assembly capable of performing dynamical configuration management is arranged in each module, for example, a data management configuration assembly, a real-time flow analysis and calculating configuration assembly, an off-line analysis and calculating configuration assembly, and a dynamic configuration assembly. The invention further provides a dynamical configuration method of the big data analysis system. A data structure and an information structure of each module are designed, and a calculating method of warning redundancy and a dynamic configuration method are provided through the dynamic configuration of a state information driving system of a warning data structure in a dynamic configuration manager. According to the invention, the system can run at an efficient big data analysis and calculating level, and the optimization process of big data analysis platform management is effectively solved.

Description

The large data analysis system of dynamic configuration and method
Technical field
The present invention relates to large data analysis application field, especially relate to large data analysis system and the method for the configuration of a kind of dynamic.
Background technology
Present Business Intelligence system, decision support system (DSS) etc. require support large data sets to become and analyze day by day, the data volume calculated due to large data analysis is large, process is complicated, the processing time is long, thus large data analysis and application are also faced with a kind of challenge newly: system must have high reliability, require that software systems have adaptivity to change, these systems need the ability with Reconfigurations under interrupt system does not serve prerequisite, fault-tolerant management problem, how processing when upgrading unsuccessfully abnormal, making system keep the operation of normal table.Namely Dynamic Reconfiguration is a kind of important means realizing large data platform software adaptive reliability.
Early stage large parallel data processing framework Hadoop is limited to Single Point of Faliure and computation schema is relatively single, Hadoop2.0 introduces this universal resource management system of YARN, improve the resource utilization of system reliability and whole cluster, become and can run multiple large data processing shelf and programming modes such as comprising real-time streams process framework Storm, Spark, but improve large data analysis application system survivability, the reliability making system have further remains a difficult problem.
Current large data engine Spark technology of just extensively rising, at first by the AMPLab development in laboratory of UCBerkeley university, is by the open source projects of Apache fund management now.The target of Spark meets most application according to data processing and excavation, and what DAP was run is faster, the model of the better a kind of general support internal memory calculating of fault-tolerance.Spark introduces elasticity distribution formula data set (ResilientDistributedDataSets) RDD model, promotes counting yield to make full use of memory source.An engine can be utilized with other large data processing shelf on the basis of Shark, MLlib, GraphX and SparkStreaming to process efficiently from ETL to SQL to machine learning again to the process of flow data unlike, Spark.Spark is used to add SparkStreaming (or Shark, B1inkDB) for real-time and batch processing; SparkStreaming is used to add MLlib for flowing process and machine learning; Spark is used to add GraphX for figure streamline etc.Although but this new real-time stream calculation framework real-time performance and fault freedom obtain large improvement, the high reliability of system and high availability remain a challenge.
Along with distributed system scale in large data platform is more and more huger, behavior becomes increasingly complex; the various faults occurred in system are exponentially level growth also; very serious harm and loss is brought to industry member, government department; once there is shutdown event in system; massive losses and puzzlement will be brought; therefore these large data analysis systems need to have the ability under not interrupt system service prerequisite with configuration automatically; to improve the reliability of system; strengthen systematic risk controlling ability, improve the overall operation efficiency of software platform.For the problem in correlation technique, at present effective solution is not yet proposed.
Summary of the invention
Technical matters to be solved by this invention is: be provided for the dynamic optimization configuration that large data analysis calculates the runtime, to improve the reliability of system, strengthen risk control ability.
In order to solve the problems of the technologies described above, the technical solution used in the present invention is: provide the large data analysis system that a kind of dynamic configures, comprising:
Real-time data memory administration module, for obtaining real-time streaming data in Distributed Services cluster, and dynamic-configuration associated control parameters, and store;
Real-time streams analytical calculation module, for statistical study real time data, obtains real-time result of calculation, and carries out task adjustment to real-time analysis algorithmic load;
Off-line analysis module, for statistical study off-line data, obtains calculated off-line result, and carries out task adjustment to off-line analysis algorithmic load;
Visualization model, for carrying out visual presentation to real-time result of calculation and calculated off-line result, and provide dynamic chart in the time delay range arranged, display cluster service operation state and response condition, carry out alert process to exceeding threshold data in time.
For solving the problem, the present invention also provides a kind of Dynamic Configuration of large data analysis system, comprises the steps:
S1: Preset Time window, by dynamic-configuration manager predetermined warning data structure, and initialization;
S2: according to the task type of the object instance setting early warning redundance lower bound of object instance and the experience initial value in the upper bound and a parameter adjustment step-length constant in node;
S3: the early warning redundancy angle value of calculating object example;
S4: determine that described early warning redundancy angle value is between lower bound and the experience initial value in the upper bound, and generate random number;
S5: according to the experience initial value of step-length, random number, the upper bound and lower bound, calculates and optimizes upper dividing value and optimize floor value;
S6: determine that described early warning redundancy angle value is in optimization floor value and on optimizing between dividing value;
S7: in the time window preset, the information warning list in the management of poll dynamic-configuration;
S8: for the information warning list of node state, amendment node state, to realize the Dynamic Maintenance of node.
Beneficial effect of the present invention is: be different from prior art, the present invention is by the coordinated of above-mentioned module, realize the configuration optimization of system performance, simultaneously by calculating warning redundance, make system can run in a high efficiency large data analysis computing platform, both improve the reliability of system, strengthen risk control ability again.
Accompanying drawing explanation
Fig. 1 is the structural representation of present system;
Fig. 2 is the schematic flow sheet of the Dynamic Maintenance of types of objects example redundance when each node of system is in NORMAL state in the inventive method;
Fig. 3 is the dynamic-configuration maintenance process schematic diagram based on information warning list interior joint state in the inventive method.
Embodiment
By describing technology contents of the present invention in detail, realized object and effect, accompanying drawing is coordinated to be explained below in conjunction with embodiment.
The design of most critical of the present invention is: by the synergy of the modules of system, realizes the configuration optimization of system performance, and then obtains high efficiency large data analysis computing platform.
Please refer to Fig. 1, the large data analysis system that the embodiment of the present invention provides a kind of dynamic to configure, comprising:
Real-time data memory administration module, for obtaining real-time streaming data in Distributed Services cluster, and dynamic-configuration associated control parameters, and store;
Real-time streams analytical calculation module, for statistical study real time data, obtains real-time result of calculation, and carries out task adjustment to real-time analysis algorithmic load;
Off-line analysis module, for statistical study off-line data, obtains calculated off-line result, and carries out task adjustment to off-line analysis algorithmic load;
Visualization model, for carrying out visual presentation to real-time result of calculation and calculated off-line result, and provide dynamic chart in the time delay range arranged, display cluster service operation state and response condition, carry out alert process to exceeding threshold data in time.
Wherein, described real-time data memory administration module comprises:
Real-time streaming data securing component, for obtaining the real-time streaming data in Distributed Services cluster, formatting lines of going forward side by side, filtration, collection, and in collection process, complete the job of flow data;
Real-time storage assembly, for by format after data interchange format asynchronous transmission to HDFS, batch is stored.
Storage administration configuration component, for real-time data memory administration module dynamic-configuration associated control parameters.
Wherein, described real-time streams analytical calculation module comprises:
Real-time streams processing components, carries out real-time analysis according to this for acquisition number from HDFS, obtains real-time result of calculation; And by real-time result of calculation persistence, send to visualization model, and be stored in HDFS;
Real-time data analysis assembly, for carrying out statistical study and the intellectual analysis based on machine learning to real time data, and carrying out task scheduling to real-time analysis algorithmic load, realizing load balancing;
Real-time streams analytical calculation configuration component, for real-time streams analytical calculation module dynamic-configuration associated control parameters.
Wherein, described off-line analysis module comprises:
Off-line data processing components, for obtaining off-line data to carry out off-line analysis from HDFS, obtaining calculated off-line result, and by calculated off-line result persistence, sending to visualization model, and be stored in HDFS and NoSQL.
Off line data analysis assembly, for carrying out statistical study and the intellectual analysis based on machine learning to off-line data, and carrying out task scheduling to off-line analysis algorithmic load, realizing load balancing;
Off line data analysis calculates configuration component, for off-line analysis module dynamic-configuration associated control parameters.
Wherein, described visualization model comprises:
Dynamic-configuration assembly, for collaborative above-mentioned module, realizes the configuration optimization of system performance;
Real-time analysis view component, for the real-time result of calculation of visual presentation, comprises the data of System, real-time statistic analysis and intelligent predicting.
Off-line analysis view component, for visual presentation calculated off-line result, comprises the displaying of the gathering of theme message, state analysis and intelligent predicting result, and the statistical summaries of location services request.
Dynamic-configuration view component, for showing configuration data, and is associated with the accuracy of detection of showing real-time result of calculation and calculated off-line result.
A Dynamic Configuration for large data analysis system, comprises the steps:
S1: Preset Time window, by dynamic-configuration manager predetermined warning data structure, and initialization;
S2: according to the task type of the object instance setting early warning redundance lower bound of object instance and the experience initial value in the upper bound and a parameter adjustment step-length constant in node;
S3: the early warning redundancy angle value of calculating object example;
S4: determine that described early warning redundancy angle value is between lower bound and the experience initial value in the upper bound, and generate random number;
S5: according to the experience initial value of step-length, random number, the upper bound and lower bound, calculates and optimizes upper dividing value and optimize floor value;
S6: determine that described early warning redundancy angle value is in optimization floor value and on optimizing between dividing value;
S7: in the time window preset, the information warning list in the management of poll dynamic-configuration;
S8: for the information warning list of node state, amendment node state, to realize the Dynamic Maintenance of node.
Wherein step S4 is specially:
S41: judge whether described early warning redundancy angle value is more than or equal to lower bound experience initial value;
If so, then S42 is performed: upgrade described object instance;
If not, then S411 is performed: judge whether described object instance is in ready state;
If so, then S412 is performed: activate described object instance, and return step S41;
If not, then perform S413: creation task example, and return step S41;
Wherein after step S42, also comprise S43: judge whether described early warning redundancy angle value is less than or equal to upper bound experience initial value;
If so, then S44 is performed: upgrade described object instance, and generate random number;
If not, then S431 is performed: judge whether described object instance is in ready state or heavy condition;
If so, then S432 is performed: delete described object instance, and return step S43;
If not, then perform S433: the parameter of regulating object example, and return step S43.
Wherein, after step S412, also comprise:
S414: judge whether to activate successfully;
If so, then step S41 is returned;
Otherwise, then S415 is performed: set the node state of warning nodal information list as heavy duty;
After step S413, also comprise S416: judge whether to create successfully;
If so, then step S41 is returned;
Otherwise, then S415 is performed.
Wherein, after step S432, also comprise S434: judge whether to delete successfully;
If so, then step S43 is returned;
Otherwise, then S415 is performed;
After step S433, also comprise S436: judge whether to adjust successfully;
If so, then step S43 is returned;
Otherwise, then S415 is performed.
Wherein, step S5 is specially:
S51: calculate described optimization floor value: optimize floor value=lower bound experience initial value+step-length * random number;
S52: calculate dividing value in described optimization: dividing value=upper bound experience initial value-step-length * random number in optimization.
Conveniently understand technique scheme, composition graphs 1 ~ Fig. 3 of the present invention provides a specific embodiment to set forth.
First, it should be noted that, in large data analysis calculates, large-scale distributed calculation services needs to carry out system optimization, and the fault freedom improving system only ensures it is inadequate from systems development process.Because the parameter relating to system performance in large data analysis computing system is various, be difficult to regulation and control, this is a very difficult job.For this challenging problem, the present invention proposes large data analysis system and the method for the configuration of a kind of dynamic, this system comprises the four modules such as real-time data memory administration module, real-time streams analytical calculation module, off-line analysis module, visualization model, the assembly that one can be carried out dynamic-configuration management is devised, as data administration configuration assembly, real-time streams analytical calculation configuration component, off-line analysis calculate configuration component, dynamic-configuration assembly in each module.Wherein, dynamic-configuration assembly is the core of system dynamic-configuration management, and it realizes the configuration optimization of system performance simultaneously with each module cooperative.And system can adopt current up-to-date large data platform technology to realize, as Hadoop, Kafka, SparkStreaming, Hive etc., current system provided by the invention is disposed by product line and is detected, and operation conditions is good.
The general structure of the large data analysis system of the dynamic configuration that the present invention proposes, as shown in Figure 1.Native system adopts modular design, mainly comprises the four modules such as real-time data memory administration module, real-time streams analytical calculation module, off-line analysis module, visualization model.The major function of modules is as follows:
(1) real-time data memory administration module
This module is made up of three assemblies, comprising: real-time streaming data securing component, real-time streaming data memory module, real-time storage administration configuration assembly.
Real-time streaming data securing component primary responsibility obtains the real-time streaming data in existing large-scale distributed service cluster, complete format by this assembly, filter and collect, in collection process, complete the job (batchingmodule) of flow data.
Batch batch stores to HDFS by real-time storage assembly by the data interchange format JSON asynchronous transmission after format, also data to be delivered in batch queue (batchqueue) by this assembly simultaneously and is supplied to real-time computation module.
This module of storage administration configuration component primary responsibility dynamic-configuration associated control parameters.
(2) real-time streams analytical calculation module
This module is made up of three assemblies, comprising: real-time streams processing components, real-time data analysis assembly, real-time streams analytical calculation configuration component.
Real-time streams processing components is mainly real-time analysis assembly and provides service.On the one hand, the related data being responsible for pulling calculated off-line result from HDFS provides analytic unit to do analysis reference, and this is the precomputation belonging to real-time analysis; On the other hand, by analysis result persistence, be both supplied to that upper strata is visual provides Data Source, also data will be stored into HDFS.
The analysis of real-time data analysis assembly primary responsibility classical statistics and the intellectual analysis based on machine learning, and task scheduling is carried out to analytical algorithm load, realize load balancing.
This module of real-time streams analytical calculation configuration component primary responsibility dynamic-configuration associated control parameters.
(3) off-line analysis module
This module is made up of three assemblies, comprising: off-line data processing components, off line data analysis assembly, off line data analysis calculate configuration component.
Off-line data processing components is mainly off-line analysis assembly and provides service.On the one hand, be responsible for pulling from HDFS from related data, for off-line analysis carries out precomputation; On the other hand, by off line data analysis result persistence, be both supplied to that upper strata is visual provides Data Source, also calculation result data be stored into HDFS and NoSQL.
The global statistics analysis of off line data analysis assembly primary responsibility classics and the overall intellectual analysis based on machine learning, and task scheduling is carried out to off-line analysis algorithmic load, realize load balancing.
Off line data analysis calculates this module of configuration component primary responsibility dynamic-configuration associated control parameters.
Calculated off-line analysis module is mainly carried out to the classical statistics analysis of off-line to data in distributed type assemblies.
Off line data analysis task is dispatched by setting-up time window, according to result of calculation generating report forms, carries out resource allocation and later stage optimization reference for service development and operation maintenance personnel to service.
(4) visualization model
This module is made up of four assemblies, comprising: dynamic-configuration assembly, dynamic-configuration view component, real-time analysis view component, off-line analysis view component.
This module mainly carries out visual presentation to the result of calculation that real-time streams analytical calculation module and off line data analysis computing module produce, allow to provide dynamic chart in the time delay range arranged, timely display cluster service operation state and response condition, carry out alert process to exceeding threshold data.
The data of this modules exhibit are divided three classes:
A, real-time analysis view component show real-time analysis data
This part mainly comprises each analysis result System, and the data of real-time statistic analysis and intelligent predicting.
B, off-line analysis view component show off-line analysis data
This part mainly comprises the displaying gathered with state analysis and intelligent predicting result of various theme message, comprises the statistical summaries of location services request.
C, dynamic-configuration view component show configuration data, and can be associated with the accuracy of detection of showing analysis result.
In order to adapt to effective analysis of the service state of current large-scale distributed service system, promote the instant analysis benefit of real-time analysis, often need the analysis task of Timeliness coverage exception, the availability requirement of this real-time analyzer is improved, the present invention constructs redundant configuration technology to this real-time streams computing system, realize the dynamic-configuration of real-time streams computing system, under the prerequisite ensureing system availability, improve the performance of large data analysis system in real time, promote the ageing of instant analysis.
For ease of discussing, wherein system of the present invention is made the following instructions:
(1) system has N number of node, provides the data analysis of M class or statistical computation task altogether;
(2) be loosely-coupled between the assembly that system of the present invention completes a generic task, namely system can be between node and provides reliable asynchronous communication mechanism, and the communication-cost between simultaneously asynchronous is identical.
Below for a kind of Dynamic Configuration of ageing proposition of real-time analysis, first the data structure of the various configuration management work in system is represented by the basic syntax form of BNF normal form.
One, the data structure of example tasks
Setting a time window is timeWindow, is defined as in given timeWindow time span to the instance objects request of the present invention of whole large data analysis system:
Task::=<Td,Load,λArrive,λCur>
Wherein Td represents the time-out time judging that calculation and object task lost efficacy, Load is the average task amount of object instance task requests, λ Arrive is a kind of array of storage object example request arrival rate, λ cur is existing object request mean arrival rate, sets: λ Cur=λ Array [0] time initial.
Two, the data structure that node is relevant
Unique identification is carried out to system interior joint NodeID, represents node name with NodeName, the object instance list ObjectList of system interior joint information list NodeList and system.
NodeList[NodeID]::=<NodeName,NodeCapacity,ActiveInstNum,ObjectList,NodeStatus,ObjTypeSet>
NodeCapacity wherein represents node NodeID treatable task amount within the unit interval, ActiveInstNum represents the active example number of this node, ObjTypeSet represents the set of this node object type, can be INADMIN, RTADMIN or OLADMIN.
ObjectList[ObjID]::=<ObjectName,ObjInstList,Task>
ObjID is wherein the unique identification of service object's class in system, and ObjectName is the title of service object's class in system, and ObjInstList represents that the example list that such service object manages, Task represent the mission bit stream model of such service object.
ObjInstList[ObjInstID]::=<NodeID,InstStatus,InstLoad>
ObjInstID is wherein the unique identification of service object's example in system, NodeID is the multihome node mark of this example, InstStatus is service object's example state mark in system (normal, heavy duty), and InstLoad represents the load that object instance ObjInstID is current.
InstStatus::=<NORMAL|OVERRIDEGreatT.GreaT.G T, NORMAL wherein represents that example is in normal condition, and OVERRIDE represents that example is in heavy condition.
NodeStatus::=<NORMAL|READY|OVERRIDEGreatT.G reaT.GT, NORMAL wherein represents that node is in normal condition, and READY represents that node is in ready state, and OVERRIDE represents that node is in heavy condition.
Three, the data structure of data storage management configuration component
The configuration data structure InConfAdmin of this storage administration configuration component represents, the module parameter and the stored parameter that mainly set flow data job control.How to form the size of the data volume that a process is criticized, usually be divided into static state setting and dynamically arrange two kinds of strategy patterns, for in the even supply situation of data source, real-time analysis is applicable to and real-time results displaying with the easy of fixed size, when being random supply generation for data source, be suitable with dynamic time window timeWindow.Data mode is defined as follows:
InConfAdmin::={<NodeList,BatchingRef,StoringRef>}
NodeList is the corresponding definition in 3.2 joints, INADMIN is set as the ObjTypeSet in NodeList, be expressed as the node of data storage management type, BatchingRef represents (BatchingRef) the in batches state modulator in stream processing components, represents that in stream processing components, stored parameter StoringRef controls.Batch processing in data storage management assembly and the information warning for storage are defined as follows:
AlarmIn::={<BatchingAlarm, StoringAlarm>}, BatchingAlarm are that the data criticized are easily very few or excessive, and StoringAlarm refers to that storage delay is warned.
Four, the data structure of real-time streams analytical calculation configuration component
Real-time streams analytical calculation configuration component RTConfAdmin, as the supvr of real-time analysis computing module correlation parameter, participates in the nodal information list NodeList of analytical calculation, real-time analysis task list TaskList in major maintenance system; Meanwhile, the load balancing parameter (RTLoad) that analytic unit in real-time streams analytical calculation module is born is set.Corresponding data structure definition is as follows:
RTConfAdmin::=<NodeList,RTAnalysisReferenceList,RTTaskList,RTLoad>
NodeList is the corresponding definition in 3.2 joints, RTADMIN is set as the ObjTypeSet in NodeList, it is correspondingly the information list of real-time analysis object in system for the ObjectList in NodeList, RTAnalysisReferenceList is real-time analysis parameter list, RTTaskList is real-time analysis task list, its list element type is the task model of the analysis example that Task specifies, RTLoad is the load calculating configuration component node in real time.
Five, off-line analysis calculates the data structure of configuration component
Off-line analysis calculates the supvr of configuration component OLConfAdmin as off-line analysis computing module correlation parameter, the nodal information list NodeList that off-line analysis calculates is participated in major maintenance system, off-line analysis task list OLTaskList, meanwhile, the load balancing parameter (OLLoad) that analytic unit in off-line analysis computing module is born is set.Corresponding data structure definition is as follows:
OLConfAdmin::=<NodeList,RTAnalysisReferenceList,RTTaskList,RTLoad>
NodeList is the corresponding definition in 3.2 joints, OLADMIN is set as the ObjTypeSet in NodeList, correspondingly, for the information list that the ObjectList in NodeList is calculated off-line analytic target in system, OLAnalysisReferenceList is the list of calculated off-line analytical parameters, OLTaskList is the list of calculated off-line analysis task, and its list element type is the task model of the analysis example that Task specifies, OLLoad is the load of calculated off-line configuration component node.
Six, the data structure of dynamic-configuration view component
Dynamic-configuration view (being designated as ViewConf_Admin) is positioned at the visual aspect on system upper strata of the present invention, the configuration information list during maintenance system is overall.
ViewConfAdmin::={<AlarmAdmin,RTConfAdmin,OLConfAdmin,InConfAdmin>}
This Dynamic Configuration is made up of four configuration components: dynamic-configuration view ViewConf_Admin is used for global administration; AlarmAdmin represents warning configuration management, and RTConfAdmin represents real-time streams analytical calculation configuration component, as the supvr of real-time analysis computing module correlation parameter; OLConfAdmin represents that off line data analysis calculates configuration component, this calculated off-line module dynamic-configuration associated control parameters of primary responsibility; InConfAdmin represents storage administration configuration component, this module dynamic-configuration associated control parameters of primary responsibility.
Seven, the data structure of dynamic-configuration manager
Warning manager AlarmAdmin in the dynamic-configuration view ViewConfAdmin of system upper strata has the ability the configuration information list safeguarded in view structure, the operation must done for the information warning of system mainly comprises node and object instance needs to do corresponding operation, as over-loading operation, parameter adjustment operation, deletion action, activation manipulation, renewal rewards theory, puts blank operation etc.Warning configuration management data structure AlarmAdmin represents, in order to perform the inter-related task of this node of dynamic-configuration shown in ViewConfAdmin.This, its corresponding structure is defined as follows:
AlarmAdmin::=<AlarmNodeList,AlarmObjectInstList,AlarmTaskList>
AlarmNodeList[NodeID]::={<NodeStatus,NodeLoad>}
AlarmObjectInstList[ObjectInstID]::={<NodeID,AlarmObjID,AlarmInstID,InstLoad>}
AlarmTaskList[TaskID]::={<TaskName>}
Wherein, AlarmNodeList is the node listing in information warning, and AlarmObjectInstList is the object instance list of the relevant operation in information warning, and AlarmTaskList is warning task information list.
Eight, the message in Dynamic Configuration
Adopt Dynamic Configuration in large data analysis system, message structure is the communication infrastructure of runtime dynamic debugging system configuration.The necessary several type of message of formal definition below, object instance information when AlarmObjectInstList wherein represents the operation of warning, AlarmNodeID is the exclusive node mark of warning.
(1) Alarm (< [NodeID] >, < [ObjInstList] >): Data Storage, in real time computing module and calculated off-line module send the list information of warning node and object instance to dynamic-configuration view ViewConfAdmin.
(2) Adjust (< [NodeList], [ReferenceList] >|< [NodeID], [ObjInstList], [ReferenceList] >): dynamic-configuration view component ViewConfAdmin is by the parameter adjustment information of dynamic-configuration manager to Data Storage, in real time computing module and calculated off-line module sending node and object instance.
(3) OverLoad ([NodeList] | < [NodeID], [ObjInstList] >): dynamic-configuration view component ViewConfAdmin by the heavily loaded information of dynamic-configuration manager to Data Storage, in real time computing module and calculated off-line module sending node and object instance.
(4) Active ([NodeList] | < [NodeID], [ObjInstList] >): dynamic-configuration view component ViewConfAdmin by the active information of dynamic-configuration manager to Data Storage, in real time computing module and calculated off-line module sending object example.
(5) Delete ([NodeList] | < [NodeID], [ObjInstList] >): dynamic-configuration view component ViewConfAdmin by the deletion information of dynamic-configuration manager to Data Storage, in real time computing module and calculated off-line module sending node or object instance.
(6) Update (< [NodeList], [ReferenceList] >|< [NodeID], [ObjInstList], [ReferenceList] >): each Module nodes (Data Storage, in real time computing module and calculated off-line module) by dynamic-configuration manager to the state of dynamic-configuration view component ViewConfAdmin sending node or object instance or parameter lastest imformation.
(7) GetLoad ([NodeList] | < [NodeID], [ObjInstList] >): dynamic-configuration view component ViewConfAdmin by the information of dynamic-configuration manager to Data Storage, in real time the acquisition load entropy of computing module and calculated off-line module sending node or object instance.
(8) SetNull ([NodeList] | < [NodeID], [ObjInstList] >): dynamic-configuration view component ViewConfAdmin by dynamic-configuration manager to Data Storage, in real time computing module and the unloaded information of calculated off-line module sending node.
(9) Create ([NodeList] | < [NodeID], [ObjInstList] >): dynamic-configuration view component ViewConfAdmin is by the establishment information of dynamic-configuration manager to Data Storage, in real time computing module and calculated off-line module sending node or object instance, and initialization related information.
After above-mentioned data structure definition, refer to Fig. 2 ~ Fig. 3, understand Dynamic Configuration of the present invention to facilitate.
Wherein, should be understood that, large data analysis computing system body is exactly a kind of large-scale distributed application of complexity, by a dynamic-configuration manager (DynamicConfigurationManager, DCM) be configured and drive fault-tolerant processing when running and the dynamic implementation of corresponding resource distribution function, make correlation parameter and configuration should be able to dynamic conditioning to conform, the change of application demand and system resource.In systems in which, can ensure by dynamic adjusting analysis calculated examples relevant configured parameter the performance improving system under the prerequisite that system availability is constant simultaneously.The present invention proposes a kind of based on the Dynamic Configuration in large data analysis system, target is optimization system performance, improves the efficiency of system.
1, the load entropy in dynamic-configuration, early warning redundance
During system cloud gray model, the mean arrival rate of analyzing and processing request increases or all may cause the heavy duty of each node to the unreasonable of request scheduling.
Setting load entropy function is delta (), the operating lag being defined as request that instance objects lost efficacy is ∝, and the requirement quantization of user to system performance and availability is the penalties function of the response time about request, therefore, response time is shorter, and user-defined penalty value is lower.Delta (< [NodeID], [ObjInstID] >) definition is as (1) formula.
d e l t a ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) : : = &Integral; 0 T D w ( t ) &CenterDot; P ( t ) d t + w f &CenterDot; F ( u ) ... ... ( 1 )
Wherein, the task load entropy function of the object instance ObjInstID that delta () is node NodeID, P (t) is the response time of instance objects ObjInstID request is the probability of t, w (t) is the penalty value function of the response time definition of request, penalty value when wf is instance objects inefficacy, F (u) is the probability that instance objects lost efficacy, u is the static threshold of instance objects, and the calculating of this threshold value u depends on the mean arrival rate of instance objects request.Rule of thumb formula can obtain the funtcional relationship of response time with load of example request.
The redundance of service object is that a kind of tolerance makes similar service object can increase degree in the load Load born in object instance task task in actual time window, relevant with the response time with the current average arrival time of example request, the early warning redundance of the object instance of specified class row is expressed as AlarmRedundancy ().
Usual λ Cur is instant example object requests mean arrival rate, and AlarmRedundancy () is defined as follows:
AlarmRedundancy(<[NodeID],[ObjInstID]>)::=(λCur-λArray)*K+(λRespondTime-Td)*H+delta(<[NodeID],[ObjInstID]>)*L(2)
K, H and L are wherein empirical constants, the task load entropy function of the object instance ObjInstID that delta () is node NodeID.λ RespondTime is the average response time of existing object class, and it is defined as follows:
λRespondTime(<[NodeID],[ObjInstID]>)::=TExecutive+TDesignWait+TWait(3)
TExecutive is the average performance times of example tasks, TDesignWait be stand-by period of task to be designated in node, TWait is the average latency of the example of this object type.
T E x e c u t i v e ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) : : = L o a d ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) E ( N o d e C a p a c i t y ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) ) - - - ( 4 )
Wherein Load () is the task load (see 3.1) of the example of appointed object in specified node NodeID, E (NodeCapacity ()) is the mathematical expectation (see 3.2) of specified node NodeID treatable task amount within the unit interval, and it is defined as follows:
E ( N o d e C a p a c i t y ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) ) : : = &Sigma; O b j I n s t I D S u m ( O b j I n s t N u m ) N o d e C a p a c i t y ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) S u m ( O b j I n s t N u m ) - - - ( 5 )
Sum (ObjInstNum) is wherein the number of the example of appointed object ObjInstID in the node NodeID realizing specifying.
T D e s i g n W a i t ( < &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; > ) : : = T Q u e u e L e n g t h ( &lsqb; N o d e I D &rsqb; ) &lambda; C u r ( &lsqb; N o d e I D &rsqb; , &lsqb; O b j I n s t I D &rsqb; ) - - - ( 6 )
TQueueLength () is wherein the length of the queue to be allocated of specified node, and λ Cur () is the request mean arrival rate of specifying in specified node in object type.
TWait (< [NodeID], [ObjInstID] >) is the average latency of the example of this object type.
In conjunction with above (1), (3), (4), (5), (6) (2) formula early warning redundance can be calculated.
The early warning redundance of object instance should have an a rational floor value LowerBound and reasonably upper dividing value SupperBound, make LowerBound≤AlarmRedundancy≤SupperBound, lower than showing during lower bound that node object example is in loose condition (of surface), task amount can be increased; If be in busy state higher than the object instance shown during the upper bound in node, can suitably reduce task amount.
In this large data analysis system, the flow process of dynamic-configuration as shown in Figures 2 and 3, Dynamic Configuration Process mainly divides two parts, the first is under node state NodeStatus is in normal NORMAL state, each module is driven to carry out Dynamic Maintenance to all kinds of example redundances in each node of system, the operations such as heavy duty, deletion, renewal, establishment are done according to example state, when making the operation in node, example remains on an efficient horizontality, as shown in Figure 2; It two is be that basic object carries out redundance maintenance with node, and the NodeStatus value according to node state does the operations such as over-loading operation, renewal rewards theory, deletion, establishment, to make whole system remain on a good runnability state, as shown in Figure 3.
Specifically to when the dynamic redundancy of object type example is safeguarded in node, when namely normal NORMAL state being in the state variable NodeStatus of each node of system, the process flow diagram of Dynamic Maintenance is carried out as shown in Figure 2 to the types of objects example redundance in node.
Its main process is described below:
(1) preset a time window timeWindow, preset a dynamic-configuration by dynamic-configuration manager and warn data structure AlarmAdmin, and initialization, the system maintenance in time window in starter node.
(2) in node NodeID according to the task type of the object instance ObjInstID setting early warning redundance lower bound LowerBound of object instance and the experience initial value of upper bound SupperBound and a parameter adjustment step-length constant S;
(3) corresponding AR=AlarmRedundancy (< [NodeID], [ObjInstID] >) value is calculated;
(4) if AR≤LowerBound, the status information that alert messages lines up the similar task instances of middle present node is searched,
If A exists the object instance being in READY state, then perform activation manipulation Active ([NodeID], [ObjInstList]) for this object instance;
If B activation manipulation is unsuccessful, then the NodeStatus revising the AlarmNodeList [NodeID] in dynamic-configuration warning data structure AlarmAdmin is OVERRIDE;
The object instance of C, if there is no READY state, then can newly-built task instances, perform Create ([NodeID], [ObjInstList]) operation, if this operation is unsuccessful, then the NodeStatus needing the AlarmNodeList [NodeID] revised in dynamic-configuration warning data structure AlarmAdmin is OVERRIDE.
(5) step (4) is repeated, until AR>LowerBound.
(6) renewal Update ([NodeID], [ObjInstList]) operation is performed.
(7) if AR >=SupperBound, the status information that alert messages lines up the similar task instances of middle present node is searched:
If there is the object instance being in READY or OVERRIDE state in A, then perform deletion action Delete ([NodeID], [ObjInstList]), if delete unsuccessful, then the NodeStatus needing the AlarmNodeList [NodeID] revised in dynamic-configuration warning data structure AlarmAdmin is OVERRIDE.
B, repeat steps A, until AR≤SupperBound or all examples all above-mentioned deletion actions.
C, be if there is no in the object instance of READY or OVERRIDE state, then execution parameter adjustment operation Adjust ([NodeID], [ObjInstList], [ReferenceList]);
If D parameter adjustment successful operation, repeat step 3., until AR≤SupperBound.
If E parameter adjustment operation is unsuccessful, then the NodeStatus revising the AlarmNodeList [NodeID] in dynamic-configuration warning data structure AlarmAdmin is OVERRIDE.
(8) renewal Update ([NodeID], [ObjInstList]) operation is performed.
(9) if LowerBound≤AR≤SupperBound, a random number random is generated,
A、LowerBound=LowerBound+random*S,
B、BestLowerBound:=LowerBound;
If C LowerBound≤SupperBound goes to step A;
D、SupperBound=SupperBound-random*S
E、BestSupperBound:=BestSupperBound;;
If F AR≤SupperBound goes to step D;
(10) bound BestLowerBound and BestSupperBound of the object instance redundance optimized is exported.
(11) start the next example ObjInstID of the same class object in same node NodeID, go to step (3).
After iteration terminates, when running in node, object instance remains on a high-caliber efficient state.
As shown in Figure 3, when the Dynamic Maintenance for node, namely the status information warning NodeStatus in the node information warning list AlarmNodeList in configuration management data structure for the AlarmAdmin in dynamic-configuration manager carries out Dynamic Maintenance, and its main flow as shown in Figure 3.
Its main process is described below:
(1) in predetermined timeWindow, the information warning list AlarmNodeList in the AlarmAdmin in poll dynamic-configuration manager, for the alert messages queue of NodeStatus, starts the Dynamic Maintenance based on node.
(2) if the NodeStatus in AlarmNodeList [NodeID] is NORMAL, then the Dynamic Maintenance of object instance redundance in XM.
(3) if the NodeStatus in AlarmNodeList [NodeID] is READY, then can new Object example tasks, perform Create ([NodeID], [ObjInstList]) operation.
If A creates successfully, revising corresponding NodeStatus is NORMAL.
If B creates unsuccessful, revising corresponding NodeStatus is OVERRIDE.
C, execution upgrade Update ([NodeID]) operation
(4) if the NodeStatus in AlarmNodeList [NodeID] is OVERRIDE, then can heavy duty this node, perform OverLoad ([NodeID]) operation.
If A heavy duty success, revising corresponding NodeStatus is READY.
If B heavy duty is unsuccessful, performs and put sky SetNull ([NodeID]) operation, perform deletion action Delete ([NodeID]) afterwards.
C, execution upgrade Update ([NodeID]) operation.
Be different from prior art, during dynamic-configuration of the present invention, performance possess following advantage:
(1), when node is in NORMAL, the object instance in node needs to perform heavy duty, deletion, activation manipulation respectively in optimizing process.But when performing these operations, do not carry out poll traversal, decrease intrasystem cost on network communication.In the process calculating warning redundance, time complexity is higher, but for the computing power of large Data Analysis Platform, this time complexity is acceptable.
In searching process for the up-and-down boundary of the object instance redundance in system node, larger expense is had when first operation, but follow-up optimizing process empirically can be worth by the best boundary value in optimizing process above, greatly can save the expense in optimizing process, reach the effect of distributing rationally fast.
(2) when node is in READY or OVERRIDE state, because there is the over-loading operation of system node, the expense of system performance is comparatively large, and communication overhead is relatively less.
In sum, the system performance in the present invention has good performance advantage compared to conventional arrangement process.
The foregoing is only embodiments of the invention; not thereby the scope of the claims of the present invention is limited; every equivalents utilizing instructions of the present invention and accompanying drawing content to do, or be directly or indirectly used in relevant technical field, be all in like manner included in scope of patent protection of the present invention.

Claims (10)

1. a large data analysis system for dynamic configuration, is characterized in that, comprising:
Real-time data memory administration module, for obtaining real-time streaming data in Distributed Services cluster, and dynamic-configuration associated control parameters, and store;
Real-time streams analytical calculation module, for statistical study real time data, obtains real-time result of calculation, and carries out task adjustment to real-time analysis algorithmic load;
Off-line analysis module, for statistical study off-line data, obtains calculated off-line result, and carries out task adjustment to off-line analysis algorithmic load;
Visualization model, for carrying out visual presentation to real-time result of calculation and calculated off-line result, and provide dynamic chart in the time delay range arranged, display cluster service operation state and response condition, carry out alert process to exceeding threshold data in time.
2. the large data analysis system of dynamic configuration according to claim 1, it is characterized in that, described real-time data memory administration module comprises:
Real-time streaming data securing component, for obtaining the real-time streaming data in Distributed Services cluster, formatting lines of going forward side by side, filtration, collection, and in collection process, complete the job of flow data;
Real-time storage assembly, for by format after data interchange format asynchronous transmission to HDFS, batch is stored;
Storage administration configuration component, for real-time data memory administration module dynamic-configuration associated control parameters.
3. the large data analysis system of dynamic configuration according to claim 1, it is characterized in that, described real-time streams analytical calculation module comprises:
Real-time streams processing components, carries out real-time analysis according to this for acquisition number from HDFS, obtains real-time result of calculation; And by real-time result of calculation persistence, send to visualization model, and be stored in HDFS;
Real-time data analysis assembly, for carrying out statistical study and the intellectual analysis based on machine learning to real time data, and carrying out task scheduling to real-time analysis algorithmic load, realizing load balancing;
Real-time streams analytical calculation configuration component, for real-time streams analytical calculation module dynamic-configuration associated control parameters.
4. the large data analysis system of dynamic configuration according to claim 1, it is characterized in that, described off-line analysis module comprises:
Off-line data processing components, for obtaining off-line data to carry out off-line analysis from HDFS, obtaining calculated off-line result, and by calculated off-line result persistence, sending to visualization model, and being stored in HDFS and NoSQL;
Off line data analysis assembly, for carrying out statistical study and the intellectual analysis based on machine learning to off-line data, and carrying out task scheduling to off-line analysis algorithmic load, realizing load balancing;
Off line data analysis calculates configuration component, for off-line analysis module dynamic-configuration associated control parameters.
5. the large data analysis system of dynamic configuration according to claim 1, it is characterized in that, described visualization model comprises:
Dynamic-configuration assembly, for collaborative above-mentioned module, realizes the configuration optimization of system performance;
Real-time analysis view component, for the real-time result of calculation of visual presentation, comprises the data of System, real-time statistic analysis and intelligent predicting;
Off-line analysis view component, for visual presentation calculated off-line result, comprises the displaying of the gathering of theme message, state analysis and intelligent predicting result, and the statistical summaries of location services request;
Dynamic-configuration view component, for showing configuration data, and is associated with the accuracy of detection of showing real-time result of calculation and calculated off-line result.
6. a Dynamic Configuration for large data analysis system, is characterized in that, comprises the steps:
S1: Preset Time window, by dynamic-configuration manager predetermined warning data structure, and initialization;
S2: according to the task type of the object instance setting early warning redundance lower bound of object instance and the experience initial value in the upper bound and a parameter adjustment step-length constant in node;
S3: the early warning redundancy angle value of calculating object example;
S4: determine that described early warning redundancy angle value is between lower bound and the experience initial value in the upper bound, and generate random number;
S5: according to the experience initial value of step-length, random number, the upper bound and lower bound, calculates and optimizes upper dividing value and optimize floor value;
S6: determine that described early warning redundancy angle value is in optimization floor value and on optimizing between dividing value;
S7: in the time window preset, the information warning list in the management of poll dynamic-configuration;
S8: for the information warning list of node state, amendment node state, to realize the Dynamic Maintenance of node.
7. the Dynamic Configuration of large data analysis system according to claim 6, it is characterized in that, wherein step S4 is specially:
S41: judge whether described early warning redundancy angle value is more than or equal to lower bound experience initial value;
If so, then S42 is performed: upgrade described object instance;
If not, then S411 is performed: judge whether described object instance is in ready state;
If so, then S412 is performed: activate described object instance, and return step S41;
If not, then perform S413: creation task example, and return step S41;
Wherein after step S42, also comprise S43: judge whether described early warning redundancy angle value is less than or equal to upper bound experience initial value;
If so, then S44 is performed: upgrade described object instance, and generate random number;
If not, then S431 is performed: judge whether described object instance is in ready state or heavy condition;
If so, then S432 is performed: delete described object instance, and return step S43;
If not, then perform S433: the parameter of regulating object example, and return step S43.
8. the Dynamic Configuration of large data analysis system according to claim 7, is characterized in that, after step S412, also comprise:
S414: judge whether to activate successfully;
If so, then step S41 is returned;
Otherwise, then S415 is performed: set the node state of warning nodal information list as heavy duty;
After step S413, also comprise S416: judge whether to create successfully;
If so, then step S41 is returned;
Otherwise, then S415 is performed.
9. the Dynamic Configuration of large data analysis system according to claim 7, is characterized in that, after step S432, also comprise S434: judge whether to delete successfully;
If so, then step S43 is returned;
Otherwise, then S415 is performed;
After step S433, also comprise S436: judge whether to adjust successfully;
If so, then step S43 is returned;
Otherwise, then S415 is performed.
10. the Dynamic Configuration of large data analysis system according to claim 6, it is characterized in that, step S5 is specially:
S51: calculate described optimization floor value: optimize floor value=lower bound experience initial value+step-length * random number;
S52: calculate dividing value in described optimization: dividing value=upper bound experience initial value-step-length * random number in optimization.
CN201510577285.XA 2015-09-11 2015-09-11 Dynamically configurable big data analysis system and method Active CN105279603B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910332409.6A CN110222923A (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system
CN201510577285.XA CN105279603B (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510577285.XA CN105279603B (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system and method

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201910332409.6A Division CN110222923A (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system

Publications (2)

Publication Number Publication Date
CN105279603A true CN105279603A (en) 2016-01-27
CN105279603B CN105279603B (en) 2020-02-07

Family

ID=55148575

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910332409.6A Pending CN110222923A (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system
CN201510577285.XA Active CN105279603B (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system and method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201910332409.6A Pending CN110222923A (en) 2015-09-11 2015-09-11 Dynamically configurable big data analysis system

Country Status (1)

Country Link
CN (2) CN110222923A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760459A (en) * 2016-02-04 2016-07-13 四川嘉宝资产管理集团股份有限公司 Distributed data processing system and method
CN105912582A (en) * 2016-03-31 2016-08-31 畅捷通信息技术股份有限公司 Control method for users' behavior analyses and control system for users' behavior analyses
CN106126641A (en) * 2016-06-24 2016-11-16 中国科学技术大学 A kind of real-time recommendation system and method based on Spark
CN106407472A (en) * 2016-11-01 2017-02-15 广西电网有限责任公司电力科学研究院 Visual editing and management system for big data analysis and calculation task of order model
CN106776984A (en) * 2016-12-02 2017-05-31 航天星图科技(北京)有限公司 A kind of cleaning method of distributed system mining data
CN107145789A (en) * 2017-05-22 2017-09-08 国网江苏省电力公司电力科学研究院 A kind of Visual Interactive method of big data safety analysis
CN107220261A (en) * 2016-03-22 2017-09-29 中国移动通信集团山西有限公司 A kind of real-time method for digging and device based on distributed data
CN107294801A (en) * 2016-12-30 2017-10-24 江苏号百信息服务有限公司 Stream Processing method and system based on magnanimity real-time Internet DPI data
CN107451147A (en) * 2016-05-31 2017-12-08 北京京东尚科信息技术有限公司 A kind of method and apparatus of kafka clusters switching at runtime
CN107623737A (en) * 2017-09-28 2018-01-23 南京轨道交通系统工程有限公司 A kind of track traffic radio communication scheduling system and its design method
CN107621972A (en) * 2016-07-15 2018-01-23 中兴通讯股份有限公司 Big data task dynamic management approach, device and server
CN107918579A (en) * 2016-10-09 2018-04-17 北京神州泰岳软件股份有限公司 A kind of method and apparatus of Mass production base-line data
CN108463813A (en) * 2016-11-30 2018-08-28 华为技术有限公司 A kind of method and apparatus carrying out data processing
CN108494600A (en) * 2018-03-30 2018-09-04 努比亚技术有限公司 Topology creates the method, apparatus and storage medium of management and control
CN108959954A (en) * 2018-03-30 2018-12-07 努比亚技术有限公司 Method, apparatus, server and the storage medium of Storm authority managing and controlling
CN109144508A (en) * 2018-07-23 2019-01-04 北京科东电力控制系统有限责任公司 It generates, the method and device of customization alarm picture
CN109154897A (en) * 2016-05-17 2019-01-04 起元技术有限责任公司 Reconfigurable distributed treatment
CN109343138A (en) * 2018-09-29 2019-02-15 深圳市华讯方舟太赫兹科技有限公司 A kind of load-balancing method and rays safety detection apparatus of safe examination system
CN109727649A (en) * 2017-10-30 2019-05-07 埃森哲环球解决方案有限公司 Use machine learning design data analysis platform
CN109739925A (en) * 2019-01-07 2019-05-10 北京云基数技术有限公司 A kind of data processing system and method based on big data
CN110019189A (en) * 2017-09-18 2019-07-16 飞狐信息技术(天津)有限公司 A kind of generation method and generation system of chart
CN111245559A (en) * 2018-11-29 2020-06-05 阿里巴巴集团控股有限公司 Information determination method, information judgment method and device and computing equipment
CN112693502A (en) * 2019-10-23 2021-04-23 上海宝信软件股份有限公司 Urban rail transit monitoring system and method based on big data architecture
CN114285891A (en) * 2021-12-15 2022-04-05 北京天融信网络安全技术有限公司 SSLVPN-based session reconstruction method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6338072B1 (en) * 1997-07-23 2002-01-08 Bull S.A. Device and process for dynamically controlling the allocation of resources in a data processing system
CN102043675A (en) * 2010-12-06 2011-05-04 北京华证普惠信息股份有限公司 Thread pool management method based on task quantity of task processing request
US20110161980A1 (en) * 2009-12-31 2011-06-30 English Robert M Load Balancing Web Service by Rejecting Connections
CN103903455A (en) * 2014-04-14 2014-07-02 东南大学 Urban road traffic signal control optimization system
CN104317658A (en) * 2014-10-17 2015-01-28 华中科技大学 MapReduce based load self-adaptive task scheduling method
CN104375621A (en) * 2014-11-28 2015-02-25 广东石油化工学院 Dynamic weighting load assessment method based on self-adaptive threshold values in cloud computing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5853465B2 (en) * 2011-07-27 2016-02-09 沖電気工業株式会社 Network analysis system
CN102497292A (en) * 2011-11-30 2012-06-13 中国科学院微电子研究所 Computer cluster monitoring method and system thereof
CN103618644A (en) * 2013-11-26 2014-03-05 曙光信息产业股份有限公司 Distributed monitoring system based on hadoop cluster and method thereof
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform
CN104579761B (en) * 2014-12-24 2018-03-23 西安工程大学 A kind of nosql clusters automatic configuration system and method for automatic configuration based on cloud computing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6338072B1 (en) * 1997-07-23 2002-01-08 Bull S.A. Device and process for dynamically controlling the allocation of resources in a data processing system
US20110161980A1 (en) * 2009-12-31 2011-06-30 English Robert M Load Balancing Web Service by Rejecting Connections
CN102043675A (en) * 2010-12-06 2011-05-04 北京华证普惠信息股份有限公司 Thread pool management method based on task quantity of task processing request
CN103903455A (en) * 2014-04-14 2014-07-02 东南大学 Urban road traffic signal control optimization system
CN104317658A (en) * 2014-10-17 2015-01-28 华中科技大学 MapReduce based load self-adaptive task scheduling method
CN104375621A (en) * 2014-11-28 2015-02-25 广东石油化工学院 Dynamic weighting load assessment method based on self-adaptive threshold values in cloud computing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
胡耀,等: "基于Xen虚拟机的内存资源实时监控与按需调整", 《计算机应用》 *
魏彬: "基于分布式日志系统的数据云服务平台设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760459A (en) * 2016-02-04 2016-07-13 四川嘉宝资产管理集团股份有限公司 Distributed data processing system and method
CN107220261A (en) * 2016-03-22 2017-09-29 中国移动通信集团山西有限公司 A kind of real-time method for digging and device based on distributed data
CN107220261B (en) * 2016-03-22 2020-10-30 中国移动通信集团山西有限公司 Real-time mining method and device based on distributed data
CN105912582A (en) * 2016-03-31 2016-08-31 畅捷通信息技术股份有限公司 Control method for users' behavior analyses and control system for users' behavior analyses
CN109154897A (en) * 2016-05-17 2019-01-04 起元技术有限责任公司 Reconfigurable distributed treatment
CN107451147A (en) * 2016-05-31 2017-12-08 北京京东尚科信息技术有限公司 A kind of method and apparatus of kafka clusters switching at runtime
CN106126641B (en) * 2016-06-24 2019-02-05 中国科学技术大学 A kind of real-time recommendation system and method based on Spark
CN106126641A (en) * 2016-06-24 2016-11-16 中国科学技术大学 A kind of real-time recommendation system and method based on Spark
CN107621972A (en) * 2016-07-15 2018-01-23 中兴通讯股份有限公司 Big data task dynamic management approach, device and server
CN107918579A (en) * 2016-10-09 2018-04-17 北京神州泰岳软件股份有限公司 A kind of method and apparatus of Mass production base-line data
CN106407472A (en) * 2016-11-01 2017-02-15 广西电网有限责任公司电力科学研究院 Visual editing and management system for big data analysis and calculation task of order model
CN106407472B (en) * 2016-11-01 2019-08-20 广西电网有限责任公司电力科学研究院 A kind of the big data calculating analysis task visual edit and management system of order form mode
CN108463813A (en) * 2016-11-30 2018-08-28 华为技术有限公司 A kind of method and apparatus carrying out data processing
CN108463813B (en) * 2016-11-30 2020-12-04 华为技术有限公司 Method and device for processing data
CN106776984B (en) * 2016-12-02 2018-09-25 航天星图科技(北京)有限公司 A kind of cleaning method of distributed system mining data
CN106776984A (en) * 2016-12-02 2017-05-31 航天星图科技(北京)有限公司 A kind of cleaning method of distributed system mining data
CN107294801B (en) * 2016-12-30 2020-03-31 江苏号百信息服务有限公司 Streaming processing method and system based on massive real-time internet DPI data
CN107294801A (en) * 2016-12-30 2017-10-24 江苏号百信息服务有限公司 Stream Processing method and system based on magnanimity real-time Internet DPI data
CN107145789B (en) * 2017-05-22 2019-08-23 国网江苏省电力公司电力科学研究院 A kind of Visual Interactive method of big data safety analysis
CN107145789A (en) * 2017-05-22 2017-09-08 国网江苏省电力公司电力科学研究院 A kind of Visual Interactive method of big data safety analysis
CN110019189A (en) * 2017-09-18 2019-07-16 飞狐信息技术(天津)有限公司 A kind of generation method and generation system of chart
CN107623737A (en) * 2017-09-28 2018-01-23 南京轨道交通系统工程有限公司 A kind of track traffic radio communication scheduling system and its design method
CN109727649A (en) * 2017-10-30 2019-05-07 埃森哲环球解决方案有限公司 Use machine learning design data analysis platform
CN109727649B (en) * 2017-10-30 2023-04-11 埃森哲环球解决方案有限公司 Design data analysis platform using machine learning
CN108494600B (en) * 2018-03-30 2022-12-23 大唐丘北风电有限责任公司 Topology creation control method, device and storage medium
CN108494600A (en) * 2018-03-30 2018-09-04 努比亚技术有限公司 Topology creates the method, apparatus and storage medium of management and control
CN108959954A (en) * 2018-03-30 2018-12-07 努比亚技术有限公司 Method, apparatus, server and the storage medium of Storm authority managing and controlling
CN109144508A (en) * 2018-07-23 2019-01-04 北京科东电力控制系统有限责任公司 It generates, the method and device of customization alarm picture
CN109343138A (en) * 2018-09-29 2019-02-15 深圳市华讯方舟太赫兹科技有限公司 A kind of load-balancing method and rays safety detection apparatus of safe examination system
CN111245559A (en) * 2018-11-29 2020-06-05 阿里巴巴集团控股有限公司 Information determination method, information judgment method and device and computing equipment
CN111245559B (en) * 2018-11-29 2023-04-18 阿里巴巴集团控股有限公司 Information determination method, information judgment method and device and computing equipment
CN109739925A (en) * 2019-01-07 2019-05-10 北京云基数技术有限公司 A kind of data processing system and method based on big data
CN112693502A (en) * 2019-10-23 2021-04-23 上海宝信软件股份有限公司 Urban rail transit monitoring system and method based on big data architecture
CN114285891A (en) * 2021-12-15 2022-04-05 北京天融信网络安全技术有限公司 SSLVPN-based session reconstruction method and system
CN114285891B (en) * 2021-12-15 2024-01-23 北京天融信网络安全技术有限公司 SSLVPN-based session reconstruction method and system

Also Published As

Publication number Publication date
CN105279603B (en) 2020-02-07
CN110222923A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
CN105279603A (en) Dynamically configured big data analysis system and method
CN110794800B (en) Intelligent factory information management monitoring system
Barbagallo et al. A bio-inspired algorithm for energy optimization in a self-organizing data center
CN104317658A (en) MapReduce based load self-adaptive task scheduling method
EP2648137A2 (en) Generic reasoner distribution method
CN104915407A (en) Resource scheduling method under Hadoop-based multi-job environment
KR101303690B1 (en) Power management apparatus and its method, power control system
CN110287228A (en) Implementation method based on dispatching of power netwoks domain equipment monitoring real-time data acquisition
CN105893158A (en) Big data hybrid scheduling model on private cloud condition
CN105446816A (en) Heterogeneous platform oriented energy consumption optimization scheduling method
CN113268486A (en) Integrated data application system of intelligent factory
CN112347636A (en) Equipment guarantee simulation modeling method based on Multi-Agent technology
CN110086855A (en) Spark task Intellisense dispatching method based on ant group algorithm
CN111324460B (en) Power monitoring control system and method based on cloud computing platform
CN115756833A (en) AI inference task scheduling method and system oriented to multiple heterogeneous environments
CN103442087B (en) A kind of Web service system visit capacity based on response time trend analysis controls apparatus and method
CN114490049A (en) Method and system for automatically allocating resources in containerized edge computing
CN107995026B (en) Management and control method, management node, managed node and system based on middleware
CN113225994A (en) Intelligent air conditioner control method facing data center
CN116028193B (en) Big data task dynamic high-energy-efficiency scheduling method and system for mixed part cluster
CN115378789B (en) Multi-level cooperative stream resource management method and system
CN103346906A (en) Intelligent operation and maintenance method and system based on cloud computing
CN110266515A (en) A kind of operation information system based on general fit calculation
CN112947173B (en) Method, controller and system for predicting running state of digital twin workshop
Feng et al. Simulation optimization framework for online deployment and adjustment of reconfigurable machines in job shops

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant