CN106506254B - A kind of bottleneck node detection method of extensive stream data processing system - Google Patents

A kind of bottleneck node detection method of extensive stream data processing system Download PDF

Info

Publication number
CN106506254B
CN106506254B CN201610835764.1A CN201610835764A CN106506254B CN 106506254 B CN106506254 B CN 106506254B CN 201610835764 A CN201610835764 A CN 201610835764A CN 106506254 B CN106506254 B CN 106506254B
Authority
CN
China
Prior art keywords
fuzzy
node
reasoning
bottleneck
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610835764.1A
Other languages
Chinese (zh)
Other versions
CN106506254A (en
Inventor
翟岩龙
吴煦
王子硕
扶聪
张鑫宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201610835764.1A priority Critical patent/CN106506254B/en
Publication of CN106506254A publication Critical patent/CN106506254A/en
Application granted granted Critical
Publication of CN106506254B publication Critical patent/CN106506254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of bottleneck node detection method of extensive stream data processing system of the present invention belongs to big data calculating, fuzzy logic and streaming preconditioning technique field.A kind of bottleneck node detection method of extensive stream data processing system, abbreviation this method, the system relied on, i.e., a kind of bottleneck detection system based on fuzzy logic control, abbreviation this system, including initialization unit, node state acquisition unit, fuzzy reasoning unit conciliate blur unit;This method step are as follows: 1 initialization unit initializes fuzzy logic engine, sets the membership function of semantization label and each quantity of state, loads Fuzzy Rule Sets, sets the reasoning results critical parameter;2 obtain node state;3 are blurred input variable;4 fuzzy reasonings;5 ambiguity solutions obtain and determine result.The invention detects that variation of the variation of flow to system load, judges that bottleneck node is extended in time, the optimal cluster of a utilization of resources is only run to safeguard, achievees the purpose that reduce cluster scale.

Description

A kind of bottleneck node detection method of extensive stream data processing system
Technical field
The present invention relates to a kind of bottleneck node detection methods of extensive stream data processing system, belong to big data meter Calculation, fuzzy logic and streaming preconditioning technique field.
Background technique
With the development of real-time big data technology, many companies start to dispose the stream data processing cluster of oneself, maintenance The operation of these clusters needs very big expense, and stream data processing system is typically characterised by data stream size shakiness Calmly, system is complicated as event converts quickly.In order to enable system can also operate normally in the case where a small number of big flows, It needs to carry out resource allocation according to the maximum stream flow estimated when configuring cluster;But heavy traffic condition is usually only seldom In the case where occur, if resource distribution by peak demand configure, in major part, major part resource is all in idle shape The resource utilization of state, system is very low, causes the serious wasting of resources.Therefore a running cluster how is monitored, fastly Speed efficiently detects bottleneck node in cluster and implements to be expanded into the key in cloud computing architecture field to it and asks One of topic.
What the stream data processing engine of present mainstream was not all detected and was extended to individual node overload (bottleneck) Design, for example Storm and S4 are to carry out operation using the mode of static configuration, and for instability of flow the case where cannot root According to dynamic distribution and Resource recovery is needed, can only then need to stop if necessary to extension from the operating status of whole detection system Only cluster edits Static Configuration Files as needed and redistributes resource and then can just continue to run.It is flat for present cloud computing Demand of the platform to extension sexual function, the method that scientific research personnel has studied several detection bottleneck nodes, and on the platforms such as Storm It is integrated, has obtained many applications in stream data process field.
Three classes are broadly divided into for the method for bottleneck node detection and the extension of stream data processing system, the first kind is Static judgment method based on threshold value, this is a kind of simple and intuitive method, but the setting of static threshold needs user couple The loading trends of application have very deep understanding could correctly be arranged and threshold value to application be it is independent, cloud platform cannot learn this How a little threshold values determine.Second class is a kind of based on the automatic decision mode learnt is enhanced for extending automatically, and this mode makes Model and Q-learning algorithm are handled with Markovian decision, passes through one judgment models of training using the method for machine learning The load condition of system is made a decision;Method and two defects in this: first is that initialization performance is poor, when needing very big training Between;Second is that needing very big state space, with the increase of state variable, the quantity of state is exponentially increased, and state is excessive In the case of cause performance decline serious.Third class is the method based on control theory, and control theory has been used to web server, Storage system, the automatic management of the systems such as data center;Control theory method be often divided into open loop (Open loop) and before Feedback-feedback (feedback and feed-forward) two ways, open loop are a kind of mode of no feedback, its root A value is calculated according to the state and system model of current system, does not judge whether the influence of output of this result to system takes Obtain desired result.And the controller with feedback then can observing system output, and calculating is correspondingly improved according to the output of system A desired result is obtained with it.
The method of control theory Integration ofTechnology to flow data processing system has obtained extensive research.Palden Lama and Xiaobo Zhou etc. is in 2010IEEE International Symposium on Modeling, Analysis and It is delivered in Simulation of Computer and Telecommunication Systems meeting entitled " Automated control in cloud computing:challenges and opportunities " proposes one For kind using the control method based on average CPU utilization adjustment cluster virtual machine quantity, this method use is more intuitive and easy to understand Control logic, realize the automatic distribution of resources of virtual machine, but the disadvantage is that too simple, only consider CPU usage this A parameter, the excessively single load condition for being difficult to embody stream data processing system entirety of variable, the reliability of effect is lower, Error is larger.In 9th IEEE/ACM International Symposium on Cluster Computing in 2009 Entitled " the Self-Tuning Virtual Machines for Predictable delivered in and the Grid seminar The article of eScience " proposes a kind of PI (Proportional-Integral) controller for controlling batch processing job resource, This method establishes one about the model for having feedback for having distributed resource according to the implementation progress of operation, although this is controlled Device can effectively work, but this model is mainly used to predict the implementation progress and resource allocation of batch processing system, and endless It is suitable for stream data processing system entirely.Another extensive control theory is fuzzy logic technology, fuzzy logic control It is mapped to a fuzzy set using by load parameter, the corresponding mould of result is obtained by the operation of the fuzzy rule set of definition Variable and its degree of membership are pasted, the final result for obtaining a fuzzy reasoning is operated finally by ambiguity solution.It is managed based on fuzzy logic The control system of opinion was known as fuzzy controller, in IEEE 19th International Symposium in 2011 In Modeling, Analysis&Simulation of Computer and Telecommunication Systems meeting Entitled " the Fuzzy Modeling Based Resource Management for Virtualized Database delivered Systems " proposes a kind of method using fuzzy logic control Resource dynamic allocation, changes method and is represented using CPU usage Input load changes the feasibility that method validation uses fuzzy logic as resource allocation controller, but the paper is mainly examined What is considered is the resource allocation of Business Logic (database), and the problem too simple there is also the input variable of selection, no It can reflect the changeable caused system mode diversification feature of data flow in stream data processing system completely.
Although above-mentioned existing resource control scheme has certain effect in respective application scenarios, based on control The theoretical method of system has that selection input variable is too simple mostly, and the method based on enhancing study has initialization Stage performance is too low and the result of learning model does not guarantee reliably, and the setting threshold method of simple, intuitive does not adapt to then more More application scenarios, the setting for needing that each application is arranged independent threshold value and threshold value are depended on to application complexity Solution.Present invention aims at solving the problem above-mentioned, an extensive stream data processing system based on control theory is proposed Bottleneck node detection method, this method can obtain good calculated performance, and it is special to choose reflection Stream Processing system enough Multiple variables of sign participate in calculating.Our invention can detect that the variation of flow changes to system load bring in time, and When judge that bottleneck node is extended, only run the optimal cluster of a utilization of resources to safeguard, reach reduction cluster scale, Save the purpose of resource.
Summary of the invention
The purpose of the present invention is the state variables for overcoming existing extension theoretical to choose the not high skill of insufficient and reliability Art defect proposes a kind of bottleneck node detection method of extensive stream data processing system.
A kind of system that the bottleneck node detection method of extensive stream data processing system is relied on, i.e., it is a kind of to be based on mould The bottleneck detection system of fuzzy logic control, abbreviation this system includes initialization unit, node state acquisition unit, fuzzy reasoning list Member reconciliation blur unit;
A kind of bottleneck node detection method of extensive stream data processing system, abbreviation this method, the specific steps are as follows:
Step 1: initialization unit initializes fuzzy logic engine, sets the semantization label and each semantic mark of input variable The membership function of label loads Fuzzy Rule Sets, sets the reasoning results critical parameter;
Wherein, fuzzy logic engine is to realize fuzzy logic control language (FCL) standard (IEC1131-7) and can be into The program engine of row fuzzy reasoning, what the language such as the fuzzylite and Java that C Plus Plus can be used to realize were realized jFuzzylogic;
Semantization label is that fuzzy logic uses " true value ", and each input value (i.e. node state) has the semantic mark of oneself Label and corresponding membership function, these should set in initialization node, be typically recorded in a configuration file, have fuzzy Logic engine reads and parses;
Each semantic label of input variable can correspond to a membership function, and the value of this function is false between 0~1 If the value range of certain input variable x is (m, n), function f (x, v) indicates to be subordinate to letter for semantic label v when value is x Number;In general, the membership function of some semantic label is trapezoidal or triangle in rectangular coordinate system;Fuzzy Rule Sets are The regular collection for the fuzzy reasoning write in advance, fuzzy logic engine can read these rules and parse, and be used for subsequent logic Reasoning;
The reasoning results determine that we also need to be arranged two threshold values for final result: shrinkable threshold value and expansible Threshold value is indicated respectively using threshold_scale_in and threshold_scale_out, when the result of ambiguity solution is less than Indicate that present node can recycle when threshold_scale_in, when ambiguity solution result is greater than threshold_scale_ Indicate that present node needs extend when out;
Step 2: node state acquisition unit obtains node state;
We will carry out bottleneck judgement to a node, need first to obtain the current operating conditions of the node, for Stream data is handled for cluster, we select the CPU usage of node, memory usage, the letter based on data tuple size Breath;
Wherein, one element group representation of node state: statusi={ Ci,Mi,Si,Missi, it is current to respectively indicate node i Cpu load, Ci;Memory usage, Mi;Handle the currently processed size of data of tuple, Si,;It is handled not in time in the nearest time The tuple quantity fallen, Missi
Processing within the specified scope all stringent for all data tuples is completed, i.e., the flow data processing of time-out is not allowed to draw It holds up, if there is the processing time-out of tuple indicates that system has needed to extend, so the Miss of this kind of engine in the present systemi's Value is 0 forever, for allowing some fault-tolerant stream data processing engines occur then to be Miss option in semantic label Set a group of labels and corresponding membership function;
Wherein, CiRange be 0~100, MiRange be 0~100;SiAnd MissiValue range and application it is specific Scene is related;
Step 3: input variable is blurred by fuzzy reasoning unit, specifically:
The input quantity of fuzzy logic processes engine is arranged in the state tuple that fuzzy reasoning unit is obtained using step 2, by fixed Justice membership function input quantity is blurred, the step for can be completed by fuzzy logic processes engine;Input variable mould The detailed process of gelatinization is as follows:
Step 3.1 with a membership function record variable for the subjection degree of fuzzy set, to the every of some input variable One fuzzy set needs to seek a group subjection degree respectively;
Wherein, membership function is denoted as μA(x);The value range of subjection degree is the real number between 0 to 1;
Each fuzzy set of some input variable is needed to seek a group subjection degree respectively, specifically:
Assuming that there is A1,A2,...,AnA fuzzy set then needs to seek degree of membership respectively to this n fuzzy set, obtains [μA1, μA2,...,μAn];
Step 3.2 seeks respectively all input variables the subjection degree of its fuzzy set;
Step 4: fuzzy reasoning;
Wherein, fuzzy reasoning is the reasoning based on fuzzy rule, and the premise of fuzzy rule, i.e. the condition of fuzzy reasoning is mould Paste the logical combination of proposition;The conclusion of fuzzy rule is to indicate the fuzzy proposition of the reasoning results, the mould that all fuzzy propositions are set up Paste degree indicates with the membership function of corresponding language variable qualitative value, i.e. blurring result required by step 3;
Step 4, specifically:
Step 4.1 fuzzy reasoning unit calculates the conclusion of every fuzzy rule;The blurring result meter obtained using step 3 It calculates the logical combination of regular premise part fuzzy proposition, and the subjection degree of premise logical combination and conclusion proposition is subordinate to letter Number does min operation, acquires the fog-level of conclusion;
Step 4.2 does max operation to the fog-level of the conclusion of fuzzy rules all in step 4.1, obtains fuzzy reasoning As a result;
So far, fuzzy logic engine has been provided complete fuzzy reasoning and realizes that we only need to define Fuzzy Rule Sets, The interface that engine can be called to provide obtains the reasoning results;
Step 5: ambiguity solution obtains and determines result;
What step 4 obtained is the value of one group of degree of membership of the fuzzy set of result, we will carry out solution mould to this group of result Paste obtains the conclusion whether a node is in bottleneck, and preferred ambiguity solution method has maximum membership degree method, weighting flat Equal method and gravity model appoach (the Center of Gravity, COG);Maximum membership degree method takes in all results degree of membership most That big result is as final judgement as a result, this method realizes that simple but precision is poor;More usually COG, the side COG Method by the position of centre of gravity of calculated result collection as a result;
A variety of ambiguity solution algorithms are realized in fuzzy logic engine, such as are only needed in configuration file in jFuzzylogic In specify DEFUZZIFY METHOD value be COG both can be used gravity model appoach ambiguity solution;Gravity model appoach ambiguity solution the result is that one A numerical value is same as two threshold value comparisons being arranged with initial phase, is finally that extend, should shrink and be also to maintain Constant decision.
So far, step 1 arrives step 5, completes a kind of bottleneck node detection method of extensive stream data processing system.
Beneficial effect
A kind of bottleneck node detection method of extensive stream data processing system is handled with other extensive stream datas The bottleneck node detection method of system is compared, and is had the following beneficial effects:
1. the bottleneck node detection method for the extensive stream data processing system that the present invention is mentioned only depends on system and works as The system mode of preceding state and timing node before, does not need the calculating of an integral function to the time;
2. the bottleneck node detection method for the extensive stream data processing system that the present invention is mentioned does not need model training, Its Stability and veracity is not by the influence of training data (needing to train in itself);
3. the advantage of the bottleneck node detection method for the extensive stream data processing system that the present invention is mentioned is to simplify It calculates, and Stability and dependability is all relatively good;
Detailed description of the invention
Fig. 1 is a kind of execution flow chart of the bottleneck node detection method of extensive stream data processing system of the present invention;
Fig. 2 is cpu busy percentage in a kind of bottleneck node detection method of extensive stream data processing system of the present invention Membership function exemplary diagram.
Specific embodiment
In-depth explanation is carried out to the method for the invention in the following with reference to the drawings and specific embodiments.
Embodiment 1
The present embodiment specifically describes the stream that the present invention is applied under stream data processing system bottleneck node detection scene Journey.
Step A: initialization,
The Fuzzy Processing engine that this example uses is the jFuzzyLogic that Java language is realized, is mentioned by jFuzzyLogic The semantic label and membership function of the configuration file configuration initialization of confession.
The node of this example only needs to handle the data flow of an inflow.It is task of O is indicated with T, in time tiWhen make Use TaskiIndicate the state for the processing being carrying out.Parameter below our uses describes this state.
pi(t): the size of currently processed data tuple
ci(t): the CPU usage of present node
mi(t): the memory usage of present node
missi(t): current not processed and tuple quantity that miss falls
We are to judge whether a node has reached bottleneck using the purpose that fuzzy logic carries out decision, if section Point, which has been bottleneck, just needs to extend this node, can then recycle this node if it is the situation for loading very low.So It is the movement that should be executed to node that we, which set output, and the collection of output is combined into Out={ extension maintains, and shrinks }.
According to the membership function of selection, we respectively obscure four input parameters above and an output parameter Change.Respectively to the parameter setting semantization label of selection, that is, their fuzzy set is set.For the utilization rate of CPU, domain For 0%-100%, empirically, can be set CPU usage semantization tag set be C=it is very low, it is low, it is medium, Height, very high;We are collectively referred to as this collection the linguistic labels of CPU usage.For memory usage, domain range is 0%-100%;It similarly can be set to M={ very low, low, medium, height, very high };The tuple fallen for time-out (Miss) Quantity, its section is 0-10 in this example, and it is E={ small, in, big } that its fuzzy set, which is arranged,.Similarly for currently processed The size of data tuple, in this example its domain range be 0Mb-10Mb, take its fuzzy set be P=it is small, in, greatly, very Greatly }.
Subordinating degree function is indicated with piecewise function, can also be indicated with line chart.For convenience, we use broken line chart Show the membership function of each dimension.Rule of thumb, it is believed that CPU usage thinks its being subordinate to for " very low " lower than 5% Degree is 1, i.e., 100% thinks that it is very low;Think that it is for the degree of membership of " very high " when CPU usage is higher than 90% 1;When it is between other situations, i.e. 5%-90%, degree of membership is as shown in Figure 2.
According to a kind of intuitive understanding, for example think to think node when CPU usage and all very high memory usage It needs to extend, in this way, following table is the fuzzy rule that this example uses to the variable Combination Design fuzzy rule of each blurring The then a subset in library carries out fuzzy reasoning using this fuzzy rule base:
Fig. 1 is the flow chart that the system that the mentioned method of the present invention is relied on executes.
From figure 1 it appears that our system exists as a card format of flow data processing system, from streaming Status data is gone to execute clearing in data processing system.Fuzzy logic engine jFuzzyLogic is by reading semantic label and person in servitude The definition of membership fuction executes initialization.Then result is obtained by blurring, fuzzy reasoning, the several steps of ambiguity solution.
Step B: node state acquisition unit obtains node state.The variable that step 1 determines can at runtime easily Acquisition, the size of data tuple is an attribute of data flow, and CPU usage, memory usage and network interface data flow are all It can be obtained by system interface.These parameters be for flow data handles engine influence it is maximum, node Process performance depends on its CPU calculated performance and memory size, and size and list of the handling capacity of system by data traffic The influence of the size of a data tuple.Configuration for the processing node of mainstream, it is believed that disk I/O performance will not be influenced The principal element of node throughput stores its transmission speed of equipment for mechanical hard disk immediately and present flow data processing is answered It should be also enough.
Step C: input variable is blurred by fuzzy reasoning unit, can be arranged by the interface of Java after initialization engine Input quantity.
Step D: fuzzy reasoning, fuzzy reasoning again may be by calling and the Java interface of FuzyyLogic is realized.
Step E: ambiguity solution obtains and determines result.
In this example, ambiguity solution is carried out using common COG (Center of Gravity) algorithm, It is COG that defuzzifier is set in jFuzzyLogic.The number of an output variable (Out) can be obtained after COG ambiguity solution Value.We set the decision threshold threshold_scale_in of result as 20threshold_scale_out be 80, when solution mould Predicate node can recycle when the result of paste is less than 20, think that present node is in bottleneck shape when the result of ambiguity solution is greater than 80 State needs to extend.
The above is presently preferred embodiments of the present invention, and it is public that the present invention should not be limited to embodiment and attached drawing institute The content opened.It is all not depart from the lower equivalent or modification completed of spirit disclosed in this invention, both fall within the model that the present invention protects It encloses.

Claims (5)

1. a kind of bottleneck node detection method of extensive stream data processing system, it is characterised in that: a kind of base relied on It is conciliate in the bottleneck detection system of fuzzy logic control, including initialization unit, node state acquisition unit, fuzzy reasoning unit Blur unit;Specific step is as follows for the bottleneck node detection method:
Step 1: initialization unit initializes fuzzy logic engine, sets the semantic label of input variable and the person in servitude of each semantic label Membership fuction loads Fuzzy Rule Sets, sets the reasoning results critical parameter;
Wherein, fuzzy logic engine is to realize fuzzy logic control language FCL standard IEC 1131-7 and can be carried out to obscure to push away The program engine of reason, the jFuzzylogic that the fuzzylite and Java language that C Plus Plus can be used to realize are realized;
Semantic label is that fuzzy logic uses " true value ", each input value, that is, node state, there is the semantic label and right of oneself The membership function answered, these should set in initialization node, be recorded in a configuration file, read by fuzzy logic engine It takes and parses;
Step 2: node state acquisition unit obtains node state;Bottleneck judgement is carried out to a node, first obtaining should The current operating conditions of node select CPU usage, memory usage, the number of node for stream data processing cluster According to the information based on tuple size;Processing within the specified scope all stringent for all data tuples is completed, i.e., does not allow time-out Flow data handle engine, if there is tuple processing time-out indicate system needed to extend, so this kind of engine this Miss in systemiValue be forever 0, can be marked in semanteme for allowing to occur some fault-tolerant stream datas processing engines then It is Miss in labeliSet of options a group of labels and corresponding membership function;
Wherein, MissiIndicate the tuple quantity disposed not in time in the current node i nearest time;
Step 3: input variable is blurred by fuzzy reasoning unit, specifically: the data that fuzzy reasoning unit is obtained using step 2 Tuple is arranged the input quantity of fuzzy logic processes engine, is blurred by the membership function defined to input quantity, is patrolled by fuzzy Processing engine is collected to complete;
Wherein, the detailed process being blurred to input quantity is as follows:
Step 3.1 with a membership function record variable for the group subjection degree of fuzzy set, to each of some input variable A fuzzy set needs to seek a group subjection degree respectively;
Wherein, membership function is denoted as μA(x);Each fuzzy set of some input variable is needed to seek a group subjection degree respectively, Specifically:
Assuming that there is A1,A2,...,AnA fuzzy set then needs to seek a group subjection degree respectively to this n fuzzy set, obtains [μA1, μA2,...,μAn];
Step 3.2 seeks respectively all input variables the group subjection degree of its fuzzy set;
Step 4: fuzzy reasoning, specifically:
Step 4.1 fuzzy reasoning unit calculates the conclusion of every fuzzy rule;Rule are calculated using the blurring result that step 3 obtains The then logical combination of premise part fuzzy proposition, and by premise logical combination group subjection degree and conclusion proposition membership function Min operation is done, the fog-level of conclusion is acquired;
Step 4.2 does max operation to the fog-level of the conclusion of fuzzy rules all in step 4.1, obtains fuzzy reasoning result;
So far, fuzzy logic engine has been provided complete fuzzy reasoning and realizes that we only need to define Fuzzy Rule Sets The interface for calling engine to provide obtains the reasoning results;
Step 5: ambiguity solution obtains and determines result.
2. a kind of bottleneck node detection method of extensive stream data processing system according to claim 1, feature Also reside in: in step 1, each semantic label of input variable can correspond to a membership function, and the value of this function is 0~1 Between, it is assumed that the value range of certain input variable x is (m, n), and function f (x, v) is indicated when value is x for semantic label v's Membership function;The membership function of some semantic label is trapezoidal or triangle in rectangular coordinate system;Fuzzy Rule Sets are things The regular collection for the fuzzy reasoning first write, fuzzy logic engine can read these rules and parse, and be used for subsequent Logical Deriving Reason;The reasoning results determine that we also need to be arranged two threshold values: shrinkable threshold value and expansible threshold value for final result, It is indicated respectively using threshold_scale_in and threshold_scale_out, when the result of ambiguity solution is less than Indicate that present node can recycle when threshold_scale_in, when ambiguity solution result is greater than threshold_scale_ Indicate that present node needs extend when out.
3. a kind of bottleneck node detection method of extensive stream data processing system according to claim 1, feature It also resides in: in step 2, one element group representation of node state: statusi={ Ci,Mi,Si,Missi, it respectively indicates node i and works as Preceding cpu load, Ci;Memory usage, Mi;Handle the currently processed size of data of tuple, Si;Locate not in time in the nearest time The tuple quantity that reason is fallen, Missi
Wherein, CiRange be 0~100, MiRange be 0~100;SiAnd MissiValue range and application concrete scene It is related.
4. a kind of bottleneck node detection method of extensive stream data processing system according to claim 1, feature Also reside in: the value range of subjection degree is the real number between 0 to 1 in step 3.1.
5. a kind of bottleneck node detection method of extensive stream data processing system according to claim 1, feature Also reside in: in step 4, fuzzy reasoning is the reasoning based on fuzzy rule, and the condition of the premise of fuzzy rule, i.e. fuzzy reasoning is The logical combination of fuzzy proposition;The conclusion of fuzzy rule is to indicate the fuzzy proposition of the reasoning results, what all fuzzy propositions were set up Fog-level indicates with the membership function of corresponding language variable qualitative value, i.e. blurring result required by step 3;
What step 4 obtained is the value of one group of subjection degree of the fuzzy set of result, we will carry out ambiguity solution to this group of result Obtain the conclusion whether a node is in bottleneck, ambiguity solution method has maximum membership degree method, weighted mean method and again Heart method COG;Maximum membership degree method take in all results that maximum result of degree of membership as final judgement as a result, this Kind method realizes that simple but precision is poor;More usually COG, COG method are by the position of centre of gravity of calculated result collection as knot Fruit;
A variety of ambiguity solution algorithms are realized in fuzzy logic engine, wherein only need to refer in configuration file in jFuzzylogic The value for determining DEFUZZIFY METHOD is that gravity model appoach ambiguity solution both can be used in COG;Gravity model appoach ambiguity solution the result is that a number Value is same as two threshold value comparisons being arranged with initial phase, is finally that extend, should shrink or remain unchanged Decision.
CN201610835764.1A 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system Active CN106506254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610835764.1A CN106506254B (en) 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610835764.1A CN106506254B (en) 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system

Publications (2)

Publication Number Publication Date
CN106506254A CN106506254A (en) 2017-03-15
CN106506254B true CN106506254B (en) 2019-04-16

Family

ID=58291455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610835764.1A Active CN106506254B (en) 2016-09-20 2016-09-20 A kind of bottleneck node detection method of extensive stream data processing system

Country Status (1)

Country Link
CN (1) CN106506254B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669436B (en) * 2018-12-06 2021-04-13 广州小鹏汽车科技有限公司 Test case generation method and device based on functional requirements of electric automobile
CN112148566B (en) * 2020-11-09 2023-07-25 中国平安人寿保险股份有限公司 Method and device for monitoring computing engine, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1345149A (en) * 2000-08-07 2002-04-17 香港科技大学 Flow-type data method and device
CN102404399A (en) * 2011-11-18 2012-04-04 浪潮电子信息产业股份有限公司 Fuzzy dynamic allocation method for cloud storage resource
CN102624870A (en) * 2012-02-01 2012-08-01 北京航空航天大学 Intelligent optimization algorithm based cloud manufacturing computing resource reconfigurable collocation method
CN103491024A (en) * 2013-09-27 2014-01-01 中国科学院信息工程研究所 Job scheduling method and device for streaming data
CN103530189A (en) * 2013-09-29 2014-01-22 中国科学院信息工程研究所 Automatic scaling and migrating method and device oriented to stream data
CN103853766A (en) * 2012-12-03 2014-06-11 中国科学院计算技术研究所 Online processing method and system oriented to streamed data
CN105069025A (en) * 2015-07-17 2015-11-18 浪潮通信信息系统有限公司 Intelligent aggregation visualization and management control system for big data
CN105721199A (en) * 2016-01-18 2016-06-29 中国石油大学(华东) Real-time cloud service bottleneck detection method based on kernel density estimation and fuzzy inference system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1345149A (en) * 2000-08-07 2002-04-17 香港科技大学 Flow-type data method and device
CN102404399A (en) * 2011-11-18 2012-04-04 浪潮电子信息产业股份有限公司 Fuzzy dynamic allocation method for cloud storage resource
CN102624870A (en) * 2012-02-01 2012-08-01 北京航空航天大学 Intelligent optimization algorithm based cloud manufacturing computing resource reconfigurable collocation method
CN103853766A (en) * 2012-12-03 2014-06-11 中国科学院计算技术研究所 Online processing method and system oriented to streamed data
CN103491024A (en) * 2013-09-27 2014-01-01 中国科学院信息工程研究所 Job scheduling method and device for streaming data
CN103530189A (en) * 2013-09-29 2014-01-22 中国科学院信息工程研究所 Automatic scaling and migrating method and device oriented to stream data
CN105069025A (en) * 2015-07-17 2015-11-18 浪潮通信信息系统有限公司 Intelligent aggregation visualization and management control system for big data
CN105721199A (en) * 2016-01-18 2016-06-29 中国石油大学(华东) Real-time cloud service bottleneck detection method based on kernel density estimation and fuzzy inference system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种针对网络流式文本数据的匹配算法;林建秋等;《齐齐哈尔大学学报》;20050630;第37-41页

Also Published As

Publication number Publication date
CN106506254A (en) 2017-03-15

Similar Documents

Publication Publication Date Title
US10515002B2 (en) Utilizing artificial intelligence to test cloud applications
CN109345377B (en) Data real-time processing system and data real-time processing method
Huang et al. Stochastic configuration networks based adaptive storage replica management for power big data processing
CN104317658B (en) A kind of loaded self-adaptive method for scheduling task based on MapReduce
CN107404523A (en) Cloud platform adaptive resource dispatches system and method
KR102522005B1 (en) Apparatus for VNF Anomaly Detection based on Machine Learning for Virtual Network Management and a method thereof
US11132293B2 (en) Intelligent garbage collector for containers
JP2016100005A (en) Reconcile method, processor and storage medium
CN113590451B (en) Root cause positioning method, operation and maintenance server and storage medium
CN103354990B (en) The system and method for the virtual machine in process cloud platform
CN106506254B (en) A kind of bottleneck node detection method of extensive stream data processing system
JP2022017588A (en) Training method of deep-running framework, device, and storage medium
CN115564071A (en) Method and system for generating data labels of power Internet of things equipment
KR101686919B1 (en) Method and apparatus for managing inference engine based on big data
Dogani et al. K-agrued: a container autoscaling technique for cloud-based web applications in kubernetes using attention-based gru encoder-decoder
Gao et al. Workload prediction of cloud workflow based on graph neural network
CN110515716B (en) Cloud optimization scheduling method and system supporting priority and inverse affinity
Chehida et al. Applied statistical model checking for a sensor behavior analysis
US11782923B2 (en) Optimizing breakeven points for enhancing system performance
Yongdnog et al. A scalable and integrated cloud monitoring framework based on distributed storage
CN106713051A (en) Network management system
McGough et al. Using machine learning in trace-driven energy-aware simulations of high-throughput computing systems
CN114529017A (en) Steam turbine fault maintenance system, maintenance method and electronic equipment
Zhang et al. Research on the construction and robustness testing of SaaS cloud computing data center based on the MVC design pattern
CN114443205B (en) Fault analysis method, device and non-transitory computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant