CN110740054A - data center virtualization network fault diagnosis method based on reinforcement learning - Google Patents

data center virtualization network fault diagnosis method based on reinforcement learning Download PDF

Info

Publication number
CN110740054A
CN110740054A CN201910644115.7A CN201910644115A CN110740054A CN 110740054 A CN110740054 A CN 110740054A CN 201910644115 A CN201910644115 A CN 201910644115A CN 110740054 A CN110740054 A CN 110740054A
Authority
CN
China
Prior art keywords
fault
network
action
fault diagnosis
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910644115.7A
Other languages
Chinese (zh)
Other versions
CN110740054B (en
Inventor
东方
沈典
张欢欢
王士琦
罗军舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201910644115.7A priority Critical patent/CN110740054B/en
Publication of CN110740054A publication Critical patent/CN110740054A/en
Application granted granted Critical
Publication of CN110740054B publication Critical patent/CN110740054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses data center virtualization network fault diagnosis methods based on reinforcement learning, which comprise the following steps of 1 initializing a network fault diagnosis model, 2 training a Q table by adopting a reinforcement learning algorithm according to a set fault diagnosis target, wherein the Q table records an accumulated discount reward value obtained by taking each action under each fault, 3 mapping network state information to a network state in the Q table when the fault occurs, inquiring the Q table according to the network state, selecting the action as a fault diagnosis result according to the maximum reward value principle, 4 optimizing a network state space by using an information gain method in steps, reducing the use cost of a model memory and improving the diagnosis precision.

Description

data center virtualization network fault diagnosis method based on reinforcement learning
Technical Field
The invention belongs to the field of data center networks and reinforcement learning, and particularly relates to diagnosis methods for solving data center virtualization network faults by using a reinforcement learning algorithm.
Background
Data centers form large-scale clusters through network connection servers, and provide mass storage capacity and ultra-strong computing capacity for upper-layer applications in a demand-allocation and elastic expansion mode.
However, network failures cause problems of extended task completion time, slow response, unavailable service and the like of an application program, user experience is reduced, data center availability is reduced, taking a cache system as an example, the cache system places recently accessed or frequently accessed contents in a memory, when a request comes, application response speed is increased by pre-accessing a cache, server access pressure is reduced, and network failures (such as network inaccessibility) cause problems that a user request is directly sent to a server, server load is increased sharply, partial user requests are reduced in response speed, even the service is unavailable and the like, the data center has the characteristics of network heterogeneity, complex communication and the like, so that a large number of network failures exist in the data center, Tortoto and research teams of Microsoft find that the average data center has 5.2 device failures each day, 40.8 link failures, each failure needs 5 minutes for positioning and diagnosis, the longest failure diagnosis time reaches weeks, each failure is found in the research process, each failure causes 59000 failures in Stantan 61 failures in the year, the economic failure diagnosis time reaches weeks, the important data center failure diagnosis results of the USC > 3957, the main failure diagnosis cost is more than 20.56, the failure, the main failure causes the loss of the network failure, the overhead of the USC # 2, the USC shows that the failure is increased, the failure causes the failure, the failure of the failure, the failure of the USC loss of the USC # 2, the USC, the failure, the USC shows that the failure is found that the failure is increased, the failure, the USC, the failure is found that the USP fails, the failure, the USP fails, the USP.
The method mainly comprises the core steps of the process of model construction and fault identification, the common information acquisition method mainly comprises the information acquisition method based on network equipment and the information acquisition method based on a terminal, the information acquisition method based on the network equipment mainly realizes the acquisition function on a switch, such as message dyeing, message sampling, message mirroring and the like.
The existing network fault diagnosis research work mainly aims at the traditional network fault diagnosis, and with the development of virtualization technology, the virtualization network technology becomes the mainstream network construction mode of a data center in recent years. The virtual network is an abstraction of the traditional network, and a user customizes a private network on a shared physical network by building virtual equipment, a virtual link and a virtual machine, so that the communication of the virtual machine under a specific network topology is realized. The virtualized network exists in all servers and mainly consists of software-defined network devices, virtual links, virtual machines, and the like. The virtualization network realizes connectivity among virtual machines by introducing a large number of virtual devices: the TA equipment is used as a virtual network card to realize the connectivity of the virtual machine and the external equipment; OpenvSwitch (OVS) is used as a virtual bridge, and the functions of message forwarding, flow control and the like are realized. The virtualized network has the characteristics that the network state changes frequently and the virtual devices share the server resources through configuration parameters. In order to fully utilize resources, a cloud data center frequently migrates virtual machines, and research shows that a large data center may migrate thousands of virtual machines per second, and frequent virtual machine migration may cause a wide range of changes in network states of the data center, which may cause a failure. Meanwhile, the virtual network uses a large number of virtual devices to realize connectivity among virtual machines, the virtual devices are deployed in a server, share server resources, and allocate available resources through global configuration of virtual device parameters, such as flow table priority, routing forwarding rules, queue length and the like, so that reasonable parameter configuration and optimization are realized, and high-quality network communication of the virtual network is realized. The use of virtualized networks presents new problems and challenges to traditional network failures. The main two aspects are as follows:
the conventional data center network has the characteristics of stable network communication and relatively few temporary faults, and the data acquisition scale or data quality loss caused by reducing the information acquisition granularity has low influence on the network fault diagnosis precision, so the conventional research work mainly uses a sampling acquisition or information periodic uploading method to reduce the acquisition cost.
The method comprises the steps that a virtual device shares server resources through configuration parameters, so that performance of multiple devices is reduced when network faults occur, the virtual device has a large number of faults with similar characteristics due to fuzzy boundary of different fault characteristics, such as the phenomenon that a network card and a TUN device lose package due to CPU competition or memory bandwidth competition, the problem that accessibility, loops and other faults are effectively diagnosed by a modeling method based on a message path in the research work of network fault diagnosis of the existing data center is high in information acquisition cost and limited in fault diagnosis range, the problem that training data is incomplete and data quality is low in a classification model established based on a machine learning method, the problem that accuracy of the model is sacrificed in the training process of the model based on the machine learning method is low in order to avoid overfitting problems in the training process of the model, the problem that the accuracy of a fault diagnosis model trained based on the machine learning method is difficult to improve and the problem that the similar faults are difficult to accurately diagnose is difficult to diagnose is solved, meanwhile, although the fault diagnosis model can identify fault types, the fault causes difficulty in diagnosing the fault types of the network devices, such as the problem that the network fault information is difficult to be identified by analyzing characteristics that the virtual network characteristics that the network fault diagnosis of a network communication model based on a network learning method, such as the network fault diagnosis of a network fault diagnosis network card, a network device, a network card, a network communication network, a network fault diagnosis method, a network card is difficult to diagnose a fault diagnosis method, a fault diagnosis network device, a fault diagnosis method, a network card, a fault diagnosis method of a network device, a network card, a network device, a network failure diagnosis method of a network device, a network.
The existing fault diagnosis method is also greatly limited when being applied to fault diagnosis of a virtualization network, and the method for information acquisition overhead and fault diagnosis precision cannot meet the low-overhead and high-precision diagnosis requirements of the virtualization network of the data center.
Disclosure of Invention
The invention aims to provide data center virtualization network fault diagnosis methods based on reinforcement learning, which can overcome the problems of high information acquisition overhead and low fault diagnosis precision existing in the application of the existing fault diagnosis method to a virtualization network pointed out in the background art.
In order to achieve the above purpose, the solution of the invention is:
A data center virtualization network fault diagnosis method based on reinforcement learning comprises the following steps:
step 1, initializing a network fault diagnosis model;
step 2, training a Q table by adopting a reinforcement learning algorithm according to a set fault diagnosis target, wherein the Q table records the accumulated discount reward value obtained by taking each action under each fault;
step 3, when a fault occurs, mapping the network state information to the network state in the Q table, inquiring the Q table according to the network state, and selecting an action as a fault diagnosis result according to the principle of maximum reward value;
step 4, the information gain method is used to further optimize the network state space.
The specific process of the step 1 is as follows:
step 11, representing a virtualized network environment by using 1028-dimensional vectors composed of server operation environment information, virtual device parameter configuration information and virtual machine network information;
step 12, with equal distance radius ═<r1,r2,…,rd>Dividing every dimensional data to construct a network state space set rdRepresenting the division interval of the d-th dimension data, wherein d is the characteristic quantity of the virtualized network information;
step 13, setting an action set, wherein the execution action set comprises 21 instructions, and each instructions represent solutions of faults;
step 14, selecting action a by using an exploration strategy of E-greedy in the action selection processtBalancing the training time and the fault diagnosis precision of the model;
step 15, updating training memory by using a round updating mode, if the fault resolution immediate reward value is R, otherwise, if the reward value is 0, updating a Q table by a system after the fault resolution for every faults, wherein the Q table updating formula is as follows:
Figure RE-GDA0002295762850000051
where γ ∈ (0,1) is the discount rate, α ∈ (0,1) is the learning efficiency, and R denotes being in the state SnSelection action anRear instant prize value, QnIs shown in state SnTime corresponds to action anNumerical values in the table of Q, Qn(Sn,an) Is shown in state SnSelection action anThe latter jackpot value.
The step 2 of training the Q table by using the reinforcement learning algorithm specifically comprises the following steps:
step 21, injecting a fault into the virtualized network by using a fault injection mode;
step 22, identifying network abnormality by using a network fault perception model and sending a diagnosis request to a fault diagnosis server;
step 23, the fault diagnosis server maps the multi-dimensional information in the diagnosis request to a state space in the Q table after preprocessing;
step 24, selecting an action by using an E-greedy exploration strategy and transmitting the action to a fault server;
step 25, the fault server executes the issued action, judges whether the fault is solved by using a fault perception model, and feeds back a perception result to the fault diagnosis server;
step 26, if the fault is solved, updating the Q table, and turning to step 27; if the fault is not resolved, repeating steps 22-26;
step 27, inject the next faults, and repeat steps 21 through 27 until the Q table converges.
In the step 3, the classification model trained by the decision tree algorithm is used for identifying the network fault and is deployed in all the information acquisition servers, and the information acquired in real time is input into the model for identifying the network fault.
The specific steps of the step 4 are as follows:
step 41, setting a memory use constraint condition L;
step 42, for the optimal action in any network state, after the Q table converges, Q is the value, n represents the number of iterations:
Q=R(1-(1-α)n)≤R
for non-optimal actions in any network state, the Q value after Q table convergence is:
Q=γR(1-(1-α)n≤γR
step 43, for the network state with multiple faults, the Q values of different actions in the Q table are:
Qs=<R,R,…,R,γR,γR,…,γR>
by calculating QSThe number of middle Rs identifies multiple fault conditions;
step 44, in the Q table training process, counting all the training samples X ═ X (X) received by the Q table1,x2,…,xd) Wherein x isdRepresenting the value of the d-th attribute, assuming that X is divided into states S and action a is selectedtCounting data X divided into states S and action selection a under the datatComposition sample data T ═ X, at),atA category label for the new sample;
step 45, setting a dividing boundary condition by using the fault ratio and the number of samples in the state S:
Figure RE-GDA0002295762850000061
or
Wherein c isiRepresents each action atThe number of the middle samples X is,
Figure RE-GDA0002295762850000064
comprises the following steps:
where m is the number of faults obtained according to step 43;
step 46, for the state which does not meet the boundary division in step 44, using the information gain to calculate the information gain value of every features in the state S, selecting the feature with the maximum information gain to split into two new states, and retraining the model;
and step 47, repeating the steps 43 to 46, and constructing an optimal network state space under the condition that the memory constraint condition is met.
In step 45, the calculation formula of the information gain is as follows:
Figure RE-GDA0002295762850000072
wherein Ent (D) represents that the ratio of kth class sample in the sample set D is pkThe entropy of time information, where k is 1,2, …, γ, γ is the number of categories, Gain (D, D) represents the information Gain when the attribute D is used to divide the sample set D, the range of D is constructed by the bisection method, and for the range f (X) of some attribute X, the values are sorted from small to large to be (X)1,x2,…,xn) Then, the candidate partition nodes constructed by the dichotomy are:
Figure RE-GDA0002295762850000073
for V branches generated by dividing D, wherein the V-th branch contains D with D attribute as DvAll samples of (2) are denoted as Dv
By adopting the scheme, the invention mainly solves the problems that the existing method for diagnosing the virtual network fault has large information acquisition overhead and low fault diagnosis precision due to the fact that the virtual network has the characteristics of frequent change of network states and the virtual equipment shares server resources through parameter configuration. The core logic includes: and constructing a fault diagnosis system framework, acquiring system information, and sensing and diagnosing faults. Firstly, establishing a Q table by adopting a reinforcement learning algorithm according to a set network fault diagnosis target; and then monitoring the virtualized network fault of the data center by using an information acquisition module and a fault perception model, mapping fault information to a network state space in a Q table, and selecting an action according to a maximum reward principle to diagnose the network fault.
Compared with the prior art, the invention has the following advantages:
(1) according to the invention, a fault perception model is deployed on an edge server, so that the aim of filtering normal network state information in the information acquisition process and reducing information acquisition overhead is achieved;
(2) the invention adopts a reinforcement learning algorithm to establish the relationship between the data center virtualization network fault and the fault solution, and can effectively identify a large number of faults with similar characteristics in the virtualization network;
(3) the invention optimizes the network state space in the Q table by using an information gain method in step , can effectively reduce the memory use overhead of the fault perception model, and improves the fault diagnosis precision.
Drawings
FIG. 1 is a schematic diagram of a reinforcement learning-based virtualized network fault diagnosis model training process according to the present invention;
FIG. 2 is a schematic diagram of a reinforcement learning based virtualization network fault diagnosis framework of the present invention;
FIG. 3 is a schematic diagram of a fault diagnosis model module of the present invention;
FIG. 4 is a flow chart of the fault diagnosis model training and use of the present invention.
Detailed Description
The technical solution and the advantages of the present invention will be described in detail with reference to the accompanying drawings.
The invention provides data center virtualization network fault diagnosis methods based on reinforcement learning, which comprise four parts, namely construction of a reinforcement learning-based fault diagnosis framework, information acquisition, fault perception and fault diagnosis, and specifically comprise the following steps:
virtualization based on reinforcement learningIn the network fault diagnosis model training process, as shown in fig. 1, the invention applies the reinforcement learning algorithm to the field of data center network fault diagnosis, and realizes the network fault diagnosis with low overhead and high precision for the virtualized network fault. In the process of training the reinforcement learning model, actions are selected according to strategies in different states, and then the actions are verified by the environment and the advantages and disadvantages of the actions are measured according to the feedback reward value. If the states in the reinforcement learning model are defined as nodes and the action sets are defined as edges, the process of training the reinforcement learning model is shown in fig. 1. Nodes a, b, c, d in the diagram represent states, b is the termination state, A1,A2Representing a set of actions, and determining the node relation, the direction of the edge and the weight in the graph through the execution and feedback of the actions. When in use<State, action>When the revenue in the table no longer changes or changes less, the model training is complete. In the using process of the model, according to the memory of future benefits in the table, selecting the action corresponding to the maximum benefit each time in a greedy mode, for example, selecting the action A in the state a1The goal of the reinforcement learning training process is to train the table on the right side of fig. 1, where column shows the state space S ═ a, b, c, d on the vertical axis, and row shows the action set a ═ a1,A2The value in the table represents the desired benefit from being in state S select action a.
In order to solve the problem of low precision of fault diagnosis representing similarity, the most ideal network state division is that there are faults in any state, in the four states { a, b, c, d }, { a, c, d } are fault states, b is a normal state, and after entering the normal state, the current round of diagnosis is finished, faults in the virtualized network may cause performance loss (multiple fault causes) in multiple places, for example, when CPU competition is small, network card cache queue is small, CPU processing messages in network card queue is slow, virtual machine internal application processing is slow, and so on, network performance is degraded, so in order to identify numerous fault causes, there may be multiple actually executed actions.
Figure RE-GDA0002295762850000091
Wherein SHIndicates a normal state, SEIndicating a fault condition, AkRepresenting a sequence of action executions, i.e. for any fault condition, there are action sequences AkThe network can be brought from the fault state to the normal state.
Based on the above analysis, the present invention designs a schematic diagram of a reinforcement learning-based virtualized network fault diagnosis framework as shown in fig. 2, and the main steps of the framework are as follows:
step A1, the information acquisition module and the fault perception module are deployed in a server of the data center, and the fault diagnosis module is deployed in a fault diagnosis server.
Step a2, a fault injection tool is used to inject various types of faults into all servers.
Step A3, the fault perception model identifies the fault and sends a fault diagnosis request to the fault diagnosis server.
And step A4, the fault diagnosis server maps the fault information to the network state space of the Q table, selects a proper action in the action set by using the E-greedy exploration strategy, and sends the action to the fault server.
And step A5, the failure server receives the issued action and executes the action.
And A6, collecting network state information of next periods, carrying out fault perception by a fault perception model, and sending a fault perception result to a fault diagnosis server.
Step a7, the failure diagnosis server updates the record in the Q table according to the feedback result.
Step A8, repeat the process A1 through A7 until the Q table converges.
According to the analysis of fig. 1 and fig. 2, the precision of network state space division in the Q table directly determines the precision of fault diagnosis, fine-grained division can improve the precision of fault diagnosis but can increase the memory use overhead of the model, coarse-grained division reduces the memory use overhead but a plurality of network faults exist in partially divided network states, resulting in low precision of fault diagnosis, by analyzing virtualized network states, most of the network state spaces are normal network states and do not need to be stored, and if the network states with a plurality of faults can be divided into stages, the precision of network state space division can be effectively improved, therefore, the steps of improving the precision of network state space division are mainly as follows:
and step B1, recognizing multiple fault diagnoses, and calculating the values of actions after the Q table is converged according to the following updating formula:
Figure RE-GDA0002295762850000101
for optimal action in any network state, Q is equal after Q table convergence
Q=R(1-(1-α)n)≤R
For non-optimal actions in any network state, the Q value after Q table convergence is:
Q=γR(1-(1-α)n≤γR
therefore, for a network state with multiple faults, the Q values of different actions of the Q table are:
QS=<R,R,…,R,γR,γR,…,γR>
by calculating QSThe number of medium data greater than gammar identifies a multiple failure network condition.
Step B2, setting splitting condition, and aiming at any state S in Q table training processtE.g. S, when the Q table receives the sample X ═ X1,x2,…,xd) In which x isdRepresenting the value of the d-th attribute, assuming that X is divided into states StIn and select action atThen the division into states S can be countedtData X of (2) and action selection a under the datatComposition sample data T ═ X, at),atIs a category label for the new sample. Suppose StThe fault occurring inCorresponding to a sample number of
Figure RE-GDA0002295762850000103
Network state StAll the samples in the fault state are equal in occurrence probability, and the fault state S can be represented by the ratio of the number of the fault samplestThe probability of (1), i.e.:
Figure RE-GDA0002295762850000104
then the definition is divided into states StNumber of samples M and
Figure RE-GDA0002295762850000105
the state S need not be aligned when the following formula is satisfiedtSplitting:
Figure RE-GDA0002295762850000106
or
Figure RE-GDA0002295762850000107
Where Q is the minimum number of split samples and θ is the fault split ratio. I.e. for state StThe total number of samples is less, which indicates that the state has low probability of occurrence, and the certain fault duty ratio is higher than theta, which indicates that faults mainly exist in the state, and the state S in the two casestNo splitting is required.
In step B3, the invention uses information entropy to measure sample purity, and uses information gain to decide how to divide the attribute. Suppose St=(D1,D2,…,Dd) Wherein D iskThe k-th characteristic interval. In this chapter, the continuous numerical value processing method in the decision tree construction process is used for reference, the dichotomy is used for discretizing the continuous numerical value and then calculating the information gain, and S is calculated according to the formula in step 47t=(D1,D2,…,Dd) The information gain of the middle D characteristics divided for different classes is calculated and assumed to be at the k-th characteristic DkThe value z of (a) is obtained as an information gainA maximum value. Then state StK-th attribute of (2)kSplitting into two states according to the value z
Figure RE-GDA0002295762850000111
And
Figure RE-GDA0002295762850000112
the two states are added to the Q-table and the model continues to be trained.
In order to implement a virtualized network fault diagnosis model based on reinforcement learning, the invention implements a fault diagnosis model module schematic diagram as shown in fig. 3, and the fault diagnosis server is mainly deployed with: the system comprises a fault automatic injection module, a diagnosis process rollback module, a model training module and a communication agent module. The modules deployed by the edge server mainly comprise: the device comprises an information acquisition module, a fault perception module, an action instruction execution module and a communication agent module.
(1) Action execution module
The method comprises the steps that a reinforcement learning module sends an action to a fault server according to a strategy, the fault server executes an action verification diagnosis result, however, the action designed in the document is not a complete instruction, such as a sudo ethyool-G eth0 rx 1024 instruction, and the action is directly executed after being sent to the fault server, but a plurality of instructions need the fault server to complete, for example, a CPU limit instruction is used, a PID in the CPU limit-p PID-l limit is a process number, the limit is a CPU value (set to 10) which is limited in use, the whole instruction indicates that the CPU utilization rate of a limited Process (PID) is 10%, the fault diagnosis module sends the instruction to the fault server, the fault server needs to obtain a current system process tree, analyzes the process PIDs which use the most CPUs, and splices the process PIDs into the CPU limit instruction to execute, the process PIDs are obtained by executing a top-n 1-b instruction execution module which is realized in the section , the top-n 1-b instruction is used to obtain process tree information (a top-n 1-b instruction execution result, then the top instruction output a top instruction is analyzed, the result of the top instruction, the output of a top instruction is output, the result of a process instruction is spliced, the result of the CPU execution is obtained by the result, the result of the CPU execution is obtained by the highest utilization rate of the CPU.
(2) Fault automation injection module and fault diagnosis process rollback module
In order to accelerate the model training process, when the fault is solved by the action selected by the fault diagnosis module, all the previously executed action sequences need to be immediately rolled back, and a new fault is injected, so that the timing task provided by Linux cannot realize the fine-grained fault injection function. According to the theoretical part of chapter four, the method for updating the Q value by using the round updating is selected, so that the complete diagnosis process can be memorized in the model training process, and the Q values of all operations are updated after the diagnosis is finished. Thus, all actions performed can be known by examining the diagnostic process, thereby rolling back all actions and selecting new faults for injection. The method specifically comprises the following steps: assume a command sequence executed after the fault diagnosis ends: 1. and (3) increasing a network card receiving queue buffer, and 2. the CPU limits the use of a process CPU. For the adjustment of the network card cache instruction, the network card cache instruction can be directly rolled back to the original parameters. The CPU limit needs to restore the process execution state of the Linux server, the real operating environment is simulated by using stress-ng and MBW in the embodiment, and the process with the highest CPU utilization rate analyzed when the CPU limit instruction is executed is necessarily the CPU use process simulated by the stress-ng instruction. Therefore, in the instruction recovery process, the present embodiment directly executes all relevant processes of kill, re-executes the environment simulation instruction to recover the original running state, and then executes a new fault injection instruction to inject a fault. The realization and the use of the module accelerate the fault injection frequency and shorten the training time of the model.
(3) Network state space division and training module
The specific implementation process includes that whether a memory limit condition is met is checked, if the memory limit condition is exceeded, the training is finished, otherwise, whether a plurality of faults exist in the network state in a Q table is checked, if the faults exist, whether the state division condition is met is calculated, if the state division condition needs to be carried out , the state is added into an Error queue, if the Error queue is empty, the model training is finished, if the Error queue is not empty, information gains of every features are calculated according to a network state division algorithm, the features with the largest information gains are selected to be split into two states and added into a Q table, and the model before splitting is deleted from the Q table.
(4) The overall process of model training is shown in fig. 4, and the main steps are as follows:
step C1, training a Q table;
step C2, checking whether the network state in the Q table meets the memory use constraint;
step C3, if not, the model training is finished;
step C4, if the memory use constraint is satisfied, traversing all the states to check whether splitting is needed;
step C5, if the splitting is not needed, the model training is finished;
step C6, if splitting is needed, splitting the network state by using an information gain mode, and adding the split network state into a network state space;
and C7, repeating the steps C1 to C6, solving the optimal network state space division method meeting the memory use constraint, and improving the model precision.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the protection scope of the present invention.

Claims (6)

1, A data center virtualization network fault diagnosis method based on reinforcement learning, which is characterized by comprising the following steps:
step 1, initializing a network fault diagnosis model;
step 2, training a Q table by adopting a reinforcement learning algorithm according to a set fault diagnosis target, wherein the Q table records the accumulated discount reward value obtained by taking each action under each fault;
step 3, when a fault occurs, mapping the network state information to the network state in the Q table, inquiring the Q table according to the network state, and selecting an action as a fault diagnosis result according to the principle of maximum reward value;
step 4, the information gain method is used to further optimize the network state space.
2. The reinforcement learning-based data center virtualization network fault diagnosis method of claim 1, wherein: the specific process of the step 1 is as follows:
step 11, representing a virtualized network environment by using 1028-dimensional vectors composed of server operation environment information, virtual device parameter configuration information and virtual machine network information;
step 12, with equal distance radius ═<r1,r2,...,rd>Dividing every dimensional data to construct a network state space set rdRepresenting the division interval of the d-th dimension data, wherein d is the characteristic quantity of the virtualized network information;
step 13, setting an action set, wherein the execution action set comprises 21 instructions, and each instructions represent solutions of faults;
step 14, selecting action a by using an exploration strategy of E-greedy in the action selection processtBalancing the training time and the fault diagnosis precision of the model;
step 15, updating training memory by using a round updating mode, if the fault resolution immediate reward value is R, otherwise, if the reward value is 0, updating a Q table by a system after the fault resolution for every faults, wherein the Q table updating formula is as follows:
Figure FDA0002132890070000011
where γ ∈ (0,1) is the discount rate, α ∈ (0,1) is the learning efficiency, and R denotes being in the state SnSelection action anRear instant prize value, QnIs shown in state SnTime corresponds to action anNumerical values in the table of Q, Qn(Sn,an) Is shown in state SnSelection action anThe latter jackpot value.
3. The reinforcement learning-based data center virtualization network fault diagnosis method of claim 1, wherein: the step 2 of training the Q table by using the reinforcement learning algorithm specifically comprises the following steps:
step 21, injecting a fault into the virtualized network by using a fault injection mode;
step 22, identifying network abnormality by using a network fault perception model and sending a diagnosis request to a fault diagnosis server;
step 23, the fault diagnosis server maps the multi-dimensional information in the diagnosis request to a state space in the Q table after preprocessing;
step 24, selecting an action by using an E-greedy exploration strategy and transmitting the action to a fault server;
step 25, the fault server executes the issued action, judges whether the fault is solved by using a fault perception model, and feeds back a perception result to the fault diagnosis server;
step 26, if the fault is solved, updating the Q table, and turning to step 27; if the fault is not resolved, repeating steps 22-26;
step 27, inject the next faults, and repeat steps 21 through 27 until the Q table converges.
4. The reinforcement learning-based data center virtualization network fault diagnosis method of claim 1, wherein: in the step 3, the classification model trained by the decision tree algorithm is used for identifying the network fault and is deployed in all the information acquisition servers, and the information acquired in real time is input into the model for identifying the network fault.
5. The reinforcement learning-based data center virtualization network fault diagnosis method of claim 1, wherein: the specific steps of the step 4 are as follows:
step 41, setting a memory use constraint condition L;
step 42, for the optimal action in any network state, after the Q table converges, Q is the value, n represents the number of iterations:
Q=R(1-(1-α)n)≤R
for non-optimal actions in any network state, the Q value after Q table convergence is:
Q=γR(1-(1-α)n≤γR
step 43, for the network state with multiple faults, the Q values of different actions in the Q table are:
QS=<R,R,...,R,γR,γR,...,γR>
by calculating QSThe number of middle Rs identifies multiple fault conditions;
step 44, in the Q table training process, counting all the training samples X ═ X (X) received by the Q table1,x2,...,xd) Wherein x isdRepresenting the value of the d-th attribute, assuming that X is divided into states S and action a is selectedtCounting data X divided into states S and action selection a under the datatComposition sample data T ═ X, at),atA category label for the new sample;
step 45, setting a dividing boundary condition by using the fault ratio and the number of samples in the state S:
Figure FDA0002132890070000031
or
Figure FDA0002132890070000032
Wherein c isiRepresents each action atThe number of the middle samples X is,comprises the following steps:
Figure FDA0002132890070000034
where m is the number of faults obtained according to step 43;
step 46, for the state which does not meet the boundary division in step 44, using the information gain to calculate the information gain value of every features in the state S, selecting the feature with the maximum information gain to split into two new states, and retraining the model;
and step 47, repeating the steps 43 to 46, and constructing an optimal network state space under the condition that the memory constraint condition is met.
6. The reinforcement learning-based data center virtualization network fault diagnosis method of claim 5, wherein: in step 45, the calculation formula of the information gain is as follows:
Figure FDA0002132890070000035
Figure FDA0002132890070000036
wherein Ent (D) represents that the ratio of kth class sample in the sample set D is pkThe entropy of time information, k is 1,2,., γ, γ is the number of categories, Gain (D, D) represents the information Gain when the attribute D is used to divide the sample set D, the value range of D is constructed by the dichotomy, and for the value range f (X) of some attribute X, the values are sorted from small to large to be (X)1,x2,...,xn) Then, the candidate partition nodes constructed by the dichotomy are:
Figure FDA0002132890070000037
for V branches generated by dividing D, wherein the V-th branch contains D with D attribute as DvAll samples of (2) are denoted as Dv
CN201910644115.7A 2019-07-17 2019-07-17 Data center virtualization network fault diagnosis method based on reinforcement learning Active CN110740054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910644115.7A CN110740054B (en) 2019-07-17 2019-07-17 Data center virtualization network fault diagnosis method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910644115.7A CN110740054B (en) 2019-07-17 2019-07-17 Data center virtualization network fault diagnosis method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN110740054A true CN110740054A (en) 2020-01-31
CN110740054B CN110740054B (en) 2022-04-01

Family

ID=69237784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910644115.7A Active CN110740054B (en) 2019-07-17 2019-07-17 Data center virtualization network fault diagnosis method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN110740054B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801272A (en) * 2021-01-27 2021-05-14 北京航空航天大学 Fault diagnosis model self-learning method based on asynchronous parallel reinforcement learning
CN113114354A (en) * 2021-04-16 2021-07-13 河南工业大学 Method for simultaneously positioning optical switch structure switch and optical link fault in optical data center
CN114884836A (en) * 2022-04-28 2022-08-09 济南浪潮数据技术有限公司 High-availability method, device and medium for virtual machine
CN115865617A (en) * 2022-11-17 2023-03-28 广州鲁邦通智能科技有限公司 VPN remote diagnosis and maintenance system
CN116339134A (en) * 2022-12-30 2023-06-27 华能国际电力股份有限公司德州电厂 Frequency modulation optimization control system of large-disturbance thermal power generating unit
CN116390138A (en) * 2023-04-25 2023-07-04 中南大学 Fault diagnosis method based on digital twin network and related equipment
TWI821666B (en) * 2021-05-13 2023-11-11 中華電信股份有限公司 Service management system and adaption method of service information process

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411749A (en) * 2016-10-12 2017-02-15 国网江苏省电力公司苏州供电公司 Path selection method for software defined network based on Q learning
CN106603293A (en) * 2016-12-20 2017-04-26 南京邮电大学 Network fault diagnosis method based on deep learning in virtual network environment
CN108092804A (en) * 2017-12-08 2018-05-29 国网安徽省电力有限公司信息通信分公司 Power telecom network maximization of utility resource allocation policy generation method based on Q-learning
CN110011876A (en) * 2019-04-19 2019-07-12 福州大学 A kind of network measure method of the Sketch based on intensified learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411749A (en) * 2016-10-12 2017-02-15 国网江苏省电力公司苏州供电公司 Path selection method for software defined network based on Q learning
CN106603293A (en) * 2016-12-20 2017-04-26 南京邮电大学 Network fault diagnosis method based on deep learning in virtual network environment
CN108092804A (en) * 2017-12-08 2018-05-29 国网安徽省电力有限公司信息通信分公司 Power telecom network maximization of utility resource allocation policy generation method based on Q-learning
CN110011876A (en) * 2019-04-19 2019-07-12 福州大学 A kind of network measure method of the Sketch based on intensified learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHANGHUANHUAN: "Virtual Network Fault Diagnosis Mechanism", 《PROCEEDINGS OF THE 2017 IEEE 21ST INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN》 *
卞辉: "基于Q学习的无线传感网络自愈算法", 《电子设计工程》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801272A (en) * 2021-01-27 2021-05-14 北京航空航天大学 Fault diagnosis model self-learning method based on asynchronous parallel reinforcement learning
CN113114354A (en) * 2021-04-16 2021-07-13 河南工业大学 Method for simultaneously positioning optical switch structure switch and optical link fault in optical data center
TWI821666B (en) * 2021-05-13 2023-11-11 中華電信股份有限公司 Service management system and adaption method of service information process
CN114884836A (en) * 2022-04-28 2022-08-09 济南浪潮数据技术有限公司 High-availability method, device and medium for virtual machine
CN115865617A (en) * 2022-11-17 2023-03-28 广州鲁邦通智能科技有限公司 VPN remote diagnosis and maintenance system
CN115865617B (en) * 2022-11-17 2023-10-03 广州鲁邦通智能科技有限公司 VPN remote diagnosis and maintenance system
CN116339134A (en) * 2022-12-30 2023-06-27 华能国际电力股份有限公司德州电厂 Frequency modulation optimization control system of large-disturbance thermal power generating unit
CN116339134B (en) * 2022-12-30 2023-10-24 华能国际电力股份有限公司德州电厂 Frequency modulation optimization control system of large-disturbance thermal power generating unit
CN116390138A (en) * 2023-04-25 2023-07-04 中南大学 Fault diagnosis method based on digital twin network and related equipment
CN116390138B (en) * 2023-04-25 2024-03-08 中南大学 Fault diagnosis method based on digital twin network and related equipment

Also Published As

Publication number Publication date
CN110740054B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN110740054B (en) Data center virtualization network fault diagnosis method based on reinforcement learning
CN110187990B (en) Virtual machine hybrid standby dynamic reliability assessment method based on mode transfer
US20210286786A1 (en) Database performance tuning method, apparatus, and system, device, and storage medium
CN104036029B (en) Large data consistency control methods and system
CN101251579A (en) Analog circuit failure diagnosis method based on supporting vector machine
Wang et al. Heterogeneity-aware gradient coding for straggler tolerance
CN110489317B (en) Cloud system task operation fault diagnosis method and system based on workflow
CN112559525B (en) Data checking system, method, device and server
CN111181800A (en) Test data processing method and device, electronic equipment and storage medium
CN108647137A (en) A kind of transaction capabilities prediction technique, device, medium, equipment and system
US12099422B2 (en) Method and electronic device for storage testing
CN112433853A (en) Heterogeneous sensing data partitioning method for parallel application of supercomputer data
CN111769974A (en) Cloud system fault diagnosis method
CN113094235B (en) Tail delay abnormal cloud auditing system and method
CN111176831A (en) Dynamic thread mapping optimization method and device based on multithread shared memory communication
US20100180024A1 (en) Reducing occurrences of two-phase commits in a multi-node computing system
WO2024118188A1 (en) Computer application error root cause diagnostic tool
CN110909023B (en) Query plan acquisition method, data query method and data query device
CN111711530A (en) Link prediction algorithm based on community topological structure information
CN115454787A (en) Alarm classification method and device, electronic equipment and storage medium
CN114676586A (en) Construction method based on multidimensional, multi-space-time digital simulation and emulation
CN113986900A (en) Data quality problem grading processing method, storage medium and system
CN114692377A (en) Processing method and system for operation and maintenance data of ultrahigh-speed low-vacuum pipeline aircraft
CN111382047B (en) Block chain evaluation method, device and system, and computer-readable medium
CN109857632A (en) Test method, device, terminal device and readable storage medium storing program for executing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant