CN112383485B

CN112383485B - Network congestion control method and device

Info

Publication number: CN112383485B
Application number: CN202011188905.8A
Authority: CN
Inventors: 程诚
Original assignee: New H3C Technologies Co Ltd
Current assignee: New H3C Technologies Co Ltd
Priority date: 2020-10-30
Filing date: 2020-10-30
Publication date: 2022-08-19
Anticipated expiration: 2040-10-30
Also published as: CN112383485A

Abstract

The embodiment of the invention provides a network congestion control method and a device, which relate to the technical field of network communication, wherein the method comprises the following steps: acquiring first information reflecting the relation between the number of data packets sent by network equipment in a communication network and the number of ACK (acknowledgement character) receiving characters in a preset time length; obtaining second information reflecting the network delay of the communication network within a preset time length; obtaining the packet loss number of a communication network within a preset time length; predicting the value of the parameter to be adjusted of the network equipment according to the first information, the second information and the packet loss number; and controlling the network equipment to adjust the parameter to be adjusted according to the predicted value so as to realize network congestion control. The scheme provided by the embodiment of the invention can achieve better network congestion control effect in different networks.

Description

Network congestion control method and device

Technical Field

The present invention relates to the field of network communication technologies, and in particular, to a method and an apparatus for controlling network congestion.

Background

Since the data processing capability of the network device is limited, when there is a lot of data to be transmitted in the communication network, it is difficult for the network device to process a large amount of data in a short time, which is likely to cause a decrease in data transmission efficiency and a network congestion problem.

In the prior art, network congestion is often controlled by a relatively fixed algorithm, for example, network congestion is controlled by a Vegas algorithm. In this case, when the network delay exceeds a preset threshold, the size of a CWnd (Congestion Window) of the network device that transmits data, which indicates the amount of data that the network device can transmit, is reduced to half of the original size.

Although the above manner can be applied to implement control over network congestion under some circumstances, because network devices and transmitted data included in different communication networks are different, causes of network congestion problems in different communication networks are often different, and therefore, it is difficult for an algorithm with a fixed rule to achieve a better network congestion control effect in different communication networks.

Disclosure of Invention

Embodiments of the present invention provide a method and an apparatus for controlling network congestion, so as to achieve a better network congestion control effect in different networks. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a network congestion control method, where the method includes:

acquiring first information reflecting the relation between the number of data packets sent by network equipment in a communication network and the number of ACK (acknowledgement character) receiving characters in a preset time length;

obtaining second information reflecting the network delay of the communication network within the preset time length;

obtaining the packet loss number of the communication network in the preset time length;

predicting the value of a parameter to be adjusted of the network equipment according to the first information, the second information and the packet loss number, wherein the parameter to be adjusted is as follows: parameters for preventing network congestion in the communication network;

and controlling the network equipment to adjust the parameter to be adjusted according to the predicted value so as to realize network congestion control.

In an embodiment of the present invention, the first information is: the ratio of the number of data packets sent by the network equipment in the communication network to the number of received ACKs in the preset time length;

the second information is: and the ratio of the average network delay of the communication network to the minimum network delay in the historical delay within the preset time length.

In an embodiment of the present invention, the parameters to be adjusted are: the data transmission rate of the network device or the size of the congestion window CWnd of the network device.

In an embodiment of the present invention, the predicting, according to the first information, the second information, and the packet loss amount, a value of a parameter to be adjusted of the network device includes:

inputting the first information, the second information and the packet loss quantity into a pre-trained congestion control model, and predicting the value of the parameter to be adjusted of the network equipment;

wherein the congestion control model is to: predicting the value of a parameter to be adjusted for preventing the communication network from generating network congestion according to the first information, the second information and the packet loss quantity;

the congestion control model is as follows: the method comprises the steps that a model is obtained by performing reinforcement learning on a preset neural network model through sample network information, sample parameter values and sample state information which are obtained by sample network equipment in a preset number of different sample communication networks, wherein the neural network model is deployed in the sample network equipment;

the sample network information includes: the first information of the sample, the second information of the sample and the packet loss number of the sample communication network, wherein the values of the sample parameters are as follows: and predicting the value of the parameter to be adjusted, which is obtained according to the sample network information and is used for preventing the sample communication network from generating network congestion, wherein the sample state information is used for reflecting that: and adjusting the to-be-adjusted parameters of the sample network equipment in the sample communication network by adopting the sample parameter values, and then obtaining the network state of the sample communication network.

In an embodiment of the present invention, the congestion control model is obtained by performing reinforcement learning on the preset neural network model in the following manner:

acquiring sample network information, sample parameter values and sample state information of sample network equipment in different sample communication networks;

inputting the sample network information into the neural network model, and predicting state prediction information representing a network state which can be achieved by the communication network;

calculating a model parameter gradient of the neural network model according to the sample network information, the sample parameter value, the sample state information and the state prediction information;

adjusting model parameters of the neural network model according to the model parameter gradient, and sending the adjusted model parameters to sample network equipment in the sample communication network, so that the sample network equipment configures the self-deployed neural network model according to the adjusted model parameters, and acquires new sample network information, sample parameter values and sample state information;

if the preset training end condition is not met, returning to execute the step of obtaining the sample network information, the sample parameter value and the sample state information of the sample network equipment in different sample communication networks;

otherwise, determining the neural network model as the congestion control model.

In an embodiment of the present invention, the obtaining of the sample network information, the sample parameter value, and the sample state information from the sample network devices in different sample communication networks includes:

randomly acquiring a sample data set from a sample data pool, wherein the sample data pool is used for storing the sample data set acquired by sample network equipment in each sample communication network, and the sample data set comprises: the method comprises the steps of obtaining sample network information, predicting sample parameter values according to the sample network information included in the sample data group, and adjusting sample state information of the sample communication network after parameters to be adjusted of sample network equipment in the sample communication network according to the sample parameter values included in the sample data group.

In an embodiment of the present invention, the sample state information is calculated according to the following formula:

r＝a*throughput*[1-latency*latency/(latency _min *2+b)]

wherein r is the sample state information, a is a first preset coefficient, b is a second preset coefficient, throughput is the network throughput of the sample communication network, latency is the average delay, latency of the sample communication network _min Is the minimum latency of the sample communication network.

In a second aspect, an embodiment of the present invention provides a network congestion control apparatus, where the apparatus includes:

the first information acquisition module is used for acquiring first information reflecting the relation between the number of data packets sent by network equipment and the number of ACK (acknowledgement character) receiving characters in a communication network within a preset time length;

the second information acquisition module is used for acquiring second information reflecting the network delay of the communication network within the preset time length;

a packet loss number obtaining module, configured to obtain a packet loss number of the communication network within the preset time period;

a value prediction module, configured to predict a value of a parameter to be adjusted of the network device according to the first information, the second information, and the packet loss number, where the parameter to be adjusted is: parameters for preventing network congestion in the communication network;

and the congestion control module is used for controlling the network equipment to adjust the parameter to be adjusted according to the predicted value so as to realize network congestion control.

In an embodiment of the present invention, the value prediction module is specifically configured to:

the congestion control model is as follows: the method comprises the steps that a model obtained by reinforcement learning of a preset neural network model is carried out through sample network information, sample parameter values and sample state information which are obtained by sample network equipment in different sample communication networks with preset quantity, wherein the neural network model is deployed in the sample network equipment;

the sample network information includes: the first information of the sample, the second information of the sample and the packet loss number of the sample communication network, wherein the values of the sample parameters are as follows: predicting the value of the parameter to be adjusted for preventing the network congestion of the sample communication network according to the sample network information, wherein the sample state information is used for reflecting: and adjusting the parameters to be adjusted of the sample network equipment in the sample communication network by adopting the sample parameter values, and then obtaining the network state of the sample communication network.

In an embodiment of the present invention, the apparatus further includes a model training module, where the model training module performs reinforcement learning on the preset neural network model to obtain the congestion control model, and the model training module includes:

the information acquisition submodule is used for acquiring sample network information, sample parameter values and sample state information of sample network equipment in different sample communication networks;

the information prediction submodule is used for inputting the sample network information into the neural network model and predicting state prediction information representing a network state which can be reached by the communication network;

the gradient calculation submodule is used for calculating the model parameter gradient of the neural network model according to the sample network information, the sample parameter value, the sample state information and the state prediction information;

the parameter adjusting submodule is used for adjusting the model parameters of the neural network model according to the model parameter gradient and sending the adjusted model parameters to the sample network equipment in the sample communication network, so that the sample network equipment configures the self-deployed neural network model according to the adjusted model parameters and acquires new sample network information, sample parameter values and sample state information; if the preset training end condition is not met, returning to execute the information acquisition submodule;

and the model determining submodule is used for determining the neural network model as the congestion control model under the condition that a preset training end condition is met.

In an embodiment of the present invention, the information obtaining sub-module is specifically configured to:

r＝a*throughput*[1-latency*latency/(latency _min *2+b)]

wherein r is the sample state information, a is a first preset coefficient, b is a second preset coefficient, throughput is the network throughput of the sample communication network, latency is the average delay, latency of the sample communication network _min Is the minimum delay of the sample communication network.

In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete mutual communication through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of the first aspect when executing a program stored in the memory.

In a fourth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method steps of any one of the first aspect.

In a fifth aspect, embodiments of the present invention also provide a computer program product comprising instructions, which when run on a computer, cause the computer to perform the method steps of any of the first aspects described above.

The embodiment of the invention has the following beneficial effects:

when the scheme provided by the embodiment of the invention is applied to network congestion control, the value of the parameter to be adjusted of the network equipment is predicted by determining the first information, the second information and the packet loss number in the preset time duration in the communication network, and the network equipment is controlled to adjust the parameter to be adjusted according to the value, so that the network congestion control is realized.

The first information is information reflecting a relationship between the number of data packets sent by the network equipment in the communication network and the number of the received Acknowledgement Characters (ACKs) in the preset time length, and the second information is information reflecting the network delay of the communication network in the preset time length, so that the first information, the second information and the packet loss number in different time lengths in different communication networks may be different, and the first information, the second information and the packet loss number reflect the network congestion condition of the communication networks. Therefore, the scheme provided by the embodiment of the invention can control the network congestion conditions of different communication networks, and achieves better network congestion control effect in different communication networks.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a first network congestion control method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a second network congestion control method according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating a first congestion control model training method according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a neural network model according to an embodiment of the present invention;

fig. 5 is a flowchart illustrating a second congestion control model training method according to an embodiment of the present invention;

fig. 6 is a schematic diagram of data transmission in a congestion control model training process according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a network congestion control apparatus according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of a model training module according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to solve the problem, embodiments of the present invention provide a method and an apparatus for controlling network congestion, because a network congestion control method in the prior art is difficult to achieve a better network congestion control effect in different communication networks.

In an embodiment of the present invention, a method for controlling network congestion is provided, where the method includes:

obtaining the packet loss number of the communication network within the preset time length;

predicting the value of the parameter to be adjusted of the network device according to the first information, the second information and the packet loss number, wherein the parameter to be adjusted is: a parameter for preventing the communication network from being congested;

The first information is information reflecting the relationship between the number of data packets sent by the network equipment in the communication network and the number of the received acknowledgement characters ACK in the preset time length, and the second information is information reflecting the network delay of the communication network in the preset time length, so that the first information, the second information and the packet loss number in different communication networks are possibly different, and the first information, the second information and the packet loss number reflect the network congestion condition of the communication network. Therefore, the scheme provided by the embodiment of the invention can control the network congestion conditions of different communication networks, and achieves better network congestion control effect in different communication networks.

The following describes a method and an apparatus for controlling network congestion according to embodiments of the present invention.

Referring to fig. 1, an embodiment of the present invention provides a flowchart of a first network congestion control method, and in particular, the method may be applied to various network devices such as a server and a router in a communication network. The above method includes the following steps S101 to S105.

S101: first information reflecting the relation between the number of data packets sent by network equipment and the number of received ACKs in a communication network in a preset time length is obtained.

The preset time period may be 1s, 3s, 5s, and the like.

Specifically, the first information may be a ratio between the number of packets sent by the network device and the number of ACKs (acknowledgement characters) received by the network device in the communication network within the preset time period, or may be a difference between the number of packets sent and the number of ACKs received.

After the network device sends the data packet, the data packet needs to be transmitted through the communication network and can reach the destination device corresponding to the data packet after a period of time, and the ACK sent by the destination device for the received data packet also can reach the network device sending the data packet after a period of time through the transmission of the communication network. If the transmission efficiency of the communication network is high and the network congestion condition is not serious, the network device can receive the ACKs in a short time after sending the data packets, and the number of the data packets sent by the network device in the preset time duration is closer to the number of the ACKs received. On the contrary, if the transmission efficiency of the communication network is low and the network congestion situation is severe, the difference between the number of the network device sending the data packets and the number of the network device receiving the ACKs within the preset time length is large, so that the first information can reflect the network congestion situation of the communication network.

S102: and obtaining second information reflecting the network delay of the communication network within the preset time length.

The second information may be an average value, a maximum value, or a median value of network delays of the communication network within the preset duration.

The second information may be: a ratio between an average network delay of the communication network within the preset time period and a minimum network delay in historical delays, or a difference between the average network delay and the minimum network delay.

Specifically, the historical delay may be a network delay of the communication network obtained before the second information is obtained at this time.

Or the second information may be: a ratio between the average network delay and a preset delay, or a difference between the average network delay and the preset delay.

Or the second information may be: the ratio of the average network delay to the average of the historical delays, or the difference between the average network delay and the average of the historical experiments.

Since the network delay is the time required to transmit a data packet from the network device to the destination device to receive the data packet. The network delay is low under the condition that the network congestion condition is not serious. Otherwise, the network delay is higher. The second information may reflect a network congestion situation of the communication network.

S103: and obtaining the packet loss number of the communication network in the preset time.

The more serious the network congestion condition is, the more packet loss phenomenon is likely to occur in the communication network, and the more the packet loss number of the communication network within the preset time period is. The second information may reflect a network congestion situation of the communication network.

Specifically, the steps S101 to S103 may be executed in parallel, or may be executed according to a preset sequence, which is not limited in the embodiment of the present invention.

S104: and predicting the value of the parameter to be adjusted of the network equipment according to the first information, the second information and the packet loss number.

Wherein, the parameters to be adjusted are as follows: parameters for preventing network congestion in the communication network.

Specifically, the parameter to be adjusted may be a data transmission rate of the network device or a CWnd size of the network device.

Since the first information, the second information, and the packet loss number may all represent a network congestion condition of the communication network, a value of the parameter to be adjusted of the network device may be selected from a preset value array according to the network congestion condition represented by the first information, the second information, and the packet loss number.

Specifically, the value of the parameter to be adjusted may be determined according to a correspondence between preset network information and a value in a value array. The network information includes the first information, the second information and the packet loss amount.

In addition, the value of the parameter to be adjusted of the network device may also be predicted in step S104A, which will not be described in detail herein.

In addition, when the parameter to be adjusted is a data transmission rate of the network device, a specific value of the data transmission rate may be recorded in the preset value array, and a first adjustment amplitude value for adjusting the data transmission rate may also be recorded.

For example, the first adjustment amplitude value may be: the first predetermined multiple multiplied based on the original data transmission rate may be: 0.5, 0.75, 1.25, 1.5, 2, etc., or a first predetermined difference value added on the basis of the original data transmission rate, wherein the first predetermined difference value may be a positive value or a negative value.

If the first information, the second information and the packet loss number indicate that the network congestion condition is serious, the value of the data sending rate can be determined to be a lower value, so that the data sending rate of the network equipment is reduced, the data quantity of the data needing to be transmitted by the communication network is reduced, and the network congestion is controlled. Otherwise, the value of the data sending rate may be determined to be a higher value, so as to shorten the time required by the network device to send data.

Furthermore, when the parameter to be adjusted is the size of the CWnd of the network device, the specific value of the size of the CWnd may be recorded in the preset value array, or a second adjustment amplitude value for adjusting the size of the CWnd may be recorded.

For example, the second adjustment amplitude value may be: the second preset multiple multiplied by the size of the original CWnd, as described above, may be: 0.5, 0.75, 1.25, 1.5, 2, etc., or a second predetermined difference value added to the size of the original CWnd, which may be a positive value or a negative value.

If the first information, the second information and the packet loss number indicate that the network congestion situation is serious, the value of the CWnd can be determined to be a lower value, so that the data volume of data which can be sent by the network equipment is reduced, the data volume of data which needs to be transmitted by the communication network is reduced, and the network congestion is further controlled. Conversely, the value of the CWnd may be determined to be a higher value, so that the network device can send more data.

S105: and controlling the network equipment to adjust the parameters to be adjusted according to the predicted values so as to realize network congestion control.

Specifically, if the network device is an electronic device that executes the network congestion method, the network device may directly adjust the parameter to be adjusted to the predicted value. If the network device is not the electronic device executing the network congestion method, the electronic device executing the network congestion method may send a predicted value to the network device, so that the network device adjusts the parameter to be adjusted according to the value. And the network equipment continues to operate according to the adjusted parameters to be adjusted, so as to control the network congestion.

Referring to fig. 2, an embodiment of the present invention provides a flowchart of a second network congestion control method, and compared with the foregoing embodiment shown in fig. 1, the step S104 may be implemented by the following step S104A.

S104A: and inputting the first information, the second information and the packet loss quantity into a pre-trained congestion control model, and predicting the value of the parameter to be adjusted of the network equipment.

Wherein the congestion control model is configured to: and predicting the value of the parameter to be adjusted for preventing the communication network from generating network congestion according to the first information, the second information and the packet loss quantity.

Specifically, the congestion control model may select a value of the parameter to be adjusted from a preset value array.

In addition, the congestion control model is: the method comprises the steps that a model obtained by reinforcement learning of a preset neural network model is carried out through sample network information, sample parameter values and sample state information which are obtained by sample network equipment in different sample communication networks with preset quantity, and the neural network model is deployed in the sample network equipment.

Wherein the sample network information includes: the first information and the second information of the sample of the communication network and the packet loss number of the sample are obtained.

The values of the sample parameters are as follows: and predicting the value of the parameter to be adjusted for preventing the sample communication network from generating network congestion according to the sample network information.

The sample state information is used to reflect: and adjusting the to-be-adjusted parameters of the sample network equipment in the sample communication network by adopting the sample parameter values, and then obtaining the network state of the sample communication network. Therefore, the improvement degree of the network congestion condition brought to the sample communication network after the adjustment of the parameters to be adjusted by adopting the sample parameter values can be evaluated according to the sample state information, so that whether the selection of the sample parameter values is accurate or not can be determined. The model parameters of the neural network model can be adjusted according to the sample state information.

Specifically, the reinforcement learning process is based on the idea of reinforcement learning, a neural network model is trained according to sample data, sample network information is continuously acquired, a new sample parameter value is predicted according to the sample network information through the trained neural network model, new sample state information is acquired, the neural network model is continuously trained according to the newly acquired sample network information, the sample parameter value and the sample state information, and the trained neural network model is obtained through cycle iteration. The neural network model is deployed in the sample network device, so that the sample parameter values of the sample network where the sample network device is located can be obtained through the neural network model.

The sample communication network may be: the network formed by the communication devices that actually exist may also be a virtual network obtained through computer simulation.

Therefore, the congestion control model has strong data processing capacity, so that the efficiency of predicting the value of the parameter to be regulated can be improved through the congestion control model, and the efficiency of controlling network congestion is improved.

Referring to fig. 3, an embodiment of the present invention provides a flowchart of a first congestion control model training method, which obtains the congestion control model by performing reinforcement learning on the preset neural network model through the following steps S301 to S305.

S301: and acquiring sample network information, sample parameter values and sample state information from sample network equipment in different sample communication networks.

Specifically, the sample network information, the sample parameter value, and the sample state information may be divided into sample data groups, and have a time sequence order according to the generation time of the sample data groups, and a sequence number may be set for the sample data groups according to the time sequence order.

Wherein the sample data set comprises: the method comprises the steps of obtaining sample network information, predicting sample parameter values according to the sample network information included in the sample data group, and adjusting sample state information of the sample communication network after parameters to be adjusted of sample network equipment in the sample communication network according to the sample parameter values included in the sample data group.

The sample state information may be calculated according to the following formula:

r＝a*throughput*[1-latency*latency/(latency _min *2+b)]

r is the sample state information, a is a first preset coefficient, b is a second preset coefficient, throughput is the network throughput of the sample communication network, latency is the average delay of the sample communication network, latency _min Is the minimum delay of the sample communication network.

Specifically, the network throughput may be a network throughput of the sample communication network within a preset sample duration, the average time delay may be an average value of network time delays of the sample communication network within the preset sample duration, and the minimum time delay may be a minimum value of the network time delays of the sample communication network within the preset sample duration.

As can be seen from the above formula, the larger the network throughput is, the larger the value of the calculated sample state information is, the smaller the average delay is, the larger the value of the calculated sample state information is, the larger the network throughput is, the smaller the average delay is, and the less serious the network congestion situation is, so the larger the value of the calculated sample state information is, the better the effect of network congestion control is.

In addition, the sample network information can correspond to the state in reinforcement learning, the sample parameter value can correspond to the action in reinforcement learning, and the sample state information can correspond to the reward in reinforcement learning.

In an embodiment of the present invention, a first preset number of sample data sets may be obtained.

S302: and inputting the sample network information into the neural network model, and predicting state prediction information representing a network state which can be achieved by the communication network.

Referring to fig. 4, an embodiment of the present invention provides a schematic structural diagram of a neural network model.

The neural network model comprises a strategy network which is framed by a dotted line and used for predicting sample parameter values according to the sample network information, the strategy network can correspond to a strategy function in reinforcement learning, and a value network which is framed by the dotted line and used for predicting network states which can be reached by a communication network according to the sample network information, and the value network can correspond to the value network in reinforcement learning. The output result of the value network may be used as the state prediction information.

The structure of the policy network and the value network may be the same as the structure of the policy network and the value network commonly used in the strong chemistry in the prior art, which is not limited in the embodiment of the present invention.

S303: and calculating the model parameter gradient of the neural network model according to the sample network information, the sample parameter value, the sample state information and the state prediction information.

The neural network model comprises a strategy network and a value network, so that the model parameter gradient obtained by calculation can be divided into a strategy model parameter gradient and a value model parameter gradient.

Specifically, the gradient of the parameter of the policy model may be calculated according to the following formula:

wherein d theta is the policy model parameter gradient, alpha is a first preset step length, c is a preset entropy coefficient, and theta is the network parameter of the policy network,

denotes derivation of θ, i is the sequence number of the sample data set, s _i For sample network information contained in the ith set of sample data, a _i Taking the value of the sample parameter, pi, contained in the ith group of sample data set _θ (s _i ，a _i ) For inputting s into a neural network model _i In the case of (2), the output sample parameter value a _i Probability of (a), V(s) _i ω) is according to s _i The predicted state prediction information, omega, is the network parameter of the value network, H (pi(s) _i θ)) is the above-mentioned π _θ (s _i ，a _i ) The entropy term of (1).

When a plurality of sample data sets are acquired, the values of Q (s, i) are different according to the sequence numbers of the acquired sample data sets, and when i is the maximum sequence number in the acquired sample data sets, Q (s, i) is equal to V (s, i) _i ω). Otherwise, Q (s, i) is r _i + γ Q (s, i +1), wherein r is as defined above _i The sample state information included in the ith group of sample data set is gamma, which is a preset regression factor.

In an embodiment of the present invention, when a plurality of sample data groups are obtained, the policy model parameter gradient may be obtained by separately calculating according to each sample data group, and the policy model parameter θ of the policy network may be separately adjusted according to each policy model parameter gradient.

In addition, the value model parameter gradient can be calculated according to the following formula:

wherein d ω is the value model parameter gradient and β is a second predetermined step length.

In an embodiment of the present invention, when a plurality of sample data sets are obtained, the value model parameter gradient may be obtained by calculation according to each sample data set, and the value model parameter ω of the value network may be adjusted according to each value model parameter gradient.

S304: and adjusting the model parameters of the neural network model according to the model parameter gradient, and sending the adjusted model parameters to the sample network equipment in the sample communication network, so that the sample network equipment configures the self-deployed neural network model according to the adjusted model parameters, and acquires new sample network information, sample parameter values and sample state information.

Specifically, d θ may be subtracted from the policy model parameter θ of the policy network, so as to adjust the policy model parameter θ. D ω may be subtracted from the value model parameter ω of the value network to adjust the value model parameter ω.

After the adjusted model parameters are sent to the sample network devices in the sample communication network, the sample network devices can configure the self-deployed neural network model according to the adjusted model parameters. And the sample equipment continuously acquires new sample network information, obtains a new sample parameter value through the neural network model prediction after adjusting the model parameter according to the newly acquired sample network information, and acquires new sample state information after adjusting the value of the parameter to be adjusted according to the new sample parameter value.

If the predetermined training end condition is not satisfied, the process returns to the step S301. And re-acquiring the sample network information, the sample parameter values and the sample state information of the sample network equipment in different sample communication networks. And continuing to train the neural network model.

Specifically, the preset training end condition may be that the number of times of sending the adjusted model parameter to the sample network device in the sample communication network reaches a first preset number of times.

S305: and determining the neural network model as the congestion control model under the condition of meeting a preset training end condition.

Therefore, the congestion control model is obtained by training the neural network model through reinforcement learning, and after the neural network model is trained according to the sample network information, the sample parameter value and the sample state information in the training process, the trained neural network model is adopted to continuously acquire new sample network information, sample parameter value and sample state information in the sample network, and then the neural network model is trained, so that the trained congestion control model is suitable for the network state of a changing communication network.

In addition, because the sample network information, the sample parameter value and the sample state information come from different sample networks, the trained congestion control model can be suitable for different communication networks, and thus, the network congestion control can be performed on different communication networks through the congestion control model.

Referring to fig. 5, an embodiment of the present invention provides a flowchart of a second congestion control model training method, and compared with the foregoing embodiment shown in fig. 3, the step S301 may be implemented by the step S301A.

S301A: and randomly acquiring a sample data set from the sample data pool.

The sample data pool is used for storing sample network information, sample parameter values and sample state information acquired by sample network equipment in each sample communication network.

Specifically, the sample data pool may be a database for storing a sample data set. The number of the randomly acquired sample data sets may be a second preset number of data sets.

Because the sample data pool can store more sample data groups, the neural network model can be trained by adopting more sample data groups in the process of training the neural network and the model, so that the neural network model is easier to be trained to be converged, and the trained congestion control model has a better network congestion control effect.

Referring to fig. 6, an embodiment of the present invention provides a data transmission diagram of a congestion control model training process.

Wherein the learning agent shown in the figure is used to train the neural network model. The arrows in the figure are used to indicate the direction of data transmission. Each cube forms a neural network model deployed in an intelligent learning body and sample network equipment and output results pi(s) and V(s) of the neural network model, wherein pi(s) represents the probability that the output sample parameter values are all values under the condition that s is input into the neural network model, and V(s) represents state prediction information obtained according to s prediction.

The intelligent learner acquires a sample data group from the sample data pool, trains the neural network model according to the sample data group, and sends the adjusted model parameters to the sample network equipment.

The method comprises the steps that sample network equipment collects sample network information of a sample network, through a neural network model deployed on the sample network equipment, sample parameter values are predicted according to the collected sample network information, after parameters to be adjusted are adjusted according to the sample parameter values, the network state of the sample network changes, and the sample network equipment determines sample state information representing the changed network state.

And the sample network equipment sends the acquired sample network information, sample parameter value and sample state information to a sample data pool, and the sample network information, the sample parameter value and the sample state information are stored in the sample data pool.

In an embodiment of the present invention, the sample network device may send the sample data set to the sample data pool every time a set of sample data sets is acquired. Or sending each acquired sample data group to the sample data pool only when the acquired sample data group reaches a third preset data group number. The process of the sample network device sending the sample data set to the sample data pool may be performed a second preset number of times.

As can be seen from the above, since the sample data set used in training the neural network model is randomly acquired from the sample data pool, the relevance of the acquired sample data set is low, so that the similarity between the network states corresponding to the sample data set is low, and it is prevented that after the neural network model is trained by using the sample data set with high relevance, the obtained congestion control model can only exert a good network congestion control effect in a network in a partial network state, thereby avoiding the congestion control model from falling into a local optimum point.

In addition, the sample data set is not directly interacted between the sample network equipment and the intelligent learner. And under the condition that the sample network equipment acquires a new sample data set, the sample data set can be sent to a sample data pool, and the sample data set is stored in the sample data pool. When the intelligent learner needs to train the neural network model, the intelligent learner can acquire a sample data set from the sample data pool. Therefore, the process of obtaining the sample data set by the sample network equipment and the process of model training by the intelligent learner are independent and asynchronous, and influence cannot be caused between the processes. The intelligent learner does not need to wait for sample data groups sent by sample network equipment, and the sample network equipment does not need to wait for the intelligent learner to carry out model training, so that the efficiency of training to obtain the congestion control model is improved.

Corresponding to the network congestion control method, the invention also provides a network congestion control device.

Referring to fig. 7, an embodiment of the present invention provides a schematic structural diagram of a network congestion control apparatus, where the apparatus includes:

a first information obtaining module 701, configured to obtain first information that reflects a relationship between the number of packets sent by a network device in a communication network and the number of received acknowledgement characters ACK in a preset duration;

a second information obtaining module 702, configured to obtain second information that reflects a network delay of the communication network within the preset time duration;

a packet loss number obtaining module 703, configured to obtain the number of packet losses of the communication network within the preset time duration;

a value prediction module 704, configured to predict a value of a parameter to be adjusted of the network device according to the first information, the second information, and the packet loss amount, where the parameter to be adjusted is: a parameter for preventing network congestion of the communication network;

and a congestion control module 705, configured to control the network device to adjust the parameter to be adjusted according to the predicted value, so as to implement network congestion control.

In an embodiment of the present invention, the value prediction module 704 is specifically configured to:

In an embodiment of the present invention, the apparatus further includes a model training module, and the model training module is configured to perform reinforcement learning on the preset neural network model to obtain the congestion control model.

Referring to fig. 8, an embodiment of the present invention provides a schematic structural diagram of a model training module. The model training module comprises:

the information acquisition submodule 801 is configured to acquire sample network information, sample parameter values, and sample state information from sample network devices in different sample communication networks;

an information prediction sub-module 802, configured to input the sample network information into the neural network model, and predict state prediction information indicating a network state that a communication network can reach;

a gradiometer module 803 for calculating a model parameter gradient of the neural network model according to the sample network information, the sample parameter value, the sample state information, and the state prediction information;

the parameter adjusting submodule 804 is configured to adjust the model parameters of the neural network model according to the model parameter gradient, and send the adjusted model parameters to the sample network device in the sample communication network, so that the sample network device configures the neural network model deployed by itself according to the adjusted model parameters, and obtains new sample network information, sample parameter values and sample state information; if the preset training end condition is not met, returning to execute the information acquisition submodule 801;

a model determining sub-module 805, configured to determine the neural network model as the congestion control model when a preset training end condition is met.

Therefore, the congestion control model obtained by training the neural network model through reinforcement learning is suitable for the network state of the changing communication network because the neural network model obtained by training is adopted to continuously acquire new sample network information, sample parameter values and sample state information in the sample network after the neural network model is trained according to the sample network information, the sample parameter values and the sample state information in the training process, and then the neural network model is trained.

In an embodiment of the present invention, the information obtaining sub-module 801 is specifically configured to:

As can be seen from the above, since the sample data set used in training the neural network model is randomly acquired from the sample data pool, the relevance in the acquired sample data set is low, so that the similarity between the network states corresponding to the sample data set is low, and it is prevented that the obtained congestion control model can exert a good network congestion control effect only in a network in a partial network state after the neural network model is trained by using the sample data set with high relevance, thereby avoiding the congestion control model from falling into a local optimal point.

In addition, the sample data set is not directly interacted between the sample network equipment and the intelligent learner. When the sample network device acquires a new sample data set, the sample data set may be sent to a sample data pool, and the sample data set may be stored in the sample data pool. When the intelligent learner needs to train the neural network model, the intelligent learner can acquire a sample data set from the sample data pool. Therefore, the process of obtaining the sample data set by the sample network equipment and the process of model training by the intelligent learner are independent and asynchronous, and influence cannot be caused between the processes. The intelligent learner does not need to wait for sample data groups sent by sample network equipment, and the sample network equipment does not need to wait for the intelligent learner to carry out model training, so that the efficiency of training to obtain the congestion control model is improved.

r＝a*throughput*[1-latency*latency/(latency _min *2+b)]

wherein r is the sample state information, a is a first preset coefficient, and b is a second preset coefficientThe throughput is the network throughput of the sample communication network, and latency is the average delay of the sample communication network _min Is the minimum latency of the sample communication network.

An embodiment of the present invention further provides an electronic device, as shown in fig. 9, including a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete mutual communication through the communication bus 904,

a memory 903 for storing computer programs;

the processor 901 is configured to implement the method steps of any of the above network congestion control methods when executing the program stored in the memory 903.

When the electronic device provided by the embodiment of the invention is applied to control network congestion, because the first information reflects the relationship between the number of data packets sent by the network device and the number of the received acknowledgement characters ACK in the communication network within the preset time length, and the second information reflects the network delay of the communication network within the preset time length, therefore, the first information, the second information and the packet loss amount may be different in different time periods in different communication networks, and since the first information, the second information and the packet loss amount reflect the network congestion condition of the communication network, the scheme provided by the embodiment of the invention can determine the value of the parameter to be adjusted for preventing each communication network from generating network congestion under the condition of referring to the first information, the second information and the packet loss number of each communication network, and then the network congestion control is realized, rather than adopting a fixed rule to carry out the network congestion control on the communication network. Therefore, the scheme provided by the embodiment of the invention can control the network congestion conditions of different communication networks, and achieves better network congestion control effect in different communication networks.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this is not intended to represent only one bus or type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

In another embodiment provided by the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above network congestion control methods.

When the computer program stored in the computer-readable storage medium applied to the agent performs network congestion control, because the first information reflects a relationship between the number of packets sent by the network device in the communication network and the number of ACK messages received in the communication network within a preset time period, and the second information reflects a network delay of the communication network within the preset time period, the first information, the second information, and the packet loss in different communication networks may be different, and because the first information, the second information, and the packet loss reflect a network congestion condition of the communication network, the solution provided by the embodiment of the present invention can determine a value of a parameter to be adjusted for preventing network congestion of each communication network with reference to the first information, the second information, and the packet loss quantity of each communication network, and further realize network congestion control, rather than adopting fixed rules to carry out network congestion control on the communication network. Therefore, the scheme provided by the embodiment of the invention can control the network congestion conditions of different communication networks, and achieves better network congestion control effect in different communication networks.

In yet another embodiment provided by the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the network congestion control methods of the above embodiments.

When the computer program applied to the agent terminal provided by the embodiment of the present invention is executed to perform network congestion control, since the first information is information reflecting a relationship between the number of data packets sent by the network device in the communication network within a preset time period and the number of ACK (acknowledgement character) received, and the second information is information reflecting a network delay of the communication network within the preset time period, the first information, the second information, and the packet loss number may be different in different communication networks in different time periods, and since the first information, the second information, and the packet loss number reflect a network congestion condition of the communication network, the scheme provided by the embodiment of the present invention can determine a value of a parameter to be adjusted for preventing the network congestion of each communication network with reference to the first information, the second information, and the packet loss number of each communication network, thereby implementing network congestion control, rather than employing fixed rules for network congestion control of the communication network. Therefore, the scheme provided by the embodiment of the invention can control the network congestion conditions of different communication networks, and achieves better network congestion control effect in different communication networks.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, as for the apparatus, the electronic device, the computer-readable storage medium and the computer program product, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to the partial description of the method embodiments for relevant points.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for network congestion control, the method comprising:

acquiring first information reflecting the relationship between the number of data packets sent by network equipment in a communication network and the number of ACK (acknowledgement character) receiving characters in a preset time length;

predicting the value of a parameter to be adjusted of the network equipment according to the first information, the second information and the packet loss number, wherein the parameter to be adjusted is as follows: a parameter for preventing network congestion of the communication network;

controlling the network equipment to adjust the parameter to be adjusted according to the predicted value so as to realize network congestion control;

the predicting a value of a parameter to be adjusted of the network device according to the first information, the second information and the packet loss number includes:

2. The method of claim 1,

the first information is: the ratio of the number of data packets sent by the network equipment in the communication network to the number of received ACKs in the preset time length;

the second information is: and the ratio of the average network delay of the communication network within the preset time length to the minimum network delay in the historical time delay.

3. The method of claim 1, wherein the parameters to be adjusted are: the data transmission rate of the network device or the size of the congestion window CWnd of the network device.

4. The method according to claim 1, wherein the congestion control model is obtained by performing reinforcement learning on the preset neural network model in the following manner:

calculating the model parameter gradient of the neural network model according to the sample network information, the sample parameter value, the sample state information and the state prediction information;

5. The method of claim 4, wherein obtaining sample network information, sample parameter values, and sample status information from sample network devices in different sample communication networks comprises:

6. The method of claim 4, wherein the sample state information is calculated according to the following formula:

r＝a*throughput*[1-latency*latency/(latency _min *2+b)]

wherein r is the sample state information, a is a first preset coefficient, b is a second preset coefficient, through is the network throughput of the sample communication network, latency is the average delay of the sample communication network, latency _min Is the minimum delay of the sample communication network.

7. An apparatus for network congestion control, the apparatus comprising:

a packet loss number obtaining module, configured to obtain a packet loss number of the communication network within the preset time duration;

the congestion control module is used for controlling the network equipment to adjust the parameter to be adjusted according to the predicted value so as to realize network congestion control;

the value prediction module is specifically configured to:

the sample network information includes: the first information of the sample, the second information of the sample and the packet loss number of the sample communication network, wherein the values of the sample parameters are as follows: predicting the value of the parameter to be adjusted for preventing the network congestion of the sample communication network according to the sample network information, wherein the sample state information is used for reflecting: and adjusting the to-be-adjusted parameters of the sample network equipment in the sample communication network by adopting the sample parameter values, and then obtaining the network state of the sample communication network.

8. The apparatus of claim 7, wherein the first information is: the ratio of the number of data packets sent by the network equipment in the communication network to the number of received ACKs in the preset time length;

9. The apparatus of claim 7, wherein the parameters to be adjusted are: the data transmission rate of the network device or the size of the congestion window CWnd of the network device.

10. The apparatus of claim 7, further comprising a model training module, wherein the model training module performs reinforcement learning on the preset neural network model to obtain the congestion control model, and the model training module comprises:

11. The apparatus according to claim 10, wherein the information obtaining sub-module is specifically configured to:

12. The apparatus of claim 10, wherein the sample state information is calculated according to the following formula:

r＝a*throughput*[1-latency*latency/(latency _min *2+b)]

wherein r is the sample state information, a is a first preset coefficient, b is a second preset coefficient, and through is theThe network throughput of the sample communication network, latency, is the average delay of the sample communication network _min Is the minimum delay of the sample communication network.