CN110300026A - A kind of network connectivity fai_lure processing method and processing device - Google Patents

A kind of network connectivity fai_lure processing method and processing device Download PDF

Info

Publication number
CN110300026A
CN110300026A CN201910578595.1A CN201910578595A CN110300026A CN 110300026 A CN110300026 A CN 110300026A CN 201910578595 A CN201910578595 A CN 201910578595A CN 110300026 A CN110300026 A CN 110300026A
Authority
CN
China
Prior art keywords
queue
server
semi
overflow
connection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910578595.1A
Other languages
Chinese (zh)
Inventor
赵帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Beijing Kingsoft Cloud Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Beijing Kingsoft Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd, Beijing Kingsoft Cloud Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN201910578595.1A priority Critical patent/CN110300026A/en
Publication of CN110300026A publication Critical patent/CN110300026A/en
Priority to PCT/CN2020/097989 priority patent/WO2020259551A1/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

The embodiment of the invention provides a kind of network connectivity fai_lure processing method and processing devices, in the case where client and the network connection being deployed between the server-side in container are established and fail, determine whether the half-connection queue of server-side overflows, if it is determined that the half-connection queue overflow of server-side, then according to default adjustment rule, increase the queue length of half-connection queue, until the half-connection queue of server-side is no longer overflowed, if it is determined that the half-connection queue of server-side is not overflowed, then determine the network connection path between client and server-side, fault detection is carried out to the node in network connection path, determine node to be repaired, to be repaired to node to be repaired.In this way, can diagnose to network failure by judging whether the half-connection queue of server-side overflows, to reduce manpower consumption, dealing with network breakdown efficiency is improved.

Description

Network connection fault processing method and device
Technical Field
The invention relates to the technical field of internet, in particular to a network connection fault processing method and device.
Background
In one container, one or more applications and environment files necessary for the operation of the applications are included. By deploying the application in the container, the running difference of the application caused by the change of the release version of the host operating system and other basic environments can be reduced. In some scenarios, the server may be deployed in a container, so that the server may provide stable service for the client even if the host in which the server is located changes.
Generally, a connection needs to be established between a client and a server deployed in a container through a Transmission Control Protocol (TCP) to perform data Transmission. However, in the process of establishing a connection through TCP, there is a possibility of connection establishment failure, and it can be understood that the connection establishment failure will cause a communication failure between the client and the server, and further, may have a large influence on the availability of the service provided by the server. At present, network connection faults can be processed only by a manual troubleshooting mode of operation and maintenance personnel, a large amount of labor is consumed, and the efficiency is low.
Therefore, there is a need for an automated network connection failure handling method for container platform based services.
Disclosure of Invention
The embodiment of the invention aims to provide a network connection fault processing method and device so as to realize automatic network connection fault processing based on a container. The specific technical scheme is as follows:
the embodiment of the invention provides a network connection fault processing method, which comprises the following steps:
determining whether a semi-connection queue of a server overflows under the condition that network connection establishment between a client and the server deployed in a container fails;
if the half-connection queue of the server side overflows, increasing the queue length of the half-connection queue according to a preset regulation rule until the half-connection queue of the server side does not overflow any more;
and if the semi-connection queue of the server does not overflow, determining a network connection path between the client and the server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
Optionally, after increasing the queue length of the semi-connection queue according to a preset adjustment rule until the semi-connection queue of the server no longer overflows, the method further includes:
if the network connection between the client and the server fails to be established under the condition that the semi-connection queue of the server does not overflow any more, determining a network connection path between the client and the server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
Optionally, the determining whether the semi-connection queue of the server overflows includes:
acquiring overflow information of a semi-connection queue of the server;
if the overflow information is acquired, determining that the semi-connection queue of the server side overflows;
and if the overflow information is not acquired, determining that the semi-connection queue of the server side does not overflow.
Optionally, the overflow information includes an overflow number; increasing the queue length of the semi-connection queue according to a preset adjustment rule until the semi-connection queue of the server side does not overflow any more, comprising:
after the queue length of the semi-connection queue is increased every time, acquiring the overflow quantity of the semi-connection queue after the queue length is increased;
comparing the overflow quantity of the semi-connection queue after the queue length is increased with the overflow quantity obtained at the previous time, and if the overflow quantity of the semi-connection queue after the queue length is increased is the same with the overflow quantity obtained at the previous time, determining that the semi-connection queue of the server side does not overflow any more
Optionally, the obtaining overflow information of the semi-connection queue of the server includes:
logging in the container, inputting a network information query instruction in the container to acquire the network information of the container, and querying character information in a preset format corresponding to the overflow information in the network information to serve as the overflow information.
Optionally, increasing the queue length of the semi-connection queue according to a preset adjustment rule includes:
increasing the numerical value of the queue parameter of the server by N times, wherein N is a natural number, and the value of N is greater than 1; or,
increasing the numerical value of the queue parameter of the server by a preset value;
the queue parameter is used for defining the queue length of the semi-connection queue of the server.
Optionally, the performing fault detection on the node in the network connection path to determine the node to be repaired includes:
sending a fault detection instruction to the client so that the client sends a preset number of detection messages to the server by using an Internet packet explorer;
acquiring the number of detection messages received by each node in the network connection path;
and judging whether the number of the detection messages received by each node is the same as the preset number, and if not, determining the node as a node to be repaired.
Optionally, the detection packet includes identification information; the obtaining the number of the detection packets received by each node in the network connection path includes:
and acquiring the number of the detection messages received by each node in the network connection path according to the identification information.
The embodiment of the invention also provides a network connection fault processing device, which comprises:
the system comprises a determining module, a sending module and a receiving module, wherein the determining module is used for determining whether a semi-connection queue of a server side overflows or not under the condition that the network connection between a client side and the server side deployed in a container fails to be established;
the adjusting module is used for increasing the queue length of the semi-connection queue according to a preset adjusting rule until the semi-connection queue of the server does not overflow any more if the semi-connection queue of the server overflows;
the first detection module is configured to determine a network connection path between the client and the server if it is determined that the semi-connection queue of the server does not overflow, perform fault detection on a node in the network connection path, determine a node to be repaired, and repair the node to be repaired.
Optionally, the apparatus further comprises:
and the second detection module is used for determining a network connection path between the client and the server if the network connection between the client and the server fails to be established under the condition that the semi-connection queue of the server does not overflow any more, performing fault detection on nodes in the network connection path, determining nodes to be repaired, and repairing the nodes to be repaired.
Optionally, the determining module is specifically configured to:
acquiring overflow information of a semi-connection queue of the server;
if the overflow information is acquired, determining that the semi-connection queue of the server side overflows;
and if the overflow information is not acquired, determining that the semi-connection queue of the server side does not overflow.
Optionally, the overflow information includes an overflow number; the adjusting module is specifically configured to:
after the queue length of the semi-connection queue is increased every time, acquiring the overflow quantity of the semi-connection queue after the queue length is increased;
comparing the overflow quantity of the semi-connection queue after the queue length is increased with the overflow quantity obtained at the previous time, and if the overflow quantity of the semi-connection queue after the queue length is increased is the same with the overflow quantity obtained at the previous time, determining that the semi-connection queue of the server side does not overflow any more
Optionally, the determining module is specifically configured to:
logging in the container, inputting a network information query instruction in the container to acquire the network information of the container, and querying character information in a preset format corresponding to the overflow information in the network information to serve as the overflow information.
Optionally, the adjusting module is specifically configured to:
increasing the numerical value of the queue parameter of the server by N times, wherein N is a natural number, and the value of N is greater than 1; or,
increasing the numerical value of the queue parameter of the server by a preset value;
the queue parameter is used for defining the queue length of the semi-connection queue of the server.
Optionally, the first detection module is specifically configured to:
sending a fault detection instruction to the client so that the client sends a preset number of detection messages to the server by using an Internet packet explorer;
acquiring the number of detection messages received by each node in the network connection path;
and judging whether the number of the detection messages received by each node is the same as the preset number, and if not, determining the node as a node to be repaired.
Optionally, the detection packet includes identification information; the first detection module is specifically configured to:
and acquiring the number of the detection messages received by each node in the network connection path according to the identification information.
The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
and a processor for implementing any of the above-described network connection failure processing methods when executing the program stored in the memory.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements any of the above-mentioned network connection failure processing methods.
Embodiments of the present invention further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute any one of the above-mentioned network connection failure processing methods.
The embodiment of the invention has the following beneficial effects:
the method and the device for processing the network connection fault provided by the embodiment of the invention determine whether a semi-connection queue of a server overflows under the condition that the network connection between a client and the server deployed in a container is failed, if the semi-connection queue of the server overflows, the queue length of the semi-connection queue is increased according to a preset adjustment rule until the semi-connection queue of the server does not overflow any more, and if the semi-connection queue of the server does not overflow, a network connection path between the client and the server is determined, the fault detection is carried out on nodes in the network connection path, and the nodes to be repaired are determined so as to repair the nodes to be repaired. Therefore, the network fault can be diagnosed by judging whether the semi-connection queue of the server side overflows or not, if the semi-connection queue of the server side overflows, the network connection fault caused by the full semi-connection queue is judged, the queue length of the semi-connection queue is increased for repairing, if the semi-connection queue of the server side does not overflow, the network connection fault caused by the node fault in the network connection path is judged, the node to be repaired with the fault is automatically positioned, the node to be repaired is repaired, the labor consumption is reduced, and the network fault processing efficiency is improved.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for TCP connection establishment;
fig. 2 is a schematic flowchart of a network connection failure processing method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a network connection failure processing apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In some scenarios, the server may be deployed in a container, so that the server may provide stable service for the client even if the host in which the server is located changes. Generally, a connection between a client and a server deployed in a container needs to be established through a Transmission Control Protocol (TCP) first to perform data Transmission.
As shown in fig. 1, the way of establishing a connection through TCP is: firstly, a client sends a SYN (synchronization Sequence number) message to a server, the server returns a SYN + ACK (Acknowledgement character) message after receiving the SYN message, and the client sends an ACK message to the server, thereby establishing a connection between the client and the server.
After receiving the SYN message, the server generates an entry corresponding to the SYN message, and stores the entry into a semi-connection queue, and after receiving the ACK message corresponding to the SYN message, stores the entry from the semi-connection queue into a full-connection queue.
However, in the process of establishing a connection through TCP, there is a possibility that the connection is established with a failure. For example, the server may not reply to the SYN + ACK packet in time due to the full semi-connection queue of the server, which may result in a connection establishment failure, or may also result in a connection establishment failure due to a packet loss of a certain node in a network connection path, and the like. At present, network connection faults can be processed only by a manual checking mode of operation and maintenance personnel, a large amount of labor is consumed, the efficiency is low, and an automatic network connection fault processing method based on containers is lacked.
In order to solve the above technical problem, the present invention provides a network connection failure processing method, which may be applied to any electronic device, such as a host of a container where a server is located, another computer in a network, a mobile terminal, and the like, and is not limited in this embodiment of the present invention.
The following generally describes a network connection failure processing method provided in an embodiment of the present invention, where the network connection failure processing method includes:
determining whether a semi-connection queue of a server overflows under the condition that network connection between a client and the server deployed in a container fails to be established;
if the half-connection queue of the server side overflows, increasing the queue length of the half-connection queue according to a preset adjustment rule until the half-connection queue of the server side does not overflow any more;
and if the semi-connection queue of the server does not overflow, determining a network connection path between the client and the server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
As can be seen from the above, the network connection fault processing method provided in the embodiments of the present invention can diagnose a network fault by determining whether a semi-connection queue of a server overflows, determine that the network connection fault is caused by the fact that the semi-connection queue is full if the semi-connection queue of the server overflows, and repair the network connection fault by increasing the queue length of the semi-connection queue, and determine that the network connection fault is caused by a node fault in a network connection path if the semi-connection queue of the server does not overflow, and automatically locate a node to be repaired, where the fault occurs, to repair the node to be repaired, thereby reducing labor consumption and improving network fault processing efficiency.
The following describes in detail a network connection failure processing method according to an embodiment of the present invention by using a specific embodiment.
As shown in fig. 2, a schematic flow chart of a network connection failure processing method provided in an embodiment of the present invention includes the following steps:
s201: in the case of a failure in establishing a network connection between a client and a server deployed in a container, it is determined whether a semi-connection queue of the server overflows. If the overflow occurs, S202 is executed, and if the overflow does not occur, S203 is executed.
In this step, the server is deployed in a container, where the container refers to a complete operating environment, and in a container data packet, the container may include the server, and a class library, other binary files, configuration files, and the like, which provide the operating environment for the server. By containerizing the server and the running environment thereof, the running difference of the server caused by the release version of the operating system and other basic environments can be reduced.
In one implementation, whether the semi-connection queue of the server overflows or not can be determined by acquiring overflow information of the semi-connection queue of the server. And if the overflow information is acquired, determining that the semi-connection queue of the server side overflows, and if the overflow information is not acquired, determining that the semi-connection queue of the server side does not overflow.
The overflow information may be an identification information, that is, when the semi-connection queue of the server overflows, an identification information may be generated to indicate that the semi-connection queue of the server overflows. The overflow information may also contain the overflow number, that is, the number of entries of the overflow half-link queue output when the server half-link queue overflows. Generally, the overflow amount is an accumulated value, and if the half-link queue overflows, the overflow amount increases, and if the half-link queue does not overflow, the overflow amount will not be output, or the overflow amount will not change compared with the previous output.
When the overflow information of the semi-connection queue of the server is obtained, the container where the server is located can be logged in first. Then, a network information query instruction is input into the container to acquire the network information of the container. And further querying character information in a preset format corresponding to the overflow information in the network information, if the character information in accordance with the preset format is queried in the network information, analyzing the character information in accordance with the preset format to obtain the overflow information of the semi-connection queue of the server, and if the character information in accordance with the preset format is not queried in the network information, judging that the overflow information is not obtained.
For example, the server can log into a container in which the server is located through a docker exec command (container execution command). Then, by inputting a "netstat-s | grep overflow" instruction in the container, the network information of the container is acquired. If the output network information has output similar to "XXX time list queue of socket overflow", it indicates that characters conforming to the preset format exist in the network information, wherein "XXX" represents any specific number, that is, the overflow number of the semi-connection queue of the server.
S202: and increasing the queue length of the semi-connection queue according to a preset adjustment rule until the semi-connection queue of the server side does not overflow any more.
In this step, if it is determined that the semi-connection queue of the server overflows, it indicates that the server is currently full of the semi-connection queue, that is, it may be preliminarily determined that the semi-connection queue is full, which results in a network connection failure. In this case, the queue length of the semi-connected queue may be increased until the server semi-connected queue no longer overflows, thereby enabling the server semi-connected queue to store new entries.
Wherein, according to the preset regulation rule, the mode of increasing the queue length of the semi-connection queue can be: firstly, queue parameters of a server are obtained, wherein the queue parameters are used for defining the queue length of a semi-connection queue of the server, and then the numerical values of the queue parameters are increased according to a parameter adjustment rule.
The value of the queue parameter is increased by N times according to the parameter adjustment rule, where the value of N is a natural number greater than 1. For example, the value of the queue parameter is increased to 2 times the current value.
Or, the value of the queue parameter is increased according to the parameter adjustment rule, or the value of the queue parameter is increased by a preset value. For example, the value of the queue parameter is increased by 1000 based on the current value, and so on.
For example, in one implementation, the queue parameters include a net. In this case, the maximum queue length of the semi-connected queue is the minimum of the net.
In another implementation, the queue parameter may further include a tcp _ max _ syn _ backlog parameter, in which case the maximum queue length of the semi-connected queue is the minimum of the net.
After the queue length of the semi-connection queue is increased according to the preset adjustment rule, the client side can try to establish network connection with the server side. Specifically, an instruction for establishing a connection may be sent to the client, and after receiving the instruction, the client establishes a network connection with the server deployed in the container again. Alternatively, the client may continuously attempt to establish a connection with the server deployed in the container at preset time intervals.
In one implementation, if the overflow information includes the overflow number, the overflow number of the semi-connection queue after the queue length of the semi-connection queue is increased each time may be obtained, and then the overflow number of the semi-connection queue after the queue length is increased this time and the overflow number obtained at the previous time may be compared.
If the two are the same, it can be determined that the semi-connection queue of the server does not overflow any more, and if the overflow quantity of the semi-connection queue after the queue length is increased is larger than the overflow quantity obtained at the previous time, it can be determined that the semi-connection queue of the server still overflows, and further, the queue length of the semi-connection queue can be continuously increased according to a preset adjustment rule until the semi-connection queue does not overflow any more.
If the network connection between the client and the server is successfully established under the condition that the semi-connection queue of the server does not overflow any more, the network connection fault before the network connection fault is indicated to be caused by the fact that the semi-connection queue of the server is full, and the network connection fault is already processed by increasing the queue length of the semi-connection queue.
If the network connection between the client and the server is still failed to be established under the condition that the semi-connection queue of the server does not overflow any more, the network connection failure is indicated to be caused by packet loss of a certain node in a network connection path. In this case, a network connection path between the client and the server may be further determined, a fault detection may be performed on a node in the network connection path, a node to be repaired may be determined, and the node to be repaired may be repaired.
S203: determining a network connection path between a client and a server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
In this step, if it is determined that the semi-connection queue of the server does not overflow, it can be inferred that a network failure is caused by packet loss due to a node failure in a network connection path. Then, a network connection path between the client and the server may be determined, fault detection may be performed on each node in the network connection path, and a node to be repaired is determined.
Specifically, the network connection path between the client and the server may be determined by querying a routing table, or the network connection path between the client and the server may be determined by querying information to an operator, which is not limited specifically.
The method for detecting the fault of the node in the network connection path and determining the node to be repaired may be: firstly, a fault detection instruction is sent to a client so that the client sends a preset number of detection messages to a server by using ping (Packet Internet Groper), then, the number of the detection messages received by each node in a network connection path is obtained, whether the number of the detection messages received by each node is the same as the preset number is judged, and if not, the node is determined as a node to be repaired.
The fault detection instruction may be a ping-c 1000-Q0 x2< server _ ip > command, and the preset number of the sent detection messages may be carried in the fault detection instruction, or may be randomly generated by the client.
For the nodes to be repaired, the number of the received detection messages is different from the preset number, each node to be repaired can be repaired, and therefore network connection faults can be processed as soon as possible. Or, each node to be repaired may be repaired in sequence according to a node passing sequence from the client to the server in the network connection path, and when repairing one node to be repaired, the network connection between the client and the server is attempted to be established.
The detection message may further include identification information, so that the number of the detection messages received by each node in the network connection path may be obtained according to the identification information. Thereby improving the accuracy of fault detection. For example, the identification information may be a DSCP (Differentiated Services Code Point) field in the detection message.
In another implementation manner, the second historical overflow amount may be greater than the first historical overflow amount, and in this case, the step of increasing the queue length of the semi-connected queue according to the preset adjustment rule may be returned until the obtained historical overflow amount is not greater than the last obtained historical overflow amount.
As can be seen from the above, the network connection fault processing method provided in the embodiments of the present invention can diagnose a network fault by determining whether a semi-connection queue of a server overflows, determine that the network connection fault is caused by the fact that the semi-connection queue is full if the semi-connection queue of the server overflows, and repair the network connection fault by increasing the queue length of the semi-connection queue, and determine that the network connection fault is caused by a node fault in a network connection path if the semi-connection queue of the server does not overflow, and automatically locate a node to be repaired, where the fault occurs, to repair the node to be repaired, thereby reducing labor consumption and improving network fault processing efficiency.
Corresponding to the affinity analysis method, an embodiment of the present invention further provides a network connection failure processing apparatus, as shown in fig. 3, which is a schematic structural diagram of the network connection failure processing apparatus, and the apparatus includes:
a determining module 310, configured to determine whether a semi-connection queue of a server overflows in a case that establishment of a network connection between a client and the server deployed in a container fails;
an adjusting module 320, configured to, if it is determined that the semi-connection queue of the server side overflows, increase the queue length of the semi-connection queue according to a preset adjusting rule until the semi-connection queue of the server side does not overflow any more;
the first detection module 330 is configured to determine a network connection path between the client and the server if it is determined that the semi-connection queue of the server does not overflow, perform fault detection on a node in the network connection path, determine a node to be repaired, and repair the node to be repaired.
In one implementation, the apparatus further comprises:
a second detection module (not shown in the figure), configured to determine a network connection path between the client and the server if the network connection between the client and the server fails to be established when the semi-connection queue of the server does not overflow any more, perform fault detection on a node in the network connection path, determine a node to be repaired, and repair the node to be repaired.
In one implementation, the determining module 310 is specifically configured to:
acquiring overflow information of a semi-connection queue of a server;
if the overflow information is acquired, determining that the semi-connection queue of the server side overflows;
and if the overflow information is not acquired, determining that the semi-connection queue of the server side does not overflow.
In one implementation, the overflow information includes an overflow amount; the adjusting module 320 is specifically configured to:
after the queue length of the semi-connection queue is increased every time, acquiring the overflow quantity of the semi-connection queue after the queue length is increased;
and comparing the overflow quantity of the semi-connection queue after the queue length is increased with the overflow quantity obtained at the previous time, and if the overflow quantity of the semi-connection queue after the queue length is increased is the same as the overflow quantity obtained at the previous time, determining that the semi-connection queue of the server side does not overflow any more.
In one implementation, the determining module 310 is specifically configured to:
and logging in the container, inputting a network information query instruction in the container to acquire the network information of the container, and querying character information in a predetermined format corresponding to the overflow information in the network information to serve as the overflow information.
In one implementation, the adjusting module 320 is specifically configured to:
increasing the numerical value of the queue parameter of the server by N times, wherein N is a natural number, and the value of N is more than 1; or,
increasing the numerical value of the queue parameter of the server by a preset value;
the queue parameter is used for defining the queue length of the semi-connection queue of the server.
In one implementation, the first detecting module 330 is specifically configured to:
sending a fault detection instruction to the client so that the client sends a preset number of detection messages to the server by using an Internet packet explorer;
acquiring the number of detection messages received by each node in a network connection path;
and judging whether the number of the detection messages received by each node is the same as the preset number, and if not, determining the node as a node to be repaired.
In one implementation, the detection message includes identification information; the first detecting module 330 is specifically configured to:
and acquiring the number of the detection messages received by each node in the network connection path according to the identification information.
As can be seen from the above description, the network connection fault processing apparatus provided in the embodiment of the present invention can diagnose a network fault by determining whether a semi-connection queue of a server overflows, determine that the network connection fault is caused by the fact that the semi-connection queue is full if the semi-connection queue of the server overflows, and repair the network connection fault by increasing the queue length of the semi-connection queue, and determine that the network connection fault is caused by a node fault in a network connection path if the semi-connection queue of the server does not overflow, and automatically locate a node to be repaired, where the fault occurs, to repair the node to be repaired, thereby reducing labor consumption and improving network fault processing efficiency.
An embodiment of the present invention further provides an electronic device, as shown in fig. 4, including a processor 401, a communication interface 402, a memory 403, and a communication bus 404, where the processor 401, the communication interface 402, and the memory 403 complete mutual communication through the communication bus 404,
a memory 403 for storing a computer program;
the processor 401, when executing the program stored in the memory 403, implements the following steps:
determining whether a semi-connection queue of a server overflows under the condition that network connection between a client and the server deployed in a container fails to be established;
if the half-connection queue of the server side overflows, increasing the queue length of the half-connection queue according to a preset adjustment rule until the half-connection queue of the server side does not overflow any more;
and if the semi-connection queue of the server does not overflow, determining a network connection path between the client and the server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
As can be seen from the above, the electronic device provided in the embodiment of the present invention can diagnose a network fault by determining whether the semi-connection queue of the server overflows, determine that the network connection fault is caused by the full semi-connection queue if the semi-connection queue of the server overflows, and repair the network connection fault by increasing the queue length of the semi-connection queue, and determine that the network connection fault is caused by a node fault in a network connection path if the semi-connection queue of the server does not overflow, and automatically locate a node to be repaired, where the fault occurs, to repair the node to be repaired, thereby reducing labor consumption and improving network fault processing efficiency.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above network connection failure processing methods.
In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer causes the computer to perform any one of the above-mentioned network connection failure processing methods.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device embodiment, the electronic device embodiment and the storage medium embodiment, since they are basically similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (18)

1. A method for handling network connection failures, the method comprising:
determining whether a semi-connection queue of a server overflows under the condition that network connection establishment between a client and the server deployed in a container fails;
if the half-connection queue of the server side overflows, increasing the queue length of the half-connection queue according to a preset regulation rule until the half-connection queue of the server side does not overflow any more;
and if the semi-connection queue of the server does not overflow, determining a network connection path between the client and the server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
2. The method according to claim 1, wherein after said increasing the queue length of the semi-connection queue according to the preset adjustment rule until the semi-connection queue of the server no longer overflows, the method further comprises:
if the network connection between the client and the server fails to be established under the condition that the semi-connection queue of the server does not overflow any more, determining a network connection path between the client and the server, performing fault detection on nodes in the network connection path, and determining nodes to be repaired so as to repair the nodes to be repaired.
3. The method of claim 1, wherein the determining whether the semi-connection queue of the server overflows comprises:
acquiring overflow information of a semi-connection queue of the server;
if the overflow information is acquired, determining that the semi-connection queue of the server side overflows;
and if the overflow information is not acquired, determining that the semi-connection queue of the server side does not overflow.
4. The method of claim 3, wherein the overflow information comprises an overflow amount; increasing the queue length of the semi-connection queue according to a preset adjustment rule until the semi-connection queue of the server side does not overflow any more, comprising:
after the queue length of the semi-connection queue is increased every time, acquiring the overflow quantity of the semi-connection queue after the queue length is increased;
and comparing the overflow quantity of the semi-connection queue after the queue length is increased with the overflow quantity obtained at the previous time, and if the overflow quantity of the semi-connection queue after the queue length is increased is the same as the overflow quantity obtained at the previous time, determining that the semi-connection queue of the server side does not overflow any more.
5. The method of claim 3, wherein the obtaining overflow information of the semi-connection queue of the server comprises:
logging in the container, inputting a network information query instruction in the container to acquire the network information of the container, and querying character information in a preset format corresponding to the overflow information in the network information to serve as the overflow information.
6. The method according to claim 1, wherein said increasing the queue length of the semi-connection queue according to a preset adjustment rule comprises:
increasing the numerical value of the queue parameter of the server by N times, wherein N is a natural number, and the value of N is greater than 1; or,
increasing the numerical value of the queue parameter of the server by a preset value;
the queue parameter is used for defining the queue length of the semi-connection queue of the server.
7. The method according to any one of claims 1 to 6, wherein the performing fault detection on the nodes in the network connection path and determining the node to be repaired comprises:
sending a fault detection instruction to the client so that the client sends a preset number of detection messages to the server by using an Internet packet explorer;
acquiring the number of detection messages received by each node in the network connection path;
and judging whether the number of the detection messages received by each node is the same as the preset number, and if not, determining the node as a node to be repaired.
8. The method of claim 7, wherein the detection packet includes identification information;
the obtaining the number of the detection packets received by each node in the network connection path includes:
and acquiring the number of the detection messages received by each node in the network connection path according to the identification information.
9. A network connection failure handling apparatus, the apparatus comprising:
the system comprises a determining module, a sending module and a receiving module, wherein the determining module is used for determining whether a semi-connection queue of a server side overflows or not under the condition that the network connection between a client side and the server side deployed in a container fails to be established;
the adjusting module is used for increasing the queue length of the semi-connection queue according to a preset adjusting rule until the semi-connection queue of the server does not overflow any more if the semi-connection queue of the server overflows;
the first detection module is configured to determine a network connection path between the client and the server if it is determined that the semi-connection queue of the server does not overflow, perform fault detection on a node in the network connection path, determine a node to be repaired, and repair the node to be repaired.
10. The apparatus of claim 9, further comprising:
and the second detection module is used for determining a network connection path between the client and the server if the network connection between the client and the server fails to be established under the condition that the semi-connection queue of the server does not overflow any more, performing fault detection on nodes in the network connection path, determining nodes to be repaired, and repairing the nodes to be repaired.
11. The apparatus of claim 9, wherein the determining module is specifically configured to:
acquiring overflow information of a semi-connection queue of the server;
if the overflow information is acquired, determining that the semi-connection queue of the server side overflows;
and if the overflow information is not acquired, determining that the semi-connection queue of the server side does not overflow.
12. The apparatus of claim 11, wherein the overflow information comprises an overflow amount; the adjusting module is specifically configured to:
after the queue length of the semi-connection queue is increased every time, acquiring the overflow quantity of the semi-connection queue after the queue length is increased;
and comparing the overflow quantity of the semi-connection queue after the queue length is increased with the overflow quantity obtained at the previous time, and if the overflow quantity of the semi-connection queue after the queue length is increased is the same as the overflow quantity obtained at the previous time, determining that the semi-connection queue of the server side does not overflow any more.
13. The apparatus of claim 11, wherein the determining module is specifically configured to:
logging in the container, inputting a network information query instruction in the container to acquire the network information of the container, and querying character information in a preset format corresponding to the overflow information in the network information to serve as the overflow information.
14. The apparatus of claim 9, wherein the adjustment module is specifically configured to:
increasing the numerical value of the queue parameter of the server by N times, wherein N is a natural number, and the value of N is greater than 1; or,
increasing the numerical value of the queue parameter of the server by a preset value;
the queue parameter is used for defining the queue length of the semi-connection queue of the server.
15. The apparatus according to any one of claims 9 to 14, wherein the first detection module is specifically configured to:
sending a fault detection instruction to the client so that the client sends a preset number of detection messages to the server by using an Internet packet explorer;
acquiring the number of detection messages received by each node in the network connection path;
and judging whether the number of the detection messages received by each node is the same as the preset number, and if not, determining the node as a node to be repaired.
16. The apparatus according to claim 15, wherein the detection message includes identification information; the first detection module is specifically configured to:
and acquiring the number of the detection messages received by each node in the network connection path according to the identification information.
17. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 8 when executing a program stored in the memory.
18. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-8.
CN201910578595.1A 2019-06-28 2019-06-28 A kind of network connectivity fai_lure processing method and processing device Pending CN110300026A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910578595.1A CN110300026A (en) 2019-06-28 2019-06-28 A kind of network connectivity fai_lure processing method and processing device
PCT/CN2020/097989 WO2020259551A1 (en) 2019-06-28 2020-06-24 Method and apparatus for handling network connection fault

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910578595.1A CN110300026A (en) 2019-06-28 2019-06-28 A kind of network connectivity fai_lure processing method and processing device

Publications (1)

Publication Number Publication Date
CN110300026A true CN110300026A (en) 2019-10-01

Family

ID=68029463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910578595.1A Pending CN110300026A (en) 2019-06-28 2019-06-28 A kind of network connectivity fai_lure processing method and processing device

Country Status (2)

Country Link
CN (1) CN110300026A (en)
WO (1) WO2020259551A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112019499A (en) * 2020-07-15 2020-12-01 上海趣蕴网络科技有限公司 Method and system for optimizing connection request in handshaking process
WO2020259551A1 (en) * 2019-06-28 2020-12-30 北京金山云网络技术有限公司 Method and apparatus for handling network connection fault
CN113726553A (en) * 2021-07-29 2021-11-30 浪潮电子信息产业股份有限公司 Node fault recovery method and device, electronic equipment and readable storage medium
CN114116128A (en) * 2021-11-23 2022-03-01 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for fault diagnosis of container instance

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1512749A (en) * 2002-12-27 2004-07-14 华为技术有限公司 Warning method for frequent discrete event fault
CN1674485A (en) * 2004-03-25 2005-09-28 国际商业机器公司 Method and system for dynamically provisioning computer system resources
CN101808021A (en) * 2010-04-16 2010-08-18 华为技术有限公司 Fault detection method, device and system, message statistical method and node equipment
CN104754003A (en) * 2013-12-30 2015-07-01 腾讯科技(深圳)有限公司 Data transmission method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107342885A (en) * 2016-05-03 2017-11-10 中兴通讯股份有限公司 Method of adjustment, device and the terminal device of terminal MTU
CN109245955B (en) * 2017-07-10 2022-12-09 阿里巴巴集团控股有限公司 Data processing method and device and server
CN110300026A (en) * 2019-06-28 2019-10-01 北京金山云网络技术有限公司 A kind of network connectivity fai_lure processing method and processing device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1512749A (en) * 2002-12-27 2004-07-14 华为技术有限公司 Warning method for frequent discrete event fault
CN1674485A (en) * 2004-03-25 2005-09-28 国际商业机器公司 Method and system for dynamically provisioning computer system resources
CN101808021A (en) * 2010-04-16 2010-08-18 华为技术有限公司 Fault detection method, device and system, message statistical method and node equipment
CN104754003A (en) * 2013-12-30 2015-07-01 腾讯科技(深圳)有限公司 Data transmission method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020259551A1 (en) * 2019-06-28 2020-12-30 北京金山云网络技术有限公司 Method and apparatus for handling network connection fault
CN112019499A (en) * 2020-07-15 2020-12-01 上海趣蕴网络科技有限公司 Method and system for optimizing connection request in handshaking process
CN113726553A (en) * 2021-07-29 2021-11-30 浪潮电子信息产业股份有限公司 Node fault recovery method and device, electronic equipment and readable storage medium
CN114116128A (en) * 2021-11-23 2022-03-01 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for fault diagnosis of container instance
CN114116128B (en) * 2021-11-23 2023-08-08 抖音视界有限公司 Container instance fault diagnosis method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2020259551A1 (en) 2020-12-30

Similar Documents

Publication Publication Date Title
CN110300026A (en) A kind of network connectivity fai_lure processing method and processing device
CN108600029B (en) Configuration file updating method and device, terminal equipment and storage medium
CN113472607B (en) Application program network environment detection method, device, equipment and storage medium
CN109714209B (en) Method and system for diagnosing website access fault
CN109684155B (en) Monitoring configuration method, device, equipment and readable storage medium
CN110740144B (en) Method, device, equipment and storage medium for determining attack target
CN110727588B (en) Network application testing method, system, computer equipment and readable storage medium
CN109739527A (en) A kind of method, apparatus, server and the storage medium of the publication of client gray scale
CN105577799A (en) Method and device for detecting fault of database cluster
CN109426510A (en) Software processing method, device, electronic equipment and computer readable storage medium
US9641595B2 (en) System management apparatus, system management method, and storage medium
CN111309696A (en) Log processing method and device, electronic equipment and readable medium
CN109150587B (en) Maintenance method and device
CN110784358A (en) Method and device for constructing network call relation topological graph
CN111628878A (en) Fault positioning method, device and system based on multi-stage network nodes
CN115729727A (en) Fault repairing method, device, equipment and medium
CN111866921A (en) Method, device and equipment for searching service fault of 5G base station and storage medium
CN112732560A (en) Method and device for detecting file descriptor leakage risk
CN115296979B (en) Fault processing method, device, equipment and storage medium
CN108512698B (en) Network disaster tolerance method and device and electronic equipment
CN115580522A (en) Method and device for monitoring running state of container cloud platform
CN111884932B (en) Link determining method, device, equipment and computer readable storage medium
CN114153668A (en) Automatic testing method and device, electronic equipment and storage medium
CN114422576A (en) Session cleaning method and device, computer equipment and readable storage medium
CN115391127A (en) Dial testing method and device, storage medium and chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191001