CN111163189A - IP monitoring and recycling system and method based on network name space management and control - Google Patents

IP monitoring and recycling system and method based on network name space management and control Download PDF

Info

Publication number
CN111163189A
CN111163189A CN202010012138.9A CN202010012138A CN111163189A CN 111163189 A CN111163189 A CN 111163189A CN 202010012138 A CN202010012138 A CN 202010012138A CN 111163189 A CN111163189 A CN 111163189A
Authority
CN
China
Prior art keywords
container
network
monitoring
management
name space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010012138.9A
Other languages
Chinese (zh)
Other versions
CN111163189B (en
Inventor
徐俊杰
蓝维洲
朱晖
潘远航
胡心悦
颜开
陈齐彦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Daoke Network Technology Co Ltd
Original Assignee
Shanghai Daoke Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Daoke Network Technology Co Ltd filed Critical Shanghai Daoke Network Technology Co Ltd
Priority to CN202010012138.9A priority Critical patent/CN111163189B/en
Publication of CN111163189A publication Critical patent/CN111163189A/en
Application granted granted Critical
Publication of CN111163189B publication Critical patent/CN111163189B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5046Resolving address allocation conflicts; Testing of addresses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5007Internet protocol [IP] addresses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an IP monitoring and recovery system and a method based on network namespace management and control, the system comprises an IP distribution management server, an IP distribution management agent, a plurality of container groups, a container engine, a cluster API server and a database, the container groups and the container engine are established and connected to realize data interaction, the IP distribution management server is respectively established and connected with the container engine, the cluster API server and the database to realize data interaction, and the IP distribution management agent is respectively established and connected with the container engine and the cluster API server to monitor and confirm; the corresponding relation between the container and the network name space can be obtained through monitoring the network name space, after the container is deleted, the network name space which is not used any more in data recovery can be monitored, the IP of the deleted container group is ensured to be released certainly, abnormal leakage is avoided, and the condition that the IP conflicts with the subsequently allocated IP is avoided.

Description

IP monitoring and recycling system and method based on network name space management and control
Technical Field
The invention relates to a Kubernetes-based network management function of a container cloud platform, in particular to an IP monitoring and recycling system and method based on network namespace management and control.
Background
The network of the container cloud platform is a foundation of the platform, the management of the IP address of the container group is a basis for providing service to the outside by application, and in the operation and maintenance management process, the allocation, management and monitoring of the IP are not only related to the outside service of the cluster, but also relate to the realization of functions such as safety audit and the like;
in the Kubernetes-based container cloud platform solution, while Kubernetes has native CNI network support such as MacVlan, default IPAM options are provided, such as:
DHCP (Dynamic Host Configuration Protocol): the meaning of dynamic allocation is taken as the name, allocation cannot be specified, and the monitoring function is also lost;
host Local: different segments may be customized for each node, specifying the IP addresses of the start and end of the segments as well as gateways, subnets, default routes, and support ipv4 and ipv 6. Host-local can determine the allocated IP address within the configuration range, because Host-local is local static configuration, IP allocation can not be carried out on application capable of dynamically drifting, the Host does not support the scene of multiple vlans, and the monitoring function is lost;
ptp (point-to-point): and configuring the container and the host by a veth path network card pair mode, and only supporting subnet network segment configuration. The function of application IP allocation management cannot be satisfied. The monitoring function is also missing;
in the existing scheme, the IP allocation of kubernets is only managed to a Pod group Pod layer, and a Docker is responsible for specific network configuration and IP recovery, a cluster network controller focuses on creation and deletion of a Pod group, the creation and deletion operations are all completed by a Docker daemon process of a node, and for the processing of the running condition and abnormal condition of the Docker, the network controller of a platform layer cannot usually take care of the creation and deletion event of the Pod, for the Docker, the local container start deletion event can be reported, other information cannot be reported by kubelet of the node, such as OOM of a memory, which causes some core processes (Docker-container-shim, Docker-container) of an unexpected kill container, and at this time, network NS residue can be caused;
therefore, the actual creation and deletion of the network name space of the container cannot be reported to the network controller component instantly and completely, which is also the reason why the network controller causes the IP conflict; the problem of IP conflict is more and more common in the actual production environment, the monitoring management of the IP use condition also falls into the situation of extremely high maintenance cost after the scale is enlarged, wherein the most common problem is the problem that the IP recovery is not complete when the container group is closed;
in a traditional IP allocation and recovery scenario, a kubernets controller and a network management controller only concern creation and deletion of a container group, a network Namespace (NS) of an actual container is created and managed by a Docker, and a network namespace residue is brought by splitting of the kubernets controller and the network management controller, so that the risk of IP collision of a cluster is caused, and therefore a management and control mechanism of the network namespace is needed to ensure that IP addresses of the container group of a cloud platform do not collide, so that an application can stably run and ensure that the platform is safe and controllable.
Disclosure of Invention
The invention provides an IP monitoring and recovery system based on network namespace management and control, which aims to solve the technical problem that IP addresses among container groups in a cloud platform are easy to conflict, can acquire the corresponding relation between a container and a network namespace through the monitoring of the network namespace, can monitor data recovery of the unused network namespace after the container is deleted, and ensure that the IP of the deleted container group is released to a certain extent and cannot be leaked abnormally so as to ensure that the IP of the deleted container group does not conflict with the subsequently allocated IP;
the scheme solves the IP management problem of the MacVlan container cloud platform of the client, the key concern is the problem caused by IP error recovery, the mechanism can effectively improve the effect of operation and maintenance automation, IP management is a very heavy task in production or test development environments, operation and maintenance personnel are required to accurately master IP allocation conditions and changes in the whole environment, meanwhile, a large amount of time is spent for handling various problems caused by IP management confusion or IP conflict, the container platform can operate thousands of container groups, the quantity and scale far exceed the IP management of the virtual machines, the container network naming space in the IPAM monitoring platform can effectively ensure that the platform IP does not conflict, therefore, the automation degree of operation and maintenance of the client is enhanced, manual operation is reduced, and the defects caused by the prior art are overcome.
The invention also provides an IP monitoring and recycling method based on network name space management and control.
In order to solve the technical problems, the invention provides the following technical scheme: an IP monitoring and recovery system based on network namespace management and control comprises an IP distribution management server (dswitch-server), an IP distribution management agent (dswitch-agent), a plurality of container groups (Pod), a container engine (Kubelet & Docker), a cluster API server (kubbernees API server) and a database (etcd), wherein the container groups are connected with the container engine to realize data interaction, the IP distribution management server is respectively connected with the container engine, the cluster API server and the database to realize data interaction, and the IP distribution management agent is respectively connected with the container engine and the cluster API server to monitor and confirm;
the container group is an operation unit and is used for applying for allocating IP to the IP allocation management server through the container engine, acquiring the IP and creating a network name space;
the container engine is used for starting the container group and monitoring the state, and is also used for recovering the network name space;
the IP allocation management server is used for acquiring an allocable IP from the database and allocating the IP after confirming that no conflict exists in the cluster through the cluster API server;
the IP allocation management agent is used for monitoring the container engine and the network name space and also used for acquiring the network name space needing to be recycled from the cluster API server;
and the cluster API server is used for interacting with the database and providing data required by the IP distribution management agent.
In the above IP monitoring and recycling system based on network namespace management and control, the IP allocation management server is connected to an external network, and the IP allocation management server detects whether there is an IP conflict in the external network through an address resolution protocol (ART).
In the above IP monitoring and recycling system based on network namespace management and control, the IP allocation management agent, the container group, and the container engine form a node;
the number of the nodes is multiple;
the container engine in each node is respectively connected with the IP distribution management server to realize data interaction;
the IP distribution management agent in each node is respectively connected with the cluster API server for confirmation;
the IP allocation management agent in each of the nodes monitors for a connection being created with the container engine within that node.
The IP monitoring and recycling system based on network namespace management and control is characterized in that a plurality of container groups are provided, and each container group is connected with the container engine.
In the above IP monitoring and recycling system based on network namespace management and control, two container groups are provided.
The cluster API server is provided with an access interface, and the access interface interacts with the database.
In a second aspect, an IP monitoring and recycling method based on network namespace management and control includes the following steps:
step one, starting a plurality of container groups, wherein a container in one container group sends an IP allocation application request to an IP allocation management server through a container engine;
step two, the IP allocation management server receives the request for applying for allocating the IP and then allocates the IP after confirming that no conflict exists in the cluster through the cluster API server;
step three, the machine where the container is located creates a corresponding network name space and mounts the network name space into the container for use, and simultaneously, an IP distribution management agent of the node where the first container group is located mounts the network name space;
in an abnormal state, deleting the container group and canceling the mounting, wherein the network name space is not correctly recovered by a container engine, the IP allocation management agent cleans the network name space, and the IP is in an allocable state;
step five, other container groups send requests for applying for IP distribution to the IP distribution management server;
step six, the IP allocation management server returns the IP after receiving the request for applying for allocating the IP;
and step seven, the other container groups use the IP address to create a new network name space.
In the above IP monitoring and recycling method based on network namespace management and control, in the second step, the IP allocation management server detects whether an IP conflict exists in an external network through an address resolution protocol (ART).
In the foregoing IP monitoring and recovering method based on network namespace management and control, in the first step, the plurality of container groups are started by the container engine.
In the fourth step, the abnormal state includes the following scenes:
when the container is abnormally withdrawn;
when the node where the container group is located fails.
The IP monitoring and recovery method based on network namespace management and control further includes the following scenarios in the abnormal state:
when the service is updated;
service stop or capacity reduction.
In the fourth step, the deletion of the container group can be manually performed except in the abnormal state.
In a third aspect, an embodiment of the present invention further provides an IP monitoring and recycling apparatus based on network namespace management and control, including: at least one processor; a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of any of the second aspects above to be implemented.
In a fourth aspect, an embodiment of the present invention further provides a chip, configured to perform the method in the second aspect. Specifically, the chip includes: a processor for calling and running the computer program from the memory so that the device on which the chip is installed is used for executing the method of the second aspect.
In a fifth aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method according to any one of the second aspects above.
In a sixth aspect, the present invention also provides a computer program product, which includes computer program instructions, and the computer program instructions make a computer execute the method in the second aspect.
According to the technical scheme provided by the IP monitoring and recycling system and method based on network name space management and control, the technical scheme has the following technical effects:
the corresponding relation between the container and the network name space is obtained through monitoring of the network name space, after the container is deleted, the network name space which is not used any more in data recovery can be monitored, the IP of the deleted container group is ensured to be released certainly and not to be leaked abnormally, and the condition that the IP conflicts with the subsequently allocated IP is ensured not to occur;
the IP management method and the system solve the IP management problem of the client MacVlan container cloud platform, wherein the key concern is the problem caused by IP error recovery, the mechanism can effectively improve the automation effect of operation and maintenance, IP management is a very heavy task in a production or test development environment, operation and maintenance personnel are required to accurately master the IP allocation condition and change in the whole environment, meanwhile, a large amount of time is needed to process various problems caused by IP management confusion or IP conflict, the container platform can run thousands of container groups, the quantity and scale of the container groups far exceed those of IP management of a virtual machine, and the container network namespace in the IPAM monitoring platform can effectively ensure that the platform IP can not conflict, so that the automation degree of the client operation and maintenance is enhanced, and manual operation is reduced.
Drawings
FIG. 1 is a schematic structural diagram of an IP monitoring and recovery system based on network namespace management and control according to the present invention;
FIG. 2 is a schematic flow chart of an IP monitoring and recovery method based on network namespace management and control according to the present invention;
FIG. 3 is a schematic view of a management interface of an IP monitoring and recovery system based on network namespace management and control according to the present invention;
FIG. 4 is a relational diagram of a container engine and an operation unit when the IP monitoring and recovery system based on network namespace management and control is implemented;
fig. 5 is a schematic diagram of an IP monitor and a container network ns of an IP monitoring and recycling system based on network namespace management and control according to the present invention.
Wherein the reference numbers are as follows:
IP distribution management server 101, IP distribution management agent 102, container group 103, container engine 104, cluster API server 105, database 106, external network 107, node 108.
Detailed Description
In order to make the technical means, the inventive features, the objectives and the effects of the invention easily understood and appreciated, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the specific drawings, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments.
All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be understood that the structures, ratios, sizes, and the like shown in the drawings and described in the specification are only used for matching with the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions under which the present invention can be implemented, so that the present invention has no technical significance, and any structural modification, ratio relationship change, or size adjustment should still fall within the scope of the present invention without affecting the efficacy and the achievable purpose of the present invention.
In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not to be construed as a scope of the present invention.
In the prior art, as shown in fig. 4, in a container platform cluster, the IPAM of a network CNI effectively reduces the situations of IP misallocation and abnormal allocation, and in a CNI scheme of a community, the complete IP recovery capability of a Linux network namespace layer is lacking.
The network namespace is configured by the container creating process of the Docker, the namespace can correctly create and recycle the network namespace in the normal creating and deleting process of the container, but when the Docker itself is abnormal, for example, the Docker core process dockerd or container exits abnormally, the network namespace is remained, and the IP is occupied by the historical container.
For the way of ending the container process abnormally, the container process is forced to be deleted, the network NS file (/ proc/PID/NS/net) thereof will be recycled by the system, but the dockerd process state is still abnormal, and it also holds the fd of the network NS of the container, thus causing the network NS of the container to continue to remain (the pecvlan network card is special, causing the IP thereof to continue to work). On the other hand, K8S finds that the container is working abnormally and tries to return the container group IP, but the CNI BIN cannot enter the network NS of the container through the network NS file path, resulting in failure to return the container group IP. Eventually, this residual network NS will remain present until the host reboots. This will result in IP collision or IP leakage. In severe cases, IP collisions can lead to production accidents. Only when the IPAM plays roles in monitoring and recovering the IP, the IP distribution under some abnormal conditions can be ensured not to have problems. In the case of abnormal exit, the Docker daemon itself may leave the network name space used by the previous container on the host, and the remaining network name space may cause IP collision, which may further cause related network failure.
As shown in fig. 5, in a complex container cloud platform, satisfying the ability to flexibly and rapidly allocate IP also means more IP management risks, and in order to cope with these potential management risks, we need a full IP reclamation and inspection capability.
How to manage and recycle network namespaces, no mature scheme exists at present, when a docker is abnormally quitted or power-off restart and other destructive events occur, some network namespaces often remain, the network namespaces can cause errors in a cluster internal network, the most common error is that one deleted container group still occupies a certain IP, multiple MAC addresses can be found through ART scanning, and potential risks are brought to the cluster.
And running a network NS monitoring service which always monitors all network NS created by the docker, and for a network NS created by the docker, when the network NS is not monitored by the docker and is not used by a certain process (namely is not used by a certain container), the NS monitoring service enters the network NS and executes IP clearing work.
Here we make a more guaranteed design, start an independent process, and when docker creates a network namespace, the process will mount the network namespace to its own management area. The process is fully responsible for adding and deleting the network namespace. The process continuously checks the condition of the system container group and the condition of the currently managed network name space, and when the unrecovered network name space appears, the process clears the abnormal name space. This undoubtedly eradicates the problem of IP collisions introduced by docker. This resolves the possibility of container platform internal IP conflicts.
The first embodiment of the invention provides an IP monitoring and recovery system and method based on network namespace management and control, aiming at obtaining the corresponding relation between a container and a network namespace through the monitoring of the network namespace, and after the container is deleted, the network namespace which is not used any more in data recovery can be monitored, so that the IP of the deleted container group is ensured to be released certainly and not to be leaked abnormally, and the condition of conflict with the subsequently allocated IP is ensured not to occur;
the IP management method and the system solve the IP management problem of the client MacVlan container cloud platform, wherein the key concern is the problem caused by IP error recovery, the mechanism can effectively improve the automation effect of operation and maintenance, IP management is a very heavy task in a production or test development environment, operation and maintenance personnel are required to accurately master the IP allocation condition and change in the whole environment, meanwhile, a large amount of time is needed to process various problems caused by IP management confusion or IP conflict, the container platform can run thousands of container groups, the quantity and scale of the container groups far exceed those of IP management of a virtual machine, and the container network namespace in the IPAM monitoring platform can effectively ensure that the platform IP can not conflict, so that the automation degree of the client operation and maintenance is enhanced, and manual operation is reduced.
In a first aspect, as shown in fig. 1, a first embodiment:
the embodiment provides an IP monitoring and recovery system based on network namespace management and control, wherein the system comprises an IP allocation management server 101 (dswitch-server), an IP allocation management agent 102 (dswitch-agent), a plurality of container groups 103 (Pod), a container engine 104 (Kubelet & Docker), a cluster API server 105 (kuubernets APIs) and a database 106 (etcd), the container groups 103 and the container engine 104 are connected to realize data interaction, the IP allocation management server 101 and the container engine 104, the cluster API server 105 and the database 106 are respectively connected to realize data interaction, and the IP allocation management agent 102 and the container engine 104 and the cluster API server 105 are respectively connected to realize monitoring and confirmation;
the container group 103 is an operation unit, and is used for applying for allocating an IP to the IP allocation management server 101 through the container engine 104, acquiring the IP, and creating a network namespace;
the container engine 104 is used for starting the container group 103 and monitoring the state, and is also used for recovering the network name space;
the IP allocation management server 101 is used for acquiring an allocable IP from the database 106 and allocating the IP after confirming that no conflict exists in the cluster through the cluster API server 105;
the IP allocation management agent 102 is configured to monitor the container engine 104 and the network name space, and further configured to obtain the network name space to be recovered from the cluster API server 105;
the cluster API server 105 is used to interact with the database 106, provide data required by the IP distribution management agent 102, and also serve as an access interface.
The IP distribution management server 101 is connected to the external network 107, and the IP distribution management server 101 detects whether there is an IP conflict in the external network 107 by an address resolution protocol (ART).
Wherein, the IP distribution management agent 102, the container group 103, and the container engine 104 form a node 108;
the node 108 is provided in plurality;
the container engine 104 in each node 108 respectively establishes connection with the IP distribution management server 101 to realize data interaction;
the IP allocation management agent 102 in each node 108 establishes a connection with the cluster API server 105 for confirmation;
the IP allocation management agent 102 in each node 108 monitors for a connection with the container engine 104 within that node 108;
the IP allocation management agent 102 monitors a single node 108, the IP allocation management server 101 controls the node 108, and the monitoring and automatic release of the network namespaces can ensure the consistency of the information of the two.
Wherein, a plurality of container groups 103 are provided, and each container group 103 is connected with the container engine 104.
Two of the container groups 103 are provided.
Second aspect, as shown in fig. 2, second embodiment:
the IP monitoring and recovery method based on network namespace management and control provided by this embodiment includes the following steps:
step one, starting a plurality of container groups 103, wherein a container in one container group 103 sends an application distribution IP request to an IP distribution management server 101 through a container engine 104;
step two, after receiving the request for applying for allocating IP, the IP allocation management server 101 allocates IP after confirming that no conflict exists in the cluster through the cluster API server 105;
step three, a machine where the container is located creates a corresponding network name space and mounts the network name space into the container for use, and meanwhile, the IP distribution management agent 102 of the node where the first container group is located mounts the network name space;
step four, in an abnormal state, deleting the container group 103 and canceling the mounting, wherein the network naming space is not correctly recovered by the container engine 104, and the IP allocation management agent 102 cleans the network naming space, and the IP is in an allocable state;
step five, other container groups 103 send requests for applying for distributing IP to the IP distribution management server 101;
step six, the IP allocation management server 101 returns the IP after receiving the request for applying for allocating the IP;
step seven, the other container group 103 creates a new network namespace using the IP address.
In step two, the IP allocation management server 101 detects whether there is an IP conflict in the external network through an address resolution protocol (ART).
In the first step, the plurality of container groups 101 are respectively started by the container engine 104.
In the fourth step, the abnormal state includes the following scenes:
when the container is abnormally withdrawn;
when a node 108 where the container group 103 is located fails, the container group 103 of the node 108 may be rescheduled to another node 108 (if the service configures the policy of the fixed node 108, the container group 103 may not be rescheduled), and after the node 108 recovers, the rescheduled container group 103 may be deleted.
Wherein, the abnormal state also comprises the following scenes:
when the service is updated, the old version of the container group 103 is deleted;
when the service stops or shrinks, the container group 103 of the service is deleted.
In step four, the deletion of the container group 103 can be manually performed except in the abnormal state.
In the process of deleting the container group 103, an IP recovery interface is called, the IP allocation management server 101 can ensure that the network name space of the IP is completely recovered under necessary conditions, even if the recovery cluster API server 105 fails to call, the service started by the local IP allocation management agent 102 can regularly check the network name space of the current host and the currently used IP obtained by the cluster API server 105, timely recover the abandoned IP and allocate the abandoned IP to a new container for use in a later period;
the specific method for recovering the IP network name space comprises the following steps: the network name spaces correspond to one network device in the Linux system, the virtual network devices correspond to one another in the container environment, and the mode that the container uses a certain network name space is to mount the corresponding network device into the container;
the method for monitoring the network namespace is that after the network device is mounted on the container, the network monitoring component (IP distribution management agent 102) mounts the network device to the path of the network device;
the method for recovering the network name space is to cancel the mount of the corresponding network device.
As shown in fig. 3, in the implementation of a DaoCloud Enterprise (container cloud platform product of dao guest network technologies, ltd), we have already performed productization according to the above specific implementation steps, where the product is developed based on python language, and the following information is obtained through a Kubernetes library of python and node 108 information reported by each node 108: node 108 may allocate network conditions; an IP address field assignable by the tenant;
in the deleting process of the container group 103, an IP recycling interface is called, the IPAM server can ensure that the network name space of the IP is completely recycled under the necessary condition, even if the recycling API fails to call, the service started by the local Agent can regularly check the network name space of the current host and the currently used IP obtained by the Kubernetes API, timely recycle the discarded IP, and allocate the IP to a new container for use in the later period. In a third aspect, the present invention further provides an IP monitoring and recycling apparatus based on network namespace management and control, including:
at least one processor; a memory coupled to the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of the second aspect of the invention to be carried out.
This embodiment provides an IP monitoring and recovery unit based on network namespace management and control, includes: at least one processor; a memory coupled to the at least one processor. The processor and the memory may be provided separately or may be integrated together.
For example, the memory may include random access memory, flash memory, read only memory, programmable read only memory, non-volatile memory or registers, and the like. The processor may be a Central Processing Unit (CPU) or the like. Or a Graphics Processing Unit (GPU) memory may store executable instructions. The processor may execute executable instructions stored in the memory to implement the various processes described herein.
It will be appreciated that the memory in this embodiment can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a ROM (Read-only memory), a PROM (programmable Read-only memory), an EPROM (erasable programmable Read-only memory), an EEPROM (electrically erasable programmable Read-only memory), or a flash memory. The volatile memory may be a RAM (random access memory) which serves as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as SRAM (staticaram, static random access memory), DRAM (dynamic RAM, dynamic random access memory), SDRAM (synchronous DRAM ), DDRSDRAM (double data rate SDRAM, double data rate synchronous DRAM), ESDRAM (Enhanced SDRAM, Enhanced synchronous DRAM), SLDRAM (synchlink DRAM, synchronous link DRAM), and DRRAM (directrrambus RAM, direct memory random access memory). The memory 42 described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, the memory stores elements, upgrade packages, executable units, or data structures, or a subset thereof, or an extended set thereof: an operating system and an application program.
The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs comprise various application programs and are used for realizing various application services. The program for implementing the method of the embodiment of the present invention may be included in the application program.
In an embodiment of the present invention, the processor is configured to execute the method steps provided in the second aspect by calling a program or an instruction stored in the memory, specifically, a program or an instruction stored in the application program.
In a fourth aspect, an embodiment of the present invention further provides a chip, configured to perform the method in the second aspect. Specifically, the chip includes: a processor for calling and running the computer program from the memory so that the device on which the chip is installed is used for executing the method of the second aspect.
Furthermore, in a fifth aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the second aspect of the present invention.
For example, the machine-readable storage medium may include, but is not limited to, various known and unknown types of non-volatile memory.
In a sixth aspect, the present invention also provides a computer program product, which includes computer program instructions, and the computer program instructions make a computer execute the method in the second aspect.
Those of skill in the art would understand that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments of the present application, the disclosed system, apparatus and method may be implemented in other ways. For example, the division of the unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system. In addition, the coupling between the respective units may be direct coupling or indirect coupling. In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or may exist separately and physically.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a machine-readable storage medium. Therefore, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a machine-readable storage medium and may include several instructions to cause an electronic device to perform all or part of the processes of the technical solution described in the embodiments of the present application. The storage medium may include various media that can store program codes, such as ROM, RAM, a removable disk, a hard disk, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, and the present invention is described in detail by the specific examples, but these are not to be construed as limitations of the present invention, and the scope of the present application is not limited thereto. Those skilled in the art can make changes or substitutions within the technical scope disclosed in the present application, and such changes or substitutions should be considered to be within the protective scope of the present application.
In summary, according to the IP monitoring and recovery system and method based on network namespace management and control, the corresponding relationship between the containers and the network namespaces can be obtained through monitoring the network namespaces, and after the containers are deleted, the network namespaces which are not used any more can be monitored and data recovery can be monitored, so that the IP of the deleted container group is ensured to be released certainly and not to be leaked abnormally, and the situation that the deleted containers conflict with the subsequently allocated IP is ensured not to occur;
the IP management method and the system solve the IP management problem of the client MacVlan container cloud platform, wherein the key concern is the problem caused by IP error recovery, the mechanism can effectively improve the automation effect of operation and maintenance, IP management is a very heavy task in a production or test development environment, operation and maintenance personnel are required to accurately master the IP allocation condition and change in the whole environment, meanwhile, a large amount of time is needed to process various problems caused by IP management confusion or IP conflict, the container platform can run thousands of container groups, the quantity and scale of the container groups far exceed those of IP management of a virtual machine, and the container network namespace in the IPAM monitoring platform can effectively ensure that the platform IP can not conflict, so that the automation degree of the client operation and maintenance is enhanced, and manual operation is reduced.

Claims (16)

1. An IP monitoring and recovery system based on network namespace management and control is characterized by comprising an IP allocation management server, an IP allocation management agent, a plurality of container groups, a container engine, a cluster API server and a database; the container group and the container engine are connected in a building mode to achieve data interaction, the IP distribution management server is connected with the container engine, the cluster API server and the database in a building mode to achieve data interaction, and the IP distribution management agent is connected with the container engine and the cluster API server in a building mode to monitor and confirm;
the container group is an operation unit and is used for applying for allocating IP to the IP allocation management server through the container engine, acquiring the IP and creating a network name space;
the container engine is used for starting the container group and monitoring the state, and is also used for recovering the network name space;
the IP allocation management server is used for acquiring an allocable IP from the database and allocating the IP after confirming that no conflict exists in the cluster through the cluster API server;
the IP allocation management agent is used for monitoring the container engine and the network name space and also used for acquiring the network name space needing to be recycled from the cluster API server;
and the cluster API server is used for interacting with the database and providing data required by the IP distribution management agent.
2. The IP monitoring and recovery system based on network namespace management of claim 1, wherein the IP allocation management server is connected to an external network, the IP allocation management server detects whether there is an IP conflict in the external network through an address resolution protocol.
3. The IP monitoring and recovery system based on network namespace management as claimed in claim 1 or 2, wherein the IP allocation management agent, the container group, and the container engine constitute one node;
the number of the nodes is multiple;
the container engine in each node is respectively connected with the IP distribution management server to realize data interaction;
the IP distribution management agent in each node is respectively connected with the cluster API server for confirmation;
the IP allocation management agent in each of the nodes monitors for a connection being created with the container engine within that node.
4. The IP monitoring and reclamation system based on network namespace management as recited in claim 3, wherein each of said container groups is coupled to said container engine.
5. The IP monitoring and recovery system based on network namespace governance as recited in claim 4, wherein there are two of the container groups.
6. The IP monitoring and recovery system based on network namespace management as claimed in claim 1, wherein the cluster API server is provided with an access interface, which interacts with the database to provide data required by the IP allocation management agent.
7. An IP monitoring and recycling method based on network namespace management and control is characterized by comprising the following steps:
step one, starting a plurality of container groups, wherein a container in one container group sends an IP allocation application request to an IP allocation management server through a container engine;
step two, the IP allocation management server receives the request for applying for allocating the IP and then allocates the IP after confirming that no conflict exists in the cluster through the cluster API server;
step three, the machine where the container is located creates a corresponding network name space and mounts the network name space into the container for use, and simultaneously, an IP distribution management agent of the node where the first container group is located mounts the network name space;
in an abnormal state, deleting the container group and canceling the mounting, wherein the network name space is not correctly recovered by a container engine, the IP allocation management agent cleans the network name space, and the IP is in an allocable state;
step five, other container groups send requests for applying for IP distribution to the IP distribution management server;
step six, the IP allocation management server returns the IP after receiving the request for applying for allocating the IP;
and step seven, the other container groups use the IP address to create a new network name space.
8. The method as claimed in claim 7, wherein in step two, the IP allocation management server detects whether there is an IP conflict in the external network through an address resolution protocol.
9. The method as claimed in claim 8, wherein in step one, the plurality of container groups are started by the container engine.
10. The IP monitoring and recovery method based on network namespace management and control as recited in claim 7, 8 or 9, wherein in step four, the abnormal state includes the following scenarios:
when the container is abnormally withdrawn;
when the node where the container group is located fails.
11. The IP monitoring and recycling method based on network namespace management and control as claimed in claim 10, wherein the abnormal state further includes following scenarios:
when the service is updated;
service stop or capacity reduction.
12. The IP monitoring and reclaiming method based on network namespace management and control as recited in claim 7, 8 or 9, wherein in step four, the deletion of the container group can be performed manually except in the abnormal state.
13. The utility model provides a IP control and recovery unit based on network namespace management and control which characterized in that includes:
at least one processor;
a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of any of claims 7 to 12 to be implemented.
14. A chip, comprising: a processor for calling and running the computer program from the memory so that the device in which the chip is installed performs: the method of any one of claims 7 to 12.
15. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 7 to 12.
16. A computer program product comprising computer program instructions for causing a computer to perform the method of any one of claims 7 to 12.
CN202010012138.9A 2020-01-07 2020-01-07 IP monitoring and recycling system and method based on network name space management and control Active CN111163189B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010012138.9A CN111163189B (en) 2020-01-07 2020-01-07 IP monitoring and recycling system and method based on network name space management and control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010012138.9A CN111163189B (en) 2020-01-07 2020-01-07 IP monitoring and recycling system and method based on network name space management and control

Publications (2)

Publication Number Publication Date
CN111163189A true CN111163189A (en) 2020-05-15
CN111163189B CN111163189B (en) 2020-09-15

Family

ID=70561615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010012138.9A Active CN111163189B (en) 2020-01-07 2020-01-07 IP monitoring and recycling system and method based on network name space management and control

Country Status (1)

Country Link
CN (1) CN111163189B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231061A (en) * 2020-10-22 2021-01-15 浪潮云信息技术股份公司 Method for running cloud-native container
CN112650553A (en) * 2020-12-09 2021-04-13 湖南麒麟信安科技股份有限公司 Universal container management method and system
CN113438295A (en) * 2021-06-22 2021-09-24 康键信息技术(深圳)有限公司 Container group address allocation method, device, equipment and storage medium
CN113760448A (en) * 2021-04-30 2021-12-07 中科天玑数据科技股份有限公司 Big data management platform based on kubernets
CN114938375A (en) * 2022-05-16 2022-08-23 聚好看科技股份有限公司 Container group updating equipment and container group updating method
CN115225612A (en) * 2022-06-29 2022-10-21 济南浪潮数据技术有限公司 Management method, device, equipment and medium for K8S cluster reserved IP
CN115473766A (en) * 2022-08-22 2022-12-13 苏州思萃工业互联网技术研究所有限公司 Method and system for realizing vip based on distributed gateway

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107783837A (en) * 2016-08-31 2018-03-09 阿里巴巴集团控股有限公司 A kind of method, apparatus and electronic equipment for carrying out storing extension
CN108958794A (en) * 2017-05-23 2018-12-07 深圳先进技术研究院 A kind of Docker host, cloud robot system and its construction method based on Docker
CN108989091A (en) * 2018-06-22 2018-12-11 杭州才云科技有限公司 Based on the tenant network partition method of Kubernetes network, storage medium, electronic equipment
CN109743261A (en) * 2019-01-07 2019-05-10 中国人民解放军国防科技大学 SDN-based container network resource scheduling method
US20190227840A1 (en) * 2018-01-22 2019-07-25 International Business Machines Corporation System and method for in-process namespace switching
CN110413437A (en) * 2019-07-26 2019-11-05 济南浪潮数据技术有限公司 Network namespace abnormality eliminating method, device, equipment and readable storage medium storing program for executing
CN110519361A (en) * 2019-08-22 2019-11-29 北京宝兰德软件股份有限公司 Container cloud platform multi-tenant construction method and device based on kubernetes

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107783837A (en) * 2016-08-31 2018-03-09 阿里巴巴集团控股有限公司 A kind of method, apparatus and electronic equipment for carrying out storing extension
CN108958794A (en) * 2017-05-23 2018-12-07 深圳先进技术研究院 A kind of Docker host, cloud robot system and its construction method based on Docker
US20190227840A1 (en) * 2018-01-22 2019-07-25 International Business Machines Corporation System and method for in-process namespace switching
CN108989091A (en) * 2018-06-22 2018-12-11 杭州才云科技有限公司 Based on the tenant network partition method of Kubernetes network, storage medium, electronic equipment
CN109743261A (en) * 2019-01-07 2019-05-10 中国人民解放军国防科技大学 SDN-based container network resource scheduling method
CN110413437A (en) * 2019-07-26 2019-11-05 济南浪潮数据技术有限公司 Network namespace abnormality eliminating method, device, equipment and readable storage medium storing program for executing
CN110519361A (en) * 2019-08-22 2019-11-29 北京宝兰德软件股份有限公司 Container cloud platform multi-tenant construction method and device based on kubernetes

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈刚等: ""基于Linux 名字空间的Web 服务器动态防御方法"", 《计算机应用》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231061A (en) * 2020-10-22 2021-01-15 浪潮云信息技术股份公司 Method for running cloud-native container
CN112231061B (en) * 2020-10-22 2023-01-20 浪潮云信息技术股份公司 Method for running cloud-native container
CN112650553A (en) * 2020-12-09 2021-04-13 湖南麒麟信安科技股份有限公司 Universal container management method and system
CN112650553B (en) * 2020-12-09 2023-07-14 湖南麒麟信安科技股份有限公司 Universal container management method and system
CN113760448A (en) * 2021-04-30 2021-12-07 中科天玑数据科技股份有限公司 Big data management platform based on kubernets
CN113438295A (en) * 2021-06-22 2021-09-24 康键信息技术(深圳)有限公司 Container group address allocation method, device, equipment and storage medium
CN114938375A (en) * 2022-05-16 2022-08-23 聚好看科技股份有限公司 Container group updating equipment and container group updating method
CN114938375B (en) * 2022-05-16 2023-06-02 聚好看科技股份有限公司 Container group updating equipment and container group updating method
CN115225612A (en) * 2022-06-29 2022-10-21 济南浪潮数据技术有限公司 Management method, device, equipment and medium for K8S cluster reserved IP
CN115225612B (en) * 2022-06-29 2023-11-14 济南浪潮数据技术有限公司 Management method, device, equipment and medium for K8S cluster reserved IP
CN115473766A (en) * 2022-08-22 2022-12-13 苏州思萃工业互联网技术研究所有限公司 Method and system for realizing vip based on distributed gateway
CN115473766B (en) * 2022-08-22 2024-01-26 苏州思萃工业互联网技术研究所有限公司 Vip implementation method and system based on distributed gateway

Also Published As

Publication number Publication date
CN111163189B (en) 2020-09-15

Similar Documents

Publication Publication Date Title
CN111163189B (en) IP monitoring and recycling system and method based on network name space management and control
CN106991035B (en) Host monitoring system based on micro-service architecture
US7941510B1 (en) Management of virtual and physical servers using central console
KR0176272B1 (en) Updating and restoration method of system file
US7383327B1 (en) Management of virtual and physical servers using graphic control panels
CN111966305A (en) Persistent volume allocation method and device, computer equipment and storage medium
CN100485676C (en) Apparatus, system, and method for file system serialization reinitialization
US20190288914A1 (en) Allocating VNFC Instances with Anti Affinity Rule to Hosts
US20150317215A1 (en) Systems and methods for host image transfer
CN109739482B (en) Service logic execution system and method based on dynamic language
CN107005426B (en) Method and device for managing life cycle of virtual network function
CN111552496A (en) System and method for realizing seamless upgrade of sidecar based on temporary container addition
CN105610946B (en) A kind of cloud springboard machine system based on docker technologies
CN104360878A (en) Method and device for deploying application software
CN109697078B (en) Repairing method of non-high-availability component, big data cluster and container service platform
CN111464603B (en) Server capacity expansion method and system
US20030018759A1 (en) Method and system for performing computer system cloning
CN113438295A (en) Container group address allocation method, device, equipment and storage medium
CN112468545A (en) Cloud host creation method, device, system, equipment and readable storage medium
CN112667711B (en) MySQL read-only instance management method, system and computer readable medium
WO2022172062A1 (en) Network service management device and network service management method
CN113296795A (en) Application deployment method, device, equipment, storage medium and program product
CA2832799C (en) Cache memory structure and method
CN113746676B (en) Network card management method, device, equipment, medium and product based on container cluster
CN117251247A (en) Operation and maintenance audit platform deployment method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 200433 floor 7, building 6, No. 99, jiangwancheng Road, Yangpu District, Shanghai

Patentee after: Shanghai Daoke Network Technology Co.,Ltd.

Address before: Room 1305-12, No.6 Weide Road, Yangpu District, Shanghai 200433

Patentee before: Shanghai Daoke Network Technology Co.,Ltd.

CP02 Change in the address of a patent holder