CN114866416A - Multi-cluster unified management system and deployment method - Google Patents

Multi-cluster unified management system and deployment method Download PDF

Info

Publication number
CN114866416A
CN114866416A CN202210410910.1A CN202210410910A CN114866416A CN 114866416 A CN114866416 A CN 114866416A CN 202210410910 A CN202210410910 A CN 202210410910A CN 114866416 A CN114866416 A CN 114866416A
Authority
CN
China
Prior art keywords
module
management
cluster
service
service module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210410910.1A
Other languages
Chinese (zh)
Inventor
陈曦
王超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202210410910.1A priority Critical patent/CN114866416A/en
Publication of CN114866416A publication Critical patent/CN114866416A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A multi-cluster unified management system and a deployment method thereof are disclosed, wherein each service module in a comprehensive management module, a data sharing module and a user interface module is deployed through docker service, and each service module in a plurality of cluster reasoning service arrangement modules is deployed through kubernets service, so that the isolation among the modules is realized; receiving the service request through the comprehensive management module, sending the service request to a target cluster for processing, then returning a processing result, and meanwhile, adding and reducing the number of the clusters or adding and reducing the number of nodes in the clusters according to the needs so as to improve the processing efficiency; the data interaction among the clusters is realized through the data sharing module which is arranged in an isolation mode, so that the safety of the data interaction is realized.

Description

Multi-cluster unified management system and deployment method
Technical Field
The invention relates to the technical field of cluster management, in particular to a multi-cluster unified management system and a deployment method.
Background
With the rapid development of the cloud-native domain, more and more enterprises/customers/organizations migrate the artificial intelligence reasoning infrastructure onto the kubernets system. However, Kubernetes, as a single cluster management scheme, supports namespace (namespace) for soft isolation (meeting the requirements of multi-tenant management and data flow in different virtual isolation scenarios), but still cannot guarantee data interaction between multiple entity clusters and unified management of multiple entity clusters, so that an enterprise/organization usually needs a large number of operation and maintenance personnel to maintain cluster resources and data by means of an independently deployed management system and to implement data flow between clusters by means of a large-capacity storage device, which cannot guarantee high requirements on production efficiency and information security in service.
In order to solve the problems, the invention provides a multi-cluster unified management system and a deployment method by taking Docker and Kubernetes as a core support technology framework, so that the timeliness, the safety, the isolation and the high availability of multi-cluster data management are ensured, the expandability of the multi-cluster scale and the expandability of nodes/computational power in a cluster are ensured, and the configuration and the operation interference on the original independent cluster are kept to be minimized.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a multi-cluster unified management system and a deployment method thereof, so as to solve the existing problems.
In a first aspect, the present application provides a multi-cluster unified management system, which is characterized by comprising an integrated management module, a data sharing module, and a plurality of clusters, wherein the integrated management module, the data sharing module, and the plurality of clusters are respectively deployed on different network nodes, and the integrated management module is deployed on a network center node;
the integrated management module comprises a node management service module and a multi-cluster management service module, wherein the node management service module is used for carrying out node service management on nodes in a cluster according to a management request of a user, the multi-cluster management service module is used for carrying out cluster service management on a plurality of clusters according to the management request of the user, the integrated management module is also used for transmitting static and/or dynamic data generated in the node service management and the cluster service management to the data sharing module, and the integrated management module is deployed through a docker;
the data sharing module comprises at least one database, and is used for storing static and/or dynamic data generated by the comprehensive management module and the clusters, wherein the database is deployed through docker;
the plurality of clusters store respective cluster data in a data sharing module according to each cluster identifier and/or acquire cluster data from the data sharing module according to each cluster identifier, and the plurality of clusters are deployed through Kubernets.
Preferably, the multi-cluster unified management system further includes a user interface module, where the user interface module is configured to receive a management request of a user for the multiple clusters, and transmit the management request to the integrated management module, where the user interface module is deployed through a docker.
Preferably, the integrated management module further includes an independent gateway service module, configured to forward the service request to each service module outside the cluster, or forward the service request to a target service module of the target cluster according to the cluster name.
Preferably, the integrated management module further comprises a user management service module and/or an authentication management service module and/or a log management service module deployed through docker, wherein the user management service module and/or the authentication management service module are/is configured to be a service module of the integrated management system;
the user management service module is used for managing users/user groups so as to realize the management of the operation authority of the cluster and/or the node;
the authentication management service module authenticates the user operation in a mode of distributing and verifying the token code;
the log management service module is used for reading data from the data sharing module and displaying the data according to log attributes, wherein the log attributes comprise at least one of the following items: cluster identification, operation time and operation users.
Preferably, each of the plurality of clusters comprises an intra-cluster reasoning service orchestration module, and the intra-cluster reasoning service orchestration module comprises a service deployment service module, a monitoring management service module, a mirror image management service module and a model management service module, wherein the service deployment service module, the monitoring management service module, the mirror image management service module and the model management service module are respectively connected with the plurality of clusters;
the service deployment service module is used for supporting the deployment of an inference model and an inference mirror image through a Kubernetes component;
the monitoring management service module is used for monitoring real-time/historical information used by hardware resources in the cluster;
the mirror image management service module is used for storing, distributing and/or managing the inference mirror image by means of a mirror image warehouse;
and the model management service module is used for storing, distributing and/or managing the inference model by virtue of the file warehouse.
Preferably, the intra-cluster reasoning service arrangement module further comprises a data storage migration module and a communication service injection module, wherein the data storage migration module is connected with the communication service injection module;
the database storage and migration module is used for storing cluster data into a database which is stored in the data sharing module and stores the cluster data;
and the communication service injection module is used for requesting and acquiring the information of the demand of the service outside the cluster from the multi-cluster management service module according to the service in the cluster.
In a second aspect, the present application further provides a method for deploying a multi-cluster unified management system, including:
deploying an integrated management module through a docker, wherein the integrated management module comprises a node management service module and a multi-cluster management service module, the node management service module is used for carrying out node service management on nodes in a cluster according to a management request of a user, the multi-cluster management service module is used for carrying out cluster service management on a plurality of clusters according to the management request of the user, and the integrated management module is also used for transmitting static/or dynamic data generated in the cluster service management and the node service management to a data sharing module;
deploying a data sharing module through a docker, wherein the data sharing module comprises at least one database and is used for storing static and/or dynamic data generated by the comprehensive management module and the plurality of clusters;
deploying a plurality of clusters through Kubernetes, wherein the plurality of clusters store respective cluster data in a data sharing module according to each cluster identifier and/or acquire cluster data from the data sharing module according to each cluster identifier;
the integrated management module, the data sharing module and the plurality of clusters are respectively deployed on different network nodes, and the integrated management module is deployed on a network center node.
Preferably, the method further comprises:
and deploying a user interface module through the docker, wherein the user interface module is used for receiving a management request of a user for the plurality of clusters and transmitting the management request to the comprehensive management module.
Preferably, the method further comprises:
and deploying an independent gateway service module through the docker, wherein the independent gateway service module is used for forwarding the service request to each service module outside the cluster, or forwarding the service request to a target service module of a target cluster according to the cluster name.
Preferably, the method further comprises:
deploying a user management service module and/or an authentication management service module and/or a log management service module through a docker, wherein the user management service module and/or the authentication management service module are/is deployed;
the user management service is used for managing users/user groups so as to realize the management of the operation authority of the cluster and/or the node;
the authentication management service module authenticates the user operation in a mode of distributing and verifying the token code;
the log management service module is used for reading data from the data sharing module and displaying the data according to log attributes, wherein the log attributes comprise at least one of the following items: cluster identification, operation time and operation users.
The technical scheme provided by the invention has the beneficial effects that:
according to the technical scheme, each service module in the comprehensive management module, the data sharing module and the user interface module is deployed through the docker service, and each service module in the plurality of cluster reasoning service arrangement modules is deployed through the kubernets service, so that isolation among the modules is realized; the service request is received by the comprehensive management module and sent to the target cluster for processing, then a processing result is returned, and meanwhile, the number of the clusters can be added and reduced or the number of nodes in the clusters can be added and reduced according to the requirement, so that the processing efficiency is improved; the data interaction among the clusters is realized through the data sharing module which is arranged in an isolation mode, so that the safety of the data interaction is realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is an architecture diagram of a multi-cluster unified management system according to an embodiment of the present invention;
fig. 2 is a flowchart of a service request processing method according to an embodiment of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a first aspect, the present application discloses a multi-cluster unified management system, as shown in fig. 1, including a comprehensive management module, a data sharing module, and a plurality of clusters, where the comprehensive management module, the data sharing module, and the plurality of clusters are respectively deployed on different network nodes, and the comprehensive management module is deployed on a network center node.
It should be noted that the integrated management module needs to be deployed on a central node of a network to manage the multiple clusters and the data sharing module.
The integrated management module comprises a node management service module and a multi-cluster management service module, wherein the node management service module is used for carrying out node service management on nodes in a cluster according to a management request of a user, the multi-cluster management service module is used for carrying out cluster service management on a plurality of clusters according to the management request of the user, the integrated management module is also used for transmitting static and/or dynamic data generated in the node service management and the cluster service management to the data sharing module, and the integrated management module is deployed through a docker;
the data sharing module comprises at least one database, and is used for storing static and/or dynamic data generated by the comprehensive management module and the clusters, wherein the database is deployed through docker;
the plurality of clusters store respective cluster data in a data sharing module according to each cluster identifier and/or acquire cluster data from the data sharing module according to each cluster identifier, and the plurality of clusters are deployed through kubernets (k8 s).
The node management service module is configured to perform node service management on nodes in a cluster according to a management request of a user, and specifically includes: adding nodes and deleting nodes for the cluster; the method comprises the steps of providing an overview of information of all nodes in the cluster, providing detailed information of the nodes, and providing tint (stain), label and annotation (annotation) information of the nodes in the cluster.
The node adding mode comprises the following steps:
and manually adding the nodes, wherein before the nodes are manually added, kubbelets, kube-proxy and other kubberenets (k8s) components are required to be installed for the nodes to be added, a kubbeadm interface of the cluster master node to be added is called, a join command is obtained, then the nodes to be added are automatically added by a user, and the adding operation is completed by executing the command.
And automatically adding the nodes, wherein the nodes to be added need to be installed with an operating system before the nodes to be added are automatically added, after the IP addresses of the nodes to be added and the password information of the root accounts are obtained, configuration and installation of components such as a driver, a docker, an nvIDia-docker, a kubel, a kube-proxy and the like are automatically and remotely completed according to the node types, and the nodes to be added are added into the cluster in the role of the node.
In order to better understand and store node information, after the node is added, a monitoring management service module in a cluster is requested, static attribute information of the node is obtained, and the related attribute information is stored in a database of a data sharing module, wherein the static attribute information of a node object comprises a node name, a cluster name, a node role, a node state, a BMC (baseboard management controller) address, a CPU (Central processing Unit) architecture, an operating system version, an operating system kernel version, a container runtime version, node description, creation time and the like. After the node is added, a function of adding label, taint and annotation is synchronously requested, and all information of label, taint and annotation of the node is brought into management.
And deleting the node, wherein the deleting node function deletes the node by calling a 'delete node' interface of kubernetes (k8s) through a communication service module in the request cluster. In order to facilitate node management, after a node is deleted, deleting node information records of the node in a data sharing module, deleting all user records and business records related to cluster reasoning service arrangement module information in the data sharing module, simultaneously requesting a label (label), stain (taint) and annotation deleting function, and deleting all label (label), stain (taint) and annotation information of the node.
Providing an information overview function of all nodes in a cluster, and acquiring static attribute information such as names, roles, states, belonged clusters, BMC addresses, CPU architectures, operating system versions, kernel versions, container runtime versions, node descriptions, creation time and the like when the nodes are added through a request data sharing module; the request monitoring management service module acquires the total amount and the occupation amount information of resources such as a CPU (central processing unit), a memory, a GPU (graphics processing unit) and the like; the method comprises the steps of obtaining historical occupancy curves of resources such as a memory, a CPU (central processing unit), a storage and a container group by calling a kubernets (k8s) interface, and obtaining information related to an accelerator card on a node by calling a shell-export interface, wherein the information comprises a name, a state, a number, a UUID (user identifier), a utilization rate, a temperature, power, a sharing state and the like.
The multi-cluster management service module is configured to perform cluster service management on the multiple clusters according to a management request of a user, and specifically includes: adding new clusters, removing clusters, deleting clusters, providing overview information of multiple clusters, and providing detailed information of a certain cluster.
Adding a new cluster function to bring the built kubernets (k8s) cluster into the management and control range of the system, wherein before the built kubernets (k8s) cluster is brought into the management and control range of the system, the domain name, label and description of the new cluster are required to be set; the resource partitioning mode of the cluster needs to be selected, such as a node mode (the minimum granularity of resource allocation facing to a user group is a node), a pooling mode (the minimum granularity of resource allocation facing to the user group is 0.001 core CPU, 1M memory, 1 GPU), and the like.
In order to fully master the information of all clusters of the system, after the clusters are added, the following operations are carried out:
storing relevant attribute information of the cluster into a database of a data sharing module, wherein the relevant attribute information comprises a domain name, a label, a resource dividing mode and description of the cluster; requesting a node adding function of the node management service module, and bringing all nodes in the cluster into a system for management; requesting a label, a taint and an annotation adding function of a node management service module, and bringing all label, taint and annotation information of all nodes in a cluster into management; and the request DNS management service module adds newly added cluster domain name information in all DNS services to ensure the connectivity and timeliness of cluster management.
And removing the cluster function, wherein the function only deletes the information record of the cluster in the data sharing module, so that when the cluster which is removed from the system management is added to the system again, the service which is still in operation is recovered as before when the cluster is removed, thereby ensuring the continuity of the service.
The cluster deleting function, besides deleting the information record of the cluster in the data sharing module, needs to:
requesting a node deleting function of the node management service module to delete all nodes in the cluster; requesting a label (tag), a taint (stain) and an annotation (annotation) deleting function of the node management service module, and deleting all label (tag), taint (stain) and annotation information of all nodes in the cluster;
initializing a kubernets (k8s) cluster, and deleting all namespaces and services which are operated in the cluster;
and deleting all user records and service records associated with the cluster information in the data sharing module.
The method comprises the steps of providing a summary information function of a plurality of clusters, acquiring information such as domain names, names and labels configured during cluster addition through a request data sharing module, requesting communication services of communication service modules in the clusters through a gateway service, and acquiring the health operation state of kubernets (k8s) through kubernets (k8 s).
The detailed information function of a certain cluster is to acquire cluster user/user group information, cluster configuration information, cluster resource information and the like through a user management service module, a configuration management service module, a monitoring management service module, an alarm management service module and the like in a request cluster.
The stain, label and annotation are attribute information of nodes in a kubernets (k8s) cluster, and are used for labeling different node roles to realize group management of different nodes, the stain is attribute information of nodes in the kubernets (k8s) cluster, is used for labeling different node roles, defines an affinity relationship between a point and a node, and realizes fine-grained control of service scheduling, and values of key and value of all stains, label and annotation can be obtained by calling relevant interfaces of kubernets (k8 s). When the cluster is added and the nodes are added, label (tag), annotation (annotation) and taint (taint) information on the related nodes are obtained and stored in the data sharing module, and when the cluster is deleted and the nodes are deleted, label (tag), annotation (annotation) and taint (taint) information which are used for storing the related nodes in the data sharing module are deleted together.
In some embodiments, the integrated management module may further include:
and the user interface module is used for receiving the management requests of the users to the clusters and transmitting the management requests to the comprehensive management module, wherein the user interface module is deployed through docker.
Specifically, the user interface module may be set in two modes:
the cluster external mode is used for displaying the information of the integrated management module and the plurality of clusters, and the information comprises: cluster overview information, cluster node overview information, user information, log information, etc. to enable an operator to learn about global information.
An intra-cluster mode for displaying information of each service module in the plurality of clusters, the information including: the mode is designed to be capable of checking the clusters respectively and supporting free skip of the sub-display pages, so that operators can check conveniently.
It should be noted that, in order to achieve good interaction with the user interface module and the cluster, the integrated management module is composed of a plurality of service modules, and each service module is deployed through a docker, and during deployment, a restart mode of the docker service is set to idle-stopped, and auto-heal function is matched to ensure high availability of the service.
In some embodiments, the integrated management module may further include:
the independent gateway service module is used for forwarding the service request to each service module outside the cluster or forwarding the service request to a target service module of a target cluster according to the cluster name;
the method specifically comprises the following steps: the request URL (uniform resource locator) of the independent gateway service module is designed to be IP, namely Port/< module >/< interface >, wherein the IP and the Port are respectively the IP address and the Port number of the gateway service module, the module is the service name corresponding to the target service module, and the interface is the interface name of the target service module. In addition, the header body for requesting the independent gateway service module may optionally include a cluster name, and if the cluster name is included, the request is forwarded to a service module in a kubernets (k8s) cluster; and if the cluster name is not contained, forwarding the request to a docker creation service module outside the cluster. When a cluster is added, the cluster domain name is stored in a database of the data sharing module, the independent gateway service module queries the database according to the cluster name in the request body header to acquire the cluster domain name, and the cluster domain name, the module name and the interface name in the request URL (uniform resource positioning system) are combined into a target interface U RL (uniform resource positioning system), so that the access and the forwarding of the target interface URL (uniform resource positioning system) are completed, and the request is ended.
The DNS management service module is used for setting the cluster domain names at the same level and reconfiguring DNS service;
specifically, in order to reduce domain name resolution delay, when a new cluster is added into the system, the DNS management service module is triggered to reconfigure the DNS service, the domain names of all clusters are set as the same level DNS, and a domain name configuration item is added to the newly added cluster.
In some embodiments, the integrated management module may further include:
the user management service module is used for managing users/user groups;
specifically, the user management service module is designed to be independent of the multi-cluster management service module, and users/user groups can be managed before clusters are added to the system. The users/user groups are classified into system administrators, group administrators, and general users by rank. The system administrator has the authority to operate multiple clusters and users/user groups, including the authority to add clusters, remove clusters, delete clusters, add nodes, delete nodes, add users, delete users, modify user information, create user groups, modify user group resource allocation, delete user groups, modify user group information, view log information, view monitoring information and alarm information. The group administrator and the common users have the authority of reasoning model service deployment, reasoning mirror image service deployment, reasoning algorithm alarm information viewing and the like, and the group administrator additionally has the authority of modifying the group members of the current users and examining and approving the service deployment in the group.
And/or;
the authentication management service module is used for ensuring that various operations of users with different identities cannot be unauthorized by distributing token codes to front-end users and verifying the token codes by a rear end.
And/or;
and the log management service module has the functions of reading log data in the data sharing module and providing a system administrator role for filtering and checking according to fields of a generation cluster, a generation module, a generation time range, a generation user group, a generation user and the like in a mode of clearly indicating the generation cluster of the log. All sources of log information include a unified management module and clusters.
Further, the data sharing module is further configured to:
providing data adding, deleting, changing and searching services;
confirming the generation cluster of the data by acquiring the domain name environment variable and the ID of the cluster, wherein the mechanism is as follows: the communication service module in the cluster reads the cluster domain name environment variable and requests the comprehensive management module to acquire the cluster ID corresponding to the cluster domain name, and when each service module in the cluster sends a data adding, deleting, changing and checking request to the data sharing module, the communication service module is firstly accessed to acquire the cluster ID and the associated data is operated from the database according to the cluster ID.
In some embodiments, each of the plurality of clusters comprises an intra-cluster inference service orchestration module comprising a service deployment service module, a monitoring management service module, a mirror management service module, and a model management service module, wherein;
the service deployment service module is used for supporting application deployment of an inference model and an inference mirror image through kubernetes (k8s) components such as a knative and a kfservinging;
the monitoring management service module is used for monitoring real-time/historical information used by hardware resources in the cluster, such as a memory, a CPU, an accelerator card, a POD (hard disk drive), a hard disk and the like;
the mirror image management service module is used for storing and/or distributing and/or managing the inference mirror image by means of a mirror image warehouse (such as a Harbor);
and the model management service module is used for storing and/or distributing and/or managing the inference model by means of a file warehouse (such as HDFS).
In some embodiments, in order to facilitate uniform management of data and convenient operation, the intra-cluster inference service orchestration module further includes a data storage migration module and a communication service injection module, wherein;
the database storage and migration module is used for storing the cluster data into a database which is stored in the data sharing module and is the same as the cluster data;
the communication service injection module is configured to request the integrated management module to obtain information of a demand for the service outside the cluster from the service in the cluster, for example, the communication service injection module in the cluster obtains an ID of a certain cluster by accessing the integrated management module, and distributes the ID to each service module in the cluster, so that each service module in the cluster operates related data from the data sharing module according to the cluster ID.
In some embodiments, in order to facilitate management of users in a cluster and ensure that users of various types do not override, the intra-cluster inference service orchestration module further includes:
the user management service in the cluster is used for ensuring the management of the group administrator in the cluster to the user group, and comprises service deployment applications of an invitation group administrator and an approval group administrator;
the cluster internal authentication management service is used for ensuring that the operations of a group administrator and a common user cannot be over-authorized;
the system comprises an in-station credit management service, a group administrator and a common user, wherein the in-station credit management service is used for generating and storing transaction notification information for a certain monitoring item and a set threshold value, and pushing the notification information to the group administrator and the common user with authority;
and the parameter management service is used for providing parameters for the algorithms of injection, configuration and management inference application.
In a second aspect, the present application provides a method for deploying a multi-cluster unified management system, where the method includes:
deploying an integrated management module through a docker, wherein the integrated management module comprises a node management service module and a multi-cluster management service module, the node management service module is used for carrying out node service management on nodes in a cluster according to a management request of a user, the multi-cluster management service module is used for carrying out cluster service management on a plurality of clusters according to the management request of the user, and the integrated management module is also used for transmitting static/dynamic data generated in the cluster service management and the node service management to a data sharing module;
deploying a data sharing module through a docker, wherein the data sharing module comprises at least one database and is used for storing static and/or dynamic data generated by the comprehensive management module and the plurality of clusters;
deploying a plurality of clusters through kubernets (k8s), wherein the plurality of clusters store respective cluster data in a data sharing module according to each cluster identifier and/or acquire cluster data from the data sharing module according to each cluster identifier;
the integrated management module, the data sharing module and the plurality of clusters are respectively deployed on different network nodes, and the integrated management module is deployed on a network center node.
In some embodiments, the method for deploying a multi-cluster unified management system further includes:
and deploying a user interface module through the docker, wherein the user interface module is used for receiving a management request of a user for the plurality of clusters and transmitting the management request to the comprehensive management module.
In some embodiments, the method for deploying a multi-cluster unified management system further includes:
and deploying an independent gateway service module through the docker, wherein the independent gateway service module is used for splicing the URL (uniform resource positioning system) in the management request and the cluster domain name corresponding to the cluster identifier in the management request header to form a target interface URL (uniform resource positioning system), so that the management request is transferred to the target service of the target cluster according to the cluster identifier.
In some embodiments, the method for deploying a multi-cluster unified management system further includes:
independently deploying a user management service module and/or an authentication management service module and/or a log management service module through a docker, wherein the user management service module and/or the authentication management service module and/or the log management service module are/is arranged in the docker;
the user management service is used for managing users/user groups so as to realize the management of the operation authority of the cluster and/or the node;
the authentication management service module authenticates the user operation in a mode of distributing and verifying the token code;
the log management service module is used for reading data from the data sharing module and displaying the data according to log attributes, wherein the log attributes comprise at least one of the following items: cluster identification, operation time and operation users.
After the system deployment is completed, the service request initiated by the client is processed as follows:
the comprehensive management module acquires a client service request sent by the user interface module;
the comprehensive management module analyzes the service request, and inquires a target cluster corresponding to the service request and a target service corresponding to the target cluster;
the comprehensive management module sends the service request to the target service of the target cluster;
the comprehensive management module receives a processing result of the target service of the target cluster;
and the comprehensive management module returns the processing result to the client through the user interface module.
The technical solutions provided by the present application are described in detail above. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The method disclosed by the embodiment corresponds to the system disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part. It should be noted that, for those skilled in the art, without departing from the principle of the present application, the present application can also make several improvements and modifications, and those improvements and modifications also fall into the protection scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A multi-cluster unified management system is characterized by comprising a comprehensive management module, a data sharing module and a plurality of clusters, wherein the comprehensive management module, the data sharing module and the clusters are respectively deployed on different network nodes, and the comprehensive management module is deployed on a network central node;
the integrated management module comprises a node management service module and a multi-cluster management service module, wherein the node management service module is used for carrying out node service management on nodes in a cluster according to a management request of a user, the multi-cluster management service module is used for carrying out cluster service management on a plurality of clusters according to the management request of the user, the integrated management module is also used for transmitting static and/or dynamic data generated in the node service management and the cluster service management to the data sharing module, and the integrated management module is deployed through a docker;
the data sharing module comprises at least one database, and is used for storing static and/or dynamic data generated by the comprehensive management module and the clusters, wherein the database is deployed through docker;
the plurality of clusters store respective cluster data in a data sharing module according to each cluster identifier and/or acquire cluster data from the data sharing module according to each cluster identifier, and the plurality of clusters are deployed through Kubernets.
2. The system according to claim 1, further comprising a user interface module, configured to receive a management request from a user for the plurality of clusters, and transmit the management request to the integrated management module, wherein the user interface module is deployed through a docker.
3. The multi-cluster unified management system according to claim 2, wherein the integrated management module further comprises an independent gateway service module for forwarding the service request to each service module outside the cluster or to a target service module of the target cluster according to the cluster name.
4. The multi-cluster unified management system according to claim 1, wherein said integrated management module further comprises a user management service module and/or an authentication management service module and/or a log management service module deployed through docker, wherein;
the user management service module is used for managing users/user groups so as to realize the management of the operation authority of the cluster and/or the node;
the authentication management service module authenticates the user operation in a mode of distributing and verifying the token code;
the log management service module is used for reading data from the data sharing module and displaying the data according to log attributes, wherein the log attributes comprise at least one of the following items: cluster identification, operation time and operation users.
5. The multi-cluster unified management system according to claim 1, wherein each of said plurality of clusters comprises an intra-cluster inference service orchestration module comprising a service deployment service module, a monitoring management service module, a mirror management service module, a model management service module, wherein;
the service deployment service module is used for supporting the deployment of an inference model and an inference mirror image through a Kubernetes component;
the monitoring management service module is used for monitoring real-time/historical information of hardware resource usage in the cluster;
the mirror image management service module is used for storing, distributing and/or managing the inference mirror image by means of a mirror image warehouse;
and the model management service module is used for storing, distributing and/or managing the inference model by virtue of the file warehouse.
6. The multi-cluster unified management system according to claim 5, wherein said intra-cluster inference service orchestration module further comprises a data storage migration module and a communication service injection module, wherein;
the database storage and migration module is used for storing cluster data into a database which is stored in the data sharing module and stores the cluster data;
and the communication service injection module is used for requesting and acquiring the information of the demand of the service outside the cluster from the multi-cluster management service module according to the service in the cluster.
7. A method for deploying a multi-cluster unified management system, the method comprising:
deploying an integrated management module through a docker, wherein the integrated management module comprises a node management service module and a multi-cluster management service module, the node management service module is used for carrying out node service management on nodes in a cluster according to a management request of a user, the multi-cluster management service module is used for carrying out cluster service management on a plurality of clusters according to the management request of the user, and the integrated management module is also used for transmitting static/dynamic data generated in the cluster service management and the node service management to a data sharing module;
deploying a data sharing module through a docker, wherein the data sharing module comprises at least one database and is used for storing static and/or dynamic data generated by the comprehensive management module and the plurality of clusters;
deploying a plurality of clusters through Kubernetes, wherein the plurality of clusters store respective cluster data in a data sharing module according to each cluster identifier and/or acquire cluster data from the data sharing module according to each cluster identifier;
the integrated management module, the data sharing module and the plurality of clusters are respectively deployed on different network nodes, and the integrated management module is deployed on a network center node.
8. The method for deploying a multi-cluster unified management system according to claim 7, further comprising:
and deploying a user interface module through the docker, wherein the user interface module is used for receiving a management request of a user for the plurality of clusters and transmitting the management request to the comprehensive management module.
9. The method for deploying a multi-cluster unified management system according to claim 7, further comprising:
and deploying an independent gateway service module through the docker, wherein the independent gateway service module is used for forwarding the service request to each service module outside the cluster, or forwarding the service request to a target service module of a target cluster according to the cluster name.
10. The method for deploying a multi-cluster unified management system according to claim 7, further comprising:
deploying a user management service module and/or an authentication management service module and/or a log management service module through a docker, wherein the user management service module and/or the authentication management service module are/is deployed;
the user management service is used for managing users/user groups so as to realize the management of the operation authority of the cluster and/or the node;
the authentication management service module authenticates the user operation in a mode of distributing and verifying the token code;
the log management service module is used for reading data from the data sharing module and displaying the data according to log attributes, wherein the log attributes comprise at least one of the following items: cluster identification, operation time and operation users.
CN202210410910.1A 2022-04-19 2022-04-19 Multi-cluster unified management system and deployment method Withdrawn CN114866416A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210410910.1A CN114866416A (en) 2022-04-19 2022-04-19 Multi-cluster unified management system and deployment method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210410910.1A CN114866416A (en) 2022-04-19 2022-04-19 Multi-cluster unified management system and deployment method

Publications (1)

Publication Number Publication Date
CN114866416A true CN114866416A (en) 2022-08-05

Family

ID=82632007

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210410910.1A Withdrawn CN114866416A (en) 2022-04-19 2022-04-19 Multi-cluster unified management system and deployment method

Country Status (1)

Country Link
CN (1) CN114866416A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396302A (en) * 2022-08-11 2022-11-25 臻乐尔科技服务(上海)有限公司 Multi-node high-availability configuration distribution system and working method thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396302A (en) * 2022-08-11 2022-11-25 臻乐尔科技服务(上海)有限公司 Multi-node high-availability configuration distribution system and working method thereof
CN115396302B (en) * 2022-08-11 2024-01-30 臻乐尔科技服务(上海)有限公司 Multi-node high-availability configuration distribution system and working method thereof

Similar Documents

Publication Publication Date Title
CN105897946B (en) A kind of acquisition methods and system of access address
US11711420B2 (en) Automated management of resource attributes across network-based services
WO2022022477A1 (en) Management operation and maintenance platform and data processing method
US7490265B2 (en) Recovery segment identification in a computing infrastructure
US6895586B1 (en) Enterprise management system and method which includes a common enterprise-wide namespace and prototype-based hierarchical inheritance
US20130318061A1 (en) Sharing business data across networked applications
US20160154870A1 (en) Systems and Methods for Event Driven Object Management and Distribution Among Multiple Client Applications
CN106648903B (en) The method and apparatus for calling distributed file system
CN102947797A (en) Online service access controls using scale out directory features
CN107003906A (en) The type of cloud computing technology part is to type analysis
JP2003520363A (en) Data maintenance method in a partially replicated database system network
CN110581893B (en) Data transmission method and device, routing equipment, server and storage medium
US10182104B1 (en) Automatic propagation of resource attributes in a provider network according to propagation criteria
US11226943B2 (en) Assigning access control for flat data structure
CN114866416A (en) Multi-cluster unified management system and deployment method
CN104881749A (en) Data management method and data storage system for multiple tenants
CN116760705B (en) Multi-tenant platform isolation management system and method based on comprehensive energy management system
CN101789963A (en) Data synchronization system
CN115037757B (en) Multi-cluster service management system
CN116383223A (en) Asset data processing method, related device and storage medium
US11582345B2 (en) Context data management interface for contact center
JP2008509467A (en) Method, system and computer program for managing database records by attributes located in a plurality of databases
CN116760913B (en) Method and system for issuing k8s cluster protocol conversion platform configuration
CN117573296B (en) Virtual machine equipment straight-through control method, device, equipment and storage medium
CN116684282B (en) Method and device for initializing newly-added cloud server and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220805

WW01 Invention patent application withdrawn after publication