WO2019153880A1 - 集群中镜像文件下载的方法、节点、查询服务器 - Google Patents

集群中镜像文件下载的方法、节点、查询服务器 Download PDF

Info

Publication number
WO2019153880A1
WO2019153880A1 PCT/CN2018/121070 CN2018121070W WO2019153880A1 WO 2019153880 A1 WO2019153880 A1 WO 2019153880A1 CN 2018121070 W CN2018121070 W CN 2018121070W WO 2019153880 A1 WO2019153880 A1 WO 2019153880A1
Authority
WO
WIPO (PCT)
Prior art keywords
download
node
image file
nodes
upstream
Prior art date
Application number
PCT/CN2018/121070
Other languages
English (en)
French (fr)
Inventor
孙宇霖
熊英
单海军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019153880A1 publication Critical patent/WO2019153880A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • the present application relates to the field of information technology, and more specifically, to a method, a node, and a query server for downloading an image file in a cluster environment.
  • a cluster is usually a parallel or distributed system of nodes (such as computers or virtual machines) that are connected together. These nodes work together and run a common set of applications, while providing a single system mapping for users and applications.
  • nodes such as computers or virtual machines
  • a server cluster connects multiple servers through a communication link. From the outside, these servers work like a server.
  • external loads are dynamically allocated to the server through certain mechanisms. Achieve high performance and high availability that is only available in super servers.
  • a cluster may have dozens to tens of thousands of nodes simultaneously downloading image files.
  • the current image file downloading method is a central download, that is, all nodes in the cluster simultaneously download image files to the central mirror server.
  • the download time delay has a linear relationship with the number of download nodes, which affects the download efficiency.
  • the present application provides a method, a node, and a query server for downloading an image file in a cluster environment, which can improve the download efficiency of the cluster.
  • a method for downloading an image file in a cluster is provided.
  • the method is applicable to a cluster including a file server and N nodes, and the file server provides downloading of the image file for at least one of the N nodes.
  • a service the at least one node of the N nodes that downloaded the image file provides a download service of the image file for at least one of the N nodes, where N is a positive integer greater than 1, the method includes: the N The first download node of the node receives the information of the first upstream download node sent by the query server, wherein the first upstream download node provides the image for the first download node determined in the download source set based on the equalization download policy.
  • the node that downloads the image file provides the download service of the image file for the other node, and the upstream download node of the first download node is selected by the query server, so that the first download node can The upstream download node downloads the image file. This prevents all nodes in the cluster from downloading image files from the file server. This reduces the occupation of downloading service resources in the cluster environment and improves the download efficiency of the cluster.
  • the method further includes: the first download node sends the first a downloading information, the first downloading information is downloading, by the first downloading node, the downloading information of the image file, where the first downloading information is used by the querying server to update the downloading information list of the image file in the querying server, the downloading information
  • the list includes download information for the image file at the N nodes.
  • the first download information includes a file name of the image file and a file size of the image file that the first download node has downloaded.
  • the first download information includes a file name of the image file, a file size of the image file that the first download node has downloaded, and the first download node downloads the file The download time of the image file and the upstream download node that the first download node downloads the image file.
  • the equalization download policy includes a first download condition, where the number of downstream download nodes of the first upstream download node is less than a preset threshold.
  • the downstream download node is a node that is downloading the image file from the first upstream download node.
  • the first downloading condition in the balancing policy is adopted, so that multiple downstream downloading nodes in the cluster can obtain the image file from the same upstream downloading node, thereby improving the downloading efficiency of the cluster.
  • the equalization download policy further includes a second download condition, where the size of the downloaded image file that is downloaded by the first download node is smaller than that of the first upstream download node. The size of the image file.
  • the second download condition in the equalization policy is adopted, and the data of the image file that needs to be downloaded at the downstream download node has been obtained at the upstream download node.
  • the method before the first download node receives the information of the first upstream download node sent by the query server, the method further includes: the first download node to the query server A query request is sent for querying information of an upstream download node for downloading the image file.
  • the query server is a centralized query server.
  • the centralized query server may be a specific one of the N nodes.
  • the centralized query server is in an active/standby mode.
  • the query server adopts the active/standby mode. Since the centralized query server adopts the service in the active/standby mode, multiple servers are ready to provide the query service at any time. Therefore, when the main query server goes down, the standby query server provides a query service that provides highly available services compared to only one query server.
  • some or all of the N nodes in the cluster may be multiplexed as a query server in a distributed query server system.
  • a distributed query server system which can reduce the load of a single query server.
  • the distributed query server system reuse node provides a query service, which can avoid the resource consumption caused by separately configuring the query server.
  • the method further comprises the first download node determining the query server from the distributed query server system.
  • the N nodes in the multiplexing cluster form a distributed query server system, the cluster further comprising a storage server, the storage server storing list information of the N nodes . Determining, by the first download node, the query server from the distributed query server system, comprising: the first download node acquiring list information from the storage server; the first download node according to the list information, from the distributed The query server is determined in the query server system.
  • the list information of the N nodes in the cluster may be stored in the storage server.
  • the first download node can determine the query server of the image file from the distributed query server system according to the list information.
  • the first download node determines the query server from the distributed query server system according to the list information, including: the first download node according to the list Information, using a consistent hash algorithm to determine the query server.
  • a consistent hash algorithm may be used to determine a query server for querying the upstream download node information from the distributed query server system, and different mirrors may be used.
  • the file's query server is distributed across different nodes, reducing the load on a single query server.
  • the method further includes: the first download node transmitting a registration request to the storage server, the registration request including an IP address of the first download node and a registration port number .
  • the list information of the N nodes includes an IP address of the N nodes and a registration port number.
  • the node that reuses the query server in the cluster sends the registration request to the storage server, thereby forming list information in the storage server. Based on the list information, information of the distributed query server system can be determined, thereby providing a distributed query server system for the cluster. With a distributed query server system, the load on a single query server can be reduced. In addition, the distributed query server system can also multiplex nodes to provide query services, thereby improving the download efficiency of the cluster.
  • the first download node acts both as a download node for the first image file and as a query server for other nodes of the N nodes (eg, the second download node).
  • the first download node as a query server further includes a function of querying the server.
  • the first download node determines a second upstream download node of the second download node of the N nodes, wherein the second upstream download node is determined in the download source set based on the equalization download policy.
  • the second download node provides a node of the download service of the second image file, where the download source set includes a file server and at least one node of the N nodes downloaded the second image file; the first download node is to the second The download node sends information of the second upstream download node.
  • the method further includes: receiving, by the first download node, second download information sent by the second download node, the second download The information downloads the download information of the second image file for the second download node; the first download node updates the download information list of the second image file in the first download node according to the second download information.
  • the first download node determines the second upstream download node of the second download node of the N nodes based on the equalization download policy, including:
  • the first download node determines the second upstream download node based on the equalization download policy and the download information list of the second image file, and the download information list includes download information of the second image file at the N nodes.
  • the download information of the second image file in the download information list of the second image file may include a file name and a file size of the downloaded second image file.
  • the first download node (as the query server) may also record the number of downstream download nodes of the allocated second upstream download node in the download information list of the second image file.
  • the equalization download policy includes a first download condition
  • the second upstream download node is determined based on the equalization download policy and the download information list, including:
  • the downloading information list includes the second The image file is downloaded from the N nodes.
  • the equalization download policy further includes a second download condition, and determining the second upstream download node based on the equalization download policy and the download information list, including:
  • the method before the first download node determines the second upstream download node of the second download node of the N nodes, the method further includes:
  • the first download node receives a query request sent by the second download node, where the query request is used to query information of the second upstream download node.
  • the second download node sends the second download information to the first download node, where the second download information is the second download node downloading the download information of the second image file, the second download The information is used by the first download node to update the download information list of the second image file, where the download information list includes download information of the second image file at the N nodes.
  • the second download information includes a file name of the second image file and a file size of the second image file that the second download node has downloaded.
  • the second download information includes a file name of the second image file, a file size of the second image file that the second download node has downloaded, and an upstream download node that the second download node downloads the second image file.
  • the second download node downloads the download time of the second image file.
  • the node can be used as a download node of the image file or as a query server of other download nodes, which can reduce the occupation of the download service resources in the cluster environment, thereby improving the download efficiency of the cluster.
  • a method for downloading an image file in a cluster is provided, wherein the method is applicable to a cluster including a file server and N nodes, wherein the file server provides the node for at least one of the N nodes. a download service of the image file, wherein at least one node of the N nodes that downloaded the image file provides a download service of the image file for at least one of the N nodes, wherein the N is a positive integer greater than 1.
  • the method includes: the query server determines, according to the equalization download policy, a first upstream download node of the first one of the N nodes, wherein the first upstream download node is the one of the download source sets that provides the first download node a node of the download service of the image file, the download source set includes the file server and at least one node of the N nodes downloaded the image file; the query server sends the information of the first upstream download node to the first download node.
  • the upstream download node of the first download node is selected by the query server, so that the first download node can download the image file from the upstream download node, where the upstream download node of the first download node is the The file server and the N nodes download at least one node of the image file, so that all the nodes in the cluster can download the image file from the file server, which can reduce the occupation of the download service resource in the cluster environment, thereby improving the cluster. Download efficiency.
  • the query server determines a first upstream download node of the first one of the N nodes, including: the query server is based on the equalization download policy and downloading The information list determines the first upstream download node, and the download information list includes download information of the image file at the N nodes.
  • the equalization download policy includes a first download condition
  • the query server determines the first upstream download node based on the equalization download policy and the download information list, including:
  • the first upstream downloading node is determined by the queried server according to the first downloading condition, where the number of the downstream downloading nodes of the first upstream downloading node is less than a preset threshold.
  • the downstream download node is a node that is downloading the image file from the first upstream download node.
  • the first downloading condition in the balancing policy is adopted, so that multiple downstream downloading nodes in the cluster can obtain the image file from the same upstream downloading node, thereby improving the downloading efficiency of the cluster.
  • the equalization download policy further includes a second download condition, the query server determining the first upstream download node based on the equalization download policy and the download information list, including: the query The server determines the first upstream download node according to the first download condition and the second download condition, where the first download condition is that the number of the downstream download nodes of the second upstream download node is less than a preset threshold, and the second download condition is The size of the second image file that the second download node has downloaded is smaller than the size of the second image file that the second upstream download node has downloaded.
  • the second download condition in the equalization policy is adopted, and the data of the image file that needs to be downloaded at the downstream download node has been obtained at the upstream download node.
  • the method further includes: the query server receiving the first download information sent by the first download node, where the first download information is downloaded by the first download node Downloading information of the image file; the query server updates the download information list of the image file in the query server according to the first download information.
  • the first download information includes a file name of the image file and a file size of the image file that the first download node has downloaded.
  • the first download information includes a file name of the image file, a file size that the first download node has downloaded the image file, and the first download node downloads the file The download time of the image file and the upstream download node that the first download node downloads the image file.
  • the method before the query server determines the first upstream download node of the first download node of the N nodes, the method further includes: the query server receiving the first A query request sent by the download node, the query request is used to query information of an upstream download node for downloading the image file.
  • the query server is a centralized query server.
  • the centralized query server may be a specific one of the N nodes.
  • the centralized query server is in an active/standby mode.
  • the query server adopts the active/standby mode. Since the centralized query server adopts the service in the active/standby mode, multiple servers are ready to provide the query service at any time. Therefore, when the main query server goes down, the standby query server provides a query service that provides highly available services compared to only one query server.
  • some or all of the N nodes in the cluster may be multiplexed as a query server in a distributed query server system.
  • a distributed query server system which can reduce the load of a single query server.
  • the distributed query server system reuse node provides a query service, which can avoid the resource consumption caused by separately configuring the query server.
  • the method of downloading image files in the above cluster can be used in a Docker environment.
  • a node is provided, wherein the node is a node in a cluster, the cluster includes a file server and N nodes, and the file server provides the image file for at least one of the N nodes.
  • Downloading a service wherein at least one node of the N nodes that downloaded the image file provides a download service of the image file for at least one of the N nodes, wherein the N is a positive integer greater than 1, and the node includes a server module and a data download service module; the server module is configured to send a download request of the image file to the data download service module; the data download service module is configured to query the server according to the download request of the image file Obtaining information of the first upstream download node, where the first upstream download node is a node that is determined by the equalization download policy to provide the image download service for the node in the download source set, where the download source set includes the file server And downloading at least one node of the image file from the N nodes; and downloading from the first upstream
  • the node that downloads the image file provides the download service of the image file for the other node, and the upstream download node of the first download node is selected by the query server, so that the first download node can The upstream download node downloads the image file. This prevents all nodes in the cluster from downloading image files from the file server. This reduces the occupation of downloading service resources in the cluster environment and improves the download efficiency of the cluster.
  • the data download service module is further configured to: send, to the query server, first download information, where the first download information is download information of the image file downloaded by the node
  • the first download information is used by the query server to update the download information list of the image file in the query server, and the download information list includes download information of the image file at the N nodes.
  • the first download information includes a file name of the image file and a file size of the image file that the node has downloaded.
  • the first download information includes a file name of the image file, a file size of the image file that the node has downloaded, a download time of the node to download the image file, and The node downloads the upstream download node of the image file.
  • the equalization download policy includes a first download condition, where the number of downstream download nodes of the first upstream download node is less than a preset threshold.
  • the downstream download node is a node that is downloading the image file from the first upstream download node.
  • the first downloading condition in the balancing policy is adopted, so that multiple downstream downloading nodes in the cluster can obtain the image file from the same upstream downloading node, thereby improving the downloading efficiency of the cluster.
  • the equalization download policy further includes a second download condition that is that the size of the downloaded image file of the first download node is smaller than that of the first upstream download node. The size of the image file.
  • the second download condition in the equalization policy is adopted, and the data of the image file that needs to be downloaded at the downstream download node has been obtained at the upstream download node.
  • the data download service module is further configured to: send a query request to the query server, where the query request is used to query information of an upstream download node for downloading the image file .
  • the query server is a centralized query server.
  • the centralized query server may be a specific one of the N nodes.
  • the centralized query server is in an active/standby mode.
  • the query server adopts the active/standby mode. Since the centralized query server adopts the service in the active/standby mode, multiple servers are ready to provide the query service at any time. Therefore, when the main query server goes down, the standby query server provides a query service that provides highly available services compared to only one query server.
  • some or all of the N nodes in the cluster may be multiplexed as a query server in a distributed query server system.
  • a distributed query server system which can reduce the load of a single query server.
  • the distributed query server system can also provide a query service by multiplexing nodes, which can avoid the resource consumption caused by separately configuring the query server.
  • the data download service module is further configured to: determine the query server from the distributed query server system.
  • the N nodes in the multiplexing cluster form a distributed query server system, the cluster further comprising a storage server, the storage server including list information of the N nodes
  • the data download service module is specifically configured to: obtain list information of the N nodes from the storage server; and determine, according to the list information, a query for querying the first upstream download node from the distributed query server system server.
  • the storage server may include list information of N nodes in the cluster. According to the list information, the information of the distributed query server system can be determined, thereby providing a distributed query server system for the cluster, and improving the download efficiency of the cluster.
  • the data download service module is specifically configured to determine the query server by using a consistent hash algorithm according to the list information.
  • a consistent hash algorithm may be used to determine a query server for querying the upstream download node information from the distributed query server system, and different mirrors may be used.
  • the file's query server is distributed across different nodes, reducing the load on a single query server.
  • the data download service module is further configured to: send a registration request to the storage server, the registration request including an IP address of the node and a registration port number.
  • the list information of the N nodes includes an IP address of the N nodes and a registration port number.
  • the N nodes in the cluster form a registration request in the storage server by sending a registration request to the storage server.
  • the first download node can determine the query server of the image file from the distributed query server system according to the list information.
  • the data download service module is further configured to provide a query server function.
  • the node is both a download node of the first image file and a query server of N other nodes (for example, the second node).
  • the data download service module further includes the following functions. :
  • the second upstream download node is a node in the download source set that provides the second node with a download service of the second image file, where
  • the download source set includes at least one node of the file server and the N nodes that downloaded the second image file; the node sends the information of the second upstream download node to the second node.
  • the download information of the second image file in the download information list of the second image file may include a file name and a file size of the downloaded second image file.
  • the first download node (as a query server) records the number of downstream download nodes of the allocated second upstream download node in the download information list of the second image file.
  • the data download service module determines a second upstream download node according to the download information list, where the download information list includes download information of the second image file at the N nodes.
  • the data download service module is further configured to:
  • the second downloading condition is the size of the second image file that the second download node has downloaded is smaller than the size of the second image file that the second upstream download node has downloaded.
  • the data download service module is further configured to:
  • the second download information includes a file name of the second image file and a file size of the second image file that the second download node has downloaded.
  • the second download information includes a file name of the second image file, a file size that the second download node has downloaded the second image file, a download time of the second download node to download the second image file, and the The second download node downloads an upstream download node of the second image file.
  • the node is any one of the N nodes, the node is a node that downloads the first image file, and the second node downloads the second image file, and the different image files correspond to different nodes.
  • Query server any one of the N nodes, the node is a node that downloads the first image file, and the second node downloads the second image file, and the different image files correspond to different nodes.
  • a query server is provided, wherein the query server is applied to a cluster, the cluster includes a file server and N nodes, and the file server provides the image file for at least one of the N nodes.
  • a download service wherein at least one node of the N nodes that downloaded the image file provides a download service of the image file for at least one of the N nodes, wherein the N is a positive integer greater than 1,
  • the query The server includes: a processing module, configured to determine, according to the equalization download policy, a first upstream download node of the first one of the N nodes, wherein the first upstream download node is provided in the download source set for the first download node a node of the download service of the image file, the download source set includes the file server and at least one node of the N nodes in which the image file is downloaded; and a transceiver module, configured to send the first upstream download to the first download node Node information.
  • the node that downloads the image file provides the download service of the image file for the other node, and the upstream download node of the first download node is selected by the query server, so that the first download node can The upstream download node downloads the image file. This prevents all nodes in the cluster from downloading image files from the file server. This reduces the occupation of downloading service resources in the cluster environment and improves the download efficiency of the cluster.
  • the processing module is specifically configured to: determine, according to the equalization download policy and the download information list, the first upstream download node, where the download information list includes the image file The download information of the N nodes.
  • the equalization downloading policy includes a first downloading condition, where the processing module is configured to: determine the first upstream downloading node according to the first downloading condition, where The first download condition is that the number of downstream download nodes of the first upstream download node is less than a preset threshold.
  • the downstream download node is a node that is downloading the image file from the first upstream download node.
  • the first downloading condition in the balancing policy is adopted, so that multiple downstream downloading nodes in the cluster can obtain the image file from the same upstream downloading node, thereby improving the downloading efficiency of the cluster.
  • the equalization download policy further includes a second download condition, where the processing module is configured to: determine the first upstream download node according to the first download condition and the second download condition.
  • the first download condition is that the number of the downstream download nodes of the first upstream download node is less than a preset threshold;
  • the second download condition is that the size of the image file that the first download node has downloaded is smaller than the first upstream The download node has downloaded the size of the image file.
  • the second download condition in the equalization policy is adopted, and the data of the image file that needs to be downloaded at the downstream download node has been obtained at the upstream download node.
  • the transceiver module is further configured to: receive the first download information sent by the first download node, where the first download information is the first download node downloading the image The downloading information of the file; the processing module is further configured to: update the download information list of the image file in the query server according to the first download information.
  • the first download information includes a file name of the image file and a file size that the first download node has downloaded the image file.
  • the first download information includes a file name of the image file, a file size of the image file that the first download node has downloaded, and the first download node downloads the file The download time of the image file and the upstream download node that the first download node downloads the image file.
  • the transceiver module is further configured to: receive a query request sent by the first node, where the query request is used to query information of an upstream download node for downloading an image file. .
  • the query server is a centralized query server.
  • the centralized query server may be a specific one of the N nodes.
  • the centralized query server employs an active/standby mode.
  • the query server adopts the active/standby mode, and since the centralized query server adopts the service in the active/standby mode, multiple servers are ready to provide the query service at any time. Therefore, when the main query server goes down, the standby query server provides a query service that provides highly available services compared to only one query server.
  • some or all of the N nodes in the cluster may be multiplexed as a query server in a distributed query server system.
  • a distributed query server system which can reduce the load of a single query server.
  • the distributed query server system can also provide a query service by multiplexing nodes, which can avoid the resource consumption caused by separately configuring the query server.
  • a node comprising: a memory for storing a computer program; a processor for executing a computer program stored in the memory to cause the device to perform the first aspect or the first Any of the aspects may be implemented in a method.
  • a query server comprising: a memory for storing a computer program; a processor, configured to execute a computer program stored in the memory, to cause the device to perform the second aspect or Any of the possible implementations of any of the second aspects.
  • the node and the query server may be chips.
  • a system comprising a file server and a node in any of the above or any of the possible implementations of any of the aspects.
  • a readable storage medium comprising a program or an instruction, when the program or instruction is run on a computer, according to the method of the first aspect and the second aspect, or any one of the possible implementation manners Executed.
  • a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect and the second aspect described above or any one of its possible implementations.
  • FIG. 1 is a schematic diagram of an implementation manner of a cluster scenario applied in an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of implementing image file downloading according to an embodiment of the present application.
  • FIG. 3 is an interaction flowchart of a method for downloading an image file in a cluster according to an embodiment of the present application.
  • FIG. 4 is an interactive flow chart of a method for downloading an image file in a cluster according to another embodiment of the present application.
  • Figure 5 is a schematic illustration of a hash ring in accordance with one embodiment of the present application.
  • FIG. 6 is a schematic diagram of image file downloading in a cluster according to an embodiment of the present application.
  • Figure 7 is a schematic block diagram of a node in accordance with one embodiment of the present application.
  • FIG. 8 is a schematic block diagram of a node in accordance with another embodiment of the present application.
  • FIG. 9 is a schematic block diagram of a query server in accordance with one embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a node in accordance with another embodiment of the present application.
  • the embodiment of the present application is applicable to image file downloading in a cluster environment.
  • the image file can be a data backup of data stored on one disk on another disk.
  • the image file can also be a file processing tool, such as converting files of other formats into a specific file format.
  • the image file may be a file similar to the compressed package, for example, a specific series of files are made into a single file according to a certain format, so that the user can download and use the file.
  • the image file can also be a registration documentation, including instructions for creating a Docker.
  • a cluster aggregates multiple servers for the same service, and the cluster can be considered a server from the customer's perspective.
  • Clusters use multiple computers or nodes for parallel computing to achieve high computational speeds.
  • FIG. 1 is a schematic diagram of an implementation manner of a cluster applied in an embodiment of the present application.
  • a file server 101 and a plurality of nodes 102 may be included in the cluster.
  • the file server 101 is used to provide a service for mirror file download to a plurality of nodes 102 in the cluster.
  • Node 102 can be a device having processing functionality, such as any computing device known in the art, such as a server, desktop computer, or the like.
  • a memory and a processor can be included in node 102.
  • Memory can be used to store program code, such as operating systems and other applications.
  • the processor can be used to call program code stored in the memory to implement the corresponding functions of the node.
  • the processor and the memory included in the node may be implemented by a chip, which is not specifically limited herein.
  • An operating system and other applications can be installed at the node. For example, you can install an application at a node: container Docker, and so on.
  • container Docker In a Docker environment, client-server (C/S) architectural patterns can be used to receive requests from clients, such as data download requests, and process those requests.
  • clients such as data download requests, and process those requests.
  • the download of image files in the cluster adopts the central download mode. That is, all nodes in the cluster need to obtain image files from the file center server. In this way, the number of nodes downloaded simultaneously increases. The delay grows linearly with the number of nodes downloaded, resulting in a longer download delay for the node. To ensure that the download latency does not increase with the number of nodes downloaded, you need to increase the resources of the server.
  • the node can obtain the image file from the file server, or obtain the image file from other nodes that have obtained some or all of the image files.
  • the file server can provide the image node download service to the initial node. Any node in the cluster can be used as the initial node. After the initial node obtains the image file, it can serve as the file source of the image file and provide the image file to other nodes. Download the service.
  • the download source and download node topology of these nodes can constitute a multi-fork tree structure for image file download.
  • the download node represents a node that performs image file downloading.
  • the “download node” and the “node” may be used in common.
  • FIG. 2 is a schematic structural diagram of an image file downloading implementation according to an embodiment of the present application.
  • the query server may be a file tracking FileTracker server.
  • the schematic diagram of the structure is a schematic diagram of an implementation manner of a download source and a download node of an image file formed after the upstream download node is determined according to the query server.
  • the cluster in FIG. 2 includes a file server and a plurality of nodes, such as a first node, a second node, a third node, a fourth node, and a fifth node in FIG. 2.
  • the node in FIG. 2 can also be any one of the nodes in FIG. 1.
  • the first node may directly download the image file from the file server, and the first node may download the image file as the file source of the image file to other nodes in the cluster (for example, the second node and the third node). ) Provide a download service for this image file.
  • the second node After the second node downloads the image file, the second node can provide the image file download service to other nodes in the cluster (for example, the fourth node and the fifth node) as the file source of the image file, and so on.
  • the N nodes in the cluster do not need to download the image file through the file server, and the image file can be downloaded through the upstream download node.
  • the query server is configured to provide each download node with information of the upstream download node.
  • the process of obtaining information of any upstream download node of the node in the cluster by the query server is described below with reference to FIG. 2 .
  • the first node sends a query request to the query server, where the query request is used to query information of the upstream download node for downloading the image file, that is, the query request is used to query information of the upstream download node of the first node.
  • the query request may include the file name of the image file that needs to be downloaded at the first node and the size of the image file that has been downloaded at the first node.
  • the query server sends a query result to the first node, where the query result includes information of the upstream download node of the first node.
  • the first node sends a query request to the query server, and the request is used to query the information of the upstream download node of the first node downloading the image file 1.
  • the query server After receiving the query request sent by the first node, the query server sends the query result to the first node. For example, if the query server does not query the data in the image file 1 that has been downloaded at any node, the result of the query is that the first node needs to download the image file 1 from the file server. After the first node obtains the query result, the first node downloads the image file 1 from the file server.
  • the third node sends a query request to the query server, where the query request is used to query the information of the upstream download node of the third node, and the query request may include the file name of the image file to be downloaded at the third node and The size of the image file that has been downloaded at the third node.
  • the query server sends a query result to the third node, where the query result includes information of the upstream download node of the third node.
  • the size of the image file 1 is 300 MB, and the data of 100 MB in the image file 1 has been downloaded at the third node.
  • the third node needs to continue to download the remaining 200 MB data in the image file 1.
  • the query server After receiving the query request of the third node, the query server sends the query result to the third node.
  • the result of the query is that the first node may be its upstream download node (assuming that the first node here has completed the download of 200 MB of data required by the third node).
  • the third node continues to download the remaining 200 MB of data in the image file 1 from the first node.
  • any one of the plurality of nodes queries the information of the upstream download node by the query server may include, but is not limited to, the above manner.
  • the entire download node may be downloaded through one upstream download node, or may be downloaded through different upstream download nodes. After downloading several times, the entire data of the image file is downloaded.
  • FIG. 3 is a schematic flowchart diagram of a method for downloading an image file in a cluster according to an embodiment of the present application.
  • the first download node may be any one of the N nodes in the cluster, for example, may be any one of the nodes in FIG.
  • the first download node may be any one of the N nodes in the cluster, and the downloading method of the image file is applicable to the cluster including the file server and the N nodes, where N is greater than 1. a positive integer, the file server provides a download service of the image file for at least one of the N nodes, and at least one node of the N nodes that downloaded the image file provides the image file for at least one of the N nodes Download service.
  • the first download node of the N nodes receives the information of the first upstream download node sent by the query server, where the first upstream download node is determined by the equalization download policy in the download source set to provide the first download node.
  • a node of the download service of the image file, the download source set includes at least one node of the file server and the N nodes downloaded the image file.
  • the first upstream download node is a node selected from the node that downloaded the image file, and the first upstream download node may be applied to two scenarios:
  • the first upstream download node is a node in the cluster, and the node that downloads the image file and one node in the file server.
  • the first upstream download node is a node in the cluster, and the node that downloads the image file and one node in the file server. This embodiment of the present application does not limit this.
  • the first download node of the N nodes acquires information of the first upstream download node from the query server when downloading the image file.
  • the first downloading node may first send a query request of the first upstream download node to the query server, and the query server sends the information of the first upstream download node to the first download node according to the query request.
  • the query request is used to query information of an upstream download node for downloading the image file. That is, the query request is used to query information of the first upstream download node.
  • the query request may include the file name of the image file that needs to be downloaded and the file size of the image file that has been downloaded at the first download node.
  • the query server may also send the information of the corresponding upstream download node to the download node according to the download information list, that is, the information of the upstream download node is not limited by the embodiment of the present application. Sent based on the request.
  • the query server determines the first upstream download node based on the equalization download policy and the download information list (refer to the following description for the specific determination process).
  • the download information list includes download information of the image file at the N nodes.
  • the download information list includes download information of the image file in the N nodes.
  • the download information of the corresponding node in the download information list may be empty. of. That is, the download information of the node corresponding to the downloaded image file in the download list is not empty, and the download information corresponding to the node that does not download the image file is empty.
  • the download information in the download information list may include a file name and a file size of the downloaded image file.
  • the query server may maintain a download information list according to download information of each node, for example, a download information list shown in Table 1 below, and record the number of downstream download nodes of the allocated upstream download node in the download information list.
  • the downstream download node is the node that is acquiring the image file from the set of download sources.
  • the download source set includes at least one node of the file server and the N nodes of the cluster that have downloaded the image file.
  • the download information in the download information list may include a file name of the downloaded image file, a file size of the downloaded image file, a download time of downloading the image file, and downloading.
  • the upstream download node of the image file may include a file name of the downloaded image file, a file size of the downloaded image file, a download time of downloading the image file, and downloading.
  • the download information of each node may further include downloading the download time of the image file and downloading the upstream download node of the image file.
  • the query server can generate a more detailed list of download information, so as to determine the upstream download node based on the equalization download policy, which is not limited in this embodiment of the present application.
  • the download information list may be download information of the image file at multiple nodes. For downloading different image files, the download information list is different.
  • the download information in the download information list records information of different download nodes that have downloaded the same image file.
  • the image file 1 (File1) shown in Table 1 specifically downloads information at each node.
  • Table 1 is an implementation manner of an image file downloading information list in the embodiment of the present application, which is not limited in this embodiment of the present application.
  • the download information list of an image file may include a file name of the downloaded image file, a node name for downloading the image file, and a size of the image file that has been downloaded from each node where the image file is downloaded and downloaded.
  • the node that downloads the image file 1 has a first node, a second node, a third node, a fourth node, and a fifth node, where the 1000 MB data in the image file 1 has been downloaded at the first node, The 500MB data of the image file 1 has been downloaded at the second node, the 800 MB data in the downloaded image file 1 at the third node, the 400 MB data in the downloaded image file 1 at the fourth node, and the downloaded image file 1 at the fifth node. 400MB of data.
  • the number of downstream download nodes of each node that downloads the image file 1 is also recorded in Table 1.
  • the first node has two downstream download nodes
  • the second node has two downstream download nodes
  • the third node and the fourth node The downstream download node of image file 1 is not downloaded at the node and the fifth node.
  • the query server may determine the first upstream download node based on the equalization download policy.
  • the equalization download strategy is used to balance the download load of each node in the cluster.
  • the equalization download policy may be a policy set based on the download status of each node in the cluster.
  • the equalization download policy may be set according to the current download situation of the image file at each node, and the equalization download may also be dynamically adjusted.
  • the strategy is to prevent too many nodes in the cluster from downloading image files from one node, causing a download bottleneck.
  • the equalization download policy may include a first download condition, that is, the query server may implement balancing of load in the cluster according to the first download condition.
  • the first download condition is that the number of the downstream download nodes of the first upstream download node is less than a preset threshold.
  • the equalization download policy may further include a second download condition, that is, the query server may determine the first upstream download node according to the first download condition and the second download condition.
  • the first download condition is that the number of the downstream download nodes of the first upstream download node is less than a preset threshold;
  • the second download condition is that the size of the image file that the first download node has downloaded is smaller than the first upstream download node. The size of the image file has been downloaded.
  • the first downloading condition in the balancing policy is adopted, so that multiple downstream downloading nodes in the cluster can be prevented from acquiring the image file from the same upstream downloading node.
  • the second download condition in the equalization policy is adopted to ensure that the data of the image file that needs to be downloaded at the downstream download node has been acquired at the upstream download node.
  • the sixth node when the sixth node needs to download the data in the image file 1, it is assumed that 450 MB of the data in the image file 1 has been downloaded at the sixth node before this. At this time, the sixth node sends a query request to the query server, where the query request is used to query information of the upstream download node of the sixth node. If the preset threshold is 2, in combination with Table 1, the query server may determine the upstream download node of the sixth node by using the first download condition and the second download condition.
  • the first download condition is that the number of the downstream download nodes of the first upstream download node is less than a preset threshold
  • the second download condition is that the size of the image file that the first download node has downloaded is smaller than that of the first upstream download node. The size of the file.
  • the number of downstream download nodes that download the image file at the first node and the second node is 2, the first download condition is not satisfied; and the image file that has been downloaded at the sixth node is
  • the data size of 1 is 450 MB, which is larger than the data size 400 MB of the image file 1 that has been downloaded at the fourth node and the fifth node, and does not meet the second download condition. Therefore, the query server combined with the table 1 determines that the third node can be the upstream download node of the sixth node according to the first download condition and the second download condition.
  • the first download node downloads the image file from the first upstream download node.
  • the first download node receives the information of the first upstream download node sent by the query server, and the first download node determines, according to the information of the first upstream download node, the first upstream download node, from the first upstream The download node gets the image file.
  • the first download node downloads the image file from the first upstream download node
  • the entire image file or the image file of the download portion may be downloaded.
  • the first download node after the first download node downloads the image file by the first upstream download node, the first download node sends the first download information to the query server.
  • the first download information is that the first download node downloads the download information of the image file.
  • the first download node starts the download of the image file, but before the image file download is completed, it can provide the download service of the image file for other downstream download nodes.
  • the first download node when the first download node downloads 10 MB of data in the image file, the first download node sends the first download information for downloading the image file to the query server. If the first download node downloads the remaining data of the image file, if the other download node needs to download the 10 MB data of the image file, the query server may determine, according to the first download information, that the first download node may be another download. The node provides the download service for this 10MB of data for the image file.
  • the time at which the first download node sends the first download information of the download image file to the query server is not specifically limited.
  • the first download information may include only the file name of the image file and the file size of the image file that the first download node has downloaded; or only the file name of the image file and the first download node.
  • the query server may update the download information list in the query server according to the received first download information, where the download information list includes download information of the image file at multiple nodes.
  • the image file 1 (File1) in Table 1 is updated to download the information of the node of the image file 1.
  • the node when a node downloads an image file from an upstream download node of the node, the node may select to send the download information of the image file to the query server at any time after starting the download.
  • the query server updates the list of download information in the server according to the received download information.
  • the query server may be a centralized query server.
  • the centralized query server can be a node in the cluster.
  • the centralized query server may be a specific node of the N nodes of the cluster to provide a centralized query server function.
  • the centralized query server adopts an active/standby mode.
  • the main query server is responsible for monitoring all the standby query servers, and when the standby query server is down, the main query server will restart the standby query server.
  • one of the standby query servers in the cluster performs the work of the primary query server.
  • the centralized query server since the centralized query server adopts the service in the active/standby mode, there are multiple servers ready to provide the query service. Therefore, when the main query server goes down, the standby query server provides the query service. Compared with only one query server, the centralized query server in the embodiment of the present application has at least two query servers ready to provide query services. Therefore, the embodiment of the present application can provide a highly available service as compared with only one query server. However, when the centralized query server adopts the service in the active/standby mode, there may be a certain bottleneck. Since only one query server can work at a time, when multiple nodes in the cluster query the information of the upstream download node from the same query server at the same time, there may be a single point performance bottleneck.
  • the query server may also be a distributed query server system.
  • the query server that sends the information of the first upstream download node to the first download node is a query server in the distributed query server system.
  • the distributed query server system may be a system formed by multiplexing some or all of the N nodes in the cluster. There are multiple query servers in the distributed query server system, and multiple query servers provide services in a distributed manner.
  • the query server may also be an independent server, that is, the nodes in the cluster may not be reused, and are not specifically limited herein.
  • the distributed query server can also be a collection of independent servers and servers that reuse nodes in the cluster.
  • the cluster further includes a storage server, where the storage server is configured to provide the node with list information, where the list information is used by the node to determine the query server of the image file in the distributed server system.
  • a first download node of the N nodes sends a registration request to the storage server, the registration request including an IP address of the first download node and a registration port number. The IP address of the first download node and the registered port number are stored in the list information in the storage server.
  • the storage server obtains the IP address of the download node and the registration port number carried in the registration request, thereby forming the information.
  • List information for N nodes is sent to the storage server according to each of the N nodes.
  • the storage server includes list information of N nodes, and all the nodes in the cluster send a registration request to the storage server, and the registration request includes the first The IP address of a download node and the registered port number.
  • the storage server forms list information of N nodes according to the registration request of the N nodes.
  • the storage server includes list information of the partial nodes.
  • the cluster includes 100 nodes, and the list information in the storage server may be list information of all nodes (for example, 100 nodes), or may be a list of some nodes (for example, 50 nodes or 80 nodes) in the cluster. information.
  • the storage server is a highly-available key value store for shared configuration and service discovery (ETCD), and the ETCD is a distribution for sharing configuration and service discovery.
  • ETCD shared configuration and service discovery
  • KV KV
  • ETCD works by using the distributed strong consistency log Raft protocol to maintain the consistency of the state of each node in the cluster.
  • the ETCD cluster is a distributed system, where multiple nodes communicate with each other to form an overall external service. Each node stores complete data, and the Raft protocol ensures that the data maintained by each node is consistent.
  • the first downloading node may determine, from the distributed query server system, a query server that queries the first upstream download node information, including:
  • the first download node obtains list information from the storage server
  • the first download node determines the query server from the distributed query server system by using a consistent hash algorithm according to the list information.
  • the first download node first obtains list information from the storage server, and the first download node uses a hash algorithm to calculate a hash value of the file name of the image file and a hash value of the N nodes in the list information.
  • the query server is determined from the distributed query server system by comparing the size of the hash value. For example, by comparing the hash value of the file name of the image file with the hash value of the N nodes, determining a node corresponding to the hash value closest to the hash value of the file name of the image file, the node serving as the node The query server for the image file. Then, the first download node sends a query request for the image file to the determined query server.
  • the query server receives the image file for the first time at this time. Querying the request, the query server determines, according to the query request of the image file received for the first time, the query server as the query server of the image file.
  • the query server determines that the file server is The upstream download node of the first download node, that is, the first download node needs to obtain the image file from the file server. After the first download node completes the download of the image file, the first download information for downloading the image file is sent to the query server, and the download information list for downloading the image file is generated in the query server. Therefore, when the query server receives the query request for the image file for the first time, it is the query server of the image file.
  • the download server has downloaded the list of download information of the image file.
  • the distributed query server system herein is N nodes in the multiplexing cluster, and the information stored in the information list is information of each node, so the hash value of the N nodes is calculated, for example, the node name.
  • the hash value is the hash value of each query server in the distributed query server system.
  • the first download node determines the query server of the image file from the distributed query server system according to the list information by using a consistent hash algorithm. (Specific steps are described in Figure 5 below).
  • the function of any one of the nodes may include the following three situations, which is not limited by the present application: the first possibility: the first node only serves as The download node of the first image file may have the function of the first download node in the method of downloading the image file in the above cluster.
  • the second possibility is that the first node is only used as a query server for querying the upstream download node information of the first image file, and may have the function of the query server in the method for downloading the image file in the cluster.
  • the first node serves as both a download node of the first image file and a query server of other nodes (for example, the second download node) of the N nodes.
  • the first image file is downloaded as the first image file
  • the second image file downloaded by the second download node is taken as an example for description.
  • Different image files can correspond to different query servers.
  • the first node serves as both a download node of the first image file and a query server of other nodes (for example, the second download node) of the N nodes, for example, the second image file is downloaded at the second download node.
  • the second download node calculates a hash value of the file name of the second image file and a hash value of the N nodes in the list information.
  • the size of the hash value for example, by comparing the hash value of the file name of the second image file with the hash value of the N nodes, the hash value closest to the file name of the second image file is determined.
  • the first node corresponding to the hash value is used as the query server of the second image file.
  • the first node also functions as a query server when it is used as a query server.
  • the first node receives a query request sent by the second download node, where the query request is used to query information of an upstream download node that downloads the second image file.
  • the first node as the query server, determines a second upstream download node that can provide the second image download for the second download node.
  • the second upstream download node is a node in the download source set that provides the download service of the second image file for the second download node, where the download source set includes the file server and the second image is downloaded from the N nodes. At least one node of the file.
  • the first node After determining the second upstream download node, the first node sends the information of the second upstream download node to the second download node.
  • the first node may determine, according to the download information list of the second image file, the second upstream download node, where the download information list includes the second image file at the N nodes. Download information.
  • the first node as the query server may determine the second upstream download node based on the equalization download policy and the download information list.
  • the equalization download policy may include a first download condition, and the first node may determine the second upstream download node according to the first download condition, where the first download condition is the second upstream The number of downstream download nodes of the download node is less than a preset threshold.
  • the equalization download policy may further include a second download condition, where the first node may determine the second upstream download node according to the first download condition and the second download condition, where The first download condition is that the number of the downstream download nodes of the second upstream download node is less than a preset threshold; the second download condition is that the size of the second image file that the second download node has downloaded is smaller than the second upstream download node. Download the size of the second image file.
  • the downloading information sent by the second downloading node to the query server may include downloading a file name of the second image file and a file size of the downloaded second image file.
  • the query server may maintain the download information list according to the download information of each node, for example, the download information list shown in Table 1 above, and further determine the upstream download node based on the equalization download policy and the download information list, and upload the upstream of the download information list record. Download the number of downstream nodes of the node.
  • the downloading information may include downloading a file name of the second image file and a file size of the downloaded second image file.
  • the download information of each node may further include downloading the download time of the second image file and downloading the upstream download node of the second image file.
  • the query server here, the first node
  • the query server can determine the upstream download node based on the equalization download policy according to the generation of the more detailed download information list, which is not limited in this embodiment of the present application.
  • the second download node when it starts the download of the second image file, it may send the second download information to the first node before completing the download of the second image file.
  • the first node updates the download information list of the second image file in the first node according to the second download information.
  • the second download node when the second download node downloads the second image file from the second upstream download node, the second image file may be downloaded or a part of the second image file may be downloaded.
  • the information of the first image file downloaded by the first download node is the first download information
  • the second download information is similar to the second download information.
  • the main difference is that the second download information is downloaded in the second download node.
  • the information of the second image file is the first download information
  • the foregoing method in the embodiment of the present application may be implemented by using a Data Download Service (DDS) at a node.
  • DDS Data Download Service
  • the function of the first download node in the foregoing embodiment may be implemented by using a service management DDS Service Handle module in the DDS.
  • the function of the query server in the above embodiment is implemented by the file tracking FileTracker module in the DDS. If the node serves as the download node and the other node provides the function of the query server, the node can implement the function of the download node and the function of the query server in the above embodiments by using the DDS Service Handle and the FileTracker module in the DDS respectively (below A detailed description is made in FIG. 6).
  • a method for downloading an image file in a cluster is provided, and the information of the upstream download node of each node is searched by the query server.
  • the download of the image files in the cluster is performed by the central download mode, that is, all the nodes download the image files in the file server, and the image file download mode based on the file server and other nodes provides the file source downloading mode, which can reduce the time for downloading the image file. degree.
  • the time complexity of downloading image files in a cluster can be reduced from O(N) (representing a proportional relationship with N) to O(LogN) (representing a logarithmic relationship with N). For example, if there are 100 nodes in the cluster, the download time of the cluster becomes proportional to Log100K in proportion to 100K, thereby reducing the download time and improving the download efficiency in the cluster.
  • the method of downloading the image file in the cluster of the present application is verified in a 1000-node cluster.
  • 1000 nodes will download a high-performance hypertext transfer protocol of 180MB and a negative proxy server Nigix image at the same time.
  • the image file downloading method in the cluster of the present application is not used, it takes 95 minutes (5700 seconds) to complete the download; when using the image file downloading method in the cluster of the present application, it takes only 50 seconds to complete the download, and the acceleration effect exceeds 120. Times.
  • FIG. 4 is an interactive flowchart of a method for downloading an image file in a cluster according to an embodiment of the present application.
  • the first node and the second node in FIG. 4 may be any two of the N nodes in the cluster.
  • the query server may be a centralized query server or a query server in a distributed query server system.
  • first node may also be referred to as a first download node
  • second node may also be referred to as a second download node
  • the data size of the image file 1 is 500 MB, and the data in the image file 1 has been downloaded at the first node and the second node before the current download, as shown in Table 2, wherein The 100MB data in the image file 1 has been downloaded at one node (here, the data in the 100 MB image file 1 may be the data of the last download image file 1), and the image file 1 has not been downloaded at the second node.
  • the first node and the second node respectively send a query request for downloading the image file 1 to the query server.
  • the first node and the second node send a query request to the same query server.
  • the query request sent by the first node may include information that the first node has downloaded the 100 MB data in the image file 1.
  • the first node queries the information of the first upstream download node by using the query server.
  • the query request sent by the second node includes information that the image data of the image file 1 has not been downloaded at the second node, and the second node queries the information of the second upstream download node by the query server.
  • the query server sends the query result to the first node and the second node according to the query request.
  • the query server sends the query result to the first node as the first upstream download node as the file server, and sends the query result to the second node as the second upstream download node as the first node.
  • the first node acquires the image file 1 from the file server and the second node acquires the image file 1 from the first node.
  • the first node and the second node respectively send the download information of the image file 1 to the query server.
  • the second node may send the download information of the image file 1 to the query server after acquiring the 100 MB data in the image file 1 from the first node.
  • the first node may send the download information of the image file 1 to the query server after acquiring the 500 MB data in the image file 1 from the file server.
  • the download information list of the image file 1 is updated.
  • the list of download information of the updated image file 1 is as shown in Table 3. It should be noted that the second node and the first node may also send the download information of the image file 1 to the query server at other timings.
  • the second node may send the download information of the image file 1 to the query server when the image file 1 is less than 100 MB, or the second node may continue to download the 100 MB data of the image file 1 after downloading
  • the first node downloads the data in the image file 1, and after downloading the 500 MB data of the image file 1, sends the download information of the image file 1 to the query server.
  • the query server selects an appropriate upstream download node to provide the download service of the image file 1 to other nodes according to the updated download information list.
  • sending the download information to the query server by the first node and the second node may be sent at the end of the first node, the second node downloading the image file 1, or may be performed simultaneously with downloading the image file 1.
  • the data of the partial image file 1 downloaded by the first node and the second node can provide the download of the image file 1 to other nodes as a file source.
  • the embodiment of the present application is not specifically limited herein.
  • a method for downloading an image file in a cluster is provided. Specifically, the information of the upstream download node of each node is searched by the query server, so that the download of the image file in the cluster is performed by using a central download mode. All nodes download the image file in the file server and become the image file download based on the file source provided by the file server and other nodes. The time complexity of downloading the image file in the cluster is reduced from O(N) to O(LogN). Thereby reducing the occupation of downloading service resources in the cluster environment and improving the downloading efficiency in the cluster.
  • Figure 5 is a schematic illustration of a hash ring in accordance with one embodiment of the present application.
  • the first download node determines a query server for querying the first upstream download node by using a consistent hash algorithm according to the list information.
  • a distributed query server system is composed of N nodes, where the distributed query server system is N nodes in the multiplexing cluster.
  • the cluster further includes a storage server, and the storage server includes list information of N nodes, and the first download node separately calculates a hash value of each node according to the list information by using a consistent hash algorithm, according to the file name of the image file.
  • the hash value and the hash value of each node are compared to determine a query server for querying the first upstream download node in the distributed query server system.
  • the hash value of the file name may be a hash value obtained by using a consistent hash algorithm according to the file name or the file number.
  • the hash value of the node may be a hash value obtained by using a consistent hash algorithm according to the number of the node.
  • the server name hash value can be a hash value obtained by using a consistent hash algorithm based on the server name number.
  • the distributed query server system multiplexes N nodes in the cluster, that is, the hash value of each node is the hash value of each query server that multiplexes the node.
  • the hash information obtained by the hash algorithm used in the embodiment of the present application may be in the form of a table, and may also be, for example, in the form of a hash ring.
  • the hash information is used as a hash ring.
  • the hash ring in the following is equivalent to the hash information, which is not limited by the embodiment of the present application.
  • the consistent hash algorithm as a data distribution reference for distributed computing, has certain advantages compared with the traditional modulo and segmentation.
  • Node a 0, 3, 6, 9;
  • Node b 1, 4, 7;
  • Node a 0, 4, 8;
  • Node b 1, 5, 9;
  • the node with the closest hash and the node hash value is determined as the storage node by comparing the hash values of the node and the data. This ensures that when the node increases or decreases, the least amount of data is affected.
  • Node a 203;
  • Node a 0, 1, 2;
  • Node a 0, 1, 2;
  • a node is added to the cluster, for example, a node Node n is added to the cluster, and after the consistent hash algorithm is used, only the data 5 and 6 can be migrated in the 10 data. The data can maintain the original data distribution node.
  • the data distribution of the consistency hash algorithm is relatively small when the number of nodes increases or decreases.
  • a hash ring is a ring-shaped logical structure obtained by hashing the number of data blocks in a physical node in a distributed storage system, obtaining a hash value of the data block, and sorting the hash value.
  • the identifier in the embodiment of the present application may be a version number, that is, the identifier of the hash information may be a version number of the hash information.
  • the identifier of the hash information may be a number, for example, the identifiers of the N hash information may be 1, 2, 3, .
  • the identifier may also be in other forms, for example, an English letter or the like, which is not limited by the embodiment of the present application. In the following, the description will be made only by taking the identification as the node number as an example, and the node number in the following can be replaced by the identifier. However, embodiments of the present application are not limited thereto.
  • the node in the embodiment of the present application may be a physical node or a virtual node. This embodiment of the present application does not limit this.
  • the first download node needs to determine information of the first upstream download node of the query download image file 1 from the distributed query server system. Similar to the above method, the image file can be regarded as data, and the query server is regarded as a storage node. By comparing the hash values of the storage node and the data, the node with the closest data and node hash value is determined as the storage node.
  • the hash value of the query server for example, by comparing the hash value of the file name of the image file with the hash value of the N nodes, it is determined that the hash value of the file name of the image file is closest.
  • the node corresponding to the hash value which serves as the query server for the image file. That is, a query server that transmits information of the first upstream download node to the first node is determined.
  • the first download node first needs to determine, from the distributed query server system, a query server for querying information of the first upstream download node, and then query the first upstream download node from the query server. Information.
  • the first download node calculates the hash value of each node and the hash value of the image file 1 according to the consistency hash algorithm, and then hashes the image file 1 with the hash value of each node. Compare to find the node with the closest hash value.
  • the query server that multiplexes the node is a query server for querying information of the first upstream download node.
  • the downloaded image file 1 has a hash value of 18, node 1 has a hash value of 10, node 2 has a hash value of 15, node 3 has a value of 22, and node 4 has a hash of node 4.
  • the value is 33, the hash value of node 5 is 50, the hash value of node 6 is 60, and the hash value of node 7 is 70.
  • the closest to the hash value 18 of the image file 1 is the hash value 15 of the node 2, so the node 2 is finally determined as the query server of the image file 1.
  • a method for downloading an image file in a cluster is provided.
  • the query server searches for the upstream download node information of each node, so that the download of the image file in the cluster is performed by using the center download mode, that is, all The nodes download the image file in the file server, and the image file is downloaded based on the file source provided by the file server and other nodes.
  • the different image files in the distributed query server system can have different query servers. To ensure that the query server function is distributed to multiple query servers for execution, there is no bottleneck problem with single point performance.
  • the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application.
  • the implementation process constitutes any limitation.
  • FIG. 6 is a schematic block diagram of an implementation manner of image file downloading in a cluster according to an embodiment of the present application.
  • the method for downloading the image file in the above cluster can be used in the Docker environment.
  • Docker Deamon the daemon
  • Each node downloads files from the Docker Registry Server (registration server) using the Docker Registry Protocol via the Docker Daemon.
  • the application of the embodiment of the present application can eliminate the need to modify the source code of the Docker Daemon, and can adopt a non-intrusive integration method.
  • the non-intrusive integration method means that the user code does not need too much dependency framework. When refactoring the code design, the previous code can still be used. Therefore, the non-intrusive integration method is relatively less expensive than the intrusive integration method, and the utilization of the source code is high.
  • the method for downloading the image file in the cluster of the present application can be non-intrusively integrated into the Docker download environment through the architecture shown in FIG. 6, that is, the method for downloading the image file of the present application and the Docker system only need to modify the configuration, for example, Modify the Docker Registery API Endpointa configuration without modifying the source code for Docker related components.
  • a Data Download Service (DDS) file can be installed at each node 610.
  • the Docker Deamon (daemon) on the download node is used to receive the data download request from the client, and then Docker Deamon sends the received data request sent by the client to the DDS for processing.
  • the Docker Daemon can use the Docker Registery API Endpoint provided by the DDS service to simulate the Docker Registery Server to provide the Docker Registery Server service to the Docker Daemon through the Docker Registery API Endpoint.
  • DDS can include 4 modules: Docker Registry Proxy, DDS Service Handle, File Tracker, and Node Manager.
  • the Docker Registry Proxy provides Docker Daemon with an image file download via the Docker Registry API Endpoint.
  • the Docker Registry Proxy receives a request from the Docker Daemon via the Docker Registry API Endpoint, if the request is not a download request for a Docker image file, such as a request for a metadata (Metadata) related request, the Docker Registry Proxy forwards the request to Docker.
  • the Registry Server 620 is processed by the Docker Registry Server 620; if the request is a download request for a Docker image file, the Docker Registry Proxy will call the DDS Servicec Handler to provide a download service for the image file.
  • the Docker Deamon in the node 610 in FIG. 6 can receive the data download request sent by the client, and the Docker Deamon can send the data download request to the Docker Registry Proxy, the Docker Registry Proxy to the data.
  • the download request is selected, and if the data download request is a query request for the image file, the request is sent to the DDS Service Handle.
  • the main function of the Docker Registry Proxy is to filter the data download requests received by the Docker Daemon. If the data download request of the client is an image file download request, the image file download request is sent to the DDS Service Handler. If it is a download request of other data, the client's request is sent to the file server for data download processing, and the Docker Registry Server (registration server) can be a file server.
  • the DDS Service Handler module implements the function of the first download node to determine the upstream download node and download the image file from the upstream download node in the embodiment of the present application;
  • the function of querying the server is implemented by the File Tracker module on the node.
  • the query server can provide the function of querying the server for the multiplexing node.
  • Node Manager provides ETCD-based node registration services. Through this service, each node in the cluster registers its own IP address and port to a centralized storage server 630 such as ETCD. When a specific file needs to be downloaded, in the distributed query server system, the node can use the consistent hash algorithm to calculate a query server in which one of the nodes provides the query service based on the complete list information.
  • the Node Manager provides an ETCD-based node registration service, and the function of the Node Manager is closed for a centralized query server.
  • the nodes in the cluster query the information of the upstream download nodes through a centralized query server.
  • non-intrusive integration means that the Docker Daemon source code does not need to be modified, only the configuration needs to be modified.
  • the Docker Daemon needs to access the Docker Registry Server server to download the image file.
  • the address of the Docker Registry Server server is configured in the Docker Daemon parameters.
  • the DDS is equivalent to a simulated Docker Registry Server server, and then the parameters of the Docker Daemon are modified to access the server simulated in the present application, and finally the data request is processed through the DDS file. So there is no need to modify the Docker Daemon source code to provide a non-intrusive integration in a Docker environment.
  • the functions of the first download node and the first upstream download node may be implemented by a DDS Service Handler module in a DDS file installed on the node, and the function of the query server may be implemented by a File Tracker module in the DDS file.
  • the method of downloading the image file in the cluster can also be implemented in the open source platform Kubernetes cluster of the automated container operation.
  • the embodiment of the present application is used for downloading an image file of a cluster environment, for example, an image file download in Docker, and can also be used for downloading other specific image files in multiple nodes in a cluster environment, such as virtual machine image download, large
  • the data runtime is downloaded from the runtime, which is not limited by the embodiment of the present application.
  • the method for downloading the image file in the cluster is applied to the Docker environment, which not only solves the single point failure and single point performance bottleneck existing in the current data center environment, and the image file downloading adopts the central download mode.
  • this application provides a non-intrusive integration method for the Docker environment, which will not affect the existing Docker open source system, thereby improving the utilization of the Docker system and reducing the cost.
  • the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application.
  • the implementation process constitutes any limitation.
  • the method for downloading the image file in the cluster is described in detail.
  • the time complexity of downloading is reduced from O(N) to O(LogN).
  • the query server is used to track the list of file information downloaded by each node in the cluster, and selects its upstream download node based on a certain selection policy according to the request of other nodes. It should be understood that the node and the query server in the embodiments of the present application may perform various methods in the foregoing embodiments of the present application, that is, the specific working processes of the following various products, and may refer to the corresponding processes in the foregoing method embodiments.
  • FIG. 7 shows a schematic block diagram of a node 700 (the node in FIG. 7 may be any one of the nodes in FIG. 1) in accordance with an embodiment of the present application.
  • the node 700 may correspond to any one of the N nodes in each method embodiment, and may have any function of the nodes in the method.
  • the node 700 in FIG. 7 is a node in the cluster, the cluster includes a file server and N nodes, and the file server provides a download service of the image file for at least one of the N nodes, and the N nodes download At least one node of the image file provides a download service of the image file for at least one of the N nodes, where N is a positive integer greater than one.
  • the node 700 includes a server module 710 and a data download service module 720.
  • the server module 710 is configured to send a download request of the image file to the data download service module.
  • the data download service module 720 is configured to obtain information of the first upstream download node from the query server according to the download request of the image file, where the first upstream download node determines the download source set based on the equalization download policy.
  • the node 700 provides a node for downloading the image file, the download source set includes a file server and at least one node of the N nodes downloaded the image file; and downloading the image file from the first upstream download node.
  • server module 710 and the data download service module 720 included in the node 700 may be performed in the same module, that is, the server module and the data download service module may be included in the same module, and the function of the server module 710.
  • the main purpose is to receive the data download request sent by the client, and receive and send all data download requests to the data download service module 720.
  • the server module 710 may be the Docker Deamon in FIG. 7, and the data download service module 720 may be a DDS file.
  • the equalization download policy may include a first download condition, where the number of downstream download nodes of the first upstream download node is less than a preset threshold.
  • the equalization download policy further includes a second download condition, where the size of the image file that the first download node has downloaded is smaller than the size of the image file that the first upstream download node has downloaded.
  • the data download service module 720 is further configured to:
  • the download information list includes download information of the image file at N nodes.
  • the data download service module 720 is further configured to:
  • the query server is a centralized query server, and the centralized query server is a specific one of the N nodes.
  • the centralized query server adopts an active/standby mode.
  • the N nodes form a distributed query server system
  • the query server is a query server in the distributed query server system.
  • the data download service module 720 is further configured to:
  • a query server for querying the first upstream download node is determined from the distributed query server system.
  • the cluster further includes a storage server, where the storage server includes list information of the N nodes, and the data download service module 720 is specifically configured to:
  • a query server for querying the first upstream download node is determined from the distributed query server system.
  • the data download service module 720 is further configured to:
  • the query server is determined using a consistent hash algorithm.
  • the data download service module 720 is further configured to:
  • a registration request is sent to the storage server, the registration request including the IP address of the node and the registration port number.
  • the list information includes an IP address of the N nodes and a registration port number.
  • the first download information includes a file name of the image file and a file size that the node 700 has downloaded the image file.
  • the first download information includes a file name of the image file, a file size that the node 700 has downloaded the image file, a download time of the node 700 to download the image file, and an upstream download of the image file downloaded by the node 700. node.
  • the data download service module 720 is further configured to provide a query server function.
  • the node 700 serves as both a download node of the first image file and a query server of other nodes (eg, the second node) of the N nodes.
  • the data download service module 720 further includes the following functions:
  • the second upstream download node is a node in the download source set that provides the second node with a download service of the image file, the download source set Included in the file server and at least one node of the N nodes that downloaded the second image file;
  • the node 700 transmits information of the second upstream download node to the second node.
  • the data download service module 720 is further configured to:
  • the data download service module 720 is further configured to:
  • the download information list including download information of the second image file at the N nodes.
  • the data download service module 720 is further configured to:
  • the second upstream download node Determining the second upstream download node according to the first condition and the second condition, wherein the first download condition is that the number of the downstream download nodes of the second upstream download node is less than a preset threshold; the second download condition is the first The size of the second image file that the two nodes have downloaded is smaller than the size of the second image file that the second upstream download node has downloaded.
  • the data download service module 720 is further configured to:
  • the second download information includes a file name of the second image file and a file size that the second download node has downloaded the second image file.
  • the second download information includes a file name of the second image file, a file size that the second download node has downloaded the second image file, a download time that the second download node downloads the second image file, and the second download The node downloads the upstream download node of the second image file.
  • the node 700 is any one of the N nodes, the node 700 downloads the first image file, and the second node downloads the second image file.
  • a method for downloading an image file in a cluster is provided. Specifically, the information of the upstream download node of each node is searched by the query server, so that the download of the image file in the cluster is performed by using a central download mode. All nodes download the image file in the file server and change it to the image file based on the file server and other nodes. The time complexity of downloading the image file in the cluster is reduced from O(N) to O. (LogN), which reduces the occupation of download service resources in the cluster environment and improves the download efficiency in the cluster.
  • FIG. 8 shows a schematic block diagram of a node 800 according to another embodiment of the present application (node 800 in FIG. 8 may be any one of FIG. 1), wherein specifically, the structure of the data download service module is shown. schematic diagram.
  • the node 800 may correspond to any one of the N nodes in each method embodiment, and may have any function of the nodes in the method.
  • the node 800 is a node in a cluster, and the cluster includes a file server and N nodes, and the file server provides a download service of the image file for at least one of the N nodes, among the N nodes. At least one node that has downloaded the image file provides a download service for the image file for at least one of the N nodes, where N is a positive integer greater than one.
  • the node 800 can include a server module 810 and a data download service module 820.
  • the server module 810 can be the server module 710 in FIG. 7, and the data download service module 820 can be the data download service module 720 in FIG.
  • the data download service module 820 can specifically include one or more of the following modules:
  • the proxy module 821 the processing module 822, the FileTracker file tracking module 823, and the node management module 824.
  • proxy module 821, the processing module 822, the FileTracker file tracking module 823, and the node management module 824 may not be present in the data download service module 820 at the same time.
  • the server module 810 is configured to receive a data download request sent by the client, and send the data download request to the data download service module 820.
  • the data download service module 820 is configured to obtain information of the first upstream download node from the query server according to the download request of the image file in the data download request received by the server module 810, where the first upstream download node is based on the balanced download A node determined by the policy in the download source set to provide the node 800 with a download service of the image file, the download source set including a file server and at least one node of the N nodes downloaded the image file; and downloading from the first upstream The node downloads the image file.
  • the proxy module 821 is configured to receive a data download request sent by the server module 810, and filter the data request.
  • the data request is an image file download request
  • the image file download request is sent to the processing module 822.
  • the equalization download policy includes a first download condition, where the number of downstream download nodes of the first upstream download node is less than a preset threshold.
  • the equalization download policy includes a second download condition, where the size of the image file that the first download node has downloaded is smaller than the size of the image file that the first upstream download node has downloaded.
  • processing module 822 is configured to:
  • processing module 822 is further configured to:
  • the download information list includes download information of the image file at the N nodes.
  • processing module 822 is further configured to:
  • processing module 822 is further configured to:
  • a query server for querying the first upstream download node is determined from the distributed query server system.
  • the cluster further includes a storage server, where the storage server includes list information of N nodes, and the processing module 822 is further configured to: obtain list information of the N nodes from the storage server;
  • a query server for querying the first upstream download node is determined from the distributed query server system.
  • processing module 822 is further configured to:
  • the query server is determined by a consistent hash algorithm.
  • processing module 822 is further configured to:
  • a registration request is sent to the storage server, the registration request including the IP address of the node and the registration port number.
  • the list information includes an IP address of the N nodes and a registration port number.
  • the first download information includes a file name of the image file, a file size of the image file that the node 800 has downloaded, a download time for the node 800 to download the image file, and an upstream download node that the node 800 downloads the image file.
  • the node 800 may serve as both a download node of the first image file and a query server of other nodes (eg, the second node) of the N nodes.
  • the node 800 also functions as a query server when it is used as a query server.
  • the data download service module 820 may further include a FileTracker module 823.
  • FileTracker module 823 is used to:
  • the second upstream download node is a download service that provides the image file for the second node determined by the equalization download policy in the download source set a node, the download source set includes the file server and at least one node of the N nodes that downloaded the second image file;
  • the node 800 transmits information of the second upstream download node to the second node.
  • FileTracker module 823 is also used to:
  • the equalization download policy includes a first download condition
  • the FileTracker module 823 is further configured to:
  • the downstream download node is a node that is downloading the image file from the first upstream download node.
  • the equalization download policy further includes a second download condition
  • the FileTracker module 823 is further configured to:
  • FileTracker module 823 is also used to:
  • the second download information includes a file name of the second image file and a file size that the second download node has downloaded the second image file.
  • the second download information includes a file name of the second image file, a file size that the second download node has downloaded the second image file, a download time that the second download node downloads the second image file, and the second download The node downloads the upstream download node of the second image file.
  • the node 800 may be any one of N nodes, the node 800 downloads a first image file, and the second node downloads a second image file.
  • the data download service module 820 may further include a node management module 824.
  • the node management module 824 is used to:
  • nodes in the cluster register their IP address and registered port number to the storage server.
  • the second node may determine a node providing the function of the query server by using a consistent hash algorithm based on the list information.
  • the data download service module 820 includes a node management module 824. If the query server in the cluster is a centralized query server, the node management module 824 is not included in the data download service module 820.
  • the node in the cluster searches for the information of the upstream download node of the node by using the query server, and downloads the image file from the upstream download node, so that the download of the image file in the cluster is performed by using the center download mode. That is, the download mode of the image file downloaded by all the nodes in the file server becomes the download mode of the image file based on the file source and the file source provided by other nodes, so that the time complexity of downloading the image file in the cluster is reduced from O(N) to O. (LogN), which reduces the occupation of download service resources in the cluster environment and improves the download efficiency in the cluster.
  • FIG. 9 shows a schematic block diagram of a query server 900 in accordance with an embodiment of the present application.
  • the query server 900 may correspond to a query server in each method embodiment, and the query server 900 may be a centralized query server or a query server in a distributed query server system.
  • the query server 900 in FIG. 9 is applied to a cluster, where the cluster includes a file server and N nodes, and the file server provides a download service of the image file for at least one of the N nodes, and the N nodes download At least one node of the image file provides a download service of the image file for at least one of the N nodes, where N is a positive integer greater than one.
  • the query server 900 includes:
  • the processing module 910 is configured to determine, according to the equalization download policy, a first upstream download node of the first one of the N nodes, where the first upstream download node is the one of the download source sets that provides the image for the first download node a node of the download service of the file, the download source set includes at least one node of the file server and the N nodes downloaded the image file;
  • the transceiver module 920 is configured to send information of the first upstream download node to the first download node.
  • the transceiver module 920 is further configured to:
  • processing module 910 is specifically configured to:
  • the first upstream download node is determined based on the equalization download policy and the load information list, and the download information list includes download information of the image file at the N nodes.
  • the equalization download policy includes a first download condition
  • the processing module 910 is specifically configured to:
  • the downstream download node is a node that is downloading the image file from the first upstream download node.
  • the equalization download policy further includes a second download condition
  • the processing module 910 is specifically configured to:
  • the transceiver module 920 is further configured to:
  • the first download information includes a file name of the image file and a file size that the first download node has downloaded the image file.
  • the first download information includes a file name of the image file, a file size of the image file that the first download node has downloaded, a download time of the first download node to download the image file, and a downloading of the image file by the first download node.
  • Upload node upstream includes a file name of the image file, a file size of the image file that the first download node has downloaded, a download time of the first download node to download the image file, and a downloading of the image file by the first download node.
  • the processing module 910 is further configured to:
  • the download information list of the image file in the query server 900 is updated according to the first download information.
  • the query server 900 is a centralized query server, and the centralized query server is a specific one of the N nodes.
  • the centralized query server can adopt the active/standby mode.
  • the N nodes form a distributed query server system
  • the query server 900 is a query server in the distributed query server system.
  • the query server sends the information of the upstream download node to the node, so that the node downloads the image file from the upstream download node, so that the download of the image file in the cluster is performed by using the central download mode, that is, all nodes are in the
  • the download mode of the downloaded image file in the file server becomes the download mode of the image file based on the file source and the file source provided by other nodes, so that the time complexity of downloading the image file in the cluster is reduced from O(N) to O(LogN).
  • FIG. 10 shows a schematic block diagram of a node of another embodiment of the present application, including at least one processor 1020 (eg, a CPU), at least one network interface 1040 or other communication interface, and a memory 1060 with communication connections between the components.
  • the processor 1020 is configured to execute executable modules, such as computer programs, stored in the memory 1060.
  • the memory 1060 may include a high speed random access memory (RAM), and may also include a non-volatile memory such as at least one disk memory.
  • a communication connection with at least one other network element is achieved by at least one network interface 1040 (which may be wired or wireless).
  • memory 1060 stores program 1011, and processor 1020 executes program 1011 for performing the methods of the various embodiments described above.
  • the processor may be configured to execute the S310 query server in FIG. 3 to send information of the first upstream download node to the first download node; or S320, the first download node obtains the image file from the first upstream download node.
  • the processor may be configured to execute the query request of the first node of S410 in FIG. 4 to send the download image file 1 to the query server, the S420 query server sends the query result to the first node, the S430 first node obtains data from the file server, and the S440 A node sends the download information of the image file 1 to the query server.
  • the memory 1060 can store the DDS file in the embodiment of the present application.
  • the processor 1020 is configured to execute various modules in the DDS file, such as a Docker Registry Proxy, a DDS Service Handle, a File Tracker, and a Node Manager, so as to implement the embodiment of the present application.
  • a Docker Registry Proxy a Docker Registry Proxy
  • a DDS Service Handle a DDS Service Handle
  • a File Tracker a Node Manager
  • the processor 1020 executes the DDS Service Handle module to provide the function of downloading the node in the embodiment of the present application. If the node acts as a query server, the processor 1020 executes the File Tracker module to provide the function of the query server in the application embodiment. If the node is both a download node and a query server of other nodes, the processor 1020 executes the DDS Service Handle module to provide the function of downloading the node in the embodiment of the present application, and executes the File Tracker module to provide the function of querying the server. Further, if the node provides the functionality of a distributed query server, the processor 1020 also executes a Node Manager module to enable registration on the storage server.
  • the node may further comprise a memory
  • the memory may store program code
  • the processor calls the program code stored in the memory to implement a corresponding function of the node.
  • the processor and memory can be implemented by a chip.
  • the embodiment of the present application further provides a cluster system, including the foregoing node and the query server.
  • the cluster system may include the nodes shown in FIGS. 7 and 8 described above and the query server shown in FIG.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transfer to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium (such as a Solid State Disk (SSD)) or the like.
  • a magnetic medium eg, a floppy disk, a hard disk, a magnetic tape
  • an optical medium eg, a DVD
  • a semiconductor medium such as a Solid State Disk (SSD)
  • the term "and/or” is merely an association relationship describing an associated object, indicating that there may be three relationships.
  • a and/or B may indicate that A exists separately, and A and B exist simultaneously, and B cases exist alone.
  • the character "/" in this article generally indicates that the contextual object is an "or" relationship.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种集群中镜像文件下载的方法、节点和查询服务器,该方法适用于包括文件服务器和N个节点的集群中,该文件服务器为N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为N个节点中的至少一个节点提供该镜像文件的下载服务,其中,所述N为大于1的正整数,该方法包括:N个节点中的第一下载节点接收查询服务器发送的第一上游下载节点的信息,其中,第一上游下载节点是基于均衡下载策略在下载源集合中确定的为第一下载节点提供该镜像文件的下载服务的节点,下载源集合包括文件服务器和N个节点中下载了该镜像文件的至少一个节点;第一下载节点从第一上游下载节点下载该镜像文件。本申请实施例的技术方案能够降低集群环境中下载服务资源的占用,从而提高集群的下载效率。

Description

集群中镜像文件下载的方法、节点、查询服务器
本申请要求于2018年02月12日提交中国专利局、申请号为201810146877.X、发明名称为“集群中镜像文件下载的方法、节点和查询服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及信息技术领域,并且更具体地,涉及一种集群环境中镜像文件下载的方法、节点以及查询服务器。
背景技术
集群通常是由一些互相连接在一起的节点(例如,计算机或虚拟机)构成的一个并行或分布式系统。这些节点一起工作并运行一系列共同的应用程序,同时,为用户和应用程序提供单一的系统映射。例如,对于计算机集群而言,从外部来看,计算机集群是一个系统,对外提供统一的服务。对内部来说,集群内的计算机在物理上通过电缆连接,在逻辑上则通过集群软件连接。服务器集群是把多台服务器通过通信链路连接,从外部看来,这些服务器就像一台服务器在工作,而对内部来说,外来的负载通过一定的机制动态地分配到服务器中去,从而达到超级服务器才有的高性能、高可用。
在数据中心环境下,一个集群可能有几十个到几万个节点同时开始下载镜像文件。目前的镜像文件下载的方式为中心式下载,即集群中的全部节点同时向中心镜像服务器下载镜像文件。这样,下载时间的延迟与下载节点数量呈线性增长的关系,从而影响下载效率。
发明内容
本申请提供了一种集群环境中镜像文件下载的方法、节点以及查询服务器,能够提高集群的下载效率。
第一方面,提供了一种集群中镜像文件下载的方法,该方法适用于包括文件服务器和N个节点的集群中,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务,其中,N为大于1的正整数,该方法包括:该N个节点中的第一下载节点接收查询服务器发送的第一上游下载节点的信息,其中,该第一上游下载节点是基于均衡下载策略在下载源集合中确定的为该第一下载节点提供该镜像文件的下载服务的节点,该下载源集合包括该文件服务器和该N个节点中下载了该镜像文件的至少一个节点;该第一下载节点从该第一上游下载节点下载该镜像文件。
在本申请实施例的技术方案中,下载了镜像文件的节点为其他节点提供该镜像文件的下载服务,通过查询服务器对第一下载节点的上游下载节点进行选择,使得第一下载节点可以从该上游下载节点下载该镜像文件,这样能够避免集群中的全部节点都从文件服务器下载镜像文件,可以降低集群环境中下载服务资源的占用,从而能够提高集群的下载效率。
结合第一方面,在第一方面的某些实现方式中,该第一下载节点从该第一上游下载节点下载该镜像文件之后,该方法还包括:该第一下载节点向该查询服务器发送第一下载信息,该第一下载信息为该第一下载节点下载该镜像文件的下载信息,该第一下载信息用于该查询服务器更新该查询服务器中的该镜像文件的下载信息列表,该下载信息列表包括该镜像文件在该N个节点的下载信息。
结合第一方面,在第一方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名和该第一下载节点已下载该镜像文件的文件大小。
结合第一方面,在第一方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名、该第一下载节点已下载该镜像文件的文件大小、该第一下载节点下载该镜像文件的下载时间和该第一下载节点下载该镜像文件的上游下载节点。
结合第一方面,在第一方面的某些实现方式中,所述均衡下载策略包括第一下载条件,该第一下载条件为第一上游下载节点的下游下载节点的个数小于预设阈值。其中,所述下游下载节点即正在从该第一上游下载节点下载该镜像文件的节点。
在本申请实施例的技术方案中,采用均衡策略中的第一下载条件,能够避免集群中多个下游下载节点从同一上游下载节点获取镜像文件,从而能够提高集群的下载效率。
结合第一方面,在第一方面的某些实现方式中,均衡下载策略还包括第二下载条件,该第二下载条件为第一下载节点已下载镜像文件的大小小于第一上游下载节点已下载该镜像文件的大小。
在本申请实施例的技术方案中,采用均衡策略中的第二下载条件,能够确保上游下载节点处已经获取下游下载节点处需要下载的镜像文件的数据。
结合第一方面,在第一方面的某些实现方式中,该第一下载节点接收该查询服务器发送的第一上游下载节点的信息之前,该方法还包括:该第一下载节点向该查询服务器发送查询请求,该查询请求用于查询用于下载镜像文件的上游下载节点的信息。
结合第一方面,在第一方面的某些实现方式中,该查询服务器为集中式的查询服务器。可选的,该集中式的查询服务器可以为该N个节点中的特定节点。
结合第一方面,在第一方面的某些实现方式中,该集中式的查询服务器采用主备模式。
在本申请实施例的技术方案中,查询服务器采用主备模式,由于集中式的查询服务器采用主备模式的服务,即有多个服务器随时准备着提供查询服务。因此,当主查询服务器出现宕机时,备查询服务器会提供查询服务,与仅有一个查询服务器相比,提供了高可用的服务。
结合第一方面,在第一方面的某些实现方式中,可以通过复用所述集群中N个节点中的部分或全部,作为分布式查询服务器系统中的查询服务器。
在本申请实施例的技术方案中,采用分布式的查询服务器系统,可以降低单个查询服务器的负荷。此外,分布式的查询服务器系统复用节点提供查询服务,能够避免单独配置查询服务器带来的资源消耗。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:该第一下载节点从该分布式的查询服务器系统中确定该查询服务器。
结合第一方面,在第一方面的某些实现方式中,复用集群中的N个节点形成分布式 的查询服务器系统,该集群还包括存储服务器,该存储服务器存储该N个节点的列表信息。该第一下载节点从该分布式的查询服务器系统中确定该查询服务器,包括:该第一下载节点从该存储服务器中获取列表信息;该第一下载节点根据该列表信息,从该分布式的查询服务器系统中确定该查询服务器。
在本申请实施例的技术方案中,存储服务器中可以存储集群中N个节点的列表信息。第一下载节点根据列表信息,能够从分布式的查询服务器系统中确定该镜像文件的查询服务器。
结合第一方面,在第一方面的某些实现方式中,该第一下载节点根据该列表信息,从该分布式的查询服务器系统中确定该查询服务器,包括:该第一下载节点根据该列表信息,采用一致性哈希算法确定该查询服务器。
在本申请实施例的技术方案中,在下载不同的镜像文件时,采用一致性哈希算法可以从分布式的查询服务器系统中,确定用于查询上游下载节点信息的查询服务器,可以将不同镜像文件的查询服务器分布到不同的节点上,从而降低单个查询服务器的负荷。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:该第一下载节点向该存储服务器发送注册请求,该注册请求包括该第一下载节点的IP地址以及注册端口号。
结合第一方面,在第一方面的某些实现方式中,该N个节点的列表信息包括该N个节点的IP地址以及注册端口号。
在本申请实施例的技术方案中,集群中复用查询服务器的节点通过向存储服务器发送注册请求,从而形成存储服务器中的列表信息。根据列表信息可以确定分布式的查询服务器系统的信息,从而为集群提供分布式的查询服务器系统。采用分布式的查询服务器系统,可以降低单个查询服务器的负荷。此外,分布式的查询服务器系统还可以复用节点提供查询服务,从而提高集群的下载效率。
在某些实现方式中,第一下载节点既作为第一镜像文件的下载节点,又作为N个节点中其它节点(例如第二下载节点)的查询服务器。第一下载节点作为查询服务器还包括查询服务器的功能。
在某些实现方式中,第一下载节点确定该N个节点中的第二下载节点的第二上游下载节点,其中,该第二上游下载节点是基于均衡下载策略在下载源集合中确定的为该第二下载节点提供该第二镜像文件的下载服务的节点,该下载源集合包括文件服务器和N个节点中下载了该第二镜像文件的至少一个节点;该第一下载节点向该第二下载节点发送第二上游下载节点的信息。
在某些实现方式中,该第二下载节点从该第二上游下载节点下载该镜像文件之后,该方法还包括:第一下载节点接收第二下载节点发送的第二下载信息,该第二下载信息为该第二下载节点下载第二镜像文件的下载信息;第一下载节点根据第二下载信息更新第一下载节点中的该第二镜像文件的下载信息列表。
在某些实现方式中,第一下载节点(作为查询服务器)基于均衡下载策略确定所述N个节点中的第二下载节点的第二上游下载节点,包括:
该第一下载节点基于均衡下载策略和第二镜像文件的下载信息列表确定该第二上游下载节点,下载信息列表包括该第二镜像文件在所述N个节点的下载信息。
在某些实现方式中,第二镜像文件的下载信息列表中的第二镜像文件的下载信息可以包括下载的第二镜像文件的文件名和文件大小。可选的,第一下载节点(作为查询服务器)还可以在第二镜像文件的下载信息列表中记录分配的第二上游下载节点的下游下载节点的个数。
在某些实现方式中,该均衡下载策略包括第一下载条件,基于均衡下载策略和下载信息列表确定该第二上游下载节点,包括:
该第一下载节点据第一下载条件确定该第二上游下载节点,所述第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值,该下载信息列表包括第二镜像文件在该N个节点的下载信息。
在某些实现方式中,该均衡下载策略还包括第二下载条件,基于均衡下载策略和下载信息列表确定该第二上游下载节点,包括:
该第一下载节点根据所述第一下载条件和所述第二下载条件确定所述第二上游下载节点,所述第二下载条件为所述第一下载节点已下载该第二镜像文件的大小小于该第二上游下载节点已下载该第二镜像文件的大小。
在某些实现方式中,在该第一下载节点确定该N个节点中的第二下载节点的第二上游下载节点之前,该方法还包括:
该第一下载节点接收第二下载节点发送的查询请求,该查询请求用于查询该第二上游下载节点的信息。
在某些实现方式中,该第二下载节点向所述第一下载节点发送第二下载信息,该第二载信息为该第二下载节点下载该第二镜像文件的下载信息,该第二下载信息用于该第一下载节点更新该第二镜像文件的下载信息列表,该下载信息列表包括该第二镜像文件在N个节点的下载信息。
结合第一方面,在第一方面的某些实现方式中,该第二下载信息包括第二镜像文件的文件名和第二下载节点已下载该第二镜像文件的文件大小。
在某些实现方式中,第二下载信息包括第二镜像文件的文件名、第二下载节点已下载该第二镜像文件的文件大小、该第二下载节点下载第二镜像文件的上游下载节点和该第二下载节点下载第二镜像文件的下载时间。
在本申请实施例的技术方案中,节点既可以作为镜像文件的下载节点,又可以作为其他下载节点的查询服务器,这样可以降低集群环境中下载服务资源的占用,从而能够提高集群的下载效率。
第二方面,提供了一种集群中镜像文件下载的方法,其特征在于,该方法适用于包括文件服务器和N个节点的集群中,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务,其中,所述N为大于1的正整数,该方法包括:查询服务器基于均衡下载策略确定该N个节点中的第一下载节点的第一上游下载节点,其中,该第一上游下载节点是下载源集合中的为该第一下载节点提供该镜像文件的下载服务的节点,下载源集合包括该文件服务器和该N个节点中下载了该镜像文件的至少一个节点;该查询服务器向该第一下载节点发送该第一上游下载节点的信息。
本申请实施例的技术方案,通过查询服务器对第一下载节点的上游下载节点进行选择,使得第一下载节点可以从该上游下载节点下载镜像文件,其中,第一下载节点的上游下载节点是该文件服务器和该N个节点中下载了该镜像文件的至少一个节点,这样能够避免集群中的全部节点都从文件服务器下载镜像文件,可以降低集群环境中下载服务资源的占用,从而能够提高集群的下载效率。
结合第二方面,在第二方面的某些实现方式中,该查询服务器确定该N个节点中的第一下载节点的第一上游下载节点,包括:该查询服务器基于所述均衡下载策略和下载信息列表确定该第一上游下载节点,该下载信息列表包括该镜像文件在该N个节点的下载信息。
结合第二方面,在第二方面的某些实现方式中,所述均衡下载策略包括第一下载条件,该查询服务器基于该均衡下载策略和下载信息列表确定该第一上游下载节点,包括:该查询服务器根据第一下载条件确定该第一上游下载节点,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值。其中,下游下载节点即正在从该第一上游下载节点下载该镜像文件的节点。
在本申请实施例的技术方案中,采用均衡策略中的第一下载条件,能够避免集群中多个下游下载节点从同一上游下载节点获取镜像文件,从而能够提高集群的下载效率。
结合第二方面,在第二方面的某些实现方式中,均衡下载策略还包括第二下载条件,该查询服务器基于均衡下载策略和该下载信息列表确定该第一上游下载节点,包括:该查询服务器根据第一下载条件和第二下载条件确定该第一上游下载节点,其中,该第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值,第二下载条件为该第二下载节点已下载第二镜像文件的大小小于该第二上游下载节点已下载该第二镜像文件的大小。
在本申请实施例的技术方案中,采用均衡策略中的第二下载条件,能够确保上游下载节点处已经获取下游下载节点处需要下载的镜像文件的数据。
结合第二方面,在第二方面的某些实现方式中,该方法还包括:该查询服务器接收该第一下载节点发送的第一下载信息,该第一下载信息为该第一下载节点下载该镜像文件的下载信息;该查询服务器根据该第一下载信息更新该查询服务器中的该镜像文件的下载信息列表。
结合第二方面,在第二方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名和该第一下载节点已下载该镜像文件的文件大小。
结合第二方面,在第二方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名、该第一下载节点已下载该镜像文件的文件大小、该第一下载节点下载该镜像文件的下载时间和该第一下载节点下载该镜像文件的上游下载节点。
结合第二方面,在第二方面的某些实现方式中,在该查询服务器确定该N个节点中的第一下载节点的第一上游下载节点之前,该方法还包括:该查询服务器接收该第一下载节点发送的查询请求,该查询请求用于查询用于下载镜像文件的上游下载节点的信息。
结合第二方面,在第二方面的某些实现方式中,该查询服务器为集中式的查询服务器。可选地,该集中式的查询服务器可以为该N个节点中的特定节点。
结合第二方面,在第二方面的某些实现方式中,该集中式的查询服务器采用主备模式。
在本申请实施例的技术方案中,查询服务器采用主备模式,由于集中式的查询服务器采用主备模式的服务,即有多个服务器随时准备着提供查询服务。因此,当主查询服务器出现宕机时,备查询服务器会提供查询服务,与仅有一个查询服务器相比,提供了高可用的服务。
结合第二方面,在第二方面的某些实现方式中,可以通过复用所述集群中N个节点中的部分或全部,作为分布式查询服务器系统中的查询服务器。
在本申请实施例的技术方案中,采用分布式的查询服务器系统,可以降低单个查询服务器的负荷。此外,分布式的查询服务器系统复用节点提供查询服务,能够避免单独配置查询服务器带来的资源消耗。
在某些实现方式中,上述集群中镜像文件下载的方法能够在Docker环境下使用。
第三方面,提供了一种节点,其特征在于,该节点为集群中的节点,该集群包括文件服务器和N个节点,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务,其中,所述N为大于1的正整数,该节点包括:服务端模块和数据下载服务模块;该服务端模块,用于向该数据下载服务模块发送该镜像文件的下载请求;该数据下载服务模块,用于根据该镜像文件的下载请求,从查询服务器获取第一上游下载节点的信息,其中,该第一上游下载节点是基于均衡下载策略在下载源集合中确定的为该节点提供该镜像文件的下载服务的节点,该下载源集合包括该文件服务器和该N个节点中下载了该镜像文件的至少一个节点;以及从该第一上游下载节点下载该镜像文件。
在本申请实施例的技术方案中,下载了镜像文件的节点为其他节点提供该镜像文件的下载服务,通过查询服务器对第一下载节点的上游下载节点进行选择,使得第一下载节点可以从该上游下载节点下载该镜像文件,这样能够避免集群中的全部节点都从文件服务器下载镜像文件,可以降低集群环境中下载服务资源的占用,从而能够提高集群的下载效率。
结合第三方面,在第三方面的某些实现方式中,该数据下载服务模块还用于:向该查询服务器发送第一下载信息,该第一下载信息为该节点下载该镜像文件的下载信息,该第一下载信息用于该查询服务器更新该查询服务器中的该镜像文件的下载信息列表,该下载信息列表包括该镜像文件在该N个节点的下载信息。
结合第三方面,在第三方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名和该节点已下载该镜像文件的文件大小。
结合第三方面,在第三方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名、该节点已下载该镜像文件的文件大小、该节点下载该镜像文件的下载时间和该节点下载该镜像文件的上游下载节点。
结合第三方面,在第三方面的某些实现方式中,所述均衡下载策略包括第一下载条件,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值。其中,下游下载节点即正在从该第一上游下载节点下载该镜像文件的节点。
在本申请实施例的技术方案中,采用均衡策略中的第一下载条件,能够避免集群中多个下游下载节点从同一上游下载节点获取镜像文件,从而能够提高集群的下载效率。
结合第三方面,在第三方面的某些实现方式中,均衡下载策略还包括第二下载条件,该第二下载条件为第一下载节点已下载镜像文件的大小小于第一上游下载节点已下载该镜像文件的大小。
在本申请实施例的技术方案中,采用均衡策略中的第二下载条件,能够确保上游下载节点处已经获取下游下载节点处需要下载的镜像文件的数据。
结合第三方面,在第三方面的某些实现方式中,该数据下载服务模块还用于:向该查询服务器发送查询请求,该查询请求用于查询用于下载镜像文件的上游下载节点的信息。结合第三方面,在第三方面的某些实现方式中,该查询服务器为集中式的查询服务器。可选地,该集中式的查询服务器可以为该N个节点中的特定节点。
结合第三方面,在第三方面的某些实现方式中,该集中式的查询服务器采用主备模式。
在本申请实施例的技术方案中,查询服务器采用主备模式,由于集中式的查询服务器采用主备模式的服务,即有多个服务器随时准备着提供查询服务。因此,当主查询服务器出现宕机时,备查询服务器会提供查询服务,与仅有一个查询服务器相比,提供了高可用的服务。
结合第三方面,在第三方面的某些实现方式中,可以通过复用所述集群中N个节点中的部分或全部,作为分布式查询服务器系统中的查询服务器。
在本申请实施例的技术方案中,采用分布式的查询服务器系统,可以降低单个查询服务器的负荷。此外,分布式的查询服务器系统还可以复用节点提供查询服务,能够避免单独配置查询服务器带来的资源消耗。
结合第三方面,在第三方面的某些实现方式中,该数据下载服务模块还用于:从该分布式的查询服务器系统中确定该查询服务器。
结合第三方面,在第三方面的某些实现方式中,复用集群中的N个节点形成分布式的查询服务器系统,该集群还包括存储服务器,该存储服务器包括该N个节点的列表信息;该数据下载服务模块具体用于:从该存储服务器中获取该N个节点的列表信息;根据该列表信息,从该分布式的查询服务器系统中确定用于查询该第一上游下载节点的查询服务器。
在本申请实施例的技术方案中,存储服务器中可以包括集群中N个节点的列表信息。根据列表信息可以确定分布式的查询服务器系统的信息,从而为集群提供分布式的查询服务器系统,提高了集群的下载效率。
结合第三方面,在第三方面的某些实现方式中,该数据下载服务模块具体用于:根据该列表信息,采用一致性哈希算法确定该查询服务器。
在本申请实施例的技术方案中,在下载不同的镜像文件时,采用一致性哈希算法可以从分布式的查询服务器系统中,确定用于查询上游下载节点信息的查询服务器,可以将不同镜像文件的查询服务器分布到不同的节点上,从而降低单个查询服务器的负荷。结合第三方面,在第三方面的某些实现方式中,该数据下载服务模块还用于:向该存储服务器发送注册请求,该注册请求包括该节点的IP地址以及注册端口号。
结合第三方面,在第三方面的某些实现方式中,该N个节点的列表信息包括该N个节点的IP地址以及注册端口号。
在本申请实施例的技术方案中,集群中的N个节点通过向存储服务器发送注册请求,从而形成存储服务器中的列表信息。第一下载节点根据列表信息,能够从分布式的查询服务器系统中确定该镜像文件的查询服务器。结合第三方面,在第三方面的某些实现方式中,该数据下载服务模块还用于提供查询服务器功能。
在某些实现方式中,该节点既作为第一镜像文件的下载节点同时又为N个其它节点(例如第二节点)的查询服务器,该节点作为查询服务器时,数据下载服务模块还包括以下功能:
确定该N个节点中的第二节点的第二上游下载节点,其中,该第二上游下载节点是下载源集合中的为该第二节点提供该第二镜像文件的下载服务的节点,其中,下载源集合包括文件服务器和N个节点中下载了该第二镜像文件的至少一个节点;该节点向该第二节点发送第二上游下载节点的信息。
在某些实现方式中,第二镜像文件的下载信息列表中的第二镜像文件的下载信息可以包括下载的第二镜像文件的文件名和文件大小。第一下载节点(作为查询服务器)并在第二镜像文件的下载信息列表中记录分配的第二上游下载节点的下游下载节点的个数。
在某些实现方式中,该数据下载服务模块根据下载信息列表确定第二上游下载节点,该下载信息列表包括第二镜像文件在该N个节点的下载信息。
在某些实现方式中,该数据下载服务模块还用于:
根据第一下载条件和第二下载条件确定该第二上游下载节点,其中,该第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值;第二下载条件为该第二下载节点已下载第二镜像文件的大小小于该第二上游下载节点已下载该第二镜像文件的大小。
在某些实现方式中,该数据下载服务模块还用于:
接收第二节点发送的第二下载信息,该第二下载信息为该第二下载节点下载第二镜像文件的下载信息;
根据第二下载信息更新第二下载节点中的该第二镜像文件的下载信息列表。
在某些实现方式中,第二下载信息包括第二镜像文件的文件名和第二下载节点已下载该第二镜像文件的文件大小。
在某些实现方式中,第二下载信息包括第二镜像文件的文件名、第二下载节点已下载该第二镜像文件的文件大小、该第二下载节点下载第二镜像文件的下载时间和该第二下载节点下载第二镜像文件的上游下载节点。
在本申请实施例中该节点即为N个节点中的任意一个节点,该节点是对第一镜像文件进行下载的节点,第二节点下载的是第二镜像文件,不同的镜像文件对应着不同的查询服务器。
第四方面,提供了一种查询服务器,其特征在于,该查询服务器应用于集群中,该集群包括文件服务器和N个节点,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节 点中的至少一个节点提供该镜像文件的下载服务,其中,所述N为大于1的正整数,该查询服务器包括:处理模块,用于基于均衡下载策略确定N个节点中的第一下载节点的第一上游下载节点,其中,该第一上游下载节点是下载源集合中的为该第一下载节点提供该镜像文件的下载服务的节点,该下载源集合包括该文件服务器和该N个节点中下载了该镜像文件的至少一个节点;收发模块,用于向该第一下载节点发送该第一上游下载节点的信息。
在本申请实施例的技术方案中,下载了镜像文件的节点为其他节点提供该镜像文件的下载服务,通过查询服务器对第一下载节点的上游下载节点进行选择,使得第一下载节点可以从该上游下载节点下载该镜像文件,这样能够避免集群中的全部节点都从文件服务器下载镜像文件,可以降低集群环境中下载服务资源的占用,从而能够提高集群的下载效率。
结合第四方面,在第四方面的某些实现方式中,该处理模块具体用于:基于所述均衡下载策略和下载信息列表确定该第一上游下载节点,该下载信息列表包括该镜像文件在该N个节点的下载信息。
结合第四方面,在第四方面的某些实现方式中,所述均衡下载策略包括第一下载条件,该处理模块具体用于:根据第一下载条件确定该第一上游下载节点,其中,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值。其中,下游下载节点即正在从该第一上游下载节点下载该镜像文件的节点。
在本申请实施例的技术方案中,采用均衡策略中的第一下载条件,能够避免集群中多个下游下载节点从同一上游下载节点获取镜像文件,从而能够提高集群的下载效率。
结合第四方面,在第四方面的某些实现方式中,均衡下载策略还包括第二下载条件,该处理模块具体用于:根据第一下载条件和第二下载条件确定该第一上游下载节点,其中,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值;该第二下载条件为该第一下载节点已下载该镜像文件的大小小于该第一上游下载节点已下载该镜像文件的大小。
在本申请实施例的技术方案中,采用均衡策略中的第二下载条件,能够确保上游下载节点处已经获取下游下载节点处需要下载的镜像文件的数据。
结合第四方面,在第四方面的某些实现方式中,该收发模块还用于:接收该第一下载节点发送的第一下载信息,该第一下载信息为该第一下载节点下载该镜像文件的下载信息;该处理模块还用于:根据该第一下载信息更新该查询服务器中的该镜像文件的下载信息列表。
结合第四方面,在第四方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名和该第一下载节点已下载该镜像文件的文件大小。
结合第四方面,在第四方面的某些实现方式中,该第一下载信息包括该镜像文件的文件名、该第一下载节点已下载该镜像文件的文件大小、该第一下载节点下载该镜像文件的下载时间和该第一下载节点下载该镜像文件的上游下载节点。
结合第四方面,在第四方面的某些实现方式中,该收发模块还用于:接收该第一节点发送的查询请求,该查询请求用于查询用于下载镜像文件的上游下载节点的信息。
结合第四方面,在第四方面的某些实现方式中,该查询服务器为集中式的查询服务 器。可选地,该集中式的查询服务器可以为该N个节点中的特定节点。
结合第四方面,在第四方面的某些实现方式中,该集中式的查询服务器采用主备模式。
在本申请实施例的技术方案中,查询服务器通过采用主备模式,由于集中式的查询服务器采用主备模式的服务,即有多个服务器随时准备着提供查询服务。因此,当主查询服务器出现宕机时,备查询服务器会提供查询服务,与仅有一个查询服务器相比,提供了高可用的服务。
结合第四方面,在第四方面的某些实现方式中,可以通过复用所述集群中N个节点中的部分或全部,作为分布式查询服务器系统中的查询服务器。
在本申请实施例的技术方案中,采用分布式的查询服务器系统,可以降低单个查询服务器的负荷。此外,分布式的查询服务器系统还可以复用节点提供查询服务,能够避免单独配置查询服务器带来的资源消耗。
第五方面,提供了一种节点,该节点包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,以使得所述装置执行上述第一方面或第一方面中的任一种可能实现方式中的方法。
第六方面,提供了一种查询服务器,该查询服务器包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,以使得所述装置执行上述第二方面或第二方面中的任一种可能实现方式中的方法。
结合上述任一方面,在某些实现方式中,上述节点、查询服务器可以为芯片。
第七方面,提供了一种系统,该系统包括文件服务器和上述任一方面或任一方面中的任一种可能实现方式中的节点。
第八方面,提供一种可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,根据上述第一方面和第二方面中或其任一种可能实现方式中的方法被执行。
第九方面,提供了一种包含指令的计算机程序产品,其在计算机上运行时,使得计算机执行上述第一方面和第二方面中或其任一种可能实现方式中的方法。
附图说明
图1是本申请实施例应用的集群场景的一种实现方式的示意图。
图2是根据本申请实施例的一种实现镜像文件下载的结构示意图。
图3是根据本申请一个实施例的集群中镜像文件下载的方法的交互流程图。
图4是根据本申请另一个实施例的集群中镜像文件下载的方法的交互性流程图。
图5是根据本申请一个实施例的哈希环的示意图。
图6是根据本申请一个实施例的集群中镜像文件下载的示意图。
图7是根据本申请一个实施例的节点的示意性框图。
图8是根据本申请另一个实施例的节点的示意性框图。
图9是根据本申请一个实施例的查询服务器的示意性框图。
图10是根据本申请另一个实施例的节点的示意性框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请实施例适用于集群环境的镜像文件下载。其中,镜像文件可以是一个磁盘上的数据在另一个磁盘上存储的一个数据备份。或者,镜像文件也可以是文件处理工具,例如将其他格式的文件转换为特定的文件格式。或者,镜像文件还可以是与压缩包类似的文件,例如,将特定的一系列文件按照一定的格式制作成单一的文件,以方便用户下载和使用。在Docker(容器)环境下,镜像文件还可以为一个注册说明文件,其中包括创建Docker的说明。
集群将多个服务器集中起来进行同一种服务,从客户的角度可以将集群视为一个服务器。集群利用多个计算机或节点进行并行计算从而获得高的计算速度。
图1是本申请实施例应用的集群的一种实现方式的示意图。
如图1所示,集群中可以包括文件服务器101和多个节点102。文件服务器101用于向集群中的多个节点102提供镜像文件下载的服务。
节点102可以为具有处理功能的设备,例如可以包括当前技术已知的任何计算设备,如服务器、台式计算机等等。节点102中可以包括存储器和处理器。存储器可以用于存储程序代码,例如,操作系统以及其他应用程序。处理器可以用于调用存储器存储的程序代码,以实现节点的相应功能。节点中包括的处理器和存储器可以通过芯片实现,此处不作具体的限定。
节点处可以安装有操作系统以及其他应用程序。例如,可以在节点处安装应用程序:容器Docker等。在Docker环境下,可以使用客户端-服务器(Client/Server,C/S)架构模式,接收来自客户的请求,例如,数据下载请求,并处理这些请求。
目前在数据中心环境下,集群中镜像文件的下载采用中心下载的方式,即集群中的全部节点需要从文件中心服务器获取镜像文件,在这种方式下随着同时下载的节点数量的增加,下载时延与下载的节点数量呈线性增长的关系,从而导致节点的下载时延较长。若要保证下载时延不随下载的节点数增加而增长,则需要增加服务器的资源。
在本申请的实施例中,节点既可以从文件服务器获取镜像文件,也可以从已经获取部分或全部镜像文件的其它节点处获取镜像文件。文件服务器可以向初始节点提供镜像文件的下载服务,在集群中的任何节点均可以作为初始节点,初始节点获取镜像文件后,可以作为该镜像文件的文件源,向其它的节点提供该镜像文件的下载服务。这些节点的下载源和下载节点的拓扑可以构成一个镜像文件下载的多叉树结构。
下面将结合具体的例子详细描述本申请的实施例。需要说明的是,这只是为了帮助本领域技术人员更好地理解本申请实施例,而非限制本申请实施例的范围。
应理解,在本申请的各实施例中,下载节点表示进行镜像文件下载的节点,在本申请实施例的关于镜像文件下载方案的描述中,“下载节点”和“节点”可以通用。
还应理解,在本申请的各实施例中,“第一”、“第二”“第三”等仅是为了指代不同的对象,并不表示对指代的对象有其它限定。
图2示出了本申请实施例的一种实现镜像文件下载的结构示意图。同时结合图2,简要描述了集群的多个节点中任一个节点通过查询服务器获取其上游下载节点的信息的过程。此外,在本申请实施例中,查询服务器可以是文件追踪FileTracker服务器。
需要说明的是,在本申请的实施例中,可以应用在Docker环境下,由于在Docker 的协议中可以支持顺序下载,因此集群中的镜像文件可以顺序下载。此处以2叉树结构示意图举例说明,该结构示意图为根据查询服务器确定上游下载节点后,形成的镜像文件的下载源和下载节点的一种实现方式的结构示意图。
图2中的集群包括文件服务器和多个节点,多个节点如图2中的第一节点、第二节点、第三节点、第四节点和第五节点。图2中的节点也可以是图1中的任意一个节点。
在本申请的实施例中,第一节点可以直接从文件服务器下载镜像文件,第一节点下载镜像文件后可以作为该镜像文件的文件源向集群中的其它节点(例如第二节点和第三节点)提供该镜像文件的下载服务。
在第二节点下载镜像文件后,第二节点又可以作为该镜像文件的文件源向集群中的其它节点(例如第四节点和第五节点)提供该镜像文件的下载服务,依次类推。
在本申请的实施例中,集群中的N个节点不需要均通过文件服务器进行镜像文件的下载,其可以通过上游下载节点进行镜像文件的下载。
在本申请的实施例中,查询服务器用于为每个下载节点提供上游下载节点的信息。下面结合图2,描述集群中多个节点中的任一个节点通过查询服务器获取其上游下载节点的信息的过程。
在S210中,第一节点向查询服务器发送查询请求,该查询请求用于查询用于下载镜像文件的上游下载节点的信息,即该查询请求用于查询第一节点的上游下载节点的信息。该查询请求可以包括第一节点处需要下载的镜像文件的文件名以及第一节点处已经下载的该镜像文件的大小。
在S220中,查询服务器向第一节点发送查询结果,该查询结果中包括第一节点的上游下载节点的信息。
例如,第一节点向查询服务器发送查询请求,该请求用于查询第一节点下载镜像文件1的上游下载节点的信息。
查询服务器接收到第一节点发送的查询请求后,向所述第一节点发送查询结果。例如,查询服务器未查询到任何节点处已经下载镜像文件1中的数据,则查询结果为第一节点需要从文件服务器中下载镜像文件1。当第一节点获取查询结果后,第一节点从文件服务器进行镜像文件1的下载。
同理,在S230中第三节点向查询服务器发送查询请求,该查询请求用于查询第三节点的上游下载节点的信息,该查询请求可以包括第三节点处需要下载的镜像文件的文件名以及第三节点处已经下载的该镜像文件的大小。
在S240中,查询服务器向第三节点发送查询结果,该查询结果包括第三节点的上游下载节点的信息。
例如,镜像文件1的大小为300MB,第三节点处已经下载了镜像文件1中100MB的数据,此时第三节点需要继续下载镜像文件1中剩余的200MB数据。查询服务器接收到第三节点的查询请求后,向第三节点发送查询结果。例如,查询结果为第一节点可以是其上游下载节点(假设此处的第一节点已经完成第三节点所需要的200MB数据的下载)。当第三节点收到查询结果后,第三节点从第一节点继续下载镜像文件1中剩余的200MB数据。
应理解,多个节点中的任一节点通过查询服务器查询其上游下载节点的信息的方式 可以包括但不限定为以上方式。
还应理解,多个节点中的任一节点进行镜像文件的下载时,可以是通过一个上游下载节点,经过一次下载后完成镜像文件的全部数据的下载,也可以是通过不同的上游下载节点经过几次下载后完成镜像文件全部数据的下载。
下面将结合图3,具体介绍集群中的任意一个节点通过查询服务器查询上游下载节点的信息,并从上游下载节点下载镜像文件的过程。
图3是根据本申请一个实施例的集群中镜像文件下载的方法的流程示意图。其中,第一下载节点可以是集群中N个节点中的任意一个节点,例如,可以是图1中的任意一个节点。
在本申请的实施例中,第一下载节点可以为集群中N个节点中的任一个节点,镜像文件的下载方法适用于包括文件服务器和N个节点的集群中,其中,N为大于1的正整数,文件服务器为该N个节点中的至少一个节点提供镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务。
S310,N个节点中的第一下载节点接收查询服务器发送的第一上游下载节点的信息,其中,第一上游下载节点是基于均衡下载策略在下载源集合中确定的为该第一下载节点提供镜像文件的下载服务的节点,下载源集合包括该文件服务器和N个节点中下载了该镜像文件的至少一个节点。
应理解,在本申请的实施例中,第一上游下载节点是从下载了镜像文件的节点中选取的一个节点,该第一上游下载节点可以适用两种场景:
第一种场景为第一上游下载节点为集群全部节点中,下载了该镜像文件的节点和文件服务器中的一个节点。
第二种场景为第一上游下载节点为集群部分节点中,下载了该镜像文件的节点和文件服务器中的一个节点。本申请实施例对此不作限定。
在本申请的实施例中,N个节点中的第一下载节点在下载镜像文件时,从查询服务器获取第一上游下载节点的信息。
可选地,第一下载节点可以先向查询服务器发送第一上游下载节点的查询请求,查询服务器根据查询请求,向该第一下载节点发送第一上游下载节点的信息。
其中,查询请求用于查询用于下载镜像文件的上游下载节点的信息。即该查询请求用于查询该第一上游下载节点的信息。查询请求可以包括需要下载镜像文件的文件名以及第一下载节点处目前已经下载的该镜像文件的文件大小。
可选地,在本申请的实施例中,查询服务器也可以根据下载信息列表主动向下载节点发送相应的上游下载节点的信息,也就是说,本申请实施例并不限定上游下载节点的信息是基于请求而发送的。
在本申请的实施例中,查询服务器基于均衡下载策略和下载信息列表确定所述第一上游下载节点(具体确定过程参见后续描述)。所述下载信息列表包括所述镜像文件在所述N个节点的下载信息。
需要说明的是,下载信息列表包括所述镜像文件在所述N个节点的下载信息,当集群中的某一个节点没有下载该镜像文件时,下载信息列表中对应该节点的下载信息可以 为空的。即下载列表中对应下载了该镜像文件的节点的下载信息不为空,对应没有下载该镜像文件的节点的下载信息为空。
可选地,在本申请的一个实施例中,所述下载信息列表中的下载信息可以包括下载的镜像文件的文件名和文件大小。
查询服务器可以根据各个节点的下载信息维护下载信息列表,例如,以下表1所示的下载信息列表,并在下载信息列表中记录分配的上游下载节点的下游下载节点的个数。应理解,在本申请的实施例中,下游下载节点即从下载源集合中正在获取镜像文件的节点。其中,下载源集合包括文件服务器和集群的N个节点中下载了镜像文件的至少一个节点。
可选地,在本申请的一个实施例中,所述下载信息列表中的下载信息可以包括下载镜像文件的文件名、已下载的该镜像文件的文件大小、下载该镜像文件的下载时间和下载该镜像文件的上游下载节点。
除了下载镜像文件的文件名和已下载的该镜像文件的文件大小外,各节点的下载信息还可以包括下载该镜像文件的下载时间和下载该镜像文件的上游下载节点。这样,查询服务器可以生成更详细的下载信息列表,从而基于均衡下载策略确定上游下载节点,本申请实施例对此不作限定。
需要说明的是,下载信息列表可以为镜像文件在多个节点处的下载信息。对于下载不同的镜像文件,下载信息列表不同。下载信息列表中的下载信息记录的是下载了同一镜像文件的不同下载节点的信息。
例如,表1所示镜像文件1(File1)在各节点处具体的下载信息。表1为本申请实施例中一种镜像文件下载信息列表的实现方式,本申请实施例对此不作限定。
如表1所示,一个镜像文件的下载信息列表中可以包括下载镜像文件的文件名、下载该镜像文件的节点名称以及每个下载该镜像文件的节点处已经下载该镜像文件的大小和下载该镜像文件1的下游下载节点数量。
表1
Figure PCTCN2018121070-appb-000001
在表1中,下载镜像文件1(File1)的节点有第一节点、第二节点、第三节点、第四节点和第五节点,其中第一节点处已下载镜像文件1中的1000MB数据、第二节点处已下载镜像文件1的500MB数据、第三节点处已下载镜像文件1中的800MB数据、第四节点处已下载镜像文件1中的400MB数据、第五节点处已下载镜像文件1中的400MB数据。同时表1中还记录了下载镜像文件1的每个节点的下游下载节点个数,例如,第 一节点具有2个下游下载节点,第二节点具有2个下游下载节点,第三节点、第四节点和第五节点处没有下载镜像文件1的下游下载节点。在本申请的实施例中,查询服务器可以基于均衡下载策略确定该第一上游下载节点。均衡下载策略用于均衡集群中各个节点的下载负荷。该均衡下载策略可以是基于集群中每个节点的下载情况而设置的策略,例如,可以根据当前每个节点处镜像文件的下载情况进行设置该均衡下载策略,并且还可以动态地调节该均衡下载策略,从而避免集群中过多节点从一个节点处下载镜像文件,而造成下载瓶颈。
可选地,在本申请的一个实施例中,该均衡下载策略可以包括第一下载条件,即所述查询服务器可以根据第一下载条件实现集群中负载的均衡。其中,第一下载条件为第一上游下载节点的下游下载节点的个数小于预设阈值。
可选地,在本申请的一个实施例中,均衡下载策略还可以包括第二下载条件,即所述查询服务器可以根据第一下载条件和第二下载条件确定所述第一上游下载节点。其中,第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值;第二下载条件为所述第一下载节点已下载该镜像文件的大小小于该第一上游下载节点已下载该镜像文件的大小。
在本申请的实施例中,在本申请实施例的技术方案中,采用均衡策略中的第一下载条件,能够避免集群中多个下游下载节点从同一上游下载节点获取镜像文件。采用均衡策略中的第二下载条件,能够确保上游下载节点处已经获取下游下载节点处需要下载的镜像文件的数据。
结合表1,例如,当第六节点需要下载镜像文件1中的数据时,假设在此之前第六节点处已经下载了镜像文件1中450MB的数据。此时,第六节点向查询服务器发送查询请求,该查询请求用于查询第六节点的上游下载节点的信息。若预设阈值为2,结合表1,查询服务器可以采用第一下载条件和第二下载条件,确定第六节点的上游下载节点。其中,第一下载条件为第一上游下载节点的下游下载节点的个数小于预设阈值,第二下载条件为第一下载节点已下载该镜像文件的大小小于第一上游下载节点已下载该镜像文件的大小。
根据第一下载条件和第二下载条件可知,由于第一节点和第二节点处下载该镜像文件的下游下载节点数量为2,不满足第一下载条件;由于第六节点处已经下载的镜像文件1的数据大小为450MB,大于第四节点和第五节点处已经下载的镜像文件1的数据大小400MB,不符合第二下载条件。因此,查询服务器结合表1根据第一下载条件和第二下载条件,确定第三节点可以为第六节点的上游下载节点。
S320,第一下载节点从第一上游下载节点下载该镜像文件。
在本申请的实施例中,第一下载节点接收查询服务器发送的第一上游下载节点的信息,第一下载节点根据该第一上游下载节点的信息确定第一上游下载节点,从该第一上游下载节点获取镜像文件。
应理解,此处第一下载节点从第一上游下载节点下载镜像文件时,可以下载全部的镜像文件或者下载部分的镜像文件。
可选地,当第一下载节点在第一上游下载节点下载该镜像文件后,第一下载节点向该查询服务器发送第一下载信息。其中,第一下载信息为该第一下载节点下载该镜像文 件的下载信息。
应理解,当第一下载节点开始镜像文件的下载,但在没有完成本次镜像文件下载之前,它就可以为其它下游下载节点提供该镜像文件的下载服务。
例如,当第一下载节点下载了镜像文件中的10MB数据时,第一下载节点向查询服务器发送下载该镜像文件的第一下载信息。这样第一下载节点在下载该镜像文件的剩余数据的同时,若其它下载节点需要下载该镜像文件的这10MB数据,则查询服务器可以根据上述第一下载信息,确定第一下载节点可以为其他下载节点提供该镜像文件的这10MB数据的下载服务。
在本申请的实施例中,第一下载节点向查询服务器发送下载镜像文件的第一下载信息的时间不作具体的限定。
在本申请的实施例中,第一下载信息可以只包括该镜像文件的文件名和第一下载节点已下载的该镜像文件的文件大小;或者,只包括该镜像文件的文件名、第一下载节点已下载的该镜像文件的文件大小和第一下载节点下载该镜像文件的上游下载节点。本申请实施例对此并不限定。
查询服务器可以根据接收到的第一下载信息更新该查询服务器中的下载信息列表,该下载信息列表包括该镜像文件在多个节点的下载信息。例如,更新表1中的镜像文件1(File1)在已经下载镜像文件1的节点的下载信息。
在本申请的实施例中,当一个节点从该节点的上游下载节点下载镜像文件时,该节点可以选择在开始下载后的任意时刻将镜像文件的下载信息发送至查询服务器。查询服务器根据接收到的下载信息,更新该服务器中的下载信息列表。
在本申请的实施例中,查询服务器可以为集中式的查询服务器。该集中式的查询服务器可以为集群中的某个节点。
需要说明的是,集中式的查询服务器可以为复用集群N个节点中某些特定的节点,以提供集中式的查询服务器功能。
在本申请的实施例中,集中式的查询服务器采用主备模式。例如,主查询服务器负责监测全部的备查询服务器,并在备查询服务器宕机时,主查询服务器会对备查询服务器进行重启。
如果主查询服务器宕机了,集群中的一个备查询服务器会执行主查询服务器的工作。这种主/备(Master/Slave)架构方式下会有至少2台或者更多的服务器作为查询服务器,但是主备模式中一个时刻仅有一个主查询服务器进行工作。
对于集中式的查询服务器而言,由于集中式的查询服务器采用主备模式的服务,即有多个服务器随时准备着提供查询服务。因此,当主查询服务器出现宕机时,备查询服务器会提供查询服务。本申请实施例中的集中式的查询服务器与仅有一台查询服务器相比,至少会有2台查询服务器同时准备着提供查询服务。因此本申请实施例与仅有一个查询服务器相比,能够提供高可用的服务。但是集中式的查询服务器采取主备模式的服务时,也可能会存在一定的瓶颈。由于一个时刻仅能有一个查询服务器进行工作,故当集群中的多个节点在同一时刻从同一个查询服务器查询上游下载节点的信息时,可能会存在单点性能的瓶颈。
因此,对于本申请的实施例中,查询服务器也可以为分布式的查询服务器系统。
在本申请的实施例中,向第一下载节点发送第一上游下载节点的信息的查询服务器,为分布式的查询服务器系统中的一个查询服务器。
作为一种可选的实现方式,分布式的查询服务器系统可以是复用集群中N个节点中的部分或全部形成的系统。在该分布式的查询服务器系统中具有多个查询服务器,多个查询服务器以分布式的方式提供服务。
应理解,在本申请实施例中,查询服务器也可以是独立的服务器,即可以不复用集群中的节点,此处不作具体限定。分布式的查询服务器还可以独立的服务器与复用集群中的节点的服务器的集合。
应理解,对于分布式的查询服务器系统,不同的镜像文件可以按照分布式的方式对应不同的查询服务器,这样查询服务器功能就被分布到了不同的查询服务器上进行执行,从而不会产生单点性能瓶颈的问题。
在本申请的实施例中,集群中还包括存储服务器,该存储服务器用于向节点提供列表信息,该列表信息用于节点在分布式的服务器系统中确定该镜像文件的查询服务器。N个节点中的第一下载节点向所述存储服务器发送注册请求,该注册请求包括该第一下载节点的IP地址以及注册端口号。第一下载节点的IP地址以及注册端口号存储在存储服务器中的列表信息中。
也就是说,存储服务器上的列表信息是根据N个节点中的每一个节点向存储服务器发送注册请求后,存储服务器获取注册请求中携带的下载节点的IP地址以及注册端口号的信息,从而形成的N个节点的列表信息。
对于分布式的查询服务器系统复用集群中N个节点的情况,在存储服务器中包括N个节点的列表信息,集群中的N个节点均会向存储服务器发送注册请求,该注册请求包括该第一下载节点的IP地址以及注册端口号。存储服务器根据N个节点的注册请求形成N个节点的列表信息。类似的,对于分布式的查询服务器系统复用集群中部分节点的情况,存储服务器中包括该部分节点的列表信息。例如,集群中包括100个节点,存储服务器中的列表信息可以为全部节点(例如,100个节点)的列表信息,也可以为集群中部分节点(例如,50个节点或80个节点)的列表信息。
在本申请的实施例中,例如,存储服务器为一种高可用的键值存储系统ETCD(A highly-available key value store for shared configuration and service discovery),ETCD是用于共享配置和服务发现的分布式,一致性的KV键值存储数据(key-value,KV)存储系统。
ETCD的工作原理为使用分布式强一致性日志Raft协议,来维护集群内各个节点状态的一致性。简单说,ETCD集群是一个分布式系统,由多个节点相互通信构成整体对外进行服务,每个节点都存储了完整的数据,并且通过Raft协议保证每个节点维护的数据是一致的。
在本申请的实施例中,第一下载节点可以从分布式的查询服务器系统中确定查询第一上游下载节点信息的查询服务器,包括:
第一下载节点从存储服务器中获取列表信息;
第一下载节点根据列表信息采用一致性哈希算法,从分布式的查询服务器系统中确定查询服务器。
第一下载节点首先从存储服务器中获取列表信息,第一下载节点采用哈希算法计算该镜像文件的文件名的哈希值和列表信息中N个节点的哈希值。通过比较哈希值的大小,从分布式的查询服务器系统中确定查询服务器。例如,通过比较该镜像文件的文件名的哈希值和N个节点的哈希值,确定与该镜像文件的文件名的哈希值最接近的哈希值所对应的节点,该节点作为该镜像文件的查询服务器。然后,第一下载节点向确定的查询服务器发送该镜像文件的查询请求,若第一下载节点为集群中第一个下载该镜像文件的节点,查询服务器此时第一次接收到该镜像文件的查询请求,查询服务器根据第一次接收到的该镜像文件的查询请求确定该查询服务器作为该镜像文件的查询服务器。
可选地,在本申请的实施例中,若第一下载节点为集群中第一个下载该镜像文件的节点时,查询服务器中没有该镜像文件的下载信息列表,则查询服务器确定文件服务器为该第一下载节点的上游下载节点,即第一下载节点需要从文件服务器获取该镜像文件。当第一下载节点完成本次镜像文件的下载后,会向该查询服务器发送下载该镜像文件的第一下载信息,则查询服务器中生成下载该镜像文件的下载信息列表。因此,查询服务器第一次接收到该镜像文件的查询请求时,即作为该镜像文件的查询服务器。
可选地,在本申请的实施例中,若第一下载节点不是集群中第一个下载该镜像文件的节点时,集群中已经有节点下载过该镜像文件。此时,该查询服务器中已经有下载该镜像文件的下载信息列表。
需要说明的是,此处分布式的查询服务器系统为复用集群中的N个节点,信息列表中存储的信息为各个节点的信息,因此计算出N个节点的哈希值,例如节点名称的哈希值,即得到分布式的查询服务器系统中的各个查询服务器的哈希值。
第一下载节点根据列表信息,采用一致性哈希算法从分布式的查询服务器系统中确定该镜像文件的查询服务器。(具体步骤在下面图5中进行描述)。
应理解,在本申请的实施例中,集群中的任意一个节点(例如,第一节点)的功能可能包括以下三种情况,本申请对此不作限定:第一种可能:第一节点仅作为第一镜像文件的下载节点,可以具有上述集群中镜像文件下载的方法中的第一下载节点的功能。
第二种可能:第一节点仅作为查询下载第一镜像文件的上游下载节点信息的查询服务器,可以具有上述集群中镜像文件下载的方法中的查询服务器的功能。
第三种可能:第一节点既作为第一镜像文件的下载节点,又作为N个节点中其它节点(例如第二下载节点)的查询服务器。
本申请实施例中,以第一下载节点下载的为第一镜像文件,第二下载节点下载的为第二镜像文件为例进行说明。不同镜像文件可以分别对应不同的查询服务器。
对于第三种可能,第一节点既作为第一镜像文件的下载节点,又作为N个节点中其它节点(例如第二下载节点)的查询服务器,例如,第二下载节点处下载第二镜像文件时,第二下载节点计算第二镜像文件的文件名的哈希值和列表信息中N个节点的哈希值。通过比较哈希值的大小,例如,通过比较该第二镜像文件的文件名的哈希值和N个节点的哈希值,确定与第二镜像文件的文件名的哈希值最接近的哈希值所对应的第一节点,作为第二镜像文件的查询服务器。第一节点作为查询服务器时还包括查询服务器的功能。
可选地,在本申请的一个实施例中,第一节点接收到第二下载节点发送的查询请求, 该查询请求用于查询下载第二镜像文件的上游下载节点的信息。第一节点作为查询服务器确定能够为该第二下载节点提供该第二镜像文件下载的第二上游下载节点。其中,该第二上游下载节点是下载源集合中为该第二下载节点提供该第二镜像文件的下载服务的节点,该下载源集合包括该文件服务器和N个节点中下载了该第二镜像文件的至少一个节点。第一节点在确定该第二上游下载节点后,向该第二下载节点发送该第二上游下载节点的信息。
可选地,在本申请的一个实施例中,第一节点作为查询服务器可以根据第二镜像文件的下载信息列表确定第二上游下载节点,该下载信息列表包括第二镜像文件在该N个节点的下载信息。
在本申请的实施例中,第一节点作为查询服务器可以基于所述均衡下载策略和下载信息列表确定第二上游下载节点。
可选地,在本申请的一个实施例中,该均衡下载策略可以包括第一下载条件,第一节点可以根据第一下载条件确定该第二上游下载节点,第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值。
可选地,在本申请的一个实施例中,该均衡下载策略还可以包括第二下载条件,第一节点可以根据第一下载条件和第二下载条件确定该第二上游下载节点,其中,该第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值;第二下载条件为该第二下载节点已下载的第二镜像文件的大小小于该第二上游下载节点已下载该第二镜像文件的大小。
可选地,在本申请的一个实施例中,该第二下载节点向查询服务器发送的下载信息,可以包括下载第二镜像文件的文件名和已下载的该第二镜像文件的文件大小。
查询服务器可以根据各个节点的下载信息维护下载信息列表,例如,上述表1所示的下载信息列表,并进一步基于均衡下载策略和下载信息列表确定上游下载节点,并在下载信息列表记录分配的上游下载节点的下游节点个数。在本申请的一个实施例中,该下载信息可以包括下载第二镜像文件的文件名和已下载的该第二镜像文件的文件大小。
除了该第二镜像文件的文件名和已下载的该第二镜像文件的文件大小外,各节点的下载信息还可以包括下载该第二镜像文件的下载时间和下载该第二镜像文件的上游下载节点。这样,查询服务器(此处为第一节点)可以根据生成更详细的下载信息列表基于均衡下载策略确定上游下载节点,本申请实施例对此不作限定。
需要说明的是,当第二下载节点开始第二镜像文件的下载,在没有完成第二镜像文件的下载之前,它就可以为向第一节点发送第二下载信息。第一节点根据第二下载信息更新第一节点中的该第二镜像文件的下载信息列表。
可选地,在本申请的实施例中,第二下载节点从第二上游下载节点下载第二镜像文件时,可以下载全部第二镜像文件或者也可以下载部分第二镜像文件。
在本申请的实施例中,第一下载节点下载的第一镜像文件的信息为第一下载信息,第二下载信息与此相类似,主要区别在于,第二下载信息为第二下载节点中下载的第二镜像文件的信息。
需要说明的是,在本申请实施例的上述方法可以通过在节点的数据下载服务(Data Download Service,DDS)来实现。具体的,可以通过DDS中的服务管理DDS Service  Handle模块来实现上述实施例中的第一下载节点的功能。通过DDS中的文件追踪FileTracker模块来实现上述实施例中的查询服务器的功能。若节点既作为下载节点,又为其它节点提供查询服务器的功能,则该节点可以通过DDS中的DDS Service Handle和FileTracker模块分别实现上述实施例中的下载节点的功能和查询服务器的功能(下面在图6中进行具体描述)。
在本申请的实施例中,提供了一种集群中镜像文件下载的方法,通过查询服务器查找各个节点的上游下载节点的信息。集群中镜像文件的下载,由采用中心下载方式,即全部节点均在文件服务器中下载镜像文件,变为基于文件服务器和其它节点提供文件源的镜像文件下载方式,能够降低镜像文件下载的时间复杂度。例如,可以使得集群中镜像文件下载的时间复杂度由O(N)(表示与N为正比关系)下降到了O(LogN)(表示与N为对数关系)。例如,若集群中有100节点则集群的下载时间由与100K成正比变成与Log100K成正比,从而减少下载时间提高了集群中的下载效率。
本申请的集群中镜像文件下载的方法,在1000节点的集群中实施了验证。验证中,1000节点会同时下载180MB的一个高性能的超文本传输协议和反向代理服务器Nigix镜像,我们通过记录下载完成时间来验证本方案的下载的加速效果。在没有使用本申请的集群中镜像文件下载的方法时,需要95分钟(5700秒)完成下载;在使用本申请的集群中镜像文件下载的方法时,仅需要50秒完成下载,加速效果超过120倍。
图4是本申请一个实施例的集群中镜像文件下载的方法的交互性流程图。
图4中的第一节点、第二节点可以为集群中N个节点中的任意两个节点。例如,可以是图1中的任意一个节点,查询服务器可以为集中式的查询服务器或者为分布式的查询服务器系统中的查询服务器。
应理解,第一节点也可以称为第一下载节点,第二节点也可以称为第二下载节点。
在本申请的实施例中,例如,镜像文件1的数据大小为500MB,在本次下载之前第一节点和第二节点处已经下载镜像文件1中的数据的情况如表2所示,其中第一节点处已经下载镜像文件1中的100MB数据(此处100MB镜像文件1中的数据可以为上一次下载镜像文件1的数据),第二节点处还未下载镜像文件1。
S410,第一节点、第二节点分别向查询服务器发送下载镜像文件1的查询请求。
由于第一节点和第二节点均需要下载镜像文件1,因此第一节点和第二节向同一查询服务器发送查询请求。在本申请的实施例中,第一节点发送的查询请求中可以包括第一节点已经下载镜像文件1中的100MB数据的信息。第一节点通过查询服务器查询第一上游下载节点的信息。第二节点发送的查询请求中包括第二节点处还未下载镜像文件1数据的信息,第二节点通过查询服务器查询第二上游下载节点的信息。
S420,查询服务器根据查询请求分别向第一节点、第二节点发送查询结果。
在本申请的实施例中,查询服务器向第一节点发送查询结果为第一上游下载节点为文件服务器,向第二节点发送查询结果为第二上游下载节点为第一节点。
S430,根据S420的查询结果,第一节点从文件服务器获取镜像文件1以及第二节点从第一节点处获取镜像文件1。
S440,第一节点以及第二节点分别向查询服务器发送镜像文件1的下载信息。
例如,第二节点可以在从第一节点处获取镜像文件1中的100MB数据后,向查询 服务器发送镜像文件1的下载信息。第一节点可以在从文件服务器获取镜像文件1中的500MB数据后,向查询服务器发送镜像文件1的下载信息。当查询服务器接收第一节点和第二节点发送的镜像文件1的下载信息后,更新镜像文件1的下载信息列表。更新后的镜像文件1的下载信息列表如表3所示。需要说明的是,第二节点、第一节点还可以在其它时机向查询服务器发送镜像文件1的下载信息。以第二节点为例,第二节点可以在下载镜像文件1小于100MB的数据时,向查询服务器发送镜像文件1的下载信息,或者第二节点可以在下载完镜像文件1的100MB数据后继续从第一节点下载镜像文件1中的数据,并在下载完镜像文件1的500MB数据后,向查询服务器发送镜像文件1的下载信息。应理解,此处仅为举例,并不对本申请的实施例作限定。
下载信息列表更新后,其它节点下载该镜像文件1时,查询服务器会根据更新后的下载信息列表,选择合适的上游下载节点为其它节点提供该镜像文件1的下载服务。
表2
Figure PCTCN2018121070-appb-000002
表3
Figure PCTCN2018121070-appb-000003
应理解,第一节点、第二节点向查询服务器发送下载信息可以是在第一节点、第二节点下载镜像文件1结束时发送,也可以是与下载镜像文件1同时进行。第一节点、第二节点下载的部分镜像文件1的数据能够作为文件源为其它节点提供镜像文件1的下载。此处为本申请的实施例,不作具体限定。
本申请的实施例中,提供了一种在集群中镜像文件下载的方法,具体地通过查询服务器查找各个节点的上游下载节点的信息,使得集群中的镜像文件的下载由采用中心下载方式,即全部节点均在文件服务器中下载镜像文件,变为基于文件服务器和其它节点提供的文件源的镜像文件下载,使得集群中镜像文件下载的时间复杂度由O(N)下降到了O(LogN),从而降低了集群环境中下载服务资源的占用,提高了集群中的下载效率。
图5是根据本申请一个实施例的哈希环的示意图。
应理解,图5仅是示例,不构成对本申请实施例的限定。根据图3中的描述可知,第一下载节点根据列表信息,采用一致性哈希算法确定用于查询第一上游下载节点的查 询服务器。
如图5所示,一个分布式的查询服务器系统由N个节点构成,此处分布式的查询服务器系统为复用集群中的N个节点。集群中还包括存储服务器,在存储服务器中包括N个节点的列表信息,第一下载节点根据列表信息采用一致性哈希算法分别计算出每个节点的哈希值,根据镜像文件的文件名的哈希值以及每个节点的哈希值进行比较,确定分布式的查询服务器系统中用于查询第一上游下载节点的查询服务器。
需要说明的是,文件名的哈希值可以为根据文件名称或者文件编号采用一致性哈希算法得到的哈希值。节点的哈希值可以为根据节点的编号采用一致性哈希算法得到的哈希值。服务器名称哈希值可以为根据服务器名称编号采用一致性哈希算法得到的哈希值。在本申请的实施例中,分布式的查询服务器系统复用集群中的N个节点,即每个节点的哈希值就为每个复用该节点的查询服务器的哈希值。
下面将举例说明一致性哈希算法。
应理解,本申请实施例中采用的哈希算法得到的哈希信息可以是表格的形式,还可以例如为哈希环的形式,在下文中仅以哈希信息为哈希环的形式进行说明。下文中的哈希环可以等同替换为哈希信息,本申请的实施例对此不作限定。
通常一致性哈希算法,作为分布式计算的数据分配参考,与传统的取模,划段相比具有一定的优势。
传统的取模方式:
例如有10个数据,3个节点,如果按照取模的方式,即:
Node a:0,3,6,9;
Node b:1,4,7;
Node c:2,5,8;
当增加一个节点时,数据分布就变更为:
Node a:0,4,8;
Node b:1,5,9;
Node c:2,6;
Node d:3,7;
根据传统取模的方式可知,当集群中增加一个节点Node d时,数据3,4,5,6,7,8,9的分布都需要做迁移。当集群中存在更多的数据时,增加一个节点就会有更大的数据迁移,从而导致工作量以及成本过高。
采用一致性哈希算法,对节点和数据都只需要做一次哈希运算,然后通过比较节点和数据的哈希值,确定数据和节点哈希值最相近的节点作为存放节点。由此可以保证当节点增加或者减少时,产生影响的数据最少。
例如,依旧以有10个数据,3个节点为例,首先分别计算出10个数据的哈希值:
0:192;1:196;2:200;3:204;4:208;5:212;6:216;7:220;8:224;9:228;
再分别对于三个节点,计算出每个节点的哈希值:
Node a:203;
Node g:209;
Node z:228。
此时,比较两者哈希值的大小,如果数据的哈希值大于228,则数据存放到前面哈希值为203的节点,相当于整个哈希值构成了一个哈希环。对应的映射结果:
Node a:0,1,2;
Node g:3,4;
Node z:5,6,7,8,9。
此时,当集群中加入节点Node n时,只需要先计算出Node n的哈希值,例如:Node n:216,这时相应的数据就会做迁移。如,
Node a:0,1,2;
Node g:3,4;
Node n:5,6;
Node z:7,8,9。
通过采用一致性哈希算法,若集群中增加一个节点,例如,上述集群中增加一个节点Node n,采用一致性哈希算法之后则10个数据中可以只需要把数据5和6进行迁移,其它数据可以保持原有数据分布节点。
因此,与传统取模的方式相比,通过一致性哈希算法的数据分布,在节点数量增加或者减少时,其数据的迁移规模相对较小。
哈希环是指分布式存储系统中对物理节点中的数据块的编号进行哈希计算,获得数据块的哈希值,并以该哈希值进行排序获得的一个环状逻辑结构。
应理解,本申请实施例中的标识可以为版本号,也就是说,哈希信息的标识可以为哈希信息的版本号。例如,哈希信息的标识可以为数字,例如N个哈希信息的标识可以依次为1、2、3…N。标识还可以为其他形式,例如,可以英文字母等,本申请实施例并不对此做限定。在下文中,仅以标识为节点编号举例进行相关说明,下文中的节点编号可以等同替换为标识。但本申请实施例并不限于此。
应注意,本申请实施例中的节点可以是物理节点,也可以是虚拟节点。本申请实施例并不对此作限定。在本申请的实施例中,如图5所示,第一下载节点需要从分布式的查询服务器系统中确定查询下载镜像文件1的第一上游下载节点的信息。与上述方法类似,可以把镜像文件看作数据,把查询服务器看作存放节点,通过比较存放节点和数据的哈希值,确定数据和节点哈希值最相近的节点作为存放节点。即通过比较镜像文件和查询服务器的哈希值,例如,通过比较该镜像文件的文件名的哈希值和N个节点的哈希值,确定与该镜像文件的文件名的哈希值最接近的哈希值所对应的节点,该节点作为该镜像文件的查询服务器。即确定向第一节点发送第一上游下载节点的信息的查询服务器。
在本申请的实施例中,第一下载节点首先需要从分布式的查询服务器系统中,确定用于查询第一上游下载节点的信息的查询服务器,进而从该查询服务器中查询第一上游下载节点的信息。
例如,首先第一下载节点根据一致性哈希算法,分别计算出每个节点的哈希值以及镜像文件1的哈希值,然后将镜像文件1的哈希值与每个节点的哈希值进行比较,从而找到哈希值最接近的节点。
由于N个节点构成分布式的查询服务器系统,因此复用该节点的查询服务器即为用于查询第一上游下载节点的信息的查询服务器。
例如,如图5所示,下载的镜像文件1的哈希值为18,节点1的哈希值为10、节点2的哈希值为15、节点3的值为22、节点4的哈希值为33,节点5的哈希值为50,节点6的哈希值为60,节点7的哈希值为70。通过比较哈希值的大小,与镜像文件1的哈希值18最接近的为节点2的哈希值15,因此最终确定节点2作为镜像文件1的查询服务器。
本申请的实施例中,提供了一种在集群中镜像文件下载的方法,具体地通过查询服务器查找各个节点的上游下载节点信息,使得集群中的镜像文件的下载由采用中心下载方式,即全部节点均在文件服务器中下载镜像文件的下载方式,变为基于文件服务器和其它节点提供的文件源的镜像文件的下载方式,其中分布式的查询服务器系统中不同的镜像文件可以具有不同的查询服务器,确保了查询服务器功能被分布到多个查询服务器上执行,不存在单点性能的瓶颈问题。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图6是根据本申请一个实施例的集群中镜像文件下载的一种实现方式示意框图。
在本申请的实施例中,上述集群中镜像文件下载的方法能够在Docker环境下使用。在Docker环境下,Docker Deamon(守护进程)作为Docke架构中的主体部分以一个后台服务形式运行在每一个Docker节点上,用户使用Docker Client(用户端)与Docker Daemon(守护进程)建立通信。每个节点通过Docker Daemon从Docker Registry Server(注册服务器)使用Docker Registry Protocol(注册协议)下载文件。
应用本申请实施例可以不需要对Docker Daemon的源代码进行修改,能够通过一种非侵入式的集成方式。非侵入式的集成方法指用户代码并不需要过多的依赖框架,当重构代码设计时,之前的代码仍然可以运用。因此,非侵入式的集成方法与侵入式的集成方法相比相对成本较小,同时对于源代码的利用率较高。
本申请的集群中镜像文件下载的方法可以通过如图6所示的架构,非侵入式的集成到Docker下载环境,即本申请的镜像文件下载的方法与Docker系统集成仅需要修改配置,例如,修改Docker Registery API Endpointa(注册表应用程序编程接口端点)配置,而不需要修改Docker相关组件的源代码。
在本申请的实施例中,可以在每个节点610处安装数据下载服务(Data Download Service,DDS)文件。下载节点上的Docker Deamon(守护进程)用于接收客户端发来的数据下载请求,然后Docker Deamon将接收到的客户端发送的数据请求发送至DDS进行处理。
具体地,Docker Daemon可以使用DDS服务提供的Docker Registery API Endpoint,通过这个Docker Registery API Endpoint,DDS服务模拟Docker Registery Server向Docker Daemon提供Docker Registery Server服务。
DDS可以包括4个模块:Docker Registry Proxy(注册代理)、DDS Service Handle(服务管理)、File Tracker(文件追踪)和Node Manager(节点管理)。
Docker Registry Proxy:Docker Registry Proxy通过Docker Registry API Endpoint为Docker Daemon提供镜像文件的下载服务。当Docker Registry Proxy通过Docker Registry API Endpoint接到从Docker Daemon发来请求时,如果该请求不是Docker镜像文件的下载请求,例如请求为元数据(Metadata)相关请求,Docker Registry Proxy会将请求转发到Docker Registry Server(注册服务器)620由Docker Registry Server(注册服务器)620进行处理;如果该请求是Docker镜像文件的下载请求时,则Docker Registry Proxy会调用DDS Servicec Handler提供镜像文件的下载服务。
需要说明的是,在本申请的实施例中,例如,图6中节点610中Docker Deamon可以接收客户端发送的数据下载请求,Docker Deamon可以将数据下载请求发送Docker Registry Proxy,Docker Registry Proxy对数据下载请求进行选择,若数据下载请求为镜像文件的查询请求则将该请求发送至DDS Service Handle。
应理解,Docker Registry Proxy的主要功能在于将Docker Daemon接收到的客户端的数据下载请求进行筛选。若该客户端的数据下载请求为镜像文件下载请求,则将该镜像文件下载请求发送至DDS Service Handler。若为其它数据的下载请求,则将客户端的请求发送至文件服务器进行数据下载处理,Docker Registry Server(注册服务器)可以为文件服务器。
DDS Service Handler:DDS Servicec Handler模块实现本申请实施例中的,第一下载节点确定上游下载节点,并且从上游下载节点下载镜像文件的功能;
File Tracker:查询服务器的功能通过节点上的File Tracker模块来实现,查询服务器可以为复用节点提供查询服务器的功能。
Node Manager:Node Manager提供了基于ETCD的节点注册服务。通过这个服务,集群中每个节点将本节点的IP地址和端口注册到ETCD这样的集中存储服务器630上。当需要下载特定文件时,在分布式查询服务器系统中节点可以基于这个完整列表信息使用一致性哈希算法计算出其中某个节点提供查询服务的查询服务器。
应理解,Node Manager提供了基于ETCD的节点注册服务,对于集中式的查询服务器,Node Manager的功能是关闭的。集群中的节点都通过集中式的查询服务器来查询上游下载节点的信息。
通过上述本申请的实施例,为Docker环境提供了一个非侵入式的镜像文件下载的方法。
应理解,非侵入式的集成意味Docker Daemon源代码不需要修改,仅需要修改配置。Docker Daemon需要访问Docker Registry Server服务器下载镜像文件,Docker Registry Server服务器的地址是配置在Docker Daemon的参数中的。在本申请的实施例中DDS相当于是一个模拟的Docker Registry Server服务器,然后修改Docker Daemon的参数让它访问本申请中模拟的服务器,最终通过DDS文件来处理数据请求。因此不需要修改Docker Daemon源代码,提供了一种Docker环境下非侵入式的集成方式。
需要说明的是,例如,第一下载节点和第一上游下载节点的功能可以通过节点上安装的DDS文件中的DDS Service Handler模块实现,查询服务器的功能可以通过DDS文件中的File Tracker模块实现。
在本申请的实施例中,集群中镜像文件下载的方法也可以在自动化容器操作的开源 平台Kubernetes集群中实现。
本申请的实施例用于集群环境的镜像文件的下载,例如,Docker中的镜像文件下载,它还可以用于集群环境中多个节点下载其他特定的同一镜像文件,例如虚拟机镜像下载,大数据运行时刻Runtime下载,本申请的实施例对此不作限定。
在本申请的实施例中,将集群中镜像文件下载的方法应用于Docker环境下,不仅解决了目前数据中心环境下的镜像文件下载采用中心下载的方式而存在的单点故障与单点性能瓶颈问题,同时本申请为Docker环境提供了一种非侵入式的集成方式,不会影响现有Docker开源系统,从而提高了Docker系统的利用率,减少了成本。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
上文详细描述了根据本申请实施例的集群中镜像文件下载的方法,在本申请中集群环境下的镜像文件的下载方法,下载的时间复杂度由O(N)下降到了O(LogN),通过查询服务器来追踪集群中各个节点已下载的文件信息列表,并根据其它节点的请求基于一定的选择策略来选择其上游下载节点。应理解,本申请实施例的节点、查询服务器可以执行前述本申请实施例的各种方法,即以下各种产品的具体工作过程,可以参考前述方法实施例中的对应过程。
图7示出了根据本申请实施例的节点700的示意性框图(图7中的节点可以是图1中的任意一个节点)。该节点700可以对应于各方法实施例中N个节点中的任意一个节点,可以具有方法中的节点的任意功能。
如图7中的节点700为集群中的节点,该集群包括文件服务器和N个节点,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务,其中,N为大于1的正整数。
如图7所示,该节点700包括服务端模块710和数据下载服务模块720。
服务端模块710,用于向该数据下载服务模块发送该镜像文件的下载请求。
数据下载服务模块720,用于根据该镜像文件的下载请求,从查询服务器获取第一上游下载节点的信息,其中,该第一上游下载节点是基于均衡下载策略在下载源集合中确定的为该节点700提供该镜像文件的下载服务的节点,下载源集合包括文件服务器和N个节点中下载了该镜像文件的至少一个节点;以及从该第一上游下载节点下载该镜像文件。
应理解,节点700包括的服务端模块710和数据下载服务模块720的功能可以在同一个模块中执行,即服务端模块和数据下载服务模块可以包括在同一个模块中,服务端模块710的功能主要在于接收客户端发送的数据下载请求,并将全部的数据下载请求接收并发送至数据下载服务模块720。
应理解,在本申请实施例中服务端模块710可以为图7中的Docker Deamon,数据下载服务模块720可以为DDS文件。
可选地,均衡下载策略可以包括第一下载条件,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值。
可选地,均衡下载策略还包括第二下载条件,该第二下载条件为该第一下载节点已下载该镜像文件的大小小于该第一上游下载节点已下载该镜像文件的大小。
可选地,数据下载服务模块720还用于:
向该查询服务器发送第一下载信息,该第一下载信息为该节点700下载该镜像文件的下载信息,该第一下载信息用于该查询服务器更新该查询服务器中的该镜像文件的下载信息列表,该下载信息列表包括该镜像文件在N个节点的下载信息。
可选地,数据下载服务模块720还用于:
向查询服务器发送查询请求,该查询请求用于查询该第一上游下载节点的信息。
可选地,该查询服务器为集中式的查询服务器,该集中式的查询服务器为N个节点中的特定节点。
可选地,该集中式的查询服务器采用主备模式。
可选地,该N个节点形成分布式的查询服务器系统,该查询服务器为该分布式的查询服务器系统中的查询服务器。
可选地,数据下载服务模块720还用于:
从该分布式的查询服务器系统中确定用于查询该第一上游下载节点的查询服务器。
可选地,集群还包括存储服务器,该存储服务器包括该N个节点的列表信息;数据下载服务模块720具体用于:
从该存储服务器中获取该N个节点的列表信息;
根据该列表信息,从该分布式的查询服务器系统中确定用于查询该第一上游下载节点的查询服务器。
可选地,数据下载服务模块720还用于:
根据该列表信息,采用一致性哈希算法确定该查询服务器。
可选地,数据下载服务模块720还用于:
向该存储服务器发送注册请求,该注册请求包括该节点的IP地址以及注册端口号。
可选地,该列表信息包括该N个节点的IP地址以及注册端口号。
可选地,该第一下载信息包括该镜像文件的文件名和该节点700已下载该镜像文件的文件大小。
可选地,该第一下载信息包括该镜像文件的文件名、该节点700已下载该镜像文件的文件大小、该节点700下载该镜像文件的下载时间和该节点700下载该镜像文件的上游下载节点。
可选地,该数据下载服务模块720还用于提供查询服务器功能。
可选地,在本申请的,该节点700既作为第一镜像文件的下载节点,又作为N个节点中其它节点(例如第二节点)的查询服务器。该节点700作为查询服务器时,数据下载服务模块720还包括以下功能:
确定该N个节点中的第二节点的第二上游下载节点,其中,该第二上游下载节点是下载源集合中的为该第二节点提供该镜像文件的下载服务的节点,该下载源集合包括该文件服务器和N个节点中下载了该第二镜像文件的至少一个节点;
该节点700向该第二节点发送第二上游下载节点的信息。
可选地,该数据下载服务模块720还用于:
接收第二节点发送的查询请求,该查询请求用于查询该第二上游下载节点的信息。
可选地,该数据下载服务模块720还用于:
根据下载信息列表确定第二上游下载节点,该下载信息列表包括第二镜像文件在该N个节点的下载信息。
可选地,该数据下载服务模块720还用于:
根据第一条件和第二条件确定该第二上游下载节点,其中,该第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值;该第二下载条件为该第二节点已下载该第二镜像文件的大小小于该第二上游下载节点已下载该第二镜像文件的大小。
可选地,该数据下载服务模块720还用于:
接收第二节点发送的第二下载信息,该第二下载信息为该第二下载节点下载第二镜像文件的下载信息;
根据第二下载信息更新第二下载节点中的该第二镜像文件的下载信息列表。
可选地,第二下载信息包括第二镜像文件的文件名和第二下载节点已下载该第二镜像文件的文件大小。
可选地,第二下载信息包括第二镜像文件的文件名、第二下载节点已下载该第二镜像文件的文件大小、该第二下载节点下载第二镜像文件的下载时间和该第二下载节点下载第二镜像文件的上游下载节点。
应理解,在本申请实施例中该节点700即为N个节点中的任意一个节点,该节点700下载的是第一镜像文件,第二节点下载的是第二镜像文件。
本申请的实施例中,提供了一种在集群中镜像文件下载的方法,具体地通过查询服务器查找各个节点的上游下载节点的信息,使得集群中的镜像文件的下载由采用中心下载方式,即全部节点均在文件服务器中下载镜像文件的下载方式,变为基于文件服务器和其它节点提供文件源的镜像文件的下载方式,使得集群中镜像文件下载的时间复杂度由O(N)下降到了O(LogN),从而降低了集群环境中下载服务资源的占用,提高了集群中的下载效率。
图8示出了根据本申请另一实施例的节点800的示意性框图(图8中的节点800可以是图1中的任意一个节点),其中具体地,示出了数据下载服务模块的结构示意图。该节点800可以对应于各方法实施例中N个节点中的任意一个节点,可以具有方法中的节点的任意功能。
如图8所示,节点800为集群中的节点,该集群包括文件服务器和N个节点,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务,其中,N为大于1的正整数。
如图8所示,该节点800可以包括:服务端模块810和数据下载服务模块820。服务端模块810可以为图7中的服务端模块710,数据下载服务模块820可以为图7中的数据下载服务模块720。在一些实施例中,数据下载服务模块820具体可以包括以下中的一个或多个模块:
代理模块821、处理模块822、FileTracker文件追踪模块823以及节点管理模块824。
应注意,代理模块821、处理模块822、FileTracker文件追踪模块823以及节点管 理模块824可以不同时存在于数据下载服务模块820中。
在本申请的实施例中,服务端模块810,用于接收客户端发送的数据下载请求,并将数据下载请求发送至数据下载服务模块820。
数据下载服务模块820,用于根据服务端模块810接收到数据下载请求中的镜像文件的下载请求,从查询服务器获取第一上游下载节点的信息,其中,该第一上游下载节点是基于均衡下载策略在下载源集合中确定的为该节点800提供该镜像文件的下载服务的节点,下载源集合包括文件服务器和N个节点中下载了该镜像文件的至少一个节点;以及从该第一上游下载节点下载该镜像文件。
其中,代理模块821,用于接收服务端模块810发送的数据下载请求,并对数据请求进行筛选。当数据请求为镜像文件下载请求时,将该镜像文件下载请求发送至处理模块822。
可选地,均衡下载策略包括第一下载条件,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值。
可选地,均衡下载策略包括第二下载条件,该第二下载条件为该第一下载节点已下载该镜像文件的大小小于该第一上游下载节点已下载该镜像文件的大小。
可选地,处理模块822用于:
根据该镜像文件的下载请求,从查询服务器获取第一上游下载节点的信息。
可选地,处理模块822还用于:
向该查询服务器发送第一下载信息,该第一下载信息为节点800下载该镜像文件的下载信息,该第一下载信息用于该查询服务器更新该查询服务器中的该镜像文件的下载信息列表,该下载信息列表包括该镜像文件在该N个节点的下载信息。
可选地,处理模块822还用于:
向查询服务器发送查询请求,该查询请求用于查询该第一上游下载节点的信息。
可选地,处理模块822还用于:
从分布式的查询服务器系统中确定用于查询该第一上游下载节点的查询服务器。
可选地,集群中还包括存储服务器,存储服务器包括N个节点的列表信息;处理模块822还用于:从存储服务器中获取N个节点的列表信息;
根据列表信息,从分布式的查询服务器系统中确定用于查询该第一上游下载节点的查询服务器。
可选地,处理模块822还用于:
根据列表信息,采用一致性哈希算法确定该查询服务器。
可选地,处理模块822还用于:
向存储服务器发送注册请求,该注册请求包括节点的IP地址以及注册端口号。
可选地,列表信息包括该N个节点的IP地址以及注册端口号。
可选地,该第一下载信息包括该镜像文件的文件名、节点800已下载该镜像文件的文件大小、节点800下载该镜像文件的下载时间和节点800下载该镜像文件的上游下载节点。
需要说明的是,在本申请的一个实施例中,节点800可以既作为第一镜像文件的下载节点,又作为N个节点中其它节点(例如第二节点)的查询服务器。该节点800作为 查询服务器时还包括查询服务器的功能。
可选地,当节点800作为N个节点中其它节点(例如第二节点)的查询服务器时,数据下载服务模块820中还可以包括FileTracker模块823。
可选地,FileTracker模块823用于:
确定该N个节点中的第二节点的第二上游下载节点,其中,该第二上游下载节点是基于均衡下载策略在下载源集合中确定的为该第二节点提供该镜像文件的下载服务的节点,下载源集合包括该文件服务器和N个节点中下载了该第二镜像文件的至少一个节点;
该节点800向第二节点发送第二上游下载节点的信息。
可选地,FileTracker模块823还用于:
接收第二节点发送的查询请求,该查询请求用于查询该第二上游下载节点的信息。
可选地,均衡下载策略包括第一下载条件,该FileTracker模块823还用于:
根据第一下载条件确定第二上游下载节点,该第一下载条件为该第二上游下载节点的下游下载节点的个数小于预设阈值。其中,所述下游下载节点即正在从该第一上游下载节点下载该镜像文件的节点。
可选地,均衡下载策略还包括第二下载条件,该FileTracker模块823还用于:
根据第一下载条件和第二下载条件确定第二上游下载节点,其中,该第二下载条件为该第二下载节点已下载该第二镜像文件的大小小于该第二上游下载节点已下载该第二镜像文件的大小。
可选地,FileTracker模块823还用于:
接收第二节点发送的第二下载信息,该第二下载信息为该第二下载节点下载第二镜像文件的下载信息;
根据第二下载信息更新第二下载节点中的该第二镜像文件的下载信息列表。
可选地,第二下载信息包括第二镜像文件的文件名和第二下载节点已下载该第二镜像文件的文件大小。
可选地,第二下载信息包括第二镜像文件的文件名、第二下载节点已下载该第二镜像文件的文件大小、该第二下载节点下载第二镜像文件的下载时间和该第二下载节点下载第二镜像文件的上游下载节点。
应理解,在本申请实施例中该节点800可以为N个节点中的任意一个节点,该节点800下载的是第一镜像文件,第二节点下载的是第二镜像文件。
可选地,当集群中的查询服务器为分布式的查询服务器系统时,数据下载服务模块820中还可以包括节点管理模块824。
节点管理模块824用于:
提供节点到存储服务器上注册的服务。
通过这个服务,集群中的节点将其IP地址和注册端口号注册到存储服务器上。在分布式的查询服务器系统中第二节点可以基于这个列表信息采用一致性哈希算法确定提供查询服务器功能的节点。
应理解,在本申请实施例中,若集群中的查询服务器为分布式的查询服务器,则数据下载服务模块820中包括节点管理模块824。若集群中的查询服务器为集中式的查询 服务器,则数据下载服务模块820中不包括节点管理模块824。
本申请实施例的技术方案中,集群中的节点通过查询服务器查找该节点的上游下载节点的信息,从该上游下载节点下载该镜像文件,使得集群中的镜像文件的下载由采用中心下载方式,即全部节点在文件服务器中下载镜像文件的下载方式,变为基于文件服务器和其它节点提供文件源的镜像文件的下载方式,使集群中镜像文件下载的时间复杂度由O(N)下降到了O(LogN),从而降低了集群环境中下载服务资源的占用,提高了集群中的下载效率。
图9示出了根据本申请实施例的查询服务器900的示意性框图。该查询服务器900可以对应于各方法实施例中的查询服务器,该查询服务器900可以为集中式的查询服务器或者分布式查询服务器系统中的查询服务器。
如图9中的查询服务器900应用于集群中,该集群包括文件服务器和N个节点,该文件服务器为该N个节点中的至少一个节点提供该镜像文件的下载服务,该N个节点中下载了该镜像文件的至少一个节点为该N个节点中的至少一个节点提供该镜像文件的下载服务,其中,N为大于1的正整数。该查询服务器900包括:
处理模块910,用于基于均衡下载策略确定N个节点中的第一下载节点的第一上游下载节点,其中,该第一上游下载节点是下载源集合中的为该第一下载节点提供该镜像文件的下载服务的节点,下载源集合包括该文件服务器和N个节点中下载了该镜像文件的至少一个节点;
收发模块920,用于向该第一下载节点发送该第一上游下载节点的信息。
可选地,该收发模块920还用于:
接收第一节点发送的查询请求,该查询请求用于查询该第一上游下载节点的信息。
可选地,处理模块910具体用于:
基于均衡下载策略和载信息列表确定第一上游下载节点,该下载信息列表包括镜像文件在N个节点的下载信息。
可选地,均衡下载策略包括第一下载条件,处理模块910具体用于:
根据第一下载条件确定该第一上游下载节点,该第一下载条件为该第一上游下载节点的下游下载节点的个数小于预设阈值。其中,所述下游下载节点即正在从该第一上游下载节点下载该镜像文件的节点。
可选地,均衡下载策略还包括第二下载条件,处理模块910具体用于:
根据第一下载条件和第二下载条件确定该第一上游下载节点,其中,该第二下载条件为该第一下载节点已下载该镜像文件的大小小于该第一上游下载节点已下载该镜像文件的大小。
可选地,该收发模块920还用于:
接收第一下载节点发送的第一下载信息,该第一下载信息为该第一下载节点下载该镜像文件的下载信息;
可选地,第一下载信息包括该镜像文件的文件名和第一下载节点已下载该镜像文件的文件大小。
可选地,第一下载信息包括该镜像文件的文件名、第一下载节点已下载该镜像文件的文件大小、第一下载节点下载该镜像文件的下载时间和第一下载节点下载该镜像文件 的上游下载节点。
该处理模块910还用于:
根据第一下载信息更新该查询服务器900中的该镜像文件的下载信息列表。
可选地,查询服务器900为集中式的查询服务器,集中式的查询服务器为该N个节点中的特定节点。
可选地,集中式的查询服务器可以采用主备模式。
可选地,N个节点形成分布式的查询服务器系统,查询服务器900为该分布式的查询服务器系统中的查询服务器。
本申请实施例的技术方案中,查询服务器向节点发送上游下载节点的信息,从而该节点从该上游下载节点下载镜像文件,使得集群中的镜像文件的下载由采用中心下载方式,即全部节点在文件服务器中下载镜像文件的下载方式,变为基于文件服务器和其它节点提供文件源的镜像文件的下载方式,使集群中镜像文件下载的时间复杂度由O(N)下降到了O(LogN),从而降低了集群环境中下载服务资源的占用,提高了集群中的下载效率。
图10示出了本申请另一个实施例的节点的示意性框图,包括至少一个处理器1020(例如CPU),至少一个网络接口1040或者其他通信接口,和存储器1060,这些部件之间通信连接。处理器1020用于执行存储器1060中存储的可执行模块,例如计算机程序。存储器1060可能包含高速随机存取存储器(Random Access Memory,RAM),也可能还包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。通过至少一个网络接口1040(可以是有线或者无线)实现与至少一个其他网元之间的通信连接。
在一些实施方式中,存储器1060存储了程序1011,处理器1020执行程序1011,用于执行前述本申请各种实施例中的方法。
例如,处理器可以用于执行上述图3中的S310查询服务器向第一下载节点发送第一上游下载节点的信息;或者S320第一下载节点从第一上游下载节点处获取镜像文件。
例如,处理器可以用于执行图4中S410第一节点向查询服务器发送下载镜像文件1的查询请求、S420查询服务器向第一节点发送查询结果、S430第一节点从文件服务器获取数据以及S440第一节点向查询服务器发送镜像文件1的下载信息。
例如,存储器1060可以存储本申请实施例中的DDS文件。处理器1020用于执行DDS文件中的各个模块,例如Docker Registry Proxy(注册代理)、DDS Service Handle(服务管理)、File Tracker(文件追踪)和Node Manager(节点管理),从而实施本申请实施例的技术方案。
例如,若节点作为下载节点,处理器1020执行DDS Service Handle模块提供本申请实施例中下载节点的功能。若节点作为查询服务器,处理器1020执行File Tracker模块提供申请实施例中的查询服务器的功能。若节点既作为下载节点,又作为其它节点的查询服务器,处理器1020执行DDS Service Handle模块提供本申请实施例中下载节点的功能,执行File Tracker模块提供查询服务器的功能。进一步地,若节点提供分布式的查询服务器的功能,处理器1020还执行Node Manager模块以实现在存储服务器上注册。
可选地,节点还可以包括存储器,该存储器可以存储程序代码,处理器调用存储器存储的程序代码,以实现该节点的相应功能。可选地,处理器和存储器可以通过芯片实现。
本申请实施例还提供了一种集群系统,包括上述节点和查询服务器。例如,集群系统可以包括上述图7、图8所示的节点以及图9所示的查询服务器。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。
应理解,在本申请实施例中,术语“第一”、“第二”等仅仅是为了指代对象,并不表示相应对象间的先后次序。
应理解,在本申请实施例中,术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (26)

  1. 一种集群中镜像文件下载的方法,其特征在于,所述方法适用于包括文件服务器和N个节点的集群中,所述文件服务器为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,所述N个节点中下载了所述镜像文件的至少一个节点为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,其中,所述N为大于1的正整数,所述方法包括:
    所述N个节点中的第一下载节点接收查询服务器发送的第一上游下载节点的信息,其中,所述第一上游下载节点是基于均衡下载策略在下载源集合中确定的为所述第一下载节点提供所述镜像文件的下载服务的节点,所述下载源集合包括所述文件服务器和所述N个节点中下载了所述镜像文件的至少一个节点;
    所述第一下载节点从所述第一上游下载节点下载所述镜像文件。
  2. 根据权利要求1所述的方法,其特征在于,所述第一下载节点从所述第一上游下载节点下载所述镜像文件之后,所述方法还包括:
    所述第一下载节点向所述查询服务器发送第一下载信息,所述第一下载信息为所述第一下载节点下载所述镜像文件的下载信息,所述第一下载信息用于所述查询服务器更新所述查询服务器中的所述镜像文件的下载信息列表,所述下载信息列表包括所述镜像文件在所述N个节点的下载信息。
  3. 根据权利要求1所述的方法,其特征在于,所述均衡下载策略包括第一下载条件,
    所述第一下载条件为所述第一上游下载节点的下游下载节点的个数小于预设阈值。
  4. 根据权利要求3所述的方法,其特征在于,所述均衡下载策略还包括第二下载条件,所述第二下载条件为所述第一下载节点已下载所述镜像文件的大小小于所述第一上游下载节点已下载所述镜像文件的大小。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一下载节点接收所述查询服务器发送的第一上游下载节点的信息之前,所述方法还包括:
    所述第一下载节点向所述查询服务器发送查询请求,所述查询请求用于查询所述第一上游下载节点的信息。
  6. 根据权利要求2所述的方法,其特征在于,所述第一下载信息包括所述镜像文件的文件名和所述第一下载节点已下载的所述镜像文件的文件大小。
  7. 一种集群中镜像文件下载的方法,其特征在于,所述方法适用于包括文件服务器和N个节点的集群中,所述文件服务器为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,所述N个节点中下载了所述镜像文件的至少一个节点为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,其中,所述N为大于1的正整数,所述方法包括:
    查询服务器基于均衡下载策略确定所述N个节点中的第一下载节点的第一上游下载节点,其中,所述第一上游下载节点是下载源集合中的为所述第一下载节点提供所述镜像文件的下载服务的节点,所述下载源集合包括所述文件服务器和所述N个节点中下载了所述镜像文件的至少一个节点;
    所述查询服务器向所述第一下载节点发送所述第一上游下载节点的信息。
  8. 根据权利要求7所述的方法,其特征在于,所述查询服务器基于均衡下载策略确定所述N个节点中的第一下载节点的第一上游下载节点,包括:
    所述查询服务器基于所述均衡下载策略和下载信息列表确定所述第一上游下载节点,所述下载信息列表包括所述镜像文件在所述N个节点的下载信息。
  9. 根据权利要求8所述的方法,其特征在于,所述均衡下载策略包括第一下载条件,所述查询服务器基于所述均衡下载策略和下载信息列表确定所述第一上游下载节点,包括:
    所述查询服务器根据第一下载条件确定所述第一上游下载节点,所述第一下载条件为所述第一上游下载节点的下游下载节点的个数小于预设阈值。
  10. 根据权利要求9所述的方法,其特征在于,所述均衡下载策略还包括第二下载条件,所述查询服务器基于所述均衡下载策略和下载信息列表确定所述第一上游下载节点,包括:
    所述查询服务器根据所述第一下载条件和所述第二下载条件确定所述第一上游下载节点,所述第二下载条件为所述第一下载节点已下载所述镜像文件的大小小于所述第一上游下载节点已下载所述镜像文件的大小。
  11. 根据权利要求7至10中任一项所述的方法,其特征在于,所述查询服务器向所述第一下载节点发送所述第一上游下载节点的信息之后,所述方法还包括:
    所述查询服务器接收所述第一下载节点发送的第一下载信息,所述第一下载信息为所述第一下载节点下载所述镜像文件的下载信息;
    所述查询服务器根据所述第一下载信息更新所述查询服务器中的所述镜像文件的下载信息列表。
  12. 根据权利要求7至11中任一项所述的方法,其特征在于,在所述查询服务器确定所述第一下载节点的第一上游下载节点之前,所述方法还包括:
    所述查询服务器接收所述第一下载节点发送的查询请求,所述查询请求用于查询所述第一上游下载节点的信息。
  13. 根据权利要求11所述的方法,其特征在于,所述第一下载信息包括所述镜像文件的文件名和所述第一下载节点已下载的所述镜像文件的文件大小。
  14. 一种节点,其特征在于,所述节点为集群中的节点,所述集群包括文件服务器和N个节点,所述文件服务器为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,所述N个节点中下载了所述镜像文件的至少一个节点为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,其中,所述N为大于1的正整数,所述节点包括:
    服务端模块和数据下载服务模块;
    所述服务端模块,用于向所述数据下载服务模块发送所述镜像文件的下载请求;
    所述数据下载服务模块,用于根据所述镜像文件的下载请求,从查询服务器获取第一上游下载节点的信息,其中,所述第一上游下载节点是基于均衡下载策略在下载源集合中确定的为所述节点提供所述镜像文件的下载服务的节点,所述下载源集合包括所述文件服务器和所述N个节点中下载了所述镜像文件的至少一个节点;以及从所述第一上游下载节点下载所述镜像文件。
  15. 根据权利要求14所述的节点,其特征在于,所述数据下载服务模块还用于:
    向所述查询服务器发送第一下载信息,所述第一下载信息为所述节点下载所述镜像文件的下载信息,所述第一下载信息用于所述查询服务器更新所述查询服务器中的所述镜像文件的下载信息列表,所述下载信息列表包括所述镜像文件在所述N个节点的下载信息。
  16. 根据权利要求14所述的节点,其特征在于,所述均衡下载策略包括第一下载条件,所述第一下载条件为所述第一上游下载节点的下游下载节点的个数小于预设阈值。
  17. 根据权利要求14或16所述的节点,其特征在于,所述均衡下载策略还包括第二下载条件,所述第二下载条件为所述第一下载节点已下载所述镜像文件的大小小于所述第一上游下载节点已下载所述镜像文件的大小。
  18. 根据权利要求14至17中任一项所述的节点,其特征在于,所述数据下载服务模块还用于:
    向所述查询服务器发送查询请求,所述查询请求用于查询所述第一上游下载节点的信息。
  19. 根据权利要求15所述的节点,其特征在于,所述第一下载信息包括所述镜像文件的文件名和所述节点已下载的所述镜像文件的文件大小。
  20. 一种查询服务器,其特征在于,所述查询服务器应用于集群中,所述集群包括文件服务器和N个节点,所述文件服务器为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,所述N个节点中下载了所述镜像文件的至少一个节点为所述N个节点中的至少一个节点提供所述镜像文件的下载服务,其中,所述N为大于1的正整数,所述查询服务器包括:
    处理模块,用于基于均衡下载策略确定N个节点中的第一下载节点的第一上游下载节点,其中,所述第一上游下载节点是下载源集合中的为所述第一下载节点提供所述镜像文件的下载服务的节点,所述下载源集合包括所述文件服务器和所述N个节点中下载了所述镜像文件的至少一个节点;
    收发模块,用于向所述第一下载节点发送所述第一上游下载节点的信息。
  21. 根据权利要求20所述的查询服务器,其特征在于,所述处理模块具体用于:
    基于所述均衡下载策略和下载信息列表确定所述第一上游下载节点,所述下载信息列表包括所述镜像文件在所述N个节点的下载信息。
  22. 根据权利要求20或21所述的查询服务器,其特征在于,所述均衡下载策略包括第一下载条件,所述处理模块具体用于:
    根据所述第一下载条件确定所述第一上游下载节点,所述第一下载条件为所述第一上游下载节点的下游下载节点的个数小于预设阈值。
  23. 根据权利要求22所述的查询服务器,其特征在于,所述均衡下载策略还包括第二下载条件,所述处理模块具体用于:
    根据所述第一下载条件和所述第二下载条件确定所述第一上游下载节点,所述第二下载条件为所述第一下载节点已下载所述镜像文件的大小小于所述第一上游下载节点已下载所述镜像文件的大小。
  24. 根据权利要求20至23中任一项所述的查询服务器,其特征在于,所述收发模块还用于:
    接收所述第一下载节点发送的第一下载信息,所述第一下载信息为所述第一下载节点下载所述镜像文件的下载信息;
    所述处理模块还用于:
    根据所述第一下载信息更新所述查询服务器中的所述镜像文件的下载信息列表。
  25. 根据权利要求20至24中任一项所述的查询服务器,其特征在于,所述收发模块还用于:
    接收所述第一节点发送的查询请求,所述查询请求用于查询所述第一上游下载节点的信息。
  26. 根据权利要求24所述的查询服务器,其特征在于,所述第一下载信息包括所述镜像文件的文件名和所述第一下载节点已下载的所述镜像文件的文件大小。
PCT/CN2018/121070 2018-02-12 2018-12-14 集群中镜像文件下载的方法、节点、查询服务器 WO2019153880A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810146877.XA CN108200211B (zh) 2018-02-12 2018-02-12 集群中镜像文件下载的方法、节点和查询服务器
CN201810146877.X 2018-02-12

Publications (1)

Publication Number Publication Date
WO2019153880A1 true WO2019153880A1 (zh) 2019-08-15

Family

ID=62593228

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/121070 WO2019153880A1 (zh) 2018-02-12 2018-12-14 集群中镜像文件下载的方法、节点、查询服务器

Country Status (2)

Country Link
CN (1) CN108200211B (zh)
WO (1) WO2019153880A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108200211B (zh) * 2018-02-12 2020-10-09 华为技术有限公司 集群中镜像文件下载的方法、节点和查询服务器
CN109246234B (zh) * 2018-09-30 2021-09-24 北京金山云网络技术有限公司 一种镜像文件下载方法、装置、电子设备及存储介质
CN110401702B (zh) * 2019-07-09 2022-03-25 北京达佳互联信息技术有限公司 一种离线包下载方法、装置、电子设备和存储介质
CN111367880A (zh) * 2020-02-05 2020-07-03 北京华电天仁电力控制技术有限公司 一种通用实时数据存储管理系统及其实现方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506587A (zh) * 2016-09-23 2017-03-15 中国人民解放军国防科学技术大学 一种基于分布式存储的Docker镜像下载方法
WO2017067484A1 (zh) * 2015-10-23 2017-04-27 中兴通讯股份有限公司 一种虚拟化数据中心调度系统和方法
CN107426258A (zh) * 2016-05-23 2017-12-01 华为技术有限公司 一种镜像文件的上传和下载方法及装置
CN108200211A (zh) * 2018-02-12 2018-06-22 华为技术有限公司 集群中镜像文件下载的方法、节点和查询服务器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017067484A1 (zh) * 2015-10-23 2017-04-27 中兴通讯股份有限公司 一种虚拟化数据中心调度系统和方法
CN107426258A (zh) * 2016-05-23 2017-12-01 华为技术有限公司 一种镜像文件的上传和下载方法及装置
CN106506587A (zh) * 2016-09-23 2017-03-15 中国人民解放军国防科学技术大学 一种基于分布式存储的Docker镜像下载方法
CN108200211A (zh) * 2018-02-12 2018-06-22 华为技术有限公司 集群中镜像文件下载的方法、节点和查询服务器

Also Published As

Publication number Publication date
CN108200211B (zh) 2020-10-09
CN108200211A (zh) 2018-06-22

Similar Documents

Publication Publication Date Title
JP7210713B2 (ja) オンデマンドコード実行システムにおける実行環境についての効率的な状態メンテナンス
US11243953B2 (en) Mapreduce implementation in an on-demand network code execution system and stream data processing system
US11882177B2 (en) Orchestration of data services in multiple cloud infrastructures
US11588755B2 (en) Distributed stream-based database triggers
US9971823B2 (en) Dynamic replica failure detection and healing
JP6033805B2 (ja) 分散リソース管理のためのバランスしたコンシステント・ハッシュ
WO2019153880A1 (zh) 集群中镜像文件下载的方法、节点、查询服务器
US8832130B2 (en) System and method for implementing on demand cloud database
US11487591B1 (en) Automatically configuring execution of a containerized application
US20100138540A1 (en) Method of managing organization of a computer system, computer system, and program for managing organization
US9438665B1 (en) Scheduling and tracking control plane operations for distributed storage systems
US20200403911A1 (en) Dynamic distributed service location discovery
US10102230B1 (en) Rate-limiting secondary index creation for an online table
US9021478B1 (en) Provisioning virtual machines from template by splitting and building index for locating content portions via content-centric network
US20200042619A1 (en) System and method for high replication factor (rf) data replication
US10747739B1 (en) Implicit checkpoint for generating a secondary index of a table
George et al. Hadoop MapReduce for mobile clouds
US20230034521A1 (en) Computing cluster bring-up on any one of a plurality of different public cloud infrastructures
US20190373021A1 (en) Policy aggregation
JP2013543169A (ja) ミドルウェアマシン環境を含むシステム
WO2023029485A1 (zh) 数据处理方法、装置、计算机设备及计算机可读存储介质
US20220382601A1 (en) Configuration map based sharding for containers in a machine learning serving infrastructure
US10015248B1 (en) Syncronizing changes to stored data among multiple client devices
US20210067599A1 (en) Cloud resource marketplace
US11249952B1 (en) Distributed storage of data identifiers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905786

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18905786

Country of ref document: EP

Kind code of ref document: A1