CN109688222B

CN109688222B - Shared computing resource scheduling method, shared computing system, server and storage medium

Info

Publication number: CN109688222B
Application number: CN201811601521.7A
Authority: CN
Inventors: 李�浩
Original assignee: Shenzhen Onething Technologies Co Ltd
Current assignee: Shenzhen Onething Technologies Co Ltd
Priority date: 2018-12-26
Filing date: 2018-12-26
Publication date: 2020-12-25
Anticipated expiration: 2038-12-26
Also published as: WO2020133967A1; CN109688222A

Abstract

The invention discloses a scheduling method for shared computing resources. The method includes: acquiring a shared computing task to be executed; acquiring a list of all candidate shared computing nodes; selecting a shared computing task from the shared computing node list matching shared computing nodes; and delivering the shared computing tasks to the shared computing nodes matching the shared computing tasks. The present invention also provides a shared computing system, server and storage medium. The present invention can select an appropriate shared computing node according to the resource requirements of the user, and make corresponding scheduling in real time in response to the fluctuation of the node.

Description

Scheduling method for shared computing resources, shared computing system, server and storage medium

技术领域technical field

本发明涉及共享计算技术领域，尤其涉及一种共享计算资源的调度方法、共享计算系统、服务器及存储介质。The present invention relates to the technical field of shared computing, and in particular, to a scheduling method for shared computing resources, a shared computing system, a server and a storage medium.

背景技术Background technique

目前有很多企业需要使用大量带宽、磁盘、CPU资源来为分布在不同地域不同网络环境下的用户提供稳定高速的服务，同时家庭环境的带宽和存储等资源存在很大的闲置，通过部署在用户家庭中的智能硬件作为家庭节点，搭建一套共享计算系统能充分使用这些资源，极大的降低企业的服务成本。家庭节点有以下特点：1、数量众多，可能多达十万、百万甚至更高数量级；2、家庭节点的稳定性低于服务器节点；3、节点之间是公网互联，节点的IP地址是动态变化的；4、单个节点所拥有的物理资源很少而且实时波动。At present, many enterprises need to use a large amount of bandwidth, disk, and CPU resources to provide stable and high-speed services for users distributed in different regions and different network environments. At the same time, the bandwidth and storage resources of the home environment are largely idle. The intelligent hardware in the home acts as a home node, and building a shared computing system can make full use of these resources and greatly reduce the service cost of enterprises. Home nodes have the following characteristics: 1. There are a large number of them, possibly as many as 100,000, millions or even higher orders of magnitude; 2. The stability of home nodes is lower than that of server nodes; 3. The nodes are interconnected by the public network, and the IP addresses of the nodes It is dynamic; 4. The physical resources owned by a single node are few and fluctuate in real time.

上述模式下，对智能硬件收集到的资源做到灵活高效的管理是核心点，要求能快速部署不同的业务程序，并对业务程序做资源管理和安全控制，同时根据业务对每个节点的资源使用情况做出实时调度，最大化利用节点的物理资源。对百万以上部署在家庭网络环境下的节点，抽象出虚拟的计算、存储、网络资源，目前业界并没有成熟方案。In the above mode, flexible and efficient management of the resources collected by intelligent hardware is the core point. It requires the rapid deployment of different business programs, resource management and security control for business programs, and the resource management of each node according to the business. The usage is scheduled in real time to maximize the use of the physical resources of the node. There is no mature solution in the industry to abstract virtual computing, storage, and network resources for more than one million nodes deployed in the home network environment.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明提出一种共享计算资源的调度方法、共享计算系统、服务器及存储介质，以解决至少一个上述技术问题。In view of this, the present invention proposes a scheduling method for sharing computing resources, a sharing computing system, a server and a storage medium to solve at least one of the above technical problems.

首先，为实现上述目的，本发明提出一种共享计算资源的调度方法，其特征在于，所述方法包括：First, in order to achieve the above purpose, the present invention provides a scheduling method for sharing computing resources, characterized in that the method includes:

获取待执行的共享计算任务；Obtain shared computing tasks to be executed;

获取所有备选的共享计算节点列表；Get a list of all alternative shared computing nodes;

从所述共享计算节点列表中选择与所述共享计算任务匹配的共享计算节点；Selecting a shared computing node matching the shared computing task from the shared computing node list;

将所述共享计算任务下发至所述与所述共享计算任务匹配的共享计算节点。Delivering the shared computing task to the shared computing node matching the shared computing task.

可选地，所述共享计算节点列表包括各共享计算节点的ID、可用资源数据；Optionally, the shared computing node list includes IDs and available resource data of each shared computing node;

所述共享计算任务包括需要配置的共享计算资源的需求；The shared computing tasks include requirements for shared computing resources that need to be configured;

所述从所述共享计算节点列表中选择与所述共享计算任务匹配的共享计算节点包括：The selecting a shared computing node matching the shared computing task from the shared computing node list includes:

根据所述需要配置的共享计算资源的需求以及各共享计算节点的可用资源数据，从所述共享计算节点列表中选择与所述共享计算任务匹配的共享计算节点。According to the requirements of the shared computing resources to be configured and the available resource data of each shared computing node, a shared computing node matching the shared computing task is selected from the shared computing node list.

可选地，所述共享计算资源的需求包括：带宽需求、存储空间需求和计算资源需求中的至少一种。Optionally, the requirements for sharing computing resources include at least one of bandwidth requirements, storage space requirements, and computing resource requirements.

可选地，所述共享计算节点列表中的可用资源数据为根据各个共享计算节点上传的节点实时状态、任务状态及节点上执行任务时产生的数据计算得到。Optionally, the available resource data in the shared computing node list is calculated according to the real-time status of the node, the task status uploaded by each shared computing node, and the data generated when the task is executed on the node.

可选地，所述根据所述需要配置的共享计算资源的需求以及各共享计算节点的可用资源数据，从所述共享计算节点列表中选择与所述共享计算任务匹配的共享计算节点包括：Optionally, selecting a shared computing node matching the shared computing task from the shared computing node list according to the requirements of the shared computing resources to be configured and available resource data of each shared computing node includes:

获取所述共享计算节点列表中每个共享计算节点的可用资源数据；obtaining available resource data of each shared computing node in the shared computing node list;

从所述共享计算节点列表中选择所述可用资源数据达到预设值的共享计算节点，生成可用节点列表；Selecting a shared computing node whose available resource data reaches a preset value from the shared computing node list to generate a list of available nodes;

按照预设指标为所述可用节点列表中的各个共享计算节点打分，采用装箱算法拆分所述需要配置的共享计算资源的需求到打分分值超过预设阈值的共享计算节点，得到最终的匹配节点列表。Score each shared computing node in the available node list according to the preset index, and use the binning algorithm to divide the requirements of the shared computing resources that need to be configured to the shared computing nodes whose scoring value exceeds the preset threshold to obtain the final A list of matching nodes.

可选地，所述根据所述需要配置的共享计算资源的需求以及各共享计算节点的可用资源数据，从所述共享计算节点列表中选择与所述共享计算任务匹配的共享计算节点还包括：Optionally, selecting a shared computing node matching the shared computing task from the shared computing node list according to the requirements of the shared computing resources to be configured and available resource data of each shared computing node further includes:

定时获取所选择的所述共享计算节点当前的可用资源数据；Obtaining the current available resource data of the selected shared computing node at regular intervals;

根据所述需要配置的共享计算资源的需求以及所述共享计算节点当前的可用资源数据判断是否需要进行节点增删。Whether it is necessary to add or delete nodes is determined according to the requirements of the shared computing resources to be configured and the current available resource data of the shared computing nodes.

可选地，所述预设指标包括区域资源余量、历史稳定性。Optionally, the preset indicators include regional resource surplus and historical stability.

可选地，所述获取待执行的共享计算任务包括：获取根据待执行的共享计算任务生成的docker镜像。Optionally, the obtaining the shared computing task to be executed includes: obtaining a docker image generated according to the shared computing task to be executed.

可选地，所述将所述共享计算任务下发至所述与所述共享计算任务匹配的共享计算节点包括：将与所述共享计算任务相对应的docker镜像下发至所述与所述共享计算任务匹配的共享计算节点。Optionally, the delivering the shared computing task to the shared computing node matching the shared computing task includes: delivering a docker image corresponding to the shared computing task to the shared computing node matching the shared computing task. Shared computing nodes matched by shared computing tasks.

此外，为实现上述目的，本发明还提供一种服务器，所述服务器包括存储器、处理器，所述存储器上存储有可在所述处理器上运行的共享计算资源的调度程序，所述共享计算资源的调度程序被所述处理器执行时实现如上述的共享计算资源的调度方法。In addition, in order to achieve the above object, the present invention also provides a server, the server includes a memory and a processor, the memory stores a scheduler for sharing computing resources that can run on the processor, and the shared computing When the resource scheduler is executed by the processor, the above-mentioned scheduling method for shared computing resources is implemented.

进一步地，为实现上述目的，本发明还提供一种共享计算系统，所述系统包括：Further, in order to achieve the above object, the present invention also provides a shared computing system, the system includes:

任务管理单元，用于从客户端接收待执行的共享计算任务，并向调度服务单元派发所述共享计算任务；a task management unit, configured to receive the shared computing task to be executed from the client, and dispatch the shared computing task to the scheduling service unit;

所述调度服务单元，用于从所述任务管理单元获取所述共享计算任务，根据节点管理单元和数据仓库提供的各个共享计算节点的状态和历史数据获取所有备选的共享计算节点列表，并从所述共享计算节点列表中选择与所述共享计算任务匹配的共享计算节点；The scheduling service unit is configured to obtain the shared computing task from the task management unit, obtain a list of all alternative shared computing nodes according to the status and historical data of each shared computing node provided by the node management unit and the data warehouse, and Selecting a shared computing node matching the shared computing task from the shared computing node list;

部署服务单元，用于向所述调度服务单元所选择的与所述共享计算任务匹配的共享计算节点下发所述共享计算任务。A deployment service unit, configured to deliver the shared computing task to a shared computing node selected by the scheduling service unit that matches the shared computing task.

进一步地，为实现上述目的，本发明还提供一种存储介质，所述存储介质存储有共享计算资源的调度程序，所述共享计算资源的调度程序可被至少一个处理器执行，以使所述至少一个处理器执行如上述的共享计算资源的调度方法。Further, in order to achieve the above object, the present invention also provides a storage medium, where the storage medium stores a scheduler for sharing computing resources, and the scheduler for sharing computing resources can be executed by at least one processor, so that the At least one processor executes the scheduling method for sharing computing resources as described above.

本发明所提出的共享计算资源的调度方法、共享计算系统、服务器及存储介质，可以统一管理百万量级的共享计算节点组成的Docker集群，根据共享计算任务所需资源分配与该任务相匹配的共享计算节点，并根据节点状态变化随时进行节点调度，维持资源总量的平稳。The scheduling method for shared computing resources, the shared computing system, the server and the storage medium proposed by the present invention can uniformly manage a Docker cluster composed of millions of shared computing nodes, and allocate resources required for a shared computing task to match the task. shared computing nodes, and schedule nodes at any time according to changes in node status to maintain the stability of the total amount of resources.

附图说明Description of drawings

图1是本发明第一实施例提出的一种共享计算系统的架构示意图；1 is a schematic diagram of the architecture of a shared computing system proposed by a first embodiment of the present invention;

图2是本发明第二实施例提出的一种调度服务器的架构示意图；2 is a schematic diagram of the architecture of a scheduling server proposed by a second embodiment of the present invention;

图3是本发明第三实施例提出的一种共享计算资源的调度方法的流程示意图；3 is a schematic flowchart of a scheduling method for sharing computing resources proposed by a third embodiment of the present invention;

图4是图3中S24的细化流程示意图。FIG. 4 is a schematic diagram of the refinement flow of S24 in FIG. 3 .

本发明目的的实现、功能特点及优点将结合实施例，参照附图做进一步说明。The realization, functional characteristics and advantages of the present invention will be further described with reference to the accompanying drawings in conjunction with the embodiments.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本发明，并不用于限定本发明。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

需要说明的是，在本发明中涉及“第一”、“第二”等的描述仅用于描述目的，而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外，各个实施例之间的技术方案可以相互结合，但是必须是以本领域普通技术人员能够实现为基础，当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在，也不在本发明要求的保护范围之内。It should be noted that the descriptions involving "first", "second", etc. in the present invention are only for the purpose of description, and should not be construed as indicating or implying their relative importance or implying the number of indicated technical features . Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In addition, the technical solutions between the various embodiments can be combined with each other, but must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that the combination of such technical solutions does not exist. , is not within the scope of protection required by the present invention.

第一实施例first embodiment

参阅图1所示，本发明第一实施例提出一种共享计算系统。上述共享计算系统是使用分布式节点资源构建的一套IaaS((Infrastructure as a Service，基础设施即服务)系统，核心功能是根据用户的资源需求，选择出合适的节点并进行轻量虚拟化，承载用户的程序逻辑，实时应对节点的网络位置、带宽、存储等波动而做出相应调度和调整。Referring to FIG. 1 , a first embodiment of the present invention provides a shared computing system. The above-mentioned shared computing system is a set of IaaS (Infrastructure as a Service) system constructed by using distributed node resources. It carries the program logic of the user, and makes corresponding scheduling and adjustment in real time in response to fluctuations in the node's network location, bandwidth, and storage.

在本实施例中，共享计算系统1包括服务器10和共享计算节点19。上述服务器10包括任务管理单元11、调度服务单元12、节点管理单元13、数据仓库14、部署服务单元15及镜像仓库17。上述共享计算系统1与客户端2通过网络进行数据通信，用于根据客户端2发起的共享计算任务分配相应的共享计算节点19，以执行该共享计算任务。In this embodiment, the shared computing system 1 includes a server 10 and a shared computing node 19 . The above server 10 includes a task management unit 11 , a scheduling service unit 12 , a node management unit 13 , a data warehouse 14 , a deployment service unit 15 and an image warehouse 17 . The above-mentioned shared computing system 1 and the client 2 perform data communication through the network, so as to allocate the corresponding shared computing node 19 according to the shared computing task initiated by the client 2 to execute the shared computing task.

客户端2用于选择所需资源的规格和容量及待执行的程序逻辑，根据上述程序逻辑自动生成Docker(应用容器引擎)镜像，并将选择的所需资源封装成标准化共享计算任务。在本实施例中，用户在客户端2可以通过管理控制台、CLI(Command-line Interface，命令行界面)工具、API(Application Programming Interface，应用程序编程接口)接口调用等多种方式，选择所需资源的规格和容量(例如带宽量、存储量等)，选定待执行的程序逻辑(可用多种语言实现)，通过调试平台和交叉编译平台处理后，自动将该程序逻辑生成Docker镜像。例如，资源需求是100Gbps带宽，10PB的存储量，执行的逻辑代码为hello.py。同时，客户端2的用户还可以对上述程序逻辑进行启停增删等控制。客户端2将选择的所需资源封装标准化任务后，将该任务递交至任务管理单元11。而用户在客户端2选定的程序逻辑会封装为标准化的Docker镜像，屏蔽编程语言和执行环境差异，然后递交至镜像仓库17。The client 2 is used to select the specification and capacity of the required resources and the program logic to be executed, automatically generate a Docker (application container engine) image according to the above program logic, and encapsulate the selected required resources into standardized shared computing tasks. In this embodiment, the user can select the desired value through a management console, a CLI (Command-line Interface, command line interface) tool, an API (Application Programming Interface, application programming interface) interface invocation and other methods on the client 2. The specification and capacity of the required resources (such as bandwidth, storage, etc.), select the program logic to be executed (which can be implemented in multiple languages), and after processing through the debugging platform and cross-compilation platform, the program logic will automatically generate a Docker image. For example, the resource requirement is 100Gbps bandwidth, 10PB storage, and the logic code to execute is hello.py. At the same time, the user of the client 2 can also control the above program logic, such as starting, stopping, adding, deleting, and so on. After the client 2 encapsulates the standardized task for the selected required resource, it submits the task to the task management unit 11 . The program logic selected by the user on the client 2 will be encapsulated into a standardized Docker image, shielding differences in programming languages and execution environments, and then submitted to the image repository 17 .

任务管理单元11用于从客户端2接收上述任务后，向调度服务单元12派发任务。在本实施例中，任务管理单元11会根据优先级和关联度将接收到的任务排入多条并行化的流水线，调度服务单元12从该流水线中按顺序获取任务。The task management unit 11 is configured to dispatch the task to the scheduling service unit 12 after receiving the above task from the client 2 . In this embodiment, the task management unit 11 will arrange the received tasks into a plurality of parallelized pipelines according to the priority and the degree of relevance, and the scheduling service unit 12 obtains tasks from the pipelines in sequence.

调度服务单元12用于从任务管理单元11获取任务，并根据节点管理单元13和数据仓库14提供的各个共享计算节点19的状态和历史数据选择与该共享计算任务匹配的共享计算节点19。调度服务单元12选取节点需要依赖从节点管理单元13获取的全量节点的实时状态，以及从数据仓库14中获取的节点和任务的历史数据(例如节点的历史稳定性等)。举例而言，调度服务单元12首先获取当前所有备选的共享计算节点列表，上述共享计算节点列表包括各共享计算节点19的ID、可用资源数据，上述可用资源数据可以根据各个共享计算节点19上传的节点实时状态、任务状态及节点上执行任务时产生的数据计算得到。然后，调度服务单元12拆分该任务的资源需求，根据地域、ISP(Internet Service Provider，互联网服务提供商)、NAT(Network Address Translation，网络地址转换)类型、带宽、存储空间、计算资源等选择达到预设值的可用节点列表。最后按照区域资源余量、历史稳定性等预设指标为该可用节点列表中的各个共享计算节点19打分，根据资源成本，采用装箱算法按照资源利用最大化原则，拆分该任务所需要配置的共享计算资源的需求到打分分值超过预设阈值的共享计算节点19，选出最终的匹配节点列表。另外，当已选取的共享计算节点19上传节点实时状态和任务状态(从而得到当前可用资源数据)后，调度服务单元12还用于进一步确定是否进行节点增删。The scheduling service unit 12 is configured to obtain tasks from the task management unit 11 , and select a shared computing node 19 matching the shared computing task according to the status and historical data of each shared computing node 19 provided by the node management unit 13 and the data warehouse 14 . The selection of nodes by the scheduling service unit 12 needs to rely on the real-time status of all nodes obtained from the node management unit 13 and the historical data of nodes and tasks obtained from the data warehouse 14 (eg, historical stability of nodes, etc.). For example, the scheduling service unit 12 first obtains a list of all current candidates for shared computing nodes, and the above-mentioned shared computing node list includes the ID and available resource data of each shared computing node 19. The above-mentioned available resource data can be uploaded according to each shared computing node 19. The real-time state of the node, the task state and the data generated when the task is executed on the node are calculated. Then, the scheduling service unit 12 divides the resource requirements of the task, and selects them according to the region, ISP (Internet Service Provider, Internet Service Provider), NAT (Network Address Translation, network address translation) type, bandwidth, storage space, computing resources, etc. List of available nodes up to the preset value. Finally, each shared computing node in the available node list is scored according to preset indicators such as regional resource reserve and historical stability. According to the resource cost, the binning algorithm is used to split the configuration required for the task according to the principle of maximizing resource utilization. The demand for shared computing resources reaches the shared computing node 19 whose scoring value exceeds the preset threshold, and the final matching node list is selected. In addition, after the selected shared computing node 19 uploads the node real-time state and task state (thus obtaining the currently available resource data), the scheduling service unit 12 is further configured to further determine whether to add or delete nodes.

节点管理单元13用于接收各个共享计算节点19上传的节点实时状态和任务状态并提供给调度服务单元12进行调度。The node management unit 13 is configured to receive the node real-time status and task status uploaded by each shared computing node 19 and provide them to the scheduling service unit 12 for scheduling.

数据仓库14用于接收各个共享计算节点19上传的执行任务时产生的数据并提供给调度服务单元12进行调度。The data warehouse 14 is configured to receive the data generated when executing tasks uploaded by each shared computing node 19 and provide the data to the scheduling service unit 12 for scheduling.

部署服务单元15用于向调度服务单元12选取的共享计算节点19下发部署的任务。The deployment service unit 15 is configured to deliver the deployment task to the shared computing node 19 selected by the scheduling service unit 12 .

镜像仓库17用于接收客户端2生成的Docker镜像，以及向共享计算节点19提供Docker镜像。The image repository 17 is used to receive the Docker image generated by the client 2 and provide the Docker image to the shared computing node 19 .

共享计算节点19用于接收部署服务单元15部署的任务并执行，从镜像仓库17下载对应的Docker镜像，启动镜像实例，并将节点实时状态、任务状态和节点上产生的数据上传。在本实施例中，共享计算节点19从镜像仓库17下载Docker镜像，在其他实施例中，可以通过共享计算节点19之间的P2P传输获取其他共享计算节点19已下载的Docker镜像。当已下载上述Docker镜像后，还可以通过P2P向其他共享计算节点传输该Docker镜像。The shared computing node 19 is used to receive and execute the task deployed by the deployment service unit 15, download the corresponding Docker image from the image repository 17, start the image instance, and upload the node real-time status, task status and data generated on the node. In this embodiment, the shared computing node 19 downloads the Docker image from the mirror repository 17 . In other embodiments, the Docker image downloaded by other shared computing nodes 19 may be obtained through P2P transmission between the shared computing nodes 19 . After the above Docker image has been downloaded, the Docker image can also be transferred to other shared computing nodes through P2P.

进一步地，上述共享计算系统1还包括：Further, the above-mentioned shared computing system 1 also includes:

信令网关16，用于将部署服务单元15部署的任务下发至对应的共享计算节点19，以及接收共享计算节点19上传的节点实时状态和任务状态，并发送至节点管理单元13。The signaling gateway 16 is configured to deliver the task deployed by the deployment service unit 15 to the corresponding shared computing node 19 , and receive the node real-time status and task status uploaded by the shared computing node 19 , and send them to the node management unit 13 .

数据网关18，用于向共享计算节点19传输Docker镜像，以及接收共享计算节点19上传的Docker实例执行过程中产生的数据，并上传至数据仓库14。The data gateway 18 is configured to transmit the Docker image to the shared computing node 19 , and receive the data generated during the execution of the Docker instance uploaded by the shared computing node 19 , and upload the data to the data warehouse 14 .

上述信令和数据的传输采用内容分发网络(Content Delivery Network，CDN)进行动态加速。The transmission of the above signaling and data is dynamically accelerated by a content delivery network (Content Delivery Network, CDN).

进一步地，共享计算节点19包括本地信令代理190、本地数据代理192和Docker管理器194。通过部署在每个共享计算节点19上的本地信令代理190、本地数据代理192和Docker管理器194，来对节点资源进行虚拟化分割和管理，同时实时采集节点和任务状态，以及节点上产生的数据。Further, the shared computing node 19 includes a local signaling broker 190 , a local data broker 192 and a Docker manager 194 . Through the local signaling agent 190, the local data agent 192 and the Docker manager 194 deployed on each shared computing node 19, the node resources are virtualized and managed, and the node and task status are collected in real time, and The data.

本地信令代理190用于从信令网关16接收信令(例如部署的任务)，解析信令，传递给Docker管理器194，以及向信令网关16上传节点实时状态和任务状态。Docker管理器194用于根据本地信令代理190接收到的任务下载Docker镜像，加载并启动镜像实例。本地数据代理192用于从数据网关18接收从镜像仓库17下载的Docker镜像或通过P2P传输从其他共享计算节点19获取Docker镜像，以及上传Docker实例执行过程中产生的数据，例如Docker实例执行过程中生成的结果、日志、核心转储(Coredump)等，上述数据后续可以作为该节点的历史数据在调度服务单元12进行调度时作为参照。当部分共享计算节点19已下载Docker镜像后，可以通过本地数据代理192进行P2P扩散，降低数据网关18的下载带宽压力。Local signaling agent 190 is used to receive signaling (eg, deployed tasks) from signaling gateway 16 , parse the signaling, pass to Docker manager 194 , and upload node real-time status and task status to signaling gateway 16 . The Docker manager 194 is used to download the Docker image according to the task received by the local signaling agent 190, load and start the image instance. The local data agent 192 is used to receive the Docker image downloaded from the image warehouse 17 from the data gateway 18 or obtain the Docker image from other shared computing nodes 19 through P2P transmission, and upload the data generated during the execution of the Docker instance, for example, during the execution of the Docker instance. The generated results, logs, core dumps (Coredump), etc., the above-mentioned data can be subsequently used as the historical data of the node as a reference when the scheduling service unit 12 performs scheduling. After some of the shared computing nodes 19 have downloaded the Docker image, P2P diffusion can be performed through the local data proxy 192 to reduce the download bandwidth pressure of the data gateway 18 .

本实施例提供的共享计算系统1，可以对资源受限的家庭智能硬件采用Docker的方式进行轻量虚拟化，统一管理百万量级的公网节点组成的Docker集群，具有跨省份跨运营商的集群管理和容错能力。信令和数据的传输用CDN网络进行动态加速，Docker镜像通过P2P方式扩散分发，提高分发效率，节省服务端带宽。共享计算节点19承载的Docker镜像实例处于公网环境，节点的NAT类型、运营商、地域会动态变化，调度服务单元12通过装箱算法时刻在进行节点增减，可以维持资源总量的平稳。The shared computing system 1 provided by this embodiment can use Docker to perform lightweight virtualization on resource-constrained home intelligent hardware, and uniformly manage a Docker cluster composed of millions of public network nodes. cluster management and fault tolerance. The transmission of signaling and data is dynamically accelerated by the CDN network, and the Docker image is diffused and distributed through the P2P method, which improves the distribution efficiency and saves the bandwidth of the server. The Docker image instance carried by the shared computing node 19 is in the public network environment, and the NAT type, operator, and region of the node will change dynamically. The scheduling service unit 12 constantly increases and decreases the node through the packing algorithm, which can maintain the stability of the total amount of resources.

第二实施例Second Embodiment

参阅图2所示，本发明第二实施例提出一种服务器10。Referring to FIG. 2 , a second embodiment of the present invention provides a server 10 .

上述服务器10包括：存储器21、处理器23、网络接口25及通信总线27。其中，网络接口25可选地可以包括标准的有线接口、无线接口(如WI-FI接口)。通信总线27用于实现这些组件之间的连接通信。The above server 10 includes: a memory 21 , a processor 23 , a network interface 25 and a communication bus 27 . Wherein, the network interface 25 may optionally include a standard wired interface and a wireless interface (eg, a WI-FI interface). The communication bus 27 is used to realize the connection communication between these components.

存储器21至少包括一种类型的可读存储介质。上述至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。在一些实施例中，上述存储器21可以是服务器10的内部存储单元，例如该服务器10的硬盘。在另一些实施例中，上述存储器21也可以是服务器10的外部存储单元，例如服务器10上配备的插接式硬盘，智能存储卡(Smart Media Card，SMC)，安全数字(Secure Digital，SD)卡，闪存卡(Flash Card)等。The memory 21 includes at least one type of readable storage medium. The above-mentioned at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory, or the like. In some embodiments, the above-mentioned memory 21 may be an internal storage unit of the server 10 , such as a hard disk of the server 10 . In other embodiments, the above-mentioned memory 21 may also be an external storage unit of the server 10, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) equipped on the server 10 card, flash card (Flash Card) and so on.

上述存储器21可以用于存储安装于服务器10的应用软件及各类数据，例如共享计算资源的调度程序20的程序代码及其运行过程中产生的相关数据。The above-mentioned memory 21 may be used to store application software installed on the server 10 and various types of data, such as the program code of the scheduler 20 sharing computing resources and related data generated during the running process thereof.

处理器23在一些实施例中可以是一中央处理器，微处理器或其它数据处理芯片，用于运行存储器21中存储的程序代码或处理数据。The processor 23 may be a central processing unit, a microprocessor or other data processing chip in some embodiments, and is used to execute program codes or process data stored in the memory 21 .

图2仅示出了具有组件21-27以及共享计算资源的调度程序20的服务器10，但是应理解的是，图2并未示出服务器10的所有组件，可以替代实施更多或者更少的组件。Figure 2 only shows server 10 with components 21-27 and scheduler 20 sharing computing resources, but it should be understood that Figure 2 does not show all components of server 10, and more or less may be implemented instead. components.

在图2所示的服务器10实施例中，作为一种计算机存储介质的存储器21中存储共享计算资源的调度程序20的程序代码，处理器23执行上述共享计算资源的调度程序20的程序代码时，实现如下方法：In the embodiment of the server 10 shown in FIG. 2 , the program code of the scheduler 20 for sharing computing resources is stored in the memory 21 as a computer storage medium. When the processor 23 executes the program code of the scheduler 20 for sharing computing resources, , implement the following methods:

(1)获取待执行的共享计算任务。(1) Obtain the shared computing task to be executed.

(2)获取所有备选的共享计算节点列表。(2) Obtain a list of all candidate shared computing nodes.

(3)从该共享计算节点列表中选择与该共享计算任务匹配的共享计算节点19。(3) Select a shared computing node 19 matching the shared computing task from the shared computing node list.

(4)将该共享计算任务下发至与该共享计算任务匹配的共享计算节点19。(4) Delivering the shared computing task to the shared computing node 19 matching the shared computing task.

上述方法的详细说明请参阅下述第三实施例，在此不再赘述。For the detailed description of the above method, please refer to the following third embodiment, which will not be repeated here.

第三实施例Third Embodiment

参阅图3所示，本发明第三实施例提出一种共享计算资源的调度方法，应用于上述服务器10。在本实施例中，根据不同的需求，图3所示的流程图中的步骤的执行顺序可以改变，某些步骤可以省略。该方法包括：Referring to FIG. 3 , a third embodiment of the present invention proposes a scheduling method for sharing computing resources, which is applied to the above server 10 . In this embodiment, according to different requirements, the execution order of the steps in the flowchart shown in FIG. 3 can be changed, and some steps can be omitted. The method includes:

S20，获取待执行的共享计算任务。S20: Acquire a shared computing task to be executed.

在本实施例中，上述共享计算任务包括需要配置的共享计算资源的需求。上述共享计算资源的需求包括带宽需求、存储空间需求和计算资源需求中的至少一种。当用户在客户端2选择所需资源的规格和容量及待执行的程序逻辑后，客户端2根据上述程序逻辑自动生成Docker镜像，并将选择的所需资源封装成标准化任务。然后，客户端2将该任务递交至任务管理单元11，将该Docker镜像递交至镜像仓库17。任务管理单元11会根据优先级和关联度将接收到的任务排入多条并行化的流水线，调度服务单元12从该流水线中按顺序获取任务。In this embodiment, the above-mentioned shared computing tasks include requirements for shared computing resources that need to be configured. The above requirements for shared computing resources include at least one of bandwidth requirements, storage space requirements, and computing resource requirements. After the user selects the specification and capacity of the required resources and the program logic to be executed on the client 2, the client 2 automatically generates a Docker image according to the above program logic, and encapsulates the selected required resources into standardized tasks. Then, the client 2 submits the task to the task management unit 11 , and submits the Docker image to the image repository 17 . The task management unit 11 will arrange the received tasks into a plurality of parallelized pipelines according to the priority and the degree of association, and the scheduling service unit 12 obtains tasks from the pipelines in sequence.

S22，获取所有备选的共享计算节点列表。S22: Obtain a list of all candidate shared computing nodes.

在本实施例中，上述共享计算节点列表包括各共享计算节点19的ID、可用资源数据，上述可用资源数据可以根据各个共享计算节点19上传的节点实时状态、任务状态及节点上执行任务时产生的数据计算得到。节点管理单元13接收各个共享计算节点19上传的节点实时状态和任务状态并提供给调度服务单元12进行调度。数据仓库14接收各个共享计算节点19上传的产生的数据并提供给调度服务单元12进行调度。调度服务单元12选取节点需要依赖从节点管理单元13获取的全量节点的实时状态，以及从数据仓库14中获取的节点和任务的历史数据(例如节点的历史稳定性等)。In this embodiment, the above-mentioned shared computing node list includes the IDs and available resource data of each shared computing node 19 , and the above-mentioned available resource data can be generated according to the real-time status and task status of the nodes uploaded by each shared computing node 19 and when tasks are executed on the nodes. data is calculated. The node management unit 13 receives the node real-time status and task status uploaded by each shared computing node 19 and provides them to the scheduling service unit 12 for scheduling. The data warehouse 14 receives the generated data uploaded by each shared computing node 19 and provides it to the scheduling service unit 12 for scheduling. The selection of nodes by the scheduling service unit 12 needs to rely on the real-time status of all nodes obtained from the node management unit 13 and the historical data of nodes and tasks obtained from the data warehouse 14 (eg, historical stability of nodes, etc.).

S24，从该共享计算节点列表中选择与该共享计算任务匹配的共享计算节点19。S24: Select a shared computing node 19 matching the shared computing task from the shared computing node list.

调度服务单元12根据上述需要配置的共享计算资源的需求以及各共享计算节点19的可用资源数据，从该共享计算节点列表中选择与该共享计算任务匹配的共享计算节点19。举例而言，调度服务单元12首先获取当前所有备选的共享计算节点列表，然后拆分该任务的资源需求，根据地域、ISP、NAT类型、带宽、存储空间、计算资源等选择达到预设值的可用节点列表，最后按照区域资源余量、历史稳定性等预设指标为该可用节点列表中的各个共享计算节点19打分，根据资源成本，采用装箱算法按照资源利用最大化原则，拆分该任务所需要配置的共享计算资源的需求到打分分值超过预设阈值的共享计算节点19，选出最终的匹配节点列表。另外，当已选取的共享计算节点19上传节点实时状态和任务状态(从而得到当前可用资源数据)后，调度服务单元12还用于进一步确定是否进行节点增删。The scheduling service unit 12 selects a shared computing node 19 matching the shared computing task from the shared computing node list according to the shared computing resource requirements to be configured and the available resource data of each shared computing node 19 . For example, the scheduling service unit 12 first obtains a list of all current candidate shared computing nodes, then divides the resource requirements of the task, and selects a preset value according to region, ISP, NAT type, bandwidth, storage space, computing resources, etc. Finally, according to preset indicators such as regional resource surplus and historical stability, each shared computing node in the available node list is scored 19. According to the resource cost, the binning algorithm is used to split according to the principle of maximizing resource utilization. The shared computing resources that need to be configured for the task are required to reach the shared computing nodes 19 whose scoring values exceed the preset threshold, and the final matching node list is selected. In addition, after the selected shared computing node 19 uploads the node real-time state and task state (thus obtaining the currently available resource data), the scheduling service unit 12 is further configured to further determine whether to add or delete nodes.

参阅图3所示，为上述S24的细化流程示意图。该细化流程包括：Referring to FIG. 3 , it is a schematic diagram of the refinement flow of the above S24. The refinement process includes:

S240，获取该共享计算节点列表中每个共享计算节点19的可用资源数据。S240: Obtain available resource data of each shared computing node 19 in the shared computing node list.

S242，从该共享计算节点列表中选择可用资源数据达到预设值的共享计算节点19，生成可用节点列表。S242, select a shared computing node 19 whose available resource data reaches a preset value from the shared computing node list, and generate an available node list.

S244，按照预设指标为该可用节点列表中的各个共享计算节点19打分，采用装箱算法拆分该任务需要配置的共享计算资源的需求到打分分值超过预设阈值的共享计算节点19，得到最终的匹配节点列表。S244, score each shared computing node 19 in the available node list according to the preset index, and use a binning algorithm to divide the shared computing resource requirements that need to be configured for the task to the shared computing node 19 whose scoring value exceeds a preset threshold, Get the final list of matching nodes.

S246，定时获取所选择的共享计算节点19当前的可用资源数据。S246, obtain the current available resource data of the selected shared computing node 19 at regular intervals.

S248，根据上述共享计算资源的需求以及上述共享计算节点19当前的可用资源数据判断是否需要进行节点增删。例如，当节点上下线状态变化、NAT类型或运营商变化、磁盘存储变化、任务负载变化等情况出现时，可能需要增删节点。S248, according to the requirement of the shared computing resource and the current available resource data of the shared computing node 19, determine whether it is necessary to add or delete nodes. For example, nodes may need to be added or deleted when the online and offline status of nodes changes, NAT type or operator changes, disk storage changes, and task load changes.

S26，将该共享计算任务下发至与该共享计算任务匹配的共享计算节点19。S26: Deliver the shared computing task to the shared computing node 19 matching the shared computing task.

当调度服务单元12选取出共享计算节点19后，可以将从任务管理单元11获取到的上述任务分配给各个所选取的共享计算节点19，然后通过部署服务单元15将每个所选取的共享计算节点19被分配到的任务下发至对应的共享计算节点19。After the scheduling service unit 12 selects the shared computing nodes 19 , the above tasks obtained from the task management unit 11 can be allocated to each of the selected shared computing nodes 19 , and then the deployment service unit 15 assigns each selected shared computing node 19 The tasks assigned to the nodes 19 are delivered to the corresponding shared computing nodes 19 .

共享计算节点19接收到所下发的任务并执行，从镜像仓库17下载对应的Docker镜像，启动镜像实例，并将节点实时状态、任务状态和节点上产生的数据上传。The shared computing node 19 receives and executes the assigned task, downloads the corresponding Docker image from the image repository 17, starts the image instance, and uploads the real-time status of the node, the status of the task and the data generated on the node.

本实施例提供的共享计算资源的调度方法，可以对资源受限的家庭智能硬件采用Docker的方式进行轻量虚拟化，统一管理百万量级的公网节点组成的Docker集群，具有跨省份跨运营商的集群管理和容错能力。共享计算节点19承载的Docker镜像实例处于公网环境，节点的NAT类型、运营商、地域会动态变化，调度服务单元12通过装箱算法时刻在进行节点增减，可以维持资源总量的平稳。The scheduling method for shared computing resources provided in this embodiment can use Docker to perform lightweight virtualization on resource-constrained home intelligent hardware, and uniformly manage a Docker cluster composed of millions of public network nodes. Operator cluster management and fault tolerance. The Docker image instance carried by the shared computing node 19 is in the public network environment, and the NAT type, operator, and region of the node will change dynamically. The scheduling service unit 12 constantly increases and decreases the node through the packing algorithm, which can maintain the stability of the total amount of resources.

第四实施例Fourth Embodiment

本发明还提供了另一种实施方式，即提供一种计算机可读存储介质，上述计算机可读存储介质存储有共享计算资源的调度程序20，上述共享计算资源的调度程序20可被至少一个处理器执行，以使上述至少一个处理器执行如上述的共享计算资源的调度方法。Another embodiment of the present invention is to provide a computer-readable storage medium, wherein the computer-readable storage medium stores a scheduler 20 for sharing computing resources, and the scheduler 20 for sharing computing resources can be processed by at least one The above-mentioned at least one processor executes the above-mentioned scheduling method for sharing computing resources.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台客户端(可以是手机，计算机，电子装置，空调器，或者网络设备等)执行本发明各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solutions of the present invention can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a client (which may be a mobile phone, a computer, an electronic device, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present invention.

以上仅为本发明的优选实施例，并非因此限制本发明的专利范围，凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present invention, or directly or indirectly applied in other related technical fields , are similarly included in the scope of patent protection of the present invention.

Claims

1. A scheduling method for shared computing resources, wherein the method comprises:

Obtain the Docker image generated according to the shared computing task to be executed from the image repository;

Get a list of all alternative shared computing nodes;

Selecting a shared computing node matching the shared computing task from the shared computing node list includes: selecting a shared computing node whose available resource data reaches a preset value from the shared computing node list to generate a list of available nodes, and generating a list of available nodes according to the preset value. Set the index to score each shared computing node in the available node list, and use the binning algorithm to split the shared computing resource requirements that need to be configured for the shared computing task to the shared computing node whose scoring value exceeds the preset threshold to obtain a matching node list. , periodically obtain the current available resource data of the shared computing nodes in the matching node list, and determine whether it is necessary to perform node additions or deletions on the shared computing nodes in the matching node list according to the requirements of the shared computing resources and the available resource data operation, when it is determined that an addition or deletion operation needs to be performed, an addition or deletion operation is performed on the shared computing nodes in the matching node list;

The Docker image is transmitted to the matching shared computing node by means of CDN dynamic acceleration, and the real-time node status and task status returned by the matching shared computing node after downloading the Docker image, loading and starting the image instance are received. and data generated on the node, wherein: after any of the matching shared computing nodes completes the download of the Docker image, the downloaded Docker image is spread to other matching shared computing nodes through P2P through a local data agent.

2 . The method for scheduling shared computing resources according to claim 1 , wherein the shared computing node list includes IDs and available resource data of each shared computing node. 3 .

3. The scheduling method for shared computing resources according to claim 1 or 2, wherein the requirements for the shared computing resources include at least one of: bandwidth requirements, storage space requirements, and computing resource requirements.

4. The scheduling method for shared computing resources according to claim 1 or 2, wherein the available resource data in the shared computing node list is based on the real-time status of the node, the task status and the on-node status uploaded by each shared computing node. The data generated when the task is executed is calculated.

5 . The scheduling method for shared computing resources according to claim 1 , wherein the preset indicators include regional resource surplus and historical stability. 6 .

6. A server, characterized in that the server comprises a memory and a processor, and the memory stores a scheduler of shared computing resources that can run on the processor, and the scheduler of the shared computing resources is The processor implements the method according to any one of claims 1-5 when executed.

7. A shared computing system, wherein the system comprises:

A task management unit, configured to receive the Docker image generated by the client according to the shared computing task to be executed from the image warehouse, and distribute the Docker image to the scheduling service unit;

The scheduling service unit is used to obtain the Docker image from the task management unit, obtain a list of all alternative shared computing nodes according to the status and historical data of each shared computing node provided by the node management unit and the data warehouse, and from Selecting a shared computing node matching the shared computing task in the shared computing node list includes: selecting a shared computing node whose available resource data reaches a preset value from the shared computing node list to generate a list of available nodes, according to a preset The indicator scores each shared computing node in the available node list, and uses the binning algorithm to split the shared computing resource requirements that need to be configured for the shared computing task to the shared computing node whose score exceeds a preset threshold to obtain a matching node list. Periodically obtain the current available resource data of the shared computing nodes in the matching node list, and determine whether it is necessary to perform node addition or deletion operations on the shared computing nodes in the matching node list according to the requirements of the shared computing resources and the available resource data , when it is determined that an addition or deletion operation needs to be performed, an addition or deletion operation is performed on the shared computing nodes in the matching node list;

A deployment service unit, configured to deliver the Docker image to the matched shared computing node by means of CDN dynamic acceleration;

The node management unit is configured to receive the node real-time status, task status and data generated on the node returned after the matched shared computing node has downloaded the Docker image, loaded and started the image instance;

The data warehouse is used to receive data generated on each shared computing node;

Wherein: after any of the matching shared computing nodes completes the downloading of the Docker image, the downloaded Docker image is spread to other matching shared computing nodes through P2P through the local data agent.

8. A storage medium storing a scheduler for sharing computing resources, the scheduler for sharing computing resources being executable by at least one processor, so that the at least one processor executes the process according to claim 1- The scheduling method for shared computing resources according to any one of 5.