CN116866440B

CN116866440B - Cluster node selection scheduling method and device, electronic equipment and storage medium

Info

Publication number: CN116866440B
Application number: CN202311133826.0A
Authority: CN
Inventors: 王斌; 荆荣讯
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-09-05
Filing date: 2023-09-05
Publication date: 2024-01-19
Anticipated expiration: 2043-09-05
Also published as: CN116866440A

Abstract

The embodiment of the invention provides a cluster node selection scheduling method, a cluster node selection scheduling device, electronic equipment and a storage medium, and relates to the technical field of computer systems and storage; comprising the following steps: acquiring node data from a cluster platform; constructing a node scheduling topological graph according to the node data, and determining a node task flow index according to the node data; constructing an updated virtual cluster topology view according to the node scheduling topology map, wherein the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list; responding to a cluster concurrent operation request, and determining a target area from a virtual cluster topology view according to a node task flow index; sequentially reading nodes in a node list of the target area, determining target nodes, and responding to the job request based on the target nodes; the embodiment of the invention can perform rapid and efficient scheduling processing in a large-scale cluster scene, and meets the product demand target in a multi-service scene.

Description

Cluster node selection scheduling method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer systems and storage technologies, and in particular, to a cluster node selection scheduling method, a cluster node selection scheduling device, an electronic device, and a storage medium.

Background

At present, a scheduling strategy method under a large-scale cluster scene only considers a preferred algorithm for submitting a large number of jobs to be scheduled; either consider the large-scale cluster node factor, but only simply: the candidate nodes are randomly intercepted, and the scheduling algorithm selects the optimal node from the nodes, namely the optimal node is locally optimal, and is not globally optimal. While less consideration is given to the resource and environmental characteristics of the large-scale cluster nodes. However, all nodes are traversed, so that the time delay of a single job scheduling process is longer, and the scheduling performance is reduced; only partial nodes are selected randomly to schedule, so that the scheduling strategy has deviation of search range and the optimal node cannot be selected. It can be seen that the scene support for the large-scale cluster nodes is imperfect, and the scheduling performance is greatly affected.

Disclosure of Invention

In view of the foregoing, embodiments of the present invention are presented to provide a cluster node selection scheduling method, a cluster node selection scheduling apparatus, an electronic device, and a storage medium that overcome or at least partially solve the foregoing problems.

In order to solve the above-mentioned problems, in a first aspect of the present invention, an embodiment of the present invention discloses a cluster node selection scheduling method, where a cluster is deployed on a cluster platform, and the method includes:

Acquiring node data from the cluster platform;

constructing a node scheduling topological graph according to the node data, and determining a node task flow index according to the node data;

constructing an updated virtual cluster topology view according to the node scheduling topology map, wherein the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list;

responding to the concurrent job request of the cluster, and determining a target area from the virtual cluster topology view according to the node task flow index;

and sequentially reading nodes in the node list of the target area, determining a target node, and responding to the job request based on the target node.

Optionally, the method further comprises:

responsive to the number of node resource residuals of the target region being less than the number of resource requirements of the job request, scheduling the job request to other regions, the other regions being regions of the plurality of regions other than the target region;

and executing the steps of sequentially reading the nodes in the node list of the target area by adopting the other areas as the target area, determining a target node and responding to the job request based on the target node.

Optionally, the step of constructing a node scheduling topology according to the node data includes:

determining node topology configuration data according to the node data, wherein the node topology configuration data comprises a node topology hierarchical structure;

constructing a node physical topology structure diagram according to the node topology hierarchical structure;

and filtering the node physical topological structure diagram to generate the node scheduling topological diagram.

Optionally, the step of determining node topology configuration data from the node data comprises:

determining a node topology type based on the node data, wherein the node topology type corresponds to a node topology level identifier;

determining a node topology distance based on the node data;

and combining the node topology type and the node topology distance to determine the node topology configuration data.

Optionally, the step of filtering the physical topology structure diagram of the node and generating the node scheduling topology diagram includes:

extracting available schedulable nodes in the node physical topology structure diagram;

and constructing the node scheduling topological graph based on the available schedulable nodes.

Optionally, the node data includes multi-service dimension information, and the step of determining the node task traffic index according to the node data includes:

And calculating the weighted sum value of the multi-service dimension information, normalizing the weighted sum value, and determining the node task flow index.

Optionally, the multi-service dimension information includes: the node task flow index comprises a service load, the multi-service dimension information weighted sum value is calculated, the weighted sum value is normalized, and the step of determining the node task flow index comprises the following steps:

and calculating the weighted sum value of the current operation network flow of the node, the operation quantity, the data set and the mirror image buffer memory state, normalizing the weighted sum value, and determining the node service load.

Optionally, the method further comprises:

and determining the business affinity between the nodes based on the node type and the node task flow index.

Optionally, the method further comprises:

and carrying out coding processing on the multidimensional resource vector value of the job request.

Optionally, the step of encoding the multidimensional resource vector value of the job request includes:

performing numerical coding on the resource vector characteristic value of any dimension in the multi-dimensional resource vector values;

And combining the numerical codes according to the front and back dimensions to generate a first total characteristic numerical value.

Optionally, the method further comprises:

and performing coding mapping on the resources corresponding to the nodes of the virtual cluster topology view to generate node resource characteristic values.

Optionally, the method further comprises:

and performing coding mapping on the multidimensional features corresponding to the nodes of the virtual cluster topology view to generate a second total feature value.

Optionally, the step of determining, in response to the job request concurrent with the cluster, a target area from the virtual cluster topology view according to the node task traffic index includes:

dynamically planning nodes in the virtual cluster topology view to generate the region;

based on the resource characteristic value and the node service load, ordering the node list of the region to generate a region node sequence;

and based on the job request, associating the areas and determining the target area.

Optionally, the step of dynamically planning the nodes in the virtual cluster topology view and generating the area includes:

determining an initial region, and based on a region score of a node in the initial region and the initial region;

Determining a central node according to the region score;

the region is generated based on the central node.

Optionally, the node in the initial area corresponds to a location affinity and a service affinity, and the step of based on the area score of the node in the initial area and the initial area includes:

calculating a weighted sum of the location affinity and the business affinity;

calculating the ratio of the weighted sum value to the traffic load;

and determining the ratio as the region score.

Optionally, the step of associating the areas based on the job request, and determining the target area includes:

based on the job type of the job request, correlating the areas and determining the target area

Optionally, the step of sequentially reading nodes in the node list of the target area and determining the target node includes:

reading nodes in the sequence of the regional nodes one by one according to the sequence;

and when the resources corresponding to the nodes meet the job request, determining the nodes as target nodes.

In a second aspect of the present invention, an embodiment of the present invention discloses a cluster node selection scheduling apparatus, where the cluster is deployed on a cluster platform, and includes:

The first acquisition module is used for acquiring node data from the cluster platform;

the first construction module is used for constructing a node scheduling topological graph according to the node data and determining a node task flow index according to the node data;

the second construction module is used for constructing an updated virtual cluster topology view according to the node scheduling topology map, wherein the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list;

the target area determining module is used for responding to the concurrent job request of the cluster and determining a target area from the virtual cluster topology view according to the node task flow index;

and the request response module is used for sequentially reading the nodes in the node list of the target area, determining the target node and responding to the job request based on the target node.

In a third aspect of the present invention, an embodiment of the present invention discloses an electronic device, including a processor, a memory, and a computer program stored on the memory and capable of running on the processor, the computer program implementing the steps of the cluster node selection scheduling method as described above when executed by the processor.

In a fourth aspect of the present invention, embodiments of the present invention disclose a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a cluster node selection scheduling method as described above.

The embodiment of the invention has the following advantages:

according to the embodiment of the invention, the node data are acquired from the cluster platform; constructing a node scheduling topological graph according to the node data, and determining a node task flow index according to the node data; constructing an updated virtual cluster topology view according to the node scheduling topology map, wherein the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list; responding to the concurrent job request of the cluster, and determining a target area from the virtual cluster topology view according to the node task flow index; sequentially reading nodes in a node list of the target area, determining a target node, and responding to the job request based on the target node; the area of service execution is determined by dynamically dividing the virtual cluster scheduling topological graph, and the current node can be allocated to the job as long as the nodes in the area can be scheduled, so that the concurrent scheduling of the job in a large-scale cluster scene and the rapid allocation of the nodes in the area can be realized.

Drawings

FIG. 1 is a flow chart of steps of an embodiment of a cluster node selection scheduling method of the present invention;

FIG. 2 is a flow chart of steps of another embodiment of a cluster node selection scheduling method of the present invention;

FIG. 3 is a schematic diagram of numerical encoding of an example cluster node selection scheduling method of the present invention;

FIG. 4 is a schematic diagram of an initial area of an example cluster node selection scheduling method of the present invention;

FIG. 5 is a schematic diagram of a central node after initial area update of an exemplary cluster node selection scheduling method of the present invention;

FIG. 6 is a schematic diagram of a sequence of regional nodes of an example cluster node selection scheduling method of the present invention;

FIG. 7 is a hardware schematic of an example cluster node selection scheduling method of the present invention;

FIG. 8 is a block diagram illustrating an embodiment of a cluster node selection scheduler of the present invention;

fig. 9 is a block diagram of an electronic device according to an embodiment of the present invention;

fig. 10 is a block diagram of a storage medium according to an embodiment of the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

Referring to fig. 1, a flowchart illustrating steps of an embodiment of a cluster node selection scheduling method according to the present invention is shown, where in the embodiment of the present invention, the clusters are deployed on a cluster platform, and information of the clusters is collected and stored in the cluster platform. The cluster node selection scheduling method specifically comprises the following steps:

step 101, acquiring node data from the cluster platform;

node data for each node in the cluster may be obtained from the cluster platform, including but not limited to nodes, jobs, job tasks, traffic monitoring data, and the like.

102, constructing a node scheduling topological graph according to the node data, and determining a node task flow index according to the node data;

the relation among the nodes can be determined according to the node data, and a node scheduling topological graph is constructed. Wherein the node scheduling topology is a topology of nodes at a resource scheduling level. And the node task flow index of the node can be calculated according to the node data, so that the node can be accurately distributed.

Step 103, constructing an updated virtual cluster topology view according to the node scheduling topology map, wherein the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list;

An updated virtual cluster topology view may be constructed from the node scheduling topology, the updated virtual cluster topology view including a plurality of regions, each region corresponding to a list of nodes. Distribution nodes within the region are determined by updating the virtual cluster topology view.

104, responding to the concurrent job request of the cluster, and determining a target area from the virtual cluster topology view according to the node task flow index;

after the virtual cluster topology view is updated, a target area can be determined from the virtual cluster topology view according to the task flow index of the nodes in response to the concurrent job request of the clusters, and the request is distributed to the area.

And 105, sequentially reading nodes in the node list of the target area, determining a target node, and responding to the job request based on the target node.

Then, in the target area, the nodes in the node list of the target area can be sequentially read, and the condition of the nodes is determined to determine the target nodes in the target area. And responds to the job request based on the target node to effect processing of the job.

Referring to fig. 2, a flowchart illustrating steps of another embodiment of a cluster node selection scheduling method according to the present invention, where a cluster is deployed on a cluster platform, the cluster node selection scheduling method specifically may include the following steps:

step 201, obtaining node data from the cluster platform;

the node data such as node cache information, job resources, job task information, service flow monitoring data and the like of the cluster can be obtained from the cluster platform.

Step 202, constructing a node scheduling topological graph according to the node data, and determining a node task flow index according to the node data;

the node physical topological graph can be constructed based on the obtained node data, and the quantitative parameters can be provided for the subsequent selected areas by determining the node task flow index based on the node data.

In an optional embodiment of the invention, the step of constructing a node scheduling topology according to the node data includes:

step S2021, determining node topology configuration data according to the node data, wherein the node topology configuration data comprises a node topology hierarchical structure;

the node topology configuration data, which is information of the node itself, may be determined according to the node data, wherein the node topology configuration data includes a node topology hierarchy. The node topology hierarchy may include: zone (zone), switch (switch), node (node), etc.

Specifically, the step of determining node topology configuration data according to the node data includes: determining a node topology type based on the node data, wherein the node topology type corresponds to a node topology level identifier; determining a node topology distance based on the node data; and combining the node topology type and the node topology distance to determine the node topology configuration data.

Firstly, determining a node topology type based on a node data node topology type, and setting a node topology level identification; setting a node topology distance; such as: a switch is 10 from its lower node: { switch- > node: 10}. The node topology hierarchy is as follows: zone- > [ switch- > switch- > -switch ] - >; the switch layer may have multiple layers, configured according to the actual data center network topology. And combining the node topology type and the node topology distance to determine node topology configuration data.

Step S2021, constructing a node physical topology structure diagram according to the node topology hierarchical structure;

and constructing the physical topology of the nodes based on the node topology level in the obtained node topology configuration data, and generating an original node physical topology structure diagram.

And step S2022, filtering the physical topological structure diagram of the node to generate the node scheduling topological diagram.

And filtering the physical topological structure diagram of the processing node, and further generating a node scheduling topological diagram at a resource scheduling layer.

Specifically, the step of filtering the physical topology structure diagram of the node and generating the node scheduling topology diagram includes: extracting available schedulable nodes in the node physical topology structure diagram; and constructing the node scheduling topological graph based on the available schedulable nodes.

In practical application, available schedulable nodes in the node physical topological structure diagram can be extracted, then the unavailable nodes are removed based on the available schedulable nodes in the node physical topological structure diagram to construct the node scheduling topological diagram. The node scheduling topology tree structure is as follows:

topoView：zones->zone->switchs->switch->nodes->node: distance

the switch layer may have multiple layers, such as three layers of core switches and two layers of switches.

In an optional embodiment of the invention, the node data includes multi-service dimension information, and the step of determining the node task traffic index according to the node data includes:

step S2023, calculating the weighted sum value of the multi-service dimension information, normalizing the weighted sum value, and determining the node task traffic index.

In the embodiment of the invention, the weight corresponding to each dimension in the multi-service dimension information and the single service dimension parameter can be weighted and summed to obtain a weighted sum value, and the weighted sum value is normalized to determine the node task flow index.

Specifically, the multi-service dimension information includes: the node task flow index comprises a service load, the multi-service dimension information weighted sum value is calculated, the weighted sum value is normalized, and the step of determining the node task flow index comprises the following steps: and calculating the weighted sum value of the current operation network flow of the node, the operation quantity, the data set and the mirror image buffer memory state, normalizing the weighted sum value, and determining the node service load.

The current or short-time monitoring data statistics according to the platform task can be carried out, the weighted sum value of the current operation network flow, the operation quantity, the data set and the mirror image caching state of the node is calculated, and then the weighted sum value is normalized to determine the node service load. And the traffic load of the node is used as a traffic index of the node task.

Specifically, the node data includes a node type, and the step of determining the node task traffic index according to the node data further includes: and determining the business affinity between the nodes based on the node type and the node task flow index.

The business condition and nature between the nodes can be determined based on the node type and the node task flow index, for example, two nodes are high-speed network cards and high-performance disks, and index data are the same or close. The inter-node traffic condition and nature values of the two nodes are high. If one node is hundred mega network card equipment and the other node is giga network card equipment, the two nodes are considered to be far away, and the business condition and the nature value between the nodes are low.

Furthermore, in an alternative embodiment of the present invention, the method further comprises:

and S1, performing coding processing on the multidimensional resource vector value of the job request.

Because of the complexity of using resources by jobs in today's platforms, a variety of resource types and amounts may be requested. In order to increase the processing speed, the multidimensional resource vector value of the job request can be subjected to coding processing so as to reduce the dimensional complexity of the resource.

Specifically, the step of encoding the multidimensional resource vector value of the job request includes: performing numerical coding on the resource vector characteristic value of any dimension in the multi-dimensional resource vector values; and combining the numerical codes according to the front and back dimensions to generate a first total characteristic numerical value.

The resource vector eigenvalues for each dimension may be numerically encoded, as in the example of fig. 3, with a 3-bit decimal place allele encoding for the eigenvalues for each dimension. And then, the characteristic value code of each dimension is formed into a total characteristic value bitmap, namely a first total characteristic value, according to the front and back of the dimension. As shown in fig. 3, the values of 4 different resource types are mapped to 004,003,001,001 by this encoding, the total eigenvalue can be represented as a 12-bit value and the high 0 value 4003001001 omitted. The bitmap illustrated here is a numerical bitmap, and numerical operations can be directly performed.

Furthermore, in an alternative embodiment of the present invention, the method may further include:

and S2, performing coding mapping on the resources corresponding to the nodes of the virtual cluster topology view to generate node resource characteristic values.

And when the first total characteristic value is calculated, the code mapping in the calculation of the first total characteristic value can be carried out on the resources corresponding to the nodes of the virtual cluster topology view, namely the residual various resources of the nodes, so as to calculate the characteristic value of the node resources.

In an alternative embodiment of the present invention, the method may further include:

and S3, performing coding mapping on the multidimensional features corresponding to the nodes of the virtual cluster topological view to generate a second total feature value.

And performing coding mapping in the first total feature value calculation on the multidimensional features corresponding to the nodes of the virtual cluster topology view, and calculating a second total feature value. For example, the multidimensional characteristics of the nodes are: { the number of the resources of each dimension type, the calculation network type, the storage type and the calculation resource type … }, processing the resources and the characteristics bitmap of one node as the total characteristic vector of the zone. The total eigenvector of the zone is the resource type aggregate and the resource maximum number in the contained node. The bitmap here contains both a numerical value bitmap and a type tag bitmap. And realizing the dimension reduction processing of the characteristic values such as the job request, the node resource information and the like. When the job request selects the region, correlation calculation matching can be performed according to the job request resource vector and the total feature vector of the region, and the region can be quickly selected.

Step 203, constructing an updated virtual cluster topology view according to the node scheduling topology map, wherein the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list;

an updated virtual cluster topology view can be constructed based on the node scheduling topology map for job allocation, the updated virtual cluster topology view comprises a plurality of areas, and each area corresponds to a node list; the node list records the resource information of each node in the area.

In an optional embodiment of the invention, the step of determining, in response to the job request concurrent with the cluster, a target area from the virtual cluster topology view according to the node task traffic index includes:

step S2031, dynamically planning nodes in the virtual cluster topology view, and generating the region;

and dynamically planning the nodes in the virtual cluster topology view, and dynamically dividing the cluster area.

Specifically, the step of dynamically planning the nodes in the virtual cluster topology view and generating the region includes: determining an initial region, and based on a region score of a node in the initial region and the initial region; determining a central node according to the region score; the region is generated based on the central node.

The initial area is first set. The initial areas can be referred to as fig. 4, and the 3 initial areas are respectively

Zone-A switch-1- > [ node1, node2, node3], the regional center node is node1;

Zone-B is switch-2- > [ node4, node5, node6], the regional center node is node4;

Zone-C switch-3- > [ node7, node8, node9], the regional center node is node7.

Based on the job association between each node in the initial region and the initial region, a region score is determined, and a center node of the region is iteratively updated. The virtual cluster areas are generated based on the central node, specifically as shown in fig. 5, and the virtual cluster areas are respectively:

Zone-A [ node2, node3, node6], zone center node is node2;

Zone-B [ node1, node4, node5, node6], zone center node is node4;

Zone-C [ node7, node8, node9], zone center node is node7;

wherein, node6 exists in both Zone-A and Zone-B.

Further, the node in the initial area corresponds to a location affinity and a traffic affinity, and the step of based on the area score of the node in the initial area and the initial area includes: calculating a weighted sum of the location affinity and the business affinity; calculating the ratio of the weighted sum value to the traffic load; and determining the ratio as the region score.

The region score can be quantified using the formula:

zoneScore = (locationAffinity + trafficAffinity) / trafficLoad

zoneScore is a linear regression multi-factor model that includes the influence of conditions of locationAffinity, trafficAffinity and trafficLoad; locationAffinity: and judging the link distance between the nodes in the area according to the physical topological view. The shorter the link distance value is, the higher the locationAffinity value is, and the easier the topology position is seen to belong to a region; trafficAffinity: judging the service affinity between nodes according to the historical service data flow and the task running number between the nodes, wherein the higher the trafficAffinity value is, the higher the node running history distributed operation frequency is, and the easier the node running history distributed operation frequency is divided into a region; trafficLoad: and judging the node service load according to the current service data flow of the node, wherein the higher the trafficLoad value is, the larger the load is. If divided into a region, traffic congestion is increased.

Step S2032, sorting the node list of the area based on the resource feature value and the node traffic load, to generate an area node having a sequence;

and ordering the node list of the region according to the resource characteristic value and the node service load based on the node list, and generating a sequence of the region nodes. As shown in fig. 6, a plurality of region node sequences are generated, each region node sequence including a plurality of nodes.

In a substep S2033, the target area is determined by associating the areas based on the job request.

The regions are associated based on the job request. The manner of association includes, but is not limited to, tag attribute matching selection, historical job data statistics correlation coefficients, load balancing or centralized use, and the like. In the embodiment of the invention, the job request can be processed in parallel to enter the associated regional job queue. And then a target area in which the request is performed is determined.

Specifically, in one embodiment of the present invention, the step of associating the area based on the job request, and determining the target area includes: and associating the areas based on the job type of the job request, and determining the target area.

The areas of the same type may be associated according to the type of job request and the target area may be determined therefrom.

Step 204, in response to the job request concurrent with the cluster, determining a target area from the virtual cluster topology view according to the node task flow index;

in the embodiment of the invention, the job request can be processed in parallel to enter the associated regional job queue. And determining a target area from the virtual cluster topology view according to the node task flow index. Further scheduling is performed in the target area.

Step 205, sequentially reading nodes in the node list of the target area, determining a target node, and responding to the job request based on the target node.

The nodes in the node list of the target area can be sequentially read, the target node is determined, and the target node is adopted to respond to the job request so as to perform job processing.

In an optional embodiment of the invention, the step of sequentially reading nodes in the node list of the target area and determining the target node includes:

sub-step S2051, reading nodes in the sequence of the regional nodes one by one according to the sequence;

And step S2052, when the resource corresponding to the node meets the job request, determining the node as a target node.

The nodes are ordered in sequence. And sequentially reading dequeue operation, directly calculating with dequeue nodes, and determining the first node meeting the comparison of the values of the characteristic values as the target node.

Step 206, in response to the node resource remaining number of the target area being smaller than the resource demand number of the job request, scheduling the job request to other areas, wherein the other areas are areas outside the target area in the plurality of areas;

when the node resource remaining number of the target area is smaller than the resource demand number of the job request, that is, the area resource is insufficient to process the job request, the job request can be dispatched to other areas, that is, areas other than the target area in the plurality of areas, in response to the node resource remaining number of the target area being smaller than the resource demand number of the job request.

Step 207, using the other area as the target area, executing the steps of sequentially reading the nodes in the node list of the target area, determining a target node, and responding to the job request based on the target node.

And after determining that the new area job is a target area, adopting a new target area execution sequence to read nodes in a node list of the target area, determining a target node, and responding to the job request based on the target node so as to schedule the job by adopting the new target area.

In addition, referring to fig. 7, in practical application, steps 201 to 203 may be performed based on the node topology management part, and steps 204 to 207 may be implemented by codes of the node fast scheduling policy algorithm.

In the embodiment of the invention, the influence of the factors of the resource layer and the service layer on the scheduling is considered, the areas are dynamically divided through the virtual cluster scheduling topological graph, the node resource characteristic value is quickly matched and optimized, the concurrent operation of the area queue is processed, the area nodes are quickly generated in sequence, and the cluster scheduler can efficiently execute the optimization processing of the cloud platform in the large-scale node cluster based on the scheduling score function expression capable of being quantified by engineering. The scheduling performance problem caused by the node number scale is solved in a self-adaptive manner, and the running stability and the continuity of the platform service are ensured.

It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.

Referring to fig. 8, a block diagram of an embodiment of a cluster node selection scheduling device according to the present invention is shown, where the cluster node selection scheduling device may specifically include the following modules:

a first obtaining module 801, configured to obtain node data from the trunking platform;

a first construction module 802, configured to construct a node scheduling topology according to the node data, and determine a node task traffic index according to the node data;

a second construction module 803, configured to construct an updated virtual cluster topology view according to the node scheduling topology map, where the updated virtual cluster topology view includes a plurality of areas, and each area corresponds to a node list;

a target area determining module 804, configured to determine a target area from the virtual cluster topology view according to the node task traffic index in response to the job request concurrent with the cluster;

a request response module 805, configured to sequentially read the nodes in the node list of the target area, determine a target node, and respond to the job request based on the target node.

In an alternative embodiment of the invention, the apparatus further comprises:

a scheduling module, configured to schedule the job request to another area in response to the node resource remaining number of the target area being smaller than the resource demand number of the job request, where the other area is an area outside the target area in the plurality of areas;

And the execution module is used for adopting the other areas as the target areas, executing the steps of sequentially reading the nodes in the node list of the target areas, determining target nodes and responding to the job request based on the target nodes.

In an alternative embodiment of the present invention, the first building block 802 includes:

a configuration sub-module, configured to determine node topology configuration data according to the node data, where the node topology configuration data includes a node topology hierarchy;

a construction sub-module for constructing a node physical topology structure according to the node topology hierarchical structure;

and the filtering sub-module is used for filtering the node physical topological structure diagram and generating the node scheduling topological diagram.

In an alternative embodiment of the present invention, the configuration submodule includes:

the topology type determining unit is used for determining a node topology type based on the node data, wherein the node topology type corresponds to a node topology level identifier;

a node topology distance determination unit that determines a node topology distance based on the node data;

and the aggregation unit is used for combining the node topology type and the node topology distance to determine the node topology configuration data.

In an alternative embodiment of the present invention, the filtering submodule includes:

the extracting unit is used for extracting available schedulable nodes in the node physical topological structure diagram;

and the first construction unit is used for constructing the node scheduling topological graph based on the available schedulable nodes.

In an alternative embodiment of the present invention, the node data includes multi-service dimension information, and the target area determining module 804 includes:

and the node task flow index calculation sub-module is used for calculating the weighted sum value of the multi-service dimension information, normalizing the weighted sum value and determining the node task flow index.

In an alternative embodiment of the present invention, the multi-service dimension information includes: the node current operation network flow, the operation quantity, the data set and the mirror image cache state, wherein the node task flow index comprises a service load, and the node task flow index calculation submodule comprises:

and the calculation unit is used for calculating the weighted sum value of the current operation network flow of the node, the operation quantity, the data set and the mirror image cache state, normalizing the weighted sum value and determining the node business load.

In an alternative embodiment of the invention, the apparatus further comprises:

and the inter-node business affinity determining unit is used for determining the inter-node business affinity based on the node type and the node task flow index.

In an alternative embodiment of the invention, the apparatus further comprises:

and the conversion module is used for carrying out coding processing on the multidimensional resource vector value of the job request.

In an alternative embodiment of the invention, the conversion module comprises:

the first coding submodule is used for carrying out numerical coding on the resource vector characteristic value of any dimension in the multidimensional resource vector values;

and the combination sub-module is used for combining the numerical codes according to the front and back dimensions to generate a first total characteristic numerical value.

In an alternative embodiment of the invention, the apparatus further comprises:

and the first mapping module is used for carrying out coding mapping on the resources corresponding to the nodes of the virtual cluster topology view to generate node resource characteristic values.

In an alternative embodiment of the invention, the apparatus further comprises:

and the second mapping module is used for carrying out coding mapping on the multidimensional features corresponding to the nodes of the virtual cluster topology view to generate a second total feature value.

In an alternative embodiment of the present invention, the target area determining module 804 includes:

the dynamic planning sub-module is used for dynamically planning the nodes in the virtual cluster topology view to generate the region;

the sequencing sub-module is used for sequencing the node list of the area based on the resource characteristic value and the node service load to generate an area node sequence;

and the association sub-module is used for associating the areas based on the job request and determining the target area.

In an alternative embodiment of the present invention, the dynamic programming submodule includes:

a score calculation unit configured to determine an initial region, and based on a region score of a node in the initial region and the initial region;

a center node determining unit, configured to determine a center node according to the region score;

and the region generating unit is used for generating the region based on the central node.

In an optional embodiment of the invention, the node in the initial area corresponds to a location affinity and a traffic affinity, and the score calculating unit comprises:

a weighted calculation subunit for calculating a weighted sum of the location affinity and the business affinity;

A ratio calculating subunit, configured to calculate a ratio of the weighted sum value to the traffic load;

and the score determining subunit is used for determining the ratio as the region score.

In an alternative embodiment of the present invention, the association submodule includes:

and the association unit is used for associating the areas based on the job type of the job request and determining the target area.

In an alternative embodiment of the present invention, the request response module 805 includes:

the reading submodule is used for reading the nodes in the sequence of the regional nodes one by one according to the sequence;

and the target node determining submodule is used for determining the node as a target node when the resource corresponding to the node meets the job request.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

Referring to fig. 9, an embodiment of the present invention further provides an electronic device, including:

a processor 901 and a storage medium 902, said storage medium 902 storing a computer program executable by said processor 901, said processor 901 executing said computer program when the electronic device is running to perform a cluster node selection scheduling method according to any one of the embodiments of the present invention. The cluster node selection scheduling method, wherein the clusters are deployed on a cluster platform, comprises the following steps:

Acquiring node data from the cluster platform;

Optionally, the method further comprises:

determining a node topology distance based on the node data;

Optionally, the node data includes a node type, and the method further includes:

Optionally, the method further comprises:

determining a central node according to the region score;

the region is generated based on the central node.

calculating a weighted sum of the location affinity and the business affinity;

calculating the ratio of the weighted sum value to the traffic load;

and determining the ratio as the region score.

and associating the areas based on the job type of the job request, and determining the target area.

The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processing, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

Referring to fig. 10, an embodiment of the present invention further provides a computer readable storage medium 1001, where the storage medium 1001 stores a computer program, and the computer program when executed by a processor performs a cluster node selection scheduling method according to any one of the embodiments of the present invention. The cluster node selection scheduling method, wherein the clusters are deployed on a cluster platform, comprises the following steps:

Acquiring node data from the cluster platform;

Optionally, the method further comprises:

determining a node topology distance based on the node data;

Optionally, the method further comprises:

determining a central node according to the region score;

the region is generated based on the central node.

calculating a weighted sum of the location affinity and the business affinity;

calculating the ratio of the weighted sum value to the traffic load;

and determining the ratio as the region score.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The method, the device, the electronic equipment and the storage medium for cluster node selection scheduling provided by the invention are described in detail, and specific examples are applied to the explanation of the principle and the implementation mode of the invention, and the explanation of the above examples is only used for helping to understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. The cluster node selection scheduling method is characterized by comprising the following steps of:

acquiring node data from the cluster platform; the node data comprises the current operation network flow of the node, the number of the operations, a data set and a mirror image cache state, and the node task flow index comprises a service load;

sequentially reading nodes in a node list of the target area, determining a target node, and responding to the job request based on the target node;

the node scheduling topological graph is constructed based on available schedulable nodes in a node physical topological structure graph, the node physical topological structure graph is determined through node topological configuration data, the node topological configuration data is used for determining node topological types based on the node data, and the node topological types correspond to node topological level identifiers; determining a node topology distance based on the node data; determining the node topology type and the node topology distance in a combined way;

And the node task flow index is determined by calculating the weighted sum value of the current operation network flow of the node, the operation quantity, the data set and the mirror image cache state, and normalizing the weighted sum value.

2. The method according to claim 1, wherein the method further comprises:

3. The method of claim 1, wherein the step of constructing a node scheduling topology from the node data comprises:

4. The method of claim 1, wherein the step of determining a node task traffic indicator from the node data comprises:

5. The method according to claim 1, wherein the method further comprises:

6. The method according to claim 1, wherein the method further comprises:

7. The method of claim 6, wherein the step of encoding the multi-dimensional resource vector value of the job request comprises:

8. The method of claim 7, wherein the method further comprises:

9. The method of claim 8, wherein the method further comprises:

10. The method of claim 9, wherein the step of determining a target area from the virtual cluster topology view in accordance with the node task traffic index in response to the cluster concurrent job request comprises:

11. The method of claim 10, wherein the step of dynamically planning nodes in the virtual cluster topology view to generate the region comprises:

determining a central node according to the region score;

the region is generated based on the central node.

12. The method of claim 11, wherein nodes in the initial region correspond to location affinity and business affinity, and wherein the step of based on the region scores of the nodes in the initial region and the initial region comprises:

calculating a weighted sum of the location affinity and the business affinity;

calculating the ratio of the weighted sum value to the traffic load;

and determining the ratio as the region score.

13. The method of claim 10, wherein the step of associating the regions based on the job request, determining the target region comprises:

14. The method of claim 10, wherein the step of sequentially reading nodes in the node list of the target area and determining the target node comprises:

15. A cluster node selection scheduling device, wherein the cluster is deployed on a cluster platform, and the cluster node selection scheduling device is characterized by comprising:

the first acquisition module is used for acquiring node data from the cluster platform; the node data comprises the current operation network flow of the node, the number of the operations, a data set and a mirror image cache state, and the node task flow index comprises a service load;

the request response module is used for sequentially reading the nodes in the node list of the target area, determining a target node and responding to the job request based on the target node;

16. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program implementing the steps of the cluster node selection scheduling method of any one of claims 1 to 14 when executed by the processor.

17. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the cluster node selection scheduling method of any of claims 1 to 14.