CN115495246A

CN115495246A - Hybrid remote memory scheduling method under separated memory architecture

Info

Publication number: CN115495246A
Application number: CN202211212624.0A
Authority: CN
Inventors: 李超; 王靖; 贺昊; 梅君夷; 汪陶磊; 过敏意
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2022-12-20
Anticipated expiration: 2042-09-30
Also published as: CN115495246B

Abstract

A mixed far memory scheduling method under a separated memory architecture comprises the steps of firstly collecting data in operation by limiting the use of an application local memory, so as to divide tasks into a far memory insensitive task, a far memory sensitive task and a far memory forbidden use task; allocating the memory insensitive task and the memory sensitive task to the same computing node according to a sensitivity degree complementary principle, yielding a memory to the maximum extent under the same performance limiting condition according to the tasks, performing cross-node memory resource adjustment when the overall yielding memory values between corresponding servers have larger difference, determining the yielding memory value or the rented remote memory value of the server, then performing memory resource adjustment inside the node, and performing resource allocation for each task according to the current residual memory resources of the server and the principle of more additional local memory resources of the sensitive tasks, thereby realizing mixed remote memory scheduling. The invention fully excavates the characteristics of the application in the remote memory environment, and then improves the memory utilization rate and the use efficiency in the data center through an efficient remote memory allocation strategy.

Description

Hybrid remote memory scheduling method under separated memory architecture

Technical Field

The invention relates to a technology in the field of distributed computing, in particular to a method/system for scheduling a hybrid remote memory under a separated memory architecture.

Background

The ineffectiveness of memory resource utilization in a data center is one of the main reasons that bottlenecks occur in performance of many complex computing applications. The sharing of memory resources and storage resources can be realized through remote memory access, so that the resource utilization is optimized. Currently, there are technologies that use high performance memories such as SSD as data replacement of the longitudinal remote memory, and technologies that use high speed networks such as RDMA protocol to realize read and write of the transverse remote memory. However, the existing server and job scheduler have not considered and solved the problem of allocating application resources under the horizontal and vertical mixed remote memory architecture, and cannot allocate remote memory resources according to application characteristics, and at the same time, cannot capture the dynamics of tasks to adjust the deployment of memory resources to achieve load balancing, and cannot realize high-performance resource sharing under limited memory resources.

Disclosure of Invention

The invention provides a hybrid remote memory scheduling method under a separated memory architecture, which aims at solving the problems of unbalanced cluster load and task throughput caused by the fact that the existing memory scheduling technology does not consider the remote memory sensitivity of tasks, does not support the bottleneck that the dynamic property of the captured tasks is not captured so as to adjust the deployment of memory resources, the task using the remote memory is not high enough, and the resource using efficiency of the integral scheduling is not good enough.

The invention is realized by the following technical scheme:

the invention relates to a mixed remote memory scheduling method under a separated memory architecture, which comprises the steps of firstly collecting data during operation in a mode of limiting the use of a local memory, thereby dividing tasks into a remote memory insensitive task, a remote memory sensitive task and a remote memory forbidden task; allocating memory insensitive tasks and memory sensitive tasks to the same computing node according to a sensitivity degree complementation principle, yielding memory to the maximum extent under the condition of the same performance limit by the tasks, performing cross-node memory resource adjustment when the integral yielding memory value difference between corresponding servers is large, determining yielding memory value or rented far memory value of the server, then performing internal memory resource adjustment of the nodes, and performing resource allocation for each task according to the current residual memory resource of the server and the principle of more additional local memory resources of the sensitive tasks, thereby realizing mixed far memory scheduling.

The separated memory architecture is as follows: the architecture for flexibly combining and collocating a plurality of server CPUs and memories in a data center in a network connection mode is characterized in that: servers with computing tasks as functions are used as computing nodes (computer nodes), and servers with Memory access as functions are used as Memory nodes (Memory nodes). And the main program of the task runs on the computing node and simultaneously accesses the memory of the memory node as a remote memory. The server can be used as a computing node and simultaneously provide a memory, and can also be used as a memory node to provide a function of far memory access.

The far memory architecture is as follows: each server is provided with RDMA network cards, the two RDMA network cards are connected through a copper cable, each server takes a CPU as a computing core and a DRAM as a memory unit, the respective RDMA network cards are connected with a mainboard of the server through PCIe, and the CPU of each server uses a local memory and uses a remote memory through the RDMA network cards without occupying the resources of the remote CPU.

The hybrid remote memory comprises: a laterally distal memory and a longitudinally distal memory, wherein: the horizontal far memory comprises a far DRAM space accessed through an RDMA network card connection, and the vertical far memory comprises storage accessed through a linux swap mechanism and an I/O interface and used by the same server, such as a magnetic disk device and an SSD deviceThe storage space of (a). The total memory space Mem of an application is added by three parts, including the local memory space Mem _lm Far memory space Mem _hfm And longitudinal far memory space Mem _vfm 。

The method for collecting the runtime data by limiting the use of the local memory of the application refers to the following steps: and recording the local memory use ratio and the corresponding overall application performance in a mode of activating longitudinal remote memory access by limiting the application local memory use ratio.

The step of allocating the memory insensitive task and the memory sensitive task to the same computing node is as follows: local memory space according to task i

And the allowable memory value SI _i Server id and its remaining memory capacity C _j And calculating to obtain the server node id where each task should be placed, specifically:

i) When the total remaining resource Res is satisfied, i.e.

Resource Min which is larger than minimum resource allocation of current task group _Allo I.e. by

When step ii) is performed;

ii) maximum allowable memory value per task for all tasks in current task group

And within group allowable memory mean SI _avg In a differential order, i.e. according to

Is sorted according to the minimum local memory of each task by using a knapsack algorithm

Allocate and prioritize the

Tasks that are similar and different in sign are distributed to the same server.

iii) Real-time calculation of server predicted remaining capacity C _j When it comes to

And is

The current server is considered full and the placement of the task for the next server is started.

iv) returning to execute step i) until each task is traversed.

Preferably, the computing node adjusts the local memory usage and the horizontal far memory usage and the vertical far memory usage of each task according to a ratio according to the maximum allowable memory of each task.

The cross-node memory resource adjustment specifically includes:

i) Calculating the overall allowable memory value ServerSI of the server, namely the difference value between the residual memory capacity of the server when the task is not distributed and the minimum local memory resource of the distributed task, namely ServerSI _j ＝C _j -Min _Allo ；

ii) calculating the total allowable memory ServerSI of each server _j Average value ServerSI of _avg When ServerSI is used _j -ServerSI _avg When > 0, the server needs to give out the server _j -ServerSI _avg Memory capacity of otherwise the server needs to borrow ServerSI _j -ServerSI _avg Memory capacity of l.

The memory resource adjustment inside the node is as follows: the method for adjusting the size of the memory resources of the local memory, the transverse far memory and the longitudinal far memory of each task inside the server node so as to achieve the effect that task sensitivity application has a relatively larger proportion of SI value specifically comprises the following steps:

i) Collecting minimum local memory value of each task in current server

And

calculating the maximum allocable memory resource value ServerSI = Min (ServerSI ) of the current server _avg ). For the servers of the yielding memory, only the total yielding memory needs to be provided, the memory spaces are used as the task horizontal far memories on other servers, and the use amount of each task far memory is calculated in the next step.

ii) calculating the value of increasable local memory resources of each task, the proportion of increasable local resources of each task and the value of each task's own local memory resources

In inverse proportion, i.e.

Then the local memory space Mem for that task _lm ＝Min _lm +Δlm。

iii) Laterally distant memory space for computing tasks

Wherein: each task ii comprises only the far memory sensitive tasks in the server, while satisfying ServerSI _j -ServerSI _avg ＞0。

iv) calculating the longitudinal far memory value equal to Mem for each task _vfm ＝Mem-Mem _lm -Mem _hfm 。

v) returning the final memory allocation condition of each node.

Preferably, when the far memory sensitivity of the computing node changes or a task ends but other tasks do not end, the computing node performs cross-node memory resource adjustment and node internal resource adjustment again, releases related resources after operation ends, and recalculates the allocation condition of the task in the next task queue.

The change of the far memory sensitivity of the computing node is as follows: and judging the far memory sensitivity of the task by periodically detecting the difference value of the page error number. When the difference value of the page faults is positive and is more than three times continuously larger than the average number of the page faults, the sensitivity of the page faults is considered to be converted from the far memory insensitive type to the far memory sensitive type. And when the difference value of the page faults is negative and is more than three times of continuous times and larger than the average page fault number, the sensitivity of the page faults is considered to be converted from the far memory sensitivity type to the far memory sensitivity type.

The step of performing the cross-node memory resource adjustment and the node internal resource adjustment again refers to: when a task is changed from a far memory sensitivity to an insensitivity, or when a task is completed but other tasks are not finished, the change delta SI of the current yielding memory value SI of the server needs to be preferentially acquired, when the change is larger than a certain threshold value, a cross-node adjusting module is called, a yielding part of memory is accessed as a far memory, then a node internal adjusting module is called, and a memory with a proper proportion is allocated to each running task. Meanwhile, when the far memory is insensitive and becomes sensitive, the change delta SI of the current SI value of the server is obtained preferentially, when the change is larger than a certain threshold value, a cross-node adjusting module is called, a part of memory is borrowed to be used as far memory access of a local task, and then a node internal adjusting module is called to allocate a far memory with a proper proportion to each running task.

And the resource allocation is carried out on each task, and comprises the allocation of the size of a local memory resource, the size of a transverse far memory resource and the size of a longitudinal far memory.

The invention relates to a system for realizing the method, which comprises the following steps: the system comprises a stage application sensitivity analysis unit based on a far memory, a task grouping unit according to sensitivity and self characteristics, a computing node selection unit based on load balance, a cross-node memory resource adjustment unit and a node internal memory resource adjustment unit, wherein: the application sensitivity analysis unit acquires data during online operation by limiting the use of an application local memory, collects the data during operation of the application, and calculates application parameters related to sensitivity; the task grouping unit carries out far memory sensitivity analysis and calculation according to the collected running data and related parameters, and divides tasks into far memory insensitive tasks, far memory sensitive tasks and far memory forbidden tasks; the computing node selection unit distributes the memory insensitive task and the memory sensitive task to the same computing node according to the task sensitivity information collected by the application sensitivity analysis unit and the task grouping unit and the sensitivity degree complementation principle; the cross-node memory resource adjusting unit calculates the overall allowable memory value of the servers according to the maximum allowable memory of the tasks under the same performance limiting condition, and when the overall allowable memory value difference between the servers is large, the cross-node memory resource adjusting unit adjusts the cross-node memory resources and determines the allowable memory value or the leased far memory value of the servers; and the internal memory resource adjusting unit of the node allocates the size of the local memory resource for each task according to the current residual memory resource of the server and the principle that more additional local memory resources are provided for sensitive tasks, determines the size of the transverse remote memory resource by combining the result of the cross-node memory resource adjusting unit, and calculates the size of the longitudinal remote memory at the same time, thereby finally realizing efficient hybrid remote internal scheduling.

Technical effects

The method comprises the steps of performing online runtime data acquisition by limiting the use of a local memory, collecting runtime data of an application, calculating sensitivity-related application parameters, performing remote memory sensitivity analysis calculation according to the collected runtime data and the related parameters in cooperation with task grouping according to sensitivity and self characteristics, dividing tasks into a remote memory insensitive task, a remote memory sensitive task and a remote memory forbidden use task, selecting based on load balancing calculation nodes, and distributing the memory insensitive task and the memory sensitive task to the same calculation node according to sensitivity degree complementary principle according to collected task sensitivity information; the method comprises the steps of determining an yielding memory value or a rented remote memory value of a server according to the maximum yielding memory of a task and the integral yielding memory value of a computing server under the same performance limiting condition through cross-node memory resource adjustment, further distributing the size of a local memory resource for each task according to the current residual memory resource of the server and the principle of more additional local memory resources of a sensitive task through node internal memory resource adjustment, and determining the size of a transverse remote memory resource and the size of a longitudinal remote memory.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a schematic diagram of a module framework embodying the present invention;

FIG. 3 is a flow chart of cross-node and intra-node resource readjustment;

FIG. 4 is a diagram illustrating the result of the effect optimization of the embodiment;

in the figure: the method comprises the following steps of (a) optimizing the utilization rate, (b) optimizing the overall performance, and (c) optimizing the overall memory utilization efficiency;

FIG. 5 is a diagram illustrating the sensitivity and effect of different tasks in one embodiment;

in the figure: (a) memory use efficiency of the S-Trace; (b) memory use efficiency of M-Trace; and (c) the memory use efficiency of the L-Trace.

Detailed Description

In this embodiment, taking a plurality of real application frames as examples, including a general computation task, a graph computation task, a video processing task, an AI training task, an AI inference task, an image recognition task, and a video recognition task, RDMA is used as a remote memory medium when runtime data is collected, and the environment of the system is as follows: two Intel (R) Xeon (R) Gold 6148CPU,256GB memory, 21TB hard disk and a two-channel Mellanox ConnectX-5RDMA network card with 2 20 cores. One of the servers serves as a compute node and the other serves as a remote memory access node (remote node). In the process of simulated scheduling, a python program is used for simulating three scenes of 10 nodes and 200 tasks, 20 nodes and 500 tasks and 50 nodes and 2000 tasks to respectively give optimization results and comparison, and meanwhile, the task distribution with the far memory sensitive task proportion of 10%,30%,50%,70% and 90% is used for showing the memory use efficiency optimization of the method.

As shown in fig. 1, a method for scheduling a hybrid far memory under a split memory architecture according to this embodiment includes:

i) Firstly, collecting runtime data by limiting the use of an application local memory so as to collect and analyze application characteristic data in a staged manner under a remote memory environment, specifically: activating the longitudinal far memory by limiting the application local memory use ratio, and recording:

(1) the use proportion of the local memory and the corresponding application running time;

(2) comparing the task performance under each condition with the performance when the far memory is not used, and recording the maximum used memory capacity Mem when the performance ratio is not more than SLO (the default value is 1.2) _max Local memory ratio L and local memory value Mem _lm Calculating the maximum memory unloading ratio R =1-L and the maximum allowable memory value SI _max ＝Mem _max -Mem _lm ，Mem _lm The value of the current allowable memory value SI of each task is equal to the sum of the allocated memory capacity Mem minus the local memory value Mem _lm ；

(3) And setting task stage division nodes according to the sensitivity performance, determining by the number of page errors PF recorded under the condition that the time Interval is Interval, and selecting a time point T when the difference delta PF of the number of page errors is more than mean (PF (T)) for three times continuously as the task stage division nodes.

ii) according to the collected and analyzed application characteristic data, grouping the tasks according to the sensitivity and the self characteristics, and dividing the tasks into far memory insensitive tasks, far memory sensitive tasks and far memory forbidden tasks, specifically: calculating an allowable memory value SI of each task in a task group to be distributed and an average value SI _ avg of the SI of each task in the task group, wherein when SI is larger than SI _ avg and R is smaller than 0.2, the memory is a far memory forbidden use type, when R is larger than 0.2 and smaller than 0.5, the memory is a far memory sensitive type, and when R is larger than 0.5, the memory is a far memory insensitive type; when SI < SI _ avg and R < 0.5, it is the far memory forbidden, when 0.5 < R < 0.8, it is the far memory sensitive, and when R > 0.8, it is the far memory insensitive.

iii) Selecting a computing node for the task, specifically: according to a set of task _ num tasksLocal memory value

And the allowable memory value SI _i Id of a group of server _ num servers and remaining memory capacity C thereof _j Current task group minimum resource to allocate

Res is less than the total remaining resources

Then, the maximum allowable memory value of each task is given to all tasks in the current task group

And within group the allowable memory mean SI _avg In a differential order, i.e. according to

Is allocated and preferably will

The tasks which are close and have different symbols are distributed to the same server; real-time calculation of server residual capacity C _j When it comes to

And is provided with

Considering that the current server is full, and starting to place a task for the next server; and traversing each task and outputting a server id corresponding to each task.

iv) performing cross-node memory resource adjustment, comprising: 1) Calculating the overall permissible memory value ServerSI of the server, i.e. the taskThe difference between the residual memory capacity of the server when not allocated and the minimum local memory resource of the allocated task, i.e. ServerSI _j ＝C _j -Min _Allo (ii) a 2) Calculating the available memory ServerSI of each server _j Average value ServerSI of _avg When ServerSI is used _j -ServerSI _avg When > 0, the server needs to give the server _j -ServerSI _avg Memory capacity of otherwise server needs to borrow server SI _j -ServerSI _avg Memory capacity of l.

v) adjusting the task local memory inside the node, including: 1) Collecting minimum local memory value of each task in current server

And

calculating the maximum allocable memory resource value ServerSI = Min (ServerSI ) of the current server _avg ). 2) Calculating the value of increasable local memory resources of each task, the proportion of increasable local resources of each task and the value of each task

In inverse proportion, i.e.

Then the local memory space Mem for that task _lm ＝Min _lm + Δ lm. 3) Laterally distant memory space for computing tasks

Wherein: each task ii comprises only the far memory sensitive tasks in the server, while satisfying ServerSI _j -ServerSI _avg Is greater than 0. 4) Computing the longitudinal far memory value equal to Mem of each task _vfm ＝Mem-Mem _lm -Mem _hfm . And returning the final memory allocation condition of each node.

vi) adjusting resources inside the cross-boundary point and the node again, and judging the far memory sensitivity of the task by periodically detecting the difference value of the page error number, wherein the method comprises the following steps: 1) And when the difference value of the page faults is positive and is more than three times continuously larger than the average page fault number, the sensitivity of the page faults is considered to be converted from the far memory insensitive type to the far memory sensitive type. And when the difference value of the page faults is negative and is more than three times of continuous times and larger than the average page fault number, the sensitivity of the page faults is considered to be converted from the far memory sensitivity type to the far memory sensitivity type. 2) When a task is changed from being insensitive to a far memory, and when a certain task is completed but other tasks are not completed, a change delta SI of a current SI value of a server needs to be preferentially obtained, when the change is larger than a certain threshold value, a cross-node adjusting module is called, a part of memory is given out to be used as far memory access, and then a node internal adjusting module is called to allocate a memory with a proper proportion to each running task. 3) Meanwhile, when the far memory is insensitive and sensitive, the change delta SI of the current SI value of the server is preferentially acquired, when the change is larger than a certain threshold value, a cross-node adjusting module is called, a part of memory is borrowed to be used as far memory access of a local task, then a node internal adjusting module is called, and a far memory with a proper proportion is distributed for each running task.

vii) releasing relevant resources after the operation is finished, and recalculating the task allocation condition in the next task queue.

Through practical experiments, under the specific environment setting of remote memory based on RDMA, tasks such as running chart calculation, video processing, AI training, AI inference, image recognition and automatically generated sample application are performed, the task distribution of 10%,30%,50%,70% and 90% of remote memory sensitive tasks is successfully simulated under three scenes of 10 nodes of 200 tasks (S-Trace), 20 nodes of 500 tasks (M-Trace) and 50 nodes of 2000 tasks (L-Trace) by realizing the remote memory scheduling strategy, and the scheduling strategy (No-FM) of the remote memory is compared with the scheduling strategy (LIFC) of a first-to-first compressed memory and the scheduling strategy of a first-to-first compressed (FIFC) without considering the remote memory; the invention improves the overall memory utilization rate of 17.6%, the remote memory application performance of 20.7% and the memory use efficiency of 20.5% under the condition of reaching the maximum memory utilization rate of 98%.

Compared with the prior art, the performance index improvement of the method is higher memory utilization rate, more flexible remote memory access, higher application performance in a remote memory environment and higher memory use efficiency.

The foregoing embodiments may be modified in many different ways by one skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and not by the preceding embodiments, and all embodiments within their scope are intended to be limited by the scope of the invention.

Claims

1. A mixed far memory scheduling method under a separated memory architecture is characterized in that data during operation are collected by limiting the use of an application local memory, so that tasks are divided into far memory insensitive tasks, far memory sensitive tasks and far memory forbidden tasks; allocating memory insensitive tasks and memory sensitive tasks to the same computing node according to a sensitivity degree complementation principle, yielding memory to the maximum extent under the condition of the same performance limit by the tasks, performing cross-node memory resource adjustment when the integral yielding memory value difference between corresponding servers is large, determining yielding memory value or rented far memory value of the server, then performing internal memory resource adjustment of the nodes, and performing resource allocation for each task according to the current residual memory resource of the server and the principle of more additional local memory resources of the sensitive tasks, thereby realizing mixed far memory scheduling.

2. The method as claimed in claim 1, wherein the hybrid remote memory comprises: a laterally far memory and a longitudinally far memory, wherein: the transverse far memory comprises a far-end DRAM space accessed through RDMA network card connection, and the longitudinal far memory comprises a storage space of the same server, such as a magnetic disk device and an SSD device, accessed through a linux swap mechanism and an I/O interface; the total memory space Mem of the application is added by three parts, including the local memory space Mem _lm Far memory space Mem _hfm And longitudinal far memory space Mem _vfm 。

3. The method according to claim 1, wherein the step of allocating the memory-insensitive task and the memory-sensitive task to the same compute node comprises: local memory space according to task i

i) When the total remaining resource Res is satisfied, i.e.

When step ii) is performed;

ii) maximum allowable memory value per task for all tasks in current task group

And within group allowable memory mean SI _avg In a sequence of differences, i.e. according to

Using knapsack algorithm to order according to the minimum local memory of each task

Allocate and prioritize the

Proximity ofTasks with different symbols are distributed to the same server;

iii) Real-time calculation of server predicted remaining capacity C _j When is coming into contact with

And is

Considering that the current server is full, and starting to place a task for the next server;

iv) returning to execute step i) until each task is traversed.

4. The method according to claim 1 or 3, wherein the compute node proportionally adjusts the local memory usage and the horizontal and vertical remote memory usage for each task according to the maximum allowable memory for each task.

5. The method according to claim 1, wherein the cross-node memory resource adjustment comprises:

i) Calculating the overall allowed memory value ServerSI of the server, i.e. the difference between the residual memory capacity of the server when the task is not allocated and the minimum local memory resource of the allocated task, i.e. ServerSI _j ＝C _j -Min _Allo ；

ii) calculating the total allowable memory ServerSI of each server _j Average value ServerSI of _avg When ServerSI is used _j -ServerSI _avg When > 0, the server needs to give the server _j -ServerSi _avg Memory capacity of otherwise the server needs to borrow ServerSI _j -ServerSI _avg Memory capacity of l.

6. The method according to claim 1 or 2, wherein the memory resource adjustment inside the node is: the method for adjusting the memory resource size of the local memory, the transverse remote memory and the longitudinal remote memory of each task inside the server node so as to achieve the effect that task sensitivity application has a relatively larger proportion of SI value includes the following specific steps:

i) Collecting minimum local memory value of each task in current server

And

calculating the maximum allocable memory resource value ServerSI = Min (ServerSI ) of the current server _avg ) (ii) a For the servers of the yielding memory, only the total yielding memory needs to be provided, the memory spaces are used as the task transverse remote memories on other servers, and the using quantity of each task remote memory is calculated in the next step;

In inverse proportion, i.e.

The local memory space Mem of the task _lm ＝Min _lm +Δlm；

iii) Laterally distant memory space for computing tasks

Wherein: each task ii comprises only the far memory sensitive tasks in the server, while satisfying ServerSI _j -ServerSI _avg ＞0；

iv) calculating the longitudinal far memory value equal to Mem for each task _vfm ＝Mem-Mem _lm -Mem _hfm ；

v) returning the final memory allocation condition of each node.

7. The method according to claim 1, 2, 3 or 5 for scheduling the hybrid remote memory under the separate memory architecture, wherein when the sensitivity of the remote memory of the compute node changes or a task ends but other tasks do not end, the compute node performs cross-node memory resource adjustment and resource adjustment inside the compute node again, releases related resources after the operation ends and recalculates the allocation condition of the task in the next task queue;

the change of the far memory sensitivity of the computing node is as follows: judging the remote memory sensitivity of the task by periodically detecting the difference value of the page error number; when the difference value of the page faults is positive and is more than three times continuously larger than the average page fault number, the sensitivity of the page faults is considered to be converted from a far memory insensitive type to a far memory sensitive type; when the difference value of the page faults is negative and is more than three times continuously larger than the average page fault number, the sensitivity of the page faults is considered to be converted from a far memory sensitive type to a far memory sensitive type;

the re-performing of the cross-node memory resource adjustment and the node internal resource adjustment refers to: when a task is changed from a far memory sensitivity to an insensitivity, and when a certain task is completed but other tasks are not finished, the change delta SI of the current yielding memory value SI of the server needs to be preferentially obtained, when the change is larger than a certain threshold value, a cross-node adjusting module is called, a yielding part of memory is accessed as a far memory, and then a node internal adjusting module is called to allocate a memory with a proper proportion to each running task; meanwhile, when the far memory is insensitive and becomes sensitive, the change delta SI of the current SI value of the server is obtained preferentially, when the change is larger than a certain threshold value, a cross-node adjusting module is called, a part of memory is borrowed to be used as far memory access of a local task, and then a node internal adjusting module is called to allocate a far memory with a proper proportion to each running task.

8. The method according to claim 1, wherein the allocating resources for each task includes allocating a size of a local memory resource, a size of a horizontal remote memory resource, and a size of a vertical remote memory.

9. A system for implementing the method for scheduling a hybrid remote memory under the split memory architecture of any one of claims 1 to 8, comprising: the system comprises a stage application sensitivity analysis unit based on a remote memory, a task grouping unit according to sensitivity and self characteristics, a computing node selection unit based on load balancing, a cross-node memory resource adjustment unit and a node internal memory resource adjustment unit, wherein: the application sensitivity analysis unit acquires data during online operation by limiting the use of an application local memory, collects the data during operation of the application, and calculates application parameters related to sensitivity; the task grouping unit performs far memory sensitivity analysis and calculation according to the collected running data and related parameters, and divides tasks into far memory insensitive tasks, far memory sensitive tasks and far memory forbidden tasks; the computing node selection unit distributes the memory insensitive task and the memory sensitive task to the same computing node according to the task sensitivity information collected by the application sensitivity analysis unit and the task grouping unit and the sensitivity degree complementation principle; the cross-node memory resource adjusting unit calculates the overall allowable memory value of the servers according to the maximum allowable memory of the tasks under the same performance limiting condition, and when the overall allowable memory value difference between the servers is large, the cross-node memory resource adjusting unit adjusts the cross-node memory resources and determines the allowable memory value or the leased far memory value of the servers; and the internal memory resource adjusting unit of the node allocates the size of the local memory resource for each task according to the current residual memory resource of the server and the principle that more additional local memory resources are provided for sensitive tasks, determines the size of the transverse remote memory resource by combining the result of the cross-node memory resource adjusting unit, and calculates the size of the longitudinal remote memory at the same time, thereby finally realizing efficient hybrid remote internal scheduling.