CN114253457A - Memory control method and device - Google Patents

Memory control method and device Download PDF

Info

Publication number
CN114253457A
CN114253457A CN202010996974.5A CN202010996974A CN114253457A CN 114253457 A CN114253457 A CN 114253457A CN 202010996974 A CN202010996974 A CN 202010996974A CN 114253457 A CN114253457 A CN 114253457A
Authority
CN
China
Prior art keywords
memory
node
control group
job
total occupied
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010996974.5A
Other languages
Chinese (zh)
Inventor
丁肇辉
朱波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010996974.5A priority Critical patent/CN114253457A/en
Priority to PCT/CN2021/117914 priority patent/WO2022057754A1/en
Publication of CN114253457A publication Critical patent/CN114253457A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System (AREA)

Abstract

The embodiment of the application provides a memory control method and a memory control device, which are applied to a first node, wherein the method comprises the following steps: acquiring a first total occupied memory of each operation included in an operation control group, wherein the operation control group includes each operation which is not processed on the first node; when the first total occupied memory meets a first preset condition, reducing the total occupied memory of each operation included in the operation control group, wherein the first preset condition includes that the first total occupied memory is larger than the maximum available memory, or includes a first preset proportion which is smaller than or equal to the maximum available memory and is larger than the maximum available memory. In this embodiment of the application, each operation that is not processed on the first node is located in the same operation control group, so that the first node can obtain the total occupied memory of each operation on the first node in real time, and when the total occupied memory meets the first preset condition, the total occupied memory of each operation included in the operation control group is reduced, and while OOM is avoided, the system memory of the first node can be fully utilized.

Description

Memory control method and device
Technical Field
The application relates to the technical field of computers, in particular to a memory control method and device.
Background
The main objective of High Performance Computing (HPC) services is to increase the computation speed and power to achieve computation speeds on the order of trillions per second. It can solve the calculation of large-scale scientific problems and the processing of mass data, such as weather forecast, automobile simulation, military research, biopharmaceutical, gene sequencing, nuclear explosion simulation, etc. Computers that can provide high performance computing services may be referred to as "high performance computers" or "HPC computers". That is, the load of the HPC computer is large, for example, a large number of jobs are executed, and thus, it is important to control the memory of the HPC computer.
The memory control method of the HPC computer at present has a problem of memory overflow (out of memory, abbreviated as OOM) and a problem that the system memory of the HPC computer cannot be fully used.
Disclosure of Invention
The embodiment of the application provides a memory control method and device, which can make full use of a system memory and effectively prevent the memory from overflowing.
In a first aspect, an embodiment of the present application provides a memory control method, which is applied to a first node, and the method includes: acquiring a first total occupied memory of each operation included in an operation control group, wherein the operation control group includes each operation which is not processed on the first node; when the first total occupied memory meets a first preset condition, reducing the total occupied memory of each operation included in the operation control group, wherein the first preset condition includes that the first total occupied memory is larger than a maximum available memory, or the first preset condition includes that the first total occupied memory is smaller than or equal to the maximum available memory and is larger than a first preset proportion of the maximum available memory.
It can be understood that, in the present solution, the first node invokes the control group system monitoring process to obtain the first total occupied memory of each job included in the job control group.
In the scheme, the number of the operation control groups is one, and each operation acquired by the first node is added to the operation control group by the first node. In a specific implementation, after the first node acquires the job, a task management process for managing the job is started, and the first node calls the task management process to start the job and adds a process corresponding to the job control group, so as to bind the job and the job control group.
One method for determining the maximum memory usage may be as follows: the first node reads the maximum available memory proportion in the configuration file, and determines the maximum used memory according to the maximum used memory proportion and the total memory of the first node. In addition, the maximum used memory may also be a system memory of the first node, which is not limited in this scheme.
Because the control group system monitoring process has a function of monitoring the operation control group in real time and each operation which is not processed on the first node is located in the same operation control group, the first node can obtain the total occupied memory of each operation which is not processed on the first node in real time, and the total occupied memory of each operation included in the operation control group is reduced when the total occupied memory is larger than the maximum available memory or is smaller than or equal to the maximum available memory and larger than the first preset proportion of the maximum available memory, therefore, the method of the scheme can fully utilize the system memory of the first node while avoiding the memory overflow.
In a possible implementation, the reducing the total occupied memory of each job included in the job control group includes: and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
In a first alternative, the first job may be a job satisfying the following conditions: the first actually used memory of the first operation is larger than the first maximum available memory of the first operation, and the memory excess ratio of the first operation is the highest. The memory excess proportion of the first operation is a ratio of a second difference value to a first maximum used memory, and the second difference value is a difference value between a first actually used memory and a first maximum available memory. The first optional mode can suspend or terminate the operation with the highest memory exceeding ratio in a targeted manner, and ensures the normal execution of other operations on the first node.
In a second alternative, the first operation may be the operation that actually uses the highest memory. This second alternative may quickly reduce the total memory occupied by the operations included in the control group.
Optionally, before the first node processes the process corresponding to the first job in the job control group by using the first processing manner, the method further includes: the first node reads processing mode indication information in the configuration file, wherein the processing mode indication information indicates a first processing mode. That is, the first node may obtain the first processing method according to the processing method indication information read from the configuration file. The optional mode can enable the first node to determine the processing mode of the first job so as to realize correct processing of the first job, and facilitates management of the first node on the job.
According to the scheme, the total occupied memory of each operation included in the operation control group is reduced by suspending or terminating the first operation, and normal execution of other operations except the first operation is ensured.
In a possible implementation manner, before the processing, by the first processing means, a process corresponding to a first job in the job control group, the method further includes: determining that the first total occupied memory is less than or equal to the maximum used memory; and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
Optionally, the first memory information includes: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%; the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory. The second preset proportion is a minimum threshold value of the remaining total memory proportion of the first node. The second preset proportion may be read by the first node from the configuration file.
Optionally, the first memory information includes: the proportion of the used total memory is greater than the first preset proportion; the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory. The first preset proportion is a minimum threshold value of the used total memory proportion of the first node. The first preset proportion may be read from the configuration file by the first node.
That is to say, in the present scheme, when the first total occupied memory meets a first preset condition, and the first preset condition is a first preset proportion that is less than or equal to the maximum used memory and is greater than the maximum available memory, the process corresponding to the first operation in the operation control group is processed in a first processing manner. After the first total occupied memory is larger than the first preset proportion of the maximum available memory, the process corresponding to the first operation in the operation control group is processed by adopting the first processing mode, so that the probability that the total occupied memory of each operation included in the operation control group is larger than the maximum used memory can be effectively reduced.
In a possible implementation manner, before obtaining the first total occupied memory of each job included in the job control group, the method further includes: acquiring a second total occupied memory of each operation included in the operation control group; if the second total occupied memory is larger than the maximum available memory, migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory. When the first total occupied memory meets a first preset condition and the first preset condition is less than or equal to the maximum used memory and greater than a first preset proportion of the maximum available memory, processing a first operation in operation control by adopting a first processing mode, wherein the first processing mode is suspension or termination.
In the scheme, the memory of the first node can be effectively prevented from overflowing, and the first operation is suspended or terminated when the total occupied memory of each operation in the control group is smaller than or equal to the maximum available memory and is larger than the first preset proportion of the maximum available memory, so that the probability that the total memory of each operation in the control group is larger than the maximum available memory can be reduced, that is, the probability that part of memory data corresponding to the operation control group is migrated to the swap partition of the first node can be reduced, and the probability that the operation execution time of the operation is prolonged due to the fact that the memory data is migrated to the swap partition is further reduced.
In a possible implementation manner, the suspending the first processing manner, and after processing a process corresponding to a first job in the job control group by using the first processing manner when the first total occupied memory meets a first preset condition and the first preset condition is less than or equal to the maximum used memory and is greater than a first preset proportion of the maximum available memory, the method further includes: acquiring a third total occupied memory of each operation included in the operation control group; determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion; and awakening the process corresponding to the first operation.
That is to say, in this scheme, when the first total occupied memory meets the first preset condition and the first preset condition is less than or equal to the maximum used memory and greater than the first preset proportion of the maximum available memory, after the first node suspends the process corresponding to the first operation, if it is monitored that the third total occupied memory is less than the third preset proportion of the maximum available memory, the process corresponding to the first operation may be awakened, so that the first operation continues to be executed, and the reliability of the first node is increased.
In a possible implementation manner, the suspending the first processing manner, and after processing a process corresponding to a first job in the job control group by using the first processing manner when the first total occupied memory meets a first preset condition and the first preset condition is greater than the maximum used memory, the method further includes: acquiring a third total occupied memory of each operation included in the operation control group; determining a fourth preset proportion that the third total occupied memory is smaller than the maximum available memory; and awakening the process corresponding to the first operation.
That is to say, in this scheme, when the first total occupied memory meets the first preset condition and the first preset condition is greater than the maximum available memory, after the first node suspends the process corresponding to the first job, if it is monitored that the third total occupied memory is less than the fourth preset proportion of the maximum available memory, the process corresponding to the first job may be awakened, so that the first job continues to be executed, and the reliability of the first node is increased.
In a possible implementation, the reducing the total occupied memory of each job included in the job control group includes: and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
In the scheme, at least part of memory data corresponding to at least one operation included in the operation control group is migrated to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory, and the memory overflow can be effectively prevented.
In a possible implementation manner, after the first node migrates the partial memory data corresponding to each job included in the job control group to the swap partition, the method further includes: if the swap partition is processed by the first occupied storage space occupied by the operation control group, the first occupied storage space is larger than the maximum available storage space of the swap partition, a first processing mode is adopted to process the process corresponding to the second operation in the operation control group, so that the storage space occupied by the operation control group is smaller than or equal to the maximum available storage space, and the first processing mode is suspension or termination.
The second operation is any one of the operations occupying the swap partition in the operation control group; or the second operation is the operation which occupies the most storage space of the swap partition in the operation control group. The method for the first node to determine the maximum available storage space of the swap partition can be as follows: a first node reads a first ratio of a maximum used memory of a configuration file to a maximum available storage space of the swap partition; and determining the maximum available storage space of the swap partition according to the first ratio and the maximum used memory.
The scheme can prevent the storage space of the swap partition from being excessively used.
In a second aspect, an embodiment of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect or any of its possible implementations.
In a third aspect, an embodiment of the present application provides a memory control system, including a second node and at least one first node; the second node is used for sending operation information to the first node; the first node is used for obtaining operation based on the operation information and adding the operation to an operation control group, wherein the operation control group comprises each operation which is not processed on the first node; the first node is further configured to obtain a first total occupied memory of each job included in a job control group, where the job control group includes each job that is not processed on the first node; when the first total occupied memory meets a first preset condition, reducing the total occupied memory of each operation included in the operation control group, wherein the first preset condition includes that the first total occupied memory is larger than a maximum available memory, or the first preset condition includes that the first total occupied memory is smaller than or equal to the maximum available memory and is larger than a first preset proportion of the maximum available memory.
In a possible implementation, the first node is specifically configured to: and calling a control group system monitoring process to acquire a first total occupied memory of each operation included in the operation control group.
In a possible implementation manner, the number of the job control groups is one, and each job acquired by the first node is added to the job control group by the first node.
In a possible implementation, the first node is specifically configured to: and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
In a possible implementation manner, before the first node processes a process corresponding to a first job in the job control group by using the first processing manner, the first node is further configured to: determining that the first total occupied memory is less than or equal to the maximum used memory; and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
In one possible implementation, the first memory information includes: the proportion of the used total memory is greater than the first preset proportion; the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory.
In one possible implementation, the first node is further configured to: and reading the first preset proportion in the configuration file.
In one possible implementation, the first memory information includes: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%; the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory.
In one possible implementation, the first node is further configured to: and reading the second preset proportion in the configuration file.
In a possible implementation manner, before the first node acquires the first total occupied memory of each job included in the job control group, the first node is further configured to: acquiring a second total occupied memory of each operation included in the operation control group; if the second total occupied memory is larger than the maximum available memory, migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory.
In one possible implementation, the first actually used memory of the first operation is larger than the first maximum available memory of the first operation, and the memory overrun ratio of the first operation is the highest; the memory excess ratio of the first operation is a ratio of a second difference value to the first maximum used memory, and the second difference value is a difference value between the first actually used memory and the first maximum available memory.
In a possible implementation manner, the first processing manner is suspension, and after the first node processes a process corresponding to a first job in the job control group by using the first processing manner, the first node is further configured to: acquiring a third total occupied memory of each operation included in the operation control group; determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion; and awakening the process corresponding to the first operation.
In a possible implementation manner, before the first node processes a process corresponding to a first job in the job control group by using the first processing manner, the first node is further configured to: and reading processing mode indication information in the configuration file, wherein the processing mode indication information indicates the first processing mode.
In a possible implementation, the first node is specifically configured to: and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
In one possible implementation, the first node is further configured to: reading the maximum available memory proportion in the configuration file; and determining the maximum used memory according to the maximum used memory proportion and the total memory of the first node.
In a possible implementation manner, the first node is a computing node or a cloud server in a distributed computing system, and the second node is a management node in the distributed computing system; the job information includes the job.
In a possible implementation manner, the second node is a terminal device, and the first node is an application server; the job information includes a user request, the job being for execution of the user request.
In a fourth aspect, an embodiment of the present application provides a memory control device, including: the acquisition module is used for acquiring a first total occupied memory of each operation included in an operation control group, wherein the operation control group includes each operation which is not processed on the first node. The processing module is configured to reduce the total occupied memory of each job included in the job control group when the first total occupied memory meets a first preset condition, where the first preset condition includes that the total occupied memory is greater than a maximum available memory, or the first preset condition includes a first preset proportion that is less than or equal to the maximum available memory and is greater than the maximum available memory.
In a possible implementation manner, the obtaining module is specifically configured to: and calling a control group system monitoring process to acquire a first total occupied memory of each operation included in the operation control group.
In one possible implementation manner, the number of the job control groups is one, and each job acquired by the first node is added to the job control group by the first node.
In one possible implementation, the processing module is specifically configured to: and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
In one possible implementation, before the processing module processes a process corresponding to a first job in the job control group by using a first processing manner, the processing module is further configured to: determining that the first total occupied memory is less than or equal to the maximum used memory; and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
In one possible implementation, the first memory information includes: the proportion of the used total memory is greater than the first preset proportion; the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory.
In one possible implementation, the processing module is further configured to: and reading the first preset proportion in the configuration file.
In one possible implementation, the first memory information includes: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%; the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory.
In one possible implementation, the processing module is further configured to: and reading the second preset proportion in the configuration file.
In one possible implementation manner, before the obtaining module obtains the first total occupied memory of each job included in the job control group: the acquisition module is further configured to: acquiring a second total occupied memory of each operation included in the operation control group; if the second total occupied memory is greater than the maximum available memory, the processing module is further configured to: and migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory.
In one possible implementation, the first actually used memory of the first operation is larger than the first maximum available memory of the first operation, and the memory overrun ratio of the first operation is the highest; the memory excess ratio of the first operation is a ratio of a second difference value to the first maximum used memory, and the second difference value is a difference value between the first actually used memory and the first maximum available memory.
In a possible implementation manner, the first processing manner is suspension, and after the processing module processes the process corresponding to the first job in the job control group by using the first processing manner: the acquisition module is further configured to: acquiring a third total occupied memory of each operation included in the operation control group; the processing module is further configured to: determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion; and awakening the process corresponding to the first operation.
In one possible implementation, before the processing module processes a process corresponding to a first job in the job control group by using a first processing manner, the processing module is further configured to: and reading processing mode indication information in the configuration file, wherein the processing mode indication information indicates the first processing mode.
In one possible implementation, the processing module is specifically configured to: and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
In a possible implementation manner, after the processing module migrates the partial memory data corresponding to each job included in the job control group to the swap partition, the processing module is further configured to: if the swap partition is processed by the first occupied storage space occupied by the operation control group, the first occupied storage space is larger than the maximum available storage space of the swap partition, a first processing mode is adopted to process the process corresponding to the second operation in the operation control group, so that the storage space occupied by the operation control group is smaller than or equal to the maximum available storage space, and the first processing mode is suspension or termination.
In a possible implementation manner, the second job is any one of jobs occupying the swap partition in the job control group; or the second operation is the operation which occupies the most storage space of the swap partition in the operation control group.
In one possible implementation, the processing module is further configured to: reading a first ratio of a maximum used memory in a configuration file to a maximum available storage space of the swap partition; and determining the maximum available storage space of the swap partition according to the first ratio and the maximum used memory.
In one possible implementation, the processing module is further configured to: reading the maximum available memory proportion in the configuration file; and determining the maximum used memory according to the maximum used memory proportion and the total memory of the first node.
In a fifth aspect, an embodiment of the present application provides a storage medium, where the storage medium includes a computer program, and the computer program is used to implement the method described in the first aspect or any possible implementation manner of the first aspect.
In a sixth aspect, an embodiment of the present application provides a chip, which includes a processor, a memory, and a communication interface, where the processor and the memory are connected to the communication interface, and is configured to read and execute a computer program stored in the memory to perform the method described in the first aspect or any possible implementation manner of the first aspect.
Drawings
FIG. 1 is a schematic diagram illustrating a memory control method according to the prior art;
FIG. 2 is a schematic diagram illustrating another conventional memory control method;
FIG. 3A is a diagram of a system architecture according to an embodiment of the present application;
FIG. 3B is a diagram of a next specific system architecture of a distributed computing system according to an embodiment of the present application;
FIG. 4 is a schematic block diagram of an electronic device according to an embodiment of the present application;
fig. 5 is a first flowchart of a memory control method according to an embodiment of the present application;
fig. 6 is a second flowchart of a memory control method according to an embodiment of the present application;
FIG. 7 is a schematic diagram of an embodiment of the method shown in FIG. 6;
FIG. 8 is a process diagram of the embodiment of the method shown in FIG. 6;
fig. 9 is a third flowchart of a memory control method according to an embodiment of the present application;
fig. 10 is a fourth flowchart of a memory control method according to an embodiment of the present application;
fig. 11 is a fifth flowchart of a memory control method according to an embodiment of the present application;
fig. 12 is a schematic block diagram of a memory control device according to an embodiment of the present application;
fig. 13 is a schematic block diagram of a memory control system according to an embodiment of the present application.
Detailed Description
First, elements related to the present application will be described.
1. Distributed computing system: a distributed computing system includes a plurality of computers interconnected by a network, such as a plurality of HPC computers interconnected by a network. The plurality of computers in the distributed computing system may include at least one management node for receiving jobs submitted by users and distributing the jobs to the computing nodes, and a plurality of computing nodes for executing the jobs distributed by the management node. A job may be distributed to a compute node or multiple compute nodes for execution.
2. Job (job): the set of program instances that need to be executed to complete a particular computing service typically corresponds to a set of processes, containers, or other runtime entities on one or more computers. That is, one job corresponds to a plurality of processes. For example, in a distributed management system, after receiving a request from a user, a management node generates at least one job for processing the request, and distributes the job to a computing node for execution. For another example, in some other scenarios, after receiving a request of a user for an application, the terminal device sends the request to an application server of the application, and the application server generates at least one job for processing the request and executes the at least one job.
3. Memory overuse or overload (overload): may result in multiple jobs on the same machine vicious competing for resources, slow execution of jobs or abnormal exit: such as: the processor is occupied by other operations, system processes (e.g., nfsd) are affected, resulting in slow system service response, and in severe cases, the machine may be down due to out of memory (OOM).
4. Underutilization of memory (underuse): low utilization of memory resources may result in longer job queuing latency and reduced throughput.
5. The control group (cgroup) provides a set of mechanisms for controlling the use of resources by a particular set of processes. cgroup binds a process set to one or more subsystems. The subsystem is a module for managing process sets through the tools and interfaces provided by cgroup. One subsystem is a typical "resource controller" that is used to schedule resources or control the upper bound on resource usage.
For a better understanding of the present application, the technical problems that exist at present are explained below.
The Job memory Limit (Job Limit) method is a memory control method applied in a distributed computing system. For the method, referring to fig. 1, a management node receives a job submitted by a user, the job carries a job-level limit (job-level limit), and the management node allocates the job to a computing node and carries the maximum available memory of the job. When the computing node executes the operation, the actually used memory of the operation is periodically monitored, and if the actually used memory of the operation is larger than the maximum available memory of the operation, the computing node suspends or terminates the operation. For example, the maximum available memory for an operation is 10G, and when the actual used memory for the operation is greater than 10G, the operation is suspended or terminated.
On one hand, the method cannot realize real-time monitoring when the computing node monitors the actually used memory of the operation, and the possibility that the actually used memory of the operation is larger than the maximum available memory of the operation exists in the monitoring interval period, so that OOM is possible to occur. On the other hand, when a certain operation actually uses a memory larger than the maximum available memory of the operation, even if a large amount of idle memory exists in the system of the computing node at the time, the operation is suspended or terminated, and thus the memory cannot be fully utilized.
Hard resource restriction is another memory control method applied in distributed computing systems. For this method, referring to fig. 2, a computing node creates a control group (cgroup) for each job, when the computing node calls a control group system monitoring process to monitor an actually used memory of the job, when it is monitored that the actually used memory of the job is greater than the maximum used memory of the job, part of memory data of the job is migrated to a swap partition, and when the actually used memory of the job is greater than the maximum used memory of the job and a storage space of the swap partition occupied by the job is greater than the maximum used storage space (job swap partition) of the job, the computing node suspends or terminates the job.
According to the method, because a control group is created for each operation, the computing node can call a control group system monitoring process to realize real-time monitoring of the actually used memory of the operation, and OOM is almost avoided. However, in this method, when the actually used memory of the operation is larger than the maximum used memory of the operation, even if the system memory of the computing node has a large amount of idle memory at this time, part of the memory data of the operation is migrated to the swap partition, so the memory cannot be fully utilized.
In order to solve the above technical problem, the method of the embodiment of the present application is proposed.
Fig. 3A is a system architecture diagram according to an embodiment of the present application. Referring to fig. 3A, the system architecture includes at least one second node and at least one first node. The second node may be a management node in the distributed computing system, a terminal device, or another server; the first node may be a computing node or a cloud server in a distributed computing system, or may be another server, such as a cloud server or an application server. For example, when the second node is a management node in the distributed computing system, the first node may be a computing node or a cloud server in the distributed computing system and the number of the first nodes is plural. For another example, when the second node is a terminal device, the first node may be an application server.
The first node runs a job, and the job in the first node may be allocated by the second node or generated based on a request sent by the first node.
For example, when the method of the present embodiment is applied to a distributed computing system, a specific system architecture diagram can be shown in fig. 3B. Referring to fig. 3B, the system architecture includes a management node and a computing node, and may further include a cloud server, where the computing node and the cloud server may be a first node shown in fig. 3A, and the management node may be a second node shown in fig. 3B.
Fig. 4 is a schematic block diagram of an electronic device according to an embodiment of the present application. The electronic device of this embodiment may be the first node in the system architecture, or may also be a chip, a chip system, or a processor, which supports the first node to implement the following method, and this electronic device may be used to implement the method corresponding to the first node described in the following method embodiment, and specifically, refer to the description in the method embodiment.
The electronic device may comprise one or more processors 401, where the processors 401 may also be referred to as processing units and may implement certain control functions. The processor 401 may be a general purpose processor or a special purpose processor, etc.
In an alternative design, the processor 401 may also store instructions and/or data 403, and the instructions and/or data 403 may be executed by the processor to cause the electronic device to perform the methods described in the method embodiments below.
In another alternative design, the processor 401 may include a transceiver unit to perform receive and transmit functions. The transceiving unit may be, for example, a transceiving circuit, or an interface circuit. The transmit and receive circuitry, interfaces or interface circuitry used to implement the receive and transmit functions may be separate or integrated. The transceiver circuit, the interface circuit or the interface circuit may be used for reading and writing code/data, or the transceiver circuit, the interface circuit or the interface circuit may be used for transmitting or transferring signals.
Optionally, the electronic device may include one or more memories 402, on which instructions 404 may be stored, the instructions being executable on the processor to cause the electronic device to perform the methods described in the method embodiments described below. Optionally, the memory may further store data therein. Optionally, instructions and/or data may also be stored in the processor. The processor and the memory may be provided separately or may be integrated together. For example, the correspondence described in the method embodiments described below may be stored in a memory or in a processor.
Optionally, the electronic device may further comprise a transceiver 405 and/or an antenna 406. The processor 401 may be referred to as a processing unit and controls the electronic device. The transceiver 405 may be referred to as a transceiver unit, a transceiver, a transceiving circuit or a transceiver, etc. for implementing transceiving functions.
The following describes the memory control method according to the present application with specific embodiments.
Fig. 5 is a first flowchart of a memory control method according to an embodiment of the present disclosure, where an execution main body of the embodiment may be the first node in fig. 3A. Referring to fig. 5, the method of the present embodiment includes:
step S501, the first node obtains a first total occupied memory of each job included in the job control group, where the job control group includes each job that is not processed completely on the first node.
Wherein, in the initialization phase of the first node, the first node creates a job control group. When the first node acquires the operation, the first node allocates the operation to the operation control group, or binds the process corresponding to the operation control group. In a specific implementation, after the first node obtains the job, a task management process for managing the job is started, and the first node calls the task management process to start the job and adds a process corresponding to the job control group, so as to bind the job and the job control group. That is, the job control group includes all jobs that have not been processed on the first node, including both the jobs that are being executed and the jobs that are suspended. That is, the first node includes a job control group, and each job acquired by the first node is assigned to the job control group by the first node.
The first node can call the control group system monitor process to monitor the first total occupied memory of each operation included in the operation control group. Because the control group system monitoring process has a function of monitoring the job control group in real time, when the computing node calls the control group system monitoring process to monitor the total occupied memory of each job included in the job control group, the monitoring real-time performance is excellent, that is, the first node can obtain the first total occupied memory of each job included in the job control group in real time or can obtain the first total occupied memory of each job unprocessed on the first node in real time.
Step S502, when the first total occupied memory meets a first preset condition, the first node reduces the total occupied memory of each job included in the job control group, where the first preset condition includes that the first total occupied memory is greater than the maximum available memory, or the first preset condition includes a first preset proportion that is less than or equal to the maximum available memory and is greater than the maximum available memory.
In one arrangement, the maximum available memory may be the total system memory of the first node.
In another scheme, the maximum available memory may be a product of a ratio (jobgroup.memory.limit _ rate) of a total system memory of the first node and the maximum available memory. The product of the maximum available memory proportion may be read from the configuration file by the first node, and after the first node acquires the maximum available memory proportion, the product of the system total memory and the maximum available memory proportion of the first node is acquired to obtain the maximum available memory. Alternatively, the maximum available memory ratio may be any one of 70% to 95%. For example, the total system memory of the first node is 100G, and the maximum available memory proportion is 90%, then the maximum available memory is 90G.
The target node can receive a maximum available memory proportion configuration instruction, and the configuration instruction can be used for generating a configuration file. The target node may be the first node, may be the second node, or may be another device, that is, the target node is the first node itself of this embodiment, or the target node is a node that establishes a communication connection with the first node of this embodiment and enables the first node of this embodiment to read the configuration file. Because the configuration file comprises the maximum available memory proportion set by the user, the maximum available memory proportion is read from the configuration file at the first node so as to obtain the maximum available memory, and the maximum available memory determined by the first node is more reasonable.
The first node judges whether the first total occupied memory meets a first preset condition according to the first total occupied memory and the maximum available memory, wherein the first preset condition comprises that the first total occupied memory is larger than the maximum available memory, or the first preset condition comprises that the first preset ratio is smaller than or equal to the maximum available memory and is larger than the maximum available memory. If the first total occupied memory meets the first preset condition, the first node reduces the total occupied memory of each operation included in the operation control group, for example, the total occupied memory of each operation included in the operation control group can be reduced by suspending or terminating the operation, and for example, the total occupied memory of each operation included in the operation control group can be reduced by migrating partial memory data to the swap partition.
In this embodiment, each operation that is not completed on the first node is located in the same operation control group, so that the first node can obtain, in real time, the total occupied memory of each operation that is not completed on the first node, and when the total occupied memory is greater than the maximum available memory, or is less than or equal to the maximum available memory and is greater than a first preset proportion of the maximum available memory, the total occupied memory of each operation included in the operation control group is reduced, and while OOM is avoided, the system memory of the first node can be fully utilized.
The embodiment shown in fig. 5 will be described in detail below using specific examples.
Fig. 6 is a second flowchart of the memory control method according to the embodiment of the present application, where the first preset condition includes a scenario that is less than or equal to the maximum available memory and is greater than a first preset ratio of the maximum available memory. Referring to fig. 6, the method of the present embodiment is different from the method shown in fig. 5 in that: in this embodiment, the specific implementation of step S502 "the first node reduces the total occupied memory of each job included in the job control group" shown in fig. 5 is step S603 in this embodiment, and the method of this embodiment further includes step S601 and step S602 before step S603:
step S601, the first node determines that the first total occupied memory is less than or equal to the maximum used memory.
Step S602, the first node determines first memory information according to the first total occupied memory and the maximum used memory, where the first memory information indicates that the first total occupied memory is greater than a first preset proportion of the maximum available memory.
In one scheme, the first node determines first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information comprises a 1-a 2 as follows:
a1, the first node determines the first remaining total available memory proportion according to the first total occupied memory and the maximum used memory.
The first node may first obtain a first difference between the maximum used memory and the first total occupied memory, and determine a ratio of the first difference to the maximum used memory as a ratio of the first remaining total available memory. Alternatively, the first node may further obtain the used memory proportion first, and determine that the difference between 100% and the used memory proportion is the first remaining total available memory proportion.
a2, the first node determines that the first remaining total available memory ratio is less than a second predetermined ratio.
At this time, the first memory information includes that the first remaining total available memory proportion is smaller than a second preset proportion. Since the sum of the first remaining total available memory proportion and the used memory proportion is 100%, and the sum of the second predetermined proportion and the first predetermined proportion is 100%, if the first remaining total available memory proportion is smaller than the second predetermined proportion, the used memory proportion is larger than the first predetermined proportion, in combination with the above description of the used memory proportion, when the first memory information includes that the first remaining total available memory proportion is smaller than the second predetermined proportion, the first memory information may indicate that the first total occupied memory is larger than the first predetermined proportion of the maximum available memory.
The second preset proportion is a minimum threshold value of the remaining total memory proportion of the first node. The second preset proportion may be read by the first node from the configuration file. Correspondingly, the target node also receives a second preset proportion configuration instruction, and the configuration instruction is used for generating a configuration file. It is understood that one of the first preset proportion and the second preset proportion may be included in the configuration file.
In another scheme, the first node determines first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information comprises the following b 1-b 2:
b1, the first node determines the proportion of the total used memory according to the first total occupied memory and the maximum used memory.
After the first node determines that the first total occupied memory is smaller than or equal to the maximum used memory, the proportion of the used total memory is determined. The ratio of the total used memory is the ratio of the first total occupied memory to the maximum available memory.
b2, the first node determines that the ratio of the total used memory is greater than a first preset ratio.
At this time, the first memory information includes: the total used memory ratio is greater than a first predetermined ratio, and the first memory information may indicate that the first total occupied memory is greater than the first predetermined ratio of the maximum available memory because the total used memory ratio is a ratio of the first total occupied memory to the maximum available memory.
The first preset proportion is a maximum threshold value of the used total memory proportion of the first node. The first preset proportion may be read from the configuration file by the first node. Correspondingly, the target node also receives a first preset proportion configuration instruction, and the configuration instruction is used for generating a configuration file.
The first memory information in the schemes b 1-b 2 includes: the ratio of the total used memory is greater than the first preset ratio, and the ratio of the total used memory can be directly determined according to the first total occupied memory and the maximum used memory without acquiring a difference value between the first total occupied memory and the maximum used memory and determining the ratio of the remaining total available memory according to the ratio of the difference value to the maximum used memory, so that the efficiency of acquiring the first memory information is higher in the schemes b 1-b 2 compared with the schemes a 1-a 2.
Step S603, the first node processes the process corresponding to the first job in the job control group by using a first processing mode, where the first processing mode is suspend or terminate.
That is, in this embodiment, when the first total occupied memory satisfies a first predetermined ratio that is less than or equal to the maximum available memory and greater than the maximum available memory, the first node suspends or terminates the first operation in the operation control group. The number of the first jobs may be one or more.
In a first alternative, the first job may be a job that satisfies the following condition: the first actually used memory of the first operation is larger than the first maximum available memory of the first operation, and the memory excess ratio of the first operation is the highest. The memory excess proportion of the first operation is a ratio of a second difference value to a first maximum used memory, and the second difference value is a difference value between a first actually used memory and a first maximum available memory. The first maximum available memory (job-level limit) of the first job may be sent to the first node when the second node allocates the first job to the first node in this embodiment.
The first optional mode can suspend or terminate the operation with the highest memory exceeding ratio in a targeted manner, and ensures the normal execution of other operations on the first node.
In a second alternative, the first operation may be the operation that actually uses the highest memory. This second alternative may quickly reduce the total memory occupied by the operations included in the control group.
Optionally, before the first node processes the process corresponding to the first job in the job control group by using the first processing manner, the method further includes: the first node reads processing mode indication information in the configuration file, wherein the processing mode indication information indicates a first processing mode. The optional mode can enable the first node to determine the processing mode of the first job so as to realize correct processing of the first job, and facilitates management of the first node on the job.
That is, the configuration file includes: and processing mode indication information of at least one job, wherein the at least one job is used for reducing the total occupied memory of the job control group.
Optionally, the first processing manner is suspension, that is, the first node suspends the process corresponding to the first job, and after the first node processes the process corresponding to the first job in the job control group by using the first processing manner, the method may further include: the first node acquires a third total occupied memory of each operation included in the operation control group, and awakens a process corresponding to the first operation when the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion.
When the third preset proportion is different from the first preset proportion, the third preset proportion may be read from the configuration file by the first node. Correspondingly, the target node also receives a third preset proportion configuration instruction input by the user, and the configuration instruction is used for generating a configuration file.
That is to say, after the first node suspends the process corresponding to the first job, if it is monitored that the third total occupied memory is smaller than the third preset proportion of the maximum available memory, the process corresponding to the first job may be awakened, so that the first job continues to be executed, and the reliability of the first node is increased.
Optionally, in this embodiment, before the step S501, the method further includes step S500, where the first node obtains a second total occupied memory of each job included in the job control group, and if the second total occupied memory is greater than the maximum available memory, the first node migrates part of the memory data corresponding to the job control group to the swap partition of the first node, so that the total occupied memory of each job included in the job control group is the first total occupied memory, and the first total occupied memory is less than or equal to the maximum available memory. The first node can call the control group system monitoring process to obtain the second total occupied memory of each operation included in the operation control group. At this time, when the first node is a computing node in the distributed computing system and the second node is a management node, a schematic diagram corresponding to the embodiment may be shown in fig. 7, and a process schematic diagram corresponding to the embodiment may be shown in fig. 8.
Referring to FIG. 8, after the first node starts up, a job control group is created. When the first node receives a job, the job is added into the job queue, and then the process corresponding to the job is added into the job control group after the job is started. And the first node calls a system monitoring process of the control group to acquire the total occupied memory of each operation included by the operation control group, and if the total occupied memory is larger than the maximum available memory, the first node migrates partial memory data corresponding to the operation control group to the swap partition. And then, the first node continues to call the control group system monitoring process to obtain the total occupied memory of each operation included in the operation control group, if the total occupied memory of each operation included in the operation control group is smaller than or equal to the maximum available memory at the moment, the first node obtains first memory information, and if the first memory information indicates that the total occupied memory is larger than a first preset proportion of the maximum available memory, the first node suspends the operation 1 with the highest memory use excess proportion so as to reduce the total occupied memory of each operation included in the operation control group.
In this embodiment, when the total occupied memory of each operation in the control group is less than or equal to the maximum available memory and greater than the first preset proportion of the maximum available memory, the first operation is suspended or terminated, so that the probability that the total memory of each operation in the control group is greater than the maximum available memory can be reduced, and further, the probability of occurrence of the OOM is reduced.
In addition, in a scenario where, if the total occupied memory of each job in the control group is greater than the maximum available memory, the partial memory data corresponding to the job control group is migrated to the swap partition of the first node, in this embodiment, when the total occupied memory of each job in the control group is less than or equal to the maximum available memory and is greater than a first preset proportion of the maximum available memory, the first job is suspended or terminated, so that a probability that the total occupied memory of each job in the control group is greater than the maximum available memory can be reduced, that is, a probability that the partial memory data corresponding to the job control group is migrated to the swap partition of the first node can be reduced, and further, a probability that a job execution time of the job is lengthened due to the fact that the memory data is migrated to the swap partition is reduced.
Fig. 9 is a third flowchart of the memory control method according to the embodiment of the present application, where the present embodiment is applicable to a scenario where the first preset condition includes a condition greater than the maximum available memory. Referring to fig. 9, the present embodiment is different from the embodiment shown in fig. 5 in that: in this embodiment, the specific implementation of the step S502 "the first node reduces the total memory occupied by each job included in the job control group" in the embodiment shown in fig. 5 is the step S902 in this embodiment, and the method of this embodiment further includes, before the step S902, the step S901:
in step S901, the first node determines that the first total occupied memory is greater than the maximum available memory.
Step S902, the first node processes a process corresponding to a first job in the job control group by using a first processing mode, where the first processing mode is suspend or terminate.
The first processing manner and the first operation in this embodiment refer to the description in the previous embodiment, and are not described herein again.
Optionally, if the first processing mode is suspended, after the first node processes the process corresponding to the first job in the job control group by using the first processing mode, the method may further include: and the first node acquires a third total occupied memory of each operation included in the operation control group, and awakens a process corresponding to the first operation when the third total occupied memory is smaller than a fourth preset proportion of the maximum available memory or smaller than the maximum available memory. The fourth preset proportion may be the same as the third preset proportion, and may not be the same. Optionally, the fourth preset proportion may be any one proportion of 70% to 90%.
In this embodiment, when the total occupied memory of each operation in the control group is greater than the maximum available memory, the first node suspends or terminates the first operation, which can reduce the occurrence probability of the OOM and fully use the system memory of the first node.
Fig. 10 is a fourth flowchart of a memory control method according to an embodiment of the present application. The embodiment is applicable to a scenario where the first preset condition includes a condition greater than the maximum available memory. Referring to fig. 10, the present embodiment is different from the embodiment shown in fig. 5 in that: in this embodiment, the specific implementation of the step S502 "the first node reduces the total memory occupied by each job included in the job control group" in the embodiment shown in fig. 5 is the step S1002 in this embodiment, and the method of this embodiment further includes, before the step S1002, the step S1001:
step S1001, the first node determines that the first total occupied memory is greater than the maximum available memory.
In step S1002, the first node migrates at least a portion of the memory data corresponding to the job control group to the swap partition of the first node, so that the total occupied memory of each job included in the job control group is less than or equal to the maximum available memory.
In this embodiment, when the first total occupied memory is greater than the maximum available memory, at least a portion of memory data corresponding to the job control group is migrated to the swap partition of the first node, so that the total occupied memory of each job included in the job control group is less than or equal to the maximum available memory. That is, when the first total occupied memory is less than or equal to the maximum available memory, the first node does not perform any operation.
In this embodiment, when the total occupied memory of each job in the control group is greater than the maximum available memory, the first node migrates at least part of the memory data corresponding to the job control group to the swap partition of the first node, and it is not necessary to suspend or terminate part of the job on the first node.
In view of the above scenario that when the total occupied memory of each job in the control group is greater than the maximum available memory, the first node migrates at least part of the memory data corresponding to the job control group to the swap partition of the first node, in order to prevent the storage space of the swap partition from being used excessively (overuse), the embodiment is further improved on the basis of the foregoing embodiment, where fig. 11 is a flowchart five of a memory control method provided in the embodiment of the present application, and referring to fig. 11, the method of the embodiment includes:
step S1101, the first node determines a first occupied storage space occupied by the job control group of the swap partition.
Step S1102, if the first occupied storage space is greater than the maximum available storage space of the swap partition, the first node processes the process corresponding to the second job in the job control group by using a first processing manner, so that the storage space occupied by the job control group of the swap partition is less than or equal to the maximum available storage space, and the first processing manner is suspend or terminate.
Optionally, the second job is a job occupying the most storage space of the swap partition.
Optionally, the second job is any one of jobs occupying a storage space of the swap partition in the job control group.
The method for the first node to obtain the maximum available storage space may be as follows: the first node obtains a first ratio of the maximum used memory to the maximum available storage space of the swap partition, and determines the maximum available storage space of the swap partition according to the first ratio and the maximum used memory. For example, the maximum used memory is 90G, the first ratio of the maximum used memory to the maximum available storage space of the swap partition is 1:2, and then the maximum available storage space of the swap partition is 180G.
The first ratio of the maximum used memory to the maximum available storage space of the swap partition can be stored in a configuration file, and the first node reads the first ratio from the configuration file. Correspondingly, the target node also receives a configuration instruction of the first ratio input by a user, and the configuration instruction is used for generating a configuration file. The meaning of the target node in this embodiment is the same as that of the target node in the above embodiment, and is not described here again.
Optionally, if the first processing mode is suspended, after the first node processes the process corresponding to the second job in the job control group by using the first processing mode, the following c1 to c3 are further included:
and c1, the first node determines a second occupied storage space occupied by the swap partition by the job control group.
And c2, the first node determines that the second occupied storage space is less than or equal to the maximum occupied storage space of the swap partition.
c3, the first node determines first storage space information according to the second occupied storage space and the maximum occupied storage space of the swap partition, and the first storage space information indicates that the second occupied storage space is smaller than or equal to a fifth preset proportion of the maximum available storage space of the swap partition.
Optionally, the fifth predetermined proportion may be any one of 60% to 80%.
In one scheme, the first node determines first storage space information according to the second occupied storage space and the maximum occupied storage space of the swap partition, wherein the first storage space information comprises the following c 31-c 32:
and c31, the first node determines the proportion of the used total storage space according to the second occupied storage space and the maximum occupied storage space of the swap partition.
And the ratio of the used total storage space is the ratio of the second occupied storage space to the maximum occupied storage space of the swap partition.
c32, the first node determines that the proportion of the total used storage space is less than or equal to a fifth preset proportion.
At this time, the first storage space information includes that the used total storage space proportion is smaller than or equal to the preset proportion, and since the used total storage space proportion is a ratio of the second occupied storage space to the maximum occupied storage space of the swap partition, the first storage space information may indicate that the second occupied storage space is smaller than or equal to a fifth preset proportion of the maximum available storage space of the swap partition.
Wherein the fifth predetermined proportion is a maximum threshold of the proportion of the total storage space used in the job control group. The fifth preset proportion may be read by the first node from the configuration file. Correspondingly, the target node also receives a fifth preset proportion configuration instruction input by a user, and the configuration instruction is used for generating a configuration file.
In another scheme, the first node determines first storage space information according to the second occupied storage space and the maximum occupied storage space of the swap partition, wherein the first storage space information comprises c 33-c 34 as follows:
and c33, the first node determines the proportion of the remaining total available storage space according to the second occupied storage space and the maximum occupied storage space of the swap partition.
The first node can obtain a third difference value between the maximum occupied storage space of the swap partition and the second occupied storage space, and the ratio of the third difference value to the maximum occupied storage space of the swap partition is determined as the proportion of the remaining total available storage space. Alternatively, the first node may also obtain the used total storage space proportion first, and determine that the difference between 100% and the used total storage space proportion is the remaining total available storage space proportion.
c34, the first node determines that the proportion of the remaining total available storage space is greater than or equal to a sixth preset proportion.
At this time, the first storage space information is information that the remaining total available storage space proportion is greater than or equal to a sixth preset proportion. Since the sum of the remaining total available storage space proportion and the used total storage space proportion is 100%, and the sum of the fifth preset proportion and the sixth preset proportion is 100%, if the remaining total available storage space proportion is less than or equal to the sixth preset proportion, the used total storage space proportion is greater than the fifth preset proportion, and in combination with the above description of the used total storage space proportion, when the first memory information includes that the remaining total available storage space proportion is less than or equal to the sixth preset proportion, the first storage space information may indicate that the second occupied storage space is less than or equal to the fifth preset proportion of the maximum available storage space of the swap partition.
Wherein the sixth preset proportion is a minimum threshold of the remaining total storage space proportion of the job control group. The sixth preset ratio may be read by the first node from the configuration file. Correspondingly, the target node also receives a sixth preset proportion configuration instruction input by the user, and the configuration instruction is used for generating a configuration file. It is understood that one of the fifth preset proportion and the sixth preset proportion may be included in the configuration file.
c4, the first node wakes up the process corresponding to the second operation.
The method of the embodiment can prevent the storage space of the swap partition from being used excessively.
The method according to the present application is explained above, and the apparatus according to the present application is explained below.
Fig. 12 is a schematic block diagram of a memory control device according to an embodiment of the present application, and referring to fig. 11, the device according to the embodiment includes: an obtaining module 1201 and a processing module 1202.
An obtaining module 1201, configured to obtain a first total occupied memory of each job included in a job control group, where the job control group includes each job that is not processed completely on the first node.
A processing module 1202, configured to reduce the total occupied memory of each job included in the job control group when the first total occupied memory meets a first preset condition, where the first preset condition includes that the total occupied memory is greater than a maximum available memory, or the first preset condition includes that the total occupied memory is less than or equal to the maximum available memory and is greater than a first preset proportion of the maximum available memory.
Optionally, the obtaining module 1201 is specifically configured to: and calling a control group system monitoring process to acquire a first total occupied memory of each operation included in the operation control group.
Optionally, the number of the job control groups is one, and each job acquired by the first node is added to the job control group by the first node.
Optionally, the processing module 1202 is specifically configured to: and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
Optionally, before the processing module 1202 processes the process corresponding to the first job in the job control group by using the first processing manner, the processing module 1202 is further configured to:
determining that the first total occupied memory is less than or equal to the maximum used memory;
and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
Optionally, the first memory information includes: the proportion of the used total memory is greater than the first preset proportion; the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory.
Optionally, the processing module 1202 is further configured to: and reading the first preset proportion in the configuration file.
Optionally, the first memory information includes: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%; the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory.
Optionally, the processing module 1202 is further configured to: and reading the second preset proportion in the configuration file.
Optionally, before the obtaining module 1201 obtains the first total occupied memory of each job included in the job control group: the obtaining module 1201 is further configured to: acquiring a second total occupied memory of each operation included in the operation control group; if the second total occupied memory is greater than the maximum available memory, the processing module 1202 is further configured to: and migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory.
Optionally, a first actually used memory of the first operation is greater than a first maximum available memory of the first operation, and a memory excess ratio of the first operation is highest; the memory excess ratio of the first operation is a ratio of a second difference value to the first maximum used memory, and the second difference value is a difference value between the first actually used memory and the first maximum available memory.
Optionally, the first processing mode is suspension, and after the processing module 1202 processes the process corresponding to the first job in the job control group by using the first processing mode: the obtaining module 1201 is further configured to: acquiring a third total occupied memory of each operation included in the operation control group; the processing module 1202 is further configured to: determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion; and awakening the process corresponding to the first operation.
Optionally, before the processing module 1202 processes the process corresponding to the first job in the job control group by using the first processing manner, the processing module 1202 is further configured to: and reading processing mode indication information in the configuration file, wherein the processing mode indication information indicates the first processing mode.
Optionally, the processing module 1202 is specifically configured to: and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
Optionally, after the processing module 1202 migrates the partial memory data corresponding to each job included in the job control group to the swap partition, the processing module 1202 is further configured to: if the swap partition is processed by the first occupied storage space occupied by the operation control group, the first occupied storage space is larger than the maximum available storage space of the swap partition, a first processing mode is adopted to process the process corresponding to the second operation in the operation control group, so that the storage space occupied by the operation control group is smaller than or equal to the maximum available storage space, and the first processing mode is suspension or termination.
Optionally, the second job is any one job in the jobs occupying the swap partition in the job control group; or the second operation is the operation which occupies the most storage space of the swap partition in the operation control group.
Optionally, the processing module 1202 is further configured to: reading a first ratio of a maximum used memory in a configuration file to a maximum available storage space of the swap partition; and determining the maximum available storage space of the swap partition according to the first ratio and the maximum used memory.
Optionally, the processing module 1202 is further configured to: reading the maximum available memory proportion in the configuration file; and determining the maximum used memory according to the maximum used memory proportion and the total memory of the first node.
The apparatus of this embodiment may be configured to execute the technical solution in the foregoing method embodiment, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 13 is a schematic block diagram of a memory control system according to an embodiment of the present application. Referring to fig. 13, the system in the present embodiment includes at least a first node 1301 and a second node 1302;
the second node 1302, configured to send job information to the first node 1301;
the first node 1301 is configured to obtain a job based on the job information, and add the job to a job control group, where the job control group includes each job that is not processed on the first node 1301;
the first node 1301 is further configured to obtain a first total occupied memory of each job included in a job control group, where the job control group includes each job that is not processed on the first node 1301;
when the first total occupied memory meets a first preset condition, reducing the total occupied memory of each operation included in the operation control group, wherein the first preset condition includes that the first total occupied memory is larger than a maximum available memory, or the first preset condition includes that the first total occupied memory is smaller than or equal to the maximum available memory and is larger than a first preset proportion of the maximum available memory.
Optionally, the first node 1301 is specifically configured to: and calling a control group system monitoring process to acquire a first total occupied memory of each operation included in the operation control group.
Optionally, the number of the job control groups is one, and each job acquired by the first node 1301 is added to the job control group by the first node 1301.
Optionally, the first node 1301 is specifically configured to: and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
Optionally, before the first node 1301 processes the process corresponding to the first job in the job control group in the first processing manner, the first node 1301 is further configured to:
determining that the first total occupied memory is less than or equal to the maximum used memory;
and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
Optionally, the first memory information includes: the proportion of the used total memory is greater than the first preset proportion; the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory.
Optionally, the first node 1301 is further configured to: and reading the first preset proportion in the configuration file.
Optionally, the first memory information includes: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%; the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory.
Optionally, the first node 1301 is further configured to: and reading the second preset proportion in the configuration file.
Optionally, before the first node 1301 acquires the first total occupied memory of each job included in the job control group, the first node 1301 is further configured to:
acquiring a second total occupied memory of each operation included in the operation control group;
if the second total occupied memory is larger than the maximum available memory, migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory.
Optionally, a first actually used memory of the first operation is greater than a first maximum available memory of the first operation, and a memory excess ratio of the first operation is highest;
the memory excess ratio of the first operation is a ratio of a second difference value to the first maximum used memory, and the second difference value is a difference value between the first actually used memory and the first maximum available memory.
Optionally, the first processing manner is suspension, and after the first node 1301 processes the process corresponding to the first job in the job control group by using the first processing manner, the first node 1301 is further configured to:
acquiring a third total occupied memory of each operation included in the operation control group;
determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion;
and awakening the process corresponding to the first operation.
Optionally, before the first node 1301 processes the process corresponding to the first job in the job control group in the first processing manner, the first node 1301 is further configured to:
and reading processing mode indication information in the configuration file, wherein the processing mode indication information indicates the first processing mode.
Optionally, the first node 1301 is specifically configured to: and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
Optionally, the first node 1301 is further configured to:
reading the maximum available memory proportion in the configuration file;
and determining the maximum used memory according to the maximum used memory proportion and the total memory of the first node 1301.
Optionally, the first node 1301 is a computing node or a cloud server in a distributed computing system, and the second node 1302 is a management node in the distributed computing system; the job information includes the job.
Optionally, the second node 1302 is a terminal device, and the first node 1301 is an application server; the job information includes a user request, the job being for execution of the user request.
The system of this embodiment may be configured to execute the technical solutions in the method embodiments, and the implementation principles and technical effects are similar, which are not described herein again.
The present application further provides a computer-readable medium, on which a computer program is stored, where the computer program is executed by a computer to implement the method of any of the above method embodiments.
The embodiment of the present application further provides a computer program product, and when being executed by a computer, the computer program product implements the method described in any of the above method embodiments.
The embodiment of the present application further provides a chip, which includes a processor, a memory, and a communication interface, where the processor and the memory are connected to the communication interface, and is characterized in that the processor is configured to read and execute a computer program stored in the memory, so as to perform the method according to any one of the above method embodiments.
The processors and transceivers described in embodiments of the present application may be fabricated using various IC process technologies, such as Complementary Metal Oxide Semiconductor (CMOS), N-type metal oxide semiconductor (NMOS), P-type metal oxide semiconductor (PMOS), Bipolar Junction Transistor (BJT), Bipolar CMOS (bicmos), silicon germanium (SiGe), gallium arsenide (GaAs), and the like.
It should be understood that the processor in the embodiments of the present application may be an integrated circuit chip having signal processing capability. In implementation, the steps of the above method embodiments may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
It will be appreciated that the memory in the embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate SDRAM, enhanced SDRAM, SLDRAM, Synchronous Link DRAM (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
Where the above embodiments are implemented using software, they may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Video Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It should be appreciated that reference throughout this specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the various embodiments are not necessarily referring to the same embodiment throughout the specification. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
It should also be understood that, in the present application, "when …", "if" and "if" all refer to the fact that the first node performs the corresponding processing under certain objective conditions, and are not time-limited, and do not require certain judgment actions to be performed by the first node, nor do they imply that other limitations exist.
Reference in the present application to an element using the singular is intended to mean "one or more" rather than "one and only one" unless specifically stated otherwise. In the present application, unless otherwise specified, "at least one" is intended to mean "one or more" and "a plurality" is intended to mean "two or more".
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A can be singular or plural, and B can be singular or plural.
The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
Herein, the term "at least one of … …" or "at least one of … …" means all or any combination of the listed items, e.g., "at least one of A, B and C", may mean: the compound comprises six cases of separately existing A, separately existing B, separately existing C, simultaneously existing A and B, simultaneously existing B and C, and simultaneously existing A, B and C, wherein A can be singular or plural, B can be singular or plural, and C can be singular or plural.
It should be understood that in the embodiments of the present application, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.

Claims (35)

1. A memory control method is applied to a first node, and the method comprises the following steps:
acquiring a first total occupied memory of each operation included in an operation control group, wherein the operation control group includes each operation which is not processed on the first node;
when the first total occupied memory meets a first preset condition, reducing the total occupied memory of each operation included in the operation control group, wherein the first preset condition includes that the first total occupied memory is larger than a maximum available memory, or the first preset condition includes that the first total occupied memory is smaller than or equal to the maximum available memory and is larger than a first preset proportion of the maximum available memory.
2. The method of claim 1, wherein the obtaining the first total occupied memory of each job included in the job control group comprises:
and calling a control group system monitoring process to acquire a first total occupied memory of each operation included in the operation control group.
3. The method according to claim 1 or 2, wherein the number of the job control groups is one, and each job acquired by the first node is added to the job control group by the first node.
4. The method according to any of claims 1 to 3, wherein the reducing the total occupied memory of each operation included in the operation control group comprises:
and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
5. The method according to claim 4, wherein before the processing corresponding to the first job in the job control group by the first processing manner, further comprising:
determining that the first total occupied memory is less than or equal to the maximum used memory;
and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
6. The method of claim 5, wherein the first memory information comprises: the proportion of the used total memory is greater than the first preset proportion;
the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory.
7. The method of claim 6, further comprising:
and reading the first preset proportion in the configuration file.
8. The method of claim 5, wherein the first memory information comprises: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%;
the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory.
9. The method of claim 8, further comprising:
and reading the second preset proportion in the configuration file.
10. The method according to any of claims 5 to 9, wherein before the obtaining the first total occupied memory of each job included in the job control group, the method further comprises:
acquiring a second total occupied memory of each operation included in the operation control group;
if the second total occupied memory is larger than the maximum available memory, migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory.
11. The method according to any of claims 4 to 10, wherein the first actually used memory of the first operation is larger than the first maximum available memory of the first operation and the memory overrun ratio of the first operation is highest;
the memory excess ratio of the first operation is a ratio of a second difference value to the first maximum used memory, and the second difference value is a difference value between the first actually used memory and the first maximum available memory.
12. The method according to any one of claims 5 to 11, wherein the first processing manner is suspend, and after the processing corresponding to the first job in the job control group by using the first processing manner, the method further comprises:
acquiring a third total occupied memory of each operation included in the operation control group;
determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion;
and awakening the process corresponding to the first operation.
13. The method according to any one of claims 4 to 12, wherein before processing the process corresponding to the first job in the job control group by the first processing means, the method further comprises:
and reading processing mode indication information in the configuration file, wherein the processing mode indication information indicates the first processing mode.
14. The method according to any of claims 1 to 3, wherein the reducing the total occupied memory of each operation included in the operation control group comprises:
and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
15. The method of any one of claims 1 to 14, further comprising:
reading the maximum available memory proportion in the configuration file;
and determining the maximum used memory according to the maximum used memory proportion and the total memory of the first node.
16. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-15.
17. A memory control system is characterized by comprising a second node and at least one first node;
the second node is used for sending operation information to the first node;
the first node is used for obtaining operation based on the operation information and adding the operation to an operation control group, wherein the operation control group comprises each operation which is not processed on the first node;
the first node is further configured to obtain a first total occupied memory of each job included in a job control group, where the job control group includes each job that is not processed on the first node;
when the first total occupied memory meets a first preset condition, reducing the total occupied memory of each operation included in the operation control group, wherein the first preset condition includes that the first total occupied memory is larger than a maximum available memory, or the first preset condition includes that the first total occupied memory is smaller than or equal to the maximum available memory and is larger than a first preset proportion of the maximum available memory.
18. The system of claim 17, wherein the first node is specifically configured to:
and calling a control group system monitoring process to acquire a first total occupied memory of each operation included in the operation control group.
19. The system according to claim 17 or 18, wherein the number of the job control groups is one, and each job acquired by the first node is added to the job control group by the first node.
20. The system according to any one of claims 17 to 19, wherein the first node is specifically configured to:
and processing the process corresponding to the first job in the job control group by adopting a first processing mode, wherein the first processing mode is suspension or termination.
21. The system as claimed in claim 20, wherein before the first node processes the process corresponding to the first job in the job control group by the first processing means, the first node is further configured to:
determining that the first total occupied memory is less than or equal to the maximum used memory;
and determining first memory information according to the first total occupied memory and the maximum used memory, wherein the first memory information indicates that the first total occupied memory is larger than a first preset proportion of the maximum available memory.
22. The system of claim 21, wherein the first memory information comprises: the proportion of the used total memory is greater than the first preset proportion;
the used total memory proportion is the ratio of the first total occupied memory to the maximum available memory.
23. The system of claim 22, wherein the first node is further configured to:
and reading the first preset proportion in the configuration file.
24. The system of claim 21, wherein the first memory information comprises: the proportion of the first remaining total available memory is smaller than a second preset proportion, and the sum of the first preset proportion and the second preset proportion is 100%;
the first remaining total available memory ratio is a ratio of a first difference value to the maximum available memory, and the first difference value is a difference value between the maximum available memory and the first total occupied memory.
25. The system of claim 24, wherein the first node is further configured to:
and reading the second preset proportion in the configuration file.
26. The system according to any of claims 21 to 25, wherein before the first node obtains the first total occupied memory of each job included in the job control group, the first node is further configured to:
acquiring a second total occupied memory of each operation included in the operation control group;
if the second total occupied memory is larger than the maximum available memory, migrating part of memory data corresponding to each operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is the first total occupied memory.
27. The system of any of claims 20 to 26, wherein a first actually used memory of the first operation is greater than a first maximum available memory of the first operation and a memory overrun ratio of the first operation is highest;
the memory excess ratio of the first operation is a ratio of a second difference value to the first maximum used memory, and the second difference value is a difference value between the first actually used memory and the first maximum available memory.
28. The system according to any of claims 21 to 27, wherein the first processing means is suspend, and after the first node processes the process corresponding to the first job in the job control group by using the first processing means, the first node is further configured to:
acquiring a third total occupied memory of each operation included in the operation control group;
determining that the third total occupied memory is smaller than a third preset proportion of the maximum available memory, wherein the third preset proportion is smaller than or equal to the first preset proportion;
and awakening the process corresponding to the first operation.
29. The system according to any of claims 20 to 28, wherein before the first node processes the process corresponding to the first job in the job control group in the first processing mode, the first node is further configured to:
and reading processing mode indication information in the configuration file, wherein the processing mode indication information indicates the first processing mode.
30. The system according to any one of claims 17 to 19, wherein the first node is specifically configured to:
and migrating at least part of memory data corresponding to at least one operation included in the operation control group to the swap partition, so that the total occupied memory of each operation included in the operation control group is less than or equal to the maximum available memory.
31. The system according to any of claims 17 to 30, wherein the first node is further configured to:
reading the maximum available memory proportion in the configuration file;
and determining the maximum used memory according to the maximum used memory proportion and the total memory of the first node.
32. The system according to any one of claims 17 to 31, wherein the first node is a computing node or a cloud server in a distributed computing system, and the second node is a management node in the distributed computing system;
the job information includes the job.
33. The system according to any one of claims 17 to 31, wherein the second node is a terminal device, and the first node is an application server;
the job information includes a user request, the job being for execution of the user request.
34. A storage medium, characterized in that the storage medium comprises a computer program for implementing the method according to any one of claims 1 to 15.
35. A chip comprising a processor, a memory and a communication interface, the processor and the memory being connected to the communication interface, wherein the processor is configured to read and execute a computer program stored in the memory to perform the method of any of the preceding claims 1 to 15.
CN202010996974.5A 2020-09-21 2020-09-21 Memory control method and device Pending CN114253457A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010996974.5A CN114253457A (en) 2020-09-21 2020-09-21 Memory control method and device
PCT/CN2021/117914 WO2022057754A1 (en) 2020-09-21 2021-09-13 Memory control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010996974.5A CN114253457A (en) 2020-09-21 2020-09-21 Memory control method and device

Publications (1)

Publication Number Publication Date
CN114253457A true CN114253457A (en) 2022-03-29

Family

ID=80776479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010996974.5A Pending CN114253457A (en) 2020-09-21 2020-09-21 Memory control method and device

Country Status (2)

Country Link
CN (1) CN114253457A (en)
WO (1) WO2022057754A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521029A (en) * 2011-12-02 2012-06-27 曙光信息产业(北京)有限公司 Job scheduling method based on exclusive memory
CN107066316A (en) * 2017-04-25 2017-08-18 华中科技大学 Alleviate the dispatching method and system of memory pressure in distributed data processing system
CN107665146A (en) * 2016-07-29 2018-02-06 华为技术有限公司 Memory management apparatus and method
CN109992471A (en) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 A kind of method and device of internal memory monitoring

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101833512A (en) * 2010-04-22 2010-09-15 中兴通讯股份有限公司 Method and device thereof for reclaiming memory
CN106407010A (en) * 2016-09-06 2017-02-15 北京珠穆朗玛移动通信有限公司 Internal memory management method and mobile terminal
US9946577B1 (en) * 2017-08-14 2018-04-17 10X Genomics, Inc. Systems and methods for distributed resource management
CN109379246B (en) * 2018-09-21 2021-03-05 锐捷网络股份有限公司 Memory detection method and device
CN110221921A (en) * 2019-06-13 2019-09-10 深圳Tcl新技术有限公司 EMS memory management process, terminal and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521029A (en) * 2011-12-02 2012-06-27 曙光信息产业(北京)有限公司 Job scheduling method based on exclusive memory
CN107665146A (en) * 2016-07-29 2018-02-06 华为技术有限公司 Memory management apparatus and method
CN107066316A (en) * 2017-04-25 2017-08-18 华中科技大学 Alleviate the dispatching method and system of memory pressure in distributed data processing system
CN109992471A (en) * 2018-01-02 2019-07-09 中国移动通信有限公司研究院 A kind of method and device of internal memory monitoring

Also Published As

Publication number Publication date
WO2022057754A1 (en) 2022-03-24

Similar Documents

Publication Publication Date Title
US9658881B1 (en) Application hosting in a distributed application execution system
US9405574B2 (en) System and method for transmitting complex structures based on a shared memory queue
US8307053B1 (en) Partitioned packet processing in a multiprocessor environment
US8321876B2 (en) System and method of dynamically loading and executing module devices using inter-core-communication channel in multicore system environment
EP1750200A2 (en) System and method for executing job step, and computer product
JP2015537307A (en) Component-oriented hybrid cloud operating system architecture and communication method thereof
CN113448743B (en) Method, electronic device and computer program product for task processing
US20140089943A1 (en) Method, system and apparatus for handling events for partitions in a socket with sub-socket partitioning
US10846251B1 (en) Scratchpad-based operating system for multi-core embedded systems
CN109726005B (en) Method, server system and computer readable medium for managing resources
JP2001331333A (en) Computer system and method for controlling computer system
CN111190854A (en) Communication data processing method, device, equipment, system and storage medium
CN109284192B (en) Parameter configuration method and electronic equipment
CN114116149A (en) Task scheduling method, device, equipment and medium
CN116126742A (en) Memory access method, device, server and storage medium
CN110716805A (en) Task allocation method and device of graphic processor, electronic equipment and storage medium
CN114253457A (en) Memory control method and device
CN115878333A (en) Method, device and equipment for judging consistency between process groups
CN115794396A (en) Resource allocation method, system and electronic equipment
CN115878309A (en) Resource allocation method, device, processing core, equipment and computer readable medium
CN113535378A (en) Resource allocation method, storage medium and terminal equipment
JP2017021618A (en) Information processing apparatus, parallel computer system, file server communication program, and file server communication method
US20060048150A1 (en) Task management methods and related devices
WO2024098926A1 (en) Computing resource management method and apparatus
CN117938863B (en) Cluster-based joint simulation implementation method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination