CN114185687B - Heap memory management method and device for shared memory type coprocessor - Google Patents

Heap memory management method and device for shared memory type coprocessor Download PDF

Info

Publication number
CN114185687B
CN114185687B CN202210131446.2A CN202210131446A CN114185687B CN 114185687 B CN114185687 B CN 114185687B CN 202210131446 A CN202210131446 A CN 202210131446A CN 114185687 B CN114185687 B CN 114185687B
Authority
CN
China
Prior art keywords
memory
linked list
coprocessor
memory block
available
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210131446.2A
Other languages
Chinese (zh)
Other versions
CN114185687A (en
Inventor
张昂
廖湘科
崔英博
杨灿群
黄春
唐滔
彭林
夏泽宇
郭逸飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202210131446.2A priority Critical patent/CN114185687B/en
Publication of CN114185687A publication Critical patent/CN114185687A/en
Application granted granted Critical
Publication of CN114185687B publication Critical patent/CN114185687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/167Interprocessor communication using a common memory, e.g. mailbox
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources

Abstract

The application relates to a heap memory management method and device facing a shared memory type coprocessor. The method comprises the following steps: before executing a co-processing end program, applying for a larger continuous heap memory space through a first interface function, converting a virtual address of the applied heap memory space into a physical address, transmitting the physical address to the co-processing end program, and respectively organizing the applied heap memory space into an available linked list and an allocated linked list; when a coprocessor-side program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list; in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into an available linked list; and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function.

Description

Heap memory management method and device for shared memory type coprocessor
Technical Field
The present application relates to the field of computer technologies, and in particular, to a heap memory management method and apparatus for a shared memory coprocessor.
Background
Heterogeneous computing is continuously developed in the Field of high-performance computing due to its characteristics of high performance and high energy efficiency, and more coprocessors are emerging, such as a GPU (Graphics Processing Unit), an FPGA (Field-programmable Gate Array), and the like. Usually, the coprocessor is connected to a host CPU (Central Processing Unit) in the form of a PCIe (Peripheral Component Interconnect Express) Peripheral, has an independent storage Unit, cannot directly share a memory with the CPU, and belongs to a separate memory coprocessor. The separate memory coprocessor needs to explicitly transfer data between the memory and the storage space of the coprocessor during programming, which increases the programming difficulty and the overhead in program execution.
Aiming at the problems, the shared memory type coprocessor is directly connected with the CPU in the chip through a high-speed bus, can realize the memory sharing with the CPU, does not need to carry out data transportation during programming, avoids the overhead of separating the explicit data transportation of the memory type coprocessor, and improves the programmability and the program performance of the coprocessor.
Data in the program running process is mainly stored in a memory, the size of occupied space of some data can be determined during program compiling, the size of some data can be determined only when the program runs, and the memory space needs to be dynamically applied and released in the running process. The shared memory type coprocessor and the CPU can directly access the memory, wherein the CPU end can independently run an operating system, has the capacity of virtual-real address conversion and dynamic heap memory management, and can realize the dynamic memory application and release functions. However, the existing shared memory type coprocessor cannot run an operating system, does not have the capability of virtual-real address conversion, and can only identify a physical address. Therefore, the coprocessor does not have the dynamic memory management capability, and cannot meet the requirement of using the dynamic memory at the coprocessor end.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a heap memory management method and apparatus, a computer device, and a storage medium for a shared memory coprocessor, which can implement a dynamic memory management capability of a coprocessor.
A heap memory management method for a shared memory coprocessor, the method comprising:
acquiring coprocessor heap memory space size information to be applied by a coprocessor, applying for a heap memory space through a first interface function according to the coprocessor heap memory space size information, converting a virtual address of the applied heap memory space into a physical address, and transmitting the physical address to a coprocessor end program;
respectively organizing the applied heap memory space into an available linked list and an allocated linked list through the first interface function;
when a coprocessor end program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list;
in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into the available linked list;
and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function.
In one embodiment, the method further comprises the following steps: initializing an available linked list and an allocated linked list through the first interface function; the information stored by each node of the available linked list and the allocated linked list includes: the memory size of the current node points to the next pointer of the next node;
pointing an available link table head pointer of the available link table to the physical address;
assigning an assignable linked list head pointer of the assigned linked list to null.
In one embodiment, the method further comprises the following steps: when the coprocessor side program applies for memory usage, the available link list head pointer is assigned to the current memory block pointer through a second interface function;
if the size of the current memory block is larger than or equal to the size of the applied memory, returning the pointer of the current memory block as the distributed physical address; otherwise, taking the pointer of the next memory block as the pointer of the current memory block, judging whether the size of the current memory block meets the size of the applied memory or not until the memory block with proper size is found, and returning the pointer of the memory block as the allocated physical address;
and adding the newly allocated memory into the tail of the chain table of the allocated chain table.
In one embodiment, the method further comprises the following steps: obtaining the size of a current memory block;
acquiring the size of an application memory;
subtracting the size of the application memory from the size of the current memory block to serve as the size of the current idle memory block;
and adding the current idle memory block into the available linked list.
In one embodiment, the method further comprises the following steps: in the process of executing the program of the coprocessor end, acquiring the physical address of the memory block to be released through a third interface function;
and sequentially comparing the pointer of the current memory block in the distributed linked list with the physical address of the memory block to be released from the head of the distributed linked list, and if the pointer of the previous memory block pointing to the next node in the current memory block in the distributed linked list is equal to the physical address of the second memory block behind the current memory block.
In one embodiment, the method further comprises the following steps: if the newly released memory is adjacent to the idle memory block in the available linked list, merging the newly released memory with the adjacent memory block in the available linked list;
and traversing the available linked list if the newly released memory is not adjacent to the idle memory block in the available linked list, and adding the newly released memory block into the available linked list according to the sequence of memory addresses.
In one embodiment, the method further comprises the following steps: comparing a pointer of a current memory block in the available linked list with the size of the current memory block in sequence from the head of the available linked list, and then judging whether the pointer of the current memory block in the available linked list is equal to the physical address of the newly released memory block or not, and if so, merging the current memory block in the available linked list with the newly released memory block; wherein, the current memory block in the available linked list is in front, and the newly released memory block is behind;
if the pointer of the newly released memory block is not equal to the pointer of the current memory block in the available linked list, the newly released memory block is added with the size of the newly released memory block, and the newly released memory block and the current memory block in the available linked list are compared with the pointer of the current memory block in the available linked list in sequence; wherein the newly released memory chunk is before, and the current memory chunk is after in the available linked list.
A heap memory management apparatus for a shared memory coprocessor, the apparatus comprising:
the coprocessor heap memory space application module is used for acquiring coprocessor heap memory space size information to be applied by a coprocessor, applying for a heap memory space through a first interface function according to the coprocessor heap memory space size information, converting a virtual address of the applied heap memory space into a physical address, and transmitting the physical address to a coprocessor end program;
a linked list organizing module, configured to organize the applied heap memory space into an available linked list and an allocated linked list respectively through the first interface function;
the system comprises a heap memory space allocation module, a coprocessor end program and a chain table tail allocation module, wherein the heap memory space allocation module is used for searching first available memory allocation of a size from the head of an available chain table through a second interface function when the coprocessor end program applies for memory usage, and adding newly allocated memory into the chain table tail of the allocated chain table;
the heap memory space releasing module is used for releasing the heap memory space of the coprocessor end through a third interface function in the process of executing the coprocessor end program and adding a newly released memory into the available linked list;
and the heap memory space cleaning module is used for cleaning the coprocessor heap memory space according to the virtual address through a fourth interface function after the coprocessor end program is executed.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring coprocessor heap memory space size information to be applied by a coprocessor, applying for a heap memory space through a first interface function according to the coprocessor heap memory space size information, converting a virtual address of the applied heap memory space into a physical address, and transmitting the physical address to a coprocessor end program;
respectively organizing the applied heap memory space into an available linked list and an allocated linked list through the first interface function;
when a coprocessor end program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list;
in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into the available linked list;
and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring coprocessor heap memory space size information to be applied by a coprocessor, applying for a heap memory space through a first interface function according to the coprocessor heap memory space size information, converting a virtual address of the applied heap memory space into a physical address, and transmitting the physical address to a coprocessor end program;
respectively organizing the applied heap memory space into an available linked list and an allocated linked list through the first interface function;
when a coprocessor end program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list;
in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into the available linked list;
and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function.
Before executing the co-processing end program, applying for a large continuous heap memory space through a first interface function, converting a virtual address of the applied heap memory space into a physical address, transmitting the physical address to the co-processing end program, and respectively organizing the applied heap memory space into an available linked list and an allocated linked list; when a coprocessor-side program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list; in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into an available linked list; and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function. The invention provides a heap memory management mechanism facing a shared memory type coprocessor, solves the problem that a shared memory type coprocessor end cannot realize dynamic memory management, and realizes the capability of dynamically applying and releasing memory for a coprocessor end program.
Drawings
FIG. 1 is a flow diagram illustrating an embodiment of a shared memory coprocessor-oriented heap memory management method;
FIG. 2 is a block diagram of an embodiment of a heap memory management device for a shared memory coprocessor;
FIG. 3 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The heap memory management method facing the shared memory type coprocessor can be applied to the following application environments. The invention relates to a shared memory type coprocessor, which is connected with a CPU through a high-speed bus in a chip and can realize the memory sharing with the CPU. At the coprocessor, respectively organizing the idle memory and the allocated memory into two linked lists, wherein the information stored by each node of the linked lists comprises: the memory size of the current node points to the pointer of the next node. When applying for the memory, searching the available memory allocation of the first block size from the head of the free memory chain table; when the memory is released, the memory is tried to be combined with the front and rear idle memory blocks. .
In one embodiment, as shown in fig. 1, a shared memory coprocessor-oriented heap memory management method is provided, including the following steps:
step 102, obtaining the size information of the coprocessor heap memory space to be applied by the coprocessor, applying for the heap memory space through a first interface function according to the size information of the coprocessor heap memory space, converting the virtual address of the applied heap memory space into a physical address, and transmitting the physical address to a coprocessor end program.
The invention provides a shared memory type coprocessor-oriented heap memory management mechanism, and realizes a set of programming interfaces for coprocessor heap memory management, which comprises four interfaces of initializing coprocessor heap memory init _ cp _ heap (), applying coprocessor end heap memory malloc _ cp _ heap (), releasing coprocessor end heap memory free _ cp _ heap () and clearing coprocessor heap memory destroy _ cp _ heap (). In this embodiment, init _ cp _ heal () is referred to as a first interface function, malloc _ cp _ heal () is referred to as a second interface function, free _ cp _ heal () is referred to as a third interface function, and default _ cp _ heal () is referred to as a fourth interface function.
Specifically, the size of the memory space of the coprocessor stack is initialized to the heap _ size, and the heap _ size needs to be large enough to ensure that the use requirement of the dynamic memory of the coprocessor end program can be met;
applying for the memory space with the size of the heap _ size, and obtaining the virtual address of the memory space as the heap _ start _ va;
converting the virtual address heap _ start _ va into a physical address heap _ start _ pa;
the heap physical address heap _ start _ pa is passed to the coprocessor side program.
And 104, respectively organizing the applied heap memory space into an available linked list and an allocated linked list through a first interface function.
Specifically, the available memory link head pointer free _ head is assigned to head _ start _ pa, and the allocated memory link head pointer allocated _ head is assigned to NULL.
And step 106, when the coprocessor side program applies for memory usage, searching the available memory allocation of the first block size from the head of the available linked list through the second interface function, and adding the newly allocated memory into the tail of the linked list of the allocated linked list.
And firstly, positioning the available memory blocks, then returning the physical addresses of the allocated coprocessor heap memory space, and finally adding the newly applied memory into the tail of the chain table of the applied memory.
And 108, releasing the memory space of the coprocessor end heap through a third interface function in the process of executing the coprocessor end program, and adding the newly released memory into an available linked list.
The memory block needing to be released is positioned firstly, then the memory is released, and finally the idle memory is merged.
And step 110, after the coprocessor program is executed, cleaning the memory space of the coprocessor heap according to the virtual address through the fourth interface function.
And releasing the memory of the coprocessor heap through the heap _ start _ va after the coprocessor end program is executed.
In the shared memory coprocessor-oriented heap memory management method, before executing a coprocessor-oriented program, a large continuous heap memory space is applied through a first interface function, a virtual address of the applied heap memory space is converted into a physical address, the physical address is transmitted to the coprocessor-oriented program, and the applied heap memory space is respectively organized into an available linked list and an allocated linked list; when a coprocessor-side program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list; in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into an available linked list; and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function. The invention provides a heap memory management mechanism facing a shared memory type coprocessor, solves the problem that a shared memory type coprocessor end cannot realize dynamic memory management, and realizes the capability of dynamically applying and releasing memory for a coprocessor end program.
In one embodiment, the method further comprises the following steps: initializing an available linked list and an allocated linked list through a first interface function; the information stored by each node of the available and assigned linked lists includes: the memory size of the current node, the pointer next pointing to the next node; pointing an available chain table head pointer of an available chain table to a physical address; assigning an assignable linked list head pointer of the assigned linked list to null.
In one embodiment, the method further comprises the following steps: when the coprocessor side program applies for memory usage, the available link table head pointer is assigned to the current memory block pointer through a second interface function; if the size of the current memory block is larger than or equal to the size of the applied memory, returning a pointer of the current memory block as an allocated physical address; otherwise, taking the pointer of the next memory block as the pointer of the current memory block, judging whether the size of the current memory block meets the size of the applied memory or not until the memory block with proper size is found, and returning the pointer of the memory block as the distributed physical address; and adding the newly allocated memory into the tail of the chain table of the allocated chain table.
Specifically, the free _ head pointer of the available memory chain table is assigned to the current block pointer current _ ptr, and the previous _ ptr of the previous memory block is assigned to NULL;
comparing the current _ ptr- > size of the current idle memory block with the required _ size of the application memory, if the current _ ptr- > size is smaller than the required _ size, the current node does not have enough available memory, and moving the pointer backwards to the next memory block to continue judging; and if the current _ ptr- > size is larger than or equal to the required _ size, which indicates that the current node has enough available memory, returning the physical address of the allocated memory space of the coprocessor heap.
The specific operation of adding the newly allocated memory to the tail of the chain table of the allocated chain table comprises the following steps:
assigning an allocated memory chain table head pointer allocated _ head to a current block pointer current _ ptr;
checking whether current _ ptr- > next is NULL, if not, moving the pointer to the next memory block, and continuing to judge; if the memory is NULL, adding the newly applied memory into the chain table tail of the applied memory, and assigning the pointer of the next node of the newly applied memory to be NULL.
In one embodiment, the method further comprises the following steps: obtaining the size of a current memory block; acquiring the size of an application memory; subtracting the size of the applied memory from the size of the current memory block to obtain the size of the current idle memory block; and adding the current idle memory block into the available linked list.
Specifically, return address returned _ pa is assigned to current _ ptr;
assigning the size of the current free memory block to be current _ ptr- > size-required _ size;
and assigning the next pointer previous _ ptr- > next of the previous memory block as current _ ptr + current _ size, and adding the new free memory block into the linked list.
In one embodiment, the method further comprises the following steps: in the process of executing the program of the coprocessor end, acquiring the physical address of the memory block to be released through a third interface function; and sequentially comparing the pointer of the current memory block in the distributed linked list with the physical address of the memory block to be released from the head of the distributed linked list, and if the pointer of the previous memory block pointing to the next node of the current memory block in the distributed linked list is equal to the physical address of the second memory block behind the current memory block.
Specifically, a physical address to _ free _ pa of a memory block to be released is obtained, an allocated memory chain head pointer allocated _ head is assigned to a current block pointer current _ ptr, and a previous memory block pointer _ ptr is assigned to NULL;
comparing current _ ptr with to _ free _ pa, if equal, indicating that the memory block to be released is positioned, and releasing the memory; if not, moving the pointer of the current block backwards by one section for continuous comparison;
in one embodiment, the method further comprises the following steps: if the newly released memory is adjacent to the idle memory block in the available linked list, merging the newly released memory with the adjacent memory block in the available linked list: sequentially comparing the current memory block pointer in the available linked list with the size of the current memory block from the head of the available linked list, judging whether the current memory block pointer is equal to the physical address of the newly released memory block, and combining the current memory block in the available linked list with the newly released memory block if the current memory block pointer is equal to the physical address of the newly released memory block; wherein, the current memory block in the available linked list is in front, and the newly released memory block is behind; if not, adding the newly released pointer of the memory block to the size of the newly released memory block, comparing with the current pointer of the memory block in the available linked list in sequence, and if the newly released pointer of the memory block is equal to the current pointer of the memory block in the available linked list, merging the newly released memory block with the current pointer of the memory block in the available linked list; wherein the newly released memory block is in front of the current memory block in the available linked list. And if the newly released memory is not adjacent to the idle memory block in the available linked list, traversing the available linked list, and adding the newly released memory block into the available linked list according to the sequence of the memory addresses.
The specific implementation steps are as follows: assigning a free _ head pointer of an idle memory chain to a current _ ptr of a current block pointer, and assigning a previous _ ptr of a previous memory block pointer to NULL;
comparing current _ ptr + current _ ptr- > size with to _ free _ pa, if equal, indicating that the current memory block is adjacent to the released memory block, merging the two memory blocks after the released memory block, assigning the current _ ptr- > size as current _ ptr- > size + to _ free _ pa- > size, and finishing merging;
if the current value is not equal to the current value, comparing to _ free _ pa + to _ free _ pa- > size and current _ ptr, if the current value is equal to the current value, indicating that the current memory block is adjacent to the released memory block, merging the two memory blocks, assigning the current _ ptr- > size to be current _ ptr- > size + to _ free _ pa- > size, assigning the current _ ptr to be to _ free _ pa, and assigning the previous _ ptr- > next to be current _ ptr, thereby completing merging;
if not, if current _ ptr- > next is NULL, it is indicated that no adjacent free memories exist before and after the released memory, the free memory linked list is traversed again, so that current _ ptr is larger than to _ free _ pa, previous _ ptr + previous _ ptr- > size is smaller than to _ free _ pa, previous _ ptr- > next is assigned as to _ free _ pa, to _ free _ pa- > next is assigned as current _ ptr, and the released memory block is added into the free memory linked list; if current _ ptr- > next is not NULL, the current block pointer is moved backward by one section, and the judgment is continued.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, a heap memory management method for a shared memory coprocessor is provided, including the following steps:
step 1: calling an init _ cp _ heap () interface to initialize a coprocessor heap memory space;
step 1.1: initializing the memory space size of the coprocessor stack to be heap _ size which needs to be large enough to ensure that the use requirement of a dynamic memory of a coprocessor end program can be met;
step 1.2: applying for the memory space of the heap _ size, and obtaining the virtual address of the memory space as heap _ start _ va;
step 1.3: converting the virtual address heap _ start _ va into a physical address heap _ start _ pa;
step 1.4: transmitting the heap physical address heap _ start _ pa to a coprocessor end program;
step 1.5: assigning a free _ head pointer of an available memory link list as a head _ start _ pa, and assigning an allocated _ head pointer of an allocated memory link list as a NULL;
step 2: calling a malloc _ cp _ heap () interface to apply for a coprocessor end heap memory space;
step 2.1: positioning available memory blocks;
step 2.1.1: assigning a free _ head pointer of an available memory chain table to a current _ ptr of a current block pointer, and assigning a previous _ ptr of a previous memory block pointer to NULL;
step 2.1.2: comparing the current free memory block size current _ ptr- > size with the size required _ size of the application memory, if the current _ ptr- > size is smaller than the required _ size, indicating that the current node does not have enough available memory, executing step 2.1.3; if current _ ptr- > size is larger than or equal to required _ size, it indicates that the current node has enough available memory, then step 2.2 is executed;
step 2.1.3: assigning previous _ ptr as current _ ptr, assigning current _ ptr as address current _ ptr- > next of next memory block, and executing step 2.1.2;
step 2.2: returning the physical address of the memory space of the distributed coprocessor heap;
step 2.2.1: assigning the return address returned _ pa to current _ ptr;
step 2.2.2: assigning the size of the current free memory block as current _ ptr- > size-required _ size;
step 2.2.3: assigning the next pointer previous _ ptr- > next of the previous memory block as current _ ptr + current _ size, and adding a new idle memory block into a linked list;
step 2.3: adding the newly applied memory into the chain table tail of the applied memory;
step 2.3.1: assigning an allocated memory chain table head pointer allocated _ head to a current block pointer current _ ptr;
step 2.3.2: checking whether current _ ptr- > next is NULL, and if not, executing a step 2.3.3; if NULL, carry out step 2.3.4;
step 2.3.3: assigning current _ ptr to the address current _ ptr- > next of the next memory block, and executing step 2.3.2;
step 2.3.4: assigning current _ ptr- > next to return _ pa, and adding the newly applied memory into the chain table tail of the applied memory;
step 2.3.5: assigning current _ ptr to be the address current _ ptr- > next of the next memory block, and assigning current _ ptr- > next to be NULL;
and step 3: calling a free _ cp _ heap () interface to release the memory space of the coprocessor end heap;
step 3.1: positioning a memory block to be released;
step 3.1.1: obtaining a physical address to _ free _ pa of a memory block needing to be released, assigning an allocated memory chain head pointer allocated _ head to a current block pointer current _ ptr, and assigning a previous memory block pointer _ ptr as NULL;
step 3.1.2: comparing current _ ptr with to _ free _ pa, if equal, indicating that the memory block to be released is located, executing step 3.2; if not, executing step 3.1.3;
step 3.1.3: assigning previous _ ptr as current _ ptr, assigning current _ ptr as address current _ ptr- > next of next memory block, and executing step 3.1.2;
step 3.2: releasing the memory;
step 3.2.1: if current _ ptr equals allocated _ head, go to step 3.2.2; if current _ ptr is not equal to allocated _ head, go to step 3.2.3;
step 3.2.2: assigning the allocated _ head pointer to current _ ptr- > next, and executing the step 3.2.4;
step 3.2.3: assigning a next pointer previous _ ptr- > next of the previous memory block as an address current _ ptr- > next of a memory of a second backward block;
step 3.3: merging the idle memories;
step 3.3.1: assigning a free _ head pointer of an idle memory chain to a current _ ptr of a current block pointer, and assigning a previous _ ptr of a previous memory block pointer to NULL;
step 3.3.2: comparing current _ ptr + current _ ptr- > size with to _ free _ pa, if equal, executing step 3.3.3; if not, executing step 3.3.4;
step 3.3.3: the current memory block is adjacent to the released memory block, the two memory blocks are merged after the released memory block, the current _ ptr- > size is assigned to be current _ ptr- > size + to _ free _ pa- > size, and merging is completed;
step 3.3.4: comparing to _ free _ pa + to _ free _ pa- > size and current _ ptr, if equal, executing step 3.3.5; if not, executing step 3.3.6;
step 3.3.5: explaining that the current memory block is adjacent to the released memory block, merging the two memory blocks before the released memory block, assigning current _ ptr- > size to current _ ptr- > size + to _ free _ pa- > size, assigning current _ ptr to _ free _ pa, assigning previous _ ptr- > next to current _ ptr, and finishing merging;
step 3.3.6: if current _ ptr- > next is NULL, go to step 3.3.8; otherwise, go to step 3.3.7;
step 3.3.7: assigning previous _ ptr as current _ ptr, assigning current _ ptr as address current _ ptr- > next of next memory block, and executing step 3.3.2;
step 3.3.8: explaining that no adjacent free memories exist before and after the released memory, traversing the free memory linked list again to enable current _ ptr to be larger than to _ free _ pa, previous _ ptr + previous _ ptr- > size to be smaller than to _ free _ pa, assigning previous _ ptr- > next to be to _ free _ pa, assigning to _ free _ pa- > next to be current _ ptr, and adding the released memory block into the free memory linked list;
and 4, step 4: calling destroy _ cp _ heap () to clear the memory space of the coprocessor heap;
step 4.1: and releasing the memory of the coprocessor heap through the heap _ start _ va after the coprocessor end program is executed.
In one embodiment, as shown in fig. 2, a heap memory management apparatus for a shared memory co-processor is provided, including: a heap memory space application module 202, a linked list organization module 204, a heap memory space allocation module 206, a heap memory space release module 208, and a heap memory space cleaning module 210, wherein:
a heap memory space application module 202, configured to obtain coprocessor heap memory space size information to be applied by a coprocessor, apply for a heap memory space through a first interface function according to the coprocessor heap memory space size information, convert a virtual address of the applied heap memory space into a physical address, and transmit the physical address to a coprocessor end program;
a linked list organizing module 204, configured to organize the applied heap memory space into an available linked list and an allocated linked list through a first interface function;
a heap memory space allocation module 206, configured to search, when the coprocessor side program applies for memory usage, for a first available memory allocation of a size from the head of the available linked list through a second interface function, and add a newly allocated memory to the tail of the linked list of the allocated linked list;
a heap memory space releasing module 208, configured to release the heap memory space of the coprocessor end through a third interface function and add a newly released memory into an available linked list during the execution of the coprocessor end program;
and a heap memory space cleaning module 210, configured to clean the coprocessor heap memory space according to the virtual address through the fourth interface function after the coprocessor end program is executed.
The linked list organizing module 204 is further configured to initialize an available linked list and an allocated linked list through a first interface function; pointing an available chain table head pointer of an available chain table to a physical address; the information stored by each node of the available and assigned linked lists includes: the memory size of the current node points to the next pointer of the next node; assigning an assignable linked list head pointer of the assigned linked list to null.
The heap memory space allocation module 206 is further configured to assign the available head pointer of the chain table to the current pointer of the memory block through the second interface function when the coprocessor side program applies for memory usage; if the size of the current memory block is larger than or equal to the size of the applied memory, returning a pointer of the current memory block as an allocated physical address; otherwise, using the pointer of the next memory block as the pointer of the current memory block, judging whether the size of the current memory block meets the size of the applied memory until the memory block with proper size is found, and returning the pointer of the current memory block as the distributed physical address; and adding the newly allocated memory into the tail of the chain table of the allocated chain table.
The heap memory space allocation module 206 is further configured to obtain a size of the current memory block; acquiring the size of an application memory; subtracting the size of the applied memory from the size of the current memory block to obtain the size of the current idle memory block; and adding the current idle memory block into the available linked list.
The heap memory space releasing module 208 is further configured to, during the execution of the coprocessor side program, obtain a physical address of the memory block that needs to be released through the third interface function; and sequentially comparing the pointer of the current memory block in the distributed linked list with the physical address of the memory block to be released from the head of the distributed linked list, and if the pointer of the previous memory block pointing to the next node of the current memory block in the distributed linked list is equal to the physical address of the second memory block behind the current memory block.
The heap memory space releasing module 208 is further configured to merge the newly released memory with an adjacent memory block in the available linked list if the newly released memory is adjacent to an idle memory block in the available linked list; and if the newly released memory is not adjacent to the idle memory block in the available linked list, traversing the available linked list, and adding the newly released memory block into the available linked list according to the sequence of the memory addresses.
The heap memory space releasing module 208 is further configured to compare, in sequence, from the head of the available linked list, whether the current memory block pointer in the available linked list is equal to the physical address of the newly released memory block after adding the size of the current memory block, and if so, merge the current memory block in the available linked list with the newly released memory block; wherein, the current memory block in the available linked list is in front, and the newly released memory block is behind; if not, adding the newly released pointer of the memory block to the size of the newly released memory block, comparing with the current pointer of the memory block in the available linked list in sequence, and if the newly released pointer of the memory block is equal to the current pointer of the memory block in the available linked list, merging the newly released memory block with the current pointer of the memory block in the available linked list; wherein the newly released memory block is in front of the current memory block in the available linked list.
For specific limitations of the heap memory management device for the shared memory coprocessor, reference may be made to the above limitations of the heap memory management method for the shared memory coprocessor, and details are not described here. All or part of each module in the heap memory management device for the shared memory type coprocessor can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a heap memory management method for a shared memory co-processor. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configuration shown in fig. 3 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A heap memory management method for a shared memory type coprocessor is characterized by comprising the following steps:
acquiring coprocessor heap memory space size information to be applied by a coprocessor through a CPU (central processing unit), applying for a heap memory space through a first interface function according to the coprocessor heap memory space size information, converting a virtual address of the applied heap memory space into a physical address, and transmitting the physical address to a coprocessor end program; the CPU and the coprocessor form a heterogeneous system; the coprocessor is a shared memory type coprocessor;
respectively organizing the applied heap memory space into an available linked list and an allocated linked list through the first interface function; the information stored by each node of the available linked list and the allocated linked list includes: the memory size of the current node points to the pointer of the next node;
when a coprocessor end program applies for memory usage, searching for the memory allocation available for the first block size from the head of the available linked list through a second interface function, and adding the newly allocated memory to the tail of the linked list of the allocated linked list;
in the process of executing the coprocessor-side program, releasing the memory space of the coprocessor-side heap through a third interface function, and adding a newly released memory into the available linked list;
and after the execution of the coprocessor end program is finished, cleaning the memory space of the coprocessor heap according to the virtual address through a fourth interface function.
2. The method of claim 1, wherein organizing the applied heap memory space into an available linked list and an allocated linked list respectively through the first interface function comprises:
initializing an available linked list and an allocated linked list through the first interface function; pointing an available link table head pointer of the available link table to the physical address;
assigning an assignable linked list head pointer of the assigned linked list to null.
3. The method of claim 2, wherein when the coprocessor side program applies for memory usage, searching for memory allocation available for a first block size from an available linked list head through a second interface function, and adding newly allocated memory to a linked list tail of the allocated linked list, comprises:
when the coprocessor side program applies for memory usage, the available link list head pointer is assigned to the current memory block pointer through a second interface function;
if the size of the current memory block is larger than or equal to the size of the applied memory, returning the pointer of the current memory block as the distributed physical address; otherwise, taking the pointer of the next memory block as the pointer of the current memory block, judging whether the size of the current memory block meets the size of the applied memory or not until the memory block with proper size is found, and returning the pointer of the memory block as the allocated physical address;
and adding the newly allocated memory into the tail of the chain table of the allocated chain table.
4. The method according to claim 3, wherein if the size of the current memory block is larger than or equal to the size of the application memory, returning the pointer of the current memory block as the allocated physical address further comprises:
obtaining the size of a current memory block;
acquiring the size of an application memory;
subtracting the size of the application memory from the size of the current memory block to serve as the size of the current idle memory block;
and adding the current idle memory block into the available linked list.
5. The method of claim 4, wherein releasing the coprocessor side heap memory space through the third interface function during the coprocessor side program execution process comprises:
in the process of executing the coprocessor-side program, acquiring a physical address of a memory block to be released through a third interface function;
and sequentially comparing the pointer of the current memory block in the distributed linked list with the physical address of the memory block to be released from the head of the distributed linked list, and if the pointer of the previous memory block pointing to the next node in the current memory block in the distributed linked list is equal to the physical address of the second memory block behind the current memory block.
6. The method of claim 5, wherein adding the newly released memory to the available linked list comprises:
if the newly released memory is adjacent to the idle memory block in the available linked list, merging the newly released memory with the adjacent memory block in the available linked list;
and traversing the available linked list if the newly released memory is not adjacent to the idle memory block in the available linked list, and adding the newly released memory block into the available linked list according to the sequence of memory addresses.
7. The method according to claim 6, wherein merging the newly released memory with the adjacent memory block in the available linked list if the newly released memory is adjacent to the free memory block in the available linked list comprises:
comparing a pointer of a current memory block in the available linked list with the size of the current memory block in sequence from the head of the available linked list, and then judging whether the pointer of the current memory block in the available linked list is equal to the physical address of the newly released memory block or not, and if so, merging the current memory block in the available linked list with the newly released memory block; wherein, the current memory block in the available linked list is in front, and the newly released memory block is behind;
if not, adding the newly released memory block pointer to the size of the newly released memory block, comparing the newly released memory block pointer with the current memory block pointer in the available linked list in sequence, and if the newly released memory block pointer is equal to the current memory block pointer in the available linked list, merging the newly released memory block and the current memory block in the available linked list; wherein the newly released memory chunk is before, and the current memory chunk is after in the available linked list.
8. A heap memory management device for a shared memory coprocessor, the device comprising:
the system comprises a heap memory space application module, a coprocessor end program and a coprocessor end program, wherein the heap memory space application module is used for acquiring size information of a coprocessor heap memory space to be applied by a coprocessor through a CPU (central processing unit), applying the heap memory space through a first interface function according to the size information of the coprocessor heap memory space, converting a virtual address of the applied heap memory space into a physical address, and transmitting the physical address to the coprocessor end program; the CPU and the coprocessor form a heterogeneous system; the coprocessor is a shared memory type coprocessor;
a linked list organizing module, configured to organize the applied heap memory space into an available linked list and an allocated linked list respectively through the first interface function; the information stored by each node of the available linked list and the allocated linked list includes: the memory size of the current node points to the pointer of the next node;
the system comprises a heap memory space allocation module, a coprocessor end program and a chain table tail allocation module, wherein the heap memory space allocation module is used for searching first available memory allocation of a size from the head of an available chain table through a second interface function when the coprocessor end program applies for memory usage, and adding newly allocated memory into the chain table tail of the allocated chain table;
the heap memory space releasing module is used for releasing the heap memory space of the coprocessor end through a third interface function in the process of executing the coprocessor end program and adding a newly released memory into the available linked list;
and the heap memory space cleaning module is used for cleaning the coprocessor heap memory space according to the virtual address through a fourth interface function after the coprocessor end program is executed.
9. The apparatus of claim 8, wherein the linked list organizing module is further configured to:
initializing an available linked list and an allocated linked list through the first interface function; the information stored by each node of the available linked list and the allocated linked list includes: the memory size of the current node points to the next pointer of the next node;
pointing an available link table head pointer of the available link table to the physical address;
assigning an assignable linked list head pointer of the assigned linked list to null.
10. The apparatus of claim 9, wherein the heap memory space allocation module is further configured to:
when the coprocessor side program applies for memory usage, the available link list head pointer is assigned to the current memory block pointer through a second interface function;
if the size of the current memory block is larger than or equal to the size of the applied memory, returning the pointer of the current memory block as the distributed physical address; otherwise, taking the pointer of the next memory block as the pointer of the current memory block, judging whether the size of the current memory block meets the size of the applied memory or not until the memory block with proper size is found, and returning the pointer of the memory block as the allocated physical address;
and adding the newly allocated memory into the tail of the chain table of the allocated chain table.
CN202210131446.2A 2022-02-14 2022-02-14 Heap memory management method and device for shared memory type coprocessor Active CN114185687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210131446.2A CN114185687B (en) 2022-02-14 2022-02-14 Heap memory management method and device for shared memory type coprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210131446.2A CN114185687B (en) 2022-02-14 2022-02-14 Heap memory management method and device for shared memory type coprocessor

Publications (2)

Publication Number Publication Date
CN114185687A CN114185687A (en) 2022-03-15
CN114185687B true CN114185687B (en) 2022-05-24

Family

ID=80545814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210131446.2A Active CN114185687B (en) 2022-02-14 2022-02-14 Heap memory management method and device for shared memory type coprocessor

Country Status (1)

Country Link
CN (1) CN114185687B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8301672B2 (en) * 2008-09-22 2012-10-30 Advanced Micro Devices, Inc. GPU assisted garbage collection
CN106569957A (en) * 2015-10-10 2017-04-19 龙芯中科技术有限公司 Memory allocation method and device
DE102017109239A1 (en) * 2017-04-28 2018-10-31 Ilnumerics Gmbh COMPUTER IMPLEMENTED PROCESS, COMPUTER READABLE MEDIA AND HETEROGICAL COMPUTER SYSTEM
CN109376003A (en) * 2018-08-17 2019-02-22 中国航空无线电电子研究所 A kind of GPU video memory management method of chain structure
CN112463356A (en) * 2020-10-27 2021-03-09 苏州浪潮智能科技有限公司 GPU heap manager memory address allocation method, system, terminal and storage medium
CN113835887A (en) * 2021-09-17 2021-12-24 北京百度网讯科技有限公司 Video memory allocation method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN114185687A (en) 2022-03-15

Similar Documents

Publication Publication Date Title
US8453132B2 (en) System and method for recompiling code based on locality domain and thread affinity in NUMA computer systems
US11429314B2 (en) Storage device, storage system and operating method thereof
US20140040541A1 (en) Method of managing dynamic memory reallocation and device performing the method
CN113688062B (en) Method for storing data and related product
US9389997B2 (en) Heap management using dynamic memory allocation
KR102326280B1 (en) Method, apparatus, device and medium for processing data
US20220253252A1 (en) Data processing method and apparatus
CN109213423B (en) Address barrier-based lock-free processing of concurrent IO commands
CN114185687B (en) Heap memory management method and device for shared memory type coprocessor
US20220374174A1 (en) Storage device with reduced communication overhead using hardware logic
CN107924363A (en) Use the automated storing device management of memory management unit
CN116383101A (en) Memory access method, memory management unit, chip, device and storage medium
CN115686782A (en) Resource scheduling method and device based on solid state disk, electronic equipment and storage medium
EP3188028B1 (en) Buffer management method and apparatus
CN112905497B (en) Memory management method and device, electronic equipment and storage medium
CN110162483B (en) Static memory defragmentation method, device, computer equipment and storage medium
CN113535392A (en) Memory management method and system for supporting continuous allocation of large memory based on CMA
CN109213424B (en) Lock-free processing method for concurrent IO command
CN111708715A (en) Memory allocation method, memory allocation device and terminal equipment
CN112395245A (en) Processor access device and method and computer equipment
US20230350797A1 (en) Flash-based storage device and copy-back operation method thereof
CN115237605B (en) Data transmission method between CPU and GPU and computer equipment
CN115454681B (en) Batch processing program execution method, device and system
TW202331520A (en) Flash memory controller and method used in flash memory controller
CN112947863B (en) Method for combining storage spaces under Feiteng server platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant