WO2021232769A1 - 一种存储数据的方法及数据处理装置 - Google Patents

一种存储数据的方法及数据处理装置 Download PDF

Info

Publication number
WO2021232769A1
WO2021232769A1 PCT/CN2020/136966 CN2020136966W WO2021232769A1 WO 2021232769 A1 WO2021232769 A1 WO 2021232769A1 CN 2020136966 W CN2020136966 W CN 2020136966W WO 2021232769 A1 WO2021232769 A1 WO 2021232769A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
candidate storage
stored
candidate
storage
Prior art date
Application number
PCT/CN2020/136966
Other languages
English (en)
French (fr)
Inventor
张峰
周乃彪
胡英俊
王文强
蒋科
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to KR1020217031361A priority Critical patent/KR20210144730A/ko
Priority to JP2021557735A priority patent/JP7164733B2/ja
Publication of WO2021232769A1 publication Critical patent/WO2021232769A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • G06F3/0649Lifecycle management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/084Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0871Allocation or management of cache space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • This application relates to the computer field, in particular to methods and related products for storing data.
  • AI chips are generally composed of multiple computing units with different functions, a high-speed shared cache with limited space, and double-rate synchronous dynamic random access memory (Double Data Rate Synchronous Dynamic Random Access Memory, DDR SDRAM, referred to as DDR) composition.
  • DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • the embodiments of the present application disclose methods and related products for storing data.
  • an embodiment of the present application provides a method for storing data.
  • the method includes: determining at least two candidate storage spaces in the target memory based on the size of the storage space required for the data to be stored; At least one of the first data release time and the life cycle of the stored data, determining the target weight of each candidate storage scheme among the multiple candidate storage schemes for storing the data to be stored in the at least two candidate storage spaces, Wherein, each candidate storage space corresponds to at least one candidate storage solution; based on the target weight of each candidate storage solution among the multiple candidate storage solutions, the target storage solution for the data to be stored is determined.
  • an embodiment of the present application provides a data processing device, the device including: a first determining unit configured to determine at least two candidate storage spaces in the target storage based on the size of the storage space required for the data to be stored; The second determining unit is configured to determine multiple candidate storages for storing the to-be-stored data in the at least two candidate storage spaces based on at least one of the first data release time and the life cycle of the to-be-stored data The target weight of each candidate storage scheme in the scheme, wherein each candidate storage space corresponds to at least one candidate storage scheme; the third determining unit is used to base the target of each candidate storage scheme in the multiple candidate storage schemes The weight determines the target storage scheme of the data to be stored.
  • an embodiment of the present application provides an electronic device, the electronic device includes: a memory storing processor-executable instructions, a target memory, and a processor, wherein when the processor executes the instructions, Such as the above-mentioned first aspect and any optional implementation method.
  • an embodiment of the present application provides a chip that includes a processor, a data interface, and the target memory described in the first aspect, wherein the processor is used to execute the first aspect or any possible implementation of the first aspect The method in the way.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor of an electronic device, cause the processor to A method for implementing the above-mentioned first aspect and any of the optional implementation manners.
  • the embodiments of the present application provide a computer program product, the computer program product includes program instructions, and when the program instructions are executed by a processor, the processor executes the first aspect and any one of the optional options.
  • the method of realization includes
  • a storage scheme that can effectively reduce memory fragmentation can be determined from a variety of candidate storage schemes.
  • FIG. 1 is a flowchart of a method for storing data provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of a process of calculating a target weight provided by an embodiment of this application;
  • FIG. 3 is a flowchart of another method for storing data provided by an embodiment of the application.
  • FIG. 4 is a schematic structural diagram of a data processing device provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of another data processing device provided by an embodiment of the application.
  • Fig. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the size of the high-speed shared cache of the AI chip is generally a few MB, and the current common one is 8MB or 12MB. Because the instructions of the AI chip are different from the instructions of the Central Processing Unit (CPU), the registers of the CPU are fixed in size, for example, the registers of a 32-bit CPU are fixed to 32 bits. However, there are no registers in the AI chip, and the size of the tensor of the neural network is not fixed, and it is impossible to allocate a fixed storage space to the tensor. In view of the limited shared cache space and the variable size of the tensor to be allocated, if the allocation strategy is not effective, it is easy to produce memory fragments. These memory fragments appear in different locations in a small and discontinuous manner, making these free memory unusable. Can give full play to the role of high-speed shared cache. In fact, these free memory fragments exist in two ways: internal fragments and external fragments.
  • the state of the entire memory space is 0-9 free, 10-14 occupied, 15-24 occupied, and 25-99 free. Among them, 0-9 is a memory fragment. If 10 to 14 are occupied all the time, and the space applied for later is greater than 10 units, and the interval from 0 to 9 cannot be used, the interval from 0 to 9 becomes an external fragment.
  • the embodiments of the present application provide a method for storing data that can reduce fragmentation.
  • the method for storing data provided in the embodiments of the present application is mainly applied to the allocation scenario of the shared cache in the AI chip.
  • the AI chip performs data processing tasks, such as text recognition, image recognition, image super-resolution processing, voice recognition, text translation, etc., all of which need to occupy a shared buffer. That is to say, the method for storing data provided by the embodiments of the present application is mainly applied to scenarios where the AI chip performs data processing tasks, but the method for storing provided by the embodiments of the present disclosure can also be applied to other memory or cache allocation scenarios. The embodiment of the present disclosure does not limit this.
  • the method for storing data provided by the embodiments of the present application can also be applied to a compilation scenario of an AI model, that is, a scenario in which the AI model is compiled into a sequence of instructions executable by the AI chip using compilation software.
  • the data processing device can execute the method for storing data provided by the embodiments of the application to simulate the allocation of the shared cache when the AI model performs processing operations, and then compile the AI model to obtain memory that can indicate the shared cache The sequence of instructions for allocation and release.
  • the AI chip executes the instruction sequence obtained by compiling the AI model, the memory allocation and release process of the shared cache is the same as the memory allocation and release process obtained by executing the method for storing data provided by the embodiment of the present application.
  • the AI chip does not need to execute the method for storing data provided in the embodiments of the present application in real time when performing data processing tasks, but only needs to execute the instruction sequence, which takes less time.
  • the AI chip in the data processing device can reduce the generation of memory fragments and improve the success rate of cache allocation when performing data processing tasks.
  • the shared cache of the AI chip is dynamically allocated when the program of the data processing device is running.
  • the shared cache can be divided into multiple storage spaces, such as cache blocks.
  • the size of different cache blocks can be the same or different, and can be based on the needs of the cached data. To decide.
  • the state of the cache block can be marked.
  • the allocated block can be marked as used_item
  • the unallocated block can be marked as free_item.
  • the initial state is that the entire shared cache is a free_item. After the number of memory allocations and releases, there may be multiple used_items, and there may be 1 or 0 free_item among these used_items.
  • Allocated blocks refer to occupied storage space
  • unallocated blocks refer to unoccupied storage space.
  • the compiler generates an instruction sequence to the AI chip, and the sequence number of each instruction in the instruction sequence is called an instruction sequence number.
  • the compiler is a piece of software or a piece of program code run by the data processing device.
  • Each tensor (which can be understood as data) may be used by multiple instructions (as the output of the instruction or as the input of the instruction).
  • the smallest sequence number of these instructions can be called the start program counter of the tensor (start program counter, referred to as start_pc)
  • start_pc start program counter
  • end_pc the largest sequence number of the tensor
  • the difference between end_pc and start_pc can be called the life cycle of the tensor.
  • the data release time of the data refers to the time when the address occupied by the data is released, that is, the time when the data is released.
  • Fig. 1 is a flowchart of a method for storing data provided by an embodiment of the application.
  • the data processing device determines at least two candidate storage spaces in the target memory based on the size of the storage space required for the data to be stored.
  • the data to be stored may be input image data, or an intermediate result and/or final result produced by processing the input image through a neural network.
  • the data to be stored may be at least a part of a feature map, or
  • the stored data may also be model data, such as the weight of the model, etc., but the embodiment of the present disclosure does not limit this.
  • each candidate storage space (corresponding to free_item) is greater than or equal to the size of the storage space required to store the aforementioned data to be stored.
  • the data processing device may be a device that can perform data processing operations, such as a server, a desktop computer, a notebook computer, a mobile phone, and a tablet computer.
  • the aforementioned target memory is a shared cache in the artificial intelligence AI chip.
  • the data processing device may determine two or more candidate storage spaces that can store the data to be stored from among the multiple discrete storage spaces (ie, free_item) that are not allocated by the target memory.
  • the processor in the data processing device can linearly scan all the storage space (ie item) of the shared cache, and use free_item that is greater than or equal to the storage space required by the data to be stored (such as tensor) as the candidate storage space. Obtain at least two candidate storage spaces mentioned above.
  • each candidate storage space corresponds to at least one candidate storage solution.
  • the first data release time of the data to be stored may be the time when the data to be stored is released, that is, the time when the storage space occupied by the data to be stored is released.
  • the life cycle of the data to be stored may be the interval between the time when the data to be stored is released and the time when the data to be stored is stored.
  • the target weight of each candidate storage scheme is negatively related to the interval between the first data release time and the second data release time of the data to be stored, where the second data release time is related to the data to be stored.
  • the data release time of the data stored in the storage space adjacent to the storage location in the above candidate storage solution The implementation of step 102 will be described in detail later.
  • determining the target storage scheme of the data to be stored may be that the data processing device assigns the largest target weight among the respective target weights of the above-mentioned multiple candidate storage schemes
  • the corresponding candidate storage scheme is determined as the target storage scheme of the data to be stored; it may also be that the data processing device stores the candidate corresponding to any target weight that exceeds the preset weight threshold among the respective target weights of the multiple candidate storage schemes.
  • the scheme is determined as the target storage scheme of the data to be stored; wherein, the weight threshold may be 0.6, 0.75, 0.8, and so on.
  • the data processing apparatus may also perform the following operations: store the above-mentioned data to be stored in the first address to the second address of the candidate storage space corresponding to the above-mentioned target storage scheme;
  • the storage space corresponding to the second address is set as the allocated storage space (that is, used_item).
  • one of the first address and the second address is the start address of the candidate storage space corresponding to the target storage scheme, or one of the first address and the second address is the target storage scheme The end address of the corresponding candidate storage space.
  • the data processing apparatus may also perform the following operations: release the storage space corresponding to the first address to the second address;
  • the storage space corresponding to the foregoing second address is set as an unallocated storage space (ie, free_item).
  • a certain memory management software run by the data processing device executes the method flow of FIG. 1.
  • the candidate storage space corresponding to the target storage scheme is greater than the storage space required by the data to be stored, after storing the data to be stored in the aforementioned first address to the second address, the candidate storage space corresponding to the target storage scheme
  • the space in the storage space where the data to be stored is not stored is still set as the unallocated storage space (ie, free_item).
  • the first address is the start address of the candidate storage space corresponding to the target storage scheme
  • the storage space between the next address of the second address and the end address of the candidate storage space corresponding to the target storage scheme is set as unallocated Storage space.
  • the storage space between the start address of the candidate storage space corresponding to the target storage scheme and the previous address of the first address is set to be unavailable.
  • the allocated storage space is the first address.
  • the target weight of the candidate storage scheme based on at least one of the first data release time and the life cycle of the data to be stored, it is determined to store the data to be stored in each of the multiple candidate storage schemes in the at least two candidate storage spaces.
  • the target weight of the candidate storage scheme according to multiple target weights, a storage scheme that can effectively reduce memory fragmentation can be determined from a variety of candidate storage schemes.
  • the candidate storage solution corresponding to each candidate storage space includes at least one of a first candidate storage solution and a second candidate storage solution, wherein the starting storage address in the first candidate storage solution is the foregoing The start address of the candidate storage space, and the end storage address in the second candidate storage solution is the end address of the candidate storage space.
  • each candidate storage space corresponds to one or two allocation methods, namely, left allocation (corresponding to the first candidate storage plan) and right allocation (corresponding to the second candidate storage plan), which can be calculated separately The target weight of a method of distribution.
  • Left allocation refers to storing the data to be stored in the starting address of a certain candidate storage space to a certain address, that is, allocating the starting address of the candidate storage space to multiple consecutive addresses for the data to be stored.
  • Right allocation refers to storing the data to be stored from an address to the end address of a candidate storage space, that is, allocating the end address of the storage space and multiple consecutive addresses before the end address for the data to be stored.
  • the candidate storage space When the size of a candidate storage space is greater than the size of the storage space required to store the data to be stored, the candidate storage space has two allocation methods (that is, the left allocation and the right allocation are different); When the size is equal to the size of the storage space required to store the data to be stored, the candidate storage space has only one allocation method (that is, the left allocation and the right allocation are the same). For example, if there are 10 candidate storage spaces whose size is greater than the storage space required to store the data to be stored, the data processing device performs 20 rounds of target weight calculations, that is, calculates the corresponding left allocation method for each candidate storage space The target weight and the target weight corresponding to the right distribution method.
  • the storage space occupied by the data to be stored can be merged with its adjacent storage space into a larger storage after the storage space occupied by the data to be stored is released. Space to reduce memory fragmentation.
  • FIG. 2 is a schematic diagram of a process of calculating the target weight of a candidate storage solution provided by an embodiment of this application.
  • the black rectangular area shown in 211-216 represents the allocated storage space in the target storage (ie used_item), and the white rectangular area shown in 201-205 represents the unallocated storage space in the target storage.
  • Storage space (ie free_item) assuming that storage space 201, storage space 203, and storage space 205 can store data to be stored, and the size of storage space 201 and storage space 203 is greater than the size of storage space required to store the data to be stored, storage space The size of 205 is equal to the size of the storage space required to store the data to be stored.
  • the black rectangular area in the figure represents the occupied part of the storage space
  • the white rectangular area represents the unoccupied part of the storage space
  • the upper edge of the rectangular area represents the starting address of the corresponding storage space.
  • the bottom edge of the rectangular area indicates the end address of the corresponding storage space.
  • the target weight is calculated when the data to be stored is stored from the starting address of the storage space 203 to a certain address (allocated to the left).
  • the target weight when the data to be stored is stored in a certain address of the storage space 203 to the end address (allocated to the right) is calculated.
  • the target weight when storing the data to be stored in the storage space 205 from the start address to the end address (that is, the left allocation and the right allocation are the same); and so on.
  • the data processing device calculates the target weight for storing the above-mentioned data to be stored in a candidate storage space.
  • the target weight may be used as the first target weight, and then it may be executed.
  • the operation is as follows: in a case where the current maximum target weight is less than the above-mentioned first target weight, the above-mentioned current maximum target weight is updated to the above-mentioned first target weight.
  • the target weight is regarded as the current maximum target weight and saved; the target weight obtained by the i-th round of target weight calculation is compared with the saved current maximum target The weights are compared. If the newly calculated target weight is greater than the current maximum target weight, the current maximum target weight is updated to the newly calculated target weight; otherwise, the current maximum target weight remains unchanged, where i is a positive integer greater than 1. .
  • the foregoing embodiment does not describe in detail the method for determining the target weight of each candidate storage scheme among multiple candidate storage schemes for storing data to be stored in at least two candidate storage spaces.
  • the following takes the calculation of the target weight of the reference candidate storage scheme as an example Introduce some optional implementation methods for calculating target weights.
  • the aforementioned reference candidate storage solution is any one of the aforementioned at least two candidate storage spaces.
  • the target weight of the candidate storage scheme may be determined.
  • the target weight corresponding to the reference candidate storage scheme is negatively correlated with the time interval between the first data release time and the second data release time of the data to be stored, wherein the second data release time is related to the data to be stored in the reference
  • the data release time of the data stored in the storage space adjacent to the storage location in the candidate storage scheme is the target weight corresponding to the reference candidate storage scheme.
  • the target weight corresponding to the reference candidate storage scheme is the reciprocal of the interval between the first data release time and the second data release time of the data to be stored.
  • the first data release time is t1
  • the second data release time is t2
  • the target weight corresponding to the reference candidate storage scheme is
  • the adjacent storage space is 211 or 212.
  • the adjacent storage space of the storage space 201 is 211 due to the left allocation.
  • the adjacent storage space of the storage space 201 is due to the right allocation. 212.
  • the adjacent storage space may be 215 or 216.
  • the target weight of each candidate storage solution in the storage solution includes: determining the target weight of the candidate storage solution based on the life cycle of the data to be stored and the starting address of the candidate storage space corresponding to the candidate storage solution.
  • the determination of the foregoing target storage scheme enables the life cycle of the data stored in the foregoing target memory to increase or decrease with the storage address.
  • the data processing apparatus executing the method for storing data can make the life cycle of the data stored in the above-mentioned target memory increase or decrease with the storage address. In other words, try to store data to be stored with a small life cycle on one side of the storage space (such as storage on the left), and store data with a long life cycle on the other side of the storage space (such as storage on the right). ). In some embodiments, based on at least one of the first data release time and the life cycle of the data to be stored, it is determined to store the data to be stored in each of the multiple candidate storage solutions of the at least two candidate storage spaces.
  • the target weight of a candidate storage scheme includes: determining the maximum life cycle corresponding to the data to be stored; determining the first ratio between the life cycle of the data to be stored and the maximum life cycle; determining the candidate storage corresponding to the candidate storage scheme The second ratio between the start address of the space and the end address of the target memory; based on the first ratio and the second ratio, the target weight of the candidate storage scheme is determined.
  • the target weight of the candidate storage solution is negatively correlated with the absolute value of the difference between the first ratio and the second ratio.
  • the maximum life cycle corresponding to the data to be stored may be the maximum life cycle among the life cycles of the data corresponding to each instruction in the instruction sequence, that is, the maximum length of time that the data related to the data to be stored occupies the target memory.
  • the maximum life cycle corresponding to the data to be stored is the maximum value of the life cycle of all data that needs to be stored during the image processing process, including the maximum value of the life cycle of all data with allocated memory and unallocated memory
  • the embodiments of the present disclosure are not limited to this.
  • the start address of the candidate storage space may be expressed as the offset value of the start address of the candidate storage space relative to the start address of the target memory
  • the end address of the target memory may be expressed as the relative end address of the target memory. The offset value from the start address of the target memory.
  • the second ratio between the start address of the candidate storage space and the total storage space size of the target memory may be determined, and the second ratio may be used as at least one candidate corresponding to the candidate storage space.
  • the second ratio of the storage scheme is stored, but the embodiment of the present disclosure is not limited to this.
  • each candidate storage scheme in the storage scheme includes: the second data release time based on the first data release time corresponding to the above-mentioned data to be stored and the data stored in the storage space adjacent to the storage location corresponding to the above-mentioned candidate storage scheme , Determine the first weight of the candidate storage solution; determine the second weight of the candidate storage solution based on the life cycle of the data to be stored and the starting address of the candidate storage space corresponding to the candidate storage solution; and determine the second weight of the candidate storage solution based on the first weight and The weighted sum of the foregoing second weights obtains the target weight of the foregoing candidate storage solution.
  • the target weight of each candidate storage scheme in the scheme includes: determining the storage space size corresponding to each candidate storage scheme in the multiple candidate storage schemes based on the first data release time and life cycle of the above-mentioned data to be stored The target weight of the candidate storage solution.
  • the size of the storage space corresponding to the candidate storage solution may be the size of the candidate storage space corresponding to the candidate storage solution.
  • the target weight corresponding to the aforementioned candidate storage solution includes a weighted sum of the first index, the second index, and the third index.
  • the first indicator is determined by the interval between the first data release time and the second data release time of the data to be stored, and the second data release time is related to the storage location of the data to be stored in the candidate storage solution.
  • the second indicator is determined by the difference between the first ratio and the second ratio, and the first ratio is the life cycle of the data to be stored corresponding to the data to be stored
  • the second ratio is the ratio between the start address of the candidate storage space corresponding to the candidate storage scheme and the end address of the target memory
  • the third index is determined by the candidate storage scheme The ratio of the storage space to the total storage space of the aforementioned target memory is determined.
  • the first data release time, life cycle, and required storage space size of the data to be stored are comprehensively considered, so that the determined target storage solution can more effectively reduce memory fragmentation and reduce the occupied storage space.
  • the target weight corresponding to the above candidate storage scheme satisfies the following formula (1):
  • e represents the first data release time
  • e1 represents the second data release time
  • abs(e-e1) represents the difference between e and e1 The absolute value of.
  • w3 1-s_cand/mem_size
  • s_cand represents the size of the candidate storage space corresponding to the candidate storage scheme
  • mem_size represents the size of the total storage space of the target storage.
  • the method of calculating the target weight of the candidate storage solution is the combined result of the three allocation principles.
  • w1 corresponds to the first allocation principle, which is to allocate as close to end_pc as possible, so that the release time of adjacent storage spaces is similar, which is beneficial to merge into a large free storage space, thereby reducing memory fragmentation.
  • Each data corresponds to an end_pc, and the end_pc corresponding to each data indicates the point in time when the storage space occupied by the data is released. Try to allocate as close to the end_pc as possible, and allocate the data to be stored as close as possible to the corresponding end_pc and the end_pc corresponding to the data to be stored.
  • the data to be stored is allocated to the space storage space adjacent to the storage space.
  • w2 corresponds to the second allocation principle, which is to allocate data with a short life cycle (frequently allocated and released) and data with a long life cycle, and set the location of the allocated and released frequently data as close as possible, which can also reduce Memory fragmentation.
  • w3 corresponds to the third allocation principle, which is to allocate the minimum free storage space that can meet the demand and to the data to be stored. In this implementation, combining multiple allocation principles to allocate addresses for the data to be allocated can effectively reduce memory fragmentation.
  • the data processing device may combine any two of the three distribution principles to calculate the target weight, or may only use the first principle or the second principle to calculate the target weight.
  • the target weight corresponding to the above candidate storage scheme satisfies the following formula (2):
  • the target weight corresponding to the above candidate storage scheme satisfies the following formula (3):
  • the target weight corresponding to the above candidate storage scheme satisfies the following formula (4):
  • the target weight corresponding to the above candidate storage scheme satisfies the following formula (5):
  • the target weight corresponding to the above candidate storage scheme satisfies the following formula (6):
  • FIG. 3 is a flowchart of another method for storing data provided by an embodiment of the application. As shown in Figure 3, the method may include the following steps.
  • the data processing device determines two or more candidate storage spaces that can store data to be stored from a plurality of discrete storage spaces not allocated by the target memory.
  • the Nth round of target weight calculation calculate the first target weight for storing the to-be-stored data in the first candidate storage space based on at least one of the first data release time and the life cycle of the to-be-stored data.
  • the above-mentioned first candidate storage space is any one of the above-mentioned two or more candidate storage spaces
  • the calculation of the first target weight for storing the data to be stored in the first candidate storage space may be by using the formula ( 1) Any one of formula (6) to calculate the target weight.
  • the data processing apparatus calculates the target weight assuming that the data to be stored is stored in the first candidate storage space, and does not perform the operation of storing the data to be stored in the first candidate storage space.
  • the aforementioned N is an integer greater than zero.
  • the data processing device may calculate one target weight or two target weights corresponding to the data to be stored in each candidate storage space, and each round of target weight calculation may calculate a target weight.
  • updating the current maximum target weight may be to save the target weight calculated in the first round as the current maximum target weight.
  • updating the current maximum target weight can be to update the current maximum target weight to the target weight calculated in the Nth round when the target weight calculated in the Nth round is greater than the current maximum target weight currently saved ; In the case that the target weight calculated in the Nth round is not greater than the currently saved current maximum target weight, keep the current maximum target weight unchanged.
  • determining whether to stop the calculation of the next round of target weights may be determined to stop the calculation of the next round of target weights when the target weights of each candidate storage scheme are currently calculated; In the case of the target weight of the candidate storage scheme, it is judged to continue the next round of target weight calculation. If the next round of target weight calculation is not stopped, N+1, and step 302 is executed; if the next round of target weight calculation is stopped, step 305 is executed.
  • the storage space corresponding to the above-mentioned first address to the above-mentioned second address may be set as an unallocated storage space.
  • the second address is the end address of the candidate storage space
  • the third address is located to the right of the second address
  • the The first address to the aforementioned third address are set as an unallocated discrete storage space.
  • the storage space whose starting address is the next address of the third address is the allocated storage space (used_item).
  • Step 308 can be replaced by: if the first address is the start address of the candidate storage space, then the fourth address of the target memory (the fourth address is located on the left side of the first address) to the previous address of the first address When no data is stored, the fourth address to the second address are set as an unallocated discrete storage space. Among them, the storage space with the last address of the fourth address as the end address is the allocated storage space (used_item).
  • the method provided by the embodiment of the present application can effectively reduce memory fragmentation.
  • the method for storing data described in the foregoing embodiment can be applied to a scenario where a data processing device performs data processing tasks through an AI chip, that is, real-time management of address allocation and release of a shared cache; it can also be applied to a compilation scenario of an AI model.
  • the data processing device can execute the method for storing data provided by the embodiments of the application to simulate the allocation of the shared cache when the AI model performs processing operations, and then compile the AI model to obtain memory that can indicate the shared cache
  • the sequence of instructions for allocation and release can be executed a sequence of instructions to perform data processing tasks.
  • the AI chip stores data in the shared cache and releases the data in the shared cache according to the instructions in the instruction sequence, which can improve the utilization rate of the shared cache.
  • FIG. 4 is a schematic structural diagram of a data processing device provided by an embodiment of the application. As shown in FIG. 4, the device includes a first determining unit 401, a second determining unit 402, and a third determining unit 403.
  • the first determining unit 401 is configured to determine at least two candidate storage spaces in the target memory based on the size of the storage space required for the data to be stored.
  • the second determining unit 402 is configured to determine, based on at least one of the first data release time and the life cycle of the to-be-stored data, to store the to-be-stored data in multiple candidate storage solutions of the at least two candidate storage spaces The target weight of each candidate storage solution, where each candidate storage space corresponds to at least one candidate storage solution.
  • the third determining unit 403 is configured to determine the target storage scheme of the data to be stored based on the target weight of each candidate storage scheme among the multiple candidate storage schemes.
  • the candidate storage solution corresponding to the candidate storage space includes at least one of a first candidate storage solution and a second candidate storage solution, wherein the starting storage address in the first candidate storage solution is Is the start address of the candidate storage space, and the end storage address in the second candidate storage solution is the end address of the candidate storage space.
  • the second determining unit 401 is further configured to, for each candidate storage solution, determine the target of the candidate storage solution based on the first data release time and the second data release time of the data to be stored The weight, wherein the second data release time is the data release time of the data stored in the storage space adjacent to the storage location of the data to be stored in the candidate storage solution.
  • the target weight corresponding to the candidate storage solution is negatively related to the interval between the first data release time and the second data release time of the data to be stored.
  • the second determining unit 402 is further configured to determine, for each candidate storage scheme, based on the life cycle of the data to be stored and the starting address of the candidate storage space corresponding to the candidate storage scheme. The target weight of the candidate storage solution.
  • the second determining unit 402 is further configured to determine the maximum life cycle corresponding to the data to be stored; determine the first ratio between the life cycle of the data to be stored and the maximum life cycle; Determine the second ratio between the start address of the candidate storage space corresponding to the candidate storage scheme and the end address of the target memory; determine the target weight of the candidate storage scheme based on the first ratio and the second ratio.
  • the target weight of the candidate storage solution is negatively related to the absolute value of the difference between the first ratio and the second ratio.
  • the second determining unit 402 is further configured to determine, for each candidate storage scheme, based on the first data release time and the second data release time corresponding to the above-mentioned data to be stored, The first weight, where the second data release time is the data release time of the data stored in the storage space adjacent to the storage location of the data to be stored in the candidate storage solution; based on the life cycle of the data to be stored and the candidate storage
  • the starting address of the candidate storage space corresponding to the solution determines the second weight of the candidate storage solution; based on the weighted sum of the first weight and the second weight, the target weight of the candidate storage solution is obtained.
  • the second determining unit 402 is further configured to, for each candidate storage scheme, based on the first data release time and life cycle of the data to be stored, and the storage space size corresponding to the candidate storage scheme, Determine the target weight of the candidate storage solution.
  • the second determining unit 402 is further configured to determine the first weight of the candidate storage scheme based on the first data release time corresponding to the data to be stored and the second data release time, where , The second data release time is the data release time of the data stored in the storage space adjacent to the storage location of the data to be stored in the candidate storage solution; based on the life cycle of the data to be stored and the candidate storage
  • the starting address of the candidate storage space corresponding to the solution determines the second weight of the candidate storage solution; based on the size of the candidate storage space corresponding to the candidate storage solution and the total storage space of the target storage, the candidate storage is determined
  • the third weight of the solution based on the weighted sum of the first weight, the second weight, and the third weight, the target weight of the candidate storage solution is obtained.
  • the second determining unit 402 is further configured to determine, for each candidate storage scheme, based on the first data release time of the data to be stored and the storage space size corresponding to the candidate storage scheme The target weight of the candidate storage solution.
  • the second determining unit 402 is further configured to determine the candidate storage solution based on the life cycle of the data to be stored and the storage space size corresponding to the candidate storage solution for each candidate storage solution.
  • the target weight is further configured to determine the candidate storage solution based on the life cycle of the data to be stored and the storage space size corresponding to the candidate storage solution for each candidate storage solution. The target weight.
  • the device further includes a setting unit 404, configured to store the data to be stored in the first address to the second address of the candidate storage space corresponding to the target storage scheme;
  • the storage space corresponding to the first address to the second address is set as the allocated storage space; wherein, one of the first address and the second address is the storage space of the candidate storage space corresponding to the target storage scheme
  • the start address, or one of the first address and the second address is the end address of the candidate storage space corresponding to the target storage scheme.
  • the device further includes a releasing unit 405, configured to release the data corresponding to the first address to the second address after the first data release time corresponding to the data to be stored arrives.
  • Storage space; the setting unit 404 is further configured to set the storage space corresponding to the first address to the second address as unallocated storage space.
  • the third determining unit 403 is further configured to determine the candidate storage solution corresponding to the largest target weight among the target weights of the multiple candidate storage solutions as the data to be stored The target storage solution; or the target weights of each of the multiple candidate storage solutions, the candidate storage solution corresponding to any target weight exceeding a preset weight threshold is determined as the target storage solution.
  • the aforementioned target memory is a shared cache in the artificial intelligence AI chip.
  • the first determining unit 401 is further configured to determine the at least two candidate storage spaces that can store the to-be-stored data from the multiple discrete storage spaces that are not allocated by the target memory, wherein The size of the candidate storage space is greater than or equal to the storage space occupied by the data to be stored.
  • the setting unit 404 is further configured to: if the second address is the end address of the candidate storage space, if no data is stored in the next address to the third address of the second address, The first address to the third address are set as an unallocated discrete storage space. Among them, the storage space whose starting address is the next address of the third address is the allocated storage space.
  • the setting unit 404 is further configured to: if the first address is the starting address of the candidate storage space, no data is stored in the fourth address of the target memory to the previous address of the first address. In the case of setting the fourth address to the second address as an unallocated discrete storage space. Among them, the storage space with the last address of the fourth address as the end address is the allocated storage space.
  • Fig. 5 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • the data processing device includes an AI chip 510 and a memory 520.
  • the AI chip 510 can obtain data and instructions from the memory 520 and output the final processing result to the memory 520.
  • the calculation unit 501 in the AI chip 510 performs processing.
  • Task the computing unit 501 stores the data in the shared cache 502 (that is, the target memory) and obtains the data from the shared cache 502 in the process of processing the data.
  • the address allocation and release of the shared cache 502 can use the method for storing data in the foregoing embodiment.
  • the memory 520 may be located inside the AI chip 510.
  • a certain memory management software run by the data processing device executes the method for storing data in the foregoing embodiments to manage the address allocation and release of the shared cache.
  • the instruction read from the memory is executed to implement the data processing task, and the instruction read from the memory in the process of implementing the data processing task indicates the address of the shared cache. Allocate and release. In other words, the AI chip executes the instructions read from the memory to achieve the same memory allocation and release process as the foregoing embodiment.
  • FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 600 may have relatively large differences due to different configurations or performance, and may include one or more central processing units (CPU) 622 (for example, one or more processors) and memory 632, one or more storage media 630 (for example, one or more mass storage devices) for storing application programs 642 or data 644, and one or more AI chips 624.
  • the memory 632 and the storage medium 630 may be short-term storage or persistent storage.
  • the program stored in the storage medium 630 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the electronic device.
  • the central processing unit 622 may be configured to communicate with the storage medium 630, and execute a series of instruction operations in the storage medium 630 on the electronic device 600.
  • the AI chip 624 can perform various data processing tasks assigned by the CPU 622.
  • the electronic device 600 may be the data processing apparatus provided by this application.
  • the electronic device 600 may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input and output interfaces 658, and/or one or more operating systems 641, such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and so on.
  • operating systems 641 such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and so on.
  • the steps performed by the data processing apparatus in the foregoing embodiment may be based on the electronic device structure shown in FIG. 6.
  • the central processing unit 622 can implement the functions of each unit in FIG. 4.
  • the embodiment of the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program includes program instructions. When the computer program is executed by a processor, the computer program is implemented: The size of the storage space, determine at least two candidate storage spaces in the target storage; based on at least one of the first data release time and life cycle of the data to be stored, determine to store the data to be stored in the at least two candidate storages The target weight of each candidate storage scheme in the multiple candidate storage schemes of the space, where each candidate storage space corresponds to at least one candidate storage scheme; based on the target weight of each candidate storage scheme in the above-mentioned multiple candidate storage schemes, Determine the target storage plan for the above-mentioned data to be stored.
  • the computer-readable storage medium may be a non-volatile storage medium.
  • the embodiments of the present application provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the method for storing data provided in the foregoing embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Neurology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Memory System (AREA)
  • Indexing, Searching, Synchronizing, And The Amount Of Synchronization Travel Of Record Carriers (AREA)

Abstract

本申请实施例公开了一种用于存储数据的方法和相关产品,该方法包括:基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间;基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案;基于所述多种候选存储方案中每种候选存储方案对应的目标权重,确定所述待存储数据的目标存储方案。

Description

[根据细则37.2由ISA制定的发明名称] 一种存储数据的方法及数据处理装置 技术领域
本申请涉及计算机领域,尤其涉及用于存储数据的方法和相关产品。
背景技术
人工智能(artificial intelligence,AI)芯片一般是由多个不同功能的计算单元、空间有限的高速共享缓存和双倍速率同步动态随机存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDR SDRAM,简称DDR)组成。
发明内容
本申请实施例公开了用于存储数据的方法和相关产品。
第一方面,本申请实施例提供了一种用于存储数据的方法,该方法包括:基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间;基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案;基于所述多种候选存储方案中每种候选存储方案的目标权重,确定所述待存储数据的目标存储方案。
第二方面,本申请实施例提供了一种数据处理装置,该装置包括:第一确定单元,用于基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间;第二确定单元,用于基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案;第三确定单元,用于基于所述多种候选存储方案中每种候选存储方案的目标权重,确定所述待存储数据的目标存储方案。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括:存储有处理器可执行指令的存储器、目标存储器和处理器,其中,所述处理器在执行所述指令时,实现如上述第一方面以及任一种可选的实现方式的方法。
第四方面,本申请实施例提供了一种芯片,该芯片包括处理器、数据接口以及上述 第一方面所述的目标存储器,其中,处理器用于执行第一方面或第一方面的任意可能实现方式中的方法。
第五方面,本申请实施例提供了一种计算机可读存储介质,该计算机存储介质存储有计算机程序,该计算机程序包括程序指令,该程序指令当被电子设备的处理器执行时使该处理器执行上述第一方面以及任一种可选的实现方式的方法。
第六方面,本申请实施例提供了一种计算机程序产品,该计算机程序产品包括程序指令,所述程序指令当被处理器执行时使所述处理器执行上述第一方面以及任一种可选的实现方式的方法。
本申请实施例中,基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,可以从多种候选存储方案中确定一种能够有效减少内存碎片的存储方案。
附图说明
图1为本申请实施例提供的一种用于存储数据的方法流程图;
图2为本申请实施例提供的一种计算目标权重的过程示意图;
图3为本申请实施例提供的另一种用于存储数据的方法流程图;
图4为本申请实施例提供的一种数据处理装置结构示意图;
图5为本申请实施例提供的另一种数据处理装置结构示意图;
图6是本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
本申请的说明书实施例和权利要求书及上述附图中的术语“第一”、“第二”、和“第三”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
AI芯片的高速共享缓存的大小一般为几MB,目前常见的为8MB或12MB。由于 AI芯片的指令和中央处理器(Central Processing Unit,CPU)指令不同,CPU的寄存器是固定大小的,比如32位CPU的寄存器是固定32位的。然而AI芯片中没有寄存器,并且神经网络的张量(tensor)大小不是固定的,无法分配固定的存储空间给tensor。鉴于共享缓存空间有限,且待分配tensor大小不定,如果分配策略效果不佳,很容易产生内存碎片,这些内存碎片以小且不连续方式出现在不同的位置,导致这些空闲的内存无法使用,不能够充分发挥高速共享缓存的作用。实际上这些空闲内存碎片存在的方式有两种:内部碎片和外部碎片。
内部碎片的产生:因为所有的内存分配必须起始于可被4、8或16整除(视处理器体系结构而定)的地址或者因为内存管理单元(memory management unit,MMU)的分页机制的限制,决定内存分配算法仅能把预定大小的内存块分配给数据。假设当存储某个数据需要占用一个43字节的内存块时,因为没有适合大小的内存,所以它可能会获得44字节、48字节等稍大一点的字节,因此由所需大小向上取整而产生的多余空间就叫内部碎片。
外部碎片的产生:频繁的分配与回收物理页面会导致大量的、连续且小的页面块夹杂在已分配的页面中间,就会产生外部碎片。假设有一块一共有100个单位(例如一个地址)的连续空闲的内存空间,范围是0~99。如果从中申请一块内存,如10个单位,那么申请出来的内存块可以占用0~9区间。这时候如果继续申请一块内存,比如说5个单位大,第二块申请到的内存块可以占用10~14区间。如果把第一块内存块释放,然后再申请一块大于10个单位的内存块,比如说20个单位。因为刚被释放的内存块不能满足新的请求,所以只能从15开始分配出20个单位的内存块。现在整个内存空间的状态是0~9空闲,10~14被占用,15~24被占用,25~99空闲。其中0~9就是一个内存碎片了。如果10~14一直被占用,而以后申请的空间都大于10个单位,无法使用0~9的区间,则0~9区间变成外部碎片。
为充分发挥高速共享缓存的作用,本申请实施例提供了能够减少碎片的用于存储数据的方法。
本申请实施例提供的用于存储数据的方法主要应用于AI芯片中共享缓存的分配场景。应理解,AI芯片执行数据处理任务,例如文本识别、图像识别、图像超分辨率处理、语音识别、文本翻译等,均需要占用共享缓存。也就是说,本申请实施例提供的用于存储数据的方法主要应用于AI芯片执行数据处理任务的场景,但本公开实施例提供的用于存储的方法也可以应用于其他内存或缓存分配场景,本公开实施例对此不做限定。
本申请实施例提供的用于存储数据的方法还能够应用于AI模型的编译场景,即将AI模型利用编译软件编译为AI芯片可执行的指令序列的场景。在AI模型的编译场景中,数据处理装置可执行本申请实施例提供的用于存储数据的方法来模拟AI模型执行处理操作时共享缓存的分配,进而对AI模型编译得到能指示共享缓存的内存分配和释放的指令序列。AI芯片执行编译AI模型得到的指令序列时,共享缓存的内存分配和释放流程与执行本申请实施例提供的用于存储数据的方法得到的内存分配和释放流程相同。在这种场景中,AI芯片在执行数据处理任务不需要实时执行本申请实施例提供的用于存储数据的方法,仅需执行指令序列,花费时间更短。
在上述场景中,数据处理装置中的AI芯片在执行数据处理任务时,可以减少内存碎片的产生,提高缓存分配的成功率。
下面先介绍本公开实施例中出现的一些术语的含义。
AI芯片的共享缓存是在数据处理装置的程序运行时动态分配的,其中,共享缓存可以划分成多个存储空间,如缓存块,不同缓存块的大小可以相同或不同,可以基于缓存数据的需求来决定。在本公开实施例中,可以对缓存块的状态进行标记,例如,已分配的块可以被标记为used_item,未分配的块可以被标记为free_item,初始状态是整个共享缓存为一个free_item,经过一定次数的内存分配和释放后,可能有多个used_item,这些used_item之间可能有1个或0个free_item。已分配的块是指被占用的存储空间,未分配的块是指未被占用的存储空间。
在一些实施例中,编译器产生指令序列给AI芯片,每个指令在指令序列中的序号称为指令序号。编译器为数据处理装置运行的一个软件或一段程序代码。每个tensor(可以理解为数据)可能会被多条指令使用(作为指令的输出或者作为指令的输入),这些指令中最小的序号可被称作tensor的起始序号(start program counter,简称为start_pc),最大的序号可被称作tensor的结束序号(end program counter,简称为end_pc),end_pc和start_pc的差值可被称作tensor的生命周期。数据的数据释放时间是指该数据占用的地址被释放的时间,即数据被释放的时间。
图1为本申请实施例提供的一种用于存储数据的方法流程图。
101、数据处理装置基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间。
可选地,待存储数据可以为输入图片数据,或者是通过神经网络对输入图片进行处 理所产生的中间结果和/或最终结果,例如,待存储数据可以为特征图的至少一部分,或者,待存储数据也可以为模型数据,例如模型的权重,等等,但本公开实施例对此不做限定。
每个候选存储空间(对应于free_item)的大小大于或等于存储上述待存储数据所需占用的存储空间的大小。数据处理装置可以是服务器、台式电脑、笔记本电脑、手机、平板电脑等可执行数据处理操作的设备。可选的,上述目标存储器为人工智能AI芯片中的共享缓存。
数据处理装置可以从上述目标存储器未分配的多个离散存储空间(即free_item)中,确定可存储上述待存储数据的两个或两个以上候选存储空间。在实际应用中,数据处理装置中的处理器可以线性扫描共享缓存的所有存储空间(即item),将大于或者等于待存储数据(如tensor)所需占用的存储空间的free_item作为候选存储空间,得到上述至少两个候选存储空间。
102、基于上述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重。
其中,每个候选存储空间对应于至少一种候选存储方案。上述待存储数据的第一数据释放时间可以是上述待存储数据被释放的时间,即上述待存储数据占用的存储空间被释放的时间。上述待存储数据的生命周期可以是上述待存储数据被释放的时间与上述待存储数据被存储的时间的间隔。示例性的,每个候选存储方案的目标权重与上述待存储数据的第一数据释放时间和第二数据释放时间之间的间隔负相关,其中,上述第二数据释放时间为与上述待存储数据在上述候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间。后续再详述步骤102的实现方式。
103、基于上述多种候选存储方案中每种候选存储方案的目标权重,确定上述待存储数据的目标存储方案。
基于上述多种候选存储方案中每种候选存储方案的目标权重,确定上述待存储数据的目标存储方案可以是数据处理装置在上述多种候选存储方案各自的目标权重中,将最大的一个目标权重对应的候选存储方案确定为上述待存储数据的目标存储方案;也可以是数据处理装置在上述多种候选存储方案各自的目标权重中,将超过预设权重阈值的任一个目标权重对应的候选存储方案确定为上述待存储数据的目标存储方案;其中,上述 权重阈值可以是0.6、0.75、0.8等。
可选的,数据处理装置执行步骤103之后,还可以执行如下操作:将上述待存储数据存储至上述目标存储方案对应的候选存储空间的第一地址至第二地址;将上述第一地址至上述第二地址对应的存储空间设置为已分配的存储空间(即used_item)。可选的,上述第一地址和上述第二地址中的一个为上述目标存储方案对应的候选存储空间的起始地址,或者,上述第一地址和上述第二地址中的一个为上述目标存储方案对应的候选存储空间的结束地址。在一些实施例中,数据处理装置在上述待存储数据对应的第一数据释放时间到达之后,还可以执行如下操作:释放上述第一地址至上述第二地址对应的存储空间;将上述第一地址至上述第二地址对应的存储空间设置为未分配的存储空间(即free_item)。在一些实施例中,数据处理装置运行的某个内存管理软件来执行图1的方法流程。
在一些实施例中,若目标存储方案对应的候选存储空间大于待存储数据所需的存储空间,则在将待存储数据存储至上述第一地址至第二地址后,将目标存储方案对应的候选存储空间中未存储待存储数据的空间仍设置为未分配的存储空间(即free_item)。例如,假设第一地址为目标存储方案对应的候选存储空间的起始地址,则将第二地址的下一个地址到目标存储方案对应的候选存储空间的结束地址之间的存储空间设置为未分配的存储空间。又例如,假设第二地址为目标存储方案对应的候选存储空间的结束地址,则将目标存储方案对应的候选存储空间的起始地址到第一地址的上一个地址之间的存储空间设置为未分配的存储空间。
本申请实施例中,基于上述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,根据多个目标权重,可以从多种候选存储方案中确定一种能够有效减少内存碎片的存储方案。
在一些实施例中,每个候选存储空间对应的候选存储方案包括第一候选存储方案和第二候选存储方案中的至少一种,其中,上述第一候选存储方案中的起始存储地址为上述候选存储空间的起始地址,上述第二候选存储方案中的结束存储地址为上述候选存储空间的结束地址。也就是说,每个候选存储空间对应1种或2种分配方法,即靠左分配(对应于第一候选存储方案)和靠右分配(对应于第二候选存储方案),可以分别计算这两种分配方法的目标权重。靠左分配是指将待存储数据存储至某个候选存储空间的起始地址至某个地址,即为该待存储数据分配该候选存储空间的起始地址至后面连续多个 地址。靠右分配是指将待存储数据存储至某个候选存储空间的某个地址至结束地址,即为该待存储数据分配该存储空间的结束地址以及该结束地址前面连续的多个地址。当某个候选存储空间的大小大于存储待存储数据所需的存储空间的大小时,该候选存储空间有两种分配方法(即靠左分配和靠右分配不同);当某个候选存储空间的大小等于存储待存储数据所需的存储空间的大小时,该候选存储空间只有1种分配方法(即靠左分配和靠右分配相同)。举例来说,有10个候选存储空间的大小大于存储待存储数据所需的存储空间的大小,则数据处理装置执行20轮目标权重计算,即计算每个候选存储空间采用靠左分配方式对应的目标权重和采用靠右分配方式对应的目标权重。
在该实现方式中,采用第一候选存储方案或第二候选存储方案存储待存储数据之后,在该待存储数据占用的存储空间被释放之后能够与其相邻的存储空间合并为一个更大的存储空间,以减少内存碎片。
图2为本申请实施例提供的一种计算候选存储方案的目标权重的过程示意图。如图2所示,如211-216所示的黑色的矩形区域表示目标存储器中已分配的存储空间(即used_item),如201-205所示的白色的矩形区域表示该目标存储器中未分配的存储空间(即free_item),假设存储空间201、存储空间203、存储空间205均可存储待存储数据,存储空间201和存储空间203的大小大于存储待存储数据所需的存储空间的大小,存储空间205的大小等于存储待存储数据所需的存储空间的大小。如图2所示,在权重计算中,图中黑色矩形区域表示存储空间被占用的部分,白色矩形区域表示存储空间未被占用的部分,矩形区域的上沿表示对应的存储空间的起始地址,矩形区域的下沿表示对应的存储空间的结束地址。在第1轮目标权重计算中,计算将待存储数据存储至存储空间201的起始地址至某个地址(靠左分配)时的目标权重。在第2轮目标权重计算中,计算将待存储数据存储至存储空间201的某个地址至结束地址(即靠右分配)时的目标权重。在第3轮目标权重计算中,计算将待存储数据存储至存储空间203的起始地址至某个地址(靠左分配)时的目标权重。在第4轮目标权重计算中,计算将待存储数据存储至存储空间203的某个地址至结束地址(靠右分配)时的目标权重。在第5轮目标权重计算中,计算将待存储数据存储至存储空间205的起始地址至结束地址(即靠左分配和靠右分配相同)时的目标权重;以此类推。
在一些实施例中,数据处理装置在第N轮目标权重计算中,计算将上述待存储数据存储至某个候选存储空间的目标权重,可以将该目标权重作为第一目标权重,之后还可执行如下操作:在当前最大目标权重小于上述第一目标权重的情况下,将上述当前最大 目标权重更新为上述第一目标权重。可选的,数据处理装置执行第1轮的目标权重计算得到一个目标权重之后,将该目标权重作为当前最大目标权重并保存;将第i轮目标权重计算得到的目标权重与保存的当前最大目标权重进行比较,如果新计算得到的目标权重大于当前最大目标权重,则将当前最大目标权重更新为新计算得到的目标权重,否则,保持当前最大目标权重不变,其中i为大于1的正整数。
前述实施例未详述确定将待存储数据存储到至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重的实现方式,下面以计算参考候选存储方案的目标权重为例介绍一些计算目标权重可选的实现方式。上述参考候选存储方案为上述至少两个候选存储空间中任一种候选存储方案。
在一个可选的实现方式中,基于待存储数据的第一数据释放时间和第二数据释放时间之间的时间间隔,可以确定候选存储方案的目标权重。参考候选存储方案对应的目标权重与上述待存储数据的第一数据释放时间和第二数据释放时间之间的时间间隔负相关,其中,上述第二数据释放时间为与上述待存储数据在上述参考候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间。示例性的,参考候选存储方案对应的目标权重为上述待存储数据的第一数据释放时间和第二数据释放时间之间的间隔的倒数。举例来说,第一数据释放时间为t1,第二数据释放时间为t2,参考候选存储方案对应的目标权重为
Figure PCTCN2020136966-appb-000001
以图2为例,对于存储空间201,其相邻的存储空间为211或212。在第1轮权重计算时,由于为靠左分配,存储空间201的相邻的存储空间为211,在第2轮权重计算时,由于为靠右分配,存储空间201的相邻的存储空间为212。对于存储空间205,其相邻的存储空间可以为215,也可以为216。
在一个可选的实现方式中,上述基于上述待存储数据对应的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:基于上述待存储数据的生命周期和上述候选存储方案对应的候选存储空间的起始地址,确定上述候选存储方案的目标权重。可选的,上述目标存储方案的确定使得上述目标存储器中存储的数据的生命周期随着存储地址递增或递减。可以理解,数据处理装置执行本申请实施例提供的用于存储数据的方法可以使得上述目标存储器中存储的数据的生命周期随着存储地址递增或递减。也就是说,尽量将生命周期小的待存储数据存储在存储空间的一侧(如靠左侧存储),将生命周期长的待存储数据存储在存储空间的另一侧(如靠右侧存储)。在一些实施例 中,上述基于上述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:确定上述待存储数据对应的最大生命周期;确定上述待存储数据的生命周期与上述最大生命周期之间的第一比值;确定上述候选存储方案对应的候选存储空间的起始地址与上述目标存储器的结束地址之间的第二比值;基于第一比值和第二比值,确定上述候选存储方案的目标权重。示例性的,上述候选存储方案的目标权重与上述第一比值和上述第二比值之差的绝对值负相关。待存储数据对应的最大生命周期可以是指令序列中各指令分别对应的数据的生命周期中最大的生命周期,即与待存储数据相关的数据占用目标存储器的最大时长。示例性的,待存储数据对应的最大生命周期为本次图像处理过程中产生的所有需要存储的数据的生命周期的最大值,包括已分配内存和尚未分配内存的所有数据的生命周期的最大值,但本公开实施例对此不限于此。
在一些实施例中,候选存储空间的起始地址可以表示为候选存储空间的起始地址相对于目标存储器的起始地址的偏移值,目标存储器的结束地址可以表示为目标存储器的结束地址相对于目标存储器的起始地址的偏移值。
在一个可选的实现方式中,可以确定候选存储空间的起始地址与目标存储器的总存储空间大小之间的第二比值,并将该第二比值作为该候选存储空间对应的至少一种候选存储方案的第二比值,但本公开实施例不限于此。
在一个可选的实现方式中,上述基于上述待存储数据对应的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:基于上述待存储数据对应的第一数据释放时间和与上述候选存储方案对应的存储位置相邻的存储空间所存储数据的第二数据释放时间,确定上述候选存储方案的第一权重;基于上述待存储数据的生命周期和上述候选存储方案对应的候选存储空间的起始地址,确定上述候选存储方案的第二权重;基于上述第一权重和上述第二权重的加权和,得到上述候选存储方案的目标权重。
在该实现方式中,综合考虑待存储数据的第一数据释放时间和生命周期,能够更有效的减少内存碎片。
在一个可选的实现方式中,上述基于上述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:基于上述待存储数据的第一数据释放时间、生命周期和多种候选存储方案中每种候选存储方案对应的存储空间大小,确定所 述每种候选存储方案的目标权重。其中,候选存储方案对应的存储空间大小可以是该候选存储方案对应的候选存储空间的大小。
在一些实施例中,上述候选存储方案对应的目标权重包括第一指标、第二指标以及第三指标的加权和。其中,上述第一指标由上述待存储数据的第一数据释放时间和第二数据释放时间之间的间隔确定,上述第二数据释放时间为与上述待存储数据在上述候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间;上述第二指标由第一比值和第二比值之间的差值确定,上述第一比值为上述待存储数据的生命周期与上述待存储数据对应的最大生命周期之间的比值,上述第二比值为上述候选存储方案对应的候选存储空间的起始地址与上述目标存储器的结束地址之间的比值;上述第三指标由上述候选存储方案对应的存储空间与上述目标存储器的总存储空间的比值确定。
在该实现方式中,综合考虑待存储数据的第一数据释放时间、生命周期和所需的存储空间大小,以便于确定的目标存储方案能更有效的减少内存碎片,并减少占用的存储空间。
可选的,上述候选存储方案对应的目标权重满足如下公式(1):
weight=α*w1+β*w2+γ*w3       (1);
其中,α、β、γ均为不小于0的目标权重系数,且α+β+γ=1,weight表示上述候选存储方案对应的目标权重,w1表示第一指标,w2表示第二指标,w3表示第三指标。可选的,cost1=abs(e-e1),w1=1/cost1,e表示上述第一数据释放时间,e1表示上述第二数据释放时间,abs(e-e1)表示e和e1的差值的绝对值。可选的,cost2=abs((c/c_max)-(start/mem_size)),w2=1-cost2,c表示上述待存储数据的生命周期,c_max表示上述待存储数据对应的最大生命周期,start表示上述候选存储方案对应的候选存储空间的起始地址,mem_size表示目标存储器的总存储空间的大小,可以表示为目标存储器的结束地址。可选的,w3=1-s_cand/mem_size,s_cand表示候选存储方案对应的候选存储空间的大小,mem_size表示上述目标存储器的总存储空间的大小。
在该实现方式中,目标权重系数α、β和γ是通过测试得到的结果。可以按一定的步进改变α、β和γ的值,并保证α+β+γ=1,这样可以得到多组不同的参数组合方式,运行一组测试集合,并保存每组参数组合方式在该测试集合下的结果。从而最终选择一组性能优越的参数组合方式。
在该实现方式中,计算候选存储方案的目标权重的方式为三种分配原则的综合结果。 w1对应第一种分配原则,该原则是尽量分配在end_pc相近的位置,使得相邻存储空间释放时间相近,有利于合并成大的空闲存储空间,从而减少内存碎片。每个数据对应一个end_pc,每个数据对应的end_pc表示该数据占用的存储空间被释放的时间点。尽量分配在end_pc相近的位置可以将尽量将待存储数据分配在对应的end_pc与该待存储数据对应的end_pc较接近的数据相邻的位置。举例来说,目标存储器中某个存储空间存储的数据对应的end_pc与该待存储数据对应的end_pc较接近,则将该待存储数据分配在于该存储空间相邻的空间存储空间。w2对应第二种分配原则,该原则是将生命周期短的数据(分配释放频繁的)与生命周期长的数据分段分配,将分配和释放频繁的数据所设置的位置尽量接近,也可减少内存碎片。w3对应第三种分配原则,该原则是把既能满足需求,又是最小的空闲存储空间分配给待存储数据。在该实现方式中,结合多种分配原则来为待分配数据分配地址,可以有效减少内存碎片。
应理解,数据处理装置可以结合这三种分配原则中任意两种来计算目标权重,也可以仅采用根据第一种原则或者第二种原则来计算目标权重。举例来说,上述候选存储方案对应的目标权重满足如下公式(2):
weight=α*w1+β*w2       (2);
其中,式(2)中的w1、w2分别与式(1)中w1、w2相同,α、β均为大于0的权重系数,且α+β=1。
又举例来说,上述候选存储方案对应的目标权重满足如下公式(3):
weight=α*w1+γ*w3       (3);
其中,式(3)中的w1、w3分别与式(1)中w1、w3相同,α、γ均为大于0的权重系数,且α+γ=1。
又举例来说,上述候选存储方案对应的目标权重满足如下公式(4):
weight=β*w2+γ*w3       (4);
其中,式(4)中的w2、w3分别与式(1)中w2、w3相同,β、γ均为大于0的目标权重系数,且β+γ=1。
又举例来说,上述候选存储方案对应的目标权重满足如下公式(5):
weight=w2=1-cost2       (5);
又举例来说,上述候选存储方案对应的目标权重满足如下公式(6):
weight=w1=1/cost1       (6);
在该实现方式中,结合多种分配原则来为待分配数据分配地址,可以有效减少内存碎片。
图3为本申请实施例提供的另一种用于存储数据的方法流程图。如图3所示,该方法可包括以下步骤。
301、数据处理装置从目标存储器未分配的多个离散存储空间中,确定可存储待存储数据的两个或两个以上候选存储空间。
302、在第N轮目标权重计算中,基于待存储数据的第一数据释放时间和生命周期中的至少一项,计算将待存储数据存储至第一候选存储空间的第一目标权重。
可选的,上述第一候选存储空间为上述两个或两个以上候选存储空间中任一候选存储空间,计算将待存储数据存储至第一候选存储空间的第一目标权重可以是采用式(1)至式(6)中的任一个来计算目标权重。可以理解,数据处理装置是计算假定将待存储数据存储至第一候选存储空间时的目标权重,并不执行将待存储数据存储至第一候选存储空间的操作。上述N为大于0的整数。在实际应用中,数据处理装置可以计算每个候选存储空间存储待存储数据对应的一个目标权重或者两个目标权重,每轮目标权重计算可计算得到一个目标权重。
303、更新当前最大目标权重。
在一些实施例中,当N=1时,更新当前最大目标权重可以是将第1轮计算得到的目标权重保存为当前最大目标权重。当N>1时,更新当前最大目标权重可以是在第N轮计算得到的目标权重大于当前保存的当前最大目标权重的情况下,将当前最大目标权重更新为在第N轮计算得到的目标权重;在第N轮计算得到的目标权重不大于当前保存的当前最大目标权重的情况下,保持当前最大目标权重不变。
304、判断是否停止下一轮目标权重的计算。
在一些实施例中,判断是否停止下一轮目标权重的计算可以是在当前计算得到每种候选存储方案的目标权重的情况下,判断停止下一轮目标权重的计算;在当前未计算得到每种候选存储方案的目标权重的情况下,判断继续下一轮目标权重的计算。若不停止下一轮目标权重计算,则N+1,并执行步骤302;若停止下一轮目标权重计算,执行步骤305。
305、将当前最大目标权重对应的候选存储方案作为目标存储方案,并将待存储数据存储至目标存储方案对应的候选存储空间的第一地址至第二地址。
306、将上述第一地址至上述第二地址对应的存储空间设置为已分配的存储空间。
307、在第一数据释放时间到达后,释放上述第一地址至上述第二地址。
在一些实施例中,可以将上述第一地址至上述第二地址对应的存储空间设置为未分配的存储空间。
308、若第二地址为候选存储空间的结束地址,则在上述第二地址的下一地址至第三地址(第三地址位于第二地址的右侧)均未存储数据的情况下,将上述第一地址至上述第三地址设置为一个未分配的离散存储空间。其中,以第三地址的下一地址为起始地址的存储空间为已分配的存储空间(used_item)。
步骤308可以替换为:若第一地址为候选存储空间的起始地址,则在上述目标存储器的第四地址(第四地址位于第一地址的左侧)至上述第一地址的上一地址均未存储数据的情况下,将上述第四地址至上述第二地址设置为一个未分配的离散存储空间。其中,以第四地址的上一地址为结束地址的存储空间为已分配的存储空间(used_item)。
这样,可以快速地将相邻的两个未分配的存储空间设置为一个较大的未分配的存储空间。
本申请实施例提供的方法,能够有效减少内存碎片。
前述实施例描述的用于存储数据的方法可以应用于数据处理装置通过AI芯片执行数据处理任务的场景,即实时管理共享缓存的地址分配和释放;也可以应用于AI模型的编译场景。在AI模型的编译场景中,数据处理装置可执行本申请实施例提供的用于存储数据的方法来模拟AI模型执行处理操作时共享缓存的分配,进而对AI模型编译得到能指示共享缓存的内存分配和释放的指令序列。数据处理装置中的AI芯片可执行指令序列来执行数据处理任务。AI芯片在执行指令序列来执行数据处理任务的过程中,按照指令序列中的指令将数据存储至共享缓存以及释放共享缓存中的数据,可以提供共享缓存的利用率。
图4为本申请实施例提供的一种数据处理装置的结构示意图,如图4所示,该装置包括第一确定单元401、第二确定单元402和第三确定单元403。
第一确定单元401,用于基于待存储数据所需的存储空间大小,确定目标存储器中 的至少两个候选存储空间。
第二确定单元402,用于基于上述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案。
第三确定单元403,用于基于上述多种候选存储方案中每种候选存储方案的目标权重,确定上述待存储数据的目标存储方案。
在一个可选的实现方式中,上述候选存储空间对应的候选存储方案包括第一候选存储方案和第二候选存储方案中的至少一种,其中,上述第一候选存储方案中的起始存储地址为上述候选存储空间的起始地址,上述第二候选存储方案中的结束存储地址为上述候选存储空间的结束地址。
在一个可选的实现方式中,第二确定单元401还用于对于每种候选存储方案,基于所述待存储数据的第一数据释放时间和第二数据释放时间,确定该候选存储方案的目标权重,其中,所述第二数据释放时间为与所述待存储数据在所述候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间。
在一个可选的实现方式中,上述候选存储方案对应的目标权重与上述待存储数据的第一数据释放时间和第二数据释放时间之间的间隔负相关。
在一个可选的实现方式中,第二确定单元402,还用于对于每种候选存储方案,基于上述待存储数据的生命周期和上述候选存储方案对应的候选存储空间的起始地址,确定上述候选存储方案的目标权重。
在一个可选的实现方式中,第二确定单元402,还用于确定所述待存储数据对应的最大生命周期;确定上述待存储数据的生命周期与上述最大生命周期之间的第一比值;确定上述候选存储方案对应的候选存储空间的起始地址与上述目标存储器的结束地址之间的第二比值;基于第一比值和第二比值,确定上述候选存储方案的目标权重。
在一个可选的实现方式中,上述候选存储方案的目标权重与上述第一比值和上述第二比值之差的绝对值负相关。
在一个可选的实现方式中,第二确定单元402,还用于对于每种候选存储方案,基于上述待存储数据对应的第一数据释放时间和第二数据释放时间,确定上述候选存储方案的第一权重,其中,第二数据释放时间为与待存储数据在该候选存储方案中的存储位 置相邻的存储空间所存储数据的数据释放时间;基于上述待存储数据的生命周期和上述候选存储方案对应的候选存储空间的起始地址,确定上述候选存储方案的第二权重;基于上述第一权重和上述第二权重的加权和,得到上述候选存储方案的目标权重。
在一个可选的实现方式中,第二确定单元402,还用于对于每种候选存储方案,基于上述待存储数据的第一数据释放时间、生命周期和该候选存储方案对应的存储空间大小,确定该候选存储方案的目标权重。
在一个可选的实现方式中,第二确定单元402,还用于基于所述待存储数据对应的第一数据释放时间和与第二数据释放时间,确定该候选存储方案的第一权重,其中,所述第二数据释放时间为与所述待存储数据在该候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间;基于所述待存储数据的生命周期和该候选存储方案对应的候选存储空间的起始地址,确定该候选存储方案的第二权重;基于所述候选存储方案对应的候选存储空间的大小和所述目标存储器的总存储空间的大小,确定该候选存储方案的第三权重;基于所述第一权重、所述第二权重和所述第三权重的加权和,得到该候选存储方案的所述目标权重。
在一个可选的实现方式中,第二确定单元402,还用于对于每种候选存储方案,基于所述待存储数据的第一数据释放时间和该候选存储方案对应的存储空间大小,确定该候选存储方案的目标权重。
在一个可选的实现方式中,第二确定单元402,还用于对于每种候选存储方案,基于所述待存储数据的生命周期和该候选存储方案对应的存储空间大小,确定该候选存储方案的目标权重。
在一个可选的实现方式中,所述装置还包括设置单元404,用于将所述待存储数据存储至所述目标存储方案对应的候选存储空间的第一地址至第二地址;并将所述第一地址至所述第二地址对应的存储空间设为已分配的存储空间;其中,所述第一地址和所述第二地址中的一个为所述目标存储方案对应的候选存储空间的起始地址,或者,所述第一地址和所述第二地址中的一个为所述目标存储方案对应的候选存储空间的结束地址。
在一个可选的实现方式中,所述装置还包括释放单元405,用于在所述待存储数据对应的第一数据释放时间到达之后,释放所述第一地址至所述第二地址对应的存储空间;设置单元404还用于将所述第一地址至所述第二地址对应的存储空间设置为未分配的存储空间。
在一个可选的实现方式中,第三确定单元403,还用于在所述多种候选存储方案各自的目标权重中,将最大的一个目标权重对应的候选存储方案确定为所述待存储数据的所述目标存储方案;或所述多种候选存储方案各自的目标权重中,将超过预设权重阈值的任一个目标权重对应的候选存储方案确定为所述目标存储方案。
在一个可选的实现方式中,上述目标存储器为人工智能AI芯片中的共享缓存。
在一个可选的实现方式中,第一确定单元401,还用于从上述目标存储器未分配的多个离散存储空间中,确定可存储上述待存储数据的上述至少两个候选存储空间,其中所述候选存储空间的大小大于或等于待存储数据占用的存储空间。
在一个可选的实现方式中,设置单元404,还用于若第二地址为候选存储空间的结束地址,在上述第二地址的下一地址至第三地址均未存储数据的情况下,将上述第一地址至上述第三地址设置为一个未分配的离散存储空间。其中,以第三地址的下一地址为起始地址的存储空间为已分配的存储空间。
在一个可选的实现方式中,设置单元404,还用于若第一地址为候选存储空间的起始地址,在上述目标存储器的第四地址至上述第一地址的上一地址均未存储数据的情况下,将上述第四地址至上述第二地址设置为一个未分配的离散存储空间。其中,以第四地址的上一地址为结束地址的存储空间为已分配的存储空间。
图5是本申请实施例提供的一种数据处理装置的结构示意图。如图5所示,数据处理装置包括AI芯片510和存储器520,AI芯片510可从存储器520获取数据和指令,并将最终的处理结果输出至存储器520,AI芯片510中的计算单元501执行处理任务,计算单元501在处理数据的过程中将数据存储至共享缓存502(即目标存储器)以及从该共享缓存502获取数据。共享缓存502的地址分配和释放可采用前述实施例中的用于存储数据的方法。在一些实施例中,存储器520可能位于AI芯片510内部。在一些实施例中,在AI芯片执行某种数据处理任务时,数据处理装置运行的某个内存管理软件执行前述实施例中的用于存储数据的方法来管理共享缓存的地址分配和释放。在一些实施例中,在AI芯片执行某种数据处理任务时,执行从存储器读取的指令来实现数据处理任务,在实现数据处理任务的过程中从存储器读取的指令指示了共享缓存的地址分配和释放。也就是说,AI芯片执行从存储器读取的指令就可实现与前述实施例相同的内存分配和释放流程。
图6是本申请实施例提供的一种电子设备的结构示意图,该电子设备600可因 配置或性能不同而产生比较大的差异,可以包括一个或多个中央处理器(central processing units,CPU)622(例如,一个或多个处理器)和存储器632,一个或多个存储应用程序642或数据644的存储介质630(例如一个或多个海量存储设备),一个或多个AI芯片624。其中,存储器632和存储介质630可以是短暂存储或持久存储。存储在存储介质630的程序可以包括一个或多个模块(图示没标出),每个模块可以包括对电子设备中的一系列指令操作。更进一步地,中央处理器622可以设置为与存储介质630通信,在电子设备600上执行存储介质630中的一系列指令操作。AI芯片624可执行CPU 622分配的各种数据处理任务。电子设备600可以为本申请提供的数据处理装置。
电子设备600还可以包括一个或多个电源626,一个或多个有线或无线网络接口650,一个或多个输入输出接口658,和/或,一个或多个操作系统641,例如Windows Server TM,Mac OS X TM,Unix TM,Linux TM,FreeBSD TM等等。
上述实施例中由数据处理装置所执行的步骤可以基于该图6所示的电子设备结构。具体的,中央处理器622可实现图4中各单元的功能。
本申请实施例提供了一种计算机可读存储介质,上述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,上述计算机程序被处理器执行时实现:基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间;基于上述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将上述待存储数据存储至上述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案;基于上述多种候选存储方案中每种候选存储方案的目标权重,确定上述待存储数据的目标存储方案。该计算机可读存储介质可以是非易失性的存储介质。
本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行前述实施例所提供的用于存储数据的方法。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种用于存储数据的方法,其特征在于,包括:
    基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间;
    基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案;
    基于所述多种候选存储方案中每种候选存储方案的目标权重,确定所述待存储数据的目标存储方案。
  2. 根据权利要求1所述的方法,其特征在于,所述候选存储空间对应的候选存储方案包括第一候选存储方案和第二候选存储方案中的至少一种,其中,所述第一候选存储方案中的起始存储地址为所述候选存储空间的起始地址,所述第二候选存储方案中的结束存储地址为所述候选存储空间的结束地址。
  3. 根据权利要求1或2所述的方法,其特征在于,所述基于所述待存储数据对应的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:
    对于每种候选存储方案,基于所述待存储数据的第一数据释放时间和第二数据释放时间,确定该候选存储方案的目标权重,其中,所述第二数据释放时间为与所述待存储数据在所述候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间。
  4. 根据权利要求3所述的方法,其特征在于,该候选存储方案的所述目标权重与所述第一数据释放时间和所述第二数据释放时间之间的时间间隔负相关。
  5. 根据权利要求1或2所述的方法,其特征在于,所述基于所述待存储数据对应的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:
    对于每种候选存储方案,基于所述待存储数据的生命周期和所述候选存储方案对应的候选存储空间的起始地址,确定该候选存储方案的目标权重。
  6. 根据权利要求5中所述的方法,其特征在于,所述对于每种候选存储方案,基于所述待存储数据的生命周期和该候选存储方案对应的候选存储空间的起始地址,确定 该候选存储方案的目标权重,包括:
    确定所述待存储数据对应的最大生命周期;
    确定所述待存储数据的生命周期与所述最大生命周期之间的第一比值;
    确定该候选存储方案对应的候选存储空间的起始地址与所述目标存储器的结束地址之间的第二比值;
    基于所述第一比值和所述第二比值,确定该候选存储方案的所述目标权重。
  7. 根据权利要求6所述的方法,其特征在于,所述候选存储方案的所述目标权重与所述第一比值和所述第二比值之差的绝对值负相关。
  8. 根据权利要求1或2所述的方法,其特征在于,所述基于所述待存储数据对应的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:
    对于每种候选存储方案,
    基于所述待存储数据对应的第一数据释放时间和与第二数据释放时间,确定该候选存储方案的第一权重,其中,所述第二数据释放时间为与所述待存储数据在该候选存储方案中的存储位置相邻的存储空间所存储数据的数据释放时间;
    基于所述待存储数据的生命周期和该候选存储方案对应的候选存储空间的起始地址,确定该候选存储方案的第二权重;
    基于所述第一权重和所述第二权重的加权和,得到该候选存储方案的目标权重。
  9. 根据权利要求1或2所述的方法,其特征在于,所述基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:
    对于每种候选存储方案,基于所述待存储数据的第一数据释放时间、生命周期和该候选存储方案对应的存储空间大小,确定该候选存储方案的目标权重。
  10. 根据权利要求9所述的方法,其特征在于,所述确定该候选存储方案的目标权重,包括:
    基于所述待存储数据对应的第一数据释放时间和与第二数据释放时间,确定该候选存储方案的第一权重,其中,所述第二数据释放时间为与所述待存储数据在该候选存储 方案中的存储位置相邻的存储空间所存储数据的数据释放时间;
    基于所述待存储数据的生命周期和该候选存储方案对应的候选存储空间的起始地址,确定该候选存储方案的第二权重;
    基于所述候选存储方案对应的候选存储空间的大小和所述目标存储器的总存储空间的大小,确定该候选存储方案的第三权重;
    基于所述第一权重、所述第二权重和所述第三权重的加权和,得到该候选存储方案的所述目标权重。
  11. 根据权利要求1或2所述的方法,其特征在于,所述基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:
    对于每种候选存储方案,基于所述待存储数据的第一数据释放时间和该候选存储方案对应的存储空间大小,确定该候选存储方案的目标权重。
  12. 根据权利要求1或2所述的方法,其特征在于,所述基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,包括:
    对于每种候选存储方案,基于所述待存储数据的生命周期和该候选存储方案对应的存储空间大小,确定该候选存储方案的目标权重。
  13. 根据权利要求1至12任一项所述的方法,其特征在于,所述方法还包括:
    将所述待存储数据存储至所述目标存储方案对应的候选存储空间的第一地址至第二地址;
    将所述第一地址至所述第二地址对应的存储空间设置为已分配的存储空间;
    其中,所述第一地址和所述第二地址中的一个为所述目标存储方案对应的候选存储空间的起始地址,或者,所述第一地址和所述第二地址中的一个为所述目标存储方案对应的候选存储空间的结束地址。
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:
    在所述待存储数据对应的第一数据释放时间到达之后,释放所述第一地址至所述第二地址对应的存储空间;并将所述第一地址至所述第二地址对应的存储空间设置为未分 配的存储空间。
  15. 根据权利要求1至14任一项所述的方法,其特征在于,所述基于所述多种候选存储方案中每种候选存储方案的目标权重,确定所述待存储数据的目标存储方案,包括:
    在所述多种候选存储方案各自的目标权重中,将最大的一个目标权重对应的候选存储方案确定为所述待存储数据的所述目标存储方案;或
    在所述多种候选存储方案各自的目标权重中,将超过预设权重阈值的任一个目标权重对应的候选存储方案确定为所述目标存储方案。
  16. 根据权利要求1至15任一项所述的方法,其特征在于,所述目标存储器为人工智能AI芯片中的共享缓存。
  17. 一种数据处理装置,其特征在于,包括:
    第一确定单元,用于基于待存储数据所需的存储空间大小,确定目标存储器中的至少两个候选存储空间;
    第二确定单元,用于基于所述待存储数据的第一数据释放时间和生命周期中的至少一项,确定将所述待存储数据存储至所述至少两个候选存储空间的多种候选存储方案中每种候选存储方案的目标权重,其中,每个候选存储空间对应于至少一种候选存储方案;
    第三确定单元,用于基于所述多种候选存储方案中每种候选存储方案的目标权重,确定所述待存储数据的目标存储方案。
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被电子设备的处理器执行时,使所述处理器执行权利要求1至16任一项所述的方法。
  19. 一种电子设备,其特征在于,包括存储有处理器可执行指令的存储器、目标存储器和处理器,其中,所述处理器,在执行所述指令时用于实现如权利要求1至16任一项所述的方法。
  20. 根据权利要求19所述的电子设备,其特征在于,所述电子设备为AI芯片,所述目标存储器为所述AI芯片中的共享缓存。
PCT/CN2020/136966 2020-05-18 2020-12-16 一种存储数据的方法及数据处理装置 WO2021232769A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217031361A KR20210144730A (ko) 2020-05-18 2020-12-16 데이터 기억
JP2021557735A JP7164733B2 (ja) 2020-05-18 2020-12-16 データ記憶

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010420206.5A CN113688062B (zh) 2020-05-18 2020-05-18 用于存储数据的方法和相关产品
CN202010420206.5 2020-05-18

Publications (1)

Publication Number Publication Date
WO2021232769A1 true WO2021232769A1 (zh) 2021-11-25

Family

ID=78575569

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136966 WO2021232769A1 (zh) 2020-05-18 2020-12-16 一种存储数据的方法及数据处理装置

Country Status (5)

Country Link
JP (1) JP7164733B2 (zh)
KR (1) KR20210144730A (zh)
CN (1) CN113688062B (zh)
TW (1) TWI779438B (zh)
WO (1) WO2021232769A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116909489A (zh) * 2023-09-11 2023-10-20 北京紫光芯能科技有限公司 一种数据的管理方法、装置、电子设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114153395B (zh) * 2021-11-30 2024-05-14 浙江大华技术股份有限公司 一种对象存储数据生命周期管理方法、装置及设备
CN114442927B (zh) * 2021-12-22 2023-11-03 天翼云科技有限公司 一种数据存储空间的管理方法及装置
CN115509463B (zh) * 2022-11-15 2023-04-11 北京云成金融信息服务有限公司 一种基于数据中台的均衡化数据存储方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160283371A1 (en) * 2013-09-04 2016-09-29 Red Hat, Inc. Non-intrusive storage of garbage collector-specific management data
CN109117273A (zh) * 2018-08-17 2019-01-01 腾讯科技(深圳)有限公司 数据存储方法、装置及设备
CN109857678A (zh) * 2019-01-31 2019-06-07 深兰科技(上海)有限公司 一种嵌入式系统内存管理方法及装置
CN110427394A (zh) * 2019-08-08 2019-11-08 北京字节跳动网络技术有限公司 数据操作方法及装置
CN110555890A (zh) * 2018-05-30 2019-12-10 珠海全志科技股份有限公司 一种内存管理方法及系统
CN111078585A (zh) * 2019-11-29 2020-04-28 智器云南京信息科技有限公司 一种内存缓存管理方法、系统、存储介质及电子设备

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09198299A (ja) * 1996-01-23 1997-07-31 Nec Corp メモリブロック確保解放装置
CN1271524C (zh) * 2003-03-19 2006-08-23 华为技术有限公司 一种静态内存管理方法
JP4129693B2 (ja) 2006-05-18 2008-08-06 コニカミノルタビジネステクノロジーズ株式会社 メモリ管理方法
CN101320351A (zh) * 2008-06-27 2008-12-10 华中科技大学 内存的分配、清理和释放方法及内存管理的装置
KR20120109197A (ko) * 2011-03-28 2012-10-08 삼성전자주식회사 휴대용 장치 상의 데이터 스트림 관리 시스템에서 메모리를 관리하는 방법 및 이를 위한 휴대용 장치
CN104090848B (zh) * 2014-07-16 2017-03-08 云南大学 一种周期性大数据处理的内存管理方法及装置
CN105278873B (zh) * 2015-09-14 2018-10-19 浪潮(北京)电子信息产业有限公司 一种磁盘块的分配方法及装置
WO2017107015A1 (zh) * 2015-12-21 2017-06-29 华为技术有限公司 存储空间的分配方法及存储设备
FR3050844B1 (fr) * 2016-04-27 2018-11-23 Morpho Procede d'allocation d'espace memoire
CN106569742B (zh) * 2016-10-20 2019-07-23 华为技术有限公司 存储管理方法及存储设备
WO2019038859A1 (ja) 2017-08-23 2019-02-28 株式会社日立製作所 不揮発メモリデバイスを有するストレージシステム
CN108287666B (zh) * 2018-01-16 2021-01-26 中国人民公安大学 用于云存储环境的数据存储方法及装置
CN110058786B (zh) * 2018-01-18 2022-12-02 伊姆西Ip控股有限责任公司 用于控制存储系统中的写请求的方法、装置和计算机程序产品
US11288180B2 (en) * 2018-01-19 2022-03-29 Micron Technology, Inc. Management of storage resources allocated from non-volatile memory devices to users
US10782897B2 (en) * 2018-04-02 2020-09-22 International Business Machines Corporation Memory reduction for neural networks with fixed structures

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160283371A1 (en) * 2013-09-04 2016-09-29 Red Hat, Inc. Non-intrusive storage of garbage collector-specific management data
CN110555890A (zh) * 2018-05-30 2019-12-10 珠海全志科技股份有限公司 一种内存管理方法及系统
CN109117273A (zh) * 2018-08-17 2019-01-01 腾讯科技(深圳)有限公司 数据存储方法、装置及设备
CN109857678A (zh) * 2019-01-31 2019-06-07 深兰科技(上海)有限公司 一种嵌入式系统内存管理方法及装置
CN110427394A (zh) * 2019-08-08 2019-11-08 北京字节跳动网络技术有限公司 数据操作方法及装置
CN111078585A (zh) * 2019-11-29 2020-04-28 智器云南京信息科技有限公司 一种内存缓存管理方法、系统、存储介质及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116909489A (zh) * 2023-09-11 2023-10-20 北京紫光芯能科技有限公司 一种数据的管理方法、装置、电子设备及存储介质
CN116909489B (zh) * 2023-09-11 2024-02-27 北京紫光芯能科技有限公司 一种数据的管理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
JP7164733B2 (ja) 2022-11-01
TW202145010A (zh) 2021-12-01
TWI779438B (zh) 2022-10-01
CN113688062B (zh) 2022-08-26
JP2022537007A (ja) 2022-08-23
KR20210144730A (ko) 2021-11-30
CN113688062A (zh) 2021-11-23

Similar Documents

Publication Publication Date Title
WO2021232769A1 (zh) 一种存储数据的方法及数据处理装置
TWI564719B (zh) 具有多個資料預取器的處理器、所述處理器的操作方法及所述處理器操作的電腦程式產品
CN112306678B (zh) 一种基于异构众核处理器的算法并行处理方法及系统
US11397529B2 (en) Method and device for determining strategy for data placement within SSD
US8453132B2 (en) System and method for recompiling code based on locality domain and thread affinity in NUMA computer systems
KR20180027327A (ko) 동적으로 입도를 업데이트하는 적응적 캐시 대체 매니저 및 공유 플래시-기반 스토리지 시스템의 파티셔닝 방법
US20110153978A1 (en) Predictive Page Allocation for Virtual Memory System
US7774564B2 (en) Multi-processor system, and method of distributing memory access load in multi-processor system
KR20120123127A (ko) 이종 플랫폼에서 포인터를 공유시키는 방법 및 장치
KR20130018742A (ko) 가비지 콜렉션을 위한 gpu 서포트
JP7311981B2 (ja) 機械学習訓練のためのスラブ基盤のメモリ管理
Tripathy et al. Paver: Locality graph-based thread block scheduling for gpus
CN112714906A (zh) 用以使用dram来作为用于高效云应用的慢速可按字节寻址存储器的缓存的方法和装置
Min et al. Vmmb: Virtual machine memory balancing for unmodified operating systems
JP6042170B2 (ja) キャッシュ制御装置及びキャッシュ制御方法
JP2000242551A (ja) メモリ管理のための方法および装置
CN116501249A (zh) 一种减少gpu内存重复数据读写的方法及相关设备
CN110162483B (zh) 静态内存碎片整理方法、装置、计算机设备及存储介质
CN113010453A (zh) 一种内存管理的方法、系统、设备及可读存储介质
JP6519228B2 (ja) データ配置決定装置、データ配置決定プログラム及びデータ配置決定方法
CN116048377A (zh) 固态硬盘的数据处理方法及相关设备
KR102356704B1 (ko) 컴퓨팅 장치 및 컴퓨팅 장치에서 연산들을 처리하는 방법
CN110276454B (zh) 用于机器学习的系统和控制该系统的方法以及电子系统
Herter et al. Making dynamic memory allocation static to support WCET analysis
US8806504B2 (en) Leveraging performance of resource aggressive applications

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021557735

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20936459

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20936459

Country of ref document: EP

Kind code of ref document: A1