CN110245024B

CN110245024B - Dynamic allocation system and method for static storage blocks

Info

Publication number: CN110245024B
Application number: CN201910633585.3A
Authority: CN
Inventors: 袁进辉
Original assignee: Beijing Oneflow Technology Co Ltd
Current assignee: Beijing Oneflow Technology Co Ltd
Priority date: 2019-07-15
Filing date: 2019-07-15
Publication date: 2023-12-05
Anticipated expiration: 2039-07-15
Also published as: CN110245024A

Abstract

The present disclosure relates to a dynamic allocation system of static memory blocks, comprising: a storage space determining unit that generates header data containing metadata corresponding to content data of the data block for the data block to be processed, thereby determining a space size to be allocated for the data block by the host storage unit and a space size to be allocated for the data block by the peripheral device, and modifying the metadata of the header data when acquiring the specific data block; the host memory space allocation unit allocates a continuous fixed-size overall memory space for the header data and the content data of the specific data block and continuously stores the header data and the content data in the memory unit of the host based on the modified metadata; and an external device storage space allocation unit that allocates a fixed-size storage space for the content data of the specific data block in the storage unit of the external device and stores the content data based on the metadata included in the header data modified by the storage space determination unit.

Description

Dynamic allocation system and method for static storage blocks

Technical Field

The present disclosure relates to a dynamic allocation system of static memory blocks and a method thereof, and more particularly, to a system and a method for implementing dynamic allocation of fixed memory space in a static memory system.

Background

With the advent of big data computing and deep learning, various coprocessors are commonly used to offload the data processing functions of the CPU. Such as GPU (Graphic Processing Unit), APU, etc. The GPU has a highly parallel architecture (highly parallel structure), so the GPU has a higher efficiency than the CPU in processing graphics data and complex algorithms. When the CPU executes the calculation task, only one data is processed at one moment, no real parallelism exists, and the GPU is provided with a plurality of processor cores, and can process a plurality of data at one moment in parallel. The GPU has more ALUs (Arithmetic Logic Unit, logical operation executives) for data processing than the CPU, rather than data caching and flow control. Such a structure is well suited for large-scale data that are highly uniform in type, independent of each other, and a clean computing environment that does not need to be interrupted. Existing big data computing and deep learning systems employ dynamic allocation of storage space that can control the exact size and lifetime of these storage space locations. During system operation, if these memory spaces are not freed, this can lead to an application crash, because at some point in time the system cannot allocate more memory, a memory leak or memory crash (OOM) situation can occur.

It is therefore desirable to have a big data computing and deep learning system that is immune to data block size variations, thereby eliminating the situation of memory space crashes (OOMs).

Disclosure of Invention

Since the big data and the data of the deep learning process have a highly uniform type, it is possible to provide a method capable of eliminating the above-mentioned problems existing in the prior art. According to one aspect of the present disclosure, there is provided a dynamic allocation system of static memory blocks, comprising: a storage space determining unit that generates header data of content data corresponding to a data block for the data block to be processed, the header data containing metadata describing specific content of the content data, thereby determining a space size to be allocated for the data block by the host storage unit and a space size to be allocated for the data block by the peripheral device, and modifying the header data when acquiring the specific data block; a host memory space allocation unit that allocates a continuous fixed-size overall memory space for the header data and the content data of the specific data block within the memory unit of the host based on the metadata included in the header data modified by the memory space determination unit; and an external device storage space allocation unit that allocates a fixed-size storage space for the content data of the specific data block in the storage unit of the external device based on the metadata included in the header data modified by the storage space determination unit.

A dynamic allocation system for static memory blocks according to the present disclosure, wherein the metadata includes size metadata of the data blocks and shape metadata of the data blocks.

A dynamic allocation system of static memory blocks according to the present disclosure, wherein the metadata further comprises metadata for describing tensors of specific content of the content data.

The dynamic allocation system of static memory blocks according to the present disclosure, wherein the external device accesses content data in the corresponding sub-memory space based on an instruction containing header data from the host.

According to another aspect of the present disclosure, there is provided a method of dynamically allocating static memory blocks, including: generating header data of content data corresponding to a data block for the data block to be processed by a storage space determining unit, the header data containing metadata describing specific content of the content data; determining a size of space to be allocated by the host storage unit for the data block and a size of space to be allocated by the peripheral device for the data block; modifying header data when a specific data block is acquired; on the host memory unit, based on the metadata contained in the header data modified by the memory space determining unit, allocating, by the host memory space allocating unit, a continuous fixed-size overall memory space for the header data and the content data of a specific data block within the memory unit of the host; and on an external device connected with the host, allocating, by the external device storage space allocation unit, a storage space of a fixed size for content data of a specific data block in a storage unit of the external device based on metadata contained in the header data modified by the storage space determination unit, and dynamically allocating and storing a specific content portion of the content data in a corresponding sub-storage space according to a value of the modified metadata.

The method for dynamically allocating static memory blocks according to the present disclosure, wherein the metadata includes size metadata of the data blocks and shape metadata of the data blocks.

The method for dynamically allocating static memory blocks according to the present disclosure, wherein the metadata further comprises metadata describing tensors of specific contents of the content data.

According to the method for dynamically allocating static memory blocks, the external device accesses content data in the corresponding sub-memory space based on an instruction containing header data from a host.

By performing the static allocation of the storage spaces of the host and the external device, the control capability of the device adopting the system on data and operation can be enhanced. Meanwhile, although the data types in deep learning and big data calculation are highly uniform, there are slight changes of the data in the actual business scene, so that the metadata in the header data are modified by the fact that the content data of the data blocks are in a dynamic storage state in a static space, and that the host and the external device can know the specific storage condition of the data blocks in the respective storage units. Thereby more efficiently using the content data stored in the fixed-size storage space.

The advantage of the static state is that at the beginning of the program start-up, all the required memory is allocated so he gives the whole problem of OOM to be eliminated. This is a benefit of a relatively dynamic allocation of memory to statically allocate memory.

When the host according to the present disclosure is used in the field of big data technology and deep learning and constitutes a distributed system with external devices such as a CPU and a GPU, since there are a large number of copies of data and data interactions, it is very important how to reduce the amount of data interacted to increase the speed of data transmission in the case where the communication bandwidth of the GPU and the outside is fixed. In contrast, in the dynamic storage system employing the static storage block according to the present disclosure in big data calculation and deep learning, since the external device separates the content data from the header data, data interaction between the external device and the host device saves the header data and only the value of the metadata in the header data needs to be acquired to operate the locally stored content data, thereby saving communication overhead between the host and the external device. More importantly, the fixed storage space is determined by adopting the head data, so that the storage space sharing among different specific data blocks can be realized in the operation process, namely, after the former data block is used, the latter data block is directly stored in the same storage space under the guidance of the head data without carrying out dynamic space allocation on the subsequent data, thereby saving the operation cost of the storage space allocation, improving the fluidity of the data receiving and processing, and naturally improving the data processing speed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

The disclosure will now be described in detail by way of example with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram of a dynamic allocation system of static memory blocks according to the present disclosure;

FIG. 2 is a flow chart illustrating a method of dynamically storing static memory blocks in a distributed system according to the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to examples and drawings to enable those skilled in the art to practice the disclosure as a result of the description.

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the present disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, one of the two possible devices may be referred to hereinafter as a first host or a second host, and similarly the other of the two possible devices may be referred to as a second external device or a first external device, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

In order that those skilled in the art will better understand the present disclosure, the present disclosure will be described in further detail below with reference to the accompanying drawings and detailed description.

FIG. 1 is a schematic diagram illustrating a dynamic allocation system of static memory blocks according to the present disclosure. As shown in fig. 1, in a system for big data calculation and deep learning, an external device 200, such as a GPU, etc., may be connected to a host 100. In a distributed system, the hosts are also connected to each other, thereby completing the computing service. As shown in FIG. 1, hosts are labeled 100-1 and 100-2, and external devices are labeled 200-1 and 200-2. In the following description, the host 100 and the external device 100 are referred to as such unless otherwise indicated. The host 100 includes a control unit 10 and a storage unit 40 of the host 100. The host 100 has other constituent elements of a general host, and are not described in detail herein. The control unit 100 includes a storage space determining unit 20 and a storage space allocating unit 30. The external device 200 connected to the host 100 includes a control unit 50 and a storage unit 70. The external device 20 also has a common constituent unit of an external device (for example, GPU), and is not described again. The control unit 50 of the external device 200 includes a storage space allocation unit 60.

The size of the memory space required for the same type of data can be predetermined based on its regular size, thereby configuring a fixed memory space in the computing system. For this purpose, the storage space determining unit 20 generates header data of content data corresponding to a data block for the data block to be processed, the header data containing metadata describing specific content of the content data. And the storage space determining unit 20 determines the size of the space to be allocated for the data block by the storage unit 40 of the host 100 and the size of the space to be allocated for the data block by the storage unit 70 of the peripheral 200, with the size of the same type of data block as an upper limit. For example, if the maximum size of a data block of the same type is 4M, a storage space of 5M may be determined for such a data block, or a space of only 4M may be determined. The metadata includes data type, size metadata, and data shape metadata of the data block. The metadata contained is typically expressed in terms of specified fields. Table 1 schematically shows the structure of header data:

table 1 (head data structure)

As shown in table 1 above, the header data that generates the content data corresponding to the data block for the data block to be processed by the storage space determining unit 20 is initially only one header data frame. In other words, the header data initially generated determines the metadata item and the initial value contained in the header data, and specifies the unified description information possessed by the data block to be processed. In particular, the values of these metadata fields are only a boolean value or a scalar. For example, the data type metadata is a scalar. The data shape metadata is also a scalar. And the metadata describing the details of the content data may be a boolean value, e.g. 0 or 1. If its initial value is 1, it indicates that there is the metadata, and if its initial value is 0, it indicates that there is no metadata. Although table 1 schematically shows a structure of header data, it does not mean that the header data mentioned in the present disclosure must contain all metadata names and fields shown in the illustrative examples. For example, the field is_body_disable is not included in the normal header data. This field typically only exists in one type of transport handler so that the transport handler only provides header data for its downstream handler.

With the arrival of a specific data block, the storage space determining unit 20 correspondingly modifies metadata of the header data that has been generated for describing details of the content data based on the specific content of the specific data block, thereby obtaining accurate description information of the content data. For example, the value of "has_dim1_valid_num" of a field of content data describing a data block is modified into a matrix, for example, in order to describe a specific composition of the content data.

After the modified header data is obtained, the storage space allocation unit 30 of the host 100 allocates a continuous fixed-size overall storage space for the header data and the content data of the specific data block in the host storage space 40 based on the metadata included in the header data modified by the storage space determination unit 20.

Alternatively, in some cases, there may be no need to allocate a space size or the allocated space size may be zero. For example, in the case where the header data contains the section is_body_disable, there is no need to allocate a space for the content data.

Shown in table 2 is a schematic structure of header data and content data continuously stored in the storage unit 40 of the host 100.

TABLE 2

Meanwhile, when the external device 200 needs to copy or process a specific data block or when the control unit 10 of the host 100 transmits a data copy or process command to the external device 200, the storage space allocation unit 60 of the external device 200 allocates a storage space of a fixed size in the storage unit 70 based on the value of metadata in header data included in a command obtained from the host 100 and stores the contents of the obtained data block in the allocated fixed storage space. Shown in table 3 is a schematic structure of content data stored in the storage unit 70 of the external device.

TABLE 3 Table 3

In a data processing system of a dynamic memory system containing static memory blocks according to the present disclosure, the memory space where each executable of a host and an external device is statically allocated remains throughout the course of processing the same kind of data, but is below its maximum space, and since metadata in header data changes based on the change of content data of a specific data block, the memory unit 40 or 70 allocated to occupy is in a dynamically changing state until the data processing is completed. The static memory space allocation method based on the head data reduces the situation that the memory space is crashed under special conditions when the memory space is dynamically allocated. Furthermore, under the static allocated space, the allocated static storage space is dynamically used based on the change of metadata of the header data, so that other operation units of the external device can directly acquire a specific part of the content data based on the direction of the metadata of the header data, thereby accelerating the speed of data reading and operation data.

As shown in fig. 1, when the transfer of the data block is performed between the host 100-1 and the host 100-2, since the host 100-1 and the host 100-2 operate independently of each other, when the host 100-1 needs the data block 2 in the host 100-2, it is necessary to copy all the data of the data block 2 to the host 100-1. Since the header data and the content data of the data block 2 are stored consecutively, the data copying can be performed by one initiation (LAUNCH), otherwise, if the header data and the content data are stored separately, the transfer of the data block between the hosts 100-1 and 100-2 needs to be completed twice. Therefore, the continuous storage of the header data and the content data in the static storage space on the hosts can reduce the number of data transmissions during data copying between the hosts, thereby reducing transmission overhead.

Between the host 100-1 and the external device 200-1 connected thereto, to obtain a data block of the host 100-1, the host 100 only needs to copy the content data portion, and does not need to copy the header data, as the value of the metadata related to the command of the header data carried in the command sent to the external device. Therefore, in the actual data processing, only a small amount of command signals and the values of metadata of header data carried by the command signals are contained, in addition to the transmission of specific content data, between the host 100 and the external device. When the content data of the data block changes, the metadata of the header data stored in the host 100 changes only in its specific value, which is represented in the external device only as a change in the content data and a change in the value of the metadata of the header data contained in the command transmitted to the peripheral device. Since there is a large amount of data interaction between the host and the external device, the amount of data interaction is reduced and the speed of data processing is increased by eliminating or reducing the interaction of header data, that is, by separately storing header data required for operation of the external device on the host device. The reading of the content data on the external device 200 is then performed based on the value of the command-related metadata from the header data carried in the command sent by the host 100 to the external device.

FIG. 2 is a flow chart illustrating a method of dynamically storing static memory blocks in a distributed system according to the present disclosure. As shown in fig. 2, first, at step S310, header data of content data corresponding to a data block is generated for the data block to be processed by the storage space determining unit 20 of the host 100, the header data containing metadata describing specific content of the content data, before processing a large number of like data blocks. As described above, in performing big data calculation and deep learning, generally facing the same type of data, the size of a required storage space thereof can be predetermined based on the conventional size thereof, thereby configuring a fixed storage space in the computing system. This opens the possibility for implementation of static storage. For this purpose, the storage space determining unit 20 generates header data of content data corresponding to a data block for the data block to be processed, the header data containing metadata describing specific content of the content data. A fixed storage space is determined by one of the metadata in the header data. Thus, at step S320, the storage space determining unit 20 determines the size of the space to be allocated for the data block by the storage unit 40 of the host 100 and the size of the space to be allocated for the data block by the storage unit 70 of the peripheral device 200.

Subsequently, at step S330, upon receiving the data block, the storage space determining unit 20 modifies the metadata within the header data based on the content data of the specific data block. Specifically, as shown in table 2 above, for example, metadata "has_dim0_valid_num" in the header data that is initially generated is modified to "dim0_valid_num" and has_dim1_valid_num "is modified to" dim1_valid_num "or the like. After the header data is modified according to the specific data block, the storage space allocation unit 30 of the host 100 allocates a continuous fixed-size overall storage space for the header data and the content data of the specific data block and continuously stores the header data and the content data in the storage space 40 of the host 100 based on the metadata included in the header data modified by the storage space determination unit 20 at step S340. Simultaneously with step S340, at step S350, the storage space allocation unit 60 of the external device 200, after receiving the values containing the metadata included in the header data modified by the storage space determination unit 20, allocates a fixed-size storage space for the content data of the specific data block at the storage unit 70 of the external device 200 based on the values of these metadata, and dynamically allocates and stores the specific content portion of the content data in the corresponding sub-storage space in accordance with the values of the modified metadata. When a new specific data block is subsequently received at step S360, the header data is updated from S330, thereby modifying the value of the metadata of the header data based on the data content of the new data block, after which the storage space in the host is updated with the header data and the content data of the new data block, while the fixed storage space in the storage unit of the external device is updated based on the new modified metadata.

Although the storage space determining unit and the storage space allocating unit are described as two separate individuals in the description of the present disclosure, alternatively, the storage space determining unit itself may be a part of the storage space allocating unit, and both as one integral body of the host control unit complete the generation of the header data for the data block and the determination of the static storage space. Thus, although the present disclosure describes the two separately, it is not meant to imply that the two exist separately from each other in the arrangement necessary to implement the present disclosure.

Thus far, this specification describes a dynamic memory system of a static memory block and a method thereof according to an embodiment of the present disclosure. According to the dynamic storage system and the method for the static storage block, the header data of the data block is generated and the header data is modified to describe the content data of the data block, so that the use efficiency of the storage space of the storage unit is greatly improved, the data communication overhead between the host and the external equipment and between the host and the host is reduced, and the data processing efficiency of the distributed computing system is improved.

While the basic principles of the present disclosure have been described above in connection with specific embodiments, it should be noted that all or any steps or components of the methods and apparatus of the present disclosure can be implemented in hardware, firmware, software, or combinations thereof in any computing device (including processors, storage media, etc.) or network of computing devices, as would be apparent to one of ordinary skill in the art upon reading the present disclosure.

Thus, the objects of the present disclosure may also be achieved by running a program or set of programs on any computing device. The computing device may be a well-known general purpose device. Thus, the objects of the present disclosure may also be achieved by simply providing a program product containing program code for implementing the method or apparatus. That is, such a program product also constitutes the present disclosure, and a storage medium storing such a program product also constitutes the present disclosure. It is apparent that the storage medium may be any known storage medium or any storage medium developed in the future.

It should also be noted that in the apparatus and methods of the present disclosure, it is apparent that the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered equivalent to the present disclosure. The steps of executing the series of processes may naturally be executed in chronological order in the order described, but are not necessarily executed in chronological order. Some steps may be performed in parallel or independently of each other.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A dynamic allocation system for static memory blocks, comprising:

a storage space determining unit that generates header data of content data corresponding to a data block for the data block to be processed, the header data containing metadata describing specific content of the content data, thereby determining a space size to be allocated for the data block by the host storage unit and a space size to be allocated for the data block by the peripheral device, and modifying the header data when acquiring the specific data block;

a host memory space allocation unit that statically allocates, for head data and content data of specific data blocks for respective executives, a continuous fixed-size overall memory space and continuously stores the head data and the content data within a memory unit of the host based on metadata included in the head data modified by the memory space determination unit, wherein the allocated memory space always maintains the memory space during processing of the same kind of data, and since metadata within the head data changes based on a change in the content data of the specific data blocks, the memory unit occupied by allocation is in a dynamically changed state until the data processing is completed; and

an external device storage space allocation unit that, when the external device needs to copy or process a specific data block or when a control unit of the host transmits a data copy or process command to the external device, dynamically allocates and stores a specific content portion of the content data in a corresponding sub-storage space based on a value of metadata included in header data included in a command obtained from the host, the value of metadata included in header data modified by a storage space determination unit included in the command obtained from the host, statically allocates a storage space of a fixed size for the content data of the specific data block in a storage unit of the external device and according to the value of the modified metadata,

when the content data changes, the specific metadata value of the header data stored in the host changes, so that the host sends the command to the external device the command-related metadata value of the header data, the external device only acquires the command-related metadata value of the header data contained in the command transmitted to the peripheral device, and therefore other operation units of the external device can directly acquire specific parts of the content data based on the guidance of the header data metadata, and when the metadata changes based on the change of the content data of specific data blocks in the header data, the storage unit which is statically allocated by the external device is in a state that the content data changes dynamically until the data processing is completed, and the allocated static storage space is dynamically used, so that the data reading speed and the operation data are accelerated.

2. The dynamic allocation system of static memory blocks of claim 1, wherein said metadata comprises metadata representing a data block size and metadata of a data block shape.

3. The dynamic allocation system of static memory blocks according to claim 1 or 2, wherein said metadata further comprises metadata describing tensors of specific content of the content data.

4. The dynamic allocation system of static memory blocks according to claim 3, wherein said external device accesses content data in the corresponding sub-memory space based on an instruction containing header data from the host.

5. A method of dynamically allocating static memory blocks, comprising:

generating header data of content data corresponding to a data block for the data block to be processed by a storage space determining unit, the header data containing metadata describing specific content of the content data;

determining a size of space to be allocated by the host storage unit for the data block and a size of space to be allocated by the peripheral device for the data block;

modifying header data when a specific data block is acquired;

on the host memory unit, based on the metadata contained in the header data modified by the memory space determining unit, statically allocating, by the host memory space allocating unit, a continuous, fixed-size overall memory space for the header data and the content data of the specific data block for each execution body and continuously storing the header data and the content data in the memory unit of the host, wherein the allocated memory space always maintains the memory space during the same type of data processing, and since the metadata in the header data changes based on the change of the content data of the specific data block, the allocated and occupied memory unit is in a dynamically changed state until the data processing is completed; and

on an external device connected to a host, when the external device needs to copy or process a specific data block or when a control unit of the host transmits a data copy or process command to the external device, a value of metadata in header data included in a command obtained from the host is statically allocated a storage space of a fixed size for content data of the specific data block in a storage unit of the external device by an external device storage space allocation unit based on a value of metadata included in header data modified by a storage space determination unit included in the command obtained from the host, and a specific content portion of the content data is dynamically allocated and stored in a corresponding sub-storage space according to the value of the modified metadata,

when the content data changes, the metadata of the header data stored in the host changes only by a specific value, and the host sends the command to the external device the command-related metadata of the header data, so that the external device only acquires the command-related metadata of the header data contained in the command transmitted to the external device, thereby enabling other operation units of the external device to directly acquire a specific part of the content data based on the guidance of the metadata of the header data, so that when the metadata changes based on the change of the content data of a specific data block in the header data, the storage unit of the external device which is statically allocated is in a state in which the content data dynamically changes until the data processing is completed, and thus the allocated static storage space is dynamically used, so that the speed of data reading and operation data are accelerated.

6. The method for dynamically allocating a static memory block according to claim 5, wherein the metadata comprises metadata representing a size of the data block and metadata of a shape of the data block.

7. A method of dynamically storing static memory blocks as recited in claim 5 or 6, wherein the metadata further comprises metadata describing tensors of specific content of the content data.

8. The method of dynamically allocating a static memory block according to claim 7, wherein the external device accesses content data in the corresponding sub-memory space based on an instruction from the host containing header data.