CN111666150B - Storage space allocation method and device, terminal and computer readable storage medium - Google Patents

Storage space allocation method and device, terminal and computer readable storage medium Download PDF

Info

Publication number
CN111666150B
CN111666150B CN202010390297.2A CN202010390297A CN111666150B CN 111666150 B CN111666150 B CN 111666150B CN 202010390297 A CN202010390297 A CN 202010390297A CN 111666150 B CN111666150 B CN 111666150B
Authority
CN
China
Prior art keywords
storage space
layer combination
target
combination
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010390297.2A
Other languages
Chinese (zh)
Other versions
CN111666150A (en
Inventor
文博
曹庆新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN202010390297.2A priority Critical patent/CN111666150B/en
Publication of CN111666150A publication Critical patent/CN111666150A/en
Priority to PCT/CN2021/088444 priority patent/WO2021227789A1/en
Application granted granted Critical
Publication of CN111666150B publication Critical patent/CN111666150B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The present application belongs to the technical field of data storage, and in particular, to a method, an apparatus, a terminal and a computer-readable storage medium for allocating a storage space, wherein the method includes: traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination; determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination; if a first target input layer combination of allocated memory space exists in the plurality of input layer combinations that the output layer combination of the first target layer combination comprises, then memory space that records the first target input layer combination in a memory space map is simultaneously allocated to the first target layer combination, simplifying the software programming complexity of the convolutional neural network processor.

Description

Storage space allocation method and device, terminal and computer readable storage medium
Technical Field
The present application belongs to the technical field of data storage, and in particular, to a method, an apparatus, a terminal and a computer-readable storage medium for allocating storage space.
Background
The Convolutional Neural Network (CNN) is composed of most basic layers (layers), each Layer corresponds to an operation, and operation types of the operations may include a Convolution operation (Convolution), a Pooling operation (firing), a per-Element operation (Element-Wise), a join operation (concatement), a full-join operation (full-Connected), a batch-Normalization operation (Bath-Normalization), and the like.
A Neural Network Processor (NNP) is a processor dedicated to performing convolutional neural network computational tasks. However, the software programming complexity of current convolutional neural network processors is generally high.
Disclosure of Invention
The embodiment of the application provides a storage space allocation method, a storage space allocation device, a terminal and a computer readable storage medium, which can simplify the software programming complexity of a convolutional neural network processor.
A first aspect of an embodiment of the present application provides a method for allocating a storage space, including:
traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination; an output layer combination of the first target layer combination comprises a plurality of input layer combinations;
allocating storage space for each of the first target layer combinations;
wherein allocating storage space for each of the first target layer combinations comprises:
determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination;
if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations included in an output layer combination of the first target layer combination, simultaneously allocating storage space recording the first target input layer combination in a storage space mapping table to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1; wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
A second aspect of the embodiments of the present application provides an apparatus for allocating storage space, including:
the traversal unit is used for traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination; an output layer combination of the first target layer combination comprises a plurality of input layer combinations;
an allocation unit for allocating a storage space for each of the first target layer combinations;
the allocation unit, when allocating storage space for each of the first target layer combinations, is further configured to:
determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination;
if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations included in an output layer combination of the first target layer combination, simultaneously allocating storage space recording the first target input layer combination in a storage space mapping table to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1; wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
A third aspect of the embodiments of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method when executing the computer program.
A fourth aspect of the embodiments of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the above method.
In an embodiment of the application, a first target layer combination of which the output layer combination comprises a plurality of input layer combinations is obtained by traversing the input layer combinations and the output layer combinations of each layer combination of the convolutional neural network, and the step of determining whether a first target input layer combination of the allocated storage space exists in the plurality of input layer combinations comprised in the output layer combination of the first target layer combination is performed for each first target layer combination when a storage space is allocated for each first target layer combination, the storage space of the first target input layer combination is recorded in a storage space mapping table and is simultaneously allocated to the first target layer combination when a first target input layer combination of the allocated storage space exists in the plurality of input layer combinations comprised in the output layer combination of the first target layer combination, that is, the storage space of the first target input layer combination is simultaneously allocated to the first target layer combination, the storage space can simultaneously store the calculation results of the plurality of input layer combinations contained in the output layer combination of the first target layer combination, so that the convolutional neural network processor can read data from the same piece of storage space when executing the operation containing the plurality of input layer combinations, and the data does not need to be read from a plurality of pieces of storage space, thereby simplifying the software programming complexity of the convolutional neural network processor.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a first structural schematic diagram of data input-output relationships between convolutional neural network layer combinations provided by an embodiment of the present application;
FIG. 2 is a diagram illustrating a first result of allocating memory space by using a conventional memory space allocation method;
fig. 3 is a schematic implementation flow chart of a method for allocating a storage space according to an embodiment of the present application;
fig. 4 is a flowchart illustrating a specific implementation of step 302 of a method for allocating a storage space according to an embodiment of the present application;
fig. 5 is a flowchart illustrating a specific implementation of step 403 of a method for allocating a storage space according to an embodiment of the present application;
fig. 6 is a schematic flowchart of a specific implementation of freeing a storage space according to an embodiment of the present application;
FIG. 7 is a diagram illustrating a first result of allocating a storage space according to the storage space allocation method of the present application;
FIG. 8 is a second structural diagram of data input output relationships between convolutional neural network layer combinations provided by embodiments of the present application;
FIG. 9 is a diagram illustrating a second result of allocating memory using an existing memory allocation method;
FIG. 10 is a diagram illustrating a second result of allocating memory space using the memory space allocation method of the present application;
FIG. 11 is a schematic structural diagram of an apparatus for allocating storage space according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
The Convolutional Neural Network (CNN) is composed of most basic layers (layers), each Layer corresponds to an operation, and operation types of the operations may include a Convolution operation (Convolution), a Pooling operation (firing), a per-Element operation (Element-Wise), a join operation (concatement), a full-join operation (full-Connected), a batch-Normalization operation (Bath-Normalization), and the like.
A Neural Network Processor (NNP) is a processor dedicated to performing convolutional neural network computational tasks. And the compiler is matched with the neural network processor and used for compiling the convolutional neural network model to generate machine codes which can execute calculation tasks on the neural network processor. To reduce the bandwidth requirements of the neural network processor for memory other than the local memory of the neural network processor, the compiler attempts to store the results of each Layer in the local memory of the neural network processor when slicing the convolutional neural network. A plurality of continuous layers form a Layer combination (Layer-Group), data exchange is carried out between the Layer-groups through a memory except the local memory of the neural network processor, and the Layer inside the Layer-Group carries out data exchange through the local memory of the neural network processor. How to allocate storage space in storage other than the local storage of the neural network processor to each Layer-Group is the work that the compiler memory management needs to do.
Because some operations in these convolutional neural networks contain only one input, and some contain multiple inputs. For example, Element-Wise typically contains two inputs, while concatemate contains two or more inputs. Therefore, when a plurality of successive layers are combined into one Layer combination (Layer-Group), when the operation type of the first Layer of the Layer-Group is an Element-Wise or concatemate operation type with a plurality of inputs, the Layer-Group will have a plurality of input Layer combinations, and the Layer-Group is one output Layer combination corresponding to each of the plurality of input Layer combinations.
For example, as shown in FIG. 1, the operation of the first Layer of Layer-Group n +1 in FIG. 1 is Element-Wise, and the two inputs of Element-Wise come from Layer-Group n-1 and Layer-Group n, respectively, then Layer-Group n +1 contains two input Layer combinations of Layer-Group n-1 and Layer-Group n, and Layer-Group n +1 is the output Layer combination of the two input Layer combinations of Layer-Group n-1 and Layer-Group n.
In practical application, if the calculation results of Layer-Group n-1 and Layer-Group n are stored in different storage spaces, for example, as shown in fig. 2, BUF0 and BUF1, respectively, the data of BUF0 and BUF1 need to be read during Element-Wise calculation, and the calculation result of Layer-Group n +1 needs to be stored in the unoccupied storage space BUF 2. However, when Element-Wise calculation needs to be performed on a plurality of rounds, the data storage mode needs to alternately read data between two different storage spaces, namely BUF0 and BUF1, when Element-Wise calculation is performed, and therefore software coding of the neural network processor has high complexity.
Based on this, the embodiments of the present application provide a method, an apparatus, a terminal, and a computer-readable storage medium for allocating a storage space, which can simplify the software programming complexity of a convolutional neural network processor.
In order to explain the technical means of the present application, the following description will be given by way of specific examples.
Fig. 3 is a schematic flow chart illustrating an implementation of a method for allocating a storage space, which is provided by an embodiment of the present application, and is applied to a terminal, and can be executed by an apparatus for allocating a storage space configured on the terminal, and is suitable for a situation where software programming complexity of a convolutional neural network processor needs to be simplified. The terminal can be an intelligent terminal such as a computer and a server. The method for allocating the storage space may include steps 301 to 302.
Step 301, traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination. The output layer combination of the first target layer combination comprises a plurality of input layer combinations.
In practical applications, the above process of traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network may include: the method further includes traversing the convolutional neural network from which layer combinations the input data for each layer combination in each layer combination comes, thereby resulting in an input layer combination and an output layer combination for each layer combination of the convolutional neural network, and the output layer combination comprises a first targeted layer combination of the plurality of input layer combinations, and the output layer combination comprises only a second targeted layer combination of one input layer combination.
The application obtains a first target layer combination whose output layer combination comprises a plurality of input layer combinations by traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network, and a second targeted layer combination, the output layer combination comprising only one input layer combination, such that when storage space is allocated for the first targeted layer combination, storage space for other input layer combinations included in an output layer combination corresponding to a first target layer combination may be allocated to the storage space for the first target layer combination, enabling storage of computation results for multiple input layer combinations included in an output layer combination corresponding to the first target layer combination into the same piece of storage space, such that when the convolutional neural network processor performs an operation including multiple input layer combinations, data can be read from the same piece of storage space, and therefore software programming complexity of the convolutional neural network processor is simplified.
It should be noted that, in the embodiment of the present application, allocating a memory space for the first target layer combination refers to allocating a memory space in a memory other than the local memory of the neural network processor for the first target layer combination, and the type of the memory other than the local memory of the neural network processor may include a double data rate synchronous dynamic random access memory (DDR SDRAM), a Synchronous Dynamic Random Access Memory (SDRAM), or a bus random access memory (RDRAM), which is not limited in this application.
Step 302, allocating a storage space for each of the first target layer combinations.
In the embodiment of the present application, when allocating a storage space for each first target layer combination, it is necessary to store calculation results of a plurality of input layer combinations included in an output layer combination corresponding to each first target layer combination into the same piece of storage space, respectively. Specifically, as shown in fig. 4, when allocating a storage space for each of the first target layer combinations, steps 401 to 402 may be performed respectively.
Step 401, determining whether there is a first target input layer combination of allocated storage space in a plurality of input layer combinations included in an output layer combination of the first target layer combination.
In an embodiment of the present application, it is determined whether a portion of the plurality of input layer combinations that comprise the output layer combination of the first target layer combination has completed allocation of storage space by determining whether the first target input layer combination for allocated storage space exists among the plurality of input layer combinations.
Step 402, if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations comprised by an output layer combination of the first target layer combination, then simultaneously allocating storage space recording the first target input layer combination in a storage space map to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1.
Wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
Since the presence of a first target input layer combination of allocated storage space in a plurality of input layer combinations included in an output layer combination of a first target layer combination indicates whether or not the allocation of storage space has been completed for a part of the input layer combinations included in the plurality of input layer combinations, in order to ensure that the calculation results of the plurality of input layer combinations included in the same output layer combination can be stored in the same piece of storage space, it is necessary that the storage space recording the first target input layer combination in a storage space mapping table is allocated to the first target layer combination at the same time. Also, the space size required when the storage space of the first target input layer combination is used once needs to be recorded as size1+ size2, and the space size required when the storage space of the first target input layer combination is repeatedly used is recorded as a larger value between size1+ size2 and size _ max1, so that when dividing each piece of storage space, the division can be performed according to the space size required when the storage space recorded in the storage space mapping table is repeatedly used, and the division of the data storage location inside each piece of storage space can be performed according to the size2 of each newly added space.
Optionally, in some embodiments of the present application, as shown in fig. 4, after the step 401, a step 403 may be further included: allocating unoccupied storage space to the first target layer combination if the first target input layer combination for which storage space has been allocated does not exist among the plurality of input layer combinations that the output layer combination of the first target layer combination includes.
Since the absence of the first target input layer combination of the allocated memory space in the plurality of input layer combinations included in the output layer combination of the first target layer combination indicates that no memory space is allocated to each of the plurality of input layer combinations, any unoccupied memory space can be allocated to the first target layer combination, and the allocation of the memory space for the first target layer combination can be completed.
For example, as shown in FIG. 1, the output Layer combination Layer-Group n +1 corresponding to Layer-Group n-1 comprises two input Layer combinations Layer-Group n-1 and Layer-Group n; when the first target Layer combination is Layer-Group n-1 and the storage space is allocated for the first target Layer combination Layer-Group n-1, the corresponding output Layer combination Layer-Group n +1 contains two input Layer combinations of Layer-Group n-1 and Layer-Group n, and the first target input Layer combination with the allocated storage space does not exist, so that any unoccupied storage space BUF m can be allocated to the first target Layer combination Layer-Group n-1.
When the first target Layer combination is Layer-Group n and the storage space is allocated to the first target Layer combination Layer-Group n, the corresponding output Layer combination Layer-Group n +1 contains Layer-Group n-1 and Layer-Group n, the first target input Layer combination Layer-Group n-1 with the storage space allocated exists, so that the storage space BUF m allocated to Layer-Group n-1 needs to be allocated to Layer-Group n at the same time. That is, recording BUF m in the storage space mapping table and simultaneously allocating BUF m to Layer-Group n to ensure that the calculation results of two input Layer combinations (Layer-Group n-1 and Layer-Group n) included in the same output Layer combination Layer-Group n +1 can be stored in the same piece of storage space BUF m, so that the convolutional neural network processor can read the calculation results of Layer-Group n-1 and Layer-Group n from the same piece of storage space BUF m without data reading from two pieces of storage spaces when performing an operation including a plurality of input Layer combinations Layer-Group n +1, thereby simplifying the software programming complexity of the convolutional neural network processor.
Specifically, when BUF m is simultaneously allocated to Layer-Group n in the storage space mapping table, it is further necessary to record the size of the space required when the storage space of the first target input Layer combination is used once as size1+ size2, and record the size of the space required when the storage space of the first target input Layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1.
Wherein size1 is the size of the storage space needed by the storage space history record of the first target input Layer combination Layer-Group n-1 when being used once, namely the size of the storage space needed to be occupied by the calculation result of Layer-Group n-1; size2 is the size of the storage space required by the calculation result of the first target Layer combination Layer-Group n, size _ max1 is the size of the storage space required by the storage space history record of the first target input Layer combination when the storage space history record is reused, so that when each piece of storage space is divided, the storage space can be divided according to the size of the storage space required by the calculated storage space when the storage space is reused, and the division of the data storage position inside each piece of storage space can be divided according to the size of size2 of each newly added space.
For example, according to the allocation information of the storage space recorded in the storage space mapping table, the larger value between size1+ size2 and size _ max1 of the space size of BUF m can be obtained; Layer-Group n-1 is deposited at a relative address offset in the address space that may be base address + BUF m; Layer-Group n is deposited at a relative address offset + size1 where the address space may be base address + BUF m. Wherein the base address refers to the starting hardware address of the result of the computation of the storage layer combination. The relative address offset of BUF m refers to an address offset of the address of BUF m relative to the base address.
According to the embodiment of the application, when the storage space is allocated to each first target layer combination, the calculation results of the plurality of input layer combinations contained in the output layer combination corresponding to the first target layer combination are stored in the same piece of storage space, so that the convolutional neural network processor can read data from the same piece of storage space when executing the operation containing the plurality of input layer combinations, the data do not need to be read from the plurality of pieces of storage space, and the software programming complexity of the convolutional neural network processor is further simplified.
In some embodiments of the present application, in order to reduce memory fragmentation of the storage space, as shown in fig. 5, the step 403 for allocating the unoccupied storage space to the first target layer combination may include steps 501 to 502.
Step 501, searching the storage space mapping table, and judging whether the storage space mapping table is empty.
Step 502, if the storage space mapping table is empty, recording that the storage space with the relative address offset of 0 is allocated to the first target layer combination in the storage space mapping table; and recording a space size required when the storage space having the relative address offset of 0 is used once as size2, and recording a space size required when the storage space having the relative address offset of 0 is repeatedly used as a larger value between size2 and size _ max 2; wherein size _ max2 is the size of the storage space required when the storage space history record with the relative address offset of 0 is reused.
In this embodiment, in order to reduce memory fragmentation of a storage space, when the storage space is allocated for each of the first target layer combination and the second target layer combination, new storage spaces are sequentially allocated. That is, allocation is started from a first piece of storage space with a relative address offset of 0, then, a second piece of storage space with a relative address offset of 0+ a is allocated to a corresponding layer combination, then, a third piece of storage space with a relative address offset of 0+ a + B is allocated to a corresponding layer combination, where a is the size of the first piece of storage space, B is the size of the second piece of storage space, and so on, so that no wasted storage space exists between the previous piece of storage space and the next piece of storage space, and memory fragmentation of the storage space is reduced. The memory space having a relative address offset of 0 is a memory space having an address offset of 0 relative to the base address.
Specifically, the present application determines whether a memory space with a relative address offset of 0 has been allocated to one or more layer combinations by determining whether the memory space mapping table is empty, so that when the memory space mapping table is empty, a memory space with a relative address offset of 0 is allocated to the first target layer combination. That is, recording in the memory map that memory with a relative address offset of 0 has been allocated to the first target layer combination; and recording a space size required when the storage space having the relative address offset of 0 is used once as size2, and recording a space size required when the storage space having the relative address offset of 0 is repeatedly used as a larger value between size2 and size _ max 2.
In some embodiments of the present application, as shown in fig. 5, after step 501, steps 503 to 504 may be further included.
Step 503, if the storage space mapping table is not empty, determining whether a released storage space exists according to the storage space mapping table.
Step 504, if there is a released storage space, recording in a storage space mapping table that the released storage space has been allocated to the first target layer combination, and recording a space size required when the released storage space is used once as size2, and recording a space size required when the released storage space is repeatedly used as a larger value between size2 and size _ max 3; the size _ max3 is the amount of storage space required for the freed storage history to be reused.
Since the memory map is not empty, indicating that memory with a relative address offset of 0 has been allocated to one or more layer combinations, it is necessary to find other unoccupied memory and allocate the found unoccupied memory to the first target layer combination.
However, in order to further reduce the memory fragmentation of the storage space, when searching for other unoccupied storage spaces, it may be determined whether there is a released storage space according to the storage space mapping table, so that when there is a released storage space, the released storage space is allocated to the first target layer combination. That is, recording in a storage space mapping table that the released storage space has been allocated to the first target layer combination, and recording a space size required when the released storage space is used once as size2, and a space size required when the released storage space is repeatedly used as a larger value between size2 and size _ max 3; the size _ max3 is the amount of storage space required for the freed storage history to be reused.
It should be noted that, since the size of the storage space required for the released storage space history to be reused may be larger than size2, or may be smaller than size2, and when the size of the storage space required when the freed storage space history is reused is smaller than size2, if the size of the space required when the released storage space is reused is not recorded as a large value between size2 and size _ max3, it will be such that when the storage space of each piece of memory is divided according to the size of the storage space required when the storage space recorded by the storage space mapping table is reused, this will result in the freed up storage space having a size that does not meet the space requirements of the pre-freed up layer combination, and, therefore, in allocating the freed storage space to the first target layer combination, the size of the space required when the freed storage space is reused needs to be recorded as a larger value between size2 and size _ max 3; also, similarly, in the above step 402, when the storage space for recording the first target input layer combination in the storage space mapping table is simultaneously allocated to the first target layer combination, the size of the space required when the storage space for recording the first target input layer combination is repeatedly used needs to be recorded as a larger value between size1+ size2 and size _ max 1.
Alternatively, in order to allocate the freed memory space to the first target layer combination when allocating the unoccupied memory space to the first target layer combination, as shown in fig. 6, in the process of allocating the memory space for the first target layer combination by using the allocation method of the respective memory spaces described above, after each allocation of the memory space for the first target layer combination, the method may include: step 601 to step 602.
Step 601, determining whether there is a first target output layer combination with unallocated storage space in the output layer combinations of the input layer combinations of the first target layer combination.
In an embodiment of the present application, it is determined whether the calculation of an input layer combination of a first target layer combination has been read by its output layer combination by determining whether the first target output layer combination for which storage space is unallocated exists in the output layer combination of the input layer combination of the first target layer combination.
Step 602, if there is no first target output layer combination of unallocated memory among the output layer combinations of input layer combinations of the first target layer combination, marking in a memory map that memory occupied by the input layer combination of the first target layer combination has been freed.
When a first targeted output layer combination of unallocated memory is not present in an output layer combination of the input layer combinations of the first targeted layer combination, a computation result representing the input layer combination of the first targeted layer combination has been read by its output layer combination and the output layer combination of the input layer combination of the first targeted layer combination has completed the computation using that computation result, i.e., the stored data of the memory space occupied by the input layer combination of the first targeted layer combination has been used by the output layer combination of the input layer combination of the first targeted layer combination, and therefore, the memory space occupied by the input layer combination of the first targeted layer combination can be marked as freed so that it can be reallocated to other layer combinations of the convolutional neural network that have not yet allocated memory.
In some embodiments of the present application, when a first targeted output layer combination is present in an output layer combination of an input layer combination of the first targeted layer combination that does not allocate memory, then the computation result representing the input layer combination of the first targeted layer combination will also be read by that first targeted output layer combination, and therefore, the memory occupied by the input layer combination of the first targeted layer combination cannot yet be marked as freed.
It should be noted that, in some embodiments of the present application, when it is determined that the memory space mapping table is not empty, an unoccupied memory space with a minimum relative address offset may also be allocated to the first target layer combination. Likewise, unoccupied storage space with minimal relative address offset may also be allocated to the first target layer combination when it is determined that there is no freed storage space.
In each of the above-described embodiments, the method for allocating storage space may further include: traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a second target layer combination; the output layer combination of the second target layer combination comprises only one input layer combination; correspondingly, the method for allocating the storage space may further include: in allocating memory for each second target layer combination, the second target layer combination is allocated memory that is unoccupied.
Specifically, in order to reduce memory fragmentation of the storage space, the allocating the unoccupied storage space to the second target layer combination may include: searching the storage space mapping table, and judging whether the storage space mapping table is empty or not; if the storage space mapping table is empty, recording that the storage space with the relative address offset of 0 is allocated to the second target layer combination in the storage space mapping table; and recording a space size required when the storage space having the relative address offset of 0 is used once as size3, and recording a space size required when the storage space having the relative address offset of 0 is repeatedly used as a larger value between size3 and size _ max 2; wherein size3 is the amount of memory space that needs to be occupied by the computation of the second target layer combination.
Similarly, in some embodiments of the present application, when allocating a storage space for each second target layer combination, if the storage space mapping table is not empty, it may be further determined whether there is a released storage space according to the storage space mapping table; recording, in a storage space mapping table, if there is a released storage space, a size of a space required when the released storage space is used once as size3, and a size of a space required when the released storage space is repeatedly used as a larger value between size3 and size _ max3, the released storage space having been allocated to the second target layer combination; the size _ max3 is the amount of storage space required for the freed storage history to be reused. And, after allocating the unoccupied storage space to the second target layer combination, may further include: determining whether a second targeted output layer combination of unallocated storage exists in output layer combinations of input layer combinations of the second targeted layer combination; marking, in a storage mapping table, that storage occupied by an input layer combination of the second target layer combination has been freed if there is no second target output layer combination of unallocated storage in an output layer combination of input layer combinations of the second target layer combination.
In the embodiment of the present application, after the memory allocation is performed on each layer combination of the convolutional neural network by using the allocation method of the storage space described in the above embodiments, each storage space and the data storage location inside each storage space may be divided according to the size of the storage space required when each storage space recorded in the storage space mapping table is used once and the size of the storage space required when the storage space is repeatedly used. When the convolutional neural network processor executes an operation containing a plurality of input layer combinations, the calculation results of the plurality of input layer combinations corresponding to the operation can be read from the same piece of storage space, the software programming complexity of the convolutional neural network processor is simplified, and when the operation is Concatenate, the operation can be directly skipped, so that the data access efficiency and the occupied storage space are improved.
For example, as shown in fig. 7, 2 storage spaces recorded in the storage space mapping table are BUF0, BUF1, respectively, BUF0 is simultaneously allocated to Layer-Group n-1 and Layer-Group n as shown in fig. 1, and the size of the storage space required when BUF0 recorded in the storage space mapping table is used once is size4+ size5, and the size of the space required when BUF0 recorded in the storage space mapping table is repeatedly used is size _ max 4; BUF1 is allocated to Layer-Group n +1 as shown in fig. 1, and the size of the storage space required when BUF1 recorded in the storage space mapping table is used once is size6, and the size of the space required when BUF1 recorded in the storage space mapping table is reused is size _ max 5. Therefore, when dividing each piece of storage space and the data storage position inside each piece of storage space according to the size of the storage space required when each storage space recorded in the storage space mapping table is used once and the size of the storage space required when it is repeatedly used, the size of the space into which BUF0 is divided is size _ max4, the size of the space into which BUF1 is divided is size _ max5, and the storage address of the calculation result of Layer-Group n-1 is the relative address offset of base address + BUF 0; the storage address of the calculation result of Layer-Group n is the relative address offset + size4 of the base address + BUF 0; the calculated result of Layer-Group n +1 is stored at the relative address offset of base address + BUF 1. When the convolutional neural network processor executes the Element-Wise operation of the first Layer of Layer-Group n +1, only the data stored in the BUF0 needs to be read, and the data does not need to be read from two different storage spaces, so that the software programming complexity of the convolutional neural network processor can be simplified.
For another example, as shown in fig. 8, the operation of the first Layer of Layer-Group n +1 is Concatenate, and two inputs of Concatenate are from Layer-Group n-1 and Layer-Group n, respectively, and the calculation result of Concatenate is output to Layer-Group n + 2. If the calculation results of Layer-Group n-1 and Layer-Group n are stored in different storage spaces, for example, as shown in fig. 9, storage space BUF0 and storage space BUF1, when performing the Concatenate calculation, the processor of the convolutional neural network needs to read the data of BUF0 and BUF1, and put the calculation result of Layer-Group n +1 into the unoccupied storage space BUF2, and when performing the operation of Layer-Group n +2, it needs to read the calculation result of Concatenate from BUF2 and store the calculation result of Concatenate into the released storage space BUF 0.
When the memory allocation method provided by the embodiment of the application is used for memory allocation, the output layers of the Layer-Group n-1 and the Layer-Group n are combined into the Layer-Group n +1 including the combination of the two input layers, so that when the memory space is allocated for the Layer-Group n-1 and the Layer-Group n, the same memory space is allocated for the Layer-Group n-1 and the Layer-Group n. For example, as shown in fig. 10, the calculation results of Layer-Group n-1 and Layer-Group n are stored in the same piece of storage space BUF0, and when the convolutional neural network processor performs the operation of Layer-Group n +2, the convolutional neural network processor can directly and continuously read the calculation result in BUF0 and store the calculation result in BUF0 in the unoccupied storage space BUF1, so that the transport step of the Concatenate and the occupied storage space are saved, the data access efficiency is improved, and the software programming complexity of the convolutional neural network processor is simplified.
It should be noted that for simplicity of description, the aforementioned method embodiments are all presented as a series of combinations of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts described, as some steps may occur in other orders in accordance with the present invention.
Fig. 11 shows a schematic structural diagram of an allocation apparatus 1100 for a storage space according to an embodiment of the present application, which includes a traversal unit 1101 and an allocation unit 1102.
A traversal unit 1101, configured to traverse the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination; an output layer combination of the first target layer combination comprises a plurality of input layer combinations;
an allocation unit 1102 for allocating a storage space for each of the first target layer combinations;
the allocation unit, when allocating storage space for each of the first target layer combinations, is further configured to:
determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination;
if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations included in an output layer combination of the first target layer combination, simultaneously allocating storage space recording the first target input layer combination in a storage space mapping table to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1; wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
In some embodiments of the present application, the allocating unit 1102 is further configured to, after the determining whether there is a first target input layer combination of allocated storage space in the plurality of input layer combinations included in the output layer combination of the first target layer combination, allocate an unoccupied storage space to the first target layer combination if there is no first target input layer combination of allocated storage space in the plurality of input layer combinations included in the output layer combination of the first target layer combination.
In some embodiments of the present application, the allocating unit 1102 is further configured to search the storage space mapping table, and determine whether the storage space mapping table is empty; if the storage space mapping table is empty, recording that the storage space with the relative address offset of 0 is allocated to the first target layer combination in the storage space mapping table; and recording a space size required when the storage space having the relative address offset of 0 is used once as size2, and recording a space size required when the storage space having the relative address offset of 0 is repeatedly used as a larger value between size2 and size _ max 2; wherein size _ max2 is the size of the storage space required when the storage space history record with the relative address offset of 0 is reused.
In some embodiments of the present application, the allocating unit 1102 is further configured to, after the determining whether the storage space mapping table is empty, if the storage space mapping table is not empty, determine whether a released storage space exists according to the storage space mapping table; recording, in a storage space mapping table, if there is a released storage space, a size of a space required when the released storage space is used once as size2, and a size of a space required when the released storage space is repeatedly used as a larger value between size2 and size _ max3, the released storage space having been allocated to the first target layer combination; the size _ max3 is the amount of storage space required for the freed storage history to be reused.
In some embodiments of the present application, the allocating unit 1102 is further configured to determine whether there is a first target output layer combination of unallocated storage space in the output layer combinations of the input layer combinations of the first target layer combination after each allocation of one of the first target layer combination storage spaces is completed; if there is not a first targeted output layer combination of unallocated memory in the output layer combinations of input layer combinations for the first targeted layer combination, marking in a memory map memory occupied by the input layer combination for the first targeted layer combination as freed.
In some embodiments of the present application, the allocation unit 1102 is further configured to allocate an unoccupied storage space to the second target layer combination.
In some embodiments of the present application, the allocating unit 1102 is further configured to determine whether there is a second target output layer combination of unallocated memory in the output layer combinations of the input layer combinations of the second target layer combination after the allocating of the unoccupied memory to the second target layer combination; marking, in a storage mapping table, that storage occupied by an input layer combination of the second target layer combination has been freed if there is no second target output layer combination of unallocated storage in an output layer combination of input layer combinations of the second target layer combination.
It should be noted that, for convenience and brevity of description, the specific working process of the above-described allocation apparatus 1100 for storage space may refer to the corresponding process of the method described in fig. 1 to fig. 10, and is not described herein again.
As shown in fig. 12, the present application provides a terminal for implementing the above-mentioned allocation method of storage space, where the terminal 12 may include: a processor 120, a memory 121, and a computer program 122, such as a memory allocation program, stored in the memory 121 and operable on the processor 120. The processor 120 executes the computer program 122 to implement the steps in the above-mentioned embodiments of the allocation method of the storage space, such as the steps 301 to 302 shown in fig. 3. Alternatively, the processor 120, when executing the computer program 122, implements the functions of each module/unit in each device embodiment, for example, the functions of the units 1101 to 1102 shown in fig. 11.
The computer program may be divided into one or more modules/units, which are stored in the memory 121 and executed by the processor 120 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program in the terminal. For example, the computer program may be partitioned into a traversal unit and an allocation unit, each unit having the following specific functions:
the traversal unit is used for traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination; an output layer combination of the first target layer combination comprises a plurality of input layer combinations;
an allocation unit for allocating a storage space for each of the first target layer combinations;
the allocation unit, when allocating storage space for each of the first target layer combinations, is further configured to:
determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination;
if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations included in an output layer combination of the first target layer combination, simultaneously allocating storage space recording the first target input layer combination in a storage space mapping table to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1; wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
The terminal can be a computer, a server and other computing equipment. The terminal may include, but is not limited to, a processor 120, a memory 121. Those skilled in the art will appreciate that fig. 12 is only an example of a terminal and is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or different components, e.g., the terminal may also include input-output devices, network access devices, buses, etc.
The Processor 120 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 121 may be an internal storage unit of the terminal, such as a hard disk or a memory of the terminal. The memory 121 may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal. Further, the memory 121 may also include both an internal storage unit and an external storage device of the terminal. The memory 121 is used to store the computer program and other programs and data required by the terminal. The memory 121 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal and method may be implemented in other ways. For example, the above-described apparatus/terminal embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A method for allocating storage space, comprising:
traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination in the layer combinations contained in the convolutional neural network; an output layer combination of the first target layer combination comprises a plurality of input layer combinations;
allocating storage space for each of the first target layer combinations;
wherein allocating storage space for each of the first target layer combinations comprises:
determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination;
if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations included in an output layer combination of the first target layer combination, simultaneously allocating storage space recording the first target input layer combination in a storage space mapping table to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1; wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
2. The method of allocating storage space of claim 1, wherein after said determining whether there is a first targeted input layer combination of allocated storage space among a plurality of input layer combinations comprised by an output layer combination of said first targeted layer combination, comprising:
allocating unoccupied storage space to the first target layer combination if the first target input layer combination for which storage space has been allocated does not exist among the plurality of input layer combinations that the output layer combination of the first target layer combination includes.
3. The method of allocating storage space of claim 2, wherein said allocating unoccupied storage space to the first target layer combination comprises:
searching the storage space mapping table, and judging whether the storage space mapping table is empty or not;
if the storage space mapping table is empty, recording that the storage space with the relative address offset of 0 is allocated to the first target layer combination in the storage space mapping table; and recording a space size required when the storage space having the relative address offset of 0 is used once as size2, and recording a space size required when the storage space having the relative address offset of 0 is repeatedly used as a larger value between size2 and size _ max 2; wherein size _ max2 is the size of the storage space required when the storage space history record with the relative address offset of 0 is reused.
4. The method for allocating storage space according to claim 3, wherein after said determining whether said storage space mapping table is empty, further comprising:
if the storage space mapping table is not empty, judging whether the released storage space exists according to the storage space mapping table;
recording, in a storage space mapping table, if there is a released storage space, a size of a space required when the released storage space is used once as size2, and a size of a space required when the released storage space is repeatedly used as a larger value between size2 and size _ max3, the released storage space having been allocated to the first target layer combination; the size _ max3 is the amount of storage space required for the freed storage history to be reused.
5. The method of allocating memory space of any one of claims 1-4, after each completion of the allocation of said first target layer combined memory space, comprising:
determining whether a first target output layer combination of unallocated storage space exists in output layer combinations of input layer combinations of the first target layer combination;
if there is not a first targeted output layer combination of unallocated memory in the output layer combinations of input layer combinations for the first targeted layer combination, marking in a memory map memory occupied by the input layer combination for the first targeted layer combination as freed.
6. The method of allocating storage space of claim 1, wherein said method of allocating storage space further comprises: traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a second target layer combination; the output layer combination of the second target layer combination comprises only one input layer combination;
allocating unoccupied storage space to each of the second target layer combinations when allocating storage space for the second target layer combination.
7. The method of allocating memory space of claim 6, after said allocating unoccupied memory space to the second target layer combination, comprising:
determining whether a second targeted output layer combination of unallocated storage exists in output layer combinations of input layer combinations of the second targeted layer combination;
marking, in a storage mapping table, that storage occupied by an input layer combination of the second target layer combination has been freed if there is no second target output layer combination of unallocated storage in an output layer combination of input layer combinations of the second target layer combination.
8. An apparatus for allocating storage space, comprising:
the traversal unit is used for traversing the input layer combination and the output layer combination of each layer combination of the convolutional neural network to obtain a first target layer combination in the layer combinations contained in the convolutional neural network; an output layer combination of the first target layer combination comprises a plurality of input layer combinations;
an allocation unit for allocating a storage space for each of the first target layer combinations;
the allocation unit, when allocating storage space for each of the first target layer combinations, is further configured to:
determining whether a first target input layer combination of allocated storage space exists among a plurality of input layer combinations included in an output layer combination of the first target layer combination;
if there is a first target input layer combination of allocated storage space among a plurality of input layer combinations included in an output layer combination of the first target layer combination, simultaneously allocating storage space recording the first target input layer combination in a storage space mapping table to the first target layer combination, and recording a space size required when the storage space of the first target input layer combination is used once as size1+ size2, and recording a space size required when the storage space of the first target input layer combination is repeatedly used as a larger value between size1+ size2 and size _ max 1; wherein size1 is the size of storage space required when the storage space history for the first target input layer combination is used once, size2 is the size of storage space required to be occupied by the calculation result for the first target layer combination, and size _ max1 is the size of storage space required when the storage space history for the first target input layer combination is reused.
9. A terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010390297.2A 2020-05-09 2020-05-09 Storage space allocation method and device, terminal and computer readable storage medium Active CN111666150B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010390297.2A CN111666150B (en) 2020-05-09 2020-05-09 Storage space allocation method and device, terminal and computer readable storage medium
PCT/CN2021/088444 WO2021227789A1 (en) 2020-05-09 2021-04-20 Storage space allocation method and device, terminal, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010390297.2A CN111666150B (en) 2020-05-09 2020-05-09 Storage space allocation method and device, terminal and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111666150A CN111666150A (en) 2020-09-15
CN111666150B true CN111666150B (en) 2022-01-11

Family

ID=72383508

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010390297.2A Active CN111666150B (en) 2020-05-09 2020-05-09 Storage space allocation method and device, terminal and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN111666150B (en)
WO (1) WO2021227789A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666150B (en) * 2020-05-09 2022-01-11 深圳云天励飞技术股份有限公司 Storage space allocation method and device, terminal and computer readable storage medium
CN112256440B (en) * 2020-12-23 2021-03-09 上海齐感电子信息科技有限公司 Memory management method and device for neural network inference

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2880961B2 (en) * 1996-08-16 1999-04-12 日本電気アイシーマイコンシステム株式会社 Data buffering device and control method thereof
US8826438B2 (en) * 2010-01-19 2014-09-02 Damballa, Inc. Method and system for network-based detecting of malware from behavioral clustering
CN108615077B (en) * 2016-12-09 2021-08-24 杭州海康威视数字技术股份有限公司 Cache optimization method and device applied to deep learning network
CN106874219B (en) * 2016-12-23 2018-11-02 深圳云天励飞技术有限公司 A kind of data dispatching method of convolutional neural networks, system and computer equipment
CN106919918B (en) * 2017-02-27 2022-11-29 腾讯科技(上海)有限公司 Face tracking method and device
CN110245748B (en) * 2018-03-09 2021-07-13 赛灵思电子科技(北京)有限公司 Convolutional neural network implementation method, device, hardware accelerator and storage medium
CN110597616B (en) * 2018-06-13 2022-07-29 华为技术有限公司 Memory allocation method and device for neural network
CN110866589B (en) * 2018-08-10 2023-06-30 阿里巴巴(中国)有限公司 Operation method, device and framework of deep neural network model
CN109886390B (en) * 2019-01-10 2023-11-24 平安科技(深圳)有限公司 Convolutional neural network model optimization method, device, computer equipment and storage medium
CN109976903B (en) * 2019-02-22 2021-06-29 华中科技大学 Deep learning heterogeneous computing method and system based on layer width memory allocation
CN110750351B (en) * 2019-12-20 2020-12-22 安徽寒武纪信息科技有限公司 Multi-core task scheduler, multi-core task scheduling method, multi-core task scheduling device and related products
CN111666150B (en) * 2020-05-09 2022-01-11 深圳云天励飞技术股份有限公司 Storage space allocation method and device, terminal and computer readable storage medium

Also Published As

Publication number Publication date
WO2021227789A1 (en) 2021-11-18
CN111666150A (en) 2020-09-15

Similar Documents

Publication Publication Date Title
CN110149803B (en) Data storage method, system and terminal equipment
CN106407207B (en) Real-time newly-added data updating method and device
CN111666150B (en) Storage space allocation method and device, terminal and computer readable storage medium
CN111813805A (en) Data processing method and device
CN112667405B (en) Information processing method, device, equipment and storage medium
CN112269661B (en) Partition migration method and device based on Kafka cluster
CN111324427A (en) Task scheduling method and device based on DSP
CN111897493B (en) Storage space management method and device, electronic equipment and storage medium
CN109977074B (en) HDFS-based LOB data processing method and device
CN111143240A (en) Image storage method, system and terminal equipment
US9189382B2 (en) Noncontiguous representation of an array
CN115374232A (en) Tensor allocation method, medium, electronic device, and program product
CN113077344B (en) Block chain-based transaction method, device, electronic equipment and storage medium
CN111708715B (en) Memory allocation method, memory allocation device and terminal equipment
CN112269665B (en) Memory processing method and device, electronic equipment and storage medium
CN111352868B (en) Serial port access method, device, terminal equipment and storage medium
CN111679909A (en) Data processing method and device and terminal equipment
CN113051105A (en) Data processing method, device, equipment and storage medium
CN117033002B (en) Memory management method, device, equipment and storage medium
CN115037799B (en) Current limiting method, device, equipment and medium
CN111158605B (en) Method and device for optimizing disk storage policy of operating system and intelligent equipment
JP4668562B2 (en) Memory management program and memory management method
CN116991595B (en) Memory allocation method, device, equipment and medium based on Bitmap
WO2019041826A1 (en) Breakpoint list cleaning method and apparatus, storage medium, and server
CN117648128A (en) Instruction distribution method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 518000 1/F, 17 Building, Shenzhen Dayun Software Town, 8288 Longgang Avenue, Henggang Street, Longgang District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Yuntian lifeI Technology Co., Ltd

Address before: 518000 1/F, 17 Building, Shenzhen Dayun Software Town, 8288 Longgang Avenue, Henggang Street, Longgang District, Shenzhen City, Guangdong Province

Applicant before: SHENZHEN INTELLIFUSION TECHNOLOGIES Co.,Ltd.

GR01 Patent grant
GR01 Patent grant