CN106529679B - Machine learning method and system - Google Patents

Machine learning method and system Download PDF

Info

Publication number
CN106529679B
CN106529679B CN201610898838.6A CN201610898838A CN106529679B CN 106529679 B CN106529679 B CN 106529679B CN 201610898838 A CN201610898838 A CN 201610898838A CN 106529679 B CN106529679 B CN 106529679B
Authority
CN
China
Prior art keywords
storage
storage space
space
layer
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610898838.6A
Other languages
Chinese (zh)
Other versions
CN106529679A (en
Inventor
赵凌
李季檩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shanghai Co Ltd
Original Assignee
Tencent Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shanghai Co Ltd filed Critical Tencent Technology Shanghai Co Ltd
Priority to CN201610898838.6A priority Critical patent/CN106529679B/en
Publication of CN106529679A publication Critical patent/CN106529679A/en
Priority to PCT/CN2017/102836 priority patent/WO2018068623A1/en
Application granted granted Critical
Publication of CN106529679B publication Critical patent/CN106529679B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The embodiment of the invention discloses a machine learning method and a machine learning system, which are applied to the technical field of information processing. In the method of this embodiment, the machine learning system allocates an entire storage space according to the storage values of the storage spaces corresponding to the computing architectures of the respective layers of the machine learning model, so that the computing architectures of the respective layers can recycle the entire storage space. Compared with the prior art, the corresponding storage spaces need to be respectively allocated to each layer of computing architecture, the method can reduce storage fragments caused by allocation of a plurality of storage spaces, and greatly reduces the storage space required by the machine learning model during operation, so that the system performance is improved, and the machine learning method can be operated on the terminal equipment with limited storage.

Description

Machine learning method and system
Technical Field
The invention relates to the technical field of information processing, in particular to a machine learning method and a machine learning system.
Background
The existing deep learning forward prediction technology is only suitable for computing platforms with large storage capacity, such as servers or Personal Computers (PCs) with large memory/video storage capacity. The deep learning method has different learning models in different application fields such as face detection, face registration, face recognition and the like, the problems to be solved are different in complexity, and the complexity of a computing architecture is also different.
When the computing platform carries out forward prediction by a deep learning method, an independent storage space is distributed for each layer of computing architecture so as to store information required by the layer of computing architecture during computing, and the storage spaces of the computing architectures are independent and do not influence each other. However, if the deep learning model is complex, the total content required is more, and on a platform with limited storage, the calculation performance is reduced, even the algorithm cannot run; in addition, when the computing platform allocates the storage space, the computing platform allocates a layer of computing architecture to be executed currently, so that allocation operation occurs multiple times, and storage fragments are easily formed.
Disclosure of Invention
The embodiment of the invention provides a machine learning method and a machine learning system, which realize the allocation of an integral storage space for a machine learning model so as to facilitate the cyclic utilization of each layer of computing architecture of the machine learning model.
The embodiment of the invention provides a machine learning method, which comprises the following steps:
respectively determining the storage values of the storage spaces corresponding to each layer of computing architecture of the machine learning model;
distributing a corresponding overall storage space for the machine learning model according to the storage values of the storage spaces of the computing architectures of the layers, so that the overall storage space can store information required by the operation of any computing architecture;
and each layer of computing architecture of the machine learning model respectively utilizes the whole storage space to perform corresponding computation.
An embodiment of the present invention further provides a machine learning system, including:
the storage determining unit is used for respectively determining the storage values of the storage spaces corresponding to the computing architectures of the layers of the machine learning model;
the distribution unit is used for distributing a corresponding overall storage space for the machine learning model according to the storage values of the storage spaces of the computing architectures of all layers, so that the overall storage space can store information required by the running of any computing architecture;
and the computing unit is used for performing corresponding computation by utilizing the whole storage space for each layer of computing architecture of the machine learning model.
As can be seen, in the method of this embodiment, the machine learning system allocates an entire storage space according to the storage values of the storage spaces corresponding to the computing architectures of the respective layers of the machine learning model, so that the computing architectures of the respective layers can recycle the entire storage space. Compared with the prior art, the corresponding storage spaces need to be respectively allocated to each layer of computing architecture, the method can reduce storage fragments caused by allocation of a plurality of storage spaces, and greatly reduces the storage space required by the machine learning model during operation, so that the system performance is improved, and the machine learning method can be operated on the terminal equipment with limited storage.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for machine learning according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method for determining a stored value of a storage space corresponding to a layer of computing architecture according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for performing corresponding computations by using the entire memory space in a layer of computing architecture according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating the recycling of the entire memory space by computing architectures of various layers according to an embodiment of the present invention;
FIG. 5 is a flow chart of a machine learning method provided in an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a machine learning system according to an embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram of another machine learning system provided by an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the invention provides a machine learning method, which is a method executed by a machine learning system, such as a deep learning system, and the like, and the flow chart is shown in fig. 1, and comprises the following steps:
step 101, respectively determining storage values of storage spaces corresponding to each layer of computational architecture of the machine learning model.
It is understood that the machine learning model includes multiple layers of computing architecture, such as convolutional layers, additive layers, subtractive layers, etc., which are combined to form the computing architecture of the machine learning model to implement certain functions, such as forward prediction through deep learning, etc.
Each layer of computing architecture comprises an input unit, a computing unit and an output unit, and the units occupy a certain storage space during operation, so that the storage space corresponding to a certain layer of computing architecture comprises the storage space required by each unit included in the layer of computing architecture during operation. In performing this step, the machine learning system is obtained from the configuration file of the machine learning model.
And 102, distributing corresponding overall storage space for the machine learning model according to the storage value of the storage space of each layer of computing architecture determined in the step 101, so that the overall storage space can store information required by the running of any layer of computing architecture.
In this embodiment, the machine learning system allocates an entire storage space to the machine learning model, so that the entire storage space is recycled when the computing architectures of the respective layers run, and the entire storage space needs to sufficiently store information required when the computing architectures of any layer run, for example, a storage value of the entire storage space is consistent with a storage value of a maximum storage space in the storage spaces of the computing architectures of the respective layers.
And 103, respectively utilizing the whole storage space to perform corresponding calculation on each layer of calculation architecture of the machine learning model.
As can be seen, in the method of this embodiment, the machine learning system allocates an entire storage space according to the storage values of the storage spaces corresponding to the computing architectures of the respective layers of the machine learning model, so that the computing architectures of the respective layers can recycle the entire storage space. Compared with the prior art, the corresponding storage spaces need to be respectively allocated to each layer of computing architecture, the method can reduce storage fragments caused by allocation of a plurality of storage spaces, and greatly reduces the storage space required by the machine learning model during operation, so that the system performance is improved, and the machine learning method can be operated on the terminal equipment with limited storage.
Referring to fig. 2, in a specific embodiment, the machine learning system determines the storage value of the storage space corresponding to a certain layer of the computing architecture in the step 101, which is performed by the following steps:
step 201, obtaining a configuration file of the machine learning model, where the configuration file includes information of a calculation parameter of a certain layer of calculation architecture, and information and structural information of an input parameter.
The calculation parameters refer to fixed parameters used by the layer of calculation framework during calculation, such as some coefficients and the like, and the information of the calculation parameters may include information of the size, the type and the like of the calculation parameters; the structure information may include description information of the layer of computing architecture, etc.; the information of the input parameter may include information of the size and type of the input parameter.
Step 202, determining the size of the output parameter of a certain layer of computing architecture according to the information of the input parameter, the information of the computing parameter and the structural information. Specifically, the machine learning system may assign values to the input parameters according to the information of the input parameters, calculate the output parameters according to the structural information and the assignments of the input parameters, and then determine the size of the output parameters.
After determining the magnitude of the output parameter, the machine learning system may perform steps 203 or 204 as follows.
Step 203, using the sum of the input parameter, the output parameter and the calculation parameter as the storage value of the storage space corresponding to a certain layer of calculation framework.
In this case, when the machine learning system executes step 102, the stored value of the maximum storage space in the storage spaces corresponding to the computing architectures of the respective layers may be specifically determined, so that when the overall storage space is allocated, the stored value of the overall storage space is consistent with the stored value of the maximum storage space.
Step 204, determining a storage value of a storage space corresponding to a certain layer of computing architecture includes: the storage module stores the space storage value of the input parameter, the space storage value of the calculation parameter and the space storage value of the output parameter, namely the storage values of the storage spaces occupied by the input unit, the calculation unit and the output unit respectively included in the layer of calculation architecture.
In this case, when the machine learning system executes step 102, it may be specifically determined that the storage space corresponding to each layer of the computation architecture stores the stored value of the first maximum space of the input parameter, the stored value of the second maximum space of the computation parameter, and the stored value of the third maximum space of the output parameter, so that when the overall storage space is allocated, the stored value of the overall storage space is the sum of the stored values of the first maximum space, the second maximum space, and the third maximum space.
For example, a machine learning model has 3 levels of computing architectures, where the first level of computing architecture stores input parameters corresponding to the computing architecture, the spatial storage values of the computing parameters and output parameters are a1, a2 and a3, respectively, the second level of computing architecture stores input parameters corresponding to the computing architecture, the spatial storage values of the computing parameters and output parameters are b1, b2 and b3, respectively, the spatial storage values of the computing parameters and output parameters are c1, c2 and c3, respectively, where a1 is greater than b1 and greater than c1, b2 is greater than a2 and greater than c2, and c3 is greater than a2 and greater than b2, respectively, then a first maximum space is determined, and the storage values of the second maximum space and third maximum space are a1, b2 and c3, respectively, so that the storage value of the allocated overall storage space is a1+ b2+ c 3.
Wherein the first maximum space, the second maximum space and the third maximum space do not refer to spaces in which storage values are arranged in the first three positions, but refer to maximum spaces of different dimensions.
Referring to fig. 3, in another specific embodiment, when the machine learning system performs step 103, and a certain layer of computing architecture performs computation by using the allocated overall storage space, the method may specifically be implemented by the following steps:
step 301, the machine learning system stores the calculation parameters of a certain layer of calculation framework into a second storage space in the whole storage space.
In step 302, a computing architecture of a certain layer uses a first storage space in the whole storage space as a space for storing corresponding input parameters.
And step 303, calculating by a certain layer of calculation framework according to the input parameters and the calculation parameters to obtain output parameters, and storing the output parameters into a third storage space in the whole storage space.
If a certain layer of computing architecture is a first layer of computing architecture, namely a layer of computing architecture arranged at a first position of the machine learning model, the input parameters of the first layer of computing architecture can be input into the machine learning system by a user; if a certain layer of computing architecture is a non-first layer of computing architecture, and the non-first layer of computing architecture executes the steps 301 to 303, the first storage space in the step 301 is a space for storing output parameters of a previous layer of computing architecture of the certain layer of computing architecture, and the input parameters of the certain layer of computing architecture are output parameters of the previous layer of computing architecture; the third storage space in step 303 is a space for storing the input parameters of the previous layer of computing structure.
For example, referring to fig. 4, the overall storage space allocated to the machine learning system may specifically include a first storage space, a second storage space, and a third storage space, and during initial calculation, the machine learning system stores the input parameters of the first-layer computing architecture in the first storage space and stores the calculation parameters of the first-layer computing architecture in the second storage space, so that the first-layer computing architecture calculates the output parameters according to the input parameters and the calculation parameters, and outputs the output parameters to the third-layer storage space. Then, the machine learning system may assign the address pointer of the third storage space directly to the space for storing the input parameters of the second-layer computing architecture, assign the address pointer of the second-layer storage space directly to the space for storing the computing parameters of the second-layer computing architecture, and assign the address pointer of the first storage space to the space for storing the output parameters of the second-layer computing architecture.
In this way, the machine learning system can store the calculation parameters of the second layer of calculation structure into the second storage space according to the address pointer of the second layer of storage space, specifically, an overlay storage manner is adopted, that is, the calculation parameters of the second layer of calculation structure directly overlay the information in the second storage space, then the second layer of calculation structure takes the information stored in the third layer of storage space as the input parameters of the second layer of calculation structure, and calculates the output parameters according to the input parameters and the calculation parameters stored in the second storage space, and outputs the output parameters to the first storage space. Therefore, by adopting a pointer assignment method, the second-layer computing architecture can recycle the whole storage space.
Then, the machine learning system directly assigns the address pointer of the first storage space to a space for storing input parameters of the third-layer computing architecture, directly assigns the address pointer of the second-layer storage space to a space for storing computing parameters of the third-layer computing architecture, and assigns the address pointer of the third storage space to a space for outputting parameters of the third-layer computing architecture, so that the third-layer computing architecture can circularly utilize the whole storage space to perform corresponding computation. According to the method, by analogy, each layer of calculation framework of the machine learning model utilizes the whole storage space layer by layer to perform corresponding calculation, so that the storage space required by the machine learning model during operation is reduced to a great extent, and the system performance is improved.
In the following, a specific embodiment is described as an embodiment of the method of the present invention, in the embodiment, the machine learning system is a deep learning system, and the machine learning model is a deep learning model, the deep learning model includes n layers of computing architectures, and n is 3 in the embodiment as an example. The flowchart of the method of this embodiment is shown in fig. 5, and includes:
step 401, the deep learning system obtains a configuration file of the deep learning model, where the configuration file includes information of calculation parameters respectively corresponding to the 3-layer calculation framework, and information and structural information of the input parameters.
And step 402, assigning the input parameters according to the information of the input parameters and calculating the output parameters according to the structural information and the assignment of the input parameters by the deep learning system aiming at each layer of computing architecture, thereby determining the size of the output parameters of the corresponding layer of computing architecture.
In step 403, the deep learning system uses the sum of the input parameter, the output parameter and the calculation parameter of each layer of calculation framework as the storage value of the storage space corresponding to the corresponding layer of calculation framework.
Alternatively, the determining, by the deep learning system, the storage value of the storage space corresponding to each layer of the computing architecture includes: and storing the spatial storage value of the input parameter, the spatial storage value of the calculation parameter and the spatial storage value of the output parameter.
And step 404, distributing corresponding overall storage space for the deep learning model.
If the storage value of the storage space of each layer of computing architecture determined by the deep learning system is the sum of the size of the input parameter, the size of the output parameter and the size of the computing parameter of the corresponding layer of computing architecture, the storage value of the whole storage space allocated by the deep learning system is consistent with the storage value of the maximum storage space in the storage spaces of the 3 layers of computing architectures.
If the deep learning system determines the storage values of the storage space of each layer of the computing architecture to include: and storing the space storage value of the input parameter, the space storage value of the calculation parameter and the space storage value of the output parameter, wherein the storage value of the overall storage space distributed by the deep learning system is the sum of the storage values of the first maximum space, the second maximum space and the third maximum space. The first maximum space is a storage space corresponding to the 3-layer computing architecture and stores a storage value of the maximum space of the input parameter, the second maximum space is a storage value of the maximum space of the computing parameter stored in the storage space corresponding to the 3-layer computing architecture, and the third maximum space is a storage value of the maximum space of the output parameter stored in the storage space corresponding to the 3-layer computing architecture.
After the deep learning system allocates an entire memory space for the deep learning model through the above steps 401 to 404, the deep learning system can recycle the entire memory space according to the following method of steps 405 to 407.
Step 405, the deep learning system receives parameters input by a user, stores the parameters input by the user as input parameters of a first-layer computing architecture into a first storage space of the whole storage space, and stores the computing parameters of the first-layer computing architecture into a second storage space of the whole storage space; and the deep learning system calculates the output parameters of the first-layer computing architecture according to the input parameters and the computing parameters of the first-layer computing architecture, and stores the output parameters into a third storage space of the whole storage space.
Step 406, the deep learning system directly assigns the address pointer of the third storage space to a space for storing the input parameters of the second-layer computing architecture, and takes the output parameters of the first-layer computing architecture as the input parameters of the second-layer computing architecture; directly assigning an address pointer of the second-layer storage space to a space for storing the calculation parameters of the second-layer calculation architecture, and storing the calculation parameters of the second-layer calculation architecture into the second-layer storage space; and assigning the address pointer of the first storage space to the space of the output parameters of the second-layer computing architecture, and storing the output parameters obtained according to the input parameters and the computing parameters of the second-layer computing architecture into the first storage space.
Step 407, the deep learning system directly assigns the address pointer of the first storage space to a space for storing the input parameters of the third-layer computing architecture, and takes the output parameters of the second-layer computing architecture as the input parameters of the third-layer computing architecture; directly assigning an address pointer of the second-layer storage space to a space for storing the calculation parameters of the third-layer calculation architecture, and storing the calculation parameters of the third-layer calculation architecture into the second-layer storage space; and assigning an address pointer of the third storage space to a space of the output parameters of the third-layer computing framework, and storing the output parameters obtained according to the input parameters and the computing parameters of the third-layer computing framework into the third storage space, wherein the output parameters are final output parameters of the deep learning model.
An embodiment of the present invention further provides a machine learning system, a schematic structural diagram of which is shown in fig. 6, and the structural diagram specifically includes:
a storage determining unit 10, configured to determine storage values of storage spaces corresponding to computing architectures of respective layers of the machine learning model;
the allocation unit 11 is configured to allocate a corresponding overall storage space to the machine learning model according to the storage value of the storage space of each layer of computing architecture determined by the storage determination unit 10, so that the overall storage space can store information required by any layer of computing architecture during operation;
and the calculating unit 12 is used for performing corresponding calculation on each layer of calculating framework of the machine learning model by using the whole storage space distributed by the distributing unit 11.
Specifically, the computing unit 12 is specifically configured to store the computing parameter corresponding to the certain layer of computing architecture in a second storage space of the overall storage space; and the certain layer of calculation framework takes the first storage space in the whole storage space as a space for storing corresponding input parameters, calculates according to the input parameters and the calculation parameters to obtain output parameters, and stores the output parameters into a third storage space in the whole storage space.
If the certain layer of computing architecture is a non-first layer of computing architecture, the first storage space is a space for storing output parameters of a previous layer of computing architecture of the certain layer of computing architecture, and the input parameters of the certain layer of computing architecture are the output parameters of the previous layer of computing architecture; the third storage space is a space for storing input parameters in the previous layer of computing architecture.
As can be seen, in the machine learning system of this embodiment, the allocating unit 11 allocates an entire storage space according to the storage values of the storage spaces corresponding to the computing architectures of the layers of the machine learning model, so that the computing architectures of the computing units 12 can recycle the entire storage space. Compared with the prior art, the machine learning system needs to allocate corresponding storage spaces for each layer of computing architecture respectively, storage fragments caused by allocation of a plurality of storage spaces can be reduced, and the storage space required during running of the machine learning model is reduced to a great extent, so that the system performance is improved, and the machine learning system can be operated in a terminal device with limited storage.
Referring to fig. 7, in a specific embodiment, the storage determination unit 10 may specifically include an output determination unit 110 and a final determination unit 120, where:
an output determining unit 110, configured to obtain a configuration file of the machine learning model, where the configuration file includes information of a calculation parameter of the certain layer of calculation architecture, information of an input parameter, and structural information; determining the size of the output parameter of the certain layer of computing architecture according to the information of the input parameter, the information of the computing parameter and the structural information;
a final determining unit 120, configured to use a sum of the magnitude of the input parameter, the magnitude of the output parameter determined by the output determining unit 110, and the magnitude of the calculation parameter as a storage value of a storage space corresponding to the certain layer of calculation architecture; or, determining the storage value of the storage space corresponding to the certain layer of computing architecture includes: and storing the spatial storage value of the input parameter, storing the spatial storage value of the calculation parameter and storing the spatial storage value of the output parameter.
If the final determining unit 120 uses the sum of the magnitude of the input parameter, the magnitude of the output parameter, and the magnitude of the calculation parameter as the storage value of the storage space corresponding to the certain layer of calculation framework, the allocating unit 11 is specifically configured to determine the storage value of the maximum storage space in the storage spaces corresponding to the layers of calculation framework; allocating an overall storage space such that the stored values of the overall storage space coincide with the stored values of the maximum storage space.
If the finalization unit 120 determines that the stored value of the storage space corresponding to the certain layer of computing architecture includes: storing the spatial storage values of the input parameters, storing the spatial storage values of the calculation parameters, and storing the spatial storage values of the output parameters, the allocation unit 11 is specifically configured to determine the storage value of the first maximum space in which the input parameters are stored, the storage value of the second maximum space in which the calculation parameters are stored, and the storage value of the third maximum space in which the output parameters are stored in the storage space corresponding to each layer of the calculation architecture; allocating an overall storage space such that the stored value of the overall storage space is the sum of the stored values of the first maximum space, the second maximum space, and the third maximum space.
The present invention further provides a terminal device, a schematic structural diagram of which is shown in fig. 8, where the terminal device may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 20 (e.g., one or more processors) and a memory 21, and one or more storage media 22 (e.g., one or more mass storage devices) storing the application programs 221 or the data 222. Wherein the memory 21 and the storage medium 22 may be a transient storage or a persistent storage. The program stored in the storage medium 22 may include one or more modules (not shown), each of which may include a series of instruction operations for the terminal device. Still further, the central processor 20 may be arranged to communicate with the storage medium 22, and to execute a series of instruction operations in the storage medium 22 on the terminal device.
The terminal equipment may also include one or more power supplies 23, one or more wired or wireless network interfaces 24, one or more input-output interfaces 25, and/or one or more operating systems 223, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and the like.
The steps performed by the machine learning system in the above-described method embodiment may be based on the structure of the terminal device shown in fig. 8.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The machine learning method and system provided by the embodiment of the present invention are described in detail above, and a specific example is applied in the description to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (12)

1. A machine learning method, comprising:
respectively determining the storage values of the storage spaces corresponding to each layer of computing architecture of the machine learning model when forward prediction is carried out by a deep learning method;
distributing a corresponding overall storage space for the machine learning model according to the storage values of the storage spaces of the computing architectures of the layers, so that the overall storage space can store information required by the operation of any computing architecture;
each layer of computing architecture of the machine learning model respectively utilizes the whole storage space to perform corresponding computation;
determining a storage value of a storage space corresponding to a certain layer of computing architecture of the machine learning model specifically includes:
acquiring a configuration file of the machine learning model, wherein the configuration file comprises information of calculation parameters of the certain layer of calculation framework, information of input parameters and structural information;
determining the size of the output parameter of the certain layer of computing architecture according to the information of the input parameter, the information of the computing parameter and the structural information;
taking the sum of the input parameter, the output parameter and the calculation parameter as a storage value of a storage space corresponding to the certain layer of calculation architecture; or, determining the storage value of the storage space corresponding to the certain layer of computing architecture includes: and storing the spatial storage value of the input parameter, storing the spatial storage value of the calculation parameter and storing the spatial storage value of the output parameter.
2. The method according to claim 1, wherein the allocating the corresponding overall storage space for the machine learning model according to the storage values of the storage spaces of the respective layers of computing architectures specifically comprises:
determining a storage value of the maximum storage space in the storage spaces corresponding to the computing architectures of the layers;
allocating an overall storage space such that the stored values of the overall storage space coincide with the stored values of the maximum storage space.
3. The method according to claim 1, wherein the allocating the corresponding overall storage space for the machine learning model according to the storage values of the storage spaces of the respective layers of computing architectures specifically comprises:
determining a storage value of a first maximum space for storing input parameters, a storage value of a second maximum space for storing calculation parameters and a storage value of a third maximum space for storing output parameters in storage spaces corresponding to the calculation architectures of the layers;
allocating an overall storage space such that the stored value of the overall storage space is the sum of the stored values of the first maximum space, the second maximum space, and the third maximum space.
4. The method according to any one of claims 1 to 3, wherein the method is applied to a machine learning system, and a certain layer of computing architecture of the machine learning model performs corresponding computation by using the entire storage space, and specifically includes:
the machine learning system stores the calculation parameters corresponding to the certain layer of calculation architecture into a second storage space in the whole storage space;
and the certain layer of calculation framework takes the first storage space in the whole storage space as a space for storing corresponding input parameters, calculates according to the input parameters and the calculation parameters to obtain output parameters, and stores the output parameters into a third storage space in the whole storage space.
5. The method of claim 4,
if the certain layer of computing architecture is a non-first layer of computing architecture, the first storage space is a space for storing output parameters of a previous layer of computing architecture of the certain layer of computing architecture, and the input parameters of the certain layer of computing architecture are the output parameters of the previous layer of computing architecture; the third storage space is a space for storing input parameters in the previous layer of computing architecture.
6. A machine learning system, comprising:
the storage determining unit is used for respectively determining the storage values of the storage spaces corresponding to each layer of computing architecture of the machine learning model when the forward prediction is carried out by the deep learning method;
the distribution unit is used for distributing a corresponding overall storage space for the machine learning model according to the storage values of the storage spaces of the computing architectures of all layers, so that the overall storage space can store information required by the running of any computing architecture;
the computing unit is used for carrying out corresponding computation by utilizing the whole storage space by each layer of computing architecture of the machine learning model;
the storage determination unit specifically includes an output determination unit and a final determination unit, wherein:
the output determining unit is specifically configured to obtain a configuration file of the machine learning model, where the configuration file includes information of a calculation parameter of a certain layer of calculation architecture, information of an input parameter, and structural information; determining the size of the output parameter of the certain layer of computing architecture according to the information of the input parameter, the information of the computing parameter and the structural information;
the final determining unit is configured to use a sum of the magnitude of the input parameter, the magnitude of the output parameter, and the magnitude of the calculation parameter as a storage value of a storage space corresponding to the certain layer of calculation architecture; or, determining the storage value of the storage space corresponding to the certain layer of computing architecture includes: and storing the spatial storage value of the input parameter, storing the spatial storage value of the calculation parameter and storing the spatial storage value of the output parameter.
7. The system of claim 6,
the allocation unit is specifically configured to determine a storage value of a maximum storage space in storage spaces corresponding to the computing architectures of the layers; allocating an overall storage space such that the stored values of the overall storage space coincide with the stored values of the maximum storage space.
8. The system of claim 6,
the allocation unit is specifically configured to determine a stored value of a first maximum space in which the input parameter is stored in a storage space corresponding to each layer of the computing architecture, store a stored value of a second maximum space in which the computing parameter is stored, and store a stored value of a third maximum space in which the output parameter is stored; allocating an overall storage space such that the stored value of the overall storage space is the sum of the stored values of the first maximum space, the second maximum space, and the third maximum space.
9. The system according to any one of claims 6 to 8,
the computing unit is specifically configured to store a computing parameter corresponding to a certain layer of computing architecture in a second storage space of the overall storage space; and the certain layer of calculation framework takes the first storage space in the whole storage space as a space for storing corresponding input parameters, calculates according to the input parameters and the calculation parameters to obtain output parameters, and stores the output parameters into a third storage space in the whole storage space.
10. The system of claim 9,
if the certain layer of computing architecture is a non-first layer of computing architecture, the first storage space is a space for storing output parameters of a previous layer of computing architecture of the certain layer of computing architecture, and the input parameters of the certain layer of computing architecture are the output parameters of the previous layer of computing architecture; the third storage space is a space for storing input parameters in the previous layer of computing architecture.
11. A storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the machine learning method of any one of claims 1 to 5.
12. A terminal device comprising a processor and a storage medium, the processor configured to implement instructions;
the storage medium is configured to store a plurality of instructions for loading by a processor and executing the machine learning method of any of claims 1 to 5.
CN201610898838.6A 2016-10-14 2016-10-14 Machine learning method and system Active CN106529679B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610898838.6A CN106529679B (en) 2016-10-14 2016-10-14 Machine learning method and system
PCT/CN2017/102836 WO2018068623A1 (en) 2016-10-14 2017-09-22 Machine learning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610898838.6A CN106529679B (en) 2016-10-14 2016-10-14 Machine learning method and system

Publications (2)

Publication Number Publication Date
CN106529679A CN106529679A (en) 2017-03-22
CN106529679B true CN106529679B (en) 2020-01-14

Family

ID=58332197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610898838.6A Active CN106529679B (en) 2016-10-14 2016-10-14 Machine learning method and system

Country Status (2)

Country Link
CN (1) CN106529679B (en)
WO (1) WO2018068623A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529679B (en) * 2016-10-14 2020-01-14 腾讯科技(上海)有限公司 Machine learning method and system
FR3089649A1 (en) * 2018-12-06 2020-06-12 Stmicroelectronics (Rousset) Sas Method and device for determining the global memory size of a global memory area allocated to the data of a neural network
CN110287171B (en) * 2019-06-28 2020-05-26 北京九章云极科技有限公司 Data processing method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1492330A (en) * 2002-10-24 2004-04-28 华为技术有限公司 Method for general windows program to operate journal information record
CN1627251A (en) * 2003-12-09 2005-06-15 微软公司 Accelerating and optimizing the processing of machine learning techniques using a graphics processing unit
CN102455943A (en) * 2010-10-19 2012-05-16 上海聚力传媒技术有限公司 Method for carrying out data sharing based on memory pool, and computer device
CN103617146A (en) * 2013-12-06 2014-03-05 北京奇虎科技有限公司 Machine learning method and device based on hardware resource consumption
CN104021088A (en) * 2014-06-24 2014-09-03 广东睿江科技有限公司 Log storage method and device
CN105488565A (en) * 2015-11-17 2016-04-13 中国科学院计算技术研究所 Calculation apparatus and method for accelerator chip accelerating deep neural network algorithm
CN105637541A (en) * 2013-10-11 2016-06-01 高通股份有限公司 Shared memory architecture for a neural simulator

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101047610B (en) * 2007-04-30 2012-07-11 华为技术有限公司 Data storage, reading, transmission method and management server and network node
JP5171118B2 (en) * 2007-06-13 2013-03-27 キヤノン株式会社 Arithmetic processing apparatus and control method thereof
EP3035204B1 (en) * 2014-12-19 2018-08-15 Intel Corporation Storage device and method for performing convolution operations
CN105869117B (en) * 2016-03-28 2021-04-02 上海交通大学 GPU acceleration method for deep learning super-resolution technology
CN106529679B (en) * 2016-10-14 2020-01-14 腾讯科技(上海)有限公司 Machine learning method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1492330A (en) * 2002-10-24 2004-04-28 华为技术有限公司 Method for general windows program to operate journal information record
CN1627251A (en) * 2003-12-09 2005-06-15 微软公司 Accelerating and optimizing the processing of machine learning techniques using a graphics processing unit
CN102455943A (en) * 2010-10-19 2012-05-16 上海聚力传媒技术有限公司 Method for carrying out data sharing based on memory pool, and computer device
CN105637541A (en) * 2013-10-11 2016-06-01 高通股份有限公司 Shared memory architecture for a neural simulator
CN103617146A (en) * 2013-12-06 2014-03-05 北京奇虎科技有限公司 Machine learning method and device based on hardware resource consumption
CN104021088A (en) * 2014-06-24 2014-09-03 广东睿江科技有限公司 Log storage method and device
CN105488565A (en) * 2015-11-17 2016-04-13 中国科学院计算技术研究所 Calculation apparatus and method for accelerator chip accelerating deep neural network algorithm

Also Published As

Publication number Publication date
CN106529679A (en) 2017-03-22
WO2018068623A1 (en) 2018-04-19

Similar Documents

Publication Publication Date Title
JP6898496B2 (en) Computation graph processing
US11593644B2 (en) Method and apparatus for determining memory requirement in a network
US8910173B2 (en) Datacenter resource allocation
US10650008B2 (en) Parallel scoring of an ensemble model
CN114026569A (en) Extended convolution using systolic arrays
CN111984400B (en) Memory allocation method and device for neural network
CN112199190B (en) Memory allocation method and device, storage medium and electronic equipment
KR20180073118A (en) Convolutional neural network processing method and apparatus
CN106529679B (en) Machine learning method and system
CN109740508B (en) Image processing method based on neural network system and neural network system
JP5778343B2 (en) Instruction culling in the graphics processing unit
CN110650347A (en) Multimedia data processing method and device
CN109918182A (en) More GPU task dispatching methods under virtualization technology
CN110413539B (en) Data processing method and device
US9218198B2 (en) Method and system for specifying the layout of computer system resources
US20210158131A1 (en) Hierarchical partitioning of operators
US11301713B2 (en) Information processing apparatus, information processing method, and non-transitory computer readable medium
WO2017000645A1 (en) Method and apparatus for allocating host resource
Sekiyama et al. Profile-guided memory optimization for deep neural networks
JP6195342B2 (en) Information processing apparatus and memory access control method
JP2022137247A (en) Processing for a plurality of input data sets
US11610128B2 (en) Neural network training under memory restraint
CN114761920A (en) Hardware accelerator with reconfigurable instruction set
CN110704182A (en) Deep learning resource scheduling method and device and terminal equipment
US11636569B1 (en) Matrix transpose hardware acceleration

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant