WO2022022417A1 - 一种校准方法、装置、终端设备及存储介质 - Google Patents

一种校准方法、装置、终端设备及存储介质 Download PDF

Info

Publication number
WO2022022417A1
WO2022022417A1 PCT/CN2021/108133 CN2021108133W WO2022022417A1 WO 2022022417 A1 WO2022022417 A1 WO 2022022417A1 CN 2021108133 W CN2021108133 W CN 2021108133W WO 2022022417 A1 WO2022022417 A1 WO 2022022417A1
Authority
WO
WIPO (PCT)
Prior art keywords
group
layer
calibrated
resources
layers
Prior art date
Application number
PCT/CN2021/108133
Other languages
English (en)
French (fr)
Inventor
李康
丁瑞强
李涵
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Priority to US18/004,021 priority Critical patent/US11816547B2/en
Publication of WO2022022417A1 publication Critical patent/WO2022022417A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to the technical field of data processing, and in particular, to a calibration method, a calibration device, a terminal device and a computer-readable storage medium.
  • Model quantization is a commonly used technique in the field of deep learning. By quantizing model parameters and inputs from high precision to low precision, such as quantizing from float32 (floating point 32 bits) to int8 (integer 8 bits), the model operation speed can be improved. and reduce the model size.
  • the model needs to be calibrated, that is, the typical application data is input into the model (model inference) to obtain the dynamic range of the data to be quantized generated by each layer to be calibrated (which can be specified by the user). , and then determine the quantization factor (statistic calculation) according to the dynamic range.
  • the quantization factor is used for quantization.
  • each layer to be calibrated in the model is calibrated in sequence according to the direction of data transmission in the model, and one layer or adjacent layers is fixed each time.
  • Embodiments of the present invention provide a calibration method, a calibration device, a terminal device, and a computer-readable storage medium, which improve the speed of model calibration.
  • an embodiment of the present invention provides a calibration method, which includes: determining layer attribute information of each to-be-calibrated layer in a model; The group in which the layer to be calibrated is located; wherein the layer attribute information of any layer to be calibrated includes resources required by the layer, and the resources required by the layer are the resources occupied when calibrating the layer to be calibrated; the The total available resource is the total resource used for calibration.
  • the determining, according to the total available resources and the layer attribute information of the layers to be calibrated, the group to which each layer to be calibrated is located includes: from the layers to be calibrated that do not belong to the group, Determine the layer to be calibrated with the largest resource required by the layer as the target layer; at least according to the layer attribute information of the target layer, determine the group where the target layer is located; from the group available resources of the group where the target layer is located, Subtract the resources required by the layer of the target layer; if there is still the layer to be calibrated that does not belong to the group, return the layer to be calibrated that does not belong to the group, and determine the layer with the largest resource required by the layer The step that the layer to be calibrated is the target layer.
  • the determining the group in which the target layer is located at least according to the layer attribute information of the target layer includes: if there is a group in which the available resources of the group are greater than or equal to the resources required by the layer of the target layer, selecting from the group One is the group where the target layer is located; if there is no group with the available resources of the group greater than or equal to the resources required by the layer of the target layer, a group is created, and the group is determined as the group where the target layer is located, and the group of this group is set to be available.
  • the resources are equal to the total available resources.
  • selecting one of the groups to be the group where the target layer is located includes: selecting from a group whose available resources are greater than or equal to the resources of the layers of the target layer.
  • the groups that require resources determine the group with the smallest cost value as the group where the target layer is located; wherein, the cost value of any said group is the difference between the first time and the second time of the group, and the first time is a pair of The time required for calibrating all the layers currently to be calibrated and the target layer in the group, and the second time is the time required for calibrating all the layers currently to be calibrated in the group.
  • determining the group with the lowest cost value as the group in which the target layer is located from the group in which the available resources of the group are greater than or equal to the resources required by the layer of the target layer includes: determining the maximum sequence number of each group; any The maximum sequence number of the group is the maximum sequence number of all layers to be calibrated in the group, and the sequence number of any layer to be calibrated is the order of the layer to be calibrated in the model according to the preset processing order; if there is a maximum sequence number The group that is greater than the sequence number of the target layer, then select one from the group where the target layer is located; if there is no group whose maximum sequence number is greater than the sequence number of the target layer, determine that the group with the largest value of the maximum sequence number is the target layer. group.
  • the resource is memory
  • the method further includes: selecting an uncalibrated group, All layers to be calibrated in the group are calibrated; if there is still an uncalibrated group, returning to the step of selecting an uncalibrated group and calibrating all the layers to be calibrated in the group.
  • an embodiment of the present invention provides a calibration apparatus, which includes: a first determination module for determining layer attribute information of each to-be-calibrated layer in the model; a second determination module for determining according to the total available resources and The layer attribute information of each of the to-be-calibrated layers determines the group to which each of the to-be-calibrated layers belongs.
  • an embodiment of the present invention provides a terminal device, which includes: one or more processors; a storage device for storing one or more programs; when the one or more programs are stored by the one or more programs Multiple processors execute, so that the one or more processors implement any one of the calibration methods provided by the embodiments of the present invention.
  • an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program of the program is executed by a processor, any one of the methods provided by the embodiment of the present invention is implemented. calibration method.
  • Embodiments of the present invention provide a calibration method, apparatus, terminal device, and storage medium, wherein firstly, the layer attribute information (resources required for the layer) of each layer to be calibrated in the model is determined; The resources and the total available resources are used to determine the grouping of the layers to be calibrated, that is, to determine the layers to be calibrated corresponding to each calibration operation.
  • all layers to be calibrated can be reasonably grouped under the premise that the total available resources can support, and the resources required by the layers in each calibration operation can be as balanced and large as possible, so as to make full use of resources. In most cases, the number of calibration operations can be reduced (reduce the number of groups), which improves the calculation speed of model calibration.
  • FIG. 1 is a schematic flowchart of a calibration method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a process of determining a group of a layer to be calibrated in a calibration method provided by an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a calibration device according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
  • an embodiment of the present invention provides a calibration method.
  • the calibration method in the embodiment of the present invention can be applied to a model quantization scenario, that is, the calibration method is used to calibrate the layers to be calibrated in the model to obtain the quantization factor of each to-be-calibrated layer, and the quantization factor can be used to realize model quantization.
  • the calibration method in the embodiment of the present invention may be performed by a calibration device, where the calibration device may be implemented by software and/or hardware, and is generally integrated on a terminal device.
  • the terminal device includes but is not limited to: mobile phones, computers, and personal Devices such as digital assistants.
  • the model is any operation model established according to the deep learning technology, such as a neural network model.
  • Each model can handle certain problems, such as image recognition, speech recognition, etc.
  • Each model includes multiple layers set in sequence, and each layer can perform certain processing (such as convolution processing, full connection processing, etc.) , and output the resulting data backwards (eg, to the next layer, or from the model).
  • the layer to be calibrated is a layer in the model that needs to be calibrated (to obtain a quantization factor), which may be all or part of the layers in the model.
  • the layer to be calibrated may be determined by user designation, etc., which is not limited in the present invention.
  • FIG. 1 is a schematic flowchart of a calibration method according to an embodiment of the present invention.
  • the calibration method includes the following steps S110 to S120.
  • the layer attribute information of any of the layers to be calibrated includes resources required by the layer, and the resources required by the layer are resources occupied when calibrating the layer to be calibrated.
  • the layer attribute information of each layer to be calibrated is first determined, and the specific means for determining it is not limited, for example, it can be based on the configuration of the layer to be calibrated (such as the size of the input data, the size of the convolution kernel, the moving step, the parameters of the function) etc.) OK.
  • the layer attribute information includes at least the resources required by the layer, that is, the resources used to process the data generated by the layer to be calibrated during the calibration operation, that is, the layer to be calibrated in a single calibration operation. A resource that must be "occupied".
  • the resource includes memory.
  • all resources include at least memory resources.
  • the above resources may also include other types of resources, such as computing resources of a processor.
  • S120 Determine the group to which each layer to be calibrated belongs according to the total available resources and layer attribute information of each layer to be calibrated.
  • the total available resources are total resources used for calibration.
  • the layers to be calibrated are divided into different groups according to the total resources available in each calibration operation (total available resources) and the resources required by the layers to be calibrated when they are calibrated (resources required for layers), namely In the calibration method of the embodiment of the present disclosure, before the calibration operation is actually performed, which layers should be calibrated in each calibration operation may be predetermined.
  • the layers to be calibrated in each group can be calibrated in the same calibration operation.
  • the sum of the resources required by the layers of all the layers to be calibrated in each group cannot be greater than the total available resources (otherwise the calibration operation cannot be performed), but it should be as large as possible on the basis of satisfying this requirement, so that the divided The layers to be calibrated in each group make full use of the total available resources as much as possible to ensure that the average number of layers to be calibrated into each group is as large as possible, that is, the number of required groups (that is, the number of calibration operations) is small. .
  • the total resources may be all resources possessed by the calibration device, that is, the total resources that the calibration device can provide in each calibration operation. Of course, in different calibration operations, the resources (total available resources) of the calibration device can be reused.
  • the layer attribute information (resources required for the layer) of each layer to be calibrated in the model is first determined; then the grouping of the layers to be calibrated is determined according to the layer required resources and total available resources of each layer to be calibrated, that is, Determine the layer to be calibrated corresponding to each calibration operation.
  • all layers to be calibrated can be reasonably grouped under the premise that the total available resources can support, and the resources required by the layers in each calibration operation can be as balanced and large as possible, so as to make full use of resources. In most cases, the number of calibration operations can be reduced (reduce the number of groups), which improves the calculation speed of model calibration.
  • step S130 is further included to S140.
  • the calibration process can be started, and in each calibration operation, all the layers to be calibrated in one group are calibrated separately, and the layers to be calibrated in different groups are calibrated separately (ie, multiple calibration operations are performed). ).
  • the resources of each calibration operation can be fully utilized as much as possible, that is, each calibration operation can be fully utilized.
  • the determining, according to the total available resources and the layer attribute information of the layers to be calibrated, the group to which each layer to be calibrated is located (S120) includes: selecting from the to-be-calibrated layer that does not belong to the group.
  • the layer to be calibrated with the largest resource required by the layer as the target layer at least according to the layer attribute information of the target layer, determine the group in which the target layer is located; available from the group of the group in which the target layer is located In the resource, subtract the resources required by the layer of the target layer; if there is still the layer to be calibrated that does not belong to the group, return to the layer to be calibrated that does not belong to the group, and determine the required layer for the layer
  • the layer to be calibrated with the largest resource is the step of the target layer.
  • the "layers to be calibrated that do not belong to the group” refers to the layers to be calibrated that have not been classified into a certain group when the above steps are performed, that is, the layers to be calibrated that have not been grouped as target layers.
  • the specific grouping process may include: from the layers to be calibrated that have not been grouped at present, selecting the layer with the largest resource required by the layer (such as the memory occupied in the calibration operation) as the current target layer, and assigning the target layer to the target layer.
  • the layers are grouped, and after grouping, the resources required by the layer of the target layer are subtracted from the available resources of the group (that is, the resources corresponding to the group that have not been "occupied"); For the layer to be calibrated in the group, if so, select the target layer from the ungrouped layer to be calibrated again (of course, the previous target layer has been grouped, so it no longer belongs to the ungrouped layer to be calibrated), otherwise it ends.
  • the determining the group in which the target layer is located at least according to the layer attribute information of the target layer includes: if there is a group in which the available resources of the group are greater than or equal to the resources required by the layer of the target layer, selecting from the group One is the group where the target layer is located; if there is no group with the available resources of the group greater than or equal to the resources required by the layer of the target layer, a group is created, and the group is determined as the group where the target layer is located, and the group of this group is set to be available.
  • the resources are equal to the total available resources.
  • the grouping process for each target layer may include: if there are currently existing groups, the available resources of one or more groups are larger than the resources required by the layers of the target layer (that is, there are currently existing groups that can accommodate the target layer.
  • the target layer can be divided into such a group (subsequently subtract the resources required for the layer of the target layer from the available resources of its group); if the above groups do not exist (that is, there is currently no group that can accommodate the target layer) ), you need to "create" a new group (that is, a new calibration process is required), assign the target layer to the newly added group, and set the value of the group's available resources in the newly added group to be equal to the total available resources (Subsequently subtract the target tier's tier-required resources from its group-available resources).
  • selecting one of the groups to be the group where the target layer is located includes: selecting from a group whose available resources are greater than or equal to the resources of the layers of the target layer.
  • the groups that require resources determine the group with the smallest cost value as the group where the target layer is located; wherein, the cost value of any said group is the difference between the first time and the second time of the group, and the first time is a pair of The time required for calibrating all the layers currently to be calibrated and the target layer in the group, and the second time is the time required for calibrating all the layers currently to be calibrated in the group.
  • the target layer when there is a group that can accommodate the target layer, the target layer can be divided into the group with the smallest cost value (of course, if there is only one group that can accommodate the target layer, it must be the group with the smallest cost value. ).
  • the "cost value" of a group indicates that after the target layer is divided into this group, the time of the corresponding calibration operation of this group is relative to the time of the corresponding calibration operation of this group before the target layer is divided into this group "increment".
  • the calibration operation time corresponding to the group does not increase or increases slightly after the classification, so as to shorten the calibration time.
  • determining the group with the lowest cost value as the group in which the target layer is located from the group in which the available resources of the group are greater than or equal to the resources required by the layer of the target layer includes: determining the maximum sequence number of each group; any The maximum sequence number of the group is the maximum sequence number of all layers to be calibrated in the group, and the sequence number of any layer to be calibrated is the order of the layer to be calibrated in the model according to the preset processing order; if there is a maximum sequence number The group that is greater than the sequence number of the target layer, then select one from the group where the target layer is located; if there is no group whose maximum sequence number is greater than the sequence number of the target layer, determine that the group with the largest value of the maximum sequence number is the target layer. group.
  • the sequence number of the layer to be calibrated indicates the position of the layer to be calibrated in the model according to the preset data transmission direction, for example, the layer to be calibrated is the first layer L1 in the model (that is, the sequence number is 1), the first layer The second layer L2 (that is, the sequence number is 2), the third layer L3 (that is, the sequence number is 3), the fourth layer L4 (that is, the sequence number is 4), and so on.
  • the cost value of the group may be specifically determined according to the "sequence number". That is, if the largest sequence number of the existing layer to be calibrated (the largest sequence number of the group) is larger than the sequence number of the target layer in a partial group (of course, a group that can accommodate the target layer), the target layer is divided into In such a group, if the sequence numbers of the existing layers to be calibrated are smaller than the sequence numbers of the target layers in the group that can accommodate the target layer, select a group with the largest sequence number as the group where the target layer is located.
  • the sequence number of the target layer is smaller than the maximum sequence number of the existing layer to be calibrated in a group (that is, the last layer that needs to be run in the calibration operation of the group), then after the target layer is divided into this group, The corresponding calibration operation does not run the model "more layers", so that the running time of the model does not increase, ie the actual time of the calibration operation does not increase, ie the cost value of the group is 0 (or close to 0).
  • the model "multi-run" in the corresponding calibration operation. ” has the least number of layers, that is, the least cost value.
  • the calibration method provided by the present invention can be abstracted as a bin packing problem.
  • the model has four layers L1, L2, L3, and L4 set in sequence, all of which are layers to be calibrated, and the total memory of the calibration device is 16G.
  • each group corresponds to a box
  • the total available resources resources available for each calibration operation, such as memory
  • each The layer to be calibrated is equivalent to a pile of sand
  • the resources required for the layer are equivalent to the amount of sand.
  • the process of grouping the layers to be calibrated is equivalent to putting each pile of sand into a corresponding box.
  • the process should ensure that every The sand does not "spill" from each box, while trying to fill each box as much as possible to reduce the total number of boxes required.
  • each layer to be calibrated determines the memory size required for each layer to be calibrated (that is, determine the size of each pile of sand), and then, in the order of memory from large to small, place each layer to be calibrated in the opened box (that is, a single The memory that the calibration operation can afford), select the box with the least cost and load it; or when the opened box cannot hold a certain layer to be calibrated, open a new box.
  • the minimum cost value means that when the box is reloaded into the layer to be calibrated, the overall calculation time of the corresponding calibration operation will not increase or increase slightly.
  • the calculation duration can be determined according to the order (sequence number) of data transmission in the model. For example, the calculation duration of L4 is greater than that of L3, the calculation duration of L3 is greater than that of L2, and the calculation duration of L2 is greater than that of L1. That is to say, according to the sequence of data transmission, the calculation duration of the later layer to be calibrated is greater than the calculation duration of the earlier layer to be calibrated.
  • the execution body of the calibration method provided by the present invention may be a calibration device of a compiler, and the calibration method includes the following steps:
  • S1 calculate the memory required for each layer of L1-L4, for example, the memory required for L1, L2, L3, and L4 are 8G, 1G, 10G, and 0.5G respectively (that is, the amount of four piles of sand).
  • both boxes can accommodate L4.
  • the cost value is changed from the calculation time of L3 (second time) to the calculation time of L4 (first time). ), this cost value is smaller than the cost of loading L4 into the second box (from the calculation time of L2 to the calculation time of L4), therefore, when L4 is loaded into the first box, the remaining 4.5G in the first box.
  • an embodiment of the present invention provides a calibration apparatus 30 , and the apparatus 30 may be suitable for calibrating a layer to be calibrated in a model, wherein the apparatus 30 may be implemented by software and/or hardware , and are generally integrated on the terminal device.
  • the device 30 includes:
  • the first determination module 31 is used to determine the layer attribute information of each to-be-calibrated layer in the model
  • the second determination module 32 is configured to determine, according to the total available resources and the layer attribute information of each of the layers to be calibrated, the group to which each layer to be calibrated belongs.
  • the calibration apparatus 30 in the embodiment of the present invention can implement the calibration method described in any one of the embodiments of the present invention.
  • the determining, according to the total available resources and the layer attribute information of the layers to be calibrated, the group to which each layer to be calibrated is located includes: from the layers to be calibrated that do not belong to the group, Determine the layer to be calibrated with the largest resource required by the layer as the target layer; at least according to the layer attribute information of the target layer, determine the group where the target layer is located; from the group available resources of the group where the target layer is located, Subtract the resources required by the layer of the target layer; if there is still the layer to be calibrated that does not belong to the group, return the layer to be calibrated that does not belong to the group, and determine the layer with the largest resource required by the layer The step that the layer to be calibrated is the target layer.
  • the determining the group in which the target layer is located at least according to the layer attribute information of the target layer includes: if there is a group in which the available resources of the group are greater than or equal to the resources required by the layer of the target layer, selecting from the group One is the group where the target layer is located; if there is no group with the available resources of the group greater than or equal to the resources required by the layer of the target layer, a group is created, and the group is determined as the group where the target layer is located, and the group of this group is set to be available.
  • the resources are equal to the total available resources.
  • selecting one of the groups to be the group where the target layer is located includes: selecting from a group whose available resources are greater than or equal to the resources of the layers of the target layer.
  • the groups that require resources determine the group with the smallest cost value as the group where the target layer is located; wherein, the cost value of any said group is the difference between the first time and the second time of the group, and the first time is a pair of The time required for calibrating all the layers currently to be calibrated and the target layer in the group, and the second time is the time required for calibrating all the layers currently to be calibrated in the group.
  • determining the group with the lowest cost value as the group in which the target layer is located from the group in which the available resources of the group are greater than or equal to the resources required by the layer of the target layer includes: determining the maximum sequence number of each group; any The maximum sequence number of the group is the maximum sequence number of all layers to be calibrated in the group, and the sequence number of any layer to be calibrated is the order of the layer to be calibrated in the model according to the preset processing order; if there is a maximum sequence number The group that is greater than the sequence number of the target layer, then select one from the group where the target layer is located; if there is no group whose maximum sequence number is greater than the sequence number of the target layer, determine that the group with the largest value of the maximum sequence number is the target layer. group.
  • the resource is memory
  • the method further includes: selecting an uncalibrated group, All layers to be calibrated in the group are calibrated; if there is still an uncalibrated group, returning to the step of selecting an uncalibrated group and calibrating all the layers to be calibrated in the group.
  • an embodiment of the present invention provides a terminal device 40 .
  • the terminal device 40 includes: one or more processors 41 (one processor 41 is taken as an example in FIG. 4 ) and a storage device 42; the storage device 42 is used to store one or more programs; the one or more programs are The one or more processors 41 are executed, so that the one or more processors 41 implement the calibration method according to any one of the embodiments of the present invention.
  • the terminal device 40 may further include: an input device 43 and an output device 44 .
  • the processor 41 , the storage device 42 , the input device 43 and the output device 44 in the terminal device 40 may be connected through a bus or in other ways, and the connection through a bus is taken as an example in FIG. 4 .
  • the storage device 42 in the terminal device 40 can be used to store one or more programs, and the programs can be software programs, computer-executable programs, and modules.
  • the calibration provided in the embodiment of the present invention
  • the program instructions/modules corresponding to the method include: a first determination module 31 and a second determination module 32 ).
  • the processor 41 executes various functional applications and data processing of the terminal device 40 by running the software programs, instructions and modules stored in the storage device 42, ie, implements the calibration method in the above method embodiments.
  • the storage device 42 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device 40 and the like. Additionally, storage device 42 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some instances, storage device 42 may further include memory located remotely from processor 41, which may be connected to the device through a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input device 43 may be used to receive input numerical or character information, and to generate key signal input related to user settings and function control of the terminal device 40 .
  • the output device 44 may include a display device such as a display screen.
  • an embodiment of the present invention provides a computer-readable storage medium 50 on which a computer program is stored, and when the computer program is executed by a processor, is used to perform the calibration of any one of the embodiments of the present invention method
  • the computer storage medium in the embodiments of the present invention may adopt any combination of one or more computer-readable mediums.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium 50 .
  • the computer readable storage medium 50 may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer-readable storage media 50 include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only Memory (Read Only Memory, ROM), Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any suitable of the above combination.
  • Computer-readable storage medium 50 may be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium, other than computer-readable storage medium 50, that can transmit, propagate, or transmit data for use by or in connection with the instruction execution system, apparatus, or device. program.
  • Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • suitable medium including but not limited to: wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out operations of the present invention may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).
  • LAN local area network
  • WAN wide area network

Abstract

一种校准方法、装置、终端设备及存储介质。所述方法包括:确定模型中每个待校准层的层属性信息(S110);根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组(S120);其中,任意所述待校准层的层属性信息包括层所需资源,所述层所需资源为对该待校准层进行校准时所需占用的资源;所述总可用资源为用于进行校准的总资源。利用该方法,可在总可用资源能够支持的前提下,对所有待校准层进行合理分组,使各次校准操作中的层所需的资源尽量均衡且较大,从而充分利用资源,进而减少校准操作的次数,提升了模型校准时的计算速度。

Description

一种校准方法、装置、终端设备及存储介质 技术领域
本发明涉及数据处理技术领域,尤其涉及一种校准方法、校准装置、终端设备及计算机可读存储介质。
背景技术
模型量化是深度学习领域一种常用的技术,通过将模型参数和输入从高精度量化为低精度,如从float32(浮点32位)量化为int8(整型8位),可以提高模型运算速度并减小模型大小。
为了降低模型量化过程中精度的损失,需要对模型进行校准,即将典型应用数据输入模型中(模型推理),以得到各待校准层(可由用户指定)产生的、需进行量化的数据的动态范围,再根据该动态范围确定量化因子(统计量计算)。而量化因子用于进行量化。
深度学习技术的模型计算量大,各层产生的数据维度大(张量大、尺寸大),对数据进行校准所需的内存多;校准装置内存有限,无法一次校准所有待校准层。目前的校准算法中,通常是按照模型中数据传输的方向,对模型的各待校准层依次进行校准,每次固定校准一层或相邻的多层。
但不同待校准层在校准中所需的内存不同,可能导致部分校准操作中的待校准层所需的内存已接近校准装置上限,另一些校准操作中的待校准层所需的内存还很小,无法充分利用校准装置的资源,所需校准次数多,时间长。
发明内容
本发明实施例提供了一种校准方法、校准装置、终端设备及计算机可读存储介质,提升了模型校准时的速度。
第一方面,本发明实施例提供了一种校准方法,其包括:确定模型中每个待校准层的层属性信息;根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组;其中,任意所述待校准层的层属性信息包括层所需资源,所述层所需资源为对该待校准层进行校准时所需占用的资源;所述总可用资源为用于进行校准的总资源。
在一些实施例中,所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组,包括:从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层;至少根据所述目标层的层属性信息,确定所述目标层所在的组;从所述目标层所在的组的组可用资源中,减去所述目标层的层所需资源;若仍存在没有所在组的所述待校准层,则返回所述从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层的步骤。
在一些实施例中,所述至少根据所述目标层的层属性信息,确定所述目标层所在的组包括:若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组;若不存在组可用资源大于或等于目标层的层所需资源的组,则创建一个组,并确定该组为目标层所在的组,设定该组的组可用资源等于所述总可用资源。
在一些实施例中,所述若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组包括:从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组;其中,任意所述组的代价值为该组的第一时间与第二时间的差值,所述第一时间为对该组中当前所有的待校准层和所述目标层进行校准所需的时间,所述第二时间为对该组中当前所有的待校准层进行校准所需的时间。
在一些实施例中,所述从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组包括:确定每个组的最大顺序号;任意所述组的最大顺序号为该组的所有待校准层的顺序号的最大值,任意待校准层的顺序号为该待校准层在模型中按照 预设处理顺序的排序;若存在最大顺序号大于所述目标层的顺序号的组,则从中选择一个为目标层所在的组;若不存在最大顺序号大于目标层的顺序号的组,确定最大顺序号的值最大的组为目标层所在的组。
在一些实施例中,所述资源为内存。
在一些实施例中,在所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组后,还包括:选择一个未校准的组,对该组中所有的待校准层进行校准;若仍存在未校准的组,则返回所述选择一个未校准的组,对该组中所有的待校准层进行校准的步骤。
第二方面,本发明实施例提供了一种校准装置,其包括:第一确定模块,用于确定模型中每个待校准层的层属性信息;第二确定模块,用于根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组。
第三方面,本发明实施例提供了一种终端设备,其包括:一个或多个处理器;存储装置,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本发明实施例提供的任意一种校准方法。
第四方面,本发明实施例提供了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序所述计算机程序被处理器执行时实现本发明实施例提供的任意一种校准方法。
本发明实施例提供了一种校准方法、装置、终端设备及存储介质,其中首先确定模型中每个待校准层的层属性信息(层所需资源);然后根据各待校准层的层所需资源和总可用资源,确定待校准层的分组,即确定各次校准操作对应的待校准层。利用上述技术方案,可在总可用资源能够支持的前提下,对所有待校准层进行合理分组,尽量使各次校准操作中的层所需的资源尽量均衡且较大,从而充分利用资源,在大部分情况下可减少校准操作的次数(减少组的个数),提升了模型校准时的计算速度。
附图说明
图1为本发明实施例提供的一种校准方法的流程示意图。
图2为本发明实施例提供的一种校准方法中确定待校准层所在组的过程示意图。
图3为本发明实施例提供的一种校准装置的结构示意图。
图4为本发明实施例提供的一种终端设备的结构示意图。
图5为本发明实施例提供的一种计算机可读存储介质的结构示意图。
具体实施方式
下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本发明,而非对本发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本发明相关的部分而非全部结构。
在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作(或步骤)描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。此外,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。
本发明使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”。
需要注意,本发明中提及的“第一”、“第二”等概念仅用于对相应内容进行区分,并非用于限定顺序或者相互依存关系。
需要注意,本发明中提及的“一个”、“多个”的修饰是示意性 而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
第一方面,本发明实施例提供了一种校准方法。
本发明实施例的校准方法可应用至模型量化场景,即校准方法用于对模型中的待校准层进行校准,以得到各待校准层的量化因子,量化因子可以用于实现模型量化。
本发明实施例的校准方法可由校准装置来执行,其中校准装置可由软件和/或硬件实现,并一般集成在终端设备上,在本发明实施例中终端设备包括但不限于:手机、电脑和个人数字助理等设备。
其中,模型是按照深度学习技术建立的任何运算模型,例如神经网络模型等。每个模型可处理一定的问题,如图像识别、语音识别等。而每个模型包括多个依次设置的层,每个层可对进入其中的数据(如输入模型的数据,或前一层输出的数据)进行一定处理(如卷积处理、全连接处理等),并将产生的数据向后输出(如输出至下一层,或从模型输出)。
其中,待校准层是模型中需要进行校准(得到量化因子)的层,其可以是模型中的所有层或部分层。待校准层可通过用户指定等方式确定,本发明对其不进行限定。
图1为本发明实施例提供的一种校准方法的流程示意图,所述校准方法包括以下步骤S110至S120。
S110、确定模型中每个待校准层的层属性信息。
其中,任意所述待校准层的层属性信息包括层所需资源,所述层所需资源为对该待校准层进行校准时所需占用的资源。
本步骤中首先确定各待校准层的层属性信息,其确定的具体手段不作限定,如可基于待校准层的配置(如输入数据的尺寸、卷积核的尺寸、移动步长、函数的参数等)确定。
其中,层属性信息至少包括层所需资源,即,在进行校准操作的过程中,用于对该待校准层产生的数据进行处理的资源,也就是该待 校准层在单次的校准操作中必须“占用”的资源。
在一些实施例中,资源包括内存。
作为本发明实施例的一种方式,所有资源(总可用资源、层所需资源、组可用资源等)至少包括内存资源。当然,以上各资源还可包括其它类型的资源,如处理器的运算资源等。
S120、根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组。
其中,所述总可用资源为用于进行校准的总资源。
根据每次校准操作中可利用的总资源(总可用资源),以及各待校准层在被校准时所需的资源(层所需资源),将各待校准层分入不同的组中,即在本公开实施例的校准方法中,在实际进行校准操作之前,可预先确定在每次校准操作中应对哪些层进行校准。
从而,对每个组中的待校准层,可在同一次校准操作中校准。
其中,分入每个组中的所有待校准层的层所需资源之和不能大于总可用资源(否则该次校准操作无法进行),但应满足该要求的基础上尽量较大,以使分入每个组的待校准层尽可能充分利用总可用资源,保证分入每个组的待校准层的平均数尽量较多,即所需的组的个数(即校准操作的次数)较少。
其中,总资源可以是校准装置所具有的所有资源,也就是校准装置在每次校准操作中能提供的总资源。当然,在不同次的校准操作中,校准装置的资源(总可用资源)是可被反复利用的。
本发明实施例中,首先确定模型中每个待校准层的层属性信息(层所需资源);然后根据各待校准层的层所需资源和总可用资源,确定待校准层的分组,即确定各次校准操作对应的待校准层。利用上述技术方案,可在总可用资源能够支持的前提下,对所有待校准层进行合理分组,尽量使各次校准操作中的层所需的资源尽量均衡且较大,从而充分利用资源,在大部分情况下可减少校准操作的次数(减少组的个数),提升了模型校准时的计算速度。
参照图1,在一些实施例中,在所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组(S120)后,还包括以下步骤S130至S140。
S130、选择一个未校准的组,对该组中所有的待校准层进行校准。
S140、若仍存在未校准的组,则返回所述选择一个未校准的组,对该组中所有的待校准层进行校准的步骤。
在确定完分组后,可开始校准过程,而每次校准操作中,对且仅对一个组中的所有待校准层进行,而对不同组中的待校准层则分别校准(即多次校准操作)。
如前,相对于每次对预定次序、预定个数的待校准层进行校准的相关技术,根据本发明实施例的方式,可使每次校准操作的资源被尽量充分的利用,即每次校准操作中尽量对较多的待校准层进行校准,从而在大部分情况下分组的个数较少,也就是校准的次数少,所需的总时间少。
在一些实施例中,所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组(S120),包括:从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层;至少根据所述目标层的层属性信息,确定所述目标层所在的组;从所述目标层所在的组的组可用资源中,减去所述目标层的层所需资源;若仍存在没有所在组的所述待校准层,则返回所述从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层的步骤。
其中,“没有所在组的待校准层”是指,在进行以上步骤时,尚未被分入某个组中的待校准层,也就是尚未作为目标层进行以上分组后的待校准层。
参照图2,具体进行分组的过程可包括:从当前尚未分组的待校准层中,选择层所需资源(如在校准操作中占用的内存)最大的一个为当前的目标层,并对该目标层进行分组,且在分组后,从该组的组可用资源(即对应该组的还未被“占用”的资源)中,减去该目标层 的层所需资源;之后判断是否仍存在没有所在组的所述待校准层,若是则重新从未分组的待校准层选择目标层(当然之前的目标层已分组,故不再属于未分组的待校准层),若否则结束。
在一些实施例中,所述至少根据所述目标层的层属性信息,确定所述目标层所在的组包括:若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组;若不存在组可用资源大于或等于目标层的层所需资源的组,则创建一个组,并确定该组为目标层所在的组,设定该组的组可用资源等于所述总可用资源。
参照图2,对每个目标层的分组过程可包括:若当前已存在的组中,有一个或多个组的组可用资源比目标层的层所需资源大(即当前有能容纳目标层的组),即可将目标层分入这样的组中(后续从其组可用资源中减去目标层的层所需资源);若不存在以上的组(即当前没有能容纳目标层的组),则需要“新建”一个组(即要求新增一次校准过程),并将目标层分入该新增的组中,且设定该新增的组的组可用资源的数值等于总可用资源(后续从其组可用资源中减去目标层的层所需资源)。
在一些实施例中,所述若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组包括:从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组;其中,任意所述组的代价值为该组的第一时间与第二时间的差值,所述第一时间为对该组中当前所有的待校准层和所述目标层进行校准所需的时间,所述第二时间为对该组中当前所有的待校准层进行校准所需的时间。
参照图2,当存在能容纳目标层的组时,可将目标层分入其中代价值最小的组中(当然,若仅存在一个能容纳目标层的组,则其必然就是代价值最小的组)。
其中,一个组的“代价值”表示,当把目标层分入该组后,该组对应的校准操作的时间,相对于在将目标层分入该组之前,该组对应的校准操作的时间的“增量”。
也就是说,当要把目标层分入已有的组中时,应当保证分入后,该组对应的校准操作时间不增长或增长的较小,以缩短校准的时间。
在一些实施例中,所述从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组包括:确定每个组的最大顺序号;任意所述组的最大顺序号为该组的所有待校准层的顺序号的最大值,任意待校准层的顺序号为该待校准层在模型中按照预设处理顺序的排序;若存在最大顺序号大于所述目标层的顺序号的组,则从中选择一个为目标层所在的组;若不存在最大顺序号大于目标层的顺序号的组,确定最大顺序号的值最大的组为目标层所在的组。
其中,待校准层的顺序号表示待校准层在模型中,按照预设的数据传输方向所处的位置,如代表待校准层是模型中的第一层L1(即顺序号为1)、第二层L2(即顺序号为2)、第三层L3(即顺序号为3)、第四层L4(即顺序号为4)等。
作为本发明实施例的一种方式,具体可根据“顺序号”确定组的的代价值。即,若部分组(当然是能容纳目标层的组)中,已有的待校准层的最大的顺序号(组的最大顺序号)比目标层的顺序号更大,则将目标层分入这样的组中;若是能容纳目标层的组中,已有的待校准层的顺序号均比目标层的顺序号小,则选择一个最大顺序号的值最大的组为目标层所在的组。
在校准中,需要将典型数据输入至模型所在的装置,运行模型,以使其各待校准层产生用于校准的数据,并输入至校准装置。而任意一次校准操作对应的待校准层,若仅为模型中的部分层,则该次校准操作不必运行模型的所有层,而只要运行至其对应的顺序号最大的待校准层即可(当然该顺序号最大的待校准层之前的所有层,不论是否校准都必须运行)。
由此,如果目标层的顺序号比一个组中已有待校准层的最大顺序号(即该组的校准操作中需运行到的最后一层)小,则将目标层分入该组中后,相应校准操作中并不用使模型运行“更多层”,从而模型的运行时间没有增加,也就是校准操作的实际时间没有增加,即该组 的代价值为0(或接近于0)。
相应的,如果目标层的顺序号比任意组的最大顺序号都大,则应当将其分入最大顺序号的值最大的组,以保证目标层分入后,相应校准操作中模型“多运行”的层数最少,即代价值最小。
示例性的,本发明提供的校准方法可抽象为一个装箱问题。
例如,假设模型有依次设置的四个层L1、L2、L3、L4,其全部为待校准层,而校准装置的总内存为16G。
其中,每个组(每次校准操作)对应一个箱子,总可用资源(每次校准操作可用的资源,如内存)相当于各箱子的容量(也可理解为箱子可反复使用);而每个待校准层相当于一堆沙子,其层所需资源(其在校准操作中所需占用的资源,如内存)相当于沙子的量。而将待校准层分组的过程,相当于将每堆沙子装入一个相应的箱子,每次装入后箱子的剩余空间(即组可用资源,如内存)减小,且该过程中应保证每个箱子的沙子不会“溢出”,同时尽量使每个箱子装满,以减少所需箱子的总数量。
由此,将沙子装入箱子的规则如下:
首先,确定各待校准层所需的内存大小(即确定各堆沙子的大小),然后,按照内存从大到小的顺序,将每一待校准层在已开的箱子(就是指,单次校准操作可承受的内存)中,挑选代价值最小的一个箱子装入;或者是当已开的箱子装不下某一待校准层的时候,打开一个新的箱子。
代价值最小是指,当这个箱子再装入这一待校准层时,其对应的校准操作的总体计算时长不会增长或增长较小。其中,计算时长可按照模型中数据传输的顺序(顺序号)来决定的,例如,L4的计算时长大于L3、L3的计算时长大于L2、L2的计算时长大于L1。也就是说,按照数据传输的顺序,靠后的待校准层的计算时长大于靠前的待校准层的计算时长。
示例性的,本发明提供的校准方法的执行主体可以为编译器的校准装置,校准方法包括如下步骤:
S1,计算L1~L4每层所需的内存,如L1、L2、L3、L4所需的内存分别为8G、1G、10G、0.5G(即四堆沙子的量)。确定每次校准操作中可用的内存为16G(即每个箱子的容量)。
S2,按照内存从大到小对各待校准层进行排序,L3—L1—L2—L4。
S3,装L3(按内存从大到小的顺序装),由于此时没有箱子,故先开第一个箱子,将L3放入第一个箱子中,第一个箱子剩余6G。
S4,装L1,由于第一个箱子装L3后剩余6G,无法再容纳L1(8G),因此,再开第二个箱子,将L1放入第二个箱子中,第二个箱子剩余8G。
S5,装L2,通过遍历两个已开的箱子,发现两个箱子都可以装L2,那么需要对装入的代价值进行计算,第一个箱子中已装的是L3,位于L2之后,故当L2装入第一个箱子后,计算时长不会增长,即代价值为0;而第二个箱子中已装的L1位于L2之前,当L2装入第二个箱子后,计算时长会变为L2的计算时长,代价值较大;因此,将L2装入第一个箱子中,第一个箱子剩余5G。
S6,装L4,同样,两个箱子均可容纳L4,而且,第一个箱子放入L4之后,代价值是从L3的计算时(第二时间)长变为L4的计算时长(第一时间),这个代价值比L4装入第二个箱子的代价值(从L2的计算时长变为L4的计算时长)小,因此,将L4装入第一个箱子,第一个箱子剩余4.5G。
S7,以箱子为单位依次对各箱子中的各待校准层进行校准,如在第一次校准操作中对第一个箱子中的L2、L3、L4进行校准,在第二次校准操作中对第一个箱子中的L1进行校准,以确定对应的量化因子。
第二方面,如图3所示,本发明实施例提供了一种校准装置30,该装置30可适用于对模型中待校准层进行校准的情况,其中该装置30可由软件和/或硬件实现,并一般集成在终端设备上。
该装置30包括:
第一确定模块31,用于确定模型中每个待校准层的层属性信息;
第二确定模块32,用于根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组。
本发明实施例的校准装置30可实现本发明实施例中任一项所述的校准方法。
在一些实施例中,所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组,包括:从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层;至少根据所述目标层的层属性信息,确定所述目标层所在的组;从所述目标层所在的组的组可用资源中,减去所述目标层的层所需资源;若仍存在没有所在组的所述待校准层,则返回所述从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层的步骤。
在一些实施例中,所述至少根据所述目标层的层属性信息,确定所述目标层所在的组包括:若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组;若不存在组可用资源大于或等于目标层的层所需资源的组,则创建一个组,并确定该组为目标层所在的组,设定该组的组可用资源等于所述总可用资源。
在一些实施例中,所述若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组包括:从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组;其中,任意所述组的代价值为该组的第一时间与第二时间的差值,所述第一时间为对该组中当前所有的待校准层和所述目标层进行校准所需的时间,所述第二时间为对该组中当前所有的待校准层进行校准所需的时间。
在一些实施例中,所述从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组包括:确定每个组的最大顺序号;任意所述组的最大顺序号为该组的所有待校准层的顺序号的最大值,任意待校准层的顺序号为该待校准层在模型中按照预设处理顺序的排序;若存在最大顺序号大于所述目标层的顺序号的组,则从中选择一个为目标层所在的组;若不存在最大顺序号大于目 标层的顺序号的组,确定最大顺序号的值最大的组为目标层所在的组。
在一些实施例中,所述资源为内存。
在一些实施例中,在所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组后,还包括:选择一个未校准的组,对该组中所有的待校准层进行校准;若仍存在未校准的组,则返回所述选择一个未校准的组,对该组中所有的待校准层进行校准的步骤。
第三方面,参照图4,本发明实施例提供一种终端设备40。
终端设备40包括:一个或多个处理器41(图4中以一个处理器41为例)和存储装置42;存储装置42用于存储一个或多个程序;所述一个或多个程序被所述一个或多个处理器41执行,使得所述一个或多个处理器41实现如本发明实施例中任一项所述的校准方法。
所述终端设备40还可以包括:输入装置43和输出装置44。
终端设备40中的处理器41、存储装置42、输入装置43和输出装置44可以通过总线或其他方式连接,图4中以通过总线连接为例。
该终端设备40中的存储装置42作为一种计算机可读存储介质,可用于存储一个或多个程序,所述程序可以是软件程序、计算机可执行程序以及模块,如本发明实施例提供的校准方法对应的程序指令/模块(例如,图3所示的校准装置30中的模块,包括:第一确定模块31、第二确定模块32)。处理器41通过运行存储在存储装置42中的软件程序、指令以及模块,从而执行终端设备40的各种功能应用以及数据处理,即实现上述方法实施例中的校准方法。
存储装置42可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备40的使用所创建的数据等。此外,存储装置42可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储装置42可进一步包括相对于处理器41远程设置的存储器,这些远程存储器可以通过网络连接至设备。上述网络的实例包括但不 限于互联网、企业内部网、局域网、移动通信网及其组合。
输入装置43可用于接收输入的数字或字符信息,以及产生与终端设备40的用户设置以及功能控制有关的键信号输入。输出装置44可包括显示屏等显示设备。
第四方面,参照图5,本发明实施例提供了一种计算机可读存储介质50,其上存储有计算机程序,该计算机程序被处理器执行时用于执行本发明实施例任意一项的校准方法
本发明实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质50。计算机可读存储介质50例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质50的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。计算机可读存储介质50可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于:电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质50以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、无线电频率(Radio Frequency,RF)等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
注意,上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解,本发明不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此,虽然通过以上实施例对本发明进行了较为详细的说明,但是本发明不仅仅限于以上实施例,在不脱离本发明构思的情况下,还可以包括更多其他等效实施例,而本发明的范围由所附的权利要求范围决定。

Claims (10)

  1. 一种校准方法,其特征在于,包括:
    确定模型中每个待校准层的层属性信息;
    根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组;
    其中,任意所述待校准层的层属性信息包括层所需资源,所述层所需资源为对该待校准层进行校准时所需占用的资源;所述总可用资源为用于进行校准的总资源。
  2. 根据权利要求1所述的方法,其特征在于,所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组,包括:
    从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层;
    至少根据所述目标层的层属性信息,确定所述目标层所在的组;
    从所述目标层所在的组的组可用资源中,减去所述目标层的层所需资源;
    若仍存在没有所在组的所述待校准层,则返回所述从没有所在组的所述待校准层中,确定所述层所需资源最大的待校准层为目标层的步骤。
  3. 根据权利要求2所述的方法,其特征在于,所述至少根据所述目标层的层属性信息,确定所述目标层所在的组包括:
    若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组;
    若不存在组可用资源大于或等于目标层的层所需资源的组,则创 建一个组,并确定该组为目标层所在的组,设定该组的组可用资源等于所述总可用资源。
  4. 根据权利要求3所述的方法,其特征在于,所述若存在组可用资源大于或等于目标层的层所需资源的组,则从中选择一个为目标层所在的组包括:
    从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组;
    其中,任意所述组的代价值为该组的第一时间与第二时间的差值,所述第一时间为对该组中当前所有的待校准层和所述目标层进行校准所需的时间,所述第二时间为对该组中当前所有的待校准层进行校准所需的时间。
  5. 根据权利要求4所述的方法,其特征在于,所述从组可用资源大于或等于目标层的层所需资源的组中,确定代价值最小的组为目标层所在的组包括:
    确定每个组的最大顺序号;任意所述组的最大顺序号为该组的所有待校准层的顺序号的最大值,任意待校准层的顺序号为该待校准层在模型中按照预设处理顺序的排序;
    若存在最大顺序号大于所述目标层的顺序号的组,则从中选择一个为目标层所在的组;
    若不存在最大顺序号大于目标层的顺序号的组,确定最大顺序号的值最大的组为目标层所在的组。
  6. 根据权利要求1至5中任意一项所述的方法,其特征在于,
    所述资源为内存。
  7. 根据权利要求1所述的方法,其特征在于,在所述根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组后,还包括:
    选择一个未校准的组,对该组中所有的待校准层进行校准;
    若仍存在未校准的组,则返回所述选择一个未校准的组,对该组中所有的待校准层进行校准的步骤。
  8. 一种校准装置,其特征在于,包括:
    第一确定模块,用于确定模型中每个待校准层的层属性信息;
    第二确定模块,用于根据总可用资源和各所述待校准层的层属性信息,确定每个所述待校准层所在的组。
  9. 一种终端设备,其特征在于,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一所述的方法。
  10. 一种计算机可读存储介质,特征在于,上存储有计算机程序,其所述计算机程序被处理器执行时实现如权利要求1-7中任一所述的方法。
PCT/CN2021/108133 2020-07-29 2021-07-23 一种校准方法、装置、终端设备及存储介质 WO2022022417A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/004,021 US11816547B2 (en) 2020-07-29 2021-07-23 Calibration method and apparatus, terminal device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010747179.2 2020-07-29
CN202010747179.2A CN111915017B (zh) 2020-07-29 2020-07-29 一种校准方法、装置、终端设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022022417A1 true WO2022022417A1 (zh) 2022-02-03

Family

ID=73287385

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108133 WO2022022417A1 (zh) 2020-07-29 2021-07-23 一种校准方法、装置、终端设备及存储介质

Country Status (3)

Country Link
US (1) US11816547B2 (zh)
CN (1) CN111915017B (zh)
WO (1) WO2022022417A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915017B (zh) 2020-07-29 2023-11-24 北京灵汐科技有限公司 一种校准方法、装置、终端设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3382989A1 (en) * 2017-03-31 2018-10-03 Solarflare Communications Inc Network interface device
CN110058943A (zh) * 2019-04-12 2019-07-26 三星(中国)半导体有限公司 用于电子设备的内存优化方法和设备
CN110389824A (zh) * 2018-04-20 2019-10-29 伊姆西Ip控股有限责任公司 处理计算任务的方法、设备和计算机程序产品
CN110738316A (zh) * 2018-07-20 2020-01-31 北京三星通信技术研究有限公司 基于神经网络的操作方法、装置及电子设备
WO2020093306A1 (zh) * 2018-11-08 2020-05-14 北京比特大陆科技有限公司 神经网络层分组方法、装置、设备、存储介质及程序产品
CN111915017A (zh) * 2020-07-29 2020-11-10 北京灵汐科技有限公司 一种校准方法、装置、终端设备及存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220101133A1 (en) * 2020-09-29 2022-03-31 Qualcomm Incorporated Dynamic quantization for energy efficient deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3382989A1 (en) * 2017-03-31 2018-10-03 Solarflare Communications Inc Network interface device
CN110389824A (zh) * 2018-04-20 2019-10-29 伊姆西Ip控股有限责任公司 处理计算任务的方法、设备和计算机程序产品
CN110738316A (zh) * 2018-07-20 2020-01-31 北京三星通信技术研究有限公司 基于神经网络的操作方法、装置及电子设备
WO2020093306A1 (zh) * 2018-11-08 2020-05-14 北京比特大陆科技有限公司 神经网络层分组方法、装置、设备、存储介质及程序产品
CN110058943A (zh) * 2019-04-12 2019-07-26 三星(中国)半导体有限公司 用于电子设备的内存优化方法和设备
CN111915017A (zh) * 2020-07-29 2020-11-10 北京灵汐科技有限公司 一种校准方法、装置、终端设备及存储介质

Also Published As

Publication number Publication date
US20230196197A1 (en) 2023-06-22
CN111915017B (zh) 2023-11-24
US11816547B2 (en) 2023-11-14
CN111915017A (zh) 2020-11-10

Similar Documents

Publication Publication Date Title
US9766818B2 (en) Electronic system with learning mechanism and method of operation thereof
US20210304063A1 (en) Machine Learning Model For Micro-Service Compliance Requirements
US20240095538A1 (en) Privacy-preserving graphical model training methods, apparatuses, and devices
US9210219B2 (en) Systems and methods for consistent hashing using multiple hash rings
US11928599B2 (en) Method and device for model compression of neural network
US20090106747A1 (en) Dynamic class loading
US20220012592A1 (en) Methods and apparatus to perform weight and activation compression and decompression
WO2022022417A1 (zh) 一种校准方法、装置、终端设备及存储介质
WO2024051270A1 (zh) 任务执行的方法、装置、存储介质及电子设备
US20190391809A1 (en) Programs with serializable state
CN115081598B (zh) 算子处理方法及装置、电子设备、计算机可读存储介质
US20220374742A1 (en) Method, device and storage medium for running inference service platform
KR20220056621A (ko) 매니코어 시스템을 위한 뉴럴 네트워크 모델 처리의 병렬화 방법 및 장치
CN114841323A (zh) 神经网络计算图的处理方法及处理装置
CN116933886B (zh) 一种量子计算执行方法、系统、电子设备及存储介质
CN109377348A (zh) 应用于助贷业务系统的业务接口调用方法及助贷业务系统
US9311146B2 (en) Strategic placement of jobs for spatial elasticity in a high-performance computing environment
CN111738424A (zh) 神经网络处理方法、装置、电子设备及存储介质
US20210357753A1 (en) Method and apparatus for multi-level stepwise quantization for neural network
CN111582456B (zh) 用于生成网络模型信息的方法、装置、设备和介质
CN102063308A (zh) 一种用于地震勘探资料处理流程控制的方法
CN106407345A (zh) 一种脏数据更新方法及装置
Bengre et al. A learning-based scheduler for high volume processing in data warehouse using graph neural networks
US20190385091A1 (en) Reinforcement learning exploration by exploiting past experiences for critical events
US20230418822A1 (en) Method and system for configurable data analytics platform

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21848837

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.05.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21848837

Country of ref document: EP

Kind code of ref document: A1