CN108288089A - Method and apparatus for generating convolutional neural networks - Google Patents
Method and apparatus for generating convolutional neural networks Download PDFInfo
- Publication number
- CN108288089A CN108288089A CN201810084926.1A CN201810084926A CN108288089A CN 108288089 A CN108288089 A CN 108288089A CN 201810084926 A CN201810084926 A CN 201810084926A CN 108288089 A CN108288089 A CN 108288089A
- Authority
- CN
- China
- Prior art keywords
- weight
- input value
- quantization
- rounding
- initial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The embodiment of the present application discloses the method and apparatus for generating convolutional neural networks.One specific implementation mode of this method includes:The initial input value and initial weight for obtaining the target convolutional layer of convolutional neural networks, to generate the first input value set and the first weight set respectively;According to the quantization scale of current generation, quantization rounding is carried out respectively to the first input value set and the first weight set, generates the second input value set and the second weight set, wherein the second weight set includes the first weight quantified after rounding;By the weight that the second weight sets cooperation is target convolutional layer, and determine whether the total quantization ratio of current generation reaches preset ratio value;In response to determining that the total quantization ratio of current generation reaches preset ratio value, generates and store target convolutional neural networks.The multistage quantization of target convolutional layer weight may be implemented in the embodiment, helps to improve the flexibility of the generation method of convolutional neural networks.
Description
Technical field
The invention relates to field of computer technology, and in particular to nerual network technique field, more particularly, to
The method and apparatus for generating convolutional neural networks.
Background technology
The concept of deep learning is derived from the research of artificial neural network.Deep learning is formed more by combining low-level feature
Abstract high-rise expression attribute classification or feature, to find that the distributed nature of data indicates.Deep learning is that machine learning is ground
A new field in studying carefully, motivation are that foundation, simulation human brain carry out the neural network of analytic learning, it imitates human brain
Mechanism explains data, such as image, sound and text.
It is the same with machine learning method, point of depth machine learning method also supervised learning and unsupervised learning.It is different
Learning framework under the learning model established it is very different.For example, convolutional neural networks (Convolution Neural
Network, abbreviation CNN) it is exactly a kind of machine learning model under the supervised learning of depth;And depth confidence net (Deep
Belief Net, abbreviation DBN) it is exactly a kind of machine learning model under unsupervised learning.
Invention content
The embodiment of the present application proposes the method and apparatus for generating convolutional neural networks.
In a first aspect, the embodiment of the present application provides a kind of method for generating convolutional neural networks, including:Obtain volume
The initial input value and initial weight of the target convolutional layer of product neural network;According to initial input value and initial weight, give birth to respectively
At the first input value set and the first weight set;According to the quantization scale of current generation, to each in the first input value set
Each first weight in first input value and the first weight set carries out quantization rounding respectively, generates the second input value set and the
Two weight set, wherein the second weight set includes the first weight quantified after rounding;It is target by the second weight sets cooperation
The weight of convolutional layer, and determine whether total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio value;
In response to determining that total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio value, by the volume of current generation
Product neural network stores target convolutional neural networks as target convolutional neural networks.
In some embodiments, this method further includes:In response to determining that the weight of target convolutional layer is total in the current generation
Quantization scale is not up to preset ratio value, is trained to the convolutional neural networks of current generation according to loss function, with adjustment
The first weight of rounding is not quantified in second weight set, until the value of loss function tends to desired value;Obtain next stage
Quantization scale of the quantization scale as the current generation;According to the quantization scale of current generation, to not measured in the second input value set
Change and do not quantify the first weight of rounding in the first input value of rounding and the second weight set and carry out quantization rounding respectively, to update the
Two input value sets and the second weight set;By the weight that newer second weight sets cooperation is target convolutional layer, and determine mesh
Mark whether total quantization ratio of the weight of convolutional layer in the current generation reaches preset ratio value.
In some embodiments, according to initial input value and initial weight, the first input value set and first is generated respectively
Weight set, including:According to the digit of quantization encoding, the range of initial input value and the range of initial weight are uniformly drawn respectively
It is divided into preset number subinterval, wherein the digit positive correlation of preset number and quantization encoding;According to positioned at preset number height
Each input value in section and each weight generate the first input value set and the first weight set respectively.
In some embodiments, to each the in each first input value and the first weight set in the first input value set
One weight carries out quantization rounding respectively, generates the second input value set and the second weight set, including:According to default quantization side
Method carries out quantization rounding to each first input value in the first input value set, and the first input value after quantization rounding is made
For the second input value, the second input value set is generated;According to the distribution probability of each first weight in the first weight set, to
Each first weight in one weight set carries out rounding or lower rounding, and using the first weight after rounding as the second weight,
Generate the second weight set.
In some embodiments, according to initial input value and initial weight, the first input value set and first is generated respectively
Weight set, including:According to preset first numerical value, the range of the range of initial input value and initial weight is divided into default
The different subinterval of number siding-to-siding block length;Using the first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, by result of calculation
As the first input value, the first input value set is generated;According to each initial weight positioned at each subinterval, determination section weight,
As the first weight, to generate the first weight set.
In some embodiments, to each the in each first input value and the first weight set in the first input value set
One weight carries out quantization rounding respectively, generates the second input value set and the second weight set, including:To the first input value set
In each first input value carry out quantization rounding, and be respectively index to quantify each first input value after rounding, calculate first
The exponential depth of numerical value, as the second input value, to generate the second input value set;According to the sequence in each subinterval, successively to each
First weight establishes serial number, generates inquiry table, and using the corresponding serial number of each first weight as the second weight, generate the second weight
Gather, wherein serial number integer, and serial number is stored with each first weight in the form of key-value pair in inquiry table.
In some embodiments, according to the quantization scale of current generation, to each first input in the first input value set
Each first weight in value and the first weight set carries out quantization rounding respectively, including:According to the quantization scale of current generation,
The first input value of respective numbers is chosen respectively in first input value set and the first weight set and the first weight is quantified
Rounding, wherein selection includes descending according to numerical value, is chosen from the big one end of numerical value, or according to quantization error by it is small to
Greatly, it is chosen from the small one end of error.
In some embodiments, this method further includes:Obtain the initial defeated of the target convolutional layer of target convolutional neural networks
Enter information;Quantization rounding is carried out to initial input information, obtains integer input values;Integer input values are inputted into target convolutional layer,
And convolution algorithm is carried out with the weight of target convolutional layer, generate output information.
Second aspect, the embodiment of the present application provide a kind of device for generating convolutional neural networks, including:First obtains
Unit is taken, is configured to obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;First generates list
Member, is configured to according to initial input value and initial weight, generates the first input value set and the first weight set respectively;First
Quantifying unit is configured to the quantization scale according to the current generation, to each first input value and the in the first input value set
Each first weight in one weight set carries out quantization rounding respectively, generates the second input value set and the second weight set,
In, the second weight set includes the first weight quantified after rounding;Determination unit, be configured to be by the second weight sets cooperation
The weight of target convolutional layer, and determine whether total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio
Value;Second generation unit is configured in response to determining that the weight of target convolutional layer reaches in the total quantization ratio of current generation
Preset ratio value using the convolutional neural networks of current generation as target convolutional neural networks, and stores target convolution nerve net
Network.
In some embodiments, which is also configured to:In response to determining the weight of target convolutional layer in the current generation
Total quantization ratio be not up to preset ratio value, the convolutional neural networks of current generation are trained according to loss function, with
The first weight for not quantifying rounding in the second weight set is adjusted, until the value of loss function tends to desired value;Obtain lower single order
Quantization scale of the quantization scale of section as the current generation;According to the quantization scale of current generation, in the second input value set
The first weight for not quantifying not quantifying in the first input value of rounding and the second weight set rounding carries out quantization rounding respectively, with more
New second input value set and the second weight set;By the weight that newer second weight sets cooperation is target convolutional layer, and really
Whether total quantization ratio of the weight in the current generation of convolutional layer of setting the goal reaches preset ratio value.
In some embodiments, the first generation unit is further configured to:It, will be initial defeated according to the digit of quantization encoding
The range of the range and initial weight that enter value is evenly dividing respectively as preset number subinterval, wherein preset number and quantization
The digit positive correlation of coding;According in preset number subinterval each input value and each weight, generate respectively first defeated
Enter value set and the first weight set.
In some embodiments, the first quantifying unit is further configured to:According to default quantization method, inputted to first
Each first input value in value set carries out quantization rounding, and will quantify the first input value after rounding as the second input value,
Generate the second input value set;According to the distribution probability of each first weight in the first weight set, in the first weight set
Each first weight carry out rounding or lower rounding, and using the first weight after rounding as the second weight, generate the second weight
Set.
In some embodiments, the first generation unit is further configured to:It, will be initial according to preset first numerical value
The range of input value and the range of initial weight are divided into the different subinterval of preset number siding-to-siding block length;It is with the first numerical value
The truth of a matter calculates the logarithm of each initial input value, using result of calculation as the first input value, generates the first input value set;According to
Each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight set.
In some embodiments, the first quantifying unit is further configured to:To each the in the first input value set
One input value carries out quantization rounding, and is respectively index to quantify each first input value after rounding, calculates the finger of the first numerical value
Number power, as the second input value, to generate the second input value set;According to the sequence in each subinterval, successively to each first weight
Serial number is established, generates inquiry table, and using the corresponding serial number of each first weight as the second weight, generate the second weight set,
In, serial number integer, and in inquiry table serial number with stored in the form of key-value pair with each first weight.
In some embodiments, the first quantifying unit is further configured to:According to the quantization scale of current generation,
The first input value of respective numbers is chosen in one input value set and the first weight set respectively and the first weight carries out quantization and takes
It is whole, wherein selection includes descending according to numerical value, is chosen from the big one end of numerical value, or ascending according to quantization error,
It chooses one end small from error.
In some embodiments, which further includes:Second acquisition unit is configured to obtain target convolutional neural networks
Target convolutional layer initial input information;Second quantifying unit is configured to carry out quantization rounding to initial input information, obtain
To integer input values;Third generation unit, be configured to by integer input values input target convolutional layer, and with target convolutional layer
Weight carries out convolution algorithm, generates output information.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, including:One or more processors;Storage dress
It sets, for storing one or more programs;When one or more programs are executed by one or more processors so that one or more
A processor realizes the method as described in any embodiment in above-mentioned first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey
Sequence, wherein the method as described in any embodiment in above-mentioned first aspect is realized when the computer program is executed by processor.
Method and apparatus provided by the embodiments of the present application for generating convolutional neural networks, by obtaining convolutional Neural net
The initial input value and initial weight of the target convolutional layer of network, so as to generate the first input value set and the first weight respectively
Set.It then, can be to each first input value and the first power in the first input value set according to the quantization scale of current generation
Each first weight gather again in carries out quantization rounding respectively, to generate the second input value set and the second weight set.Wherein,
Second weight set includes the first weight quantified after rounding.Later, can be target convolutional layer by the second weight sets cooperation
Weight, i.e., at least partly weight of target convolutional layer is converted into integer weight.Simultaneously, it may be determined that the power of target convolutional layer
Whether the total quantization ratio for focusing on the current generation reaches preset ratio value.If it is determined that when the weight of target convolutional layer is in the last stage
Total quantization ratio reaches preset ratio value, then can using the convolutional neural networks of current generation as target convolutional neural networks,
And store the target convolutional neural networks.In the target convolutional layer of target convolutional neural networks i.e. at this time, integer weight it is total
Ratio is above-mentioned preset ratio value.The multistage quantization of target convolutional layer weight may be implemented in this embodiment, can enrich
The generation method of convolutional neural networks, and help to improve the flexibility of the generation method of convolutional neural networks.
Description of the drawings
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating convolutional neural networks of the application;
Fig. 3 is the structural schematic diagram according to one embodiment of the device for generating convolutional neural networks of the application;
Fig. 4 is adapted for the structural schematic diagram of the computer system of the electronic equipment for realizing the embodiment of the present application.
Specific implementation mode
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, is illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the application for generating the method for convolutional neural networks or for generating convolutional Neural
The exemplary system architecture 100 of the device of network.
As shown in Figure 1, system architecture 100 may include terminal 101,102,103, network 104 and server 105.Network
104 between terminal 101,102,103 and server 105 provide communication link medium.Network 104 may include various
Connection type, such as wired, wireless communication link or fiber optic cables etc..
User can be interacted with using terminal 101,102,103 by network 104 and server 105, to receive
Or send message etc..Various client applications can be installed in terminal 101,102,103, such as neural network class is trained to answer
With, web browser, searching class application, the application of shopping class and immediate communication tool etc..
Terminal 101,102,103 can be the various electronic equipments for having display screen, including but not limited to smart mobile phone, flat
Plate computer, E-book reader, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as various to being shown in terminal 101,102,103
Using the background server for providing support.Background server can be to the first of the target convolutional layer that terminal 101,102,103 is sent
Beginning input value and initial weight carry out analyzing processing, to be trained to convolutional neural networks, and can be by handling result (example
Such as the target convolutional neural networks of generation) it is sent to terminal 101,102,103.Wherein, the target volume of target convolutional neural networks
At least partly weight of lamination is integer.
It should be noted that the method for generating convolutional neural networks that the embodiment of the present application is provided is generally by servicing
Device 105 executes, and correspondingly, the device for generating convolutional neural networks is generally positioned in server 105.
It should be understood that the number of the terminal, network and server in Fig. 1 is only schematical.It, can according to needs are realized
With with any number of terminal, network and server.
With continued reference to Fig. 2, it illustrates an implementations according to the method for generating convolutional neural networks of the application
The flow 200 of example.The method for being used to generate convolutional neural networks may comprise steps of:
Step 201, the initial input value and initial weight of the target convolutional layer of convolutional neural networks are obtained.
In the present embodiment, the method for generating convolutional neural networks runs electronic equipment (such as Fig. 1 institutes thereon
The server 105 shown) can by a variety of methods, come obtain convolutional neural networks target convolutional layer initial input value and just
Beginning weight.It is (such as shown in FIG. 1 from the terminal for communicating with connection such as by wired connection mode or radio connection
Terminal 101,102,103), database server or Cloud Server etc., to obtain above-mentioned initial input value and initial weight.
In the present embodiment, convolutional neural networks can be the convolutional neural networks for having various functions or purposes, such as may be used
For Face datection or the convolutional neural networks of recognition of face.It can be trained convolutional neural networks, can also
It is convolutional neural networks to be trained.Convolutional neural networks are usually a kind of feedforward neural network, its artificial neuron can be with
The surrounding cells in a part of coverage area are responded, have outstanding performance for large-scale image procossing.Convolutional neural networks usually may be used
To include convolutional layer (alternating convolution layer), pond layer (pooling layer) and full articulamentum etc..
Goal convolutional layer can be the arbitrary convolutional layer in target convolutional neural networks.
In the present embodiment, initial input value can input the arbitrary input information of target convolutional layer.Such as can be
The input information of facial image is described;Again such as the output information that can be a upper convolutional layer.And initial weight can be mesh
Mark the arbitrary weight of convolutional layer.Such as the initial weight being artificially arranged;After for example being corrected again by the methods of back-propagation algorithm
Weight etc..Herein, initial input value and the value of initial weight can be integer value and/or floating point values.
It should be noted that the convolution kernel when a certain convolutional layer is 1 × 1, illustrate that exporting for the convolutional layer is big with input
It is small identical.At this point, introducing error in order to avoid the weight of the convolutional layer is quantified as integer, it is possible to not by the convolutional layer
Weight be quantified as integer.That is, target convolutional layer can not include the convolutional layer that convolution kernel is 1 × 1.In addition, convolution
The storage location of neural network is not intended to limit in this application.Such as it can be stored in electronic equipment local, it can also be stored in
On database server or Cloud Server.
Step 202, according to initial input value and initial weight, the first input value set and the first weight sets are generated respectively
It closes.
In the present embodiment, according to the initial input value and initial weight obtained in step 201, electronic equipment may be used
Various methods, to generate the first input value set and the first weight set respectively.
The present embodiment some optionally in realization method, electronic equipment can be directly using each initial input value as
One input value, to generate the first input value set.Meanwhile it can be directly using each initial weight as the first weight, to raw
At the first weight set.
Optionally, the range of initial weight can also be divided into certain amount subinterval by electronic equipment;Then, according to
Initial weight in each subinterval, it may be determined that interval weight, using as the first weight, to generate the first weight set.
Wherein, division methods may include random, at equal intervals or at least one of unequal interval divides.Here certain amount can be with
It is configured according to actual conditions.
Further, uniform quantization method may be used in electronic equipment, i.e., will be initial defeated first according to the digit of quantization encoding
The range of the range and initial weight that enter value is evenly dividing respectively as preset number subinterval.Wherein, preset number and quantization
The digit positive correlation of coding.It is then possible to according in preset number subinterval each input value and each weight, give birth to respectively
At the first input value set and the first weight set.Here the digit of quantization encoding can be that electronic equipment is recorded, transmits information
The digit of coding used, such as binary-coded digit, the digit of decimal coded.
For example, the range of initial input value can directly be evenly dividing as preset number subinterval by electronic equipment.And
Directly the range of initial weight is evenly dividing as preset number subinterval.It is then possible to which preset number sub-district will be located at
Between in each initial input value respectively as the first input value, to generate the first input value set.Meanwhile it can will be located at pre-
If each initial weight in number subinterval is respectively as the first weight, to generate the first weight set.Alternatively, can also
According to each initial weight in the subinterval, the interval weight in the subinterval is determined, using as the first weight, to generate
First weight set.Wherein it is determined that method may include calculate positioned at the subinterval each initial weight assembly average or
Intermediate value.
For another example electronic equipment can be first by the range scaling to preset range of initial input value, then this is preset into model
It encloses and is evenly dividing as preset number subinterval.It is then possible to by each input value in above-mentioned preset range, i.e., it is first after scaling
Beginning input value generates the first input value set as the first input value.Meanwhile it can be by the equal scaling of the range of initial weight extremely
In preset range, and the preset range is evenly dividing as preset number subinterval.It later, can will be each in preset range
Weight, i.e. initial weight after scaling generate the first weight set as the first weight.Wherein, the boundary value of preset range can
Think integer.
As an example, the digit of quantization encoding can be 8, i.e., binary-coded digit is 8.If the range of initial weight
For [a, b], then electronic equipment can be first by the range scaling to [0,28] in range;It then can be by [0,28] proportion range
It is evenly dividing as 256 subintervals.At this point, the siding-to-siding block length in each subinterval is 1, that is, be followed successively by [0,1], [1,2], [2,
3]···[255,256]。
Again alternatively, electronic equipment can be first by [a, b] normalizing to [0,1] range;It then can be by the weight model after normalizing
It encloses and is evenly dividing as 256 subintervals, i.e., siding-to-siding block length is 1/256;Later can again scaling to [0,28] in range, so that respectively
The boundary value in subinterval is integer, i.e. [0,1], [1,2], [2,3] etc..
The present embodiment some optionally in realization method, electronic equipment can also use logarithmic quantization method, i.e. root
According to preset first numerical value, the range of the range of initial input value and initial weight is divided into preset number siding-to-siding block length not
Same subinterval;Using the first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, using result of calculation as the first input value,
Generate the first input value set;According to each initial weight positioned at each subinterval, determination section weight, using as the first weight,
Generate the first weight set.Here determination method does not limit equally.Wherein, the first numerical value can be the positive integer other than 1.
For example, the digit of quantization encoding can be 8, i.e., binary-coded digit is 8.If initial input value is (as x)
Ranging from [a, b], the first numerical value are r.It is respectively (b-a)/r, (b- that then the range can be divided into siding-to-siding block length by electronic equipment
a)/r2、(b-a)/r3···(b-a)/r256256 subintervals.It is then possible to by logr xAs the first input value, thus
Generate the first input value set.
Step 203, according to the quantization scale of current generation, to each first input value and first in the first input value set
Each first weight in weight set carries out quantization rounding respectively, generates the second input value set and the second weight set.
In the present embodiment, electronic equipment can be according to the quantization scale of current generation, in the first input value set and the
The first input value and the first weight for choosing respective numbers in one weight set respectively carry out quantization rounding.So as to give birth to respectively
At the second input value set and the second weight set.Wherein, specific selection mode is not intended to limit, and such as can be to randomly select;
Can be that one end descending according to numerical value, big from numerical value is chosen;Or it is ascending according to quantization error, it is small from error
It chooses one end.Here quantization method does not limit equally.
In the present embodiment, may include the first input value quantified after rounding in the second input value set.Second weight
It may include the first weight quantified after rounding in set.It should be noted that each first in the first input value set is defeated
Enter before value all do not quantified, can also include non-quantized first input value in the second input value set.And in the first power
Each first weight gather again in can also not included non-quantized first power before all quantifying, in the second weight set
Weight.
It is understood that the acquisition modes of the quantization scale of current generation are not intended to limit in this application.Such as can be
What electronic equipment was obtained from local preset parameter file, can also be that electronic equipment is obtained from Cloud Server, can be with
It is that user is sent to electronic equipment by terminal.
The present embodiment some optionally in realization method, for uniform quantization method, electronic equipment can be according to pre-
If quantization method, quantization rounding is carried out to each first input value in the first input value set.And after can quantifying rounding
First input value generates the second input value set as the second input value.Wherein, default quantization method can be according to actual conditions
It is configured, the method to round up such as may be used.
Further, electronic equipment can according to the distribution probability of each first weight in the first weight set, to this
Each first weight in one weight set carries out randomization (upper rounding or lower rounding).And first after rounding can be weighed
Recast is the second weight, generates the second weight set.For example, for each first weight in [0,1] subinterval, if at least half
The first weight be distributed between [0.5,1], then can carry out upper rounding to each first weight in the subinterval, i.e., all
It is 1 to quantify rounding.Contribute to reduce quantization error in this way.
It is understood that using the method for being evenly dividing subinterval, the weight distribution after quantization can be made more uniform.
Meanwhile using randomization, can balance since loss of significance leads to the diminution of the distribution space of univers parameter, so as to
Improve the distribution space of parameter.
Optionally, for logarithmic quantization method, electronic equipment can be to each first input value in the first input value set
Carry out quantization rounding, i.e. round (logr x).And it can be respectively index to quantify each first input value after rounding, calculate
The exponential depth of first numerical value, i.e. rn(n=round (logr x)), as the second input value, to generate the second input value set.
Meanwhile electronic equipment can establish serial number to each first weight successively according to the sequence in each subinterval, generate inquiry
Table.And it can be using the corresponding serial number of each first weight as the second weight, to generate the second weight set.Wherein, serial number
Integer.And serial number is stored with the first weight in the form of key-value pair in inquiry table.
For example, according to numerical values recited, since [a, (b-a)/r] is first subinterval, so in the subinterval
The corresponding serial number of first weight can be 1.And the corresponding serial number of the first weight positioned at other subintervals can be followed successively by 2,3,4
Etc..That is, replacing weight by integer serial number, it is empty that the storage occupied needed for convolutional neural networks can be reduced in this way
Between.It can be handled again to avoid actual quantization rounding is carried out to weight simultaneously, so as to simplify processing procedure, improve operation effect
Rate.And error can be introduced to avoid because of quantization rounding processing.
Step 204, by the weight that the second weight sets cooperation is target convolutional layer, and determine that the weight of target convolutional layer is being worked as
Whether the total quantization ratio of last stage reaches preset ratio value.
In the present embodiment, electronic equipment can by each weight in the second weight set generated in step 203, as
The weight of the target convolutional layer of convolutional neural networks.And it can determine total quantization of the weight in the current generation of target convolutional layer
Whether ratio reaches preset ratio value (such as 100%).If total quantization ratio of the weight of target convolutional layer in the current generation reaches
Preset ratio value can then continue to execute step 205;If total quantization ratio of the weight of target convolutional layer in the current generation does not reach
To preset ratio value, then step 206 can be executed.It is understood that total amount of the weight of target convolutional layer in the current generation
The weight being quantized in target convolutional layer until change ratio can refer to the current generation is in the weight of target convolutional layer
Ratio.Ratio of the first weight being quantized in the second weight set.
Step 205, in response to determining that total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio
Value, using the convolutional neural networks of current generation as target convolutional neural networks, and stores target convolutional neural networks.
In the present embodiment, if determining total quantization ratio of the weight in the current generation of target convolutional layer in step 204
Reach preset ratio value, illustrates that quantizing process is completed.Then electronic equipment can be by the convolutional neural networks of current generation, i.e. mesh
The weight of mark convolutional layer has been the convolutional neural networks that the second weight set and total quantization ratio reach above-mentioned preset ratio value,
As target convolutional neural networks.And the target convolutional neural networks can be stored.It is understood that in target convolution god
In generating process through network, above-mentioned preset ratio value usually can be a steady state value.
Step 206, in response to determining that total quantization ratio of the weight of target convolutional layer in the current generation not up to presets ratio
Example value, is trained the convolutional neural networks of current generation according to loss function, is not quantified with adjusting in the second weight set
First weight of rounding, until the value of loss function tends to desired value.
In the present embodiment, if determining total quantization ratio of the weight in the current generation of target convolutional layer in step 204
Not up to preset ratio value illustrates that quantizing process does not complete also.Then electronic equipment can be according to loss function to the current generation
Convolutional neural networks are trained, to adjust the first weight for not quantifying rounding in the second weight set, until loss function
Value tends to desired value.Wherein, desired value can be arranged according to actual conditions.That is, during quantifying the multistage,
First weight can remain unchanged after being quantized once.
It should be noted that other than using loss function, other training methods can also be used, so that the current generation
The convolutional neural networks of (i.e. under the quantization scale of current generation) are stable or restrain.
Step 207, quantization scale of the quantization scale as the current generation of next stage is obtained.
In the present embodiment, after the convolutional neural networks training convergence of current generation in step 206, electronic equipment can
To obtain the quantization scale of next stage as the quantization scale of current generation, to carry out the quantizing process of next stage.
Step 208, according to the quantization scale of current generation, to not quantifying the first input value of rounding in the second input value set
Quantization rounding is carried out respectively with the first weight for not quantifying rounding in the second weight set, with the second input value set of update and the
Two weight set.
In the present embodiment, electronic equipment can be according to the quantization scale of the current generation in step 207, from the second input
The first input value for not quantifying to choose respective numbers in value set in each first input value of rounding carries out quantization rounding, and updates
Second input value set.Meanwhile it can not quantify to choose respective numbers in each first weight of rounding from the second weight set
The first weight carry out quantization rounding, and update the second weight set.Here choosing method can be with above-mentioned choosing method phase
Together.
That is, the multistage quantization in the present embodiment quantifies using increment, i.e., the base quantified in previous stage
On plinth, next stage quantization is carried out.Such as the quantization scale in two stages is followed successively by 20% and 50%, then the first stage can be with
Be quantized to 20% (quantifying 20%), second stage can be quantized to 50% (i.e. in the first stage on the basis of re-quantization
30%) or second stage can be quantized to 70% (i.e. in the first stage on the basis of re-quantization 50%).
Step 209, by the weight that newer second weight sets cooperation is target convolutional layer, and the power of target convolutional layer is determined
Whether the total quantization ratio for focusing on the current generation reaches preset ratio value.
In the present embodiment, electronic equipment can be using the weight in newer second weight set as target convolutional layer
Weight updates the weight of target convolutional layer.Meanwhile it can continue to determine total amount of the weight in the current generation of target convolutional layer
Whether change ratio reaches above-mentioned preset ratio value.If it is determined that total quantization ratio of the weight of target convolutional layer in the current generation reaches
Preset ratio value can then return to step 205;If it is determined that total quantization ratio of the weight of target convolutional layer in the current generation
Not up to preset ratio value can then continue to execute step 206.
It is understood that the method in the present embodiment can accumulate nerve net by multistage quantization to generate target volume
Network.In this way, for different quantization stages (quantization scale), the processing accuracy of convolutional neural networks can be adjusted, to advantageous
In the flexibility for improving training.It, can be in order to the discovery and solution of problem meanwhile during quantifying the multistage.
In addition, in order to improve the efficiency for generating target convolutional neural networks, it in the above-described embodiments can be only to target volume
The weight of lamination carries out multistage quantization, and can it all be quantified to rounding for the initial input value of target convolutional layer, with
Generate the second input value set.There is no the first input values for not quantifying rounding in the second input value set at this time.In this way,
In subsequent each quantization stage, it may not be necessary to the second input value set be handled and be updated again, to simplify processing
Process.Simultaneously as in each quantization stage, the input value (the i.e. second input value set) of target convolutional layer remains unchanged, institute
To help to ensure that the precision of the weight of the target convolutional layer of the target convolutional neural networks of generation.
In application scenes, after generating target convolutional neural networks, electronic equipment can also obtain the target
The initial input information of the target convolutional layer of convolutional neural networks;It is then possible to carry out quantization rounding to initial input information, obtain
To integer input values;Later, integer input values can be inputted to target convolutional layer, and convolution is carried out with the weight of target convolutional layer
Operation generates output information.Wherein, the weight of target convolutional layer is integer.That is, input information is converted to integer
Afterwards, the operation between integer and integer may be implemented, help to improve operation efficiency in this way.
It is understood that by target convolutional neural networks obtained by the above method, may be implemented floating point arithmetic
Fixed-point number operation is converted to, the occupancy of memory headroom can be reduced in this way, while helping to improve arithmetic speed.And pass through experiment
It is found that for universal cpu (Central Processing Unit, central processing unit), processing speed can be promoted to originally
Substantially twice.For FPGA (Field-Programmable Gate Array, field programmable gate array), processing speed
Degree substantially can be synchronous with graphics processor (Graphics Processing Unit, GPU).And energy consumption can be reduced.
In addition, in order to keep the weight distribution after quantization more uniform, electronic equipment can also be before a quantization in weight
Outlier handled.Specifically, first, electronic equipment can be with the distributed intelligence of statistical weight;Then, believed according to distribution
Breath, it may be determined that with the presence or absence of the weight for meeting preset condition in weight;Then, however, it is determined that there is the power for meeting preset condition
Weight, can be handled the weight for meeting preset condition.Wherein, processing mode can delete or scale those satisfactions to preset
The weight of condition.
Here scalable manner is not intended to limit, and can be bi-directional scaling, can also be to zoom to target weight value etc..This
In preset condition can be equally arranged according to actual conditions, such as can be weighted value exceed average weight value preset ratio
(such as 5%) can also be the certain proportion for being arranged in big one end according to the descending sequence of weighted value (such as preceding 5%)
Weight.It should be noted that above-mentioned processing procedure may be embodied in each repetitive exercise, can also be instructed at interval of certain iteration
Practice number to carry out again later.
Method provided in this embodiment for generating convolutional neural networks, by the target volume for obtaining convolutional neural networks
The initial input value and initial weight of lamination, so as to generate the first input value set and the first weight set respectively.Then,
It, can be in each first input value and the first weight set in the first input value set according to the quantization scale of current generation
Each first weight carries out quantization rounding respectively, to generate the second input value set and the second weight set.Wherein, the second weight sets
Conjunction includes the first weight quantified after rounding.Later, that is, the weight that the second weight sets cooperation is target convolutional layer can will
At least partly weight of target convolutional layer is converted to integer weight.Simultaneously, it may be determined that the weight of target convolutional layer is in current rank
Whether the total quantization ratio of section reaches preset ratio value.If it is determined that total quantization ratio of the weight of target convolutional layer in the current generation
Reach preset ratio value, then can be using the convolutional neural networks of current generation as target convolutional neural networks, and store the mesh
Mark convolutional neural networks.In the target convolutional layer of target convolutional neural networks at this time, the toatl proportion of integer weight is above-mentioned pre-
If ratio value.The multistage quantization of target convolutional layer weight may be implemented in this embodiment, can enrich convolutional neural networks
Generation method, and help to improve the flexibility of the generation method of convolutional neural networks.
With further reference to Fig. 3, as the realization to method shown in above-mentioned each figure, this application provides one kind for generating volume
One embodiment of the device of product neural network.The device embodiment is corresponding with embodiment of the method shown in Fig. 2, device tool
Body can be applied in various electronic equipments.
As shown in figure 3, the device 300 for generating convolutional neural networks of the present embodiment may include:First obtains list
Member 301 is configured to obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;First generation unit
302, it is configured to according to initial input value and initial weight, generates the first input value set and the first weight set respectively;The
One quantifying unit 303, is configured to the quantization scale according to the current generation, to each first input value in the first input value set
Quantization rounding is carried out respectively with each first weight in the first weight set, generates the second input value set and the second weight sets
It closes, wherein the second weight set includes the first weight quantified after rounding;Determination unit 304 is configured to the second weight
Gather the weight as target convolutional layer, and determines whether total quantization ratio of the weight of target convolutional layer in the current generation reaches
Preset ratio value;Second generation unit 305 is configured to total amount of the weight in the current generation in response to determining target convolutional layer
Change ratio reaches preset ratio value, using the convolutional neural networks of current generation as target convolutional neural networks, and stores target
Convolutional neural networks.
In the present embodiment, first acquisition unit 301, the first generation unit 302, the first quantifying unit 303, determination unit
304 and second generation unit 305 specific implementation and generation advantageous effect, embodiment shown in Figure 2 can be distinguished
In step 201, the associated description of step 202, step 203 step 204 and step 205, details are not described herein again.
In some optional realization methods of the present embodiment, which can also be configured to:In response to determining mesh
It marks total quantization ratio of the weight of convolutional layer in the current generation and is not up to preset ratio value, according to loss function to the current generation
Convolutional neural networks are trained, to adjust the first weight for not quantifying rounding in the second weight set, until loss function
Value tends to desired value;Obtain quantization scale of the quantization scale of next stage as the current generation;According to the quantization of current generation
Ratio is weighed to not quantifying not quantifying in the first input value of rounding and the second weight set the first of rounding in the second input value set
Quantization rounding is carried out respectively again, with the second input value set of update and the second weight set;By newer second weight sets cooperation
For the weight of target convolutional layer, and determine whether total quantization ratio of the weight of target convolutional layer in the current generation reaches default ratio
Example value.
Optionally, the first generation unit 302 can be further configured to:It, will be initial defeated according to the digit of quantization encoding
The range of the range and initial weight that enter value is evenly dividing respectively as preset number subinterval, wherein preset number and quantization
The digit positive correlation of coding;According in preset number subinterval each input value and each weight, generate respectively first defeated
Enter value set and the first weight set
Further, the first quantifying unit 303 can be further configured to:It is defeated to first according to default quantization method
Enter each first input value in value set and carry out quantization rounding, and using the first input value after quantization rounding as the second input
Value generates the second input value set;According to the distribution probability of each first weight in the first weight set, to the first weight set
In each first weight carry out rounding or lower rounding, and using the first weight after rounding as the second weight, generate the second power
Gather again.
Optionally, the first generation unit 302 can also be further configured to:It, will be initial according to preset first numerical value
The range of input value and the range of initial weight are divided into the different subinterval of preset number siding-to-siding block length;It is with the first numerical value
The truth of a matter calculates the logarithm of each initial input value, using result of calculation as the first input value, generates the first input value set;According to
Each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight set.
Further, the first quantifying unit 303 can also be further configured to:To each the in the first input value set
One input value carries out quantization rounding, and is respectively index to quantify each first input value after rounding, calculates the finger of the first numerical value
Number power, as the second input value, to generate the second input value set;According to the sequence in each subinterval, successively to each first weight
Serial number is established, generates inquiry table, and using the corresponding serial number of each first weight as the second weight, generate the second weight set,
In, serial number integer, and in inquiry table serial number with stored in the form of key-value pair with each first weight.
In some embodiments, the first quantifying unit 303 can be further configured to:According to the quantization of current generation ratio
Example, the first input value and the first weight for choosing respective numbers respectively in the first input value set and the first weight set carry out
Quantify rounding, wherein selection includes descending according to numerical value, is chosen from the big one end of numerical value, or according to quantization error by
It is small to be chosen to big, small from error one end.
In addition, the device 300 can also include:Second acquisition unit (not shown) is configured to obtain target volume
The initial input information of the target convolutional layer of product neural network;Second quantifying unit (not shown), is configured to initial
Input information carries out quantization rounding, obtains integer input values;Third generation unit (not shown), is configured to integer is defeated
Enter value input target convolutional layer, and convolution algorithm is carried out with the weight of target convolutional layer, generates output information.
Referring to Fig. 4, it illustrates the computer systems 400 suitable for the electronic equipment for realizing the embodiment of the present application
Structural schematic diagram.Electronic equipment shown in Fig. 4 is only an example, to the function of the embodiment of the present application and should not use model
Shroud carrys out any restrictions.
As shown in figure 4, computer system 400 includes central processing unit (CPU) 401, it can be read-only according to being stored in
Program in memory (ROM) 402 or be loaded into the program in random access storage device (RAM) 403 from storage section 408 and
Execute various actions appropriate and processing.In RAM 403, also it is stored with system 400 and operates required various programs and data.
CPU 401, ROM 402 and RAM 403 are connected with each other by bus 404.Input/output (I/O) interface 405 is also connected to always
Line 404.
It is connected to I/O interfaces 405 with lower component:Importation 406 including touch screen, keyboard, mouse etc.;Including such as
The output par, c 407 of cathode-ray tube (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage part including hard disk etc.
Divide 408;And the communications portion 409 of the network interface card including LAN card, modem etc..Communications portion 409 via
The network of such as internet executes communication process.Driver 410 is also according to needing to be connected to I/O interfaces 405.Detachable media
411, such as disk, CD, magneto-optic disk, semiconductor memory etc., as needed be mounted on driver 410 on, in order to from
The computer program read thereon is mounted into storage section 408 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed by communications portion 409 from network, and/or from detachable media
411 are mounted.When the computer program is executed by central processing unit (CPU) 401, limited in execution the present processes
Above-mentioned function.It should be noted that the computer-readable medium of the application can be computer-readable signal media or calculating
Machine readable storage medium storing program for executing either the two arbitrarily combines.Computer readable storage medium for example can be --- but it is unlimited
In --- electricity, system, device or the device of magnetic, optical, electromagnetic, infrared ray or semiconductor, or the arbitrary above combination.It calculates
The more specific example of machine readable storage medium storing program for executing can include but is not limited to:Being electrically connected, be portable with one or more conducting wires
Formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory
(EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or
The above-mentioned any appropriate combination of person.In this application, can be any include computer readable storage medium or storage program
Tangible medium, the program can be commanded execution system, device either device use or it is in connection.And in this Shen
Please in, computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated,
In carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not limited to
Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable
Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by
Instruction execution system, device either device use or program in connection.The journey for including on computer-readable medium
Sequence code can transmit with any suitable medium, including but not limited to:Wirelessly, electric wire, optical cable, RF etc. or above-mentioned
Any appropriate combination.
Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part for a part for one module, program segment, or code of table, the module, program segment, or code includes one or more uses
The executable instruction of the logic function as defined in realization.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it to note
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit can also be arranged in the processor, for example, can be described as:A kind of processor packet
Include acquiring unit, quantifying unit and generation unit.Wherein, the title of these units is not constituted under certain conditions to the unit
The restriction of itself, for example, acquiring unit is also described as " obtaining the initial of the target convolutional layer of target convolutional neural networks
The unit of input information ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be
Included in electronic equipment described in above-described embodiment;Can also be individualism, and without be incorporated the electronic equipment in.
Above computer readable medium carries one or more program, when said one or multiple programs are held by the electronic equipment
When row so that the electronic equipment:Obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;According to first
Beginning input value and initial weight generate the first input value set and the first weight set respectively;According to the quantization of current generation ratio
Example carries out quantization to each first weight in each first input value and the first weight set in the first input value set and takes respectively
It is whole, generate the second input value set and the second weight set, wherein the second weight set includes the first power after quantifying rounding
Weight;By the weight that the second weight sets cooperation is target convolutional layer, and determine total amount of the weight in the current generation of target convolutional layer
Whether change ratio reaches preset ratio value;In response to determining that total quantization ratio of the weight of target convolutional layer in the current generation reaches
Preset ratio value using the convolutional neural networks of current generation as target convolutional neural networks, and stores target convolution nerve net
Network.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art
Member should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Other technical solutions of arbitrary combination and formation.Such as features described above has similar work(with (but not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (18)
1. a kind of method for generating convolutional neural networks, including:
Obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;
According to the initial input value and the initial weight, the first input value set and the first weight set are generated respectively;
According to the quantization scale of current generation, each first input value and first weight in value set are inputted to described first
Each first weight in set carries out quantization rounding respectively, generates the second input value set and the second weight set, wherein described
Second weight set includes the first weight quantified after rounding;
By the weight that the second weight sets cooperation is the target convolutional layer, and determine that the weight of the target convolutional layer is being worked as
Whether the total quantization ratio of last stage reaches preset ratio value;
Reach preset ratio value in response to total quantization ratio of the weight in the current generation of the determination target convolutional layer, it will be current
The convolutional neural networks in stage store the target convolutional neural networks as target convolutional neural networks.
2. according to the method described in claim 1, wherein, the method further includes:
It is not up to preset ratio value in response to total quantization ratio of the weight in the current generation of the determination target convolutional layer, according to
Loss function is trained the convolutional neural networks of current generation, does not quantify rounding to adjust in the second weight set
First weight, until the value of loss function tends to desired value;Obtain quantization of the quantization scale of next stage as the current generation
Ratio;According to the quantization scale of current generation, to not quantifying the first input value of rounding and described in the second input value set
The first weight for not quantifying rounding in second weight set carries out quantization rounding respectively, with update it is described second input value set and
The second weight set;By the weight that newer second weight sets cooperation is the target convolutional layer, and determine the target
Whether total quantization ratio of the weight of convolutional layer in the current generation reaches the preset ratio value.
3. it is described according to the initial input value and the initial weight according to the method described in claim 1, wherein, respectively
The first input value set and the first weight set are generated, including:
According to the digit of quantization encoding, the range of the range of the initial input value and the initial weight is evenly dividing respectively
For preset number subinterval, wherein the digit positive correlation of the preset number and the quantization encoding;
According in preset number subinterval each input value and each weight, respectively generate first input value set and first
Weight set.
4. according to the method described in claim 3, wherein, each first input value in the first input value set and
Each first weight in the first weight set carries out quantization rounding respectively, generates the second input value set and the second weight sets
It closes, including:
According to default quantization method, quantization rounding is carried out to each first input value in the first input value set, and will amount
Change the first input value after rounding as the second input value, generates the second input value set;
According to the distribution probability of each first weight in the first weight set, to each first in the first weight set
Weight carries out rounding or lower rounding, and using the first weight after rounding as the second weight, generates the second weight set.
5. it is described according to the initial input value and the initial weight according to the method described in claim 1, wherein, respectively
The first input value set and the first weight set are generated, including:
According to preset first numerical value, the range of the range of the initial input value and the initial weight is divided into present count
The different subinterval of mesh siding-to-siding block length;
Using first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, using result of calculation as the first input value, is generated
First input value set;
According to each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight sets
It closes.
6. according to the method described in claim 5, wherein, each first input value in the first input value set and
Each first weight in the first weight set carries out quantization rounding respectively, generates the second input value set and the second weight sets
It closes, including:
Quantization rounding carried out to each first input value in the first input value set, and each first defeated after rounding to quantify
It is respectively index to enter value, calculates the exponential depth of first numerical value, as the second input value, to generate the second input value set;
According to the sequence in each subinterval, serial number is established to each first weight successively, generates inquiry table, and each first weight is corresponded to
Serial number as the second weight, generate the second weight set, wherein the serial number integer, and in the inquiry table serial number with
It is stored in the form of key-value pair with each first weight.
7. according to the method described in claim 1, wherein, the quantization scale according to the current generation is inputted to described first
Each first weight in each first input value and the first weight set in value set carries out quantization rounding respectively, including:
According to the quantization scale of current generation, phase is chosen respectively in the first input value set and the first weight set
The first input value and the first weight for answering quantity carry out quantization rounding, wherein it is described selection include it is descending according to numerical value, from
The big one end of numerical value is chosen, or ascending according to quantization error, and one end small from error is chosen.
8. according to the method described in one of claim 1-7, wherein the method further includes:
Obtain the initial input information of the target convolutional layer of the target convolutional neural networks;
Quantization rounding is carried out to the initial input information, obtains integer input values;
The integer input values are inputted into the target convolutional layer, and convolution algorithm is carried out with the weight of the target convolutional layer,
Generate output information.
9. a kind of device for generating convolutional neural networks, including:
First acquisition unit is configured to obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;
First generation unit is configured to, according to the initial input value and the initial weight, generate the first input value respectively
Set and the first weight set;
First quantifying unit is configured to the quantization scale according to the current generation, to each the in the first input value set
Each first weight in one input value and the first weight set carries out quantization rounding respectively, generate the second input value set and
Second weight set, wherein the second weight set includes the first weight quantified after rounding;
Determination unit is configured to the weight that the second weight sets cooperation is the target convolutional layer, and determines the mesh
Mark whether total quantization ratio of the weight of convolutional layer in the current generation reaches preset ratio value;
Second generation unit, be configured to the weight in response to the determination target convolutional layer the current generation total quantization ratio
Reach preset ratio value, using the convolutional neural networks of current generation as target convolutional neural networks, and stores the target volume
Product neural network.
10. device according to claim 9, wherein described device is also configured to:
It is not up to preset ratio value in response to total quantization ratio of the weight in the current generation of the determination target convolutional layer, according to
Loss function is trained the convolutional neural networks of current generation, does not quantify rounding to adjust in the second weight set
First weight, until the value of loss function tends to desired value;Obtain quantization of the quantization scale of next stage as the current generation
Ratio;According to the quantization scale of current generation, to not quantifying the first input value of rounding and described in the second input value set
The first weight for not quantifying rounding in second weight set carries out quantization rounding respectively, with update it is described second input value set and
The second weight set;By the weight that newer second weight sets cooperation is the target convolutional layer, and determine the target
Whether total quantization ratio of the weight of convolutional layer in the current generation reaches the preset ratio value.
11. device according to claim 9, wherein first generation unit is further configured to:
According to the digit of quantization encoding, the range of the range of the initial input value and the initial weight is evenly dividing respectively
For preset number subinterval, wherein the digit positive correlation of the preset number and the quantization encoding;
According in preset number subinterval each input value and each weight, respectively generate first input value set and first
Weight set.
12. according to the devices described in claim 11, wherein first quantifying unit is further configured to:
According to default quantization method, quantization rounding is carried out to each first input value in the first input value set, and will amount
Change the first input value after rounding as the second input value, generates the second input value set;
According to the distribution probability of each first weight in the first weight set, to each first in the first weight set
Weight carries out rounding or lower rounding, and using the first weight after rounding as the second weight, generates the second weight set.
13. device according to claim 9, wherein first generation unit is further configured to:
According to preset first numerical value, the range of the range of the initial input value and the initial weight is divided into present count
The different subinterval of mesh siding-to-siding block length;
Using first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, using result of calculation as the first input value, is generated
First input value set;
According to each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight sets
It closes.
14. device according to claim 13, wherein first quantifying unit is further configured to:
Quantization rounding carried out to each first input value in the first input value set, and each first defeated after rounding to quantify
It is respectively index to enter value, calculates the exponential depth of first numerical value, as the second input value, to generate the second input value set;
According to the sequence in each subinterval, serial number is established to each first weight successively, generates inquiry table, and each first weight is corresponded to
Serial number as the second weight, generate the second weight set, wherein the serial number integer, and in the inquiry table serial number with
It is stored in the form of key-value pair with each first weight.
15. device according to claim 9, wherein first quantifying unit is further configured to:
According to the quantization scale of current generation, phase is chosen respectively in the first input value set and the first weight set
The first input value and the first weight for answering quantity carry out quantization rounding, wherein it is described selection include it is descending according to numerical value, from
The big one end of numerical value is chosen, or ascending according to quantization error, and one end small from error is chosen.
16. according to the device described in one of claim 9-15, wherein described device further includes:
Second acquisition unit is configured to obtain the initial input information of the target convolutional layer of the target convolutional neural networks;
Second quantifying unit is configured to carry out quantization rounding to the initial input information, obtains integer input values;
Third generation unit is configured to the integer input values inputting the target convolutional layer, and with the target convolution
The weight of layer carries out convolution algorithm, generates output information.
17. a kind of electronic equipment, including:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors so that one or more of processors are real
Now such as method according to any one of claims 1-8.
18. a kind of computer readable storage medium, is stored thereon with computer program, wherein the computer program is handled
Such as method according to any one of claims 1-8 is realized when device executes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810084926.1A CN108288089A (en) | 2018-01-29 | 2018-01-29 | Method and apparatus for generating convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810084926.1A CN108288089A (en) | 2018-01-29 | 2018-01-29 | Method and apparatus for generating convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108288089A true CN108288089A (en) | 2018-07-17 |
Family
ID=62835926
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810084926.1A Pending CN108288089A (en) | 2018-01-29 | 2018-01-29 | Method and apparatus for generating convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108288089A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117742A (en) * | 2018-07-20 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | Gestures detection model treatment method, apparatus, equipment and storage medium |
CN109308194A (en) * | 2018-09-29 | 2019-02-05 | 北京字节跳动网络技术有限公司 | Method and apparatus for storing data |
CN109754074A (en) * | 2018-12-29 | 2019-05-14 | 北京中科寒武纪科技有限公司 | A kind of neural network quantization method, device and Related product |
CN110222821A (en) * | 2019-05-30 | 2019-09-10 | 浙江大学 | Convolutional neural networks low-bit width quantization method based on weight distribution |
CN112085183A (en) * | 2019-06-12 | 2020-12-15 | 上海寒武纪信息科技有限公司 | Neural network operation method and device and related product |
WO2021036908A1 (en) * | 2019-08-23 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Data processing method and apparatus, computer equipment and storage medium |
WO2021036890A1 (en) * | 2019-08-23 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Data processing method and apparatus, computer device, and storage medium |
US11699073B2 (en) | 2018-12-29 | 2023-07-11 | Cambricon Technologies Corporation Limited | Network off-line model processing method, artificial intelligence processing device and related products |
-
2018
- 2018-01-29 CN CN201810084926.1A patent/CN108288089A/en active Pending
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117742A (en) * | 2018-07-20 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | Gestures detection model treatment method, apparatus, equipment and storage medium |
CN109117742B (en) * | 2018-07-20 | 2022-12-27 | 百度在线网络技术(北京)有限公司 | Gesture detection model processing method, device, equipment and storage medium |
CN109308194A (en) * | 2018-09-29 | 2019-02-05 | 北京字节跳动网络技术有限公司 | Method and apparatus for storing data |
CN109754074A (en) * | 2018-12-29 | 2019-05-14 | 北京中科寒武纪科技有限公司 | A kind of neural network quantization method, device and Related product |
US11699073B2 (en) | 2018-12-29 | 2023-07-11 | Cambricon Technologies Corporation Limited | Network off-line model processing method, artificial intelligence processing device and related products |
CN110222821A (en) * | 2019-05-30 | 2019-09-10 | 浙江大学 | Convolutional neural networks low-bit width quantization method based on weight distribution |
CN110222821B (en) * | 2019-05-30 | 2022-03-25 | 浙江大学 | Weight distribution-based convolutional neural network low bit width quantization method |
CN112085183A (en) * | 2019-06-12 | 2020-12-15 | 上海寒武纪信息科技有限公司 | Neural network operation method and device and related product |
CN112085183B (en) * | 2019-06-12 | 2024-04-02 | 上海寒武纪信息科技有限公司 | Neural network operation method and device and related products |
WO2021036908A1 (en) * | 2019-08-23 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Data processing method and apparatus, computer equipment and storage medium |
WO2021036890A1 (en) * | 2019-08-23 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Data processing method and apparatus, computer device, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108288089A (en) | Method and apparatus for generating convolutional neural networks | |
CN108229663A (en) | For generating the method and apparatus of convolutional neural networks | |
US11620532B2 (en) | Method and apparatus for generating neural network | |
CN108304919A (en) | Method and apparatus for generating convolutional neural networks | |
CN110880036B (en) | Neural network compression method, device, computer equipment and storage medium | |
CN110766142A (en) | Model generation method and device | |
CN110852438B (en) | Model generation method and device | |
CN108256632A (en) | Information processing method and device | |
EP2531959B1 (en) | Organizing neural networks | |
JP7287397B2 (en) | Information processing method, information processing apparatus, and information processing program | |
CN113705610A (en) | Heterogeneous model aggregation method and system based on federal learning | |
CN109190754A (en) | Quantitative model generation method, device and electronic equipment | |
JP7354463B2 (en) | Data protection methods, devices, servers and media | |
CN109388779A (en) | A kind of neural network weight quantization method and neural network weight quantization device | |
CN105389454A (en) | Predictive model generator | |
CN106776925A (en) | A kind of Forecasting Methodology of mobile terminal user's sex, server and system | |
CN112768056A (en) | Disease prediction model establishing method and device based on joint learning framework | |
CN106778843A (en) | One kind prediction mobile terminal user's property method for distinguishing, server and system | |
CN116684330A (en) | Traffic prediction method, device, equipment and storage medium based on artificial intelligence | |
CN116861259B (en) | Training method and device of reward model, storage medium and electronic equipment | |
CN108509179A (en) | Method and apparatus for generating model | |
CN110489435B (en) | Data processing method and device based on artificial intelligence and electronic equipment | |
CN109670579A (en) | Model generating method and device | |
CN114700957B (en) | Robot control method and device with low computational power requirement of model | |
CN113762687B (en) | Personnel scheduling method and device in warehouse |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |