CN108288089A - Method and apparatus for generating convolutional neural networks - Google Patents

Method and apparatus for generating convolutional neural networks Download PDF

Info

Publication number
CN108288089A
CN108288089A CN201810084926.1A CN201810084926A CN108288089A CN 108288089 A CN108288089 A CN 108288089A CN 201810084926 A CN201810084926 A CN 201810084926A CN 108288089 A CN108288089 A CN 108288089A
Authority
CN
China
Prior art keywords
weight
input value
quantization
rounding
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810084926.1A
Other languages
Chinese (zh)
Inventor
姜志超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810084926.1A priority Critical patent/CN108288089A/en
Publication of CN108288089A publication Critical patent/CN108288089A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The embodiment of the present application discloses the method and apparatus for generating convolutional neural networks.One specific implementation mode of this method includes:The initial input value and initial weight for obtaining the target convolutional layer of convolutional neural networks, to generate the first input value set and the first weight set respectively;According to the quantization scale of current generation, quantization rounding is carried out respectively to the first input value set and the first weight set, generates the second input value set and the second weight set, wherein the second weight set includes the first weight quantified after rounding;By the weight that the second weight sets cooperation is target convolutional layer, and determine whether the total quantization ratio of current generation reaches preset ratio value;In response to determining that the total quantization ratio of current generation reaches preset ratio value, generates and store target convolutional neural networks.The multistage quantization of target convolutional layer weight may be implemented in the embodiment, helps to improve the flexibility of the generation method of convolutional neural networks.

Description

Method and apparatus for generating convolutional neural networks
Technical field
The invention relates to field of computer technology, and in particular to nerual network technique field, more particularly, to The method and apparatus for generating convolutional neural networks.
Background technology
The concept of deep learning is derived from the research of artificial neural network.Deep learning is formed more by combining low-level feature Abstract high-rise expression attribute classification or feature, to find that the distributed nature of data indicates.Deep learning is that machine learning is ground A new field in studying carefully, motivation are that foundation, simulation human brain carry out the neural network of analytic learning, it imitates human brain Mechanism explains data, such as image, sound and text.
It is the same with machine learning method, point of depth machine learning method also supervised learning and unsupervised learning.It is different Learning framework under the learning model established it is very different.For example, convolutional neural networks (Convolution Neural Network, abbreviation CNN) it is exactly a kind of machine learning model under the supervised learning of depth;And depth confidence net (Deep Belief Net, abbreviation DBN) it is exactly a kind of machine learning model under unsupervised learning.
Invention content
The embodiment of the present application proposes the method and apparatus for generating convolutional neural networks.
In a first aspect, the embodiment of the present application provides a kind of method for generating convolutional neural networks, including:Obtain volume The initial input value and initial weight of the target convolutional layer of product neural network;According to initial input value and initial weight, give birth to respectively At the first input value set and the first weight set;According to the quantization scale of current generation, to each in the first input value set Each first weight in first input value and the first weight set carries out quantization rounding respectively, generates the second input value set and the Two weight set, wherein the second weight set includes the first weight quantified after rounding;It is target by the second weight sets cooperation The weight of convolutional layer, and determine whether total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio value; In response to determining that total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio value, by the volume of current generation Product neural network stores target convolutional neural networks as target convolutional neural networks.
In some embodiments, this method further includes:In response to determining that the weight of target convolutional layer is total in the current generation Quantization scale is not up to preset ratio value, is trained to the convolutional neural networks of current generation according to loss function, with adjustment The first weight of rounding is not quantified in second weight set, until the value of loss function tends to desired value;Obtain next stage Quantization scale of the quantization scale as the current generation;According to the quantization scale of current generation, to not measured in the second input value set Change and do not quantify the first weight of rounding in the first input value of rounding and the second weight set and carry out quantization rounding respectively, to update the Two input value sets and the second weight set;By the weight that newer second weight sets cooperation is target convolutional layer, and determine mesh Mark whether total quantization ratio of the weight of convolutional layer in the current generation reaches preset ratio value.
In some embodiments, according to initial input value and initial weight, the first input value set and first is generated respectively Weight set, including:According to the digit of quantization encoding, the range of initial input value and the range of initial weight are uniformly drawn respectively It is divided into preset number subinterval, wherein the digit positive correlation of preset number and quantization encoding;According to positioned at preset number height Each input value in section and each weight generate the first input value set and the first weight set respectively.
In some embodiments, to each the in each first input value and the first weight set in the first input value set One weight carries out quantization rounding respectively, generates the second input value set and the second weight set, including:According to default quantization side Method carries out quantization rounding to each first input value in the first input value set, and the first input value after quantization rounding is made For the second input value, the second input value set is generated;According to the distribution probability of each first weight in the first weight set, to Each first weight in one weight set carries out rounding or lower rounding, and using the first weight after rounding as the second weight, Generate the second weight set.
In some embodiments, according to initial input value and initial weight, the first input value set and first is generated respectively Weight set, including:According to preset first numerical value, the range of the range of initial input value and initial weight is divided into default The different subinterval of number siding-to-siding block length;Using the first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, by result of calculation As the first input value, the first input value set is generated;According to each initial weight positioned at each subinterval, determination section weight, As the first weight, to generate the first weight set.
In some embodiments, to each the in each first input value and the first weight set in the first input value set One weight carries out quantization rounding respectively, generates the second input value set and the second weight set, including:To the first input value set In each first input value carry out quantization rounding, and be respectively index to quantify each first input value after rounding, calculate first The exponential depth of numerical value, as the second input value, to generate the second input value set;According to the sequence in each subinterval, successively to each First weight establishes serial number, generates inquiry table, and using the corresponding serial number of each first weight as the second weight, generate the second weight Gather, wherein serial number integer, and serial number is stored with each first weight in the form of key-value pair in inquiry table.
In some embodiments, according to the quantization scale of current generation, to each first input in the first input value set Each first weight in value and the first weight set carries out quantization rounding respectively, including:According to the quantization scale of current generation, The first input value of respective numbers is chosen respectively in first input value set and the first weight set and the first weight is quantified Rounding, wherein selection includes descending according to numerical value, is chosen from the big one end of numerical value, or according to quantization error by it is small to Greatly, it is chosen from the small one end of error.
In some embodiments, this method further includes:Obtain the initial defeated of the target convolutional layer of target convolutional neural networks Enter information;Quantization rounding is carried out to initial input information, obtains integer input values;Integer input values are inputted into target convolutional layer, And convolution algorithm is carried out with the weight of target convolutional layer, generate output information.
Second aspect, the embodiment of the present application provide a kind of device for generating convolutional neural networks, including:First obtains Unit is taken, is configured to obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;First generates list Member, is configured to according to initial input value and initial weight, generates the first input value set and the first weight set respectively;First Quantifying unit is configured to the quantization scale according to the current generation, to each first input value and the in the first input value set Each first weight in one weight set carries out quantization rounding respectively, generates the second input value set and the second weight set, In, the second weight set includes the first weight quantified after rounding;Determination unit, be configured to be by the second weight sets cooperation The weight of target convolutional layer, and determine whether total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio Value;Second generation unit is configured in response to determining that the weight of target convolutional layer reaches in the total quantization ratio of current generation Preset ratio value using the convolutional neural networks of current generation as target convolutional neural networks, and stores target convolution nerve net Network.
In some embodiments, which is also configured to:In response to determining the weight of target convolutional layer in the current generation Total quantization ratio be not up to preset ratio value, the convolutional neural networks of current generation are trained according to loss function, with The first weight for not quantifying rounding in the second weight set is adjusted, until the value of loss function tends to desired value;Obtain lower single order Quantization scale of the quantization scale of section as the current generation;According to the quantization scale of current generation, in the second input value set The first weight for not quantifying not quantifying in the first input value of rounding and the second weight set rounding carries out quantization rounding respectively, with more New second input value set and the second weight set;By the weight that newer second weight sets cooperation is target convolutional layer, and really Whether total quantization ratio of the weight in the current generation of convolutional layer of setting the goal reaches preset ratio value.
In some embodiments, the first generation unit is further configured to:It, will be initial defeated according to the digit of quantization encoding The range of the range and initial weight that enter value is evenly dividing respectively as preset number subinterval, wherein preset number and quantization The digit positive correlation of coding;According in preset number subinterval each input value and each weight, generate respectively first defeated Enter value set and the first weight set.
In some embodiments, the first quantifying unit is further configured to:According to default quantization method, inputted to first Each first input value in value set carries out quantization rounding, and will quantify the first input value after rounding as the second input value, Generate the second input value set;According to the distribution probability of each first weight in the first weight set, in the first weight set Each first weight carry out rounding or lower rounding, and using the first weight after rounding as the second weight, generate the second weight Set.
In some embodiments, the first generation unit is further configured to:It, will be initial according to preset first numerical value The range of input value and the range of initial weight are divided into the different subinterval of preset number siding-to-siding block length;It is with the first numerical value The truth of a matter calculates the logarithm of each initial input value, using result of calculation as the first input value, generates the first input value set;According to Each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight set.
In some embodiments, the first quantifying unit is further configured to:To each the in the first input value set One input value carries out quantization rounding, and is respectively index to quantify each first input value after rounding, calculates the finger of the first numerical value Number power, as the second input value, to generate the second input value set;According to the sequence in each subinterval, successively to each first weight Serial number is established, generates inquiry table, and using the corresponding serial number of each first weight as the second weight, generate the second weight set, In, serial number integer, and in inquiry table serial number with stored in the form of key-value pair with each first weight.
In some embodiments, the first quantifying unit is further configured to:According to the quantization scale of current generation, The first input value of respective numbers is chosen in one input value set and the first weight set respectively and the first weight carries out quantization and takes It is whole, wherein selection includes descending according to numerical value, is chosen from the big one end of numerical value, or ascending according to quantization error, It chooses one end small from error.
In some embodiments, which further includes:Second acquisition unit is configured to obtain target convolutional neural networks Target convolutional layer initial input information;Second quantifying unit is configured to carry out quantization rounding to initial input information, obtain To integer input values;Third generation unit, be configured to by integer input values input target convolutional layer, and with target convolutional layer Weight carries out convolution algorithm, generates output information.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, including:One or more processors;Storage dress It sets, for storing one or more programs;When one or more programs are executed by one or more processors so that one or more A processor realizes the method as described in any embodiment in above-mentioned first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence, wherein the method as described in any embodiment in above-mentioned first aspect is realized when the computer program is executed by processor.
Method and apparatus provided by the embodiments of the present application for generating convolutional neural networks, by obtaining convolutional Neural net The initial input value and initial weight of the target convolutional layer of network, so as to generate the first input value set and the first weight respectively Set.It then, can be to each first input value and the first power in the first input value set according to the quantization scale of current generation Each first weight gather again in carries out quantization rounding respectively, to generate the second input value set and the second weight set.Wherein, Second weight set includes the first weight quantified after rounding.Later, can be target convolutional layer by the second weight sets cooperation Weight, i.e., at least partly weight of target convolutional layer is converted into integer weight.Simultaneously, it may be determined that the power of target convolutional layer Whether the total quantization ratio for focusing on the current generation reaches preset ratio value.If it is determined that when the weight of target convolutional layer is in the last stage Total quantization ratio reaches preset ratio value, then can using the convolutional neural networks of current generation as target convolutional neural networks, And store the target convolutional neural networks.In the target convolutional layer of target convolutional neural networks i.e. at this time, integer weight it is total Ratio is above-mentioned preset ratio value.The multistage quantization of target convolutional layer weight may be implemented in this embodiment, can enrich The generation method of convolutional neural networks, and help to improve the flexibility of the generation method of convolutional neural networks.
Description of the drawings
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating convolutional neural networks of the application;
Fig. 3 is the structural schematic diagram according to one embodiment of the device for generating convolutional neural networks of the application;
Fig. 4 is adapted for the structural schematic diagram of the computer system of the electronic equipment for realizing the embodiment of the present application.
Specific implementation mode
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, is illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the application for generating the method for convolutional neural networks or for generating convolutional Neural The exemplary system architecture 100 of the device of network.
As shown in Figure 1, system architecture 100 may include terminal 101,102,103, network 104 and server 105.Network 104 between terminal 101,102,103 and server 105 provide communication link medium.Network 104 may include various Connection type, such as wired, wireless communication link or fiber optic cables etc..
User can be interacted with using terminal 101,102,103 by network 104 and server 105, to receive Or send message etc..Various client applications can be installed in terminal 101,102,103, such as neural network class is trained to answer With, web browser, searching class application, the application of shopping class and immediate communication tool etc..
Terminal 101,102,103 can be the various electronic equipments for having display screen, including but not limited to smart mobile phone, flat Plate computer, E-book reader, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as various to being shown in terminal 101,102,103 Using the background server for providing support.Background server can be to the first of the target convolutional layer that terminal 101,102,103 is sent Beginning input value and initial weight carry out analyzing processing, to be trained to convolutional neural networks, and can be by handling result (example Such as the target convolutional neural networks of generation) it is sent to terminal 101,102,103.Wherein, the target volume of target convolutional neural networks At least partly weight of lamination is integer.
It should be noted that the method for generating convolutional neural networks that the embodiment of the present application is provided is generally by servicing Device 105 executes, and correspondingly, the device for generating convolutional neural networks is generally positioned in server 105.
It should be understood that the number of the terminal, network and server in Fig. 1 is only schematical.It, can according to needs are realized With with any number of terminal, network and server.
With continued reference to Fig. 2, it illustrates an implementations according to the method for generating convolutional neural networks of the application The flow 200 of example.The method for being used to generate convolutional neural networks may comprise steps of:
Step 201, the initial input value and initial weight of the target convolutional layer of convolutional neural networks are obtained.
In the present embodiment, the method for generating convolutional neural networks runs electronic equipment (such as Fig. 1 institutes thereon The server 105 shown) can by a variety of methods, come obtain convolutional neural networks target convolutional layer initial input value and just Beginning weight.It is (such as shown in FIG. 1 from the terminal for communicating with connection such as by wired connection mode or radio connection Terminal 101,102,103), database server or Cloud Server etc., to obtain above-mentioned initial input value and initial weight.
In the present embodiment, convolutional neural networks can be the convolutional neural networks for having various functions or purposes, such as may be used For Face datection or the convolutional neural networks of recognition of face.It can be trained convolutional neural networks, can also It is convolutional neural networks to be trained.Convolutional neural networks are usually a kind of feedforward neural network, its artificial neuron can be with The surrounding cells in a part of coverage area are responded, have outstanding performance for large-scale image procossing.Convolutional neural networks usually may be used To include convolutional layer (alternating convolution layer), pond layer (pooling layer) and full articulamentum etc.. Goal convolutional layer can be the arbitrary convolutional layer in target convolutional neural networks.
In the present embodiment, initial input value can input the arbitrary input information of target convolutional layer.Such as can be The input information of facial image is described;Again such as the output information that can be a upper convolutional layer.And initial weight can be mesh Mark the arbitrary weight of convolutional layer.Such as the initial weight being artificially arranged;After for example being corrected again by the methods of back-propagation algorithm Weight etc..Herein, initial input value and the value of initial weight can be integer value and/or floating point values.
It should be noted that the convolution kernel when a certain convolutional layer is 1 × 1, illustrate that exporting for the convolutional layer is big with input It is small identical.At this point, introducing error in order to avoid the weight of the convolutional layer is quantified as integer, it is possible to not by the convolutional layer Weight be quantified as integer.That is, target convolutional layer can not include the convolutional layer that convolution kernel is 1 × 1.In addition, convolution The storage location of neural network is not intended to limit in this application.Such as it can be stored in electronic equipment local, it can also be stored in On database server or Cloud Server.
Step 202, according to initial input value and initial weight, the first input value set and the first weight sets are generated respectively It closes.
In the present embodiment, according to the initial input value and initial weight obtained in step 201, electronic equipment may be used Various methods, to generate the first input value set and the first weight set respectively.
The present embodiment some optionally in realization method, electronic equipment can be directly using each initial input value as One input value, to generate the first input value set.Meanwhile it can be directly using each initial weight as the first weight, to raw At the first weight set.
Optionally, the range of initial weight can also be divided into certain amount subinterval by electronic equipment;Then, according to Initial weight in each subinterval, it may be determined that interval weight, using as the first weight, to generate the first weight set. Wherein, division methods may include random, at equal intervals or at least one of unequal interval divides.Here certain amount can be with It is configured according to actual conditions.
Further, uniform quantization method may be used in electronic equipment, i.e., will be initial defeated first according to the digit of quantization encoding The range of the range and initial weight that enter value is evenly dividing respectively as preset number subinterval.Wherein, preset number and quantization The digit positive correlation of coding.It is then possible to according in preset number subinterval each input value and each weight, give birth to respectively At the first input value set and the first weight set.Here the digit of quantization encoding can be that electronic equipment is recorded, transmits information The digit of coding used, such as binary-coded digit, the digit of decimal coded.
For example, the range of initial input value can directly be evenly dividing as preset number subinterval by electronic equipment.And Directly the range of initial weight is evenly dividing as preset number subinterval.It is then possible to which preset number sub-district will be located at Between in each initial input value respectively as the first input value, to generate the first input value set.Meanwhile it can will be located at pre- If each initial weight in number subinterval is respectively as the first weight, to generate the first weight set.Alternatively, can also According to each initial weight in the subinterval, the interval weight in the subinterval is determined, using as the first weight, to generate First weight set.Wherein it is determined that method may include calculate positioned at the subinterval each initial weight assembly average or Intermediate value.
For another example electronic equipment can be first by the range scaling to preset range of initial input value, then this is preset into model It encloses and is evenly dividing as preset number subinterval.It is then possible to by each input value in above-mentioned preset range, i.e., it is first after scaling Beginning input value generates the first input value set as the first input value.Meanwhile it can be by the equal scaling of the range of initial weight extremely In preset range, and the preset range is evenly dividing as preset number subinterval.It later, can will be each in preset range Weight, i.e. initial weight after scaling generate the first weight set as the first weight.Wherein, the boundary value of preset range can Think integer.
As an example, the digit of quantization encoding can be 8, i.e., binary-coded digit is 8.If the range of initial weight For [a, b], then electronic equipment can be first by the range scaling to [0,28] in range;It then can be by [0,28] proportion range It is evenly dividing as 256 subintervals.At this point, the siding-to-siding block length in each subinterval is 1, that is, be followed successively by [0,1], [1,2], [2, 3]···[255,256]。
Again alternatively, electronic equipment can be first by [a, b] normalizing to [0,1] range;It then can be by the weight model after normalizing It encloses and is evenly dividing as 256 subintervals, i.e., siding-to-siding block length is 1/256;Later can again scaling to [0,28] in range, so that respectively The boundary value in subinterval is integer, i.e. [0,1], [1,2], [2,3] etc..
The present embodiment some optionally in realization method, electronic equipment can also use logarithmic quantization method, i.e. root According to preset first numerical value, the range of the range of initial input value and initial weight is divided into preset number siding-to-siding block length not Same subinterval;Using the first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, using result of calculation as the first input value, Generate the first input value set;According to each initial weight positioned at each subinterval, determination section weight, using as the first weight, Generate the first weight set.Here determination method does not limit equally.Wherein, the first numerical value can be the positive integer other than 1.
For example, the digit of quantization encoding can be 8, i.e., binary-coded digit is 8.If initial input value is (as x) Ranging from [a, b], the first numerical value are r.It is respectively (b-a)/r, (b- that then the range can be divided into siding-to-siding block length by electronic equipment a)/r2、(b-a)/r3···(b-a)/r256256 subintervals.It is then possible to by logr xAs the first input value, thus Generate the first input value set.
Step 203, according to the quantization scale of current generation, to each first input value and first in the first input value set Each first weight in weight set carries out quantization rounding respectively, generates the second input value set and the second weight set.
In the present embodiment, electronic equipment can be according to the quantization scale of current generation, in the first input value set and the The first input value and the first weight for choosing respective numbers in one weight set respectively carry out quantization rounding.So as to give birth to respectively At the second input value set and the second weight set.Wherein, specific selection mode is not intended to limit, and such as can be to randomly select; Can be that one end descending according to numerical value, big from numerical value is chosen;Or it is ascending according to quantization error, it is small from error It chooses one end.Here quantization method does not limit equally.
In the present embodiment, may include the first input value quantified after rounding in the second input value set.Second weight It may include the first weight quantified after rounding in set.It should be noted that each first in the first input value set is defeated Enter before value all do not quantified, can also include non-quantized first input value in the second input value set.And in the first power Each first weight gather again in can also not included non-quantized first power before all quantifying, in the second weight set Weight.
It is understood that the acquisition modes of the quantization scale of current generation are not intended to limit in this application.Such as can be What electronic equipment was obtained from local preset parameter file, can also be that electronic equipment is obtained from Cloud Server, can be with It is that user is sent to electronic equipment by terminal.
The present embodiment some optionally in realization method, for uniform quantization method, electronic equipment can be according to pre- If quantization method, quantization rounding is carried out to each first input value in the first input value set.And after can quantifying rounding First input value generates the second input value set as the second input value.Wherein, default quantization method can be according to actual conditions It is configured, the method to round up such as may be used.
Further, electronic equipment can according to the distribution probability of each first weight in the first weight set, to this Each first weight in one weight set carries out randomization (upper rounding or lower rounding).And first after rounding can be weighed Recast is the second weight, generates the second weight set.For example, for each first weight in [0,1] subinterval, if at least half The first weight be distributed between [0.5,1], then can carry out upper rounding to each first weight in the subinterval, i.e., all It is 1 to quantify rounding.Contribute to reduce quantization error in this way.
It is understood that using the method for being evenly dividing subinterval, the weight distribution after quantization can be made more uniform. Meanwhile using randomization, can balance since loss of significance leads to the diminution of the distribution space of univers parameter, so as to Improve the distribution space of parameter.
Optionally, for logarithmic quantization method, electronic equipment can be to each first input value in the first input value set Carry out quantization rounding, i.e. round (logr x).And it can be respectively index to quantify each first input value after rounding, calculate The exponential depth of first numerical value, i.e. rn(n=round (logr x)), as the second input value, to generate the second input value set.
Meanwhile electronic equipment can establish serial number to each first weight successively according to the sequence in each subinterval, generate inquiry Table.And it can be using the corresponding serial number of each first weight as the second weight, to generate the second weight set.Wherein, serial number Integer.And serial number is stored with the first weight in the form of key-value pair in inquiry table.
For example, according to numerical values recited, since [a, (b-a)/r] is first subinterval, so in the subinterval The corresponding serial number of first weight can be 1.And the corresponding serial number of the first weight positioned at other subintervals can be followed successively by 2,3,4 Etc..That is, replacing weight by integer serial number, it is empty that the storage occupied needed for convolutional neural networks can be reduced in this way Between.It can be handled again to avoid actual quantization rounding is carried out to weight simultaneously, so as to simplify processing procedure, improve operation effect Rate.And error can be introduced to avoid because of quantization rounding processing.
Step 204, by the weight that the second weight sets cooperation is target convolutional layer, and determine that the weight of target convolutional layer is being worked as Whether the total quantization ratio of last stage reaches preset ratio value.
In the present embodiment, electronic equipment can by each weight in the second weight set generated in step 203, as The weight of the target convolutional layer of convolutional neural networks.And it can determine total quantization of the weight in the current generation of target convolutional layer Whether ratio reaches preset ratio value (such as 100%).If total quantization ratio of the weight of target convolutional layer in the current generation reaches Preset ratio value can then continue to execute step 205;If total quantization ratio of the weight of target convolutional layer in the current generation does not reach To preset ratio value, then step 206 can be executed.It is understood that total amount of the weight of target convolutional layer in the current generation The weight being quantized in target convolutional layer until change ratio can refer to the current generation is in the weight of target convolutional layer Ratio.Ratio of the first weight being quantized in the second weight set.
Step 205, in response to determining that total quantization ratio of the weight of target convolutional layer in the current generation reaches preset ratio Value, using the convolutional neural networks of current generation as target convolutional neural networks, and stores target convolutional neural networks.
In the present embodiment, if determining total quantization ratio of the weight in the current generation of target convolutional layer in step 204 Reach preset ratio value, illustrates that quantizing process is completed.Then electronic equipment can be by the convolutional neural networks of current generation, i.e. mesh The weight of mark convolutional layer has been the convolutional neural networks that the second weight set and total quantization ratio reach above-mentioned preset ratio value, As target convolutional neural networks.And the target convolutional neural networks can be stored.It is understood that in target convolution god In generating process through network, above-mentioned preset ratio value usually can be a steady state value.
Step 206, in response to determining that total quantization ratio of the weight of target convolutional layer in the current generation not up to presets ratio Example value, is trained the convolutional neural networks of current generation according to loss function, is not quantified with adjusting in the second weight set First weight of rounding, until the value of loss function tends to desired value.
In the present embodiment, if determining total quantization ratio of the weight in the current generation of target convolutional layer in step 204 Not up to preset ratio value illustrates that quantizing process does not complete also.Then electronic equipment can be according to loss function to the current generation Convolutional neural networks are trained, to adjust the first weight for not quantifying rounding in the second weight set, until loss function Value tends to desired value.Wherein, desired value can be arranged according to actual conditions.That is, during quantifying the multistage, First weight can remain unchanged after being quantized once.
It should be noted that other than using loss function, other training methods can also be used, so that the current generation The convolutional neural networks of (i.e. under the quantization scale of current generation) are stable or restrain.
Step 207, quantization scale of the quantization scale as the current generation of next stage is obtained.
In the present embodiment, after the convolutional neural networks training convergence of current generation in step 206, electronic equipment can To obtain the quantization scale of next stage as the quantization scale of current generation, to carry out the quantizing process of next stage.
Step 208, according to the quantization scale of current generation, to not quantifying the first input value of rounding in the second input value set Quantization rounding is carried out respectively with the first weight for not quantifying rounding in the second weight set, with the second input value set of update and the Two weight set.
In the present embodiment, electronic equipment can be according to the quantization scale of the current generation in step 207, from the second input The first input value for not quantifying to choose respective numbers in value set in each first input value of rounding carries out quantization rounding, and updates Second input value set.Meanwhile it can not quantify to choose respective numbers in each first weight of rounding from the second weight set The first weight carry out quantization rounding, and update the second weight set.Here choosing method can be with above-mentioned choosing method phase Together.
That is, the multistage quantization in the present embodiment quantifies using increment, i.e., the base quantified in previous stage On plinth, next stage quantization is carried out.Such as the quantization scale in two stages is followed successively by 20% and 50%, then the first stage can be with Be quantized to 20% (quantifying 20%), second stage can be quantized to 50% (i.e. in the first stage on the basis of re-quantization 30%) or second stage can be quantized to 70% (i.e. in the first stage on the basis of re-quantization 50%).
Step 209, by the weight that newer second weight sets cooperation is target convolutional layer, and the power of target convolutional layer is determined Whether the total quantization ratio for focusing on the current generation reaches preset ratio value.
In the present embodiment, electronic equipment can be using the weight in newer second weight set as target convolutional layer Weight updates the weight of target convolutional layer.Meanwhile it can continue to determine total amount of the weight in the current generation of target convolutional layer Whether change ratio reaches above-mentioned preset ratio value.If it is determined that total quantization ratio of the weight of target convolutional layer in the current generation reaches Preset ratio value can then return to step 205;If it is determined that total quantization ratio of the weight of target convolutional layer in the current generation Not up to preset ratio value can then continue to execute step 206.
It is understood that the method in the present embodiment can accumulate nerve net by multistage quantization to generate target volume Network.In this way, for different quantization stages (quantization scale), the processing accuracy of convolutional neural networks can be adjusted, to advantageous In the flexibility for improving training.It, can be in order to the discovery and solution of problem meanwhile during quantifying the multistage.
In addition, in order to improve the efficiency for generating target convolutional neural networks, it in the above-described embodiments can be only to target volume The weight of lamination carries out multistage quantization, and can it all be quantified to rounding for the initial input value of target convolutional layer, with Generate the second input value set.There is no the first input values for not quantifying rounding in the second input value set at this time.In this way, In subsequent each quantization stage, it may not be necessary to the second input value set be handled and be updated again, to simplify processing Process.Simultaneously as in each quantization stage, the input value (the i.e. second input value set) of target convolutional layer remains unchanged, institute To help to ensure that the precision of the weight of the target convolutional layer of the target convolutional neural networks of generation.
In application scenes, after generating target convolutional neural networks, electronic equipment can also obtain the target The initial input information of the target convolutional layer of convolutional neural networks;It is then possible to carry out quantization rounding to initial input information, obtain To integer input values;Later, integer input values can be inputted to target convolutional layer, and convolution is carried out with the weight of target convolutional layer Operation generates output information.Wherein, the weight of target convolutional layer is integer.That is, input information is converted to integer Afterwards, the operation between integer and integer may be implemented, help to improve operation efficiency in this way.
It is understood that by target convolutional neural networks obtained by the above method, may be implemented floating point arithmetic Fixed-point number operation is converted to, the occupancy of memory headroom can be reduced in this way, while helping to improve arithmetic speed.And pass through experiment It is found that for universal cpu (Central Processing Unit, central processing unit), processing speed can be promoted to originally Substantially twice.For FPGA (Field-Programmable Gate Array, field programmable gate array), processing speed Degree substantially can be synchronous with graphics processor (Graphics Processing Unit, GPU).And energy consumption can be reduced.
In addition, in order to keep the weight distribution after quantization more uniform, electronic equipment can also be before a quantization in weight Outlier handled.Specifically, first, electronic equipment can be with the distributed intelligence of statistical weight;Then, believed according to distribution Breath, it may be determined that with the presence or absence of the weight for meeting preset condition in weight;Then, however, it is determined that there is the power for meeting preset condition Weight, can be handled the weight for meeting preset condition.Wherein, processing mode can delete or scale those satisfactions to preset The weight of condition.
Here scalable manner is not intended to limit, and can be bi-directional scaling, can also be to zoom to target weight value etc..This In preset condition can be equally arranged according to actual conditions, such as can be weighted value exceed average weight value preset ratio (such as 5%) can also be the certain proportion for being arranged in big one end according to the descending sequence of weighted value (such as preceding 5%) Weight.It should be noted that above-mentioned processing procedure may be embodied in each repetitive exercise, can also be instructed at interval of certain iteration Practice number to carry out again later.
Method provided in this embodiment for generating convolutional neural networks, by the target volume for obtaining convolutional neural networks The initial input value and initial weight of lamination, so as to generate the first input value set and the first weight set respectively.Then, It, can be in each first input value and the first weight set in the first input value set according to the quantization scale of current generation Each first weight carries out quantization rounding respectively, to generate the second input value set and the second weight set.Wherein, the second weight sets Conjunction includes the first weight quantified after rounding.Later, that is, the weight that the second weight sets cooperation is target convolutional layer can will At least partly weight of target convolutional layer is converted to integer weight.Simultaneously, it may be determined that the weight of target convolutional layer is in current rank Whether the total quantization ratio of section reaches preset ratio value.If it is determined that total quantization ratio of the weight of target convolutional layer in the current generation Reach preset ratio value, then can be using the convolutional neural networks of current generation as target convolutional neural networks, and store the mesh Mark convolutional neural networks.In the target convolutional layer of target convolutional neural networks at this time, the toatl proportion of integer weight is above-mentioned pre- If ratio value.The multistage quantization of target convolutional layer weight may be implemented in this embodiment, can enrich convolutional neural networks Generation method, and help to improve the flexibility of the generation method of convolutional neural networks.
With further reference to Fig. 3, as the realization to method shown in above-mentioned each figure, this application provides one kind for generating volume One embodiment of the device of product neural network.The device embodiment is corresponding with embodiment of the method shown in Fig. 2, device tool Body can be applied in various electronic equipments.
As shown in figure 3, the device 300 for generating convolutional neural networks of the present embodiment may include:First obtains list Member 301 is configured to obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;First generation unit 302, it is configured to according to initial input value and initial weight, generates the first input value set and the first weight set respectively;The One quantifying unit 303, is configured to the quantization scale according to the current generation, to each first input value in the first input value set Quantization rounding is carried out respectively with each first weight in the first weight set, generates the second input value set and the second weight sets It closes, wherein the second weight set includes the first weight quantified after rounding;Determination unit 304 is configured to the second weight Gather the weight as target convolutional layer, and determines whether total quantization ratio of the weight of target convolutional layer in the current generation reaches Preset ratio value;Second generation unit 305 is configured to total amount of the weight in the current generation in response to determining target convolutional layer Change ratio reaches preset ratio value, using the convolutional neural networks of current generation as target convolutional neural networks, and stores target Convolutional neural networks.
In the present embodiment, first acquisition unit 301, the first generation unit 302, the first quantifying unit 303, determination unit 304 and second generation unit 305 specific implementation and generation advantageous effect, embodiment shown in Figure 2 can be distinguished In step 201, the associated description of step 202, step 203 step 204 and step 205, details are not described herein again.
In some optional realization methods of the present embodiment, which can also be configured to:In response to determining mesh It marks total quantization ratio of the weight of convolutional layer in the current generation and is not up to preset ratio value, according to loss function to the current generation Convolutional neural networks are trained, to adjust the first weight for not quantifying rounding in the second weight set, until loss function Value tends to desired value;Obtain quantization scale of the quantization scale of next stage as the current generation;According to the quantization of current generation Ratio is weighed to not quantifying not quantifying in the first input value of rounding and the second weight set the first of rounding in the second input value set Quantization rounding is carried out respectively again, with the second input value set of update and the second weight set;By newer second weight sets cooperation For the weight of target convolutional layer, and determine whether total quantization ratio of the weight of target convolutional layer in the current generation reaches default ratio Example value.
Optionally, the first generation unit 302 can be further configured to:It, will be initial defeated according to the digit of quantization encoding The range of the range and initial weight that enter value is evenly dividing respectively as preset number subinterval, wherein preset number and quantization The digit positive correlation of coding;According in preset number subinterval each input value and each weight, generate respectively first defeated Enter value set and the first weight set
Further, the first quantifying unit 303 can be further configured to:It is defeated to first according to default quantization method Enter each first input value in value set and carry out quantization rounding, and using the first input value after quantization rounding as the second input Value generates the second input value set;According to the distribution probability of each first weight in the first weight set, to the first weight set In each first weight carry out rounding or lower rounding, and using the first weight after rounding as the second weight, generate the second power Gather again.
Optionally, the first generation unit 302 can also be further configured to:It, will be initial according to preset first numerical value The range of input value and the range of initial weight are divided into the different subinterval of preset number siding-to-siding block length;It is with the first numerical value The truth of a matter calculates the logarithm of each initial input value, using result of calculation as the first input value, generates the first input value set;According to Each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight set.
Further, the first quantifying unit 303 can also be further configured to:To each the in the first input value set One input value carries out quantization rounding, and is respectively index to quantify each first input value after rounding, calculates the finger of the first numerical value Number power, as the second input value, to generate the second input value set;According to the sequence in each subinterval, successively to each first weight Serial number is established, generates inquiry table, and using the corresponding serial number of each first weight as the second weight, generate the second weight set, In, serial number integer, and in inquiry table serial number with stored in the form of key-value pair with each first weight.
In some embodiments, the first quantifying unit 303 can be further configured to:According to the quantization of current generation ratio Example, the first input value and the first weight for choosing respective numbers respectively in the first input value set and the first weight set carry out Quantify rounding, wherein selection includes descending according to numerical value, is chosen from the big one end of numerical value, or according to quantization error by It is small to be chosen to big, small from error one end.
In addition, the device 300 can also include:Second acquisition unit (not shown) is configured to obtain target volume The initial input information of the target convolutional layer of product neural network;Second quantifying unit (not shown), is configured to initial Input information carries out quantization rounding, obtains integer input values;Third generation unit (not shown), is configured to integer is defeated Enter value input target convolutional layer, and convolution algorithm is carried out with the weight of target convolutional layer, generates output information.
Referring to Fig. 4, it illustrates the computer systems 400 suitable for the electronic equipment for realizing the embodiment of the present application Structural schematic diagram.Electronic equipment shown in Fig. 4 is only an example, to the function of the embodiment of the present application and should not use model Shroud carrys out any restrictions.
As shown in figure 4, computer system 400 includes central processing unit (CPU) 401, it can be read-only according to being stored in Program in memory (ROM) 402 or be loaded into the program in random access storage device (RAM) 403 from storage section 408 and Execute various actions appropriate and processing.In RAM 403, also it is stored with system 400 and operates required various programs and data. CPU 401, ROM 402 and RAM 403 are connected with each other by bus 404.Input/output (I/O) interface 405 is also connected to always Line 404.
It is connected to I/O interfaces 405 with lower component:Importation 406 including touch screen, keyboard, mouse etc.;Including such as The output par, c 407 of cathode-ray tube (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage part including hard disk etc. Divide 408;And the communications portion 409 of the network interface card including LAN card, modem etc..Communications portion 409 via The network of such as internet executes communication process.Driver 410 is also according to needing to be connected to I/O interfaces 405.Detachable media 411, such as disk, CD, magneto-optic disk, semiconductor memory etc., as needed be mounted on driver 410 on, in order to from The computer program read thereon is mounted into storage section 408 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed by communications portion 409 from network, and/or from detachable media 411 are mounted.When the computer program is executed by central processing unit (CPU) 401, limited in execution the present processes Above-mentioned function.It should be noted that the computer-readable medium of the application can be computer-readable signal media or calculating Machine readable storage medium storing program for executing either the two arbitrarily combines.Computer readable storage medium for example can be --- but it is unlimited In --- electricity, system, device or the device of magnetic, optical, electromagnetic, infrared ray or semiconductor, or the arbitrary above combination.It calculates The more specific example of machine readable storage medium storing program for executing can include but is not limited to:Being electrically connected, be portable with one or more conducting wires Formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In this application, can be any include computer readable storage medium or storage program Tangible medium, the program can be commanded execution system, device either device use or it is in connection.And in this Shen Please in, computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated, In carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by Instruction execution system, device either device use or program in connection.The journey for including on computer-readable medium Sequence code can transmit with any suitable medium, including but not limited to:Wirelessly, electric wire, optical cable, RF etc. or above-mentioned Any appropriate combination.
Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part for a part for one module, program segment, or code of table, the module, program segment, or code includes one or more uses The executable instruction of the logic function as defined in realization.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it to note Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be arranged in the processor, for example, can be described as:A kind of processor packet Include acquiring unit, quantifying unit and generation unit.Wherein, the title of these units is not constituted under certain conditions to the unit The restriction of itself, for example, acquiring unit is also described as " obtaining the initial of the target convolutional layer of target convolutional neural networks The unit of input information ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in electronic equipment described in above-described embodiment;Can also be individualism, and without be incorporated the electronic equipment in. Above computer readable medium carries one or more program, when said one or multiple programs are held by the electronic equipment When row so that the electronic equipment:Obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;According to first Beginning input value and initial weight generate the first input value set and the first weight set respectively;According to the quantization of current generation ratio Example carries out quantization to each first weight in each first input value and the first weight set in the first input value set and takes respectively It is whole, generate the second input value set and the second weight set, wherein the second weight set includes the first power after quantifying rounding Weight;By the weight that the second weight sets cooperation is target convolutional layer, and determine total amount of the weight in the current generation of target convolutional layer Whether change ratio reaches preset ratio value;In response to determining that total quantization ratio of the weight of target convolutional layer in the current generation reaches Preset ratio value using the convolutional neural networks of current generation as target convolutional neural networks, and stores target convolution nerve net Network.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Other technical solutions of arbitrary combination and formation.Such as features described above has similar work(with (but not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (18)

1. a kind of method for generating convolutional neural networks, including:
Obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;
According to the initial input value and the initial weight, the first input value set and the first weight set are generated respectively;
According to the quantization scale of current generation, each first input value and first weight in value set are inputted to described first Each first weight in set carries out quantization rounding respectively, generates the second input value set and the second weight set, wherein described Second weight set includes the first weight quantified after rounding;
By the weight that the second weight sets cooperation is the target convolutional layer, and determine that the weight of the target convolutional layer is being worked as Whether the total quantization ratio of last stage reaches preset ratio value;
Reach preset ratio value in response to total quantization ratio of the weight in the current generation of the determination target convolutional layer, it will be current The convolutional neural networks in stage store the target convolutional neural networks as target convolutional neural networks.
2. according to the method described in claim 1, wherein, the method further includes:
It is not up to preset ratio value in response to total quantization ratio of the weight in the current generation of the determination target convolutional layer, according to Loss function is trained the convolutional neural networks of current generation, does not quantify rounding to adjust in the second weight set First weight, until the value of loss function tends to desired value;Obtain quantization of the quantization scale of next stage as the current generation Ratio;According to the quantization scale of current generation, to not quantifying the first input value of rounding and described in the second input value set The first weight for not quantifying rounding in second weight set carries out quantization rounding respectively, with update it is described second input value set and The second weight set;By the weight that newer second weight sets cooperation is the target convolutional layer, and determine the target Whether total quantization ratio of the weight of convolutional layer in the current generation reaches the preset ratio value.
3. it is described according to the initial input value and the initial weight according to the method described in claim 1, wherein, respectively The first input value set and the first weight set are generated, including:
According to the digit of quantization encoding, the range of the range of the initial input value and the initial weight is evenly dividing respectively For preset number subinterval, wherein the digit positive correlation of the preset number and the quantization encoding;
According in preset number subinterval each input value and each weight, respectively generate first input value set and first Weight set.
4. according to the method described in claim 3, wherein, each first input value in the first input value set and Each first weight in the first weight set carries out quantization rounding respectively, generates the second input value set and the second weight sets It closes, including:
According to default quantization method, quantization rounding is carried out to each first input value in the first input value set, and will amount Change the first input value after rounding as the second input value, generates the second input value set;
According to the distribution probability of each first weight in the first weight set, to each first in the first weight set Weight carries out rounding or lower rounding, and using the first weight after rounding as the second weight, generates the second weight set.
5. it is described according to the initial input value and the initial weight according to the method described in claim 1, wherein, respectively The first input value set and the first weight set are generated, including:
According to preset first numerical value, the range of the range of the initial input value and the initial weight is divided into present count The different subinterval of mesh siding-to-siding block length;
Using first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, using result of calculation as the first input value, is generated First input value set;
According to each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight sets It closes.
6. according to the method described in claim 5, wherein, each first input value in the first input value set and Each first weight in the first weight set carries out quantization rounding respectively, generates the second input value set and the second weight sets It closes, including:
Quantization rounding carried out to each first input value in the first input value set, and each first defeated after rounding to quantify It is respectively index to enter value, calculates the exponential depth of first numerical value, as the second input value, to generate the second input value set;
According to the sequence in each subinterval, serial number is established to each first weight successively, generates inquiry table, and each first weight is corresponded to Serial number as the second weight, generate the second weight set, wherein the serial number integer, and in the inquiry table serial number with It is stored in the form of key-value pair with each first weight.
7. according to the method described in claim 1, wherein, the quantization scale according to the current generation is inputted to described first Each first weight in each first input value and the first weight set in value set carries out quantization rounding respectively, including:
According to the quantization scale of current generation, phase is chosen respectively in the first input value set and the first weight set The first input value and the first weight for answering quantity carry out quantization rounding, wherein it is described selection include it is descending according to numerical value, from The big one end of numerical value is chosen, or ascending according to quantization error, and one end small from error is chosen.
8. according to the method described in one of claim 1-7, wherein the method further includes:
Obtain the initial input information of the target convolutional layer of the target convolutional neural networks;
Quantization rounding is carried out to the initial input information, obtains integer input values;
The integer input values are inputted into the target convolutional layer, and convolution algorithm is carried out with the weight of the target convolutional layer, Generate output information.
9. a kind of device for generating convolutional neural networks, including:
First acquisition unit is configured to obtain the initial input value and initial weight of the target convolutional layer of convolutional neural networks;
First generation unit is configured to, according to the initial input value and the initial weight, generate the first input value respectively Set and the first weight set;
First quantifying unit is configured to the quantization scale according to the current generation, to each the in the first input value set Each first weight in one input value and the first weight set carries out quantization rounding respectively, generate the second input value set and Second weight set, wherein the second weight set includes the first weight quantified after rounding;
Determination unit is configured to the weight that the second weight sets cooperation is the target convolutional layer, and determines the mesh Mark whether total quantization ratio of the weight of convolutional layer in the current generation reaches preset ratio value;
Second generation unit, be configured to the weight in response to the determination target convolutional layer the current generation total quantization ratio Reach preset ratio value, using the convolutional neural networks of current generation as target convolutional neural networks, and stores the target volume Product neural network.
10. device according to claim 9, wherein described device is also configured to:
It is not up to preset ratio value in response to total quantization ratio of the weight in the current generation of the determination target convolutional layer, according to Loss function is trained the convolutional neural networks of current generation, does not quantify rounding to adjust in the second weight set First weight, until the value of loss function tends to desired value;Obtain quantization of the quantization scale of next stage as the current generation Ratio;According to the quantization scale of current generation, to not quantifying the first input value of rounding and described in the second input value set The first weight for not quantifying rounding in second weight set carries out quantization rounding respectively, with update it is described second input value set and The second weight set;By the weight that newer second weight sets cooperation is the target convolutional layer, and determine the target Whether total quantization ratio of the weight of convolutional layer in the current generation reaches the preset ratio value.
11. device according to claim 9, wherein first generation unit is further configured to:
According to the digit of quantization encoding, the range of the range of the initial input value and the initial weight is evenly dividing respectively For preset number subinterval, wherein the digit positive correlation of the preset number and the quantization encoding;
According in preset number subinterval each input value and each weight, respectively generate first input value set and first Weight set.
12. according to the devices described in claim 11, wherein first quantifying unit is further configured to:
According to default quantization method, quantization rounding is carried out to each first input value in the first input value set, and will amount Change the first input value after rounding as the second input value, generates the second input value set;
According to the distribution probability of each first weight in the first weight set, to each first in the first weight set Weight carries out rounding or lower rounding, and using the first weight after rounding as the second weight, generates the second weight set.
13. device according to claim 9, wherein first generation unit is further configured to:
According to preset first numerical value, the range of the range of the initial input value and the initial weight is divided into present count The different subinterval of mesh siding-to-siding block length;
Using first numerical value as the truth of a matter, the logarithm of each initial input value is calculated, using result of calculation as the first input value, is generated First input value set;
According to each initial weight positioned at each subinterval, determination section weight, as the first weight, to generate the first weight sets It closes.
14. device according to claim 13, wherein first quantifying unit is further configured to:
Quantization rounding carried out to each first input value in the first input value set, and each first defeated after rounding to quantify It is respectively index to enter value, calculates the exponential depth of first numerical value, as the second input value, to generate the second input value set;
According to the sequence in each subinterval, serial number is established to each first weight successively, generates inquiry table, and each first weight is corresponded to Serial number as the second weight, generate the second weight set, wherein the serial number integer, and in the inquiry table serial number with It is stored in the form of key-value pair with each first weight.
15. device according to claim 9, wherein first quantifying unit is further configured to:
According to the quantization scale of current generation, phase is chosen respectively in the first input value set and the first weight set The first input value and the first weight for answering quantity carry out quantization rounding, wherein it is described selection include it is descending according to numerical value, from The big one end of numerical value is chosen, or ascending according to quantization error, and one end small from error is chosen.
16. according to the device described in one of claim 9-15, wherein described device further includes:
Second acquisition unit is configured to obtain the initial input information of the target convolutional layer of the target convolutional neural networks;
Second quantifying unit is configured to carry out quantization rounding to the initial input information, obtains integer input values;
Third generation unit is configured to the integer input values inputting the target convolutional layer, and with the target convolution The weight of layer carries out convolution algorithm, generates output information.
17. a kind of electronic equipment, including:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors so that one or more of processors are real Now such as method according to any one of claims 1-8.
18. a kind of computer readable storage medium, is stored thereon with computer program, wherein the computer program is handled Such as method according to any one of claims 1-8 is realized when device executes.
CN201810084926.1A 2018-01-29 2018-01-29 Method and apparatus for generating convolutional neural networks Pending CN108288089A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810084926.1A CN108288089A (en) 2018-01-29 2018-01-29 Method and apparatus for generating convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810084926.1A CN108288089A (en) 2018-01-29 2018-01-29 Method and apparatus for generating convolutional neural networks

Publications (1)

Publication Number Publication Date
CN108288089A true CN108288089A (en) 2018-07-17

Family

ID=62835926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810084926.1A Pending CN108288089A (en) 2018-01-29 2018-01-29 Method and apparatus for generating convolutional neural networks

Country Status (1)

Country Link
CN (1) CN108288089A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117742A (en) * 2018-07-20 2019-01-01 百度在线网络技术(北京)有限公司 Gestures detection model treatment method, apparatus, equipment and storage medium
CN109308194A (en) * 2018-09-29 2019-02-05 北京字节跳动网络技术有限公司 Method and apparatus for storing data
CN109754074A (en) * 2018-12-29 2019-05-14 北京中科寒武纪科技有限公司 A kind of neural network quantization method, device and Related product
CN110222821A (en) * 2019-05-30 2019-09-10 浙江大学 Convolutional neural networks low-bit width quantization method based on weight distribution
CN112085183A (en) * 2019-06-12 2020-12-15 上海寒武纪信息科技有限公司 Neural network operation method and device and related product
WO2021036908A1 (en) * 2019-08-23 2021-03-04 安徽寒武纪信息科技有限公司 Data processing method and apparatus, computer equipment and storage medium
WO2021036890A1 (en) * 2019-08-23 2021-03-04 安徽寒武纪信息科技有限公司 Data processing method and apparatus, computer device, and storage medium
US11699073B2 (en) 2018-12-29 2023-07-11 Cambricon Technologies Corporation Limited Network off-line model processing method, artificial intelligence processing device and related products

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117742A (en) * 2018-07-20 2019-01-01 百度在线网络技术(北京)有限公司 Gestures detection model treatment method, apparatus, equipment and storage medium
CN109117742B (en) * 2018-07-20 2022-12-27 百度在线网络技术(北京)有限公司 Gesture detection model processing method, device, equipment and storage medium
CN109308194A (en) * 2018-09-29 2019-02-05 北京字节跳动网络技术有限公司 Method and apparatus for storing data
CN109754074A (en) * 2018-12-29 2019-05-14 北京中科寒武纪科技有限公司 A kind of neural network quantization method, device and Related product
US11699073B2 (en) 2018-12-29 2023-07-11 Cambricon Technologies Corporation Limited Network off-line model processing method, artificial intelligence processing device and related products
CN110222821A (en) * 2019-05-30 2019-09-10 浙江大学 Convolutional neural networks low-bit width quantization method based on weight distribution
CN110222821B (en) * 2019-05-30 2022-03-25 浙江大学 Weight distribution-based convolutional neural network low bit width quantization method
CN112085183A (en) * 2019-06-12 2020-12-15 上海寒武纪信息科技有限公司 Neural network operation method and device and related product
CN112085183B (en) * 2019-06-12 2024-04-02 上海寒武纪信息科技有限公司 Neural network operation method and device and related products
WO2021036908A1 (en) * 2019-08-23 2021-03-04 安徽寒武纪信息科技有限公司 Data processing method and apparatus, computer equipment and storage medium
WO2021036890A1 (en) * 2019-08-23 2021-03-04 安徽寒武纪信息科技有限公司 Data processing method and apparatus, computer device, and storage medium

Similar Documents

Publication Publication Date Title
CN108288089A (en) Method and apparatus for generating convolutional neural networks
CN108229663A (en) For generating the method and apparatus of convolutional neural networks
US11620532B2 (en) Method and apparatus for generating neural network
CN108304919A (en) Method and apparatus for generating convolutional neural networks
CN110880036B (en) Neural network compression method, device, computer equipment and storage medium
CN110766142A (en) Model generation method and device
CN110852438B (en) Model generation method and device
CN108256632A (en) Information processing method and device
EP2531959B1 (en) Organizing neural networks
JP7287397B2 (en) Information processing method, information processing apparatus, and information processing program
CN113705610A (en) Heterogeneous model aggregation method and system based on federal learning
CN109190754A (en) Quantitative model generation method, device and electronic equipment
JP7354463B2 (en) Data protection methods, devices, servers and media
CN109388779A (en) A kind of neural network weight quantization method and neural network weight quantization device
CN105389454A (en) Predictive model generator
CN106776925A (en) A kind of Forecasting Methodology of mobile terminal user's sex, server and system
CN112768056A (en) Disease prediction model establishing method and device based on joint learning framework
CN106778843A (en) One kind prediction mobile terminal user's property method for distinguishing, server and system
CN116684330A (en) Traffic prediction method, device, equipment and storage medium based on artificial intelligence
CN116861259B (en) Training method and device of reward model, storage medium and electronic equipment
CN108509179A (en) Method and apparatus for generating model
CN110489435B (en) Data processing method and device based on artificial intelligence and electronic equipment
CN109670579A (en) Model generating method and device
CN114700957B (en) Robot control method and device with low computational power requirement of model
CN113762687B (en) Personnel scheduling method and device in warehouse

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination