CN109961141A - Method and apparatus for generating quantization neural network - Google Patents
Method and apparatus for generating quantization neural network Download PDFInfo
- Publication number
- CN109961141A CN109961141A CN201910288941.2A CN201910288941A CN109961141A CN 109961141 A CN109961141 A CN 109961141A CN 201910288941 A CN201910288941 A CN 201910288941A CN 109961141 A CN109961141 A CN 109961141A
- Authority
- CN
- China
- Prior art keywords
- neural network
- quantization
- parameter
- network
- floating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Machine Translation (AREA)
Abstract
Embodiment of the disclosure discloses the method and apparatus for generating quantization neural network.One specific embodiment of this method includes: to obtain training sample set and initial neural network;Integer type network parameter is converted by the original floating-point type network parameter in initial neural network;Based on the integer type network parameter being converted to, quantization inceptive neural network is generated;It is concentrated from training sample and chooses training sample, execute following training step: using the sample information in training sample as the input of quantization inceptive neural network, using the sample results in training sample as the desired output of quantization inceptive neural network, quantization inceptive neural network is trained;In response to determining that quantization inceptive neural metwork training is completed, based on the quantization inceptive neural network that training is completed, quantization neural network is generated.The embodiment helps to reduce the occupied memory space of neural network, and using consumption when neural network progress information processing to CPU, improves the efficiency of information processing.
Description
Technical field
Embodiment of the disclosure is related to field of computer technology, more particularly, to generate quantization neural network method and
Device.
Background technique
Currently, for the training of accelerans network, it will usually connect one behind the convolutional layer that neural network includes
BN (Batch Normalization, batch normalize) layer.BN layers for by the output of convolutional layer pass to other layers it
Before, the output of convolutional layer is normalized, with this, improves the convergence rate of neural network.
BN layers include being multiplied for the output with convolutional layer, the normalization being normalized with the output to convolutional layer
Parameter.The output of convolutional layer carries out convolution acquisition by the weight of convolutional layer and the input of convolutional layer.In practice, weight and normalization
The data type of parameter is usually floating type.
Summary of the invention
Embodiment of the disclosure proposes the method and apparatus for generating quantization neural network.
In a first aspect, embodiment of the disclosure provides a kind of method for generating quantization neural network, this method packet
It includes: obtaining training sample set and initial neural network, wherein training sample includes sample information and true in advance for sample information
Fixed sample results, initial neural network include original floating-point type network parameter, and original floating-point type network parameter is initial nerve
The product of the floating type normalized parameter of the floating type weight of convolutional layer in network and the batch being connect with convolutional layer normalization layer;
Integer type network parameter is converted by the original floating-point type network parameter in initial neural network;Based on the integer type net being converted to
Network parameter generates quantization inceptive neural network;It is concentrated from training sample and chooses training sample, and execute following training step:
It, will be in selected training sample using the sample information in selected training sample as the input of quantization inceptive neural network
Desired output of the sample results as quantization inceptive neural network, quantization inceptive neural network is trained;In response to true
The initial neural metwork training of quantification is completed, and based on the quantization inceptive neural network that training is completed, generates quantization neural network.
In some embodiments, based on the integer type network parameter being converted to, quantization inceptive neural network is generated, comprising:
The integer type network parameter being converted into is converted into floating type network parameter, and will include the floating type network parameter being converted to
Initial neural network be determined as quantization inceptive neural network.
In some embodiments, integer type network ginseng is converted by the original floating-point type network parameter in initial neural network
Number, comprising: convert integer type weight for floating type weight corresponding to original floating-point type network parameter, and by original floating-point
Floating type normalized parameter corresponding to type network parameter is converted into integer type normalized parameter;To the integer type weight being converted to
Quadrature is carried out with integer type normalized parameter, obtains integer type network parameter.
In some embodiments, this method further include: in response to determining that quantization inceptive neural network not complete by training, executes
Following steps: training sample is chosen from the unselected training sample that training sample set includes;Adjustment quantization is initially neural
The parameter of network obtains new floating type network parameter;Convert new floating type network parameter to new integer type network ginseng
Number, and based on new integer type network parameter, generate new quantization inceptive neural network;The training chosen using the last time
Sample and the quantization inceptive neural network being newly generated, continue to execute training step.
In some embodiments, based on new integer type network parameter, new quantization inceptive neural network is generated, comprising:
Floating type network parameter, and the amount that will include the floating type network parameter being converted to are converted by new integer type network parameter
Change initial neural network and is determined as new quantization inceptive neural network.
In some embodiments, this method further include: quantization neural network is sent to user terminal, so as to user terminal
Received quantization neural network is stored.
Second aspect, embodiment of the disclosure provide a kind of method for handling information, this method comprises: obtain to
Handle information and Target quantization neural network, wherein Target quantization neural network is using such as any reality in above-mentioned first aspect
What the method for applying example generated;By information input Target quantization neural network to be processed, processing result and output are obtained.
The third aspect, embodiment of the disclosure provide a kind of for generating the device of quantization neural network, the device packet
Include: first acquisition unit is configured to obtain training sample set and initial neural network, wherein training sample includes sample letter
The predetermined sample results of sample information are ceased and are directed to, initial neural network includes original floating-point type network parameter, original floating
Point-type network parameter is that the floating type weight of the convolutional layer in initial neural network and the batch connecting with convolutional layer normalize layer
Floating type normalized parameter product;Conversion unit is configured to the original floating-point type network parameter in initial neural network
It is converted into integer type network parameter;Generation unit is configured to generate quantization inceptive based on the integer type network parameter being converted to
Neural network;First execution unit is configured to concentrate from training sample and chooses training sample, and executes following training step
It is rapid: using the sample information in selected training sample as the input of quantization inceptive neural network, by selected training sample
Desired output of the sample results as quantization inceptive neural network in this, is trained quantization inceptive neural network;Response
In determining the completion of quantization inceptive neural metwork training, based on the quantization inceptive neural network that training is completed, quantization nerve net is generated
Network.
In some embodiments, generation unit is further configured to: the integer type network parameter being converted into is converted into
Floating type network parameter, and the initial neural network of the floating type network parameter including being converted to is determined as quantization inceptive mind
Through network.
In some embodiments, conversion unit includes: conversion module, and it is right by original floating-point type network parameter institute to be configured to
The floating type weight answered is converted into integer type weight, and floating type corresponding to original floating-point type network parameter is normalized and is joined
Number is converted into integer type normalized parameter;Quadrature module is configured to integer type weight and the integer type normalization to being converted to
Parameter carries out quadrature, obtains integer type network parameter.
In some embodiments, device further include: the second execution unit is configured in response to determine quantization inceptive mind
Through network, training is not completed, and executes following steps: training is chosen from the unselected training sample that training sample set includes
Sample;The parameter of the initial neural network of adjustment quantization obtains new floating type network parameter;New floating type network parameter is turned
New integer type network parameter is turned to, and based on new integer type network parameter, generates new quantization inceptive neural network;Make
The training sample chosen with the last time and the quantization inceptive neural network being newly generated, continue to execute training step.
In some embodiments, the second execution unit is further configured to: converting new integer type network parameter to
Floating type network parameter, and the quantization inceptive neural network of the floating type network parameter including being converted to is determined as new amount
Change initial neural network.
In some embodiments, device further include: transmission unit is configured to quantify neural network and is sent to user
Terminal, so that user terminal stores received quantization neural network.
Fourth aspect, embodiment of the disclosure provide a kind of for handling the device of information, which includes: second to obtain
Unit is taken, is configured to obtain information to be processed and Target quantization neural network, wherein Target quantization neural network is using such as
The method of any embodiment generates in above-mentioned first aspect;Input unit is configured to information input aim parameter to be processed
Change neural network, obtains processing result and output.
5th aspect, embodiment of the disclosure provide a kind of electronic equipment, comprising: one or more processors;Storage
Device is stored thereon with one or more programs, when one or more programs are executed by one or more processors, so that one
Or the method that multiple processors realize any embodiment in above-mentioned first aspect or second aspect.
6th aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program,
The program realizes any embodiment in above-mentioned first aspect or second aspect method when being executed by processor.
The method and apparatus for generating quantization neural network that embodiment of the disclosure provides, by obtaining training sample
Collection and initial neural network, wherein initial neural network includes original floating-point type network parameter, then will be in initial neural network
Original floating-point type network parameter be converted into integer type network parameter, and based on the integer type network parameter being converted to, generate
Quantization inceptive neural network is finally concentrated from training sample and chooses training sample, and executes following training step: will be selected
Training sample in input of the sample information as quantization inceptive neural network, by the sample knot in selected training sample
Desired output of the fruit as quantization inceptive neural network, is trained quantization inceptive neural network;In response to determining quantization just
Beginning neural metwork training is completed, and based on the quantization inceptive neural network that training is completed, quantization neural network is generated, thus in nerve
In the training process of network, integer type network parameter is converted by the floating type network parameter in neural network, with this, for nerve
The network parameter of network is added to quantization constraint, helps to reduce the occupied memory space of neural network, and utilize nerve
Consumption when network progress information processing to CPU, improves the efficiency of information processing;Also, in the prior art directly to instruction
The network parameter practiced in the neural network completed is quantified, and is generated quantization neural network and is compared, the scheme of the disclosure can subtract
The small loss of significance caused to network parameter quantization, the accuracy for improving quantization neural network utilize the amount of the disclosure in turn
Change neural network carry out information processing electronic equipment, compared to the prior art in using quantization neural network progress information at
The electronic equipment of reason can have the more accurate information processing function.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating quantization neural network of the disclosure;
Fig. 3 is according to an embodiment of the present disclosure for generating showing for an application scenarios of the method for quantization neural network
It is intended to;
Fig. 4 is the flow chart according to one embodiment of the method for handling information of the disclosure;
Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating quantization neural network of the disclosure;
Fig. 6 is the structural schematic diagram according to one embodiment of the device for handling information of the disclosure;
Fig. 7 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of embodiment of the disclosure.
Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1, which is shown, can quantify the method for neural network, for generating quantization nerve for generating using the disclosure
The exemplary system architecture 100 of the embodiment of the device of network, the method for handling information or the device for handling information.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications can be installed, such as web browser is answered on terminal device 101,102,103
With, shopping class application, searching class application, instant messaging tools, mailbox client, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be various electronic equipments, including but not limited to smart phone, tablet computer, E-book reader, MP3 player
(Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3),
MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level
4) player, pocket computer on knee and desktop computer etc..It, can be with when terminal device 101,102,103 is software
It is mounted in above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (such as providing distribution in it
The multiple softwares or software module of formula service), single software or software module also may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as send to terminal device 101,102,103 first
The model treatment server that beginning neural network is handled.Model treatment server can be to the initial neural network etc. received
Data carry out the processing such as analyzing, and processing result (such as quantization neural network) is fed back to terminal device.
It should be noted that for generating the method for quantization neural network generally by taking provided by embodiment of the disclosure
Business device 105 executes, and correspondingly, the device for generating quantization neural network is generally positioned in server 105;In addition, this public affairs
Method provided by the embodiment opened for handling information is generally executed by terminal device 101,102,103, correspondingly, is used
It is generally positioned in 101,102,103 in the device of processing information.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module)
It is implemented as single software or software module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, one embodiment of the method for generating quantization neural network according to the disclosure is shown
Process 200.The method for being used to generate quantization neural network, comprising the following steps:
Step 201, training sample set and initial neural network are obtained.
In the present embodiment, for generating executing subject (such as the service shown in FIG. 1 of the method for quantization neural network
Device) it can be by wired connection mode or radio connection from remotely-or locally obtaining training sample set and initial nerve net
Network.Wherein, the training sample that training sample is concentrated includes sample information and for the predetermined sample results of sample information.Sample
The information that this information can be handled for initial neural network can include but is not limited at least one of following: text, image, sound
Frequently, video.For example, initial neural network can be the neural network for carrying out recognition of face, then sample information can be sample
This facial image.Sample results are to carry out processing to sample information using initial neural network to can be obtained expected result (example
Such as the gender information of the gender for characterizing personage corresponding to sample facial image).
Just knowing neural network can be unbred neural network, or trained neural network.Initially
The function of neural network inputs in other words, exports and can be predetermined.In turn, above-mentioned executing subject is available to being used for
The training sample set of the initial neural network of training.
In the present embodiment, initial neural network includes original floating-point type network parameter.Original floating-point type network parameter is
The floating type normalization of the floating type weight of convolutional layer in initial neural network and the batch normalization layer being connect with convolutional layer
The product of parameter.
Specifically, initial neural network includes convolutional layer and batch normalization layer.Convolutional layer includes floating type weight.Floating-point
Type weight can be used for the input with convolutional layer and carry out convolution algorithm, obtain the output of convolutional layer.Batch normalization layer can be with
Convolutional layer connection, is normalized for the output to convolutional layer.Specifically, batch normalization layer includes being used for and convolution
The output of layer is multiplied, the floating type normalized parameter being normalized with the output to convolutional layer.Herein, it can will roll up
Lamination and batch normalization layer are as a network structure, output and floating type normalizing of the output of the network structure by convolutional layer
The product for changing parameter determines, since the output of convolutional layer is determined by the convolution of the input of floating type weight and convolutional layer, on
The output for stating network structure can be then defeated with convolutional layer by carrying out quadrature to floating type normalized parameter and floating type weight
Enter to carry out convolution to determine.It is appreciated that the input of convolutional layer is the input variable of above-mentioned web results, floating type normalization ginseng
Several products with floating type weight are the parameter of above-mentioned network structure.In turn, in the present embodiment, floating type can be normalized
The product of parameter and floating type weight is determined as floating type network parameter.And original floating-point type network parameter is initial neural network
Including, network parameter to be quantified to it.
In practice, integer real-coded GA converted in some value range is referred to the quantization of real-coded GA
Type data.Here, value range is limited by the number of bits of integer type data.Such as the integer type data to be converted to are 8 ratios
Special position (i.e. 8bit), then value range is (0,255).It should be noted that in the present embodiment, to original floating-point type network
When parameter is quantified, the number of bits for the integer type network parameter to be quantized into can be predefined by technical staff.
Real-coded GA identical for digit and integer type data are appreciated that, since real-coded GA can recorde
Data information after decimal point, thus there is higher precision.And integer type data are due to the number after not recording decimal point
It is believed that breath, therefore, when can occupy less memory space, and be calculated using integer type data, calculating speed is faster.
It should be noted that weight and normalization in order to obtain higher precision, in neural network in the prior art
Parameter is typically stored as floating type.
Step 202, integer type network parameter is converted by the original floating-point type network parameter in initial neural network.
In the present embodiment, based on initial neural network obtained in step 201, above-mentioned executing subject can be by initial mind
Integer type network parameter is converted into through the original floating-point type network parameter in network.
Specifically, above-mentioned executing subject can determine the number of bits for the integer type network parameter to be converted first, so
Afterwards, integer type network ginseng is converted by the original floating-point type network parameter in initial neural network using existing various methods
Number.It is appreciated that convert integer type network parameter for the original floating-point type network parameter in initial neural network, it is equivalent to pair
Initial neural network addition quantization constraint.
As an example, original floating-point type network parameter includes numerical value " 21.323 ", predefine the integer to be converted to
The number of bits of type weight is eight, it can the value range for determining integer type network parameter is (0,255), and then can be straight
It connects and converts integer type network parameter for the numerical value " 21.323 " in original floating-point type network parameter by the way of rounding up
“21”。
In some optional implementations of the present embodiment, above-mentioned executing subject can incite somebody to action initial mind by following steps
Integer type network parameter is converted into through the original floating-point type network parameter in network: firstly, above-mentioned executing subject can will be original
Floating type weight corresponding to floating type network parameter is converted into integer type weight, and original floating-point type network parameter institute is right
The floating type normalized parameter answered is converted into integer type normalized parameter.Then, above-mentioned executing subject can be whole to what is be converted to
Number type weight and integer type normalized parameter carry out quadrature, obtain integer type network parameter.This implementation is first to floating type
Weight and floating type normalized parameter are added to quantization constraint, then utilize the integer type weight and integer for being added to quantization constraint
Type normalized parameter obtains integer type network parameter, with this, it is possible to reduce the loss of significance of the integer type network parameter after quantization,
The accuracy of initial neural network after helping to improve quantization.
Step 203, based on the integer type network parameter being converted to, quantization inceptive neural network is generated.
In the present embodiment, based on the integer type network parameter being converted in step 202, above-mentioned executing subject be can be generated
Quantization inceptive neural network.
Specifically, above-mentioned executing subject can directly will include the initial neural network of integer type network parameter being converted to
It is determined as quantifying neural network;Alternatively, above-mentioned executing subject can also be to the initial of the integer type network parameter including being converted to
Neural network is handled, and initial neural network is determined as quantization inceptive neural network by treated.
In some optional implementations of the present embodiment, above-mentioned executing subject can be generated by following steps to be quantified
Initial neural network: the integer type network parameter that above-mentioned executing subject can be converted into is converted into floating type network parameter, with
And the initial neural network of the floating type network parameter including being converted to is determined as quantization inceptive neural network.
Here, converting floating type network parameter for integer type network parameter is above-mentioned by original floating-point type network parameter turn
The inverse process of integer type network parameter is turned to, it can be with reference to the step that original floating-point type network parameter is turned to integer type network parameter
Suddenly the integer type network parameter being converted to is converted, obtains floating type network parameter.
It continues the example presented above, the integer type network parameter being converted to is " 21 ", by original floating-point type network parameter
" 21.323 " are it is found that floating type network parameter is accurate to after decimal point three.So here can be by integer type network parameter
" 21 " are converted into floating type network parameter " 21.000 ".
It should be noted that real-coded GA can have higher precision compared to integer type data.So being initial
Neural network addition quantifies after constraining, then converts floating type network parameter for integer type network parameter, facilitates subsequent right
In the training process of initial neural network, training precision is improved, more accurate training result is obtained.
Step 204, it is concentrated from training sample and chooses training sample, and execute following training step: by selected instruction
Practice input of the sample information in sample as quantization inceptive neural network, the sample results in selected training sample are made
For the desired output of quantization inceptive neural network, quantization inceptive neural network is trained;In response to determining quantization inceptive mind
It is completed through network training, based on the quantization inceptive neural network that training is completed, generates quantization neural network.
In the present embodiment, based on the training sample set obtained in step 201, above-mentioned executing subject can be from training sample
It concentrates and chooses training sample, and execute following training step:
It step 2041, will using the sample information in selected training sample as the input of quantization inceptive neural network
Desired output of the sample results as quantization inceptive neural network in selected training sample, to quantization inceptive neural network
It is trained.
Herein, above-mentioned executing subject can use machine learning method, be trained to quantization inceptive neural network.Tool
Body, sample information is inputted quantization inceptive neural network by above-mentioned executing subject, obtains actual result, then utilizes preset damage
It loses function and calculates the difference between the sample results in obtained actual result and training sample, for example, L2 model can be used
Number calculates the difference between the sample results in obtained actual result breath and training sample as loss function.
Step 2042, in response to determining that quantization inceptive neural metwork training is completed, the quantization inceptive completed based on training is refreshing
Through network, quantization neural network is generated.
It is currently set in advance to whether the training of quantization inceptive neural network meets specifically, above-mentioned executing subject can determine
The completion condition set, if satisfied, can then determine that quantization inceptive neural metwork training is completed.Wherein, completion condition may include
But be not limited at least one of following: the training time is more than preset duration;Frequency of training is more than preset times;Calculate resulting difference
Less than default discrepancy threshold.
In the present embodiment, above-mentioned executing subject can be in response to determining that training is completed, at the beginning of the quantization completed based on training
Beginning neural network generates quantization neural network.Wherein, quantization neural network is that network parameter that training is completed, included is
The neural network of integer type network parameter.
Specifically, be integer type weight in response to the network parameter determined in the quantization inceptive neural network trained and completed,
Then the quantization inceptive neural network that training is completed can directly be determined as quantifying neural network by above-mentioned executing subject;In response to true
The network parameter in quantization inceptive neural network that fixed training is completed is floating type network parameter, then above-mentioned executing subject can will
The floating type network parameter in quantization inceptive neural network that training is completed is converted into integer type network parameter, and then will include turning
The integer type network parameter of chemical conversion, training completion quantization inceptive neural network is determined as quantifying neural network.
In the present embodiment, above-mentioned executing subject may also respond to determine that quantization inceptive neural network not complete by training,
It executes following steps: choosing training sample from the unselected training sample that training sample set includes;Adjustment quantization is initial
The parameter of neural network obtains new floating type network parameter;Convert new floating type network parameter on new integer type net
Network parameter, and based on new integer type network parameter, generate new quantization inceptive neural network;It is chosen using the last time
Training sample and the quantization inceptive neural network being newly generated, continue to execute above-mentioned training step (step 2041-2042).
Here it is possible to using various implementations based on the sample results in the actual result and training sample being calculated
Between discrepancy adjustment quantization inceptive neural network parameter.For example, can be using BP (Back Propagation, reversed biography
Broadcast) to carry out adjustment quantization initially neural for algorithm and SGD (Stochastic Gradient Descent, stochastic gradient descent) algorithm
The parameter of network.It should be noted that, in order to not influence trained convergent, guaranteeing that training can repeat when adjusting parameter
It executes, it will usually which parameter is adjusted to floating type.Therefore, quantization inceptive neural network can obtain new floating type after adjusting parameter
Network parameter, in turn, above-mentioned executing subject can convert new floating type network parameter to new integer type network parameter, with
Again it is the quantization inceptive neural network addition quantization constraint for including new floating type network parameter, generates new quantization inceptive mind
Through network.
Specifically, above-mentioned executing subject can be adopted based on new integer type network parameter and be generated new amount in various manners
Change initial neural network.For example, directly the quantization inceptive neural network including new integer type network parameter can be determined as
New quantization inceptive neural network.Alternatively, above-mentioned executing subject can also be at the beginning of to including the quantization of new integer type network parameter
Beginning neural network is handled, and quantization inceptive neural network is determined as new quantization inceptive neural network by treated.
In some optional implementations of the present embodiment, above-mentioned executing subject can be generated new by following steps
Quantization inceptive neural network: above-mentioned executing subject can convert floating type network parameter for new integer type network parameter, with
And the quantization inceptive neural network of the floating type network parameter including being converted to is determined as new quantization inceptive neural network.
In some optional implementations of the present embodiment, quantization neural network can be sent to by above-mentioned executing subject
User terminal, so that user terminal stores received quantization neural network.Herein, it is added to the quantization of quantization constraint
Neural network can occupy less memory space, by this implementation, can save the storage resource of user terminal.
It is one of the application scenarios of the method according to the present embodiment for generating neural network with continued reference to Fig. 3, Fig. 3
Schematic diagram.In the application scenarios of Fig. 3, the training sample set 302 available first of server 301 and initial neural network 303,
Wherein, the training sample in training sample set 302 includes sample information and for the predetermined sample results of sample information.Just
Beginning neural network 303 includes original floating-point type network parameter 304 (such as " 2.134 ").Original floating-point type network parameter 304 is first
The floating type normalization of the floating type weight of convolutional layer in beginning neural network 303 and the batch being connect with convolutional layer normalization layer
The product of parameter.Then, the original floating-point type network parameter 304 in initial neural network 303 is converted integer type by server 301
Network parameter 305 (such as " 2 ").Then, server 301 generates quantization inceptive based on the integer type network parameter 305 being converted to
Neural network 306.Finally, server 301 can choose training sample 3021 from training sample set 302, and execute following
Training step: by the sample information 30211 in selected training sample 3021 as the defeated of quantization inceptive neural network 306
Enter, the desired output by the sample results 30212 in selected training sample 3021 as quantization inceptive neural network 306,
Quantization inceptive neural network 306 is trained;In response to determining that the training of quantization inceptive neural network 306 is completed, based on training
The quantization inceptive neural network 306 of completion generates quantization neural network 307.
The method provided by the above embodiment of the disclosure is in the training process of neural network, by the floating-point in neural network
Type network parameter is converted into integer type network parameter, with this, is added to quantization constraint for the network parameter of neural network, facilitates
Reduce the occupied memory space of neural network, and using consumption when neural network progress information processing to CPU, improves letter
Cease the efficiency of processing;Also, quantify with the network parameter in the neural network in the prior art directly completed to training,
It generates quantization neural network to compare, the scheme of the disclosure can reduce the loss of significance caused to network parameter quantization, improve
Quantify the accuracy of neural network, in turn, the electronic equipment of information processing is carried out using the quantization neural network of the disclosure, is compared
In the electronic equipment in the prior art for carrying out information processing using quantization neural network, more accurate information processing can have
Function.
With further reference to Fig. 4, it illustrates the processes 400 of one embodiment of the method for handling information.This is used for
Handle the process 400 of the method for information, comprising the following steps:
Step 401, information to be processed and Target quantization neural network are obtained.
It in the present embodiment, can be with for handling the executing subject (such as terminal device shown in FIG. 1) of the method for information
Information to be processed and Target quantization nerve net are instructed from remotely-or locally acquisition by wired connection mode or radio connection
Network.Wherein, Target quantization neural network is to be generated using the method for any embodiment in the corresponding embodiment of Fig. 2.Target
Quantifying neural network is its quantization neural network for carrying out information processing to be utilized.Information to be processed can be Target quantization nerve
The information that network is capable of handling.It can include but is not limited at least one of following: text, image, audio, video.As showing
Example, Target quantization neural network are the model for carrying out recognition of face, then information to be processed can be facial image.It is to be processed
Information can be stored in advance in above-mentioned executing subject, and above-mentioned executing subject can also be sent to by other electronic equipments.Processing
It as a result can be the output result of Target quantization neural network.
Step 402, by information input Target quantization neural network to be processed, processing result and output are obtained.
Specifically, information input Target quantization neural network to be processed can be obtained Target quantization by above-mentioned executing subject
The processing result of neural network output.
Herein, after obtaining processing result, above-mentioned executing subject can export processing result.Specifically, above-mentioned execution master
Body can export processing result to other electronic equipments of communication connection, can also carry out output to processing result and show.
In practice, since the memory space of user terminal is smaller, and general neural network needs to consume a large amount of storage
Resource, so usually neural network is difficult to be applicable on the subscriber terminal.The method that embodiment of the disclosure provides is corresponding using Fig. 2
Any embodiment in the quantization neural network that generates, can to quantify neural network to be suitable for user terminal, help simultaneously
In consumption of the reduction to the storage resource of user terminal;Also, when user terminal is carried out at information using quantization neural network
When reason, since the complexity of quantization neural network is low, it is possible to improve the efficiency that user terminal carries out information processing, reduce
Consumption to the CPU of user terminal;Further, since the quantization neural network for being sent to user terminal is by the training process
Addition quantization constraint and obtain neural network, with it is in the prior art, pass through for training complete neural network addition quantization
The quantization neural network for constraining and generating is compared, and the loss of significance of the quantization neural network of the disclosure is smaller, in turn, user terminal
More accurate information processing and output may be implemented using the quantization neural network of the disclosure.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, present disclose provides one kind to be used for production quantity
Change one embodiment of the device of neural network, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, device tool
Body can be applied in various electronic equipments.
As shown in figure 5, the device 500 for generating quantization neural network of the present embodiment includes: first acquisition unit
501, conversion unit 502, generation unit 503 and the first execution unit 504.Wherein, first acquisition unit 501 is configured to obtain
Training sample set and initial neural network, wherein training sample includes sample information and for the predetermined sample of sample information
, as a result, initial neural network includes original floating-point type network parameter, original floating-point type network parameter is in initial neural network for this
Convolutional layer floating type weight and connect with convolutional layer batch normalization layer floating type normalized parameter product;Conversion is single
Member 502 is configured to convert integer type network parameter for the original floating-point type network parameter in initial neural network;It generates single
Member 503 is configured to generate quantization inceptive neural network based on the integer type network parameter being converted to;First execution unit 504
It is configured to concentrate from training sample and chooses training sample, and execute following training step: will be in selected training sample
Input of the sample information as quantization inceptive neural network, just using the sample results in selected training sample as quantization
The desired output of beginning neural network is trained quantization inceptive neural network;In response to determining quantization inceptive neural network instruction
Practice and complete, based on the quantization inceptive neural network that training is completed, generates quantization neural network.
It in the present embodiment, can be by having for generating the first acquisition unit 501 of the device 500 of quantization neural network
Line connection type or radio connection are from remotely-or locally obtaining training sample set and initial neural network.Wherein, training
Training sample in sample set includes sample information and for the predetermined sample results of sample information.Sample information is initial
The information that neural network can be handled can include but is not limited at least one of following: text, image, audio, video.Just know
Neural network can be unbred neural network, or trained neural network.
In the present embodiment, initial neural network includes original floating-point type network parameter.Original floating-point type network parameter is
The floating type normalization of the floating type weight of convolutional layer in initial neural network and the batch normalization layer being connect with convolutional layer
The product of parameter.
In the present embodiment, the initial neural network obtained based on first acquisition unit 501, conversion unit 502 can incite somebody to action
Original floating-point type network parameter in initial neural network is converted into integer type network parameter.
In the present embodiment, the integer type network parameter being converted to based on conversion unit 502, generation unit 503 can give birth to
At quantization inceptive neural network.
In the present embodiment, the training sample set obtained based on first acquisition unit 501, the first execution unit 504 can be with
It is concentrated from training sample and chooses training sample, and execute following training step: the sample in selected training sample is believed
The input as quantization inceptive neural network is ceased, using the sample results in selected training sample as quantization inceptive nerve net
The desired output of network is trained quantization inceptive neural network;In response to determining that quantization inceptive neural metwork training is completed, base
In the quantization inceptive neural network that training is completed, quantization neural network is generated.
In some optional implementations of the present embodiment, generation unit 503 can be further configured to: will be converted
At integer type network parameter be converted into floating type network parameter, and will include the initial of the floating type network parameter being converted to
Neural network is determined as quantization inceptive neural network.
In some optional implementations of the present embodiment, conversion unit 502 may include: conversion module (in figure not
Show), it is configured to convert integer type weight for floating type weight corresponding to original floating-point type network parameter, and will be former
Floating type normalized parameter corresponding to beginning floating type network parameter is converted into integer type normalized parameter;Quadrature module is (in figure
It is not shown), it is configured to carry out quadrature to the integer type weight and integer type normalized parameter being converted to, obtains integer type network
Parameter.
In some optional implementations of the present embodiment, device 500 can also include: the second execution unit (in figure
It is not shown), it is configured in response to determine that quantization inceptive neural network not complete by training, executes following steps: from training sample
Training sample is chosen in the unselected training sample that collection includes;The parameter of the initial neural network of adjustment quantization, obtains new
Floating type network parameter;Convert new floating type network parameter to new integer type network parameter, and based on new integer
Type network parameter generates new quantization inceptive neural network;The training sample chosen using the last time and the amount being newly generated
Change initial neural network, continues to execute training step.
In some optional implementations of the present embodiment, the second execution unit can be further configured to: will be new
Integer type network parameter be converted into floating type network parameter, and at the beginning of will including the quantization for the floating type network parameter being converted to
Beginning neural network is determined as new quantization inceptive neural network.
In some optional implementations of the present embodiment, device 500 can also include: that transmission unit (does not show in figure
Out), it is configured to quantify neural network and is sent to user terminal, so that user terminal carries out received quantization neural network
Storage.
It is understood that all units recorded in the device 500 and each step phase in the method with reference to Fig. 2 description
It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 500 and its
In include unit, details are not described herein.
The device provided by the above embodiment 500 of the disclosure, will be in neural network in the training process of neural network
Floating type weight is converted into integer type weight, with this, is added to quantization constraint for the weight of neural network, helps to reduce nerve
The occupied memory space of network, and using consumption when neural network progress information processing to CPU, improve information processing
Efficiency;Also, quantifies with the weight in the neural network in the prior art directly completed to training, generate quantization nerve
Network is compared, and the scheme of the disclosure can reduce the loss of significance caused to weight quantization, improves the standard of quantization neural network
Exactness, in turn, using the disclosure quantization neural network carry out information processing electronic equipment, compared to the prior art in benefit
The electronic equipment that information processing is carried out with quantization neural network, can have the more accurate information processing function.
With further reference to Fig. 6, as the realization to method shown in above-mentioned each figure, present disclose provides one kind for handling letter
One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 4, which can specifically answer
For in various electronic equipments.
As shown in fig. 6, the device 600 for handling information of the present embodiment includes: that second acquisition unit 601 and input are single
Member 602.Wherein, second acquisition unit 601 is configured to obtain information to be processed and Target quantization neural network, wherein target
Quantization neural network is using as the method for any embodiment in the corresponding embodiment of Fig. 2 generates;Input unit 602 is matched
It is set to information input Target quantization neural network to be processed, obtains processing result and output.
It in the present embodiment, can be by wired connection side for handling the second acquisition unit 601 of the device 600 of information
Formula or radio connection instruct information and Target quantization neural network to be processed from remotely-or locally obtaining.Wherein, aim parameter
Changing neural network is to be generated using the method for any embodiment in the corresponding embodiment of Fig. 2.Target quantization neural network is
Its quantization neural network for carrying out information processing to be utilized.Information to be processed can be capable of handling by Target quantization neural network
Information.It can include but is not limited at least one of following: text, image, audio, video.
In the present embodiment, information input Target quantization neural network to be processed can be obtained target by input unit 602
Quantify the processing result and output of neural network output.
It is understood that all units recorded in the device 600 and each step phase in the method with reference to Fig. 4 description
It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 600 and its
In include unit, details are not described herein.
The device provided by the above embodiment 600 of the disclosure is using the quantization mind generated in the corresponding any embodiment of Fig. 2
Through network, it can to quantify neural network suitable for user terminal, while helping to reduce the storage resource to user terminal
Consumption;Also, when user terminal is when carrying out information processing using quantization neural network, due to quantifying the complexity of neural network
Degree is low, it is possible to improve the efficiency that user terminal carries out information processing, reduce the consumption to the CPU of user terminal;In addition,
Quantization neural network due to being sent to user terminal is the nerve net obtained and adding quantization constraint in the training process
Network, with quantization neural network phase that is in the prior art, being generated by the neural network addition quantization constraint completed for training
Than the loss of significance of the quantization neural network of the disclosure is smaller, and in turn, user terminal can using the quantization neural network of the disclosure
To realize more accurate information processing and output.
Below with reference to Fig. 7, it illustrates the electronic equipment (end of example as shown in figure 1 for being suitable for being used to realize the embodiment of the present disclosure
End equipment or server) 700 structural schematic diagram.Terminal device in the embodiment of the present disclosure can include but is not limited to such as move
Mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable more
Media player), the mobile terminal and such as number TV, desktop computer of car-mounted terminal (such as vehicle mounted guidance terminal) etc.
Etc. fixed terminal.Electronic equipment shown in Fig. 7 is only an example, should not function and use to the embodiment of the present disclosure
Range band carrys out any restrictions.
As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.)
701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708
Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment
Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM703 are connected with each other by bus 704.
Input/output (I/O) interface 705 is also connected to bus 704.
In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 606 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 707 of dynamic device etc.;Storage device 708 including such as tape, hard disk etc.;And communication device 709.Communication device
709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool
There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708
It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the embodiment of the present disclosure is executed
Method in the above-mentioned function that limits.
It should be noted that computer-readable medium described in the disclosure can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated,
In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to
Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit
Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned
Any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not
It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more
When a program is executed by the electronic equipment, so that the electronic equipment: obtaining training sample set and initial neural network, wherein instruction
Practice sample to include sample information and be directed to the predetermined sample results of sample information, initial neural network includes original floating-point type
Network parameter, original floating-point type network parameter are the floating type weight of the convolutional layer in initial neural network and connect with convolutional layer
Batch normalization layer floating type normalized parameter product;By the original floating-point type network parameter conversion in initial neural network
For integer type network parameter, and based on the integer type network parameter being converted to, generate quantization inceptive neural network;From training sample
This concentration chooses training sample, and executes following training step: using the sample information in selected training sample as amount
The input for changing initial neural network, using the sample results in selected training sample as the expectation of quantization inceptive neural network
Output, is trained quantization inceptive neural network;In response to determining that quantization inceptive neural metwork training is completed, it is based on having trained
At quantization inceptive neural network, generate quantization neural network.
In addition, when said one or multiple programs are executed by the electronic equipment, it is also possible that the electronic equipment: obtaining
Take information to be processed and Target quantization neural network, wherein Target quantization neural network is using in the corresponding embodiment of Fig. 2
What the method for any embodiment generated;By information input Target quantization neural network to be processed, processing result and output are obtained.
The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof
Machine program code, described program design language include object oriented program language-such as Java, Smalltalk, C+
+, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can
Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package,
Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part.
In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN)
Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service
Provider is connected by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard
The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions, for example, the
One acquiring unit is also described as " obtaining the unit of training sample set and initial neural network ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that the open scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from design disclosed above, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (16)
1. a kind of method for generating quantization neural network, comprising:
Obtain training sample set and initial neural network, wherein training sample includes sample information and preparatory for sample information
Determining sample results, initial neural network include original floating-point type network parameter, and original floating-point type network parameter is initial mind
The floating type normalized parameter of floating type weight and the batch normalization layer being connect with convolutional layer through the convolutional layer in network
Product;
Integer type network parameter is converted by the original floating-point type network parameter in initial neural network;
Based on the integer type network parameter being converted to, quantization inceptive neural network is generated;
It is concentrated from the training sample and chooses training sample, and execute following training step: will be in selected training sample
Input of the sample information as quantization inceptive neural network, just using the sample results in selected training sample as quantization
The desired output of beginning neural network is trained quantization inceptive neural network;In response to determining quantization inceptive neural network instruction
Practice and complete, based on the quantization inceptive neural network that training is completed, generates quantization neural network.
2. according to the method described in claim 1, wherein, described based on the integer type network parameter being converted to, generation quantization is just
Beginning neural network, comprising:
The integer type network parameter being converted into is converted into floating type network parameter, and will include the floating type network being converted to
The initial neural network of parameter is determined as quantization inceptive neural network.
3. according to the method described in claim 1, wherein, the original floating-point type network parameter by initial neural network turns
Turn to integer type network parameter, comprising:
Integer type weight is converted by floating type weight corresponding to original floating-point type network parameter, and by original floating-point type net
Floating type normalized parameter corresponding to network parameter is converted into integer type normalized parameter;
Quadrature is carried out to the integer type weight and integer type normalized parameter being converted to, obtains integer type network parameter.
4. according to the method described in claim 1, wherein, the method also includes:
In response to determining that quantization inceptive neural network not complete by training, executes following steps: including from the training sample set
Training sample is chosen in unselected training sample;The parameter of the initial neural network of adjustment quantization obtains new floating type net
Network parameter;It converts new floating type network parameter to new integer type network parameter, and is joined based on new integer type network
Number, generates new quantization inceptive neural network;The training sample chosen using the last time and the quantization inceptive being newly generated mind
Through network, the training step is continued to execute.
5. it is described based on new integer type network parameter according to the method described in claim 4, wherein, at the beginning of generating new quantization
Beginning neural network, comprising:
Floating type network parameter is converted by new integer type network parameter, and will include the floating type network parameter being converted to
Quantization inceptive neural network be determined as new quantization inceptive neural network.
6. method described in one of -5 according to claim 1, wherein the method also includes:
Quantization neural network is sent to user terminal, so that user terminal stores received quantization neural network.
7. a kind of method for handling information, comprising:
Obtain information to be processed and Target quantization neural network, wherein the Target quantization neural network is to want using such as right
Ask any method generation in 1-6;
By Target quantization neural network described in the information input to be processed, processing result and output are obtained.
8. a kind of for generating the device of quantization neural network, comprising:
First acquisition unit is configured to obtain training sample set and initial neural network, wherein training sample includes sample letter
The predetermined sample results of sample information are ceased and are directed to, initial neural network includes original floating-point type network parameter, original floating
Point-type network parameter is that the floating type weight of the convolutional layer in initial neural network and the batch connecting with convolutional layer normalize layer
Floating type normalized parameter product;
Conversion unit is configured to convert the original floating-point type network parameter in initial neural network to integer type network ginseng
Number;
Generation unit is configured to generate quantization inceptive neural network based on the integer type network parameter being converted to;
First execution unit is configured to concentrate from the training sample and chooses training sample, and executes following training step:
It, will be in selected training sample using the sample information in selected training sample as the input of quantization inceptive neural network
Desired output of the sample results as quantization inceptive neural network, quantization inceptive neural network is trained;In response to true
The initial neural metwork training of quantification is completed, and based on the quantization inceptive neural network that training is completed, generates quantization neural network.
9. device according to claim 8, wherein the generation unit is further configured to:
The integer type network parameter being converted into is converted into floating type network parameter, and will include the floating type network being converted to
The initial neural network of parameter is determined as quantization inceptive neural network.
10. device according to claim 8, wherein the conversion unit includes:
Conversion module is configured to convert integer type weight for floating type weight corresponding to original floating-point type network parameter,
And integer type normalized parameter is converted by floating type normalized parameter corresponding to original floating-point type network parameter;
Quadrature module is configured to carry out quadrature to the integer type weight and integer type normalized parameter being converted to, obtains integer
Type network parameter.
11. device according to claim 8, wherein described device further include:
Second execution unit is configured in response to determine that quantization inceptive neural network not complete by training, executes following steps: from
Training sample is chosen in the unselected training sample that the training sample set includes;The ginseng of the initial neural network of adjustment quantization
Number, obtains new floating type network parameter;Convert new floating type network parameter to new integer type network parameter, Yi Jiji
In new integer type network parameter, new quantization inceptive neural network is generated;Using the last training sample chosen and most
Newly-generated quantization inceptive neural network, continues to execute the training step.
12. device according to claim 11, wherein second execution unit is further configured to:
Floating type network parameter is converted by new integer type network parameter, and will include the floating type network parameter being converted to
Quantization inceptive neural network be determined as new quantization inceptive neural network.
13. the device according to one of claim 8-12, wherein described device further include:
Transmission unit is configured to quantify neural network and is sent to user terminal, so that user terminal is to received quantization mind
It is stored through network.
14. a kind of for handling the device of information, comprising:
Second acquisition unit is configured to obtain information to be processed and Target quantization neural network, wherein the Target quantization mind
It is using as method as claimed in any one of claims 1 to 6 generates through network;
Input unit, is configured to Target quantization neural network described in the information input to be processed, obtain processing result and
Output.
15. a kind of electronic equipment, comprising:
One or more processors;
Storage device is stored thereon with one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in any in claim 1-7.
16. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor
Method as described in any in claim 1-7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910288941.2A CN109961141A (en) | 2019-04-11 | 2019-04-11 | Method and apparatus for generating quantization neural network |
PCT/CN2020/078586 WO2020207174A1 (en) | 2019-04-11 | 2020-03-10 | Method and apparatus for generating quantized neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910288941.2A CN109961141A (en) | 2019-04-11 | 2019-04-11 | Method and apparatus for generating quantization neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109961141A true CN109961141A (en) | 2019-07-02 |
Family
ID=67026033
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910288941.2A Pending CN109961141A (en) | 2019-04-11 | 2019-04-11 | Method and apparatus for generating quantization neural network |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109961141A (en) |
WO (1) | WO2020207174A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110443165A (en) * | 2019-07-23 | 2019-11-12 | 北京迈格威科技有限公司 | Neural network quantization method, image-recognizing method, device and computer equipment |
CN110852421A (en) * | 2019-11-11 | 2020-02-28 | 北京百度网讯科技有限公司 | Model generation method and device |
CN111340226A (en) * | 2020-03-06 | 2020-06-26 | 北京市商汤科技开发有限公司 | Training and testing method, device and equipment for quantitative neural network model |
WO2020207174A1 (en) * | 2019-04-11 | 2020-10-15 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating quantized neural network |
CN112308226A (en) * | 2020-08-03 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | Quantization of neural network models, method and apparatus for outputting information |
CN113011569A (en) * | 2021-04-07 | 2021-06-22 | 开放智能机器(上海)有限公司 | Offline quantitative parameter filling method and device, electronic equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590460B (en) * | 2017-09-12 | 2019-05-03 | 北京达佳互联信息技术有限公司 | Face classification method, apparatus and intelligent terminal |
CN108509179B (en) * | 2018-04-04 | 2021-11-30 | 百度在线网络技术(北京)有限公司 | Method for detecting human face and device for generating model |
CN109165736B (en) * | 2018-08-08 | 2023-12-12 | 北京字节跳动网络技术有限公司 | Information processing method and device applied to convolutional neural network |
CN109961141A (en) * | 2019-04-11 | 2019-07-02 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating quantization neural network |
-
2019
- 2019-04-11 CN CN201910288941.2A patent/CN109961141A/en active Pending
-
2020
- 2020-03-10 WO PCT/CN2020/078586 patent/WO2020207174A1/en active Application Filing
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020207174A1 (en) * | 2019-04-11 | 2020-10-15 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating quantized neural network |
CN110443165A (en) * | 2019-07-23 | 2019-11-12 | 北京迈格威科技有限公司 | Neural network quantization method, image-recognizing method, device and computer equipment |
CN110443165B (en) * | 2019-07-23 | 2022-04-29 | 北京迈格威科技有限公司 | Neural network quantization method, image recognition method, device and computer equipment |
CN110852421A (en) * | 2019-11-11 | 2020-02-28 | 北京百度网讯科技有限公司 | Model generation method and device |
CN110852421B (en) * | 2019-11-11 | 2023-01-17 | 北京百度网讯科技有限公司 | Model generation method and device |
CN111340226A (en) * | 2020-03-06 | 2020-06-26 | 北京市商汤科技开发有限公司 | Training and testing method, device and equipment for quantitative neural network model |
CN111340226B (en) * | 2020-03-06 | 2022-01-25 | 北京市商汤科技开发有限公司 | Training and testing method, device and equipment for quantitative neural network model |
CN112308226A (en) * | 2020-08-03 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | Quantization of neural network models, method and apparatus for outputting information |
CN112308226B (en) * | 2020-08-03 | 2024-05-24 | 北京沃东天骏信息技术有限公司 | Quantization of neural network model, method and apparatus for outputting information |
CN113011569A (en) * | 2021-04-07 | 2021-06-22 | 开放智能机器(上海)有限公司 | Offline quantitative parameter filling method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2020207174A1 (en) | 2020-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109961141A (en) | Method and apparatus for generating quantization neural network | |
WO2020155907A1 (en) | Method and apparatus for generating cartoon style conversion model | |
CN109902186A (en) | Method and apparatus for generating neural network | |
CN108416310B (en) | Method and apparatus for generating information | |
CN108197652B (en) | Method and apparatus for generating information | |
CN109858445A (en) | Method and apparatus for generating model | |
CN108022586A (en) | Method and apparatus for controlling the page | |
CN110009101A (en) | Method and apparatus for generating quantization neural network | |
CN109993150A (en) | The method and apparatus at age for identification | |
WO2022121801A1 (en) | Information processing method and apparatus, and electronic device | |
CN110084317A (en) | The method and apparatus of image for identification | |
CN112634928A (en) | Sound signal processing method and device and electronic equipment | |
CN109800730A (en) | The method and apparatus for generating model for generating head portrait | |
CN109165736A (en) | Information processing method and device applied to convolutional neural networks | |
CN110059623A (en) | Method and apparatus for generating information | |
CN109829164A (en) | Method and apparatus for generating text | |
CN109918530A (en) | Method and apparatus for pushing image | |
CN112149699A (en) | Method and device for generating model and method and device for recognizing image | |
CN110046571A (en) | The method and apparatus at age for identification | |
WO2021068493A1 (en) | Method and apparatus for processing information | |
CN112241761A (en) | Model training method and device and electronic equipment | |
CN110288683B (en) | Method and device for generating information | |
CN110008926A (en) | The method and apparatus at age for identification | |
CN111008213A (en) | Method and apparatus for generating language conversion model | |
CN109670579A (en) | Model generating method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190702 |
|
RJ01 | Rejection of invention patent application after publication |