CN109961141A

CN109961141A - Method and apparatus for generating quantization neural network

Info

Publication number: CN109961141A
Application number: CN201910288941.2A
Authority: CN
Inventors: 刘阳
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2019-04-11
Filing date: 2019-04-11
Publication date: 2019-07-02
Also published as: WO2020207174A1

Abstract

Embodiment of the disclosure discloses the method and apparatus for generating quantization neural network.One specific embodiment of this method includes: to obtain training sample set and initial neural network；Integer type network parameter is converted by the original floating-point type network parameter in initial neural network；Based on the integer type network parameter being converted to, quantization inceptive neural network is generated；It is concentrated from training sample and chooses training sample, execute following training step: using the sample information in training sample as the input of quantization inceptive neural network, using the sample results in training sample as the desired output of quantization inceptive neural network, quantization inceptive neural network is trained；In response to determining that quantization inceptive neural metwork training is completed, based on the quantization inceptive neural network that training is completed, quantization neural network is generated.The embodiment helps to reduce the occupied memory space of neural network, and using consumption when neural network progress information processing to CPU, improves the efficiency of information processing.

Description

Method and apparatus for generating quantization neural network

Technical field

Embodiment of the disclosure is related to field of computer technology, more particularly, to generate quantization neural network method and Device.

Background technique

Currently, for the training of accelerans network, it will usually connect one behind the convolutional layer that neural network includes BN (Batch Normalization, batch normalize) layer.BN layers for by the output of convolutional layer pass to other layers it Before, the output of convolutional layer is normalized, with this, improves the convergence rate of neural network.

BN layers include being multiplied for the output with convolutional layer, the normalization being normalized with the output to convolutional layer Parameter.The output of convolutional layer carries out convolution acquisition by the weight of convolutional layer and the input of convolutional layer.In practice, weight and normalization The data type of parameter is usually floating type.

Summary of the invention

Embodiment of the disclosure proposes the method and apparatus for generating quantization neural network.

In a first aspect, embodiment of the disclosure provides a kind of method for generating quantization neural network, this method packet It includes: obtaining training sample set and initial neural network, wherein training sample includes sample information and true in advance for sample information Fixed sample results, initial neural network include original floating-point type network parameter, and original floating-point type network parameter is initial nerve The product of the floating type normalized parameter of the floating type weight of convolutional layer in network and the batch being connect with convolutional layer normalization layer； Integer type network parameter is converted by the original floating-point type network parameter in initial neural network；Based on the integer type net being converted to Network parameter generates quantization inceptive neural network；It is concentrated from training sample and chooses training sample, and execute following training step: It, will be in selected training sample using the sample information in selected training sample as the input of quantization inceptive neural network Desired output of the sample results as quantization inceptive neural network, quantization inceptive neural network is trained；In response to true The initial neural metwork training of quantification is completed, and based on the quantization inceptive neural network that training is completed, generates quantization neural network.

In some embodiments, based on the integer type network parameter being converted to, quantization inceptive neural network is generated, comprising: The integer type network parameter being converted into is converted into floating type network parameter, and will include the floating type network parameter being converted to Initial neural network be determined as quantization inceptive neural network.

In some embodiments, integer type network ginseng is converted by the original floating-point type network parameter in initial neural network Number, comprising: convert integer type weight for floating type weight corresponding to original floating-point type network parameter, and by original floating-point Floating type normalized parameter corresponding to type network parameter is converted into integer type normalized parameter；To the integer type weight being converted to Quadrature is carried out with integer type normalized parameter, obtains integer type network parameter.

In some embodiments, this method further include: in response to determining that quantization inceptive neural network not complete by training, executes Following steps: training sample is chosen from the unselected training sample that training sample set includes；Adjustment quantization is initially neural The parameter of network obtains new floating type network parameter；Convert new floating type network parameter to new integer type network ginseng Number, and based on new integer type network parameter, generate new quantization inceptive neural network；The training chosen using the last time Sample and the quantization inceptive neural network being newly generated, continue to execute training step.

In some embodiments, based on new integer type network parameter, new quantization inceptive neural network is generated, comprising: Floating type network parameter, and the amount that will include the floating type network parameter being converted to are converted by new integer type network parameter Change initial neural network and is determined as new quantization inceptive neural network.

In some embodiments, this method further include: quantization neural network is sent to user terminal, so as to user terminal Received quantization neural network is stored.

Second aspect, embodiment of the disclosure provide a kind of method for handling information, this method comprises: obtain to Handle information and Target quantization neural network, wherein Target quantization neural network is using such as any reality in above-mentioned first aspect What the method for applying example generated；By information input Target quantization neural network to be processed, processing result and output are obtained.

The third aspect, embodiment of the disclosure provide a kind of for generating the device of quantization neural network, the device packet Include: first acquisition unit is configured to obtain training sample set and initial neural network, wherein training sample includes sample letter The predetermined sample results of sample information are ceased and are directed to, initial neural network includes original floating-point type network parameter, original floating Point-type network parameter is that the floating type weight of the convolutional layer in initial neural network and the batch connecting with convolutional layer normalize layer Floating type normalized parameter product；Conversion unit is configured to the original floating-point type network parameter in initial neural network It is converted into integer type network parameter；Generation unit is configured to generate quantization inceptive based on the integer type network parameter being converted to Neural network；First execution unit is configured to concentrate from training sample and chooses training sample, and executes following training step It is rapid: using the sample information in selected training sample as the input of quantization inceptive neural network, by selected training sample Desired output of the sample results as quantization inceptive neural network in this, is trained quantization inceptive neural network；Response In determining the completion of quantization inceptive neural metwork training, based on the quantization inceptive neural network that training is completed, quantization nerve net is generated Network.

In some embodiments, generation unit is further configured to: the integer type network parameter being converted into is converted into Floating type network parameter, and the initial neural network of the floating type network parameter including being converted to is determined as quantization inceptive mind Through network.

In some embodiments, conversion unit includes: conversion module, and it is right by original floating-point type network parameter institute to be configured to The floating type weight answered is converted into integer type weight, and floating type corresponding to original floating-point type network parameter is normalized and is joined Number is converted into integer type normalized parameter；Quadrature module is configured to integer type weight and the integer type normalization to being converted to Parameter carries out quadrature, obtains integer type network parameter.

In some embodiments, device further include: the second execution unit is configured in response to determine quantization inceptive mind Through network, training is not completed, and executes following steps: training is chosen from the unselected training sample that training sample set includes Sample；The parameter of the initial neural network of adjustment quantization obtains new floating type network parameter；New floating type network parameter is turned New integer type network parameter is turned to, and based on new integer type network parameter, generates new quantization inceptive neural network；Make The training sample chosen with the last time and the quantization inceptive neural network being newly generated, continue to execute training step.

In some embodiments, the second execution unit is further configured to: converting new integer type network parameter to Floating type network parameter, and the quantization inceptive neural network of the floating type network parameter including being converted to is determined as new amount Change initial neural network.

In some embodiments, device further include: transmission unit is configured to quantify neural network and is sent to user Terminal, so that user terminal stores received quantization neural network.

Fourth aspect, embodiment of the disclosure provide a kind of for handling the device of information, which includes: second to obtain Unit is taken, is configured to obtain information to be processed and Target quantization neural network, wherein Target quantization neural network is using such as The method of any embodiment generates in above-mentioned first aspect；Input unit is configured to information input aim parameter to be processed Change neural network, obtains processing result and output.

5th aspect, embodiment of the disclosure provide a kind of electronic equipment, comprising: one or more processors；Storage Device is stored thereon with one or more programs, when one or more programs are executed by one or more processors, so that one Or the method that multiple processors realize any embodiment in above-mentioned first aspect or second aspect.

6th aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program, The program realizes any embodiment in above-mentioned first aspect or second aspect method when being executed by processor.

The method and apparatus for generating quantization neural network that embodiment of the disclosure provides, by obtaining training sample Collection and initial neural network, wherein initial neural network includes original floating-point type network parameter, then will be in initial neural network Original floating-point type network parameter be converted into integer type network parameter, and based on the integer type network parameter being converted to, generate Quantization inceptive neural network is finally concentrated from training sample and chooses training sample, and executes following training step: will be selected Training sample in input of the sample information as quantization inceptive neural network, by the sample knot in selected training sample Desired output of the fruit as quantization inceptive neural network, is trained quantization inceptive neural network；In response to determining quantization just Beginning neural metwork training is completed, and based on the quantization inceptive neural network that training is completed, quantization neural network is generated, thus in nerve In the training process of network, integer type network parameter is converted by the floating type network parameter in neural network, with this, for nerve The network parameter of network is added to quantization constraint, helps to reduce the occupied memory space of neural network, and utilize nerve Consumption when network progress information processing to CPU, improves the efficiency of information processing；Also, in the prior art directly to instruction The network parameter practiced in the neural network completed is quantified, and is generated quantization neural network and is compared, the scheme of the disclosure can subtract The small loss of significance caused to network parameter quantization, the accuracy for improving quantization neural network utilize the amount of the disclosure in turn Change neural network carry out information processing electronic equipment, compared to the prior art in using quantization neural network progress information at The electronic equipment of reason can have the more accurate information processing function.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein；

Fig. 2 is the flow chart according to one embodiment of the method for generating quantization neural network of the disclosure；

Fig. 3 is according to an embodiment of the present disclosure for generating showing for an application scenarios of the method for quantization neural network It is intended to；

Fig. 4 is the flow chart according to one embodiment of the method for handling information of the disclosure；

Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating quantization neural network of the disclosure；

Fig. 6 is the structural schematic diagram according to one embodiment of the device for handling information of the disclosure；

Fig. 7 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of embodiment of the disclosure.

Specific embodiment

The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1, which is shown, can quantify the method for neural network, for generating quantization nerve for generating using the disclosure The exemplary system architecture 100 of the embodiment of the device of network, the method for handling information or the device for handling information.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as web browser is answered on terminal device 101,102,103 With, shopping class application, searching class application, instant messaging tools, mailbox client, social platform software etc..

Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, it can be various electronic equipments, including but not limited to smart phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc..It, can be with when terminal device 101,102,103 is software It is mounted in above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (such as providing distribution in it The multiple softwares or software module of formula service), single software or software module also may be implemented into.It is not specifically limited herein.

Server 105 can be to provide the server of various services, such as send to terminal device 101,102,103 first The model treatment server that beginning neural network is handled.Model treatment server can be to the initial neural network etc. received Data carry out the processing such as analyzing, and processing result (such as quantization neural network) is fed back to terminal device.

It should be noted that for generating the method for quantization neural network generally by taking provided by embodiment of the disclosure Business device 105 executes, and correspondingly, the device for generating quantization neural network is generally positioned in server 105；In addition, this public affairs Method provided by the embodiment opened for handling information is generally executed by terminal device 101,102,103, correspondingly, is used It is generally positioned in 101,102,103 in the device of processing information.

It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module) It is implemented as single software or software module.It is not specifically limited herein.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.

With continued reference to Fig. 2, one embodiment of the method for generating quantization neural network according to the disclosure is shown Process 200.The method for being used to generate quantization neural network, comprising the following steps:

Step 201, training sample set and initial neural network are obtained.

In the present embodiment, for generating executing subject (such as the service shown in FIG. 1 of the method for quantization neural network Device) it can be by wired connection mode or radio connection from remotely-or locally obtaining training sample set and initial nerve net Network.Wherein, the training sample that training sample is concentrated includes sample information and for the predetermined sample results of sample information.Sample The information that this information can be handled for initial neural network can include but is not limited at least one of following: text, image, sound Frequently, video.For example, initial neural network can be the neural network for carrying out recognition of face, then sample information can be sample This facial image.Sample results are to carry out processing to sample information using initial neural network to can be obtained expected result (example Such as the gender information of the gender for characterizing personage corresponding to sample facial image).

Just knowing neural network can be unbred neural network, or trained neural network.Initially The function of neural network inputs in other words, exports and can be predetermined.In turn, above-mentioned executing subject is available to being used for The training sample set of the initial neural network of training.

In the present embodiment, initial neural network includes original floating-point type network parameter.Original floating-point type network parameter is The floating type normalization of the floating type weight of convolutional layer in initial neural network and the batch normalization layer being connect with convolutional layer The product of parameter.

Specifically, initial neural network includes convolutional layer and batch normalization layer.Convolutional layer includes floating type weight.Floating-point Type weight can be used for the input with convolutional layer and carry out convolution algorithm, obtain the output of convolutional layer.Batch normalization layer can be with Convolutional layer connection, is normalized for the output to convolutional layer.Specifically, batch normalization layer includes being used for and convolution The output of layer is multiplied, the floating type normalized parameter being normalized with the output to convolutional layer.Herein, it can will roll up Lamination and batch normalization layer are as a network structure, output and floating type normalizing of the output of the network structure by convolutional layer The product for changing parameter determines, since the output of convolutional layer is determined by the convolution of the input of floating type weight and convolutional layer, on The output for stating network structure can be then defeated with convolutional layer by carrying out quadrature to floating type normalized parameter and floating type weight Enter to carry out convolution to determine.It is appreciated that the input of convolutional layer is the input variable of above-mentioned web results, floating type normalization ginseng Several products with floating type weight are the parameter of above-mentioned network structure.In turn, in the present embodiment, floating type can be normalized The product of parameter and floating type weight is determined as floating type network parameter.And original floating-point type network parameter is initial neural network Including, network parameter to be quantified to it.

In practice, integer real-coded GA converted in some value range is referred to the quantization of real-coded GA Type data.Here, value range is limited by the number of bits of integer type data.Such as the integer type data to be converted to are 8 ratios Special position (i.e. 8bit), then value range is (0,255).It should be noted that in the present embodiment, to original floating-point type network When parameter is quantified, the number of bits for the integer type network parameter to be quantized into can be predefined by technical staff.

Real-coded GA identical for digit and integer type data are appreciated that, since real-coded GA can recorde Data information after decimal point, thus there is higher precision.And integer type data are due to the number after not recording decimal point It is believed that breath, therefore, when can occupy less memory space, and be calculated using integer type data, calculating speed is faster.

It should be noted that weight and normalization in order to obtain higher precision, in neural network in the prior art Parameter is typically stored as floating type.

Step 202, integer type network parameter is converted by the original floating-point type network parameter in initial neural network.

In the present embodiment, based on initial neural network obtained in step 201, above-mentioned executing subject can be by initial mind Integer type network parameter is converted into through the original floating-point type network parameter in network.

Specifically, above-mentioned executing subject can determine the number of bits for the integer type network parameter to be converted first, so Afterwards, integer type network ginseng is converted by the original floating-point type network parameter in initial neural network using existing various methods Number.It is appreciated that convert integer type network parameter for the original floating-point type network parameter in initial neural network, it is equivalent to pair Initial neural network addition quantization constraint.

As an example, original floating-point type network parameter includes numerical value " 21.323 ", predefine the integer to be converted to The number of bits of type weight is eight, it can the value range for determining integer type network parameter is (0,255), and then can be straight It connects and converts integer type network parameter for the numerical value " 21.323 " in original floating-point type network parameter by the way of rounding up “21”。

In some optional implementations of the present embodiment, above-mentioned executing subject can incite somebody to action initial mind by following steps Integer type network parameter is converted into through the original floating-point type network parameter in network: firstly, above-mentioned executing subject can will be original Floating type weight corresponding to floating type network parameter is converted into integer type weight, and original floating-point type network parameter institute is right The floating type normalized parameter answered is converted into integer type normalized parameter.Then, above-mentioned executing subject can be whole to what is be converted to Number type weight and integer type normalized parameter carry out quadrature, obtain integer type network parameter.This implementation is first to floating type Weight and floating type normalized parameter are added to quantization constraint, then utilize the integer type weight and integer for being added to quantization constraint Type normalized parameter obtains integer type network parameter, with this, it is possible to reduce the loss of significance of the integer type network parameter after quantization, The accuracy of initial neural network after helping to improve quantization.

Step 203, based on the integer type network parameter being converted to, quantization inceptive neural network is generated.

In the present embodiment, based on the integer type network parameter being converted in step 202, above-mentioned executing subject be can be generated Quantization inceptive neural network.

Specifically, above-mentioned executing subject can directly will include the initial neural network of integer type network parameter being converted to It is determined as quantifying neural network；Alternatively, above-mentioned executing subject can also be to the initial of the integer type network parameter including being converted to Neural network is handled, and initial neural network is determined as quantization inceptive neural network by treated.

In some optional implementations of the present embodiment, above-mentioned executing subject can be generated by following steps to be quantified Initial neural network: the integer type network parameter that above-mentioned executing subject can be converted into is converted into floating type network parameter, with And the initial neural network of the floating type network parameter including being converted to is determined as quantization inceptive neural network.

Here, converting floating type network parameter for integer type network parameter is above-mentioned by original floating-point type network parameter turn The inverse process of integer type network parameter is turned to, it can be with reference to the step that original floating-point type network parameter is turned to integer type network parameter Suddenly the integer type network parameter being converted to is converted, obtains floating type network parameter.

It continues the example presented above, the integer type network parameter being converted to is " 21 ", by original floating-point type network parameter " 21.323 " are it is found that floating type network parameter is accurate to after decimal point three.So here can be by integer type network parameter " 21 " are converted into floating type network parameter " 21.000 ".

It should be noted that real-coded GA can have higher precision compared to integer type data.So being initial Neural network addition quantifies after constraining, then converts floating type network parameter for integer type network parameter, facilitates subsequent right In the training process of initial neural network, training precision is improved, more accurate training result is obtained.

Step 204, it is concentrated from training sample and chooses training sample, and execute following training step: by selected instruction Practice input of the sample information in sample as quantization inceptive neural network, the sample results in selected training sample are made For the desired output of quantization inceptive neural network, quantization inceptive neural network is trained；In response to determining quantization inceptive mind It is completed through network training, based on the quantization inceptive neural network that training is completed, generates quantization neural network.

In the present embodiment, based on the training sample set obtained in step 201, above-mentioned executing subject can be from training sample It concentrates and chooses training sample, and execute following training step:

It step 2041, will using the sample information in selected training sample as the input of quantization inceptive neural network Desired output of the sample results as quantization inceptive neural network in selected training sample, to quantization inceptive neural network It is trained.

Herein, above-mentioned executing subject can use machine learning method, be trained to quantization inceptive neural network.Tool Body, sample information is inputted quantization inceptive neural network by above-mentioned executing subject, obtains actual result, then utilizes preset damage It loses function and calculates the difference between the sample results in obtained actual result and training sample, for example, L2 model can be used Number calculates the difference between the sample results in obtained actual result breath and training sample as loss function.

Step 2042, in response to determining that quantization inceptive neural metwork training is completed, the quantization inceptive completed based on training is refreshing Through network, quantization neural network is generated.

It is currently set in advance to whether the training of quantization inceptive neural network meets specifically, above-mentioned executing subject can determine The completion condition set, if satisfied, can then determine that quantization inceptive neural metwork training is completed.Wherein, completion condition may include But be not limited at least one of following: the training time is more than preset duration；Frequency of training is more than preset times；Calculate resulting difference Less than default discrepancy threshold.

In the present embodiment, above-mentioned executing subject can be in response to determining that training is completed, at the beginning of the quantization completed based on training Beginning neural network generates quantization neural network.Wherein, quantization neural network is that network parameter that training is completed, included is The neural network of integer type network parameter.

Specifically, be integer type weight in response to the network parameter determined in the quantization inceptive neural network trained and completed, Then the quantization inceptive neural network that training is completed can directly be determined as quantifying neural network by above-mentioned executing subject；In response to true The network parameter in quantization inceptive neural network that fixed training is completed is floating type network parameter, then above-mentioned executing subject can will The floating type network parameter in quantization inceptive neural network that training is completed is converted into integer type network parameter, and then will include turning The integer type network parameter of chemical conversion, training completion quantization inceptive neural network is determined as quantifying neural network.

In the present embodiment, above-mentioned executing subject may also respond to determine that quantization inceptive neural network not complete by training, It executes following steps: choosing training sample from the unselected training sample that training sample set includes；Adjustment quantization is initial The parameter of neural network obtains new floating type network parameter；Convert new floating type network parameter on new integer type net Network parameter, and based on new integer type network parameter, generate new quantization inceptive neural network；It is chosen using the last time Training sample and the quantization inceptive neural network being newly generated, continue to execute above-mentioned training step (step 2041-2042).

Here it is possible to using various implementations based on the sample results in the actual result and training sample being calculated Between discrepancy adjustment quantization inceptive neural network parameter.For example, can be using BP (Back Propagation, reversed biography Broadcast) to carry out adjustment quantization initially neural for algorithm and SGD (Stochastic Gradient Descent, stochastic gradient descent) algorithm The parameter of network.It should be noted that, in order to not influence trained convergent, guaranteeing that training can repeat when adjusting parameter It executes, it will usually which parameter is adjusted to floating type.Therefore, quantization inceptive neural network can obtain new floating type after adjusting parameter Network parameter, in turn, above-mentioned executing subject can convert new floating type network parameter to new integer type network parameter, with Again it is the quantization inceptive neural network addition quantization constraint for including new floating type network parameter, generates new quantization inceptive mind Through network.

Specifically, above-mentioned executing subject can be adopted based on new integer type network parameter and be generated new amount in various manners Change initial neural network.For example, directly the quantization inceptive neural network including new integer type network parameter can be determined as New quantization inceptive neural network.Alternatively, above-mentioned executing subject can also be at the beginning of to including the quantization of new integer type network parameter Beginning neural network is handled, and quantization inceptive neural network is determined as new quantization inceptive neural network by treated.

In some optional implementations of the present embodiment, above-mentioned executing subject can be generated new by following steps Quantization inceptive neural network: above-mentioned executing subject can convert floating type network parameter for new integer type network parameter, with And the quantization inceptive neural network of the floating type network parameter including being converted to is determined as new quantization inceptive neural network.

In some optional implementations of the present embodiment, quantization neural network can be sent to by above-mentioned executing subject User terminal, so that user terminal stores received quantization neural network.Herein, it is added to the quantization of quantization constraint Neural network can occupy less memory space, by this implementation, can save the storage resource of user terminal.

It is one of the application scenarios of the method according to the present embodiment for generating neural network with continued reference to Fig. 3, Fig. 3 Schematic diagram.In the application scenarios of Fig. 3, the training sample set 302 available first of server 301 and initial neural network 303, Wherein, the training sample in training sample set 302 includes sample information and for the predetermined sample results of sample information.Just Beginning neural network 303 includes original floating-point type network parameter 304 (such as " 2.134 ").Original floating-point type network parameter 304 is first The floating type normalization of the floating type weight of convolutional layer in beginning neural network 303 and the batch being connect with convolutional layer normalization layer The product of parameter.Then, the original floating-point type network parameter 304 in initial neural network 303 is converted integer type by server 301 Network parameter 305 (such as " 2 ").Then, server 301 generates quantization inceptive based on the integer type network parameter 305 being converted to Neural network 306.Finally, server 301 can choose training sample 3021 from training sample set 302, and execute following Training step: by the sample information 30211 in selected training sample 3021 as the defeated of quantization inceptive neural network 306 Enter, the desired output by the sample results 30212 in selected training sample 3021 as quantization inceptive neural network 306, Quantization inceptive neural network 306 is trained；In response to determining that the training of quantization inceptive neural network 306 is completed, based on training The quantization inceptive neural network 306 of completion generates quantization neural network 307.

The method provided by the above embodiment of the disclosure is in the training process of neural network, by the floating-point in neural network Type network parameter is converted into integer type network parameter, with this, is added to quantization constraint for the network parameter of neural network, facilitates Reduce the occupied memory space of neural network, and using consumption when neural network progress information processing to CPU, improves letter Cease the efficiency of processing；Also, quantify with the network parameter in the neural network in the prior art directly completed to training, It generates quantization neural network to compare, the scheme of the disclosure can reduce the loss of significance caused to network parameter quantization, improve Quantify the accuracy of neural network, in turn, the electronic equipment of information processing is carried out using the quantization neural network of the disclosure, is compared In the electronic equipment in the prior art for carrying out information processing using quantization neural network, more accurate information processing can have Function.

With further reference to Fig. 4, it illustrates the processes 400 of one embodiment of the method for handling information.This is used for Handle the process 400 of the method for information, comprising the following steps:

Step 401, information to be processed and Target quantization neural network are obtained.

It in the present embodiment, can be with for handling the executing subject (such as terminal device shown in FIG. 1) of the method for information Information to be processed and Target quantization nerve net are instructed from remotely-or locally acquisition by wired connection mode or radio connection Network.Wherein, Target quantization neural network is to be generated using the method for any embodiment in the corresponding embodiment of Fig. 2.Target Quantifying neural network is its quantization neural network for carrying out information processing to be utilized.Information to be processed can be Target quantization nerve The information that network is capable of handling.It can include but is not limited at least one of following: text, image, audio, video.As showing Example, Target quantization neural network are the model for carrying out recognition of face, then information to be processed can be facial image.It is to be processed Information can be stored in advance in above-mentioned executing subject, and above-mentioned executing subject can also be sent to by other electronic equipments.Processing It as a result can be the output result of Target quantization neural network.

Step 402, by information input Target quantization neural network to be processed, processing result and output are obtained.

Specifically, information input Target quantization neural network to be processed can be obtained Target quantization by above-mentioned executing subject The processing result of neural network output.

Herein, after obtaining processing result, above-mentioned executing subject can export processing result.Specifically, above-mentioned execution master Body can export processing result to other electronic equipments of communication connection, can also carry out output to processing result and show.

In practice, since the memory space of user terminal is smaller, and general neural network needs to consume a large amount of storage Resource, so usually neural network is difficult to be applicable on the subscriber terminal.The method that embodiment of the disclosure provides is corresponding using Fig. 2 Any embodiment in the quantization neural network that generates, can to quantify neural network to be suitable for user terminal, help simultaneously In consumption of the reduction to the storage resource of user terminal；Also, when user terminal is carried out at information using quantization neural network When reason, since the complexity of quantization neural network is low, it is possible to improve the efficiency that user terminal carries out information processing, reduce Consumption to the CPU of user terminal；Further, since the quantization neural network for being sent to user terminal is by the training process Addition quantization constraint and obtain neural network, with it is in the prior art, pass through for training complete neural network addition quantization The quantization neural network for constraining and generating is compared, and the loss of significance of the quantization neural network of the disclosure is smaller, in turn, user terminal More accurate information processing and output may be implemented using the quantization neural network of the disclosure.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, present disclose provides one kind to be used for production quantity Change one embodiment of the device of neural network, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, device tool Body can be applied in various electronic equipments.

As shown in figure 5, the device 500 for generating quantization neural network of the present embodiment includes: first acquisition unit 501, conversion unit 502, generation unit 503 and the first execution unit 504.Wherein, first acquisition unit 501 is configured to obtain Training sample set and initial neural network, wherein training sample includes sample information and for the predetermined sample of sample information , as a result, initial neural network includes original floating-point type network parameter, original floating-point type network parameter is in initial neural network for this Convolutional layer floating type weight and connect with convolutional layer batch normalization layer floating type normalized parameter product；Conversion is single Member 502 is configured to convert integer type network parameter for the original floating-point type network parameter in initial neural network；It generates single Member 503 is configured to generate quantization inceptive neural network based on the integer type network parameter being converted to；First execution unit 504 It is configured to concentrate from training sample and chooses training sample, and execute following training step: will be in selected training sample Input of the sample information as quantization inceptive neural network, just using the sample results in selected training sample as quantization The desired output of beginning neural network is trained quantization inceptive neural network；In response to determining quantization inceptive neural network instruction Practice and complete, based on the quantization inceptive neural network that training is completed, generates quantization neural network.

It in the present embodiment, can be by having for generating the first acquisition unit 501 of the device 500 of quantization neural network Line connection type or radio connection are from remotely-or locally obtaining training sample set and initial neural network.Wherein, training Training sample in sample set includes sample information and for the predetermined sample results of sample information.Sample information is initial The information that neural network can be handled can include but is not limited at least one of following: text, image, audio, video.Just know Neural network can be unbred neural network, or trained neural network.

In the present embodiment, the initial neural network obtained based on first acquisition unit 501, conversion unit 502 can incite somebody to action Original floating-point type network parameter in initial neural network is converted into integer type network parameter.

In the present embodiment, the integer type network parameter being converted to based on conversion unit 502, generation unit 503 can give birth to At quantization inceptive neural network.

In the present embodiment, the training sample set obtained based on first acquisition unit 501, the first execution unit 504 can be with It is concentrated from training sample and chooses training sample, and execute following training step: the sample in selected training sample is believed The input as quantization inceptive neural network is ceased, using the sample results in selected training sample as quantization inceptive nerve net The desired output of network is trained quantization inceptive neural network；In response to determining that quantization inceptive neural metwork training is completed, base In the quantization inceptive neural network that training is completed, quantization neural network is generated.

In some optional implementations of the present embodiment, generation unit 503 can be further configured to: will be converted At integer type network parameter be converted into floating type network parameter, and will include the initial of the floating type network parameter being converted to Neural network is determined as quantization inceptive neural network.

In some optional implementations of the present embodiment, conversion unit 502 may include: conversion module (in figure not Show), it is configured to convert integer type weight for floating type weight corresponding to original floating-point type network parameter, and will be former Floating type normalized parameter corresponding to beginning floating type network parameter is converted into integer type normalized parameter；Quadrature module is (in figure It is not shown), it is configured to carry out quadrature to the integer type weight and integer type normalized parameter being converted to, obtains integer type network Parameter.

In some optional implementations of the present embodiment, device 500 can also include: the second execution unit (in figure It is not shown), it is configured in response to determine that quantization inceptive neural network not complete by training, executes following steps: from training sample Training sample is chosen in the unselected training sample that collection includes；The parameter of the initial neural network of adjustment quantization, obtains new Floating type network parameter；Convert new floating type network parameter to new integer type network parameter, and based on new integer Type network parameter generates new quantization inceptive neural network；The training sample chosen using the last time and the amount being newly generated Change initial neural network, continues to execute training step.

In some optional implementations of the present embodiment, the second execution unit can be further configured to: will be new Integer type network parameter be converted into floating type network parameter, and at the beginning of will including the quantization for the floating type network parameter being converted to Beginning neural network is determined as new quantization inceptive neural network.

In some optional implementations of the present embodiment, device 500 can also include: that transmission unit (does not show in figure Out), it is configured to quantify neural network and is sent to user terminal, so that user terminal carries out received quantization neural network Storage.

It is understood that all units recorded in the device 500 and each step phase in the method with reference to Fig. 2 description It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 500 and its In include unit, details are not described herein.

The device provided by the above embodiment 500 of the disclosure, will be in neural network in the training process of neural network Floating type weight is converted into integer type weight, with this, is added to quantization constraint for the weight of neural network, helps to reduce nerve The occupied memory space of network, and using consumption when neural network progress information processing to CPU, improve information processing Efficiency；Also, quantifies with the weight in the neural network in the prior art directly completed to training, generate quantization nerve Network is compared, and the scheme of the disclosure can reduce the loss of significance caused to weight quantization, improves the standard of quantization neural network Exactness, in turn, using the disclosure quantization neural network carry out information processing electronic equipment, compared to the prior art in benefit The electronic equipment that information processing is carried out with quantization neural network, can have the more accurate information processing function.

With further reference to Fig. 6, as the realization to method shown in above-mentioned each figure, present disclose provides one kind for handling letter One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 4, which can specifically answer For in various electronic equipments.

As shown in fig. 6, the device 600 for handling information of the present embodiment includes: that second acquisition unit 601 and input are single Member 602.Wherein, second acquisition unit 601 is configured to obtain information to be processed and Target quantization neural network, wherein target Quantization neural network is using as the method for any embodiment in the corresponding embodiment of Fig. 2 generates；Input unit 602 is matched It is set to information input Target quantization neural network to be processed, obtains processing result and output.

It in the present embodiment, can be by wired connection side for handling the second acquisition unit 601 of the device 600 of information Formula or radio connection instruct information and Target quantization neural network to be processed from remotely-or locally obtaining.Wherein, aim parameter Changing neural network is to be generated using the method for any embodiment in the corresponding embodiment of Fig. 2.Target quantization neural network is Its quantization neural network for carrying out information processing to be utilized.Information to be processed can be capable of handling by Target quantization neural network Information.It can include but is not limited at least one of following: text, image, audio, video.

In the present embodiment, information input Target quantization neural network to be processed can be obtained target by input unit 602 Quantify the processing result and output of neural network output.

It is understood that all units recorded in the device 600 and each step phase in the method with reference to Fig. 4 description It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 600 and its In include unit, details are not described herein.

The device provided by the above embodiment 600 of the disclosure is using the quantization mind generated in the corresponding any embodiment of Fig. 2 Through network, it can to quantify neural network suitable for user terminal, while helping to reduce the storage resource to user terminal Consumption；Also, when user terminal is when carrying out information processing using quantization neural network, due to quantifying the complexity of neural network Degree is low, it is possible to improve the efficiency that user terminal carries out information processing, reduce the consumption to the CPU of user terminal；In addition, Quantization neural network due to being sent to user terminal is the nerve net obtained and adding quantization constraint in the training process Network, with quantization neural network phase that is in the prior art, being generated by the neural network addition quantization constraint completed for training Than the loss of significance of the quantization neural network of the disclosure is smaller, and in turn, user terminal can using the quantization neural network of the disclosure To realize more accurate information processing and output.

Below with reference to Fig. 7, it illustrates the electronic equipment (end of example as shown in figure 1 for being suitable for being used to realize the embodiment of the present disclosure End equipment or server) 700 structural schematic diagram.Terminal device in the embodiment of the present disclosure can include but is not limited to such as move Mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable more Media player), the mobile terminal and such as number TV, desktop computer of car-mounted terminal (such as vehicle mounted guidance terminal) etc. Etc. fixed terminal.Electronic equipment shown in Fig. 7 is only an example, should not function and use to the embodiment of the present disclosure Range band carrys out any restrictions.

As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.) 701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708 Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM703 are connected with each other by bus 704. Input/output (I/O) interface 705 is also connected to bus 704.

In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 606 of head, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 707 of dynamic device etc.；Storage device 708 including such as tape, hard disk etc.；And communication device 709.Communication device 709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708 It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the embodiment of the present disclosure is executed Method in the above-mentioned function that limits.

It should be noted that computer-readable medium described in the disclosure can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned Any appropriate combination.

Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment；It is also possible to individualism, and not It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more When a program is executed by the electronic equipment, so that the electronic equipment: obtaining training sample set and initial neural network, wherein instruction Practice sample to include sample information and be directed to the predetermined sample results of sample information, initial neural network includes original floating-point type Network parameter, original floating-point type network parameter are the floating type weight of the convolutional layer in initial neural network and connect with convolutional layer Batch normalization layer floating type normalized parameter product；By the original floating-point type network parameter conversion in initial neural network For integer type network parameter, and based on the integer type network parameter being converted to, generate quantization inceptive neural network；From training sample This concentration chooses training sample, and executes following training step: using the sample information in selected training sample as amount The input for changing initial neural network, using the sample results in selected training sample as the expectation of quantization inceptive neural network Output, is trained quantization inceptive neural network；In response to determining that quantization inceptive neural metwork training is completed, it is based on having trained At quantization inceptive neural network, generate quantization neural network.

In addition, when said one or multiple programs are executed by the electronic equipment, it is also possible that the electronic equipment: obtaining Take information to be processed and Target quantization neural network, wherein Target quantization neural network is using in the corresponding embodiment of Fig. 2 What the method for any embodiment generated；By information input Target quantization neural network to be processed, processing result and output are obtained.

The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof Machine program code, described program design language include object oriented program language-such as Java, Smalltalk, C+ +, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions, for example, the One acquiring unit is also described as " obtaining the unit of training sample set and initial neural network ".

Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that the open scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from design disclosed above, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for generating quantization neural network, comprising:

Obtain training sample set and initial neural network, wherein training sample includes sample information and preparatory for sample information Determining sample results, initial neural network include original floating-point type network parameter, and original floating-point type network parameter is initial mind The floating type normalized parameter of floating type weight and the batch normalization layer being connect with convolutional layer through the convolutional layer in network Product；

Integer type network parameter is converted by the original floating-point type network parameter in initial neural network；

Based on the integer type network parameter being converted to, quantization inceptive neural network is generated；

It is concentrated from the training sample and chooses training sample, and execute following training step: will be in selected training sample Input of the sample information as quantization inceptive neural network, just using the sample results in selected training sample as quantization The desired output of beginning neural network is trained quantization inceptive neural network；In response to determining quantization inceptive neural network instruction Practice and complete, based on the quantization inceptive neural network that training is completed, generates quantization neural network.

2. according to the method described in claim 1, wherein, described based on the integer type network parameter being converted to, generation quantization is just Beginning neural network, comprising:

The integer type network parameter being converted into is converted into floating type network parameter, and will include the floating type network being converted to The initial neural network of parameter is determined as quantization inceptive neural network.

3. according to the method described in claim 1, wherein, the original floating-point type network parameter by initial neural network turns Turn to integer type network parameter, comprising:

Integer type weight is converted by floating type weight corresponding to original floating-point type network parameter, and by original floating-point type net Floating type normalized parameter corresponding to network parameter is converted into integer type normalized parameter；

Quadrature is carried out to the integer type weight and integer type normalized parameter being converted to, obtains integer type network parameter.

4. according to the method described in claim 1, wherein, the method also includes:

In response to determining that quantization inceptive neural network not complete by training, executes following steps: including from the training sample set Training sample is chosen in unselected training sample；The parameter of the initial neural network of adjustment quantization obtains new floating type net Network parameter；It converts new floating type network parameter to new integer type network parameter, and is joined based on new integer type network Number, generates new quantization inceptive neural network；The training sample chosen using the last time and the quantization inceptive being newly generated mind Through network, the training step is continued to execute.

5. it is described based on new integer type network parameter according to the method described in claim 4, wherein, at the beginning of generating new quantization Beginning neural network, comprising:

Floating type network parameter is converted by new integer type network parameter, and will include the floating type network parameter being converted to Quantization inceptive neural network be determined as new quantization inceptive neural network.

6. method described in one of -5 according to claim 1, wherein the method also includes:

Quantization neural network is sent to user terminal, so that user terminal stores received quantization neural network.

7. a kind of method for handling information, comprising:

Obtain information to be processed and Target quantization neural network, wherein the Target quantization neural network is to want using such as right Ask any method generation in 1-6；

By Target quantization neural network described in the information input to be processed, processing result and output are obtained.

8. a kind of for generating the device of quantization neural network, comprising:

First acquisition unit is configured to obtain training sample set and initial neural network, wherein training sample includes sample letter The predetermined sample results of sample information are ceased and are directed to, initial neural network includes original floating-point type network parameter, original floating Point-type network parameter is that the floating type weight of the convolutional layer in initial neural network and the batch connecting with convolutional layer normalize layer Floating type normalized parameter product；

Conversion unit is configured to convert the original floating-point type network parameter in initial neural network to integer type network ginseng Number；

Generation unit is configured to generate quantization inceptive neural network based on the integer type network parameter being converted to；

First execution unit is configured to concentrate from the training sample and chooses training sample, and executes following training step: It, will be in selected training sample using the sample information in selected training sample as the input of quantization inceptive neural network Desired output of the sample results as quantization inceptive neural network, quantization inceptive neural network is trained；In response to true The initial neural metwork training of quantification is completed, and based on the quantization inceptive neural network that training is completed, generates quantization neural network.

9. device according to claim 8, wherein the generation unit is further configured to:

10. device according to claim 8, wherein the conversion unit includes:

Conversion module is configured to convert integer type weight for floating type weight corresponding to original floating-point type network parameter, And integer type normalized parameter is converted by floating type normalized parameter corresponding to original floating-point type network parameter；

Quadrature module is configured to carry out quadrature to the integer type weight and integer type normalized parameter being converted to, obtains integer Type network parameter.

11. device according to claim 8, wherein described device further include:

Second execution unit is configured in response to determine that quantization inceptive neural network not complete by training, executes following steps: from Training sample is chosen in the unselected training sample that the training sample set includes；The ginseng of the initial neural network of adjustment quantization Number, obtains new floating type network parameter；Convert new floating type network parameter to new integer type network parameter, Yi Jiji In new integer type network parameter, new quantization inceptive neural network is generated；Using the last training sample chosen and most Newly-generated quantization inceptive neural network, continues to execute the training step.

12. device according to claim 11, wherein second execution unit is further configured to:

13. the device according to one of claim 8-12, wherein described device further include:

Transmission unit is configured to quantify neural network and is sent to user terminal, so that user terminal is to received quantization mind It is stored through network.

14. a kind of for handling the device of information, comprising:

Second acquisition unit is configured to obtain information to be processed and Target quantization neural network, wherein the Target quantization mind It is using as method as claimed in any one of claims 1 to 6 generates through network；

Input unit, is configured to Target quantization neural network described in the information input to be processed, obtain processing result and Output.

15. a kind of electronic equipment, comprising:

One or more processors；

Storage device is stored thereon with one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-7.

16. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Method as described in any in claim 1-7.