WO2019120114A1

WO2019120114A1 - Data fixed point processing method, device, electronic apparatus and computer storage medium

Info

Publication number: WO2019120114A1
Application number: PCT/CN2018/120512
Authority: WO
Inventors: 牟永强; 韦国恒; 严蕤; 田第鸿
Original assignee: 深圳励飞科技有限公司
Priority date: 2017-12-21
Filing date: 2018-12-12
Publication date: 2019-06-27
Also published as: CN108053028B; CN108053028A

Abstract

Disclosed is a data fixed point processing method, the method comprising: extracting, with respect to a first convolutional neural network model which has completed training, a raw weighting parameter and a raw feature map of an i th data layer of the first convolutional neural network model, wherein i is a number of initialized network layers and is a positive integer; respectively performing fixed point processing on the raw weighting parameter and the raw feature map to obtain a first weighting parameter and a first feature map; respectively converting the first weighting parameter and the first feature map into floating point numbers to obtain a second weighting parameter and a second feature map; and replacing the raw weighting parameter with the second weighting parameter and the raw feature map with the second feature map. The present invention can be used to ensure that a smaller loss in precision is achieved under a lower quantization bit width.

Description

Data fixed point processing method, device, electronic device and computer storage medium

This application claims the priority of the Chinese patent application filed on December 21, 2017, the Chinese Patent Office, the application number is 201711392617.2, and the invention name is "data fixed-point processing method, device, electronic device and computer storage medium". This is incorporated herein by reference.

Technical field

The present invention relates to the field of computer technologies, and in particular, to a data fixed point processing method and apparatus, an electronic device, and a computer storage medium.

Background technique

Convolutional neural networks are an important algorithm in machine learning algorithms. With the continuous research of people, the recognition accuracy of convolutional neural networks continues to increase, and the computational and storage complexity of their algorithms also increases.

With the deep popularity of deep learning and Graphic Processing Unit (GPU) technology, tasks such as video-based object detection and recognition are gradually emerging in people's field of vision and changing our lives. Although GPU technology has powerful computing power, it is very high in power consumption and area, which limits its application in many fields, especially in the terminal products. Digital Signal Processing (DSP) and ARM (Advanced RISC Machines) processors have great advantages in power consumption and area. They have very broad application prospects in many terminal products, but due to the limited computing resources of these platforms, Fixed-point numbers are generally used as the basic data structure.

In general, an algorithm running on an embedded platform requires a fixed-point process. This process will result in different degrees of precision loss with different quantization bit widths. High quantization bit width will result in less precision loss and slower operation efficiency. Low quantization bit width will result in higher precision loss. At the same time, the operation efficiency will be faster, and it is generally necessary to strike a balance between the two. The deep learning algorithm is generally high in complexity, and usually only uses a relatively low quantization bit width to ensure the operating efficiency on the terminal device, which causes a relatively large loss of precision. It can be seen that how to use a lower quantization bit width while ensuring a small loss of precision is a technical problem to be solved.

Summary of the invention

In view of the above, it is necessary to provide a data fixed-point processing method, apparatus, electronic device and computer storage medium, which can ensure a small precision loss at a lower quantization bit width.

A data fixed-point processing method, the method comprising:

Extracting, from the first convolutional neural network model of the training, the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model, wherein the i is an initialized network layer number, the i Is a positive integer;

Performing a fixed-point processing on the original weight parameter and the original feature map to obtain a first weight parameter and a first feature map;

Converting the first weight parameter and the first feature map into a floating point number, respectively obtaining a second weight parameter and the second feature map;

The original weight parameter is replaced with the second weight parameter, and the original feature map is replaced with the second feature map.

In a possible implementation manner, the performing the point processing on the original weight parameter and the original feature map separately, and obtaining the first weight parameter and the first feature map includes:

Determining a maximum weight parameter and a minimum weight parameter from the original weight parameter; determining a first scaling factor according to the maximum weight parameter, the minimum weight parameter, and a preset parameter quantization bit width; according to the first scaling factor And performing quantization processing and rounding processing on each of the original weight parameters in turn, and performing truncation processing on the rounded weight parameters to obtain the first weight parameter;

Determining a maximum feature map and a minimum feature map from the original feature map; determining a second zoom factor according to the maximum feature map, the minimum feature map, and a preset feature quantization bit width; according to the second zoom factor And performing quantization processing and rounding processing on each of the original feature maps in turn, and performing truncation processing on the rounded feature maps to obtain a first feature map.

In a possible implementation manner, the rounding off the weight parameter after the rounding is performed, and obtaining the first weight parameter includes:

If the weight parameter after rounding is greater than the maximum weight parameter, determining the maximum weight parameter as the first weight parameter; if the weight parameter after rounding is smaller than the minimum weight parameter, determining the minimum weight parameter as the first weight parameter;

Performing truncation on the rounded feature map to obtain the first feature map includes:

If the feature map after rounding is greater than the maximum feature map, determining the maximum feature map as the first feature map; if the rounded feature map is smaller than the minimum feature map, determining the minimum feature map as the first feature Figure.

In a possible implementation manner, the parameter quantization bit width is 8 bits, and the feature quantization bit width is 8 bits.

In a possible implementation manner, the method further includes:

Determining whether the first convolutional neural network model completes a fixed-point process;

If not, i=i+1 is determined and the fixed-point process of the first convolutional neural network model is performed.

In a possible implementation manner, the method further includes:

After the end of the fixed-point process of all data layers of the first convolutional neural network model, a second convolutional neural network model is generated.

In a possible implementation manner, the first convolutional neural network model is a floating point model, and the second convolutional neural network model is a fixed point model.

A data fixed point processing device, the data fixed point processing device comprising:

An extracting unit, configured to extract, from the first convolutional neural network model that is completed by the training, an original weight parameter and an original feature map of the i-th data layer of the first convolutional neural network model, where the i is an initialized network layer Number, the i is a positive integer;

a processing unit, configured to perform a fixed-point processing on the original weight parameter and the original feature map, respectively, to obtain a first weight parameter and a first feature map;

a converting unit, configured to convert the first weight parameter and the first feature map into a floating point number, respectively, to obtain a second weight parameter and the second feature map;

And a replacement unit, configured to replace the original weight parameter with the second weight parameter, and replace the original feature map with the second feature map.

An electronic device comprising a processor and a memory for executing a computer program stored in a memory to implement the data fixed point processing method.

A computer readable storage medium storing at least one instruction that, when executed by a processor, implements the data fixed point processing method.

According to the above technical solution, in the present invention, for the first convolutional neural network model that is completed by training, the electronic device may first extract the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model, where The i is an initialized network layer, the i is a positive integer; and the original weight parameter and the original feature map are respectively subjected to a fixed point processing to obtain a first weight parameter and a first feature map; further, the electronic The device may convert the first weight parameter and the first feature map into a floating point number, respectively obtain a second weight parameter and the second feature map; and further, the electronic device may replace the original weight parameter with The second weight parameter, and replacing the original feature map with the second feature map. It can be seen that, by using the first convolutional neural network model in which the training is completed, the electronic device can perform fixed-point processing on the weight parameter and the feature map of the data layer of the first convolutional neural network model to obtain a data structure. For the weight parameter and feature map of the fixed point number, the weight parameter and the feature map of the fixed point number are further converted into floating point numbers, which are used to replace the original weight parameter and the feature map, thereby being able to be based on the existing floating point frame. Through the fixed-point processing of the original weight parameters and feature maps, the processed weight parameters and feature maps can reach the precision range of fixed-point numbers and can be compatible on the floating-point framework, thereby ensuring the lower quantization bit width. Can achieve a small loss of precision.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is an embodiment of the present invention, and those skilled in the art can obtain other drawings according to the provided drawings without any creative work.

1 is a flow chart of a preferred embodiment of a data fixed point processing method disclosed by the present invention.

2 is a flow chart of another preferred embodiment of a data fixed-point processing method disclosed by the present invention.

3 is a functional block diagram of a preferred embodiment of a data fixed point processing apparatus disclosed by the present invention.

FIG. 4 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention for implementing a data point processing method.

Detailed ways

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

The present invention will be further described in detail with reference to the accompanying drawings and specific embodiments.

The electronic device is a device capable of automatically performing numerical calculation and/or information processing according to an instruction set or stored in advance, and the hardware includes but is not limited to a microprocessor and an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc.

The electronic device includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, such as a personal computer, a tablet computer, a smart phone, and a personal digital device. Personal Digital Assistant (PDA), Internet Protocol Television (IPTV), etc.

Referring to FIG. 1, FIG. 1 is a flowchart of a preferred embodiment of a data fixed-point processing method according to the present disclosure. The order of the steps in the flowchart may be changed according to different requirements, and some steps may be omitted.

S11. For the first convolutional neural network model that is completed by the training, the electronic device extracts the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model.

Where i is the number of initialized network layers, and the i is a positive integer. The original weight parameter is the weight parameter of the i-th data layer of the first convolutional neural network model trained, and the characteristic map of the i-th data layer of the first convolutional neural network model is completed by the original feature map training, wherein each data layer There may be multiple weight parameters, and there may be multiple feature maps output by each data layer, and the specific number depends on the structure of the first convolutional neural network model.

The deep learning framework used in the present invention uses floating point as its basic data structure, which may include, but is not limited to, Caffe, MxNet, TensorFlow, and the like. The first convolutional neural network model completed by the training is a floating point model, that is, the data structure of each data layer of the first convolutional neural network model is a floating point number.

Wherein, in the training process of the first convolutional neural network model, the feature map of the i-th data layer and the weight parameter of the (i+1)th data layer are generally convoluted to determine the (i+1)th data. The feature map of the layer. And so on, until the first convolutional neural network model is generated.

S12. The electronic device performs a fixed-point processing on the original weight parameter and the original feature map to obtain a first weight parameter and a first feature map.

In the present invention, in order to operate on an embedded platform, a fixed-point process is required. The present invention can train the first convolutional neural network model using any floating point framework, such as using caffe as a basic training framework, training network using ResNet 18, and data using Minist.

After extracting the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model, the electronic device may perform a fixed-point processing on the original weight parameter and the original feature map to obtain a first weight. Parameters and first feature map.

Wherein, the original weight parameter can be fixed-pointed by using the 8-bit parameter quantization bit width, and the original feature map is fixed-pointed by using the 8-bit feature quantization bit width, and the data structure of the first weight parameter after the fixed-point processing is determined The number of points, the data structure of the first feature map after the fixed point processing is a fixed point number.

Specifically, the electronic device separately performs a fixed-point processing on the original weight parameter and the original feature map, and obtaining the first weight parameter and the first feature map includes:

Specifically, the truncating the weight parameter after rounding is performed, and obtaining the first weight parameter includes:

Wherein, the preset parameter quantization bit width and the preset feature quantization bit width refer to a lower quantization bit width, and according to a plurality of experiments, the lower quantization bit width applicable to the present invention may be (6). 12) bit, for example, the parameter quantization bit width may be 8 bits, and the feature quantization bit width may be 8 bits. It should be noted that, due to the different structures of different convolutional neural networks, the range of lower quantization bit widths will be different, and the precision loss caused by different quantization bit widths will be different, and a comparison provided by the present invention is provided. The low quantization bit width is only a lower range obtained by experimental data and does not represent the range of lower quantization bit widths to which all convolutional neural networks or frame models are applicable.

In this embodiment, each data layer has multiple weight parameters, and the electronic device may determine a maximum weight parameter and a minimum weight parameter from the original weight parameter M extracted from the ith data layer, such as p _max =max(M), p _min =min(M). Further, according to the maximum weight parameter, the minimum weight parameter, and the preset parameter quantization bit width, find a suitable scaling factor α, and ensure that the original weight parameter M can use the preset parameter quantization bit width (such as 8 bit). ) to express, the specific method is as follows:

The range of the 8bit energy representation is [-128,127], the scaling factor α should satisfy αp _max ≤127, and αp _min >-128, usually the scaling factor α is an integer. According to the above adjustment, the appropriate scaling can be uniquely determined. Factor α.

Further, according to the determined scaling factor, each of the original weight parameters may be sequentially subjected to quantization processing and rounding processing. Specifically, each of the original weight parameters is multiplied by a scaling factor, and the multiplied result is rounded off.

Further, the weighting parameter after rounding can be truncated to obtain the first weight parameter. Specifically, the truncating process is performed on the weighting parameter after the rounding, and obtaining the first weight parameter includes: if the weight parameter after rounding is greater than the maximum weight parameter, determining the maximum weight parameter as the first weight parameter; The subsequent weight parameter is smaller than the minimum weight parameter, and the minimum weight parameter is determined as the first weight parameter. That is, when a parameter in the original weight parameter M is greater than p _max , the certain parameter is re-assigned to p _max , and when a parameter in the original weight parameter M is less than p _min , the certain parameter is re-assigned. p _min .

Similarly, the output of each data layer has multiple feature maps, and the electronic device can perform fixed-point processing on the original feature map by referring to a fixed-point processing method similar to the original weight parameter, and obtain the first feature map including, Let me repeat.

S13. The electronic device converts the first weight parameter and the first feature map into floating point numbers respectively, and obtains a second weight parameter and the second feature map.

In the present invention, the data structure of the first weight parameter obtained by the electronic device and the data structure of the first feature map are both fixed point numbers, and the fixed point number cannot be operated on the existing floating point frame, and needs to be converted. data structure.

Specifically, the electronic device may divide the first weight parameter by a first scaling factor to obtain a second weight parameter whose data structure is a floating point number. Similarly, the electronic device may divide the first feature map by a second. The scaling factor is obtained to obtain a second feature map whose data structure is a floating point number. The second weight parameter and the second feature map are floating point data types, but the second weight parameter and the second feature map both reach the precision range of the fixed point number, and can be applied not only to the existing The floating point frame also achieves a small performance penalty at lower bit widths.

S14. The electronic device replaces the original weight parameter with the second weight parameter, and replaces the original feature map with the second feature map.

In the present invention, there are multiple original weight parameters, and correspondingly, there are multiple second weight parameters. Similarly, the original feature map has multiple, and correspondingly, the second feature map has multiple . For each of the original weight parameters, the electronic device needs to replace the original weight parameter with the second weight parameter, and for each of the second feature maps, the electronic device needs to replace the original feature map with the Second feature map.

The above replacement is the original weight parameter on the i-th data layer of the first convolutional neural network and the replacement of the original feature map, and so on, the electronic device can follow the above-mentioned original weight on the i-th data layer of the first convolutional neural network. The parameters and the fixed-point processing of the original feature map are used to perform fixed-point processing on other data layers of the first convolutional neural network until the entire first convolutional neural network is completed.

In the method flow described in FIG. 1 , for the trained first convolutional neural network model, the electronic device may first extract the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model, where And the i is a positive integer; the original weight parameter and the original feature map are respectively subjected to a fixed point processing to obtain a first weight parameter and a first feature map; further, The electronic device may convert the first weight parameter and the first feature map into a floating point number, respectively obtain a second weight parameter and the second feature map; and further, the electronic device may replace the original weight parameter For the second weight parameter, and replacing the original feature map with the second feature map. It can be seen that, by using the first convolutional neural network model in which the training is completed, the electronic device can perform fixed-point processing on the weight parameter and the feature map of the data layer of the first convolutional neural network model to obtain a data structure. For the weight parameter and feature map of the fixed point number, the weight parameter and the feature map of the fixed point number are further converted into floating point numbers, which are used to replace the original weight parameter and the feature map, thereby being able to be based on the existing floating point frame. Through the fixed-point processing of the original weight parameters and feature maps, the processed weight parameters and feature maps can reach the precision range of fixed-point numbers and can be compatible on the floating-point framework, thereby ensuring the lower quantization bit width. Can achieve a small loss of precision.

Referring to FIG. 2, FIG. 2 is a flowchart of another preferred embodiment of a data fixed-point processing method according to the present disclosure. The order of the steps in the flowchart may be changed according to different requirements, and some steps may be omitted.

S21. For the first convolutional neural network model that is completed by the training, the electronic device extracts the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model.

S22. The electronic device performs fixed-point processing on the original weight parameter and the original feature map to obtain a first weight parameter and a first feature map.

The parameter quantization bit width is 8 bits, and the feature quantization bit width is 8 bits.

S23. The electronic device converts the first weight parameter and the first feature map into a floating point number, respectively, to obtain a second weight parameter and the second feature map.

S24. The electronic device replaces the original weight parameter with the second weight parameter, and replaces the original feature map with the second feature map.

S25. The electronic device determines whether the first convolutional neural network model completes the fixed point process. If not, steps S26 to S27 are performed, and if yes, step S27 is performed.

In the present invention, the electronic device needs to perform fixed-point processing on the data of each data layer of the first convolutional neural network model, so after the electronic device completes the fixed-point process on the ith data layer, it is further required to determine the Whether the first convolutional neural network model completes the fixed-point process, if not completed, the electronic device needs to perform fixed-point processing on the data of the next data layer, that is, i=i+1 is iterated until the first convolutional nerve is completed. The fixed-point process of the network model.

S26. The electronic device determines i=i+1, and performs a fixed-point process of the first convolutional neural network model.

S27. After the fixed-point process of all data layers of the first convolutional neural network model ends, generate a second convolutional neural network model.

The first convolutional neural network model is a floating point model, and the second convolutional neural network model is a fixed point model.

In the method flow described in FIG. 2, the electronic device needs to perform a fixed-point processing on the weight parameter and the feature map of each data layer of the first convolutional neural network, and all data in the first convolutional neural network model. After the fixed-point process of the layer is completed, a second convolutional neural network model can be generated. The second convolutional neural network model has compatibility and can be applied to existing traditional floating-point frames, and can be guaranteed to be achieved at a lower bit width. Smaller loss of precision.

Please refer to FIG. 3. FIG. 3 is a functional block diagram of a preferred embodiment of a data fixed point processing apparatus according to the present invention. The data fixed-point processing device described in FIG. 3 is used to perform some or all of the steps in the data fixed-point processing method described in FIG. 1 or FIG. A unit referred to in the present invention refers to a series of computer program segments that can be executed by a processor and that are capable of performing a fixed function, which are stored in a memory. In the present embodiment, the functions of the respective units will be described in detail in the subsequent embodiments.

The data pointing processing device 11 depicted in FIG. 3 may include:

The extracting unit 101 is configured to extract, for the first convolutional neural network model that is completed by the training, an original weight parameter and an original feature map of the i-th data layer of the first convolutional neural network model, where the i is an initialized network The number of layers, the i is a positive integer;

The processing unit 102 is configured to perform a fixed-point processing on the original weight parameter and the original feature map to obtain a first weight parameter and a first feature map;

The converting unit 103 is configured to convert the first weight parameter and the first feature map into a floating point number, respectively, to obtain a second weight parameter and the second feature map;

The replacing unit 104 is configured to replace the original weight parameter with the second weight parameter, and replace the original feature map with the second feature map.

Optionally, the processing unit 102 performs a fixed-point processing on the original weight parameter and the original feature map respectively, and obtaining the first weight parameter and the first feature map includes:

The truncating process is performed on the weighting parameter after the rounding, and obtaining the first weight parameter includes:

In this embodiment, each data layer has multiple weight parameters, and the electronic device can determine the maximum weight parameter and the minimum weight parameter from the original weight parameter M extracted from the ith data layer, such as p _max =max(M), p _min =min(M). Further, according to the maximum weight parameter, the minimum weight parameter, and the preset parameter quantization bit width, find a suitable scaling factor α, and ensure that the original weight parameter M can use the preset parameter quantization bit width (such as 8 bit). ) to express, the specific method is as follows:

Optionally, the data fixed point processing apparatus 11 described in FIG. 3 may further include:

a determining unit, configured to determine whether the first convolutional neural network model completes a fixed point process;

Determining an execution unit, configured to determine i=i+1 when the determining unit determines that the first convolutional neural network model has not completed the fixed-point process, and perform a fixed-point process of the first convolutional neural network model .

And a generating unit, configured to generate a second convolutional neural network model after the end of the fixed-point process of all the data layers of the first convolutional neural network model.

In the data de-pointing processing device described in FIG. 3, for the first convolutional neural network model that is completed by training, the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model may be extracted first. Wherein, i is an initial number of network layers, and the i is a positive integer; and the original weight parameter and the original feature map are respectively subjected to a fixed point processing to obtain a first weight parameter and a first feature map; further And converting the first weight parameter and the first feature map into a floating point number, respectively obtaining a second weight parameter and the second feature map; and further, the original weight parameter may be replaced by the a second weight parameter, and replacing the original feature map with the second feature map. It can be seen that, by using the first convolutional neural network model in which the training is completed, the electronic device can perform fixed-point processing on the weight parameter and the feature map of the data layer of the first convolutional neural network model to obtain a data structure. For the weight parameter and feature map of the fixed point number, the weight parameter and the feature map of the fixed point number are further converted into floating point numbers, which are used to replace the original weight parameter and the feature map, thereby being able to be based on the existing floating point frame. Through the fixed-point processing of the original weight parameters and feature maps, the processed weight parameters and feature maps can reach the precision range of fixed-point numbers and can be compatible on the floating-point framework, thereby ensuring the lower quantization bit width. Can achieve a small loss of precision.

The above-described integrated unit implemented in the form of a software function module can be stored in a computer readable storage medium. Wherein, the computer readable storage medium can store a computer program, which when executed by the processor, can implement the steps of the various method embodiments described above. Wherein, the computer program comprises computer program code, which may be in the form of source code, object code form, executable file or some intermediate form. The computer readable storage medium may include any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only Memory). ), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer readable storage medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in a jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer readable The medium does not include electrical carrier signals and telecommunication signals.

As shown in FIG. 4, FIG. 4 is a schematic structural diagram of an electronic device according to a preferred embodiment of the method for implementing data fixed point processing according to the present invention. The electronic device 1 includes a memory 12 and a processor 13. It can be understood by those skilled in the art that the schematic diagram shown in FIG. 4 is only an example of the electronic device 1, does not constitute a limitation on the electronic device 1, may include more or less components than illustrated, or may combine some Components, or different components, such as the electronic device 1 may also include input and output devices, network access devices, buses, and the like.

The electronic device 1 further includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, Personal Digital Assistant (PDA), game consoles, Internet Protocol Television (IPTV), smart wearable devices, etc. The network in which the electronic device 1 is located includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (VPN), and the like.

The memory 12 optionally includes one or more computer readable storage media for storing a program of the data fixed point processing method and various data, and implementing the program or data at a high speed and automatically during the running process. access. The memory 12 optionally includes a high speed random access memory, and optionally also a non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid state memory devices.

The processor 13 is also referred to as a central processing unit (CPU). It is a very large-scale integrated circuit and is a computing core (Core) and a control unit of the electronic device 1. The processor 13 can execute an operating system of the electronic device 1 and various installed application programs, program codes, and the like, such as the data fixed point processing device 11.

1 or 2, the memory 12 in the electronic device 1 stores a plurality of instructions to implement a data pointing processing method, and the processor 13 may execute the plurality of instructions to implement:

In an optional implementation manner, the performing the point processing on the original weight parameter and the original feature map separately, and obtaining the first weight parameter and the first feature map includes:

In an optional implementation manner, the rounding off the weight parameter after the rounding is performed, and obtaining the first weight parameter includes:

In an optional implementation manner, the parameter quantization bit width is 8 bits, and the feature quantization bit width is 8 bits.

In an optional implementation manner, the processor 13 may execute the multiple instructions to implement:

For details, refer to the description of the related steps in the corresponding embodiment of FIG. 1 or FIG. 2, and the details are not described here.

In the electronic device 1 described in FIG. 4, for the trained first convolutional neural network model, the electronic device may first extract the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model. Wherein, i is an initial number of network layers, and the i is a positive integer; and the original weight parameter and the original feature map are respectively subjected to a fixed point processing to obtain a first weight parameter and a first feature map; further The electronic device may convert the first weight parameter and the first feature map into a floating point number, respectively obtain a second weight parameter and the second feature map; and further, the electronic device may use the original weight parameter The second weight parameter is replaced with the original feature map and the original feature map is replaced with the second feature map. It can be seen that, by using the first convolutional neural network model in which the training is completed, the electronic device can perform fixed-point processing on the weight parameter and the feature map of the data layer of the first convolutional neural network model to obtain a data structure. For the weight parameter and feature map of the fixed point number, the weight parameter and the feature map of the fixed point number are further converted into floating point numbers, which are used to replace the original weight parameter and the feature map, thereby being able to be based on the existing floating point frame. Through the fixed-point processing of the original weight parameters and feature maps, the processed weight parameters and feature maps can reach the precision range of fixed-point numbers and can be compatible on the floating-point framework, thereby ensuring the lower quantization bit width. Can achieve a small loss of precision.

In the several embodiments provided by the present invention, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and the actual implementation may have another division manner.

The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software function modules.

It is apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, and the present invention can be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. Therefore, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the invention is defined by the appended claims instead All changes in the meaning and scope of equivalent elements are included in the present invention. Any accompanying drawings in the claims should not be construed as limiting the claim. In addition, it is to be understood that the word "comprising" does not exclude other elements or steps. The plurality of units or devices recited in the system claims may also be implemented by a unit or device by software or hardware. The second word is used to denote the name and does not denote any particular order.

It should be noted that the above embodiments are only for explaining the technical solutions of the present invention and are not intended to be limiting, and the present invention will be described in detail with reference to the preferred embodiments. Modifications or equivalents are made without departing from the spirit and scope of the invention.

Claims

A data fixed-point processing method, characterized in that the method comprises:

Extracting, from the first convolutional neural network model of the training, the original weight parameter and the original feature map of the i-th data layer of the first convolutional neural network model, wherein the i is an initialized network layer number, the i Is a positive integer;

Performing a fixed-point processing on the original weight parameter and the original feature map to obtain a first weight parameter and a first feature map;

Converting the first weight parameter and the first feature map into a floating point number, respectively obtaining a second weight parameter and the second feature map;

The original weight parameter is replaced with the second weight parameter, and the original feature map is replaced with the second feature map.
The method according to claim 1, wherein the performing the fixed-point processing on the original weight parameter and the original feature map separately, obtaining the first weight parameter and the first feature map comprises:

Determining a maximum weight parameter and a minimum weight parameter from the original weight parameter; determining a first scaling factor according to the maximum weight parameter, the minimum weight parameter, and a preset parameter quantization bit width; according to the first scaling factor And performing quantization processing and rounding processing on each of the original weight parameters in turn, and performing truncation processing on the rounded weight parameters to obtain the first weight parameter;

Determining a maximum feature map and a minimum feature map from the original feature map; determining a second zoom factor according to the maximum feature map, the minimum feature map, and a preset feature quantization bit width; according to the second zoom factor And performing quantization processing and rounding processing on each of the original feature maps in turn, and performing truncation processing on the rounded feature maps to obtain a first feature map.
The method according to claim 2, wherein the truncating the rounded weight parameter is performed, and obtaining the first weight parameter comprises:

If the weight parameter after rounding is greater than the maximum weight parameter, determining the maximum weight parameter as the first weight parameter; if the weight parameter after rounding is smaller than the minimum weight parameter, determining the minimum weight parameter as the first weight parameter;

Performing truncation on the rounded feature map to obtain the first feature map includes:

If the feature map after rounding is greater than the maximum feature map, determining the maximum feature map as the first feature map; if the rounded feature map is smaller than the minimum feature map, determining the minimum feature map as the first feature Figure.
The method of claim 2 wherein said parameter quantization bit width is 8 bits and said feature quantization bit width is 8 bits.
The method according to any one of claims 1 to 4, further comprising:

Determining whether the first convolutional neural network model completes a fixed-point process;

If not, i=i+1 is determined and the fixed-point process of the first convolutional neural network model is performed.
The method of claim 5, wherein the method further comprises:

After the end of the fixed-point process of all data layers of the first convolutional neural network model, a second convolutional neural network model is generated.
The method according to claim 6, wherein the first convolutional neural network model is a floating point model and the second convolutional neural network model is a fixed point model.
A data fixed-point processing device, wherein the data fixed-point processing device comprises:

An extracting unit, configured to extract, from the first convolutional neural network model that is completed by the training, an original weight parameter and an original feature map of the i-th data layer of the first convolutional neural network model, where the i is an initialized network layer Number, the i is a positive integer;

a processing unit, configured to perform a fixed-point processing on the original weight parameter and the original feature map, respectively, to obtain a first weight parameter and a first feature map;

a converting unit, configured to convert the first weight parameter and the first feature map into a floating point number, respectively, to obtain a second weight parameter and the second feature map;

And a replacement unit, configured to replace the original weight parameter with the second weight parameter, and replace the original feature map with the second feature map.
An electronic device, comprising: a processor and a memory, the processor for executing a computer program stored in a memory to implement the data fixed-point processing according to any one of claims 1 to 7. method.
A computer readable storage medium, wherein the computer readable storage medium stores at least one instruction, the at least one instruction being executed by a processor to implement the data fixed point according to any one of claims 1 to 7. Processing method.