CN116720563B

CN116720563B - Method and device for improving fixed-point neural network model precision and electronic equipment

Info

Publication number: CN116720563B
Application number: CN202211138548.3A
Authority: CN
Inventors: 杨逸帆; 董云鹏
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2024-03-29
Anticipated expiration: 2042-09-19
Also published as: CN116720563A

Abstract

The application provides a method, a device and electronic equipment for improving the precision of a fixed-point neural network model, wherein the method comprises the following steps: inputting the first fixed point data into a channel-by-channel convolution model, and performing channel-by-channel convolution operation on the first fixed point data by utilizing a channel-by-channel convolution layer to obtain second fixed point data; inputting the second fixed point data into a first point-by-point convolution module, performing point-by-point convolution operation on the second fixed point data by using a first point-by-point convolution layer, normalizing the second fixed point data by using a first normalization layer, and mapping the second fixed point data by using a first nonlinear activation function to obtain first floating point data; and quantizing the first floating point data into fixed point numbers by using the first pseudo quantization node to obtain third fixed point data. The method provided by the application can control the range of the parameter to be quantized in the model, and can reduce the precision loss of the quantized model. On the basis of not increasing the model parameter, a better fixed-point effect can be obtained, and the model expression capacity is improved.

Description

Method and device for improving fixed-point neural network model precision and electronic equipment

Technical Field

The embodiment of the application relates to the field of convolutional neural networks, in particular to a method and device for improving the accuracy of a fixed-point neural network model and electronic equipment.

Background

With the development of deep learning, a number of artificial intelligence (artificial intelligence, AI) models are available to be deployed to run on end-side devices (e.g., smartphones, tablets, personal computers, smart screens, smart televisions, and smart wearable devices). For example, convolutional neural networks (onvolutional neural network, CNN) may be deployed on end-side devices, with the convolutional neural networks being utilized to extract visual features of images or videos for purposes such as face recognition, face tracking, key feature point detection, and the like.

The artificial intelligent model is generally a floating point model, and the calculation involved in the running process of the floating point model is the calculation of floating point numbers. Because floating point number operation speed is low and memory consumption is high, when a model is deployed on end-side equipment, the floating point model is quantized into a fixed point model and then deployed. The fixed point number calculation is used for replacing the floating point number calculation, so that the purposes of reducing time delay and reducing power consumption of the terminal side equipment can be achieved.

However, since quantization is a process of mapping from one range to another, the size of the range of floating point numbers will affect the accuracy of the fixed point numbers after quantization, further affecting the characterizability of the model. The quantization range of some layers of the existing model is too large, so that the data distribution of the layers is influenced, and the whole quantization precision is further influenced, so that the problems of large precision loss and poor model characterization capability exist in the existing model quantization.

Disclosure of Invention

The embodiment of the application provides a method and device for improving the precision of a fixed-point neural network model and electronic equipment, so as to solve the problem of poor precision of the existing neural network model after quantization.

In a first aspect, an embodiment of the present application provides a method for improving accuracy of a fixed-point neural network model, where the model includes a channel-by-channel convolution module and a first point-by-point convolution module, the channel-by-channel convolution module includes a channel-by-channel convolution layer, and the first point-by-point convolution module includes a first point-by-point convolution layer, a first batch of normalization layers, and a first nonlinear activation function; the method comprises the following steps: inputting the first fixed point data into a channel-by-channel convolution module, and performing channel-by-channel convolution operation on the first fixed point data by utilizing a channel-by-channel convolution layer to obtain second fixed point data; wherein: the first fixed point data and the second fixed point data both comprise data of n channels, and n is greater than or equal to 1; the channel-by-channel convolution layer comprises n convolution kernels, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation with the data of the corresponding channels in the first fixed-point data to obtain the data of n channels in the second fixed-point data; inputting the second fixed point data into a first point-by-point convolution module, performing point-by-point convolution operation on the second fixed point data by using a first point-by-point convolution layer, normalizing the second fixed point data by using a first normalization layer, and mapping the second fixed point data by using a first nonlinear activation function to obtain first floating point data; wherein: the first floating point data comprises data of m channels, wherein m is greater than or equal to 1; the first point-by-point convolution layer comprises m convolution kernels, wherein the m convolution kernels are 1 in length, 1 in width and n in height, and are in one-to-one correspondence with m channels; the m convolution kernels are used for respectively carrying out convolution operation on the second fixed-point data; and quantizing the first floating point data into fixed point numbers by using the first pseudo quantization node to obtain third fixed point data.

The embodiment of the application provides a method for improving the precision of a fixed-point neural network model, wherein first floating point data can be obtained after first fixed point data are processed by a channel-by-channel convolution layer, a first point-by-point convolution layer, a first normalization layer and a first nonlinear activation function, and the method can control the discrete degree of the first floating point data. The method can also quantize the first floating point data with smaller discrete degree to obtain third fixed point data. Therefore, the precision loss of the quantized model can be reduced, and the expression capacity of the model can be improved.

In some implementations, the first fixed point data is gravimetric from the low bit data to the high bit data using the second pseudo quantization node to obtain the first sub-data; the data type of the first sub data is fixed point number; the data type of the low-bit data is uint8 or uint16, and the data type of the high-bit data is uint32; multiplying the target position of the first fixed point data by utilizing the first fixed point weight and the convolution check of the channel-by-channel convolution layer to obtain a plurality of second sub-data; the data type of the second sub data is fixed point number; and quantizing the second sub data from the high-bit data to the low-bit data by using the second pseudo quantization node to obtain second fixed-point data.

In this way, the first fixed point data forms second fixed point data after convolution, and subsequent calculation related to the second fixed point data can adopt fixed point calculation, so that the operation speed of the end side equipment can be improved.

In some realizable modes, after the step of multiplying the target position of the first fixed point data by using the first fixed point weight and the convolution check of the channel-by-channel convolution layer to obtain a plurality of second sub-data, the method further comprises: adding each second sub data with the first bias to obtain a plurality of third sub data, wherein the data type of the first bias is fixed point number; and quantizing all the third sub-data from the high-bit data to the low-bit data by using the second pseudo-quantization node to obtain second fixed-point data.

In some implementations, before the step of performing the point-wise convolution operation on the second fixed-point data using the first point-wise convolution layer, the method further includes: and the fourth pseudo quantization node is used for weighing the second floating point weight corresponding to the first point-by-point convolution layer into a second fixed point weight. Thus, the multiplication of the first point-by-point convolution layer is fixed-point calculation, and the operation speed of the end-side device can be improved.

In one implementation, before the step of performing the channel-by-channel convolution operation on the first fixed-point data by using the channel-by-channel convolution layer to obtain the second fixed-point data, the method further includes: acquiring floating point data of n channels corresponding to an original input picture; and quantizing floating point data of n channels corresponding to the original input picture into fixed point numbers to obtain first fixed point data.

In one implementation, the model further includes a second point-wise convolution module including a second point-wise convolution layer, a second batch of normalization layers, and a second nonlinear activation function; before the step of performing the channel-by-channel convolution operation on the first fixed-point data by using the channel-by-channel convolution layer to obtain the second fixed-point data, the method further includes: obtaining floating point data of k channels corresponding to an original input picture, wherein k is smaller than or equal to n; the floating point data of k channels corresponding to the original input picture are quantized into fixed point numbers, and the original fixed point data are obtained; inputting the original fixed point data into a second point-by-point convolution module, performing point-by-point convolution operation on the original fixed point data by using a second point-by-point convolution layer, performing batch normalization processing on the original fixed point data by using a second batch normalization layer, and mapping the original fixed point data by using a second nonlinear activation function to obtain second floating point data; wherein: the second floating point data comprises n channels of data, the second point-by-point convolution layer comprises n convolution kernels, wherein the length of the n convolution kernels is 1, the width is 1, the height is k, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation on the original fixed-point data; and quantizing the second floating point data into fixed point numbers by using the fifth pseudo quantization node to obtain the first fixed point data. Thus, a model with high accuracy and good expression ability can be obtained.

In one implementation, before the step of performing the point-wise convolution operation on the original fixed-point data using the second point-wise convolution layer, the method further includes: and the third floating point weight corresponding to the second point-by-point convolution layer is quantized into a third fixed point weight by using a sixth pseudo quantization node. Thus, the multiplication of the second point-by-point convolution layer is fixed-point calculation, so that the operation speed of the end-side device can be improved.

In one implementation, the first and second nonlinear activation functions are ReLU functions. Because the first nonlinear activation function and the second nonlinear activation function may be layers of the neural network model that are relatively forward, or other layers except for the last layer of the neural network model, if the first nonlinear activation function and the second nonlinear activation function are ReLU6 functions, the ReLU6 functions may limit the output range of the output data, distort the distribution range of the output data in the earlier layers, resulting in obtaining distribution of output data that is not friendly to quantization, and affect the characterization capability of the model after quantization. Therefore, in the embodiment of the application, the first nonlinear activation function and the second nonlinear activation function adopt the ReLU function, so that the range of the output result of the first point-by-point convolution layer is not limited, particularly the maximum value of the output data is not limited, the output result can accurately express the information which should be expressed by the first floating point data, and the representation capability of the model is ensured.

In a second aspect, an embodiment of the present application further provides a device for improving accuracy of a fixed-point neural network model, where the model includes a channel-by-channel convolution module and a first point-by-point convolution module, the channel-by-channel convolution module includes a channel-by-channel convolution layer, and the first point-by-point convolution module includes a first point-by-point convolution layer, a first batch of normalization layers, and a first nonlinear activation function; the device comprises: the first convolution module is used for inputting the first fixed point data into the channel-by-channel convolution module, and performing channel-by-channel convolution operation on the first fixed point data by utilizing the channel-by-channel convolution layer to obtain second fixed point data; wherein: the first fixed point data and the second fixed point data both comprise data of n channels, and n is greater than or equal to 1; the channel-by-channel convolution layer comprises n convolution kernels, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation with the data of the corresponding channels in the first fixed-point data to obtain the data of n channels in the second fixed-point data; the second convolution module is used for inputting second fixed-point data into the first point-by-point convolution module, performing point-by-point convolution operation on the second fixed-point data by using the first point-by-point convolution layer, performing normalization processing on the second fixed-point data by using the first normalization layer, and mapping the second fixed-point data by using the first nonlinear activation function to obtain first floating-point data; wherein: the first floating point data comprises data of m channels, wherein m is greater than or equal to 1; the first point-by-point convolution layer comprises m convolution kernels, wherein the m convolution kernels are 1 in length, 1 in width and n in height, and are in one-to-one correspondence with m channels; the m convolution kernels are used for respectively carrying out convolution operation on the second fixed-point data; and the quantization module is used for quantizing the first floating point data into fixed point numbers by using the first pseudo quantization node to obtain third fixed point data.

The embodiment of the application provides a device for improving the precision of a fixed-point neural network model, wherein first fixed-point data can be obtained after being processed by a channel-by-channel convolution layer, a first point-by-point convolution layer, a first normalization layer and a first nonlinear activation function, and the device can control the discrete degree of the first floating-point data. The method can also quantize the first floating point data with smaller discrete degree to obtain third fixed point data. Therefore, the precision loss of the quantized model can be reduced, and the expression capacity of the model can be improved.

In a third aspect, embodiments of the present application further provide a computer-readable storage medium having instructions stored therein, which when executed on a computer, cause the computer to perform the method for improving the accuracy of a fixed-point neural network model in the above aspects and their respective implementations.

In a fourth aspect, embodiments of the present application further provide an electronic device, including: a processor and a memory; the memory stores program instructions that, when executed by the processor, cause the electronic device to perform the method of improving the accuracy of the fixed-point neural network model in the above aspects and implementations thereof.

Drawings

FIG. 1 is a schematic diagram of a long real floating point;

FIG. 2 is a schematic diagram of an artificial intelligence model deployed at an end-side device;

fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present application;

FIG. 4 is a software architecture block diagram of the electronic device 100 of an embodiment of the present application;

FIG. 5 is a schematic illustration of an application of a portrait light supplementing model;

FIG. 6 is a schematic diagram of a model quantization system;

FIG. 7 is a schematic diagram of quantization of a first fixed-point neural network model according to an embodiment of the present application;

FIG. 8 is a flowchart of a method for improving the accuracy of a fixed-point neural network model according to an embodiment of the present disclosure;

FIG. 9 is a schematic diagram of first fixed point data according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a channel-by-channel convolution operation provided by an embodiment of the present application;

FIG. 11 is a schematic diagram of a point-wise convolution operation provided by an embodiment of the present application;

FIG. 12 is a schematic diagram of quantization provided in an embodiment of the present application;

fig. 13 is a schematic structural diagram of a first fixed-point neural network model according to an embodiment of the present application;

FIG. 14 (a) is a schematic flow chart of a channel-by-channel convolution according to an embodiment of the present disclosure;

FIG. 14 (b) is a schematic flow chart of another channel-by-channel convolution provided in an embodiment of the present application;

FIG. 15 is a flowchart of determining first fixed point data according to an embodiment of the present application;

FIG. 16 is a diagram illustrating quantization of a second fixed-point neural network model according to an embodiment of the present application;

FIG. 17 is another schematic flow chart of a method for improving the accuracy of a fixed-point neural network model according to an embodiment of the present disclosure;

FIG. 18 is a schematic structural diagram of a second fixed-point neural network model according to an embodiment of the present disclosure;

fig. 19 is an application schematic diagram of a fixed-point portrait light supplementing model provided in an embodiment of the present application;

fig. 20 is a device for improving accuracy of a fixed-point neural network model according to an embodiment of the present application.

Detailed Description

The terms first, second, third and the like in the description and in the claims and drawings are used for distinguishing between different objects and not for limiting the specified sequence.

In the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.

The terminology used in the description of the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application, as will be described in detail with reference to the accompanying drawings.

In order to facilitate the technical personnel to understand the technical scheme of the embodiment of the application, the technical terms related to the embodiment of the application are explained below.

1. Fixed point (fixpoint) is a method of representing a number employed in a computer, specifically, the location of a decimal point in a computer is agreed, and both the integer part and the decimal part of a number are represented by binary numbers. When the decimal point of a fixed point number is fixed after the lowest digit of the number, the fixed point number is called a fixed point integer.

2. Floating point numbers (floating point) are also a representation of a number employed in computers to approximate any real number, which is derived from an integer or fixed point number multiplied by the integer power of a base. Specifically, a floating point number is generally represented by three parts, namely a counter (S), a step code (P) and a mantissa (M). The number sign is used to represent the positive and negative of the floating point number, the number sign 0 represents the positive number of the floating point number, the number sign 1 represents the negative number of the floating point number, the step code is the base number, and the mantissa is the integer or fixed point number. As shown in fig. 1, according to the IEEE754 standard, a long real floating point of a 64-bit computer is represented by a 1-bit sign, an 11-bit step code, and a 52-bit mantissa.

When a computer of the same number of bits (e.g., a 64-bit computer) represents data, the floating point number can represent a much larger range of data than the fixed point number can represent. And when the word length of the floating point number is the same as that of the fixed point number, the precision of the floating point number is higher than that of the fixed point number.

Since the calculation step of the floating point number is divided into the calculation of the step code part and the calculation of the mantissa part, the calculation steps of the floating point number are generally relatively more, and correspondingly, the calculation speed of the floating point number is lower than that of the fixed point number.

3. Quantization (quatize): a set of numbers (e.g., floating point numbers) within the original range of values is mapped to another target range of values by a mathematical transformation to form a number (e.g., fixed point number) within the target range of values.

4. Model quantization is a technique for converting floating point computation of a model into fixed point computation, and specifically, model quantization is a process of approximating a floating point model weight of a continuous value (or a large number of possible discrete values) or tensor data flowing through the model into a finite plurality of discrete values with low inference precision loss. The model quantization can effectively reduce the calculation intensity and the parameter size of the model, can achieve the purpose of reducing the size of the model, and can also reduce the memory consumption of the model and achieve the purpose of accelerating the model reasoning speed.

The application scenario of the embodiments of the present application is described below with reference to the accompanying drawings.

FIG. 2 is a schematic diagram of a deployment scenario for an artificial intelligence model. As shown in fig. 2, with the development of deep learning, a large number of artificial intelligence models can be used to deploy to run on end-side devices (e.g., smartphones, tablets, personal computers, smart screens, smart televisions, and smart wearable devices). For example, a convolutional neural network may be deployed on an end-side device, and the convolutional neural network may be used to extract visual features of an image or video, so as to achieve the purposes of face recognition, face tracking, key feature point detection, and the like.

Fig. 3 is a schematic diagram of an electronic device according to an embodiment of the present application. The electronic device 100 may be an end-side device, including: processor 110, memory 120, universal serial bus (universal serial bus, USB) interface 130, charge management module 140, power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headset interface 170D, sensor module 180, keys 190, motor 191, camera 192, display 193, and subscriber identity module (subscriber identification module, SIM) card interface 194, among others. The sensor module 180 may include a touch sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a geomagnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, and the like. Among them, the gyro sensor 180B, the air pressure sensor 180C, the geomagnetic sensor 180D, the acceleration sensor 180E, and the like can be used to detect a motion state of an electronic apparatus, and thus, may also be referred to as a motion sensor.

It is to be understood that the structure illustrated in the embodiments of the present application does not constitute a specific limitation on the electronic device 100. In other embodiments of the present application, electronic device 100 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.

Memory 120 may be used to store computer-executable program code that includes instructions. The memory 120 may include a stored program area and a stored data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data created during use of the electronic device 100 (e.g., audio data, phonebook, etc.), and so on. In addition, the memory 120 may include a high-speed random access memory, and may also include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like. The processor 110 performs various functional applications and data processing of the electronic device 100 by executing instructions stored in the memory 120 and/or instructions stored in a memory provided in the processor.

The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the electronic device 100, and may also be used to transfer data between the electronic device 100 and a peripheral device. And can also be used for connecting with a headset, and playing audio through the headset. The interface may also be used to connect other electronic devices, such as AR devices, etc.

It should be understood that the interfacing relationship between the modules illustrated in the embodiments of the present application is only illustrative, and does not limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also use different interfacing manners, or a combination of multiple interfacing manners in the foregoing embodiments.

The charge management module 140 is configured to receive a charge input from a charger. The charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charge management module 140 may receive a charging input of a wired charger through the USB interface 130. In some wireless charging embodiments, the charge management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100. The charging management module 140 may also supply power to the electronic device through the power management module 141 while charging the battery 142.

The power management module 141 is used for connecting the battery 142, and the charge management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 and provides power to the processor 110, the memory 120, the display 193, the camera 192, the wireless communication module 160, and the like. The power management module 141 may also be configured to monitor battery capacity, battery cycle number, battery health (leakage, impedance) and other parameters. In other embodiments, the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charge management module 140 may be disposed in the same device.

The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.

The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 100 may be used to cover a single or multiple communication bands. Different antennas may also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed into a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

The mobile communication module 150 may provide a solution for wireless communication including 2G/3G/4G/5G, etc., applied to the electronic device 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA), etc. The mobile communication module 150 may receive electromagnetic waves from the antenna 1, perform processes such as filtering, amplifying, and the like on the received electromagnetic waves, and transmit the processed electromagnetic waves to the modem processor for demodulation. The mobile communication module 150 can amplify the signal modulated by the modem processor, and convert the signal into electromagnetic waves through the antenna 1 to radiate. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be provided in the same device as at least some of the modules of the processor 110.

The modem processor may include a modulator and a demodulator. The modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low frequency baseband signal to the baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs sound signals through an audio device (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or videos through the display screen 193. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 150 or other functional module, independent of the processor 110.

The wireless communication module 160 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN) (e.g., wireless fidelity (wireless fidelity, wi-Fi) network), bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field communication, NFC), infrared technology (IR), etc., as applied to the electronic device 100. The wireless communication module 160 may be one or more devices that integrate at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, frequency modulate it, amplify it, and convert it to electromagnetic waves for radiation via the antenna 2.

In some embodiments, antenna 1 and mobile communication module 150 of electronic device 100 are coupled, and antenna 2 and wireless communication module 160 are coupled, such that electronic device 100 may communicate with a network and other devices through wireless communication techniques. The wireless communication techniques may include the Global System for Mobile communications (global system for mobile communications, GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), wideband code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR techniques, among others. The GNSS may include a global satellite positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a beidou satellite navigation system (beidou navigation satellite system, BDS), a quasi zenith satellite system (quasi-zenith satellite system, QZSS) and/or a satellite based augmentation system (satellite based augmentation systems, SBAS).

The electronic device 100 implements display functions through a GPU, a display screen 193, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 193 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

The display 193 is used to display images, videos, and the like. The display 193 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, electronic device 100 may include 1 or N display screens 193, N being a positive integer greater than 1.

The electronic device 100 may implement photographing functions through an ISP, a camera 192, a video codec, a GPU, a display screen 193, an application processor, and the like.

The ISP is used to process the data fed back by the camera 192. For example, when photographing, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing and is converted into an image visible to naked eyes. ISP can also optimize the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be located in the camera 192.

The camera 192 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, RYYB, YUV, or the like format. In some embodiments, the electronic device 100 may include 1 or N cameras 192, N being a positive integer greater than 1.

The electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playing, recording, etc.

The touch sensor 180A, also referred to as a "touch device". The touch sensor 180A may be disposed on the display 193, and the touch sensor 180A and the display 193 form a touch screen, which is also referred to as a "touch screen". The touch sensor 180A is used to detect a touch operation acting thereon or thereabout. The touch sensor may communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to the touch operation may be provided through the display 193. In other embodiments, the touch sensor 180A may also be disposed on a surface of the electronic device 100 at a location different from the location of the display 193.

The gyro sensor 180B may be used to determine a motion gesture of the electronic device 100. In some embodiments, the angular velocity of electronic device 100 about three axes (i.e., x, y, and z axes) may be determined by gyro sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. For example, when the shutter is pressed, the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance to be compensated by the lens module according to the angle, and makes the lens counteract the shake of the electronic device 100 through the reverse motion, so as to realize anti-shake. The gyro sensor 180B may also be used for navigating, somatosensory game scenes.

The air pressure sensor 180C is used to measure air pressure. In some embodiments, electronic device 100 calculates altitude from barometric pressure values measured by barometric pressure sensor 180C, aiding in positioning and navigation.

The geomagnetic sensor 180D includes a hall sensor. The electronic device 100 may detect the opening and closing of the flip cover using the geomagnetic sensor 180D. In some embodiments, when the electronic device 100 is a flip machine, the electronic device 100 may detect the opening and closing of the flip according to the geomagnetic sensor 180D. And then according to the detected opening and closing state of the leather sheath or the opening and closing state of the flip, the characteristics of automatic unlocking of the flip and the like are set.

The acceleration sensor 180E may detect the magnitude of acceleration of the electronic device 100 in various directions (typically three axes). The magnitude and direction of gravity may be detected when the electronic device 100 is stationary. The electronic equipment gesture recognition method can also be used for recognizing the gesture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.

A distance sensor 180F for measuring a distance. The electronic device 100 may measure the distance by infrared or laser. In some embodiments, the electronic device 100 may range using the distance sensor 180F to achieve quick focus.

The proximity light sensor 180G may include, for example, a light emitting diode and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light outward through the light emitting diode. The electronic device 100 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it may be determined that there is an object in the vicinity of the electronic device 100. When insufficient reflected light is detected, the electronic device 100 may determine that there is no object in the vicinity of the electronic device 100. The electronic device 100 can detect that the user holds the electronic device 100 close to the ear by using the proximity light sensor 180G, so as to automatically extinguish the screen for the purpose of saving power. The proximity light sensor 180G may also be used in holster mode, pocket mode to automatically unlock and lock the screen.

The fingerprint sensor 180H is used to collect a fingerprint. The electronic device 100 may utilize the collected fingerprint feature to unlock the fingerprint, access the application lock, photograph the fingerprint, answer the incoming call, etc.

The temperature sensor 180J is for detecting temperature. In some embodiments, the electronic device 100 performs a temperature processing strategy using the temperature detected by the temperature sensor 180J. For example, when the temperature reported by temperature sensor 180J exceeds a threshold, electronic device 100 performs a reduction in the performance of a processor located in the vicinity of temperature sensor 180J in order to reduce power consumption to implement thermal protection. In other embodiments, when the temperature is below another threshold, the electronic device 100 heats the battery 142 to avoid the low temperature causing the electronic device 100 to be abnormally shut down. In other embodiments, when the temperature is below a further threshold, the electronic device 100 performs boosting of the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperatures.

The keys 190 include a power-on key, a volume key, etc. The keys 190 may be mechanical keys. Or may be a touch key. The electronic device 100 may receive key inputs, generating key signal inputs related to user settings and function controls of the electronic device 100.

The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration alerting as well as for touch vibration feedback. For example, touch operations acting on different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also correspond to different vibration feedback effects by touch operations applied to different areas of the display screen 193. Different application scenarios (such as time reminding, receiving information, alarm clock, game, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.

The SIM card interface 194 is used to connect to a SIM card. The SIM card may be inserted into the SIM card interface 194, or removed from the SIM card interface 194 to enable contact and separation with the electronic device 100. The electronic device 100 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 194 may support a Nano SIM card, micro SIM card, etc. The same SIM card interface 194 may be used to insert multiple cards simultaneously. The types of the plurality of cards may be the same or different. The SIM card interface 194 may also be compatible with different types of SIM cards. The SIM card interface 194 may also be compatible with external memory cards. The electronic device 100 interacts with the network through the SIM card to realize functions such as communication and data communication. In some embodiments, the electronic device 100 employs esims, i.e.: an embedded SIM card. The eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.

The software system of the electronic device 100 may employ a layered architecture, an event driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. In this embodiment, taking an Android system with a layered architecture as an example, a software structure of the electronic device 100 is illustrated.

Fig. 4 is a software configuration block diagram of the electronic device 100 of the embodiment of the present application.

The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, from top to bottom, an application layer, an application framework layer, an Zhuoyun row (Android running time) and system libraries, and a kernel layer, respectively.

The application layer may include a series of application packages.

As shown in fig. 4, the application package may include battery management, camera, gallery, calendar, talk, map, navigation, music, video, short message, etc. applications.

The application framework layer provides an application program interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer includes a number of predefined functions.

As shown in FIG. 4, the application framework layer may include a window manager, an input manager InputManager, a sensor manager SensorManager, a phone manager, a resource manager, a notification manager, and so forth.

The input manager may be used to monitor input events of the user, such as click events, swipe events, etc., performed by the user's finger on the display screen 193 of the electronic device 100. By listening for input events, the electronic device 100 can determine whether the electronic device is being used.

The sensor manager is used to monitor data returned by various sensors in the electronic device, such as motion sensor data, proximity sensor data, temperature sensor data, and the like. Using the data returned by the various sensors, the electronic device can determine whether it is jittered, whether the display 193 is occluded, etc.

AndroidRuntime includes a core library and virtual machines. Android system is responsible for scheduling and management of android systems.

The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface manager (surface manager), media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., openGL ES), 2D graphics engines (e.g., SGL), etc.

The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.

Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio video encoding formats, such as: MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

Fig. 5 is a schematic diagram of an application of a portrait light supplementing mode. As shown in fig. 5, the portrait light supplementing model is a model for simulating lighting for a portrait in an image. In practical application, after the smart phone deploys the portrait light supplementing mode, a user uses the smart phone to photograph to obtain an image (a) to be supplemented with light, then the portrait light supplementing mode can extract features of the image to obtain a feature map of a portrait in the image to be supplemented with light, and then the portrait is subjected to simulated light-polishing treatment, such as adding texture features of the portrait, so as to improve visual effects of the portrait in the image.

The artificial intelligent model is generally a floating point model, and the calculation involved in the running process of the floating point model is the calculation of floating point numbers. Because floating point number operation speed is low and memory consumption is high, when a model is deployed on end-side equipment, the floating point model is quantized into a fixed point model and then deployed. The fixed point number calculation is used for replacing the floating point number calculation, so that the purposes of reducing time delay and reducing power consumption of the terminal side equipment can be achieved. Fig. 6 is a schematic diagram of a model quantization system. As shown in fig. 6, the model quantization system 100 includes an artificial intelligence model 101 and a post quantization module 102, the artificial intelligence model 101 being a floating point model, and the post quantization module 102 being configured to perform the foregoing model quantization steps. After model training is complete, post-quantization model 102 may quantize floating point artificial intelligence model 101 into a fixed point model.

However, since quantization is a process of mapping from one range to another, the size of the range of floating point numbers will affect the accuracy of the fixed point numbers after quantization, further affecting the characterizability of the model. The existing model quantization range is larger, so that the existing model quantization has the problems of large precision loss and poor model characterization capability.

For example, the human image light-compensating model shown in fig. 5 is a fixed-point model quantized by the model quantization system 100, and has a problem of poor accuracy. Specifically, the problem of overexposure of the figure occurs in the image (b), the figure loses some skin texture and color, and the garment is also lit up, with poor light supplementing effect. It can be seen that quantization using the model quantization system 100 results in a large quantization error.

The embodiment of the application provides a method for improving the precision of a fixed-point neural network model, which can control the range of parameters needing to be quantized in the model and can reduce the precision loss of the quantized model. On the basis of not increasing the model parameter, a better fixed-point effect can be obtained, and the model expression capacity is improved. The method provided by the embodiment of the application can be applied to the lifting of the floating point model comprising the channel-by-channel convolution layer and/or the point-by-point convolution layer so as to improve the quantization precision of the floating point model. The fixed point model formed after quantization can be deployed into end side devices such as a smart phone, a tablet personal computer, a personal computer, an intelligent screen, an intelligent television and intelligent wearable devices, and the fixed point model after deployment can not only enable the end side devices to realize low time delay and low power consumption, but also realize the same working effect as the floating point model.

Fig. 7 is a schematic diagram of quantization of a first fixed-point neural network model according to an embodiment of the present application. As shown in fig. 7, the model provided in the embodiment of the present application may include a channel-by-channel convolution (depthwise convolution) module and a first point-by-point convolution (pointwise) module, the channel-by-channel convolution module (Dw) includes a channel-by-channel convolution layer (conv 1), the first point-by-point convolution module (Pw 1) includes a first point-by-point convolution layer (conv 2), a first batch normalization (batch normalization) layer (BN 1), and a first nonlinear activation function (ReLU 1).

Fig. 8 is a flowchart of a method for improving accuracy of a fixed-point neural network model according to an embodiment of the present application. As shown in fig. 8, the method includes:

s201: inputting the first fixed point data into a channel-by-channel convolution module, and performing channel-by-channel convolution operation on the first fixed point data by utilizing a channel-by-channel convolution layer to obtain second fixed point data; wherein: the first fixed point data and the second fixed point data both comprise data of n channels, and n is greater than or equal to 1; the channel-by-channel convolution layer comprises n convolution kernels, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation with the data of the corresponding channels in the first fixed-point data so as to obtain the data of the n channels in the second fixed-point data.

Wherein the first fixed point data is input shown in fig. 7.

In some implementations, the value of n may be 3, that is, the first fixed point data and the second fixed point data include 3 channels of data, and the channel-by-channel convolution layer includes 3 convolution kernels, the 3 convolution kernels corresponding to the 3 channels one to one. The first fixed point data may be, for example, data determined by a picture including three channels of red (red, R), green (G), blue (B).

Specifically, the image of the three channels of RGB may be stored in the end-side device as three matrices, where each value in each matrix is the RGB value of the corresponding pixel in the image, so that the first fixed-point data of the three channels may be obtained, and each channel is represented by one matrix. Fig. 9 exemplarily shows a piece of first fixed-point data determined by a picture including 5*5 pixels.

Furthermore, for each matrix forming the first fixed point data, the data types of the elements in the matrix are fixed point numbers, so that when convolution operation is performed, the end side device can perform fixed point number operation instead of floating point number operation, and the purpose of improving operation speed can be achieved. The data type of the elements in the matrix may be specifically an 8-bit unsigned integer (uint 8) or a 16-bit unsigned integer (uint 16), which is not specifically limited in this application.

Further, after the convolution operation, the data types of the elements in each matrix corresponding to the second fixed-point data are fixed-point numbers, which may specifically be uint8 or uint16, which is not limited in this application.

In the embodiment of the present application, the specific value of n may be determined according to practical situations, and is not limited to 3.

Further, the convolution calculation process of the channel-by-channel convolution layer satisfies the following formula:

y＝∑ax+b；

where x is input data (first fixed point data) of the channel-by-channel convolution layer, y is convolution calculation result (second fixed point data) of the channel-by-channel convolution layer, a is weight (first fixed point weight) of the channel-by-channel convolution layer, and b is offset (first offset) of the channel-by-channel convolution layer.

The specific convolution calculation process of the channel-by-channel convolution layer will be described in detail below, and will not be described in detail here.

Fig. 10 is a schematic diagram of a channel-by-channel convolution operation provided in an embodiment of the present application, and an exemplary process of channel-by-channel convolution is illustrated using 3-channel data as the first fixed-point data. As shown in fig. 10, the first fixed point data (3 channel input) of 3 channels may be input to a channel-by-channel convolution module, the channel-by-channel convolution layer may include 3 convolution kernels, and the sizes of the 3 convolution kernels are 3×3, and the 3 convolution kernels may respectively perform convolution operation on the data of 3 channels, where the 3 convolution kernels may achieve the effect of three Filters (Filters). After each convolution kernel convolves the data of its corresponding channel, a convolution result (Maps) of 3 channels may be obtained, and the convolution result of 3 channels may also be referred to as second fixed-point data.

It should be noted that, in the embodiment of the present application, the size of the convolution kernel corresponding to the channel-by-channel convolution layer is m×n×x, where M is the number of output channels of the convolution kernel, N is the number of input channels of the convolution kernel, and x×x is the size of the convolution kernel. Further, the size of the convolution kernel corresponding to the channel-by-channel convolution layer is preferably 3 x 3, specifically, the number of input channels and the number of output channels of the convolution kernel of this size are 3, and the size of the convolution kernel is 3*3. The size of the convolution kernel corresponding to the channel-by-channel convolution layer may also be 3×3×7×7, specifically, the number of input channels and the number of output channels of the convolution kernel of this size are 3, and the size of the convolution kernel is 7*7. It will be appreciated that for a channel-by-channel convolutional layer, the number of channels of the input matrix (first fixed point data) is equal to the number of input channels of the convolutional kernel (number of convolutional kernels of the channel-by-channel convolutional layer), and equal to the number of output channels of the convolutional kernel, and equal to the number of channels of the output matrix (second fixed point data). The size of the convolution kernel corresponding to the channel-by-channel convolution layer can be determined according to actual needs, and is not limited to the two cases.

In some implementations, for the case where the pixel size of the picture corresponding to the first fixed point data is 5*5, the convolution kernel size of the channel-by-channel convolution layer is 3*3, and the number of channels of the first fixed point data is 3, the number of parameters of the channel-by-channel convolution layer is 3×3=27. Therefore, the number of parameters involved in the channel-by-channel convolution layer is small, and the operation speed is high.

S202: inputting the second fixed point data into a first point-by-point convolution module, performing point-by-point convolution operation on the second fixed point data by using a first point-by-point convolution layer, normalizing the second fixed point data by using a first normalization layer, and mapping the second fixed point data by using a first nonlinear activation function to obtain first floating point data; wherein: the first floating point data comprises data of m channels, wherein m is greater than or equal to 1; the first point-by-point convolution layer comprises m convolution kernels, wherein the m convolution kernels are 1 in length, 1 in width and n in height, and are in one-to-one correspondence with m channels; the m convolution kernels are used for respectively carrying out convolution operation on the second fixed-point data.

It will be appreciated that the output data of the channel-wise convolutional layer (second fixed point data) is the input data of the first point-wise convolutional layer.

In this embodiment, the value of m may be 3, that is, the first floating point data includes 3 channels of data. At this time, the first floating point data may be subjected to some post-processing to form an RGB three-channel picture, and the data types of the elements of the RGB three-channel picture data corresponding matrix may be floating point numbers. Alternatively, the first floating point data containing 3-channel data may also be input as input data into other models or other layers of the model. The value of m may be other than 3, and this is not particularly limited in this application.

In this embodiment of the present application, the first point-by-point convolution layer is configured to perform point-by-point convolution on the second fixed-point data, and fig. 11 is a schematic diagram of the point-by-point convolution operation provided in this embodiment of the present application, and an exemplary process of point-by-point convolution is illustrated by taking 3-channel data as the second fixed-point data. As shown in fig. 11, the second fixed point data (3 channel input) of the 3 channels may be input into a first point-by-point convolution layer, the first point-by-point convolution layer may include 4 convolution kernels, and the sizes of the 4 point-by-point convolution layers are 1×1×3, and the 4 convolution kernels may sequentially perform convolution operations on the 3 channels of the second fixed point data, where the 4 convolution kernels may achieve the effect of four Filters (Filters). After the second fixed point data is convolved by the 4 convolution kernels, a convolution result (Maps) of the 4 channels can be obtained.

It should be noted that, the size of the convolution kernel corresponding to the first point-by-point convolution layer provided in the embodiment of the present application is m×n×1×1, M is the number of output channels of the convolution kernel, N is the number of input channels of the convolution kernel or the height of the convolution kernel, and 1*1 is the length and width of the convolution kernel. Further, the size of the convolution kernel corresponding to the first point-by-point convolution layer is preferably 3×3×1×1, specifically, the number of input channels and the number of output channels of the convolution kernel of this size are 3, and the size of the convolution kernel is 1*1. It will be appreciated that for the first point-wise convolutional layer, the number of channels of the second fixed point data is equal to the number of input channels of the convolutional kernel. The number of convolution kernels determines the number of output channels of the convolution kernels, and the number of output channels of the convolution kernels is equal to the number of channels of the first floating point data.

In some implementations, for example, the convolution kernel size of the first point-by-point convolution layer is 4×3×1×1, and then the reference number of the first point-by-point convolution layer is 4×3×1×1=12. Therefore, the first point-by-point convolution layer involves a small number of parameters and has a high calculation speed.

In the embodiment of the present application, the convolution calculation of the first point-by-point convolution layer satisfies the following formula:

y＝∑ax+b；

where x is input data (second fixed point data) of the first point-by-point convolution layer, y is a convolution calculation result of the channel-by-channel convolution layer, a is a weight (second fixed point weight) of the channel-by-channel convolution layer, and b is an offset (second offset) of the channel-by-channel convolution layer.

In this embodiment of the present application, the first normalization layer is configured to perform batch normalization processing on the second fixed point data, and an exemplary description is given below of a calculation process of the first normalization layer.

The first normalization layer has the following operation formula:

input: x value for each channel: b= { x ₁ …m}；

Parameters to be learned: gamma, beta;

and (3) outputting: { y _i ＝BN _γ ，β(x _i )}；

Wherein, input: x value for each channel: b= { x _1…m Parameter to be learned: gamma, beta represents the data input into the first normalization layer, gamma is a scaling parameter, and beta is a translation parameter. And (3) outputting: { y _i ＝BN _γ ，β(x _i ) And the output data of the first normalized layer. Represents the calculation of the mean value, μ, for each channel (mini-batch) _B Representing the mean value for each channel. /> Representing the calculation of the variance of each channel (mini-batch), the +.>Representing the variance of each channel.Representation normalization, e is a minimum number for use in variance ++>And when the value is 0, the normalization normal operation is ensured.，β(x _i ) The representation is translated and scaled.

It will be appreciated that the first normalization layer is operated on independently in each channel (mini-batch).

It will be appreciated that the calculation step of the first normalization layer includes the process of subtracting the mean and dividing the variance. Since the first normalization layer includes a process of calculating variance, a case of calculating floating point number occurs when calculating variance, which also makes the data output by the first point-by-point convolution module be floating point data, specifically. The data type of the first floating point data may be float, which is not specifically limited in this application.

In the embodiment of the present application, the second fixed point data may be obtained by a convolution operation of the channel-by-channel convolution layer in the channel-by-channel convolution module. In the first aspect, since the channel-by-channel convolution layer is a convolution operation performed separately for each channel, the degree of dispersion of the elements in the matrix to which the second fixed-point data corresponds is low. Then the second fixed point data, after being input to the first point-by-point convolution layer, also has a relatively low degree of dispersion of the convolution results. In the second aspect, since the elements in the matrix corresponding to the second fixed-point data are relatively close and the degree of dispersion is low, the variance of the second fixed-point data may occur toward 0. If a batch normalization layer is provided after the channel-by-channel convolution layer, then an abnormally large data is obtained after the batch normalization layer's variance-dividing operation, since the batch normalization layer (BN layer) is operated for a single channel (channel). When model quantization is performed, such an abnormal value enlarges the quantization range, and affects the expression range of the model. Therefore, in the embodiment of the application, a batch normalization layer is not arranged behind the channel-by-channel convolution layer, so that accuracy of a quantization range can be ensured, and quantization loss of a model due to outliers is avoided.

Furthermore, because the first point-by-point convolution layer is a cross-channel convolution operation, cross-channel correction exists, even if the element values of corresponding matrixes in a single channel are similar or equal, the first batch normalization layer does not have the condition that the variance is 0 when each channel is subjected to batch normalization processing, and therefore the dispersion degree of batch normalization results among the channels is low. Therefore, the obtained data is processed by the channel-by-channel convolution layer, the first point-by-point convolution layer and the first normalization layer, and the degree of dispersion is low. The data is more quantization friendly. When quantization is carried out, the quantization range is smaller, the quantization is more accurate, and the characterization capability of the fixed-point model can be stronger.

In this embodiment of the present application, the first nonlinear activation function may map the second fixed point data, so that the first nonlinear activation function may play a role in activation, and the mapping step using the first nonlinear activation function may enhance the nonlinear expression capability of the model. In particular, the first nonlinear activation function may be a ReLU function. The function expression of the ReLU function is as follows:

where x is the argument of the ReLU function and ReLU (x) is the argument of the function. When x > 0, the output value of the function is x, and when x is less than or equal to 0, the output value of the function is 0.

Since the first nonlinear activation function may be a layer of the neural network model that is relatively forward, or other layers except for the last layer of the neural network model, if the first nonlinear activation function is a ReLU6 function, the ReLU6 function may limit the output range of the output data, distort the distribution range of the output data in an earlier layer, resulting in obtaining distribution of output data that is not friendly to quantization, and affect the characterization capability of the model after quantization. Thus, in the embodiment of the present application, the first nonlinear activation function employs a ReLU function. Compared with the ReLU6 function, the ReLU function can ensure that the range of the output result of the first point-by-point convolution layer is not compressed, so that the output result can accurately and reversely express the information which the first floating point data should express, and the representation capability of the model is ensured.

In some implementations, the first point-wise convolutional layer, the first batch of normalized layers, and the first nonlinear activation function may form a fusion operator. Specifically, the first normalization layer in the fusion operator may include two sublayers bn_a and bn_b, where bn_a calculates the mean and variance, bn_b performs normalization and scaling, and bn_a and bn_b are fused to the adjacent convolution layers (the channel-by-channel convolution layer and the first point-by-point convolution layer) and the activation layer (the layer where the first nonlinear activation function is located), respectively.

The calculation process of the fusion operator is as follows: firstly, reading an output result of a previous layer of convolution operation (channel-by-channel convolution) from a memory of the end side equipment, wherein the output result comprises second fixed point data, a mean value and a variance determined based on the second fixed point data, and a scaling parameter gamma and a translation parameter beta, so as to complete normalization and scaling calculation of BN_B. And inputting the result of BN_B into a first nonlinear activation function for calculation to obtain an activation result. And then inputting the activation result into the first point-by-point convolution layer to perform point-by-point convolution calculation to obtain a convolution result. Before writing the convolution result into the memory of the end-side device, the calculation of BN_A is completed, namely the mean value and the variance of the convolution result of the first point-by-point convolution are obtained.

It can be understood that the fusion operator completes the calculation process of 'normalized scaling- & gt active layer- & gt convolution layer- & gt calculating the mean and variance of the convolution result', and only needs to read the second fixed-point data and related parameters (mean, variance, scaling parameters and translation parameters) obtained by one-time channel-by-channel convolution calculation. And writing the result of the first point-by-point convolution calculation and related parameters (mean, variance, scaling parameters and translation parameters) back into the memory of the end-side device.

In the embodiment of the application, the fusion operator is utilized for calculation, so that the read-write operation of an intermediate result can be reduced, and the frequency of access operation is reduced.

S203: and quantizing the first floating point data into fixed point numbers by using the first pseudo quantization node to obtain third fixed point data.

Fig. 12 is a diagram of quantization provided in the embodiment of the present application, and as shown in fig. 12, quantization is a process of representing 32-bit limited-range floating point data (float) with a data type of fewer bits (e.g., uint8 or uint 16). Shown in fig. 12 is a graph of the range (min (x _f )-max(x _f ) Floating point number) to be quantized into the data range of uint8, i.e., floating point number is mapped to 0-255. If the quantization is to be performed (min (x _f )-max(x _f ) Floating point number to uint16, i.e., floating point number is mapped to 0-65535.

The first pseudo-quantization node (act quat 1) is a node for quantizing the first floating point data, and can play roles of reducing the size of a model, reducing the memory consumption of the model and accelerating the reasoning speed of the model.

The quantization calculation step of the first pseudo quantization node is as follows:

wherein Q represents a fixed point number after quantization, R represents a floating point number before quantization, Z represents a quantized fixed point value corresponding to 0 floating point number, and S is the minimum scale which can be represented after fixed point quantization.

The calculation formula of S is as follows:

wherein R is _max To quantify the maximum value of floating point number, R _min To quantify the minimum value of floating point number, Q _max For maximum value of quantized fixed-point value, Q _min Is the minimum of the quantized setpoint values.

For the quantization process shown in fig. 12, Q _max -Q _min ＝255，The floating point number will now be quantized to the type of uint 8. If it is desired to quantize floating point number to the data type of uint16, Q _max -Q _min ＝65535，/>The type of data to which the floating point number is quantized may be determined according to practical situations, which is not particularly limited in this application.

The calculation formula of Z is as follows:

it will be appreciated that the process of quantization requires statistics of the range of floating point numbers to be quantized. In the embodiment of the application, the discrete degree of the first fixed point data is relatively low through the processing of the channel-by-channel convolution, the first point-by-point convolution, the first batch normalization and the first nonlinear activation function, so that the information expressed by each channel can be reserved during quantization, the precision loss of the quantized model is small, and the characterization capability of the model can be ensured.

In this way, when the third fixed point data is input to the other convolution layers, the calculation of the other convolution layers is still fixed point calculation, and the calculation speed of the end-side device can be improved.

Fig. 13 is a schematic structural diagram of a first fixed-point neural network model according to an embodiment of the present application, and based on the method according to the embodiment of the present application, a model structure as shown in fig. 13 may be obtained. The model includes a channel-by-channel convolution layer (Deapthwise Convolution) and a point-by-point convolution layer (Pointwise Convolution), the point-by-point convolution layer in this embodiment is a first point-by-point convolution layer, the model further includes a batch normalization layer (Batch Normalization), the batch normalization layer in this embodiment is a first batch normalization layer, and the model further includes a nonlinear activation function (ReLU), where in this embodiment, the nonlinear activation function is a first nonlinear activation function.

The model structure shown in fig. 13 does not include a batch normalization layer and a nonlinear activation function after the channel-by-channel convolution layer, so that the situation that the quantization range is amplified by an abnormal value does not occur when the channel-by-channel convolution layer convolution result is quantized, and the accuracy of the quantization range can be ensured. And, the point-by-point convolution layer also comprises a nonlinear activation function, which is a nonlinear activation layer positioned before a linear layer (linear), so that the nonlinear expression capability of the model can be ensured.

As can be seen from the above technical solutions, the embodiments of the present application provide a method for improving the accuracy of a fixed-point neural network model, where after first fixed-point data is processed by a channel-by-channel convolution layer, a first point-by-point convolution layer, a first batch of normalization layers, and a first nonlinear activation function, first floating point data can be obtained, and the method can control the degree of dispersion of the first floating point data. The method can also quantize the first floating point data with smaller discrete degree to obtain third fixed point data. Therefore, the precision loss of the quantized model can be reduced, and the expression capacity of the model can be improved.

The following describes the steps of the channel-by-channel convolution according to the embodiments of the present application with reference to the accompanying drawings.

Fig. 14 (a) is a schematic flow chart of a channel-by-channel convolution according to an embodiment of the present application. As shown in fig. 14 (a), step S201 includes the following steps S2011-S2013:

s2011: the second pseudo-quantization node is utilized to weight the first fixed point data from low bit data to high bit data, and first sub-data is obtained; the data type of the first sub data is fixed point number; the data type of the low-bit data is uint8 or uint16, and the data type of the high-bit data is uint32.

The first fixed point weight is a weight corresponding to a convolution kernel of the channel-by-channel convolution layer, and the data type of the first fixed point weight is uint8 or uint16, preferably uint8. The convolution operation may involve multiplication, but the range of data represented by the fixed point number of the low bits (e.g., uint8 or uint 16) is relatively small, and if the fixed point numbers of the low bits are added, a data boundary crossing may occur. Embodiments of the present application also include data that weights the first sub-data to a high bit, such as uint32, so that out-of-range may be avoided.

S2012: multiplying the target position of the first fixed point data by utilizing the first fixed point weight and the convolution check of the channel-by-channel convolution layer to obtain a plurality of second sub-data; the data type of the second sub data is fixed point number.

Wherein the convolution kernel slides on the input feature image (first fixed point data) when the convolution operation is performed on the channel-by-channel convolution layer. In the sliding process, the convolution kernel slides to which position of the first fixed point data is the target position.

It may be understood that the convolution kernel may traverse the first fixed point data according to a preset sliding step, and the sliding step may be 1, which is not specifically limited in the embodiment of the present application.

S2013: and quantizing the second sub data from the high-bit data to the low-bit data by using the second pseudo quantization node to obtain second fixed-point data.

The data type of the second fixed point data may be uint8 or uint16, and the specific data type depends on the data type of the second pseudo quantization node (act quant 2). The data type of the second fixed point data corresponds to the data type of the second pseudo quantization node. In this way, the data output by the second pseudo quantization node can be used for input into the layers of the other model.

It is understood that the step of adding the offset is not included in steps S2011-S2013, and the offset may be considered to be 0 at this time.

Fig. 14 (b) is a schematic flow chart of another channel-by-channel convolution provided in the embodiment of the present application, as shown in fig. 14 (b), after S2012, the embodiment of the present application may further include the following steps S2014-S2015.

S2014: and adding each second sub data with the first bias to obtain a plurality of third sub data, wherein the data type of the first bias is fixed point number.

The specific value of the first bias (bias 1) may be determined by the actual situation, which is not specifically limited in this application.

In this embodiment of the present application, a pseudo quantization node (bias quantization) may be further provided for the first bias, where the pseudo quantization node is used to quantize the first bias with the data type being the floating point number to the fixed point number.

S2015: and quantizing all the third sub-data from the high-bit data to the low-bit data by using the second pseudo-quantization node to obtain second fixed-point data.

The quantization step of the third sub-data by the second pseudo-quantization node may refer to the foregoing, and will not be described herein. The data type of the second fixed point data may be uint8 or uint16, and in particular, what data type depends on the data type of the second pseudo quantization node, and the data type of the second fixed point data corresponds to the data type of the second pseudo quantization node.

In some implementations, prior to step S201, the method may further include the following step S301.

S301: and the first floating point weight corresponding to the channel-by-channel convolution layer is weighted into a first fixed point weight by using a third pseudo quantization node.

For example, the first floating point weight (weight 1) corresponding to the convolution kernel of pre-quantization size 3*3 may be as follows:

filter1＝[[-0.0266,-0.0724,-0.0256],[0.0129,0.0158,0.0272],[0.0237,0.0221,0.0399]]。

wherein the data type of the first floating point weight is float.

The quantization step of the third pseudo quantization node (wt quant 1) may refer to the foregoing, and will not be described herein. The data type of the first fixed point weight may be uint8 or uint16, preferably uint8.

Thus, the multiplication of the channel-by-channel convolution layer is fixed-point calculation, and the operation speed of the end-side equipment can be improved.

In some implementations, step S202 may be preceded by the following step S401.

S401: and the fourth pseudo quantization node is used for weighing the second floating point weight corresponding to the first point-by-point convolution layer into a second fixed point weight.

The data type of the second floating point weight (weight 2) may be float, and the data type of the second fixed point weight may be uint8 or uint16. The quantization step of the fourth pseudo quantization node (wt quant 2) may refer to the foregoing, and will not be described herein.

Thus, the multiplication of the first point-by-point convolution layer is fixed-point calculation, and the operation speed of the end-side device can be improved.

Fig. 15 is a schematic flowchart of determining first fixed point data according to an embodiment of the present application, as shown in fig. 15, before step S201, the embodiment of the present application further includes the following steps S401 to S402.

S401: and obtaining floating point data of n channels corresponding to the original input picture.

The original input picture may be a picture including three channels of RGB, and n may be 3.

S402: and quantizing floating point data of n channels corresponding to the original input picture into fixed point numbers to obtain first fixed point data.

The step of quantizing the floating point data of the n channels into fixed point numbers is a preprocessing step before inputting the first fixed point data into the channel-by-channel convolution module. The quantization may be performed by a pseudo quantization node, and specific quantization steps may refer to the foregoing, which is not described herein.

Fig. 16 is a schematic diagram of quantization of a second fixed-point neural network model according to an embodiment of the present application. As shown in fig. 16, the model provided in the embodiment of the present application further includes a second point-by-point convolution module (Pw 2) before the channel-by-channel convolution module and the first point-by-point convolution module, where the second point-by-point convolution module includes a second point-by-point convolution layer (conv 3), a second normalization layer (BN 2), and a second nonlinear activation function (ReLU 2).

Fig. 17 is another flow chart of a method for improving the accuracy of a fixed-point neural network model according to an embodiment of the present application. As shown in fig. 17, prior to step S201, the method further includes the following steps S501-S504.

S501: and obtaining floating point data of k channels corresponding to the original input picture.

The original input picture may be a picture including RGB three channels, and k may be 3. The original input picture is input shown in fig. 16.

In some implementations, the original input picture is applied to the second point-by-point convolution module, so the second point-by-point convolution module may perform an up-dimension operation on the original input picture, where as a result of the up-dimension, the number of channels n of the first fixed point data is greater than or equal to k.

S502: and quantizing floating point data of k channels corresponding to the original input picture into fixed point numbers to obtain the original fixed point data.

The quantization may be performed by a pseudo quantization node, and specific quantization steps may refer to the foregoing, which is not described herein.

In this embodiment of the present application, the data type of the original fixed-point data may be uint8 or uint16.

S503: inputting the original fixed point data into a second point-by-point convolution module, performing point-by-point convolution operation on the original fixed point data by using a second point-by-point convolution layer, performing batch normalization processing on the original fixed point data by using a second batch normalization layer, and mapping the original fixed point data by using a second nonlinear activation function to obtain second floating point data; wherein: the second floating point data comprises n channels of data, the second point-by-point convolution layer comprises n convolution kernels, wherein the length of the n convolution kernels is 1, the width is 1, the height is k, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation on the original fixed-point data.

In this embodiment, the number n of convolution kernels of the second point-by-point convolution layer determines the number of output channels of the convolution kernels of the second point-by-point convolution layer, and determines the number of channels included in the second floating point data, thereby determining the number of channels of the first fixed point data. Further, the size of the convolution kernel corresponding to the second point-by-point convolution layer provided in the embodiment of the present application may be m×n×1×1, M is the number of output channels of the convolution kernel, N is the number of input channels of the convolution kernel, and 1*1 is the size of the convolution kernel. Illustratively, the number of convolution kernels of the second point-wise convolution layer may be 32 and the number of channels of the second floating-point data may be 3. The second point-by-point convolutional layer has a convolutional kernel size of 32×3×1×1, that is, the number of output channels of the convolutional kernel is 32, the number of input channels of the convolutional kernel is 3, and the size of the convolutional kernel is 1*1. The actual size of the convolution kernel of the second point-wise convolution layer may be determined by the actual situation, which is not specifically limited in this application.

Further, the convolution calculation process of the second point-by-point convolution layer satisfies the following formula:

y＝∑ax+b；

where x is input data (original fixed-point data) of the channel-by-channel convolution layer, y is a convolution calculation result of the channel-by-channel convolution layer, a is a weight (third fixed-point weight) of the channel-by-channel convolution layer, and b is an offset (third offset) of the channel-by-channel convolution layer.

The process of performing the point-by-point convolution by the second point-by-point convolution layer (conv 3) may refer to the foregoing, and will not be described herein.

In some implementations, the data type of the second floating point data may be float.

Further, the second normalization layer is configured to perform batch normalization processing on the original fixed-point data, and specific normalization steps may refer to the foregoing, which is not described herein.

In the embodiment of the present application, since the second point-by-point convolution layer is a cross-channel convolution operation, and there is cross-channel correction, even if the element values of the corresponding matrix in a single channel are similar or equal, the second normalization layer will not have a variance of 0 when normalization processing is performed on each channel, so that the batch normalization result between the channels has a low degree of dispersion, and therefore, the data obtained after processing by the second point-by-point convolution layer and the second batch normalization layer has a low degree of dispersion. The data is more friendly to quantization, and the quantization range is smaller when the quantization is carried out, so the quantization is more accurate, and the characterization capability of the fixed-point model can be stronger.

It can be understood that, since the second normalization layer includes a process of calculating the variance, a situation that the calculation result is a floating point number occurs when the variance is calculated, which also makes the data output by the second point-by-point convolution module be second floating point data, and the data type of the second floating point data may be float, which is not limited in this application specifically.

Further, the second nonlinear activation function can map the original fixed point data, the second nonlinear activation function can play a role in activation, and the nonlinear expression capability of the model can be enhanced by using the mapping step of the second nonlinear activation function.

In particular, the second nonlinear activation function may be a ReLU function. The function expression of the ReLU function may refer to the foregoing, and will not be described here. Since the second nonlinear activation function may be a layer of the neural network model that is relatively forward, or other layers than the last layer of the neural network model, if the second nonlinear activation function is a ReLU6 function, the ReLU6 function may limit the output range of the output data, distort the distribution range of the output data in an earlier layer, resulting in distribution of output data that is not friendly to quantization, and affect the characterization capability of the model after quantization. Thus, in the embodiment of the present application, the second nonlinear activation function employs a ReLU function. Compared with the ReLU6 function, the ReLU function can ensure that the range of the output result of the second point-by-point convolution layer is not compressed, so that the output result can accurately and reversely express the information which the second floating point data should express, and the representation capability of the model is ensured.

In some implementations, the second point-wise convolutional layer, the second batch of normalized layers, and the second nonlinear activation function may form a fusion operator. The calculation process of the fusion operator can refer to the foregoing, and will not be described herein.

S504: and quantizing the second floating point data into fixed point numbers by using a fifth pseudo quantization node to obtain the first fixed point data.

The fifth pseudo-quantization node (act quat 3) is used to quantize the second floating point data into the first fixed point data so that the first fixed point data can be input into the channel-by-channel convolution module for fixed point computation. For specific quantization, reference is made to the foregoing, and details are not repeated here.

It is understood that the first fixed point data in this embodiment is feature map (feature map) data.

In some implementations, prior to step S503, embodiments of the present application further include the following step S601.

S601: and the third floating point weight corresponding to the second point-by-point convolution layer is quantized into a third fixed point weight by using a sixth pseudo quantization node.

The sixth pseudo quantization node (wt quant 3) is configured to quantize the third floating point weight (light 3) to a third fixed point weight, so that the multiplication of the second point-by-point convolution layer is fixed point calculation, which can increase the operation speed of the end-side device. The specific quantization step of the sixth pseudo quantization node may refer to the foregoing, and will not be described herein.

In some implementations, the data type of the third setpoint weight may be either uint8 or uint16, preferably uint8.

It should be noted that, based on the model structure determined by the foregoing method, in order to prevent the quantization range from being amplified by the outlier, the batch normalization process and/or the activation parameter activation operation are not included in the channel-by-channel convolution module. Furthermore, the first point-by-point convolution module comprises a first normalization layer and a first nonlinear activation function, so that the nonlinear expression capacity of the model can be ensured, and a better fixed-point effect can be obtained.

It can be understood that, in the embodiment of the present application, the first point-by-point convolution layer further includes an addition operation with a second offset (bias 2), where the data type of the value of the second offset may be a floating point number, and the addition operation may be performed after quantization, or the value of the second offset may be 0, which is not limited in this application specifically. The step of performing the convolution operation on the first point-by-point convolution layer may refer to steps S2011-S2015, which is not specifically limited in this application.

It can be understood that, when performing the convolution operation, the second point-to-point convolution layer in the embodiment of the present application further includes an addition operation with a third offset (bias 3), where the data type of the value of the third offset may be a floating point number, and after quantization, the addition operation may be performed, or the value of the third offset may be 0, which is not specifically limited in this application. The step of performing the convolution operation on the second point-by-point convolution layer may refer to steps S2011-S2015, which is not specifically limited in this application.

Fig. 18 is a schematic structural diagram of a second fixed-point neural network model according to an embodiment of the present application, and based on the method according to the embodiment of the present application, a model structure as shown in fig. 18 may be obtained. Wherein the model comprises a point-wise convolution layer (Pointwise Convolution), which in this embodiment is a second point-wise convolution layer. The model also includes a batch normalization layer (Batch Normalization), which in this embodiment is the second batch normalization layer. The model also includes a nonlinear activation function (ReLU), which in this embodiment is a second nonlinear activation function. The model also includes a channel-wise convolutional layer (Deapthwise Convolution), which in this embodiment is the first point-wise convolutional layer, and another point-wise convolutional layer (Pointwise Convolution). The model further includes a batch normalization layer (Batch Normalization), which in this embodiment is the first batch normalization layer, and a nonlinear activation function (ReLU), which in this embodiment is the first nonlinear activation function.

As shown in fig. 18, the model includes a first point-by-point convolution layer, a channel-by-channel convolution layer, and a second point-by-point convolution layer, where after the channel-by-channel convolution layer, the model does not include a batch normalization layer and a nonlinear activation function, so that when the channel-by-channel convolution layer convolution result is quantized, the quantization range is not amplified by an outlier, and the accuracy of the quantization range can be ensured. And, the second point-by-point convolution layer also comprises a nonlinear activation function, which is a nonlinear activation layer positioned before a linear layer (linear), so that the nonlinear expression capability of the model can be ensured.

It should be noted that, the model structure provided in the embodiment of the present application may further include other channel-by-channel convolution modules or point-by-point convolution modules besides the second point-by-point convolution module, the channel-by-channel convolution module, and the first point-by-point convolution module, which is not specifically limited in this application.

It should be noted that, the model structure determined based on the method of the embodiment of the present application may be applied to a model that needs to complete a high-level task (for example, a detection classification task), and may also be applied to a model that needs to complete a super-classification task. In particular, when the model of the above type needs to be deployed on the end-side device in the form of a fixed-point model, the model structure may be modified into a model structure determined based on the method of the embodiment of the present application.

Fig. 19 is an application schematic diagram of the fixed-point portrait light filling model provided in the embodiment of the present application. As shown in fig. 19, the fixed-point portrait supplemental model is a fixed-point model that simulates lighting for a portrait in an image. In practical application, after the smart phone deploys the portrait light supplementing mode, a user uses the smart phone to photograph to obtain an image (a) to be supplemented with light, then the portrait light supplementing mode can extract features of the image to obtain a feature map of a portrait in the image to be supplemented with light, and then the portrait is subjected to simulated light-polishing treatment, such as adding texture features of the portrait, so as to improve visual effects of the portrait in the image.

The fixed-point portrait light supplementing mode is determined based on the method for improving the accuracy of the fixed-point neural network model. With continued reference to image (b) in fig. 19, the image formed by the image light supplementing mode is good in visual effect, can accurately reflect skin textures or colors of the image, and the light supplementing effect is good because the light supplementing range is limited to the image and other parts of the image except the image are not lighted. Therefore, the method for improving the precision of the fixed-point neural network model provided by the embodiment of the application can enable the precision of the fixed-point model to be better and the characterization capability to be stronger.

FIG. 20 is a schematic diagram of an apparatus for improving the accuracy of a fixed-point neural network model according to an embodiment of the present application, where, as shown in FIG. 20, the embodiment of the present application further provides an apparatus for improving the accuracy of a fixed-point neural network model, where the model includes a channel-by-channel convolution module and a first point-by-point convolution module, the channel-by-channel convolution module includes a channel-by-channel convolution layer, and the first point-by-point convolution module includes a first point-by-point convolution layer, a first normalization layer, and a first nonlinear activation function;

the device comprises:

a first convolution module 701, configured to input first fixed-point data into a channel-by-channel convolution module, and perform a channel-by-channel convolution operation on the first fixed-point data by using a channel-by-channel convolution layer to obtain second fixed-point data; wherein: the first fixed point data and the second fixed point data both comprise data of n channels, and n is greater than or equal to 1; the channel-by-channel convolution layer comprises n convolution kernels, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation with the data of the corresponding channels in the first fixed-point data to obtain the data of n channels in the second fixed-point data;

The second convolution module 702 is configured to input second fixed-point data into the first point-by-point convolution module, perform a point-by-point convolution operation on the second fixed-point data using the first point-by-point convolution layer, normalize the second fixed-point data using the first normalization layer, and map the second fixed-point data using the first nonlinear activation function to obtain first floating-point data; wherein: the first floating point data comprises data of m channels, wherein m is greater than or equal to 1; the first point-by-point convolution layer comprises m convolution kernels, wherein the m convolution kernels are 1 in length, 1 in width and n in height, and are in one-to-one correspondence with m channels; the m convolution kernels are used for respectively carrying out convolution operation on the second fixed-point data;

the quantization module 703 is configured to quantize the first floating point data into a fixed point number by using the first pseudo quantization node, and obtain third fixed point data.

According to the technical scheme, the device for improving the precision of the fixed-point neural network model is provided, in the device, after the first fixed-point data is processed through the channel-by-channel convolution layer, the first point-by-point convolution layer, the first normalization layer and the first nonlinear activation function, the first floating point data can be obtained, the device can control the discrete degree of the first floating point data, further, the precision loss of the quantized model can be reduced, and the expression capacity of the model is improved.

Embodiments of the present application also provide a computer-readable storage medium having stored therein program instructions that, when executed on a computer, cause the computer to perform the methods of the above aspects and implementations thereof.

The embodiment of the application also provides electronic equipment, which comprises: a processor and a memory; the memory stores program instructions that, when executed by the processor, cause the electronic device to perform the method of improving the accuracy of the fixed-point neural network model in the above aspects and implementations thereof.

The foregoing detailed description of the embodiments of the present application has further described the objects, technical solutions and advantageous effects thereof, and it should be understood that the foregoing is merely a specific implementation of the embodiments of the present application, and is not intended to limit the scope of the embodiments of the present application, and any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the embodiments of the present application should be included in the scope of the embodiments of the present application.

Claims

1. The method for improving the precision of the fixed-point neural network model is characterized in that the model comprises a channel-by-channel convolution module and a first point-by-point convolution module, wherein the channel-by-channel convolution module comprises a channel-by-channel convolution layer, and the first point-by-point convolution module comprises a first point-by-point convolution layer, a first batch of normalization layers and a first nonlinear activation function;

The method comprises the following steps:

acquiring floating point data of n channels corresponding to an original input picture;

the floating point data of n channels corresponding to the original input picture are quantized into fixed-point numbers, and first fixed-point data are obtained;

inputting the first fixed point data into the channel-by-channel convolution module, and performing channel-by-channel convolution operation on the first fixed point data by utilizing the channel-by-channel convolution layer to obtain second fixed point data; wherein: the first fixed point data and the second fixed point data both comprise data of n channels, and n is greater than or equal to 1; the channel-by-channel convolution layer comprises n convolution kernels, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation with the data of the corresponding channels in the first fixed-point data to obtain the data of n channels in the second fixed-point data;

the step of performing a channel-by-channel convolution operation on the first fixed point data by using a channel-by-channel convolution layer to obtain second fixed point data includes:

the second pseudo-quantization node is utilized to weight the first fixed point data from low bit data to high bit data, and first sub-data is obtained; the data type of the first sub data is fixed point number; the data type of the low-bit data is uint8 or uint16, and the data type of the high-bit data is uint32;

Multiplying the target position of the first fixed point data by utilizing the first fixed point weight and the convolution check of the channel-by-channel convolution layer to obtain a plurality of second sub-data; the data type of the second sub data is fixed point number;

quantizing the second sub data from high-bit data to low-bit data by using the second pseudo quantization node to obtain the second fixed-point data;

inputting the second fixed point data into the first point-by-point convolution module, performing point-by-point convolution operation on the second fixed point data by using the first point-by-point convolution layer, performing normalization processing on the second fixed point data by using the first normalization layer, and mapping the second fixed point data by using the first nonlinear activation function to obtain first floating point data; wherein: the first floating point data comprises data of m channels, and m is greater than or equal to 1; the first point-by-point convolution layer comprises m convolution kernels, wherein the m convolution kernels are 1 in length, 1 in width and n in height, and the m convolution kernels are in one-to-one correspondence with the m channels; the m convolution kernels are used for respectively carrying out convolution operation on the second fixed-point data;

And quantizing the first floating point data into fixed point numbers by using a first pseudo quantization node to obtain third fixed point data.

2. The method for improving the accuracy of a fixed-point neural network model according to claim 1, wherein the step of multiplying the target position of the first fixed-point data by using the first fixed-point weight and the convolution of the channel-by-channel convolution layer to obtain a plurality of second sub-data further comprises:

adding each second sub data with a first bias to obtain a plurality of third sub data, wherein the data type of the first bias is a fixed point number;

and quantizing all the third sub-data into low-bit data from high-bit data by using the second pseudo quantization node to obtain the second fixed-point data.

3. The method for improving the accuracy of a fixed-point neural network model according to claim 2, wherein before the step of performing a channel-by-channel convolution operation on the first fixed-point data by using the channel-by-channel convolution layer to obtain the second fixed-point data, the method further comprises:

and the first floating point weight corresponding to the channel-by-channel convolution layer is quantized into the first fixed point weight by using a third pseudo quantization node.

4. The method for improving the accuracy of a fixed point neural network model according to claim 1, further comprising, before the step of performing a point-wise convolution operation on the second fixed point data using the first point-wise convolution layer:

And weighting the second floating point weight corresponding to the first point-by-point convolution layer into a second fixed point weight by using a fourth pseudo quantization node.

5. The method of claim 1, wherein the model further comprises a second point-wise convolution module comprising a second point-wise convolution layer, a second batch normalization layer, and a second nonlinear activation function;

before the step of performing a channel-by-channel convolution operation on the first fixed-point data by using the channel-by-channel convolution layer to obtain second fixed-point data, the method further includes:

obtaining floating point data of k channels corresponding to the original input picture;

the floating point data of k channels corresponding to the original input picture are quantized into fixed point numbers, and original fixed point data are obtained;

inputting the original fixed point data into the second point-by-point convolution module, performing point-by-point convolution operation on the original fixed point data by using a second point-by-point convolution layer, performing batch normalization processing on the original fixed point data by using the second batch normalization layer, and mapping the original fixed point data by using the second nonlinear activation function to obtain second floating point data; wherein: the second floating point data comprises data of n channels, and the second point-by-point convolution layer comprises n convolution kernels, wherein the length of the n convolution kernels is 1, the width of the n convolution kernels is 1, the height of the n convolution kernels is k, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation on the original fixed-point data;

And quantizing the second floating point data into fixed point numbers by using a fifth pseudo quantization node to obtain the first fixed point data.

6. The method for improving the accuracy of a fixed point neural network model according to claim 5, further comprising, before the step of performing a point-wise convolution operation on the original fixed point data using a second point-wise convolution layer:

and the third floating point weight corresponding to the second point-by-point convolution layer is quantized into a third floating point weight by using a sixth pseudo quantization node.

7. The method of claim 5, wherein the first nonlinear activation function and the second nonlinear activation function are ReLU functions.

8. The device for improving the precision of the fixed-point neural network model is characterized in that the model comprises a channel-by-channel convolution module and a first point-by-point convolution module, wherein the channel-by-channel convolution module comprises a channel-by-channel convolution layer, and the first point-by-point convolution module comprises a first point-by-point convolution layer, a first batch of normalization layers and a first nonlinear activation function;

the device comprises:

the quantization module is used for acquiring floating point data of n channels corresponding to the original input picture; the floating point data of n channels corresponding to the original input picture are quantized into fixed-point numbers, and first fixed-point data are obtained;

The first convolution module is used for inputting the first fixed point data into the channel-by-channel convolution module, and performing channel-by-channel convolution operation on the first fixed point data by utilizing the channel-by-channel convolution layer to obtain second fixed point data; wherein: the first fixed point data and the second fixed point data both comprise data of n channels, and n is greater than or equal to 1; the channel-by-channel convolution layer comprises n convolution kernels, and the n convolution kernels are in one-to-one correspondence with the n channels; the n convolution kernels are used for respectively carrying out convolution operation with the data of the corresponding channels in the first fixed-point data to obtain the data of n channels in the second fixed-point data;

the first convolution module is specifically configured to utilize a second pseudo quantization node to weight the first fixed point data from low-bit data to high-bit data, so as to obtain first sub-data; the data type of the first sub data is fixed point number; the data type of the low-bit data is uint8 or uint16, and the data type of the high-bit data is uint32;

the second convolution module is used for inputting the second fixed point data into the first point-by-point convolution module, performing point-by-point convolution operation on the second fixed point data by using the first point-by-point convolution layer, normalizing the second fixed point data by using the first normalization layer, and mapping the second fixed point data by using the first nonlinear activation function to obtain first floating point data; wherein: the first floating point data comprises data of m channels, and m is greater than or equal to 1; the first point-by-point convolution layer comprises m convolution kernels, wherein the m convolution kernels are 1 in length, 1 in width and n in height, and the m convolution kernels are in one-to-one correspondence with the m channels; the m convolution kernels are used for respectively carrying out convolution operation on the second fixed-point data;

the quantization module is further configured to quantize the first floating point data into fixed point numbers by using a first pseudo quantization node, and obtain third fixed point data.

9. An electronic device, comprising: a processor and a memory; the memory stores program instructions that, when executed by the processor, cause the electronic device to perform the method of improving the accuracy of a fixed-point neural network model of any of claims 1-7.