CN113962370A

CN113962370A - Fixed-point processing method and device for convolutional neural network and storage medium

Info

Publication number: CN113962370A
Application number: CN202111566030.5A
Authority: CN
Inventors: 郑伟; 杨广; 刘国清
Original assignee: Shenzhen Minieye Innovation Technology Co Ltd
Current assignee: Shenzhen Minieye Innovation Technology Co Ltd
Priority date: 2021-12-21
Filing date: 2021-12-21
Publication date: 2022-01-21

Abstract

The invention discloses a fixed-point processing method and device of a convolutional neural network and a storage medium, relates to the technical field of data processing, and solves the problem of reduced precision of the convolutional neural network after fixed-point processing. The method comprises the following steps: acquiring operation parameters and input parameters of a target convolutional layer, wherein when the target convolutional layer is a first layer, the input parameters are characteristic parameters of multimedia resources input into a convolutional neural network; when the target convolution layer is other than the first layer, the input parameter is an output result of a last convolution layer of the target convolution layer; performing fixed-point processing on the operation parameters to obtain a first fixed point number; determining floating point numbers corresponding to the first fixed point number; determining a residual error according to the operation parameter and the floating point number; performing fixed-point processing on the residual error to obtain a second fixed-point number; processing the input parameters by using the operation rule and the first fixed point number to obtain a first result, and processing the input parameters by using the operation rule and the second fixed point number to obtain a second result; and determining an output result according to the first result and the second result.

Description

Fixed-point processing method and device for convolutional neural network and storage medium

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a fixed-point processing method and apparatus for a convolutional neural network, and a storage medium.

Background

A Convolutional Neural Network (CNN) is a network in which several Convolutional layers are cascaded. The method is an important network form in the current popular deep learning field, and is widely applied to the aspects of image classification, image detection, image segmentation and the like in the image recognition field.

In the prior art, the computation related to the convolutional neural network is a floating point number operation containing decimal numbers, and the convolutional neural network is usually deployed in embedded hardware with low power consumption, so that the processing efficiency of the embedded hardware on the convolutional neural network is low. In order to improve the processing efficiency of the embedded hardware on the convolutional neural network and accelerate the operation capability of the neural network on the hardware, the convolutional neural network can be fixed-point, that is, the input parameters, the operation parameters and the output results of each convolutional layer of the convolutional neural network are all expressed by integers (fixed-point numbers). Thus, the process of processing data by the convolutional neural network is an integer computation. Finally, the output results (integers) of the convolutional neural network may be restored to floating point numbers.

However, in the process of representing the parameters (i.e., floating point numbers) of the convolutional neural network as fixed point numbers, rounding is performed, which causes a loss of precision of the convolutional neural network. Therefore, how to improve the accuracy of the convolutional neural network localization is a difficult problem in the industry.

Disclosure of Invention

The invention provides a fixed-point processing method and device of a convolutional neural network and a storage medium, and solves the technical problem of reduced precision of the convolutional neural network after fixed-point processing.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, the present invention provides a method for fixed-point processing of a convolutional neural network, where the convolutional neural network includes N cascaded convolutional layers, where N is a positive integer, and includes:

acquiring operation parameters and input parameters of a target convolutional layer, wherein when the target convolutional layer is the first layer of the N convolutional layers, the input parameters are characteristic parameters of multimedia resources input into a convolutional neural network; when the target convolutional layer is the other layer except the first layer in the N convolutional layers, inputting an output result of the last convolutional layer of the target convolutional layer as a parameter;

performing fixed-point processing on the operation parameter by adopting a first preset algorithm to obtain a first fixed-point number corresponding to the operation parameter;

determining floating point numbers corresponding to the first fixed point number by adopting a second preset algorithm corresponding to the first preset algorithm;

determining a residual error according to the operation parameter and the floating point number;

performing fixed-point processing on the residual error by adopting a first preset algorithm to obtain a second fixed-point number corresponding to the residual error;

processing the input parameters by using the operation rule of the target convolution layer and the first fixed point number to obtain a first result, and processing the input parameters by using the operation rule and the second fixed point number to obtain a second result;

and determining an output result of the target convolutional layer according to the first result and the second result.

The embodiment of the invention provides a fixed-point processing method and device for a convolutional neural network and a storage medium, which can decompose an operation parameter of a target convolutional layer into a floating point number and a residual error corresponding to a first fixed point number after the operation parameter is fixed-point, namely the operation parameter is the superposition of the floating point number and the residual error corresponding to the first fixed point number, and then fix-point the residual error to obtain a second fixed point number. And processing the input parameters by using the first fixed point number and the second fixed point number respectively according to the operation rule of the target volume layer to obtain a corresponding first result and a corresponding second result, and finally obtaining an output result of the target volume layer according to the first result and the second result. The floating point number is determined by the first fixed point number by adopting a second preset algorithm corresponding to the first preset algorithm, namely the floating point number is equivalent to the fixed point number, so that the floating point number can be regarded as lossless quantization, and the quantization errors of the operation parameters are all caused by residual errors. And because the distribution range of the residual error is far smaller than that of the operation parameter, the error generated after the residual error is fixed-point is far smaller than that of the operation parameter of the target convolution layer, and the error of the output result of the method is further smaller than that of the result in the prior art.

In one possible implementation, the method further includes:

processing characteristic parameters of the multimedia resources by adopting a convolutional neural network for multiple times, and performing fixed-point processing on operation parameters of the ith convolutional layer every time to obtain a plurality of first target output results, wherein i is an integer which is greater than or equal to 1 and less than or equal to N, and the first target output results are used for indicating identification information of the multimedia resources;

determining a first accuracy according to a plurality of first target output results;

and if the first precision meets a preset condition, determining the ith convolutional layer as a target convolutional layer.

In one possible implementation, if the first precision satisfies a preset condition, determining the ith convolutional layer as a target convolutional layer includes:

and if the first precision is smaller than a first preset threshold value, determining the ith convolutional layer as the target convolutional layer.

In one possible implementation, the method further includes:

processing the characteristic parameters of the multimedia resources by adopting the convolutional neural network for multiple times to obtain a plurality of second target output results, wherein the second target output results are used for indicating the identification information of the multimedia resources;

determining a second precision according to a plurality of second target output results;

if the first precision meets a preset condition, determining the ith convolutional layer as a target convolutional layer, wherein the step of determining the ith convolutional layer as the target convolutional layer comprises the following steps:

and if the difference value between the second precision and the first precision is larger than a second preset threshold value, determining the ith convolutional layer as the target convolutional layer.

In a possible implementation manner, performing fixed-point processing on the operation parameter by using a first preset algorithm to obtain a first fixed-point number corresponding to the operation parameter, including:

and if the operation parameter is a floating point number, performing fixed point processing on the operation parameter by adopting a first preset algorithm to obtain a first fixed point number.

In a second aspect, the present invention provides a fixed-point processing apparatus for a convolutional neural network, the convolutional neural network including N cascaded convolutional layers, where N is a positive integer, the fixed-point processing apparatus including:

the acquiring unit is used for acquiring the operation parameters and the input parameters of the target convolutional layer, and when the target convolutional layer is the first layer of the N convolutional layers, the input parameters are the characteristic parameters of the multimedia resources input into the convolutional neural network; when the target convolutional layer is the other layer except the first layer in the N convolutional layers, inputting an output result of the last convolutional layer of the target convolutional layer as a parameter;

the processing unit is used for performing fixed-point processing on the operation parameters acquired by the acquisition unit by adopting a first preset algorithm to obtain first fixed-point numbers corresponding to the operation parameters;

the determining unit is used for determining floating point numbers corresponding to the first fixed point numbers obtained by the processing unit by adopting a second preset algorithm corresponding to the first preset algorithm; determining a residual error according to the operation parameters and the floating point number acquired by the acquisition unit;

the processing unit is also used for performing fixed-point processing on the residual error determined by the determining unit by adopting a first preset algorithm to obtain a second fixed-point number corresponding to the residual error; processing the input parameters by using the operation rule of the target convolutional layer and the first fixed point number obtained by the processing unit to obtain a first result, and processing the input parameters by using the operation rule and the second fixed point number obtained by the processing unit to obtain a second result;

and the determining unit is also used for determining the output result of the target convolutional layer according to the first result and the second result obtained by the processing unit.

In a possible implementation manner, the processing unit is further configured to process feature parameters of the multimedia resource by using a convolutional neural network for multiple times, and perform fixed-point processing on the operation parameter of the ith convolutional layer each time to obtain multiple first target output results, where i is an integer greater than or equal to 1 and less than or equal to N, and the first target output results are used for indicating identification information of the multimedia resource;

the determining unit is further used for determining first precision according to a plurality of first target output results obtained by the processing unit; and if the first precision meets a preset condition, determining the ith convolutional layer as a target convolutional layer.

In a possible implementation manner, the determining unit is specifically configured to determine the ith convolutional layer as the target convolutional layer if the first precision is smaller than a first preset threshold.

In a possible implementation manner, the processing unit is further configured to process the feature parameters of the multimedia resource by using the convolutional neural network for multiple times to obtain multiple second target output results, where the second target output results are used to indicate identification information of the multimedia resource;

the determining unit is further used for determining second precision according to a plurality of second target output results obtained by the processing unit;

and the determining unit is specifically used for determining the ith convolutional layer as the target convolutional layer if the difference value between the second precision and the first precision is greater than a second preset threshold value.

In a possible implementation manner, the processing unit is specifically configured to perform fixed-point processing on the operation parameter by using a first preset algorithm to obtain a first fixed-point number if the operation parameter acquired by the acquisition unit is a floating-point number.

In a third aspect, the present invention provides a localization processing apparatus for a convolutional neural network, the apparatus including: comprising a processor and a memory, the memory storing computer instructions executable by the processor, the processor being configured to execute the computer instructions to implement the method of fixed-point processing of a convolutional neural network as any one of the alternatives of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium, in which computer instructions are stored, and when the computer instructions are run on a fixed-point processing apparatus of a convolutional neural network, the fixed-point processing apparatus of the convolutional neural network executes any one of the optional fixed-point processing methods of the convolutional neural network in the first aspect.

In a fifth aspect, the present invention provides a computer program product containing instructions which, when run on a computer, cause the computer to perform a method of fixed-point processing as any one of the first aspect, optionally a convolutional neural network.

Reference may be made to the detailed description of the first aspect and various implementations thereof for specific descriptions of the second to fifth aspects and various implementations thereof in the present disclosure; moreover, the beneficial effects of the second aspect to the fifth aspect and the various implementation manners thereof may refer to the beneficial effect analysis of the first aspect and the various implementation manners thereof, and are not described herein again.

Drawings

Fig. 1 is a schematic structural diagram of a fixed-point processing apparatus of a convolutional neural network according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a fixed-point processing method of a convolutional neural network according to an embodiment of the present invention;

fig. 3 is a second schematic flowchart of a method for performing a fixed-point processing on a convolutional neural network according to an embodiment of the present invention;

fig. 4 is a third schematic flowchart of a fixed-point processing method of a convolutional neural network according to an embodiment of the present invention;

fig. 5 is a fourth schematic flowchart of a fixed-point processing method of a convolutional neural network according to an embodiment of the present invention;

fig. 6 is a fifth schematic flowchart of a fixed-point processing method of a convolutional neural network according to an embodiment of the present invention;

fig. 7 is a schematic composition diagram of a fixed-point processing apparatus of a convolutional neural network according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present disclosure, "a plurality" means two or more unless otherwise specified. Additionally, the use of "based on" or "according to" means open and inclusive, as a process, step, calculation, or other action that is "based on" or "according to" one or more stated conditions or values may in practice be based on additional conditions or exceeding the stated values.

At present, the calculation related to the convolutional neural network is a floating point number operation containing decimal numbers, the calculation complexity is high, and the convolutional neural network is generally deployed in electronic equipment comprising an embedded chip with low power consumption, the operation efficiency of the chip on the convolutional neural network is low, and the operation efficiency on fixed point numbers is high, so that before the convolutional neural network is operated by using the electronic equipment, the parameters related to the convolutional neural network need to be fixed point.

In the related art, a rounding operation is performed in the process of representing the parameter (i.e., floating point number) of the convolutional neural network as a fixed point number, which causes a loss of precision of the convolutional neural network.

In order to solve the above technical problem, embodiments of the present invention provide a fixed-point processing method and apparatus for a convolutional neural network, and a storage medium, which are capable of decomposing an operation parameter of a target convolutional layer into a floating point number and a residual error corresponding to a first fixed point number after the operation parameter is fixed-point, that is, the operation parameter is a superposition of the floating point number and the residual error corresponding to the first fixed point number, and then perform fixed-point processing on the residual error to obtain a second fixed point number. And processing the input parameters by using the first fixed point number and the second fixed point number respectively according to the operation rule of the target volume layer to obtain a corresponding first result and a corresponding second result, and finally obtaining an output result of the target volume layer according to the first result and the second result. The floating point number is determined by the first fixed point number by adopting a second preset algorithm corresponding to the first preset algorithm, namely the floating point number is equivalent to the fixed point number, so that the floating point number can be regarded as lossless quantization, and the quantization errors of the operation parameters are all caused by residual errors. And because the distribution range of the residual error is far smaller than that of the operation parameter, the error generated after the residual error is fixed-point is far smaller than that of the operation parameter of the target convolution layer, and the error of the output result of the method is further smaller than that of the result in the prior art.

For convenience of understanding, terms or nouns referred to in the embodiments of the present invention will be described first.

1. And (3) rolling layers: the convolutional neural network comprises a plurality of convolutional layers, and each convolutional layer consists of a plurality of convolutional units. The convolution operation aims to extract different characteristics of input parameters through operation, the first layer of convolution layer can only extract some low-level characteristics such as edges, lines, angles and other levels, and more layers of networks can iteratively extract more complex characteristics from the low-level characteristics.

2. And (3) fixed point formation: a floating point number refers to a type of data that contains both decimals and integers, whose position of the decimal point is not fixed. Fixed-point numbers refer to binary numbers where the position of a decimal point in the number is fixed and remains unchanged. The process of converting a floating point number to a fixed point number is called fixed point quantization.

For example, the fixed point number may be an eight-bit fixed point number, may also be a sixteen-bit fixed point number, and may also be a fixed point number with other bit widths, which is not limited herein in the embodiment of the present invention. When it is desired to fix floating point numbers to eight fixed point numbers, integers between (-128, 127) may be used to represent floating point numbers. The floating point number is multiplied by a specified multiple and then rounded to an integer between (-128, 127), which is the floating point number fixed-point result.

3. Lossless fixed point: in practical applications, generally, when the number of fixed points is converted into a fixed point number, the converted decimal is less than 0.5, and thus lossless quantization can be realized.

The execution main body of the fixed-point processing method of the convolutional neural network provided by the embodiment of the invention is a fixed-point processing device of the convolutional neural network. The fixed-point processing device of the convolutional neural network may be an electronic device or a server, or may be a Central Processing Unit (CPU) in the electronic device or the server.

Fig. 1 is a schematic composition diagram of a fixed-point processing apparatus of a convolutional neural network according to an embodiment of the present invention. As shown in fig. 1, the fixed-point processing apparatus of the convolutional neural network may include: at least one processor 11, a memory 12, a communication interface 13, and a communication bus 14.

The processor 11 is a control center of a fixed-point processing device of the convolutional neural network, and may be a CPU, a micro-processing unit, or one or more integrated circuits for controlling the execution of the routine program implemented by the present invention.

For one embodiment, processor 11 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 1. Also, as an example, the fixed-point processing device of the convolutional neural network may include a plurality of processors, such as the processor 11 and the processor 15 shown in fig. 1. Each of these processors may be a Single-core processor (Single-CPU) or a Multi-core processor (Multi-CPU).

The memory 12 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 12 may be self-contained and coupled to the processor 11 via a communication bus 14. The memory 12 may also be integrated with the processor 11.

In a specific implementation, the memory 12 is used for storing data in the present invention and software programs for executing the present invention. The processor 11 may perform various functions of the fixed-point processing apparatus of the convolutional neural network by running or executing a software program stored in the memory 12 and calling data stored in the memory 12.

The communication interface 13 is any device, such as a transceiver, for communicating with other devices or communication networks, such as a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), and the like. The communication interface 13 may include a receiving unit implementing a receiving function and a transmitting unit implementing a transmitting function.

The communication bus 14 may include a path to transfer information between the aforementioned components.

It is noted that the structure shown in fig. 1 does not constitute a limitation of the stationing processing means of the convolutional neural network, which may include more or less components than those shown in fig. 1, or combine some components, or a different arrangement of components, in addition to the components shown in fig. 1.

Based on the introduction of the structure of the fixed-point processing device of the convolutional neural network, an embodiment of the present invention provides a fixed-point processing method of a convolutional neural network. The method is applied to a fixed-point processing device of the convolutional neural network, and the convolutional neural network can comprise N cascaded convolutional layers in the process of processing data by utilizing the convolutional neural network stored in the device, wherein N is a positive integer. As shown in fig. 2, the method for processing the localization of the convolutional neural network may include the following steps 201 to 207.

201. A fixed-point processing device of a convolutional neural network acquires an operation parameter and an input parameter of a target convolutional layer.

When the fixed-point processing device of the convolutional neural network needs to fix the point of the parameter related to the convolutional neural network, the operation parameter and the input parameter of each convolutional layer can be obtained. In the embodiment of the present invention, an example of obtaining any convolutional layer of a convolutional neural network, such as a target convolutional layer, is described.

It is understood that the target convolutional layer is any one of the N convolutional layers included in the convolutional neural network. When the target convolutional layer is the first layer of the N convolutional layers, inputting the characteristic parameters of the multimedia resources of the convolutional neural network; when the target convolutional layer is the other layer except the first layer in the N convolutional layers, the input parameter is the output result of the last convolutional layer of the target convolutional layer. Multimedia assets include, but are not limited to, audio, video, documents, images, among others.

Optionally, when the convolutional neural network is applied to the field of speech recognition, the input parameter of the first layer of the N cascaded convolutional layers included in the convolutional neural network is a feature parameter obtained by performing feature extraction on a speech signal, and the feature parameter may be a frequency spectrum of the speech signal.

Optionally, when the convolutional neural network is applied to the field of document identification, the input parameter of the first layer in the N cascaded convolutional layers included in the convolutional neural network is a feature parameter obtained by performing feature extraction on a text, and the feature parameter may be a word appearing in the document and a frequency thereof.

Optionally, when the convolutional neural network is applied to the field of image recognition, the input parameter of the first layer in the N cascaded convolutional layers included in the convolutional neural network is a characteristic parameter obtained by performing feature extraction on an image, and the characteristic parameter may be a pixel value of a pixel point of the image.

It is understood that the operational parameters of the target convolution layer may include a plurality of parameters. As a possible implementation, the operation parameter may be a multidimensional matrix composed of a plurality of parameters.

202. The fixed-point processing device of the convolutional neural network adopts a first preset algorithm to fix the operation parameters to obtain first fixed points corresponding to the operation parameters.

After the operation parameters of the target convolutional layer are obtained, the fixed-point processing device of the convolutional neural network may adopt a first preset algorithm to respectively fix points of each parameter of a plurality of parameters included in the operation parameters, so as to obtain a first fixed point number corresponding to each parameter.

Optionally, the first preset algorithm may include formula (1):

（1）

wherein round is rounding operation, a is any parameter included in the input parameters, min (a) is the minimum value of all parameters included in the input parameters, max (a) is the maximum value of all parameters included in the input parameters, n is the localization bit width, and m is the number of the fixed points corresponding to a.

It is understood that n may be 8, 16, 32, etc. The fixed-point bit width is used to represent the bit width of a binary number when the binary number is used to represent a floating-point number. The binary number is converted into an integer and then is a fixed point number.

203. And the fixed-point processing device of the convolutional neural network adopts a second preset algorithm corresponding to the first preset algorithm to determine floating point numbers corresponding to the first fixed point numbers.

After the fixed-point processing device of the convolutional neural network obtains the fixed-point number corresponding to each parameter included in the operation parameter, a second preset algorithm corresponding to the first preset algorithm may be adopted to determine the floating-point number corresponding to each first fixed-point number, that is, each first fixed-point number is restored to the floating-point number.

Optionally, the second preset algorithm may include formula (2):

（2）

wherein a' is a floating point number corresponding to the fixed point number corresponding to a. N in the formula (2) is the same as n in the above formula (1).

204. And the fixed-point processing device of the convolutional neural network determines the residual error according to the operation parameters and the floating point numbers. Alternatively, the residual error may be determined using equation (3):

（3）

wherein the content of the first and second substances,

is the residual error.

It can be understood that the operation parameter of the target convolutional layer is expressed by formula (4):

（4）

205. and the fixed-point processing device of the convolutional neural network adopts a first preset algorithm to carry out fixed-point processing on the residual error to obtain a second fixed-point number corresponding to the residual error.

It is understood that when any parameter a included in the operation parameters is fixed, the maximum fixed point error can be represented by formula (5):

（5）

where p is the maximum fix error of a.

Residual error

I.e., the fix error of a, the maximum value is p and the minimum value is-p, therefore

The spotting error of (a) can be represented by equation (6):

（6）

wherein the content of the first and second substances,

is composed of

The maximum fix error of (2).

It will be appreciated that the above-described,

is much smaller than the maximum fix error of a.

206. The fixed-point processing device of the convolutional neural network processes the input parameters by using the operation rule of the target convolutional layer and the first fixed point number to obtain a first result, and processes the input parameters by using the operation rule and the second fixed point number to obtain a second result.

207. The fixed-point processing device of the convolutional neural network determines the output result of the target convolutional layer according to the first result and the second result.

It can be understood that, when the target convolutional layer is the first sublayer or the middle layer of the convolutional neural network, the output result of the target convolutional layer determined by the fixed-point processing device of the convolutional neural network is a fixed-point number (integer), and the fixed-point processing device of the convolutional neural network needs to restore the output result to a floating-point number and input the floating-point number to the next convolutional layer.

When the target convolutional layer is the last layer of the convolutional neural network, the output result of the target convolutional layer determined by the fixed-point processing device of the convolutional neural network is a fixed-point number (integer), the fixed-point processing device of the convolutional neural network needs to restore the output result to a floating-point number, the floating-point number is the output result of the convolutional neural network, and the output result is used for indicating the identification information of the multimedia resource.

The embodiment of the invention provides a fixed-point processing method and device for a convolutional neural network and a storage medium, which can decompose an operation parameter of a target convolutional layer into a floating point number and a residual error corresponding to a first fixed point number after the operation parameter is fixed-point, namely the operation parameter is the superposition of the floating point number and the residual error corresponding to the first fixed point number, and then fix-point the residual error to obtain a second fixed point number. And processing the input parameters by using the first fixed point number and the second fixed point number respectively according to the operation rule of the target volume layer to obtain a corresponding first result and a corresponding second result, and finally obtaining an output result of the target volume layer according to the first result and the second result. The floating point number is determined by the first fixed point number by adopting a second preset algorithm corresponding to the first preset algorithm, namely the floating point number is equivalent to the fixed point number, so that the floating point number can be regarded as lossless quantization, and the quantization errors of the operation parameters are all caused by residual errors. And because the distribution range of the residual error is far smaller than that of the operation parameter, the error generated after the residual error is fixed-point is far smaller than that generated after the operation parameter of the target convolutional layer is fixed-point, so that the error of the output result of the method is smaller than that of the result in the prior art, and the accuracy of the convolutional neural network after the point is fixed is improved.

Optionally, in the embodiment of the present invention, with reference to fig. 2, as shown in fig. 3, before the step 201 is executed, the method for performing a localization processing on a convolutional neural network may further include the following steps 208 to 210.

208. The fixed-point processing device of the convolutional neural network processes the characteristic parameters of the multimedia resources by adopting the convolutional neural network for multiple times, and fixes the operating parameters of the ith convolutional layer every time to obtain a plurality of first target output results, wherein the first target output results are used for indicating the identification information of the multimedia resources.

The fixed-point processing device of the convolutional neural network can adopt the convolutional neural network to process the characteristic parameters of the multimedia resources for multiple times, the operation parameters of the ith convolutional layer are fixed-point every time, the fixed-point operation parameters of the same convolutional layer are fixed-point every time, a first target output result is obtained every time the processing is carried out, and a plurality of first target processing results are obtained after the processing is carried out for multiple times.

209. A fixed-point processing device of the convolutional neural network determines a first accuracy based on a plurality of first target output results.

The fixed-point processing device of the convolutional neural network may determine, after obtaining the plurality of first target output results, a result indicating that the identification information of the multimedia resource is correct among the plurality of first target output results, and calculate a ratio of the number of correct results to the number of the plurality of first target output results, where the ratio is a first accuracy.

210. And if the first precision meets a preset condition, determining the ith convolutional layer as a target convolutional layer.

The fixed-point processing device of the convolutional neural network may determine whether the first accuracy satisfies a preset condition after determining the first accuracy. And if the preset condition is met, determining the ith convolutional layer as the target convolutional layer. If the preset condition is not satisfied, determining that the ith convolutional layer is not the target convolutional layer.

Optionally, in the embodiment of the present invention, the fixed-point processing apparatus of the convolutional neural network may determine whether the first precision meets the preset condition by using the following two methods, which is not limited herein.

Alternatively, in the first manner, as shown in fig. 4 in combination with fig. 3, the step 210 can be implemented by the step 211.

211. And if the first precision is smaller than a first preset threshold value, determining the ith convolutional layer as the target convolutional layer.

And if the first precision is smaller than a first preset threshold value, the fixed-point processing device of the convolutional neural network determines that the first precision meets a preset condition, so that the ith convolutional layer is determined as the target convolutional layer. And if the first precision is greater than or equal to a first preset threshold value, the preset condition is not met, and the ith convolutional layer is determined not to be the target convolutional layer.

It should be noted that, when the convolutional neural network needs to improve the precision, the first preset threshold may be set to be larger, most convolutional layers in the convolutional neural network are determined as target convolutional layers according to the larger first preset threshold, and the method for performing fixed-point processing on the convolutional neural network according to the embodiment of the present invention is performed on each target convolutional layer, so that the precision can be improved.

When the convolutional neural network needs to increase the speed, the first preset threshold value may be set to be smaller, the number of target convolutional layers determined according to the smaller first preset threshold value is smaller, and the steps 201 to 207 are performed on the smaller number of target convolutional layers, so that the operation speed can be increased.

Alternatively, in the second mode, as shown in fig. 5 in conjunction with fig. 3, before the step 201 is executed, the localization processing device of the convolutional neural network may execute the step 212 and the step 213.

212. The fixed-point processing device of the convolutional neural network processes the characteristic parameters of the multimedia resources by adopting the convolutional neural network for multiple times to obtain a plurality of second target output results, and the second target output results are used for indicating the identification information of the multimedia resources.

213. The fixed-point processing device of the convolutional neural network determines a second accuracy based on a plurality of second target output results.

In this case, step 210 described above may be implemented by step 214.

214. And if the difference value between the second precision and the first precision is larger than a second preset threshold value, determining the ith convolutional layer as the target convolutional layer.

And if the difference value between the second precision and the first precision is larger than a second preset threshold value, the fixed point processing device of the convolutional neural network determines that the first precision meets a preset condition, and determines the ith convolutional layer as a target convolutional layer. And if the first precision is smaller than or equal to a first preset threshold value, determining that the first precision does not meet a preset condition, and determining that the ith convolutional layer is not the target convolutional layer.

When the convolutional neural network needs to improve the precision, the second preset threshold value can be set to be smaller. The second predetermined threshold may be set larger when the convolutional neural network needs to increase speed. For the specific reason, reference may be made to the related description in step 211, which is not described herein again.

In the embodiment of the present invention, if each convolutional layer executes steps 201 to 207, although the precision is improved, the computation amount of the convolutional neural network is increased by about 2 times, and the computation time is also increased accordingly. Therefore, in order to balance the operation accuracy and the operation time of the convolutional neural network, the target convolutional layer is screened by the preset condition. In this way, the steps 201 to 207 are executed only for the selected target convolutional layers, and the speed can be optimized while improving the accuracy.

Optionally, in this embodiment of the present invention, as shown in fig. 6 in combination with fig. 2, step 202 may specifically include step 215.

215. And if the input parameters are floating point numbers, the fixed point processing device of the convolutional neural network fixes the operation parameters by adopting a first preset algorithm to obtain first fixed point numbers.

When the operation parameter acquired by the fixed-point processing device of the convolutional neural network is a floating point number, the operation parameter is fixed-point by adopting a first preset algorithm to obtain a first fixed point number corresponding to the operation parameter. When the operation parameters acquired by the fixed-point processing device of the convolution network are fixed-point numbers, the fixed-point processing is not needed, and the input parameters are directly used for the next step.

In this way, by determining whether the operation parameter is a floating point number before the residual decomposition is performed, and by only performing fixed-point processing on the operation parameter of the floating point number, the operation time can be further reduced, and the operation efficiency can be improved.

The above description mainly introduces the solution provided by the embodiment of the present invention from the perspective of a fixed-point processing apparatus of a convolutional neural network. It is to be understood that the fixed-point processing apparatus of the convolutional neural network includes hardware structures and/or software modules for performing the respective functions in order to realize the above functions. Those of skill in the art will readily appreciate that the present invention can be implemented in hardware or a combination of hardware and computer software, in conjunction with the exemplary algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

Fig. 7 is a schematic diagram illustrating a possible composition of the fixed-point processing apparatus of the convolutional neural network according to the above embodiment, and as shown in fig. 7, the fixed-point processing apparatus of the convolutional neural network may include: an acquisition unit 71, a processing unit 72 and a determination unit 73.

An obtaining unit 71, configured to obtain an operation parameter and an input parameter of a target convolutional layer, where the input parameter is a characteristic parameter of a multimedia resource input to the convolutional neural network when the target convolutional layer is a first layer of the N convolutional layers; when the target convolutional layer is the other layer except the first layer in the N convolutional layers, the input parameter is the output result of the last convolutional layer of the target convolutional layer.

The processing unit 72 is configured to perform fixed-point processing on the operation parameter acquired by the acquisition unit 71 by using a first preset algorithm to obtain a first fixed-point number corresponding to the operation parameter.

The determining unit 73 is configured to determine, by using a second preset algorithm corresponding to the first preset algorithm, a floating point number corresponding to the first fixed point number obtained by the processing unit 72; the residual error is determined based on the operation parameter and the floating point number acquired by the acquisition unit 71.

The processing unit 72 is further configured to fix the residual error determined by the determining unit 73 by using a first preset algorithm to obtain a second fixed point number corresponding to the residual error. The input parameters are processed by the operation rule of the target convolutional layer and the first fixed point obtained by the processing unit 72 to obtain a first result, and the input parameters are processed by the operation rule and the second fixed point obtained by the processing unit 72 to obtain a second result.

The determining unit 73 is further configured to determine an output result of the target convolutional layer according to the first result and the second result obtained by the processing unit 72.

Optionally, the processing unit 72 is further configured to process the feature parameters of the multimedia resource by using the convolutional neural network for multiple times, and perform fixed-point processing on the operation parameter of the ith convolutional layer each time to obtain multiple first target output results, where i is an integer greater than or equal to 1 and less than or equal to N, and the first target output results are used to indicate the identification information of the multimedia resource.

A determining unit 73, further configured to determine a first accuracy according to the plurality of first target output results obtained by the processing unit 72; and if the first precision meets a preset condition, determining the ith convolutional layer as a target convolutional layer.

Further, the determining unit 73 is specifically configured to determine the ith convolutional layer as the target convolutional layer if the first precision is smaller than a first preset threshold.

Optionally, the processing unit 72 is further configured to process the feature parameters of the multimedia resource by using the convolutional neural network for multiple times to obtain multiple second target output results, where the second target output results are used to indicate identification information of the multimedia resource.

The determining unit 73 is further configured to determine the second accuracy according to the plurality of second target output results obtained by the processing unit 72.

The determining unit 73 is specifically configured to determine the ith convolutional layer as the target convolutional layer if a difference between the second precision and the first precision is greater than a second preset threshold.

Further, the processing unit 72 is specifically configured to perform fixed-point processing on the operation parameter by using a first preset algorithm to obtain a first fixed-point number if the operation parameter acquired by the acquiring unit 71 is a floating-point number.

The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions within the technical scope of the present invention are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A fixed-point processing method of a convolutional neural network, wherein the convolutional neural network comprises N cascaded convolutional layers, N is a positive integer, and the method is characterized by comprising the following steps:

acquiring operation parameters and input parameters of a target convolutional layer, wherein when the target convolutional layer is the first layer of the N convolutional layers, the input parameters are characteristic parameters of multimedia resources input into the convolutional neural network; when the target convolutional layer is the other layer except the first layer in the N convolutional layers, the input parameter is an output result of a last convolutional layer of the target convolutional layer;

determining floating point numbers corresponding to the first fixed point numbers by adopting a second preset algorithm corresponding to the first preset algorithm;

performing fixed-point processing on the residual error by adopting the first preset algorithm to obtain a second fixed-point number corresponding to the residual error;

processing the input parameter by using the operation rule of the target convolution layer and the first fixed point number to obtain a first result, and processing the input parameter by using the operation rule and the second fixed point number to obtain a second result;

2. The method of claim 1, further comprising:

processing the characteristic parameters of the multimedia resources by adopting the convolutional neural network for multiple times, and performing fixed-point processing on the operation parameters of the ith convolutional layer every time to obtain a plurality of first target output results, wherein i is an integer which is greater than or equal to 1 and less than or equal to N, and the first target output results are used for indicating the identification information of the multimedia resources;

determining a first precision according to the plurality of first target output results;

and if the first precision meets a preset condition, determining the ith convolutional layer as the target convolutional layer.

3. The method according to claim 2, wherein determining the i-th convolutional layer as the target convolutional layer if the first precision satisfies a preset condition comprises:

4. The method of claim 2, further comprising:

processing the characteristic parameters of the multimedia resources by adopting the convolutional neural network for multiple times to obtain multiple second target output results, wherein the second target output results are used for indicating the identification information of the multimedia resources;

determining a second precision according to the plurality of second target output results;

determining the ith convolutional layer as the target convolutional layer if the first precision meets a preset condition, including:

5. The fixed-point processing method of the convolutional neural network according to any one of claims 1 to 4, wherein the fixed-point processing of the operation parameter by using a first preset algorithm to obtain a first fixed-point number corresponding to the operation parameter comprises:

and if the operation parameter is a floating point number, performing fixed point processing on the operation parameter by adopting the first preset algorithm to obtain the first floating point number.

6. A fixed-point processing device for a convolutional neural network, the convolutional neural network including N cascaded convolutional layers, N being a positive integer, comprising:

the acquiring unit is used for acquiring operation parameters and input parameters of a target convolutional layer, and when the target convolutional layer is the first layer of the N convolutional layers, the input parameters are characteristic parameters of multimedia resources input into the convolutional neural network; when the target convolutional layer is the other layer except the first layer in the N convolutional layers, the input parameter is an output result of a last convolutional layer of the target convolutional layer;

the processing unit is used for performing fixed-point processing on the operation parameter acquired by the acquisition unit by adopting a first preset algorithm to obtain a first fixed-point number corresponding to the operation parameter;

the determining unit is used for determining floating point numbers corresponding to the first fixed point numbers obtained by the processing unit by adopting a second preset algorithm corresponding to the first preset algorithm; determining a residual error according to the operation parameter and the floating point number acquired by the acquisition unit;

the processing unit is further configured to fix the residual error determined by the determining unit by using a first preset algorithm to obtain a second fixed point number corresponding to the residual error; processing the input parameter by using the operation rule of the target convolutional layer and the first fixed point number obtained by the processing unit to obtain a first result, and processing the input parameter by using the operation rule and the second fixed point number obtained by the processing unit to obtain a second result;

the determining unit is further configured to determine an output result of the target convolutional layer according to the first result and the second result obtained by the processing unit.

7. The convolutional neural network fixed-point processing apparatus as claimed in claim 6,

the processing unit is further configured to process the feature parameters of the multimedia resource by using the convolutional neural network for multiple times, perform fixed-point processing on the operation parameter of the ith convolutional layer each time to obtain multiple first target output results, where i is an integer greater than or equal to 1 and less than or equal to N, and the first target output results are used for indicating identification information of the multimedia resource;

the determining unit is further configured to determine a first precision according to the plurality of first target output results obtained by the processing unit; and if the first precision meets a preset condition, determining the ith convolutional layer as the target convolutional layer.

8. The convolutional neural network fixed-point processing apparatus as claimed in claim 7,

the determining unit is specifically configured to determine the ith convolutional layer as the target convolutional layer if the first precision is smaller than a first preset threshold.

9. The convolutional neural network fixed-point processing apparatus as claimed in claim 7,

the processing unit is further configured to process the feature parameters of the multimedia resources by using the convolutional neural network for multiple times to obtain multiple second target output results, where the first target output result is used to indicate identification information of the multimedia resources;

the determining unit is further configured to determine a second precision according to the plurality of second target output results obtained by the processing unit;

the determining unit is specifically configured to determine the ith convolutional layer as the target convolutional layer if a difference between the second precision and the first precision is greater than a second preset threshold.

10. The convolutional neural network fixed-point processing apparatus as claimed in any one of claims 6 to 9,

the processing unit is specifically configured to fix the operation parameter by using the first preset algorithm to obtain the first fixed point number if the operation parameter acquired by the acquisition unit is a floating point number.

11. A stationarization processing apparatus for a convolutional neural network, comprising a processor and a memory, the memory storing computer instructions executable by the processor, the processor being configured to execute the computer instructions to implement the stationarization processing method for a convolutional neural network according to any one of claims 1 to 5.

12. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein computer instructions, which, when run on a stationing processing apparatus of a convolutional neural network, cause the stationing processing apparatus of the convolutional neural network to execute the stationing processing method of the convolutional neural network according to any one of claims 1 to 5.