CN111383157A

CN111383157A - Image processing method and device, vehicle-mounted operation platform, electronic equipment and system

Info

Publication number: CN111383157A
Application number: CN201811647408.2A
Authority: CN
Inventors: 程光亮; 温拓朴; 石建萍
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2020-07-07
Anticipated expiration: 2038-12-29
Also published as: KR20210090249A; JP2022512211A; WO2020135601A1; CN111383157B

Abstract

The embodiment of the invention provides an image processing method, an image processing device, a vehicle-mounted operation platform, electronic equipment and a system, wherein the method comprises the following steps: performing fixed-point processing on network parameters expressed by a convolutional neural network by using floating points according to the fixed-point bit width hardware resource quantity of an operation unit, wherein the network parameters comprise convolutional parameters and layer output parameters of the convolutional neural network; acquiring an image to be processed; and controlling the operation unit to process the image according to the network parameters after the fixed-point processing to obtain the processing result of the image. The method realizes high-efficiency image processing by using the convolutional neural network on a platform with limited hardware resources.

Description

Image processing method and device, vehicle-mounted operation platform, electronic equipment and system

Technical Field

The embodiment of the invention relates to computer technology, in particular to an image processing method, an image processing device, a vehicle-mounted operation platform, electronic equipment and an image processing system.

Background

Convolutional neural networks play an important role in the field of computer vision tasks. For example, in the field of intelligent driving, lane line detection, lane fitting, and the like may be performed based on a convolutional neural network. Before the convolutional neural network runs on a platform with limited hardware resources, a fixed-point operation is generally required. The fixed-point operation refers to the fixed-point representation of the convolution parameters and the intermediate layer results in the convolution neural network represented by floating-point numbers.

Disclosure of Invention

The embodiment of the invention provides an image processing method, an image processing device, a vehicle-mounted operation platform, electronic equipment and a system.

A first aspect of an embodiment of the present invention provides an image processing method, including:

performing fixed-point processing on network parameters expressed by a convolutional neural network by using floating points according to the fixed-point bit width hardware resource quantity of an operation unit, wherein the network parameters comprise convolutional parameters and layer output parameters of the convolutional neural network;

acquiring an image to be processed;

and controlling the operation unit to process the image according to the network parameters after the fixed-point processing to obtain the processing result of the image.

Further, the performing fixed-point processing on the network parameter of the convolutional neural network according to the fixed-point bit width hardware resource amount of the operation unit includes:

clustering layer output parameters in the convolutional neural network to obtain a preset number of clustering results;

and performing second fixed-point processing according to the clustering result and the convolution parameters in the convolutional neural network and the fixed-point bit width hardware resource amount to obtain the second fixed-point convolutional neural network.

determining fixed-point parameters according to network parameters of a convolutional neural network, wherein the fixed-point parameters are used for identifying decimal numbers of fixed-point numbers corresponding to the network parameters;

and performing first localization on the network parameters according to the localization parameters, a preset localization function and the fixed-point bit width hardware resource amount to obtain a first localized convolutional neural network, wherein the localization function is a linear function with a preset slope, and the preset slope is greater than 0.

Further, the method also comprises the following steps:

clustering the layer output parameters in the first spotted convolutional neural network to obtain a preset number of clustering results;

and performing second fixed-point processing according to the clustering result, the convolution parameters in the first fixed-point convolutional neural network and the fixed-point bit width hardware resource amount to obtain a second fixed-point convolutional neural network.

Further, the determining a fixed-point parameter according to a network parameter of the convolutional neural network includes:

and determining the fixed-point parameters according to the decimal place of the convolution parameters and the decimal place of the layer output parameters.

Further, the determining the fixed-point parameter according to the bit number of the convolution parameter and the bit number of the layer output parameter includes:

and taking the decimal place number of the convolution parameter and the maximum decimal place number in the decimal places of the layer output parameters as the decimal place number corresponding to the fixed-point parameter.

Further, the performing a first localization on the network parameter according to the localization parameter, a preset localization function and the fixed-point bit width hardware resource amount to obtain a first localized convolutional neural network includes:

determining a fixed-point level of the network parameter according to the bit number of the fixed-point bit width hardware resource quantity and the fixed-point parameter;

and carrying out first localization on the network parameters according to the localization level and the localization function to obtain a first localized convolutional neural network.

Further, the performing a first localization on the network parameter according to the localization level and the localization function includes:

determining a difference between the value of the network parameter and a target level, wherein the target level is a fixed-point level closest to the value of the network parameter;

and inputting the difference value and the target level into the fixed-point function to obtain a result of performing the first fixed-point on the network parameter.

Further, the clustering to obtain a preset number of clustering results includes:

selecting the preset number of layer output parameters from the layer output parameters as initial clustering centers;

and performing clustering iteration processing according to the initial clustering center to obtain the preset number of clustering results.

Further, the performing of the second fixed-point processing includes:

determining a clustering center of the clustering result;

and performing second fixed-point processing according to the fixed-point bit width hardware resource quantity based on the cluster center and the convolution parameters in the convolution neural network or based on the cluster center and the convolution parameters in the convolution neural network after the first fixed-point processing to obtain a second fixed-point processing result, wherein the second fixed-point processing is used for fixing the cluster center to a fixed-point level closest to the cluster center.

Further, the method also comprises the following steps:

in response to the difference between the fixed point function and the target step function being greater than a preset difference, correcting the slope of the fixed point function;

and performing new first localization according to the corrected slope.

Further, the controlling the operation unit to process the image according to the network parameter after the fix-point processing includes:

controlling the operation unit to perform at least one of the following processes on the image according to the network parameters after the fixed-point processing:

fixed point multiplication, fixed point addition, and shift.

Further, the processing result of the image comprises at least one of the following items:

feature extraction results, segmentation results, classification results, object detection/tracking results.

A second aspect of an embodiment of the present invention provides an image processing apparatus, including:

the system comprises a first processing module, a second processing module and a control module, wherein the first processing module is used for carrying out fixed point processing on network parameters expressed by floating points of a convolutional neural network according to the fixed point bit width hardware resource quantity of an arithmetic unit, and the network parameters comprise convolutional parameters and layer output parameters of the convolutional neural network;

the acquisition module is used for acquiring an image to be processed;

and the second processing module is used for controlling the operation unit to process the image according to the network parameters after the fixed-point processing to obtain the processing result of the image.

Further, the first processing module comprises:

the clustering unit is used for clustering layer output parameters in the convolutional neural network to obtain a preset number of clustering results;

and the second fixed-point unit is used for carrying out second fixed-point according to the clustering result and the convolution parameters in the convolutional neural network and the fixed-point bit width hardware resource quantity to obtain the convolutional neural network after the second fixed-point.

Further, the first processing module further includes:

the determining unit is used for determining fixed-point parameters according to network parameters of the convolutional neural network, and the fixed-point parameters are used for identifying decimal numbers of fixed-point numbers corresponding to the network parameters;

and the first fixed-point unit is used for carrying out first fixed-point processing on the network parameters according to the fixed-point parameters, a preset fixed-point function and the fixed-point bit width hardware resource amount to obtain a first fixed-point convolutional neural network, wherein the fixed-point function is a linear function with a preset slope, and the preset slope is greater than 0.

Further, the first processing module is specifically configured to:

Further, the determining unit is specifically configured to:

Further, the first localization unit is specifically configured to:

determining a difference between the value of the network parameter and the target level, wherein the target level is a fixed-point level closest to the value of the network parameter; and the number of the first and second groups,

Further, the first processing module is specifically configured to:

determining a clustering center of the clustering result;

Further, the method also comprises the following steps:

the updating module is used for correcting the slope of the fixed point function when the difference between the fixed point function and the target step function is larger than a preset difference; and performing new first localization according to the corrected slope.

Further, the second processing module is specifically configured to:

fixed point multiplication, fixed point addition, and shift.

A third aspect of the embodiments of the present invention provides a vehicle-mounted computing platform based on a field programmable gate array FPGA, including: the system comprises a processor, an external memory, a memory and an FPGA operation unit;

network parameters after fixed-point processing of the convolutional neural network are stored in the external memory, wherein the network parameters comprise convolutional parameters and layer output parameters of the convolutional neural network;

the processor reads the network parameters of the fixed-point processing of the convolutional neural network into the memory, and inputs the data on the memory and the image to be processed into the FPGA arithmetic unit;

and the FPGA arithmetic unit carries out arithmetic processing according to the image to be processed and the network parameters of the fixed-point processing to obtain the processing result of the image.

Further, the processor is configured to:

determining a clustering center of the clustering result;

Further, the processor is configured to:

and performing new first localization according to the corrected slope.

Further, the processor is configured to:

fixed point multiplication, fixed point addition, and shift.

A fourth aspect of an embodiment of the present invention provides an electronic device, including:

a memory for storing program instructions;

a processor for calling and executing the program instructions in the memory to perform the method steps of the first aspect.

A fifth aspect of the embodiments of the present invention provides an intelligent driving system, including the electronic device described in the fourth aspect.

A sixth aspect of the present invention provides a readable storage medium, in which a computer program is stored, the computer program being configured to execute the method according to the first aspect.

According to the image processing method, the image processing device, the vehicle-mounted operation platform, the electronic device and the system, firstly, fixed-point processing is carried out on the convolutional neural network according to the fixed-point bit width hardware resource quantity of the operation unit, and after the image to be processed is obtained, the operation unit is controlled to process the image according to the network parameters after the fixed-point processing, so that efficient image processing is carried out on the platform with limited hardware resources by using the convolutional neural network.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the following briefly introduces the drawings needed to be used in the description of the embodiments or the prior art, and obviously, the drawings in the following description are some embodiments of the present invention, and those skilled in the art can obtain other drawings according to the drawings without inventive labor.

Fig. 1 is a schematic flowchart of a first embodiment of an image processing method according to the present invention;

fig. 2 is a schematic flowchart of a second embodiment of an image processing method according to the present invention;

fig. 3 is a schematic flowchart of a third embodiment of an image processing method according to the present invention;

fig. 4 is a schematic flowchart of a fourth embodiment of an image processing method according to the present invention;

fig. 5 is a schematic flowchart of a fifth embodiment of an image processing method according to the present invention;

fig. 6 is a schematic flowchart of a sixth embodiment of an image processing method according to an embodiment of the present invention;

fig. 7 is a schematic flowchart of a seventh embodiment of an image processing method according to an embodiment of the present invention;

FIG. 8 is a block diagram of an image processing apparatus according to a first embodiment of the present invention;

fig. 9 is a block diagram of a second embodiment of an image processing apparatus according to the present invention;

fig. 10 is a block diagram of a third embodiment of an image processing apparatus according to the present invention;

fig. 11 is a block diagram of a fourth embodiment of an image processing apparatus according to the present invention;

fig. 12 is a schematic structural diagram of an FPGA-based vehicle-mounted computing platform according to an embodiment of the present invention;

fig. 13 is a block diagram of an electronic device 1300 according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The image processing method provided by the embodiment of the invention can be applied to various scenes using the convolutional neural network for image processing. For example, the embodiment of the invention can be applied to the field of intelligent driving, such as automatic driving and auxiliary driving technologies. In the field of intelligent driving, lane line detection, lane line fitting and the like can be performed on the basis of the convolutional neural network, and then the method of the embodiment of the invention can be used for performing fixed-point operation on the convolutional neural network in the field of intelligent driving.

Fig. 1 is a flowchart illustrating a first embodiment of an image processing method according to an embodiment of the present invention, where an execution main body of the method may be a processor, such as a Central Processing Unit (CPU), in an electronic device for performing image processing. As shown in fig. 1, the method includes:

s101, performing fixed-point processing on network parameters expressed by floating points of the convolutional neural network according to the fixed-point bit width hardware resource quantity of the operation unit, wherein the network parameters comprise convolutional parameters and layer output parameters of the convolutional neural network.

Optionally, the operation unit may be a calculation unit supporting fixed-point operation. Taking a Field-Programmable Gate Array (FPGA) as an example, the arithmetic unit may be a Digital Signal Processor (DSP) in the FPGA.

Before the convolutional neural network runs on a platform with limited hardware resources, a fixed-point operation is required. Illustratively, an FPGA is an inexpensive and stable computing platform on which convolutional neural networks can be run. In order to exert the comprehensive advantages of hardware platforms such as an FPGA and the like in the aspects of low power consumption, accelerated operation and the like, the fixed-point bit width hardware resource amount of the operation unit is generally limited. In some cases, to achieve lower power consumption, the fixed-point bit-width hardware resource amount is selected to be as small as possible, for example, 8 bits or 4 bits or even less bit-width hardware resource amount is selected to implement fixed-point operation. However, the small amount of fixed-point bit width hardware resources often affects the operation speed, and for a platform requiring fast response or even real-time response, such as an automatic driving vehicle-mounted operation platform, the convolutional neural network needs to be optimized in terms of adapting to the amount of the fixed-point bit width hardware resources, so as to implement accelerated operation on the limited resource platform. In view of the above requirements, before the convolutional neural network is run on a platform with limited hardware resources, such as an FPGA, the fixed-point operation may be performed first to adapt to the amount of hardware fixed-point bit-width resources, so as to meet the requirements of low power consumption and fast response for the operation platform at the same time. Specifically, the convolution parameters and the intermediate layer results of the convolutional neural network are fixed to be represented by fixed point numbers. After the fixed-point operation, the multiply-add of the floating-point number may be converted to a multiply-add of the fixed-point number. For the platform such as FPGA, the operations such as multiplication and addition of fixed point number can be directly executed by using DSP, therefore, the network parameters of the convolutional neural network are fixed point-processed, and the hardware consumption of the platform such as FPGA can be reduced.

The convolutional neural network can be applied to an image processing process, and the embodiment of the invention provides an image processing method based on the convolutional neural network.

In the arithmetic unit, data processing is performed using a fixed-point bit width hardware resource amount. For example, the fixed-point bit width hardware resource amount of the DSP of the FPGA may be 4 bits to 8 bits, i.e., the DSP of the FPGA may support the calculation of the fixed-point number of 4 bits to 8 bits. Accordingly, in the fixed-point processing, the floating-point number needs to be fixed to a fixed-point number of 4 bits to 8 bits.

And S102, acquiring an image to be processed.

Illustratively, under the scenes of lane line detection, lane line fitting and the like, a camera of the intelligent driving vehicle can acquire a road image, and the processor can acquire the road image acquired by the camera, wherein the road image is an image to be processed.

And S103, controlling the operation unit to process the image according to the network parameters after the fixed-point processing to obtain the processing result of the image.

Optionally, the process of processing the image may be at least one of fixed-point multiplication, fixed-point addition, and moving operation.

Based on different scenes, different image processing results can be obtained. The processing results of the images that may be obtained may include at least one of:

For example, the object detection result may be a detection result of a lane line.

In this embodiment, firstly, the convolutional neural network is fixed-point processed according to the fixed-point bit width hardware resource amount of the operation unit, and after the image to be processed is obtained, the operation unit is controlled to process the image according to the network parameter after the fixed-point processing, so that efficient image processing using the convolutional neural network on a platform with limited hardware resources is realized. By performing fixed-point processing on the convolutional neural network, accelerated operation can be realized on a platform with limited hardware resources, so that the requirements on low power consumption and quick response of an operation platform are met.

When the stationing process is performed in step S101, a stepwise approach-type stationing method, a mixed stationing method of clustering layer output parameters of the convolutional neural network, or a combination of both methods may be used.

In the staged approach type fixed-point method, firstly, fixed-point parameters are determined according to network parameters of the convolutional neural network, and then, first fixed-point is carried out on the network parameters according to the fixed-point parameters, a preset fixed-point function and fixed-point bit width hardware resource quantity, so that the convolutional neural network after first fixed-point is obtained.

In the hybrid localization mode for clustering the layer output parameters of the convolutional neural network, the layer output parameters in the convolutional neural network are clustered to obtain a preset number of clustering results, and then, second localization is performed according to the clustering results and the convolutional parameters in the convolutional neural network and the fixed-point bit width hardware resource amount to obtain a second fixed-point convolutional neural network.

In the above-described combination of both, the mixed localization method may be performed by performing approximation localization in stages and then performing approximation localization in stages.

The following examples are individually described.

Fig. 2 is a schematic flow chart of a second embodiment of the image processing method according to the embodiment of the present invention, and as shown in fig. 2, the phased approach type fixed-point process includes:

s201, determining a fixed-point parameter according to a network parameter of the convolutional neural network, wherein the fixed-point parameter is used for identifying the decimal number of the fixed-point number corresponding to the network parameter.

Optionally, the input of the embodiment of the present invention is a full-precision floating-point convolutional neural network, that is, before the method of the embodiment of the present invention is executed, the network parameters in the convolutional neural network are represented by using floating-point numbers.

Optionally, the network parameters may specifically include convolution parameters and layer output parameters. The layer output parameter refers to an intermediate layer output result of the convolutional neural network, such as an intermediate layer characteristic diagram.

Optionally, the fixed-point parameter is used to identify a decimal number of the fixed-point number corresponding to the network parameter. Before the fixed-point is carried out by using the linear function, the decimal digit of the fixed-point number can be determined according to the network parameters in the convolutional neural network.

S202, according to the fixed-point parameters, a preset fixed-point function and the fixed-point bit width hardware resource quantity, performing first fixed-point on the network parameters to obtain a first fixed-point convolutional neural network, wherein the fixed-point function is a linear function with a preset slope.

Optionally, after the fixed-point parameter is determined, the fixed-point parameter, the fixed-point function, and the fixed-point bit width hardware resource amount of the arithmetic unit may be combined to perform first fixed-point.

Equation (1) below is an example of a fixed-point function.

(x_s＝x_q+a(x-x_q) (1)

Wherein a is a slope, and x is a slope when a is 1.0_sX corresponds to no spotting. When a is 0, x_s＝x_qThis corresponds to complete spotting, in which case the spotting function is locally in the form of a step in the function. In this embodiment, a is continuously changed from 1.0 to 0 through a staged process. It should be noted that the above-mentioned fixed-point function is only one implementation manner, and a person skilled in the art may determine a linear function with a preset slope as the fixed-point function according to actual needs.

Optionally, in a specific implementation process, this step may be performed multiple times in stages. In each stage, a slope may be first assigned to the above-described fixed-point function, and first fixed-point may be performed based on the slope. After the first stationing at this stage is completed, the slope may be corrected, and a new first stationing is performed based on the corrected slope, and so on until the slope of the stationing function approaches the slope of the step function. For example, in the convolutional neural network training process, in each training, a preset slope is used for first stationing, and in the next training, it is determined whether the difference between the stationing function and the target step function is greater than the preset difference, for example, whether the slope difference is greater than the preset difference, if so, the new slope is used for continuing the first stationing, and so on until the slope of the stationing function approaches the slope of the step function.

For example, assuming that the step function is regarded as a linear function with a slope of 0, in a specific implementation process, a slope of 1 may be assigned to the fixed-point function in an initial stage, and the network parameter is fixed-point based on the fixed-point function. And analogizing in sequence, gradually reducing the slope value in the subsequent stages, and respectively carrying out fixed-point processing on the basis of the slope distributed in each stage until the slope approaches to 0.

Taking the example of executing fixed-point processing in the training process of the convolutional neural network, the process of executing fixed-point processing in the training process is as follows:

and when the training result is in accordance with expectation, executing slope distribution, fixed point and neural network training in the second stage based on the currently obtained convolutional neural network, and repeating the steps until the slope approaches to 0. By this process, the result of the fixed-point function can gradually approach the fully fixed-point result in stages.

In the embodiment, the convolution parameters and the layer output parameters of the convolution neural network are fixed by using the fixed-point function with the specific slope, so that the fixed-point result can gradually approach the complete fixed-point result, the precision loss during fixed-point is reduced, the fixed-point error is reduced, and the fixed-point accuracy is improved.

The following describes a hybrid localization method for clustering layer output parameters of a convolutional neural network, and a process for combining the two methods.

Fig. 3 is a schematic flow chart of a third embodiment of the image processing method according to the embodiment of the present invention, and as shown in fig. 3, the hybrid localization method for clustering layer output parameters of a convolutional neural network includes:

s301, clustering layer output parameters in the convolutional neural network to obtain a preset number of clustering results.

If the above two methods are combined, the convolutional neural network described in this step is a convolutional neural network after the first localization.

In this embodiment, a convolutional neural network that is formed by a first fixed point is taken as an example for description.

Optionally, the above layer output parameters may be clustered by using a K-means clustering (Kmeans) method.

Optionally, after the first spotting, the number of decimal places corresponding to the obtained layer output parameter is the number of decimal places identified by the spotting parameter. On this basis, the layer output parameters are clustered.

For example, assuming that the layer output parameter is a fixed point number of k +2 bits, k-bit clustering may be performed on the layer output parameter using a Kmeans method. Wherein k is an integer greater than 0.

And S302, according to the clustering result and the convolution parameters in the convolutional neural network, performing second fixed-point processing according to the fixed-point bit width hardware resource quantity to obtain a second fixed-point convolutional neural network.

Alternatively, the second fixed-point processing may be fixed-point processing using a step function.

In this embodiment, by clustering the layer output parameters of the convolutional neural network and performing second stationing on the clustering result, the stationing bit width can be reduced on the premise of ensuring the stationing precision, and thus the bandwidth occupation is reduced. Illustratively, when the convolutional neural network according to the embodiment of the present invention is applied to an FPGA, after the layer output parameters are clustered and represented, the layer output parameters may be represented by using a lower number of bits, so that a corresponding target may be generated. For example, if K-means clustering is used to obtain a number of 16 clusters, each layer output parameter can be represented by using 4 bits. Furthermore, when the FPGA reads data, only 4 bits can be read, namely, only less bits are needed to be read than before clustering, and then real data can be obtained in a table look-up mode, so that the bandwidth of the FPGA is reduced.

On the basis of the above-described embodiments, the present embodiment relates to a process of determining a localization parameter from a network parameter of a convolutional neural network.

Optionally, step S201 includes:

and determining the fixed-point parameter according to the decimal place of the convolution parameter and the decimal place of the layer output parameter.

Optionally, before the method according to the embodiment of the present invention is executed, the convolution parameter and the layer output parameter in the convolutional neural network are floating point numbers, and for the floating point numbers, in this embodiment, the fixed-point parameter may be determined according to decimal digits corresponding to the floating point numbers.

In an alternative embodiment, the maximum decimal place among the decimal places of the convolution parameter and the decimal places of the layer output parameter may be set as the decimal place corresponding to the fix-point parameter.

For example, assuming that there is a convolution parameter with a value of 1.1 in the convolution neural network and there are two layer output parameters with values of 1.23 and 3.354, respectively, and the maximum decimal number is 3, it can be determined that the decimal number corresponding to the above-mentioned fixed-point parameter is 3.

In another alternative embodiment, the decimal place number corresponding to the fix-point parameter may be a decimal place number having a largest ratio of the decimal place number of the convolution parameter to the decimal place number of the layer output parameter.

For example, assuming that there are two convolution parameters with values of 1.1 and 1.2, and there are two layer output parameters with values of 1.23 and 3.354, respectively, the decimal place with the largest ratio is 1, and it can be determined that the decimal place corresponding to the above-mentioned fixed-point parameter is 1.

On the basis of the above embodiment, the present embodiment relates to a process of performing first localization on the network parameter.

Fig. 4 is a flowchart illustrating a fourth embodiment of the image processing method according to the embodiment of the present invention, and as shown in fig. 4, the step S202 includes:

s401, determining the fixed point level of the network parameter according to the bit number of the fixed point bit width hardware resource quantity and the fixed point parameter.

After the fixed-point parameter is obtained through the above process, the fixed-point level of the network parameter may be determined according to the bit number of the fixed-point bit width hardware resource amount and the fixed-point parameter.

For example, assuming that the number of bits of the fixed-point bit width hardware resource amount is 2, and the number of decimal digits corresponding to the fixed-point parameter is determined to be 0 through the above process, it may be determined that the fixed-point levels of the network parameter include-2, -1, 0, and 1.

S402, according to the fixed-point level and the fixed-point function, performing first fixed-point on the network parameters to obtain a first fixed-point convolutional neural network.

As an alternative embodiment, the network parameters may be first fixed in the following manner.

Fig. 5 is a schematic flowchart of a fifth embodiment of the image processing method according to the embodiment of the present invention, and as shown in fig. 5, the step S402 includes:

s501, determining a difference between the value of the network parameter and a target level, where the target level is a fixed point level closest to the value of the network parameter.

S502, inputting the difference and the fixed-point level into the fixed-point function to obtain a result of performing the first fixed-point on the network parameter.

Optionally, after the fixed-point level of the network parameter is determined, in this embodiment, taking one of the network parameters a as an example, the first fixed-point may be performed on the network parameter a according to the following manner, and the manners of performing the first fixed-point on the remaining network parameters are all the same as the network parameter a:

assuming that the network parameter a is a convolution kernel parameter, a localization level closest to the convolution kernel parameter is first determined. Alternatively, the fixed-point level closest to the convolution kernel parameter may be determined by calculating the difference between the convolution kernel parameter and the fixed-point level.

For example, assuming that the fixed-point level of the convolution kernel parameter is determined to be-2, -1, 0, and 1 through the above steps, and the value of the convolution kernel parameter is-1.2, the fixed-point level closest to the convolution kernel parameter may be determined to be-1 by calculating the difference between the convolution kernel parameter and the fixed-point level.

And then, inputting the difference value and the target level into the fixed point function to obtain a result of performing first fixed point on the network parameter A.

Alternatively, it is assumed that the above-described fixed-point function is expressed by the following formula (2).

L + S k formula (2)

Wherein, L is the target level, S is the difference between the network parameter a and the target level, and k is the slope corresponding to the current stage, the target level and the difference determined by the above process are input into the above formula (1), and then the first localization result of the network parameter a can be obtained.

It should be noted that the above-mentioned fixed-point function is only one implementation manner, and those skilled in the art can determine a function related to the target level and the difference between the network parameter and the target level as the fixed-point function according to actual needs.

On the basis of the above embodiments, the present embodiment relates to a process of clustering the layer output parameters.

Fig. 6 is a schematic flow chart of a sixth embodiment of the image processing method according to the embodiment of the present invention, and as shown in fig. 6, the process of clustering in step S301 includes:

s601, selecting the preset number of layer output parameters from the layer output parameters as initial clustering centers.

And S602, performing clustering iteration processing according to the initial clustering center to obtain the preset number of clustering results.

For example, assuming that the preset number of bits is 6, that is, the network parameter is represented by 6 bits, in this embodiment, 4-bit clustering may be performed on the layer output parameters, and the 4 bits may identify 16 numbers, so that the preset number may be 16.

Initially, 16 layer output parameters can be selected from the layer output parameters according to a specific principle, the 16 layer output parameters are used as initial clustering centers, and then, according to the similarity degree of the remaining layer output parameters and the initial clustering centers, namely, the difference value (the smaller the difference value, the higher the similarity degree), the remaining layer output functions are respectively allocated to the clustering centers to form a plurality of clusters. And then, calculating a new clustering center of each cluster, and continuously performing clustering iteration processing based on the new clustering center until the difference value of the layer output parameters in each cluster meets the convergence condition, thereby obtaining 16 clustering results.

On the basis of the foregoing embodiment, the present embodiment relates to a process of performing second localization on the clustering result.

Fig. 7 is a schematic flowchart of a seventh embodiment of an image processing method according to an embodiment of the present invention, and as shown in fig. 7, the step S302 includes:

and S701, determining the clustering center of the clustering result.

After the above processing, a plurality of clusters can be obtained, and for each cluster, the second localization result can be determined according to the process of S701-S703 in this embodiment.

Assuming that one of the clusters is cluster 1, in this step, the cluster center of the cluster 1 is determined first.

Alternatively, the cluster center may be an average value of a plurality of layer output parameters included in the cluster 1.

And S702, performing second localization on the clustering center to obtain a second localization result of the clustering center, wherein the second localization is used for localizing the clustering center to a localization level closest to the clustering center.

When the cluster center of the cluster 1 is determined, the cluster center may be subjected to second localization. The second fixed-point may be fixed-point by using a step function, that is, the cluster center is fixed to a fixed-point level closest to the cluster center by using the step function.

Illustratively, assuming that the cluster center of cluster 1 is 1.2, after the second localization using the step function, the cluster center may be localized to 1.

And S703, taking the second fixed-point result of the clustering center as the second fixed-point result of the clustering result.

After the second localization result of the cluster center of the cluster 1 is determined, the second localization result can be used as the second localization result of each layer of output parameters included in the cluster 1.

Because bits smaller than the representation bits of the output layer parameters (for example, if the layer output parameters are represented by 6 bits, 4-bit clustering is performed) are selected for clustering during clustering, after the second localization is finally completed, all the layer output parameters can be represented by using fewer bits, and therefore, the localization bit width is reduced on the premise of ensuring the localization precision through clustering, and the purpose of reducing the bandwidth occupation is achieved.

Fig. 8 is a block diagram of a first embodiment of an image processing apparatus according to the present invention, as shown in fig. 8, the apparatus includes:

the first processing module 801 is configured to perform fixed-point processing on a network parameter expressed by a floating point in a convolutional neural network according to a fixed-point bit width hardware resource amount of an arithmetic unit, where the network parameter includes a convolutional parameter of the convolutional neural network and a layer output parameter.

An obtaining module 802, configured to obtain an image to be processed.

A second processing module 803, configured to control the operation unit to process the image according to the network parameters after the fix-point processing, so as to obtain a processing result of the image.

The device is used for realizing the method embodiments, the realization principle and the technical effect are similar, and the details are not repeated here.

Fig. 9 is a block diagram of a second embodiment of an image processing apparatus according to the present invention, and as shown in fig. 9, a first processing block 801 includes:

and the clustering unit 8011 is configured to cluster the layer output parameters in the convolutional neural network to obtain a preset number of clustering results.

A second fixed-point unit 8012, configured to perform second fixed-point processing according to the clustering result and the convolution parameter in the convolutional neural network and according to the fixed-point bit width hardware resource amount, to obtain the convolutional neural network after second fixed-point processing.

Fig. 10 is a block configuration diagram of a third embodiment of an image processing apparatus according to an embodiment of the present invention, and as shown in fig. 10, the first processing module 801 further includes:

a determining unit 8013, configured to determine a fixed-point parameter according to a network parameter of the convolutional neural network, where the fixed-point parameter is used to identify a decimal number of a fixed-point number corresponding to the network parameter;

a first localization unit 8014, configured to perform first localization on the network parameter according to the localization parameter, a preset localization function, and the fixed-point bit width hardware resource amount, to obtain a first localized convolutional neural network, where the localization function is a linear function with a preset slope, and the preset slope is greater than 0.

In another embodiment, the first processing module 801 is specifically configured to:

In another embodiment, the determining unit 8013 is specifically configured to:

In another embodiment, the first localization unit 8014 is specifically configured to:

and determining the fixed-point level of the network parameter according to the bit number of the fixed-point bit width hardware resource quantity and the fixed-point parameter.

determining a clustering center of the clustering result;

Fig. 11 is a block configuration diagram of a fourth embodiment of an image processing apparatus according to an embodiment of the present invention, and as shown in fig. 11, the apparatus further includes:

an updating module 804, configured to modify a slope of the fixed-point function when a difference between the fixed-point function and a target step function is greater than a preset difference; and performing new first localization according to the corrected slope.

In another embodiment, the second processing module 803 is specifically configured to:

fixed point multiplication, fixed point addition, and shift.

In another embodiment, the processing result of the image comprises at least one of:

Fig. 12 is a schematic structural diagram of an FPGA-based vehicle-mounted computing platform according to an embodiment of the present invention. As shown in fig. 12, the FPGA-based vehicle-mounted computing platform includes: a processor 1201, an external memory 1202, a memory 1203 and an FPGA operation unit 1204; wherein,

the external memory 1202 stores therein network parameters after the localization process of the convolutional neural network, and the network parameters include convolutional parameters and layer output parameters of the convolutional neural network.

The processor 1201 reads the network parameters of the fixed-point processing of the convolutional neural network into the memory 1203, and inputs the data on the memory 1203 and the image to be processed into the FPGA operation unit.

The FPGA operation unit 1204 performs operation processing according to the image to be processed and the network parameter of the fixed-point processing, so as to obtain a processing result of the image.

The processor 1201 may also perform the method steps in the foregoing method embodiment, which may specifically refer to the foregoing method embodiment, and details are not described here.

Fig. 13 is a block diagram of an electronic device 1300 according to an embodiment of the present invention, and as shown in fig. 13, the electronic device 1300 includes:

a memory 1301 for storing program instructions.

The processor 1302 is configured to call and execute the program instructions in the memory 1301, and perform the method steps in the above method embodiments.

The embodiment of the invention further provides an intelligent driving system, which comprises the electronic device shown in fig. 13.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. An image processing method, comprising:

acquiring an image to be processed;

2. The method according to claim 1, wherein the performing a fixed-point processing on the network parameters of the convolutional neural network according to the fixed-point bit width hardware resource amount of the arithmetic unit includes:

3. The method according to claim 1, wherein the performing a fixed-point processing on the network parameters of the convolutional neural network according to the fixed-point bit width hardware resource amount of the arithmetic unit includes:

4. The method of claim 3, further comprising:

5. The method of claim 3 or 4, wherein determining the fixed-point parameters from the network parameters of the convolutional neural network comprises:

6. An image processing apparatus characterized by comprising:

the acquisition module is used for acquiring an image to be processed;

7. The utility model provides a vehicle-mounted operation platform based on field programmable gate array FPGA which characterized in that includes: the system comprises a processor, an external memory, a memory and an FPGA operation unit;

8. An electronic device, comprising:

a memory for storing program instructions;

a processor for invoking and executing program instructions in said memory for performing the method steps of any of claims 1-5.

9. An intelligent driving system, comprising the electronic device of claim 8.

10. A readable storage medium, characterized in that a computer program is stored in the readable storage medium for performing the method of any of claims 1-5.